Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add additional metadata field for registering RI datasets / training / learning material used in a course or programme #48

Open
5 tasks done
vronk opened this issue Dec 9, 2022 · 31 comments
Labels
admin Admin - Part of application featurerequest New feature request - Kind of issue frontend Front End - Part of application todo Ready to start development - Status of issue
Milestone

Comments

@vronk
Copy link

vronk commented Dec 9, 2022

Provide explicit definition of a "MOOC" and also other types of courses, maybe in form of a glossary

  • Discuss and formulate the definitions, also comparing MOOC and Online Courses
  • write and make available glossary
  • Introduce in db: as additional education type
    (or as attribute (would require change of datamodel))
  • corresponding changes in managing courses (adding the education type)
  • Adapt search options on "home page" Choose Education-filter

Edit on 2024-05-14 by Patrick:
The purpose of this issue has changed, please check here: #48 (comment)

@vronk vronk added the documentation Improvements or additions to documentation - Part of application label Dec 9, 2022
@vronk vronk added this to the February 2023 milestone Dec 9, 2022
@IvdL22
Copy link

IvdL22 commented Jan 31, 2023

Thank you @vronk for creating this task. I noticed that we already have a draft document with definitions started by @PixlTracer: https://docs.google.com/document/d/1FMquEoMn7EIOHH6I00K_518cDIVzBsvShUnKCut0FGE/edit?usp=sharing. We will finalise the glossary in that document and notify you when done.

@patrickakk
Copy link
Contributor

@IvdL22 @PixlTracer :

Would it be possible to provide 2 examples of a MOOC? With 2 links to more detailed information? So that we can use that as reference when working on this topic?

@IvdL22
Copy link

IvdL22 commented Feb 13, 2023

@patrickakk
Copy link
Contributor

@IvdL22 @PixlTracer @vronk Summary from today's meeting:

It's no longer the intention to include MOOC's in the DHCR.

The open educational resources from Dariah Teach should be included. One example provided by Iulliana is:
https://teach.dariah.eu/course/view.php?id=68&section=2

Important information fields are: title, description and link.

The next step would be for Patrick to analyse the structure of the new items and see if and how we can integrate this in the registry.

Is that a correct summary?

Note: since this is additional work, I'll move this task to the mai23 milestone.

@patrickakk patrickakk modified the milestones: feb23, mai23 Feb 22, 2023
@patrickakk patrickakk assigned patrickakk and unassigned IvdL22 and PixlTracer Feb 22, 2023
@patrickakk patrickakk added featurerequest New feature request - Kind of issue and removed documentation Improvements or additions to documentation - Part of application labels Feb 22, 2023
@IvdL22
Copy link

IvdL22 commented Mar 7, 2023

@patrickakk I agree with your summary and suggestion.

@patrickakk patrickakk changed the title Define course types Include open educational resources from Dariah Teach into application May 7, 2023
@IvdL22
Copy link

IvdL22 commented May 10, 2023

@patrickakk @PixlTracer I have tested this by adding one course to the registry: https://dhcr.clarin-dariah.eu/courses/my-courses

You as admin can view it. It is not listed.

Conclusion: the metadata needs to be more flexible in order to be able to include open-source educational courses.

Institution - I have created DARIAH-TEACH
City - not applicable
Country - not applicable
ECTS - 1
Start date - anytime
Duration - it could be 1 week or 1 month, depending on the learner
Discipline - I could not select DH so I selected Other
Technique - I did not know which one to select. Could we add Other as an option?
Objects - DH

@patrickakk patrickakk added the specsmissing Development can’t start because specifications are missing – Status of issue label May 11, 2023
@patrickakk patrickakk assigned IvdL22 and PixlTracer and unassigned patrickakk May 11, 2023
@patrickakk patrickakk changed the title Include open educational resources from Dariah Teach into application Include open educational resources into application May 22, 2023
@patrickakk patrickakk added frontend Front End - Part of application admin Admin - Part of application labels May 22, 2023
@patrickakk patrickakk modified the milestones: Mai 2023, July 2023 May 22, 2023
@patrickakk
Copy link
Contributor

Moved to July milestone since the specs are not finished. Maybe we need to move it again, depending on how long this takes.

@patrickakk
Copy link
Contributor

@PixlTracer @IvdL22 @vronk
With this comment, I'll try to summarize the meeting from 2023-05-17 as well as propose some solutions:

Characteristics

Open education resources could be used by teachers to include them into other courses. As well they could be used by students.

Question: We used various terms. Which one is correct? Is this: Open Education Resource (OER)?

Currently they can be found on multiple locations:
-DARIAH [ca 16 courses]: https://teach.dariah.eu/course/index.php
-CLARIN [ca 10 courses]: https://www.clarin.eu/content/training-materials
-Upskills [ca 6 courses]: https://upskills.fil.bg.ac.rs/

When adding them to the DHCR, the metadata of all of them will be available at one location, as well as the complete overview can be accessed through the API.

An example for recent updated(public shown), PhD courses though the API can be found at:
https://dhcr.clarin-dariah.eu/api/v2/courses/index?recent&course_type_id=4
The same could be available when a new education type "Open Education Resource" will be created.

It was proposed to enter the courses manually, since this is a small amount of courses and the metadata is not expected to change often. The information could be checked/updated yearly by the administrators (Anna & Iulliana).

The characteristics of the OER's conflict with the current data model. Three options were discussed:

  1. Add one course which "summarizes" the OER's at one place. (Similar to ACDH tool gallery). In this case it might be difficult for a user to find the details.
  2. Create a separate list/database/table/data model for the new characteristics. This is almost the same as creating a separate registry and requires a lot of development hours. As well the items won't be available though the current filter/search options.
  3. Create a new Education Type and define a "set of rules" how we can deal with the differences in structure. This might be the preferred option, depending on how we can change/adapt the current validation rules.

Which characteristics are different? OER's have:

-No physical present institution and department
-No city and country and no location on the map (lat, lon)
-No start date and is not recurring
-No fixed duration [unit and type] (that would depend on the student)
-No lecturer name and email = no problem, since those fields are not required
-No entry requirements? = no problem, since this field is not required
-Are there Tadirah objects which apply?

The goal is that the data model represents the real world. Currently the validation rules apply to all courses, which means there are no exceptions based on education type. For example every course has to be at an existing institution and that has to be on a physically existing location. Another example is that every course needs to have at least one start date. The presence of these data is also important when users are using the filter or sort options.

We could try to implement different validation rules, which behave different, based on the Education Type. In case we choose this solution, I'll have to take a look at this further, to see if the model supports this.

What should the new validation rules do?

Institution and location difference

  • Create institutions for the OER materials. How many do we need?
    • Create validation rules which prevent not OER courses from using those institutions.
  • Create city and country "Remote" for all OER courses
    • OER courses should be forced to use only the "Remote" location. Otherwise they will "pollute" the filter results when searching for a regular course.

No start date available

  • Option 1: Allow empty start date for OER's.
    • Disadvantage: In the registry (list) they will look like courses which are not starting any more.
  • Option 2: Set a start date for a fixed day in every month as well as check(enable) recurring. For example 1th of Jan, 1th of Feb, etc.
    Disadvantage: Based on start date (always very soon) they might get more attention than the currently existing courses
  • Option 3: Something else?

No fixed duration [unit and type]

We could use the same approach as for the location difference:
Create a new duration unit "Flexible" which is required for OER courses and can not be used by other courses.
Would that be a good solution?

How can this new feature be communicated / be visible for users? How should OER courses be visible / findable for users?

  1. The user can filter on the new education type "Open Education Resource". (Use filter button)

  2. A section could be added in the menu, which contains an explanation of OER's as well as a button which automatically activates the filter and shows all the OER's. An example of clicking on this button, for the course type "PhD", would look like this: https://dhcr.clarin-dariah.eu/?course_type_id=4

  3. A short url could be created, for easy dissemination, for example: https://dhcr.clarin-dariah.eu/open-education-resources
    Is this needed? Which short url is preferred?

  4. Is anything else needed?

Is this summary correct? Did I miss anything? Are there suggestions/additions? And finally, what do you think of the proposed solution?

@vronk
Copy link
Author

vronk commented Jun 7, 2023

I have to admit, I am somewhat surprised about the turn this task took.
As the analysis also shows, the educational resources are ontologically quite a different animal than courses:
the former are some kind of digital object, the latter an activity.

And! For a catalogue of resources we have alone in DARIAH the DARIAH Campus: https://campus.dariah.eu/resources/page/1
besides many other catalogues and directories. Especially dariahTeach materials are also already included in DARIAH-Campus.

So I am decidedly against extending the functionality to support OER as first-class citizens next to courses.
In my recollection, the original idea was to allow for links/pointers to resources (learning materials) pertaining to existing course,
not a separate index of learning materials.

@patrickakk patrickakk added specsmissing Development can’t start because specifications are missing – Status of issue and removed indev Currently in development – Status of issue labels Jun 11, 2024
@patrickakk
Copy link
Contributor

@IvdL22
cc @PixlTracer
Is this the correct summary of what was said in the meeting on the 10th of June?

CLARIN needs to possibility that links (=plural) to datasets used in a course, are added to the course metadata.
DARIAH needs to possibility that links (also plural?) to training material that is (re)used in a course are added to the course metadata.

On the other hand, based on the requirements here:
#48 (comment)

Are we sure there will always be one or no link? This means not more than 1? Does that apply in all cases? we will not know. we could offer a 2nd link field (which might not be used much)...

and

Should there be a url validation check? absolutely!

The current (almost finished) implementation only contained one text field, with place for one link and a link checker that required a valid http status code. So it's not possible to enter more than one link.

The information above almost certainly needs more than one item to be added, which should be implemented in the data-model in a completely different way. (One to many relationship).

Based on the new information above the current implementation does not provide what is needed.

During the meeting it was decided to discuss what is needed during the WG meeting on the 19th June.

I'll revert and not commit the work already done, put the issue on hold and move it to the July milestone. Maybe we can all agree on what is needed soon?

patrickakk added a commit that referenced this issue Jun 11, 2024
patrickakk added a commit that referenced this issue Jun 11, 2024
patrickakk added a commit that referenced this issue Jun 11, 2024
@IvdL22
Copy link

IvdL22 commented Jun 11, 2024

Hi @patrickakk cc @PixlTracer Thank you. The way I see this implementation is the following:

Description: If you use a CLARIN or DARIAH resource in your course, be it training material, dataset, tool or service, please add the URL.

Optional field 1: CLARIN resource

Option field 2: DARIAH resource

What do you think? Would this be possible to implement? I could add the text like this to the slide and then discuss with the WG.

@patrickakk
Copy link
Contributor

@IvdL22 @PixlTracer

I would suggest to go a (few) steps back in the process: Why do we develop this feature?

Is this because CLARIN and DARIAH want to know which resources are used and in which courses or how often they are used? In that case you want to measure something, which makes it an important feature?
And in that case, why do you want to ask during a WG meeting what they need?

What do we do when there is more than one link to add?

In the case of a CLARIN dataset, do you expect they always only used one?
What do you do when they used more than one and can't enter the information? Then your report is unreliable?

Do we need to add attributes to the link? (Source: DARIAH/CLARIN, Type: Dataset, tool, training material, etc.)?

How do you want to see/summarize the data which is entered?

With the requirements/wishes currently available, I would suggest the following steps:

a) Iulliana and Anna provide at least three(3) real world examples of courses and the links that can be added

b) We create a dummy preview of a report for both organizations and check if that contains the information that's needed, in the format/with the attributes needed.

c) Based on point B, we decide which information, attributes and data structure is needed.

d) Dummy user interface preview. Now, this needs to be accepted before proceeding to the next step.

e) Decide if the new fields should be provided by the API as well.

f) Technical implementation

This is probably not a small feature. Since the larger amount of working hours needed for this, I would suggest to have at least 1 dedicated meeting about this, where at the end, everybody commits to a set of requirements to avoid doing the same work over and over again.
And at least involve Matej @vronk as well.

And please think about this: It's easy to make changes to the wishes now, more complicated when the feature is finished and way more complicated when some data has been entered. So what do you need from this feature in 1 year from now?
A small misunderstanding now, can cause a lot of additional working hours later.

@vronk
Copy link
Author

vronk commented Jun 14, 2024

The idea is that there can be more than 1 URL reference for datasets/training material (or other resources), no distinction between CLARIN and DARIAH

a separate 1:N table: external_resources

[<“label”,”URL”,”type”, “affiliation”>,..] 

“label” is an open text describing the resource (optional)
“type”= Dataset, Training Material, Service, Software, …(optional)
“affiliation” = CLARIN, DARIAH, … (optional)
{label}/{URL}

{affiliation}{type}: {label}
{URL}

CLARIN Dataset: CLARIAH.NL corpus of child speech
https://clariah.nl/…

Link checking upon submission (optionally if little effort)
accept HTTP Status >= 200 < 400

The new fields should be available via the API too.

@vronk
Copy link
Author

vronk commented Jun 14, 2024

@IvdL22, @PixlTracer could you please provide at least three real world examples of such external resources according to the proposed data structure

@IvdL22 IvdL22 changed the title Add field that points to the training/learning material available Add field that points to the infrastructure resources, e.g. datasets / training / learning material used in a course or programme Jun 14, 2024
@IvdL22 IvdL22 changed the title Add field that points to the infrastructure resources, e.g. datasets / training / learning material used in a course or programme Add additional metadata field for registering RI datasets / training / learning material used in a course or programme Jun 14, 2024
@IvdL22
Copy link

IvdL22 commented Jun 14, 2024

@vronk @patrickakk @PixlTracer Matej, thanks for the technical solution.

Example
This is a course in the registry:
Puheen analyysin perusteet (Introduction to Speech Analysis)

“label” Introduction to Speech Analysis
“type”=Training Material
“affiliation” = CLARIN
{label}/{URL] https://www.clarin.eu/content/introduction-speech-analysis

“label” Route to a Wing Corpus
“type”= Dataset
“affiliation” = FIN-CLARIN
{label}/{URL] http://urn.fi/urn:nbn:fi:lb-2020112929

The same course also uses the CLARIN VLO to show students how to search for other corpora, so I could also add the service.

Is this example enough?

Edited by patrickakk on 2024-07-03. Reason: Fixed links

@patrickakk
Copy link
Contributor

@IvdL22 @PixlTracer
cc @vronk
Thank you for providing one example. Would it be possible to provide the other two examples as well?

Could you include at least one example which uses DARIAH resources? And in general: include as much exceptions and complicated situations as possible?

@IvdL22
Could you also specify how the VLO should be added to the example you provided?

@IvdL22
Copy link

IvdL22 commented Aug 27, 2024

@patrickakk
cc @vronk @PixlTracer
Regarding the VLO, if a teacher is using a corpus found via the VLO, the labels will be the same:

“label” Nijmegen corpora of casual speech
“type”= Dataset
“affiliation” = CLARIAH-NL
{label}/{URL] https://hdl.handle.net/1839/2581a242-fde3-4349-ab06-4920a964803d

Regarding the DARIAH Campus: if a teacher is using a learning or training resource in the DH programme entered in the registry, the teacher should be able to add the link to the resource:

“label”Formal Ontologies: A Complete Novice's Guide
“type”=Training Material
“affiliation” = DARIAH CAMPUS
{label}/{URL] https://campus.dariah.eu/resource/posts/formal-ontologies-a-complete-novices-guide

Please let me know if some things are not clear. Thank you for your help.

@patrickakk
Copy link
Contributor

@IvdL22
cc @PixlTracer @vronk

Thank you for the examples. Do you agree, when considering the specifications here:
#48 (comment)
that the affiliation in both last examples should be "CLARIN and "DARIAH" ?

Since according to :

“affiliation” = CLARIN, DARIAH, … (optional)

There are only two options for this value. The idea behind this was that a resource belongs either to CLARIN or to DARIAH (or there is an exception).

Is there a special reason for specifying this more detailed?

This is an important difference, please let me know if you think otherwise or have any questions.

@IvdL22
Copy link

IvdL22 commented Aug 27, 2024

@patrickakk Both CLARIN and DARIAH have national consortia and repositories, so the affiliation can also be CLARIN-FIN, etc. As @vronk suggested, there could be more values, not only two. The course contributors should be free to fill the institution hosting the resource they use in teaching.

@patrickakk
Copy link
Contributor

@IvdL22
cc @PixlTracer

Thank you for the explanation and valuable insights. I was aware of the national consortia, but until now it's wasn't clear to me that you wanted to specify on that detail level.
Maybe situations like this are also a good example for everbody, why it is useful to ask for examples and ask such questions in this (early) stage of the development process?

  1. Type of values
    You also mentioned "course contributors should be free to fill..." To avoid any misunderstandings about that, I will start with describing that kind of input that was meant and as second step we could discuss the values.

[<“label”,”URL”,”type”, “affiliation”>,..]
“label” is an open text describing the resource (optional)
“type”= Dataset, Training Material, Service, Software, …(optional)
“affiliation” = CLARIN, DARIAH, … (optional)

  • Label will in the UI look like a text field, where the course contributor can enter a free chosen value. The field is optional.
  • Url will in the UI look like a text field, where the course contributor can enter a free chosen value. The field is mandatory.
  • Type will in the UI look like a drop-down menu with a list of pre-defined values. The field is optional.
  • Affiliation will also be a drop down menu with a pre-defines list of values. How else would you prevent pollution or similar items/duplicates? The field is optional.

Does that clarify a bit?

  1. Kind of affiliation
    As far as I understood, the idea was to let them choose from either CLARIN, DARIAH, or none.
    Now you mentioned it should be possible to specify the national consortia.
    If we change this how should it be possible to answer for example the following questions, using the dataset:
    -Which courses use a DARIAH resource?
    -Which courses are in the german language and use a CLARIN resource?

A possible solution could be to define an affiliation parent type. Values for this could be DARIAH or CLARIN?
And define the national consortia as affiliation child types.
Are we sure that every national consortia belongs to only one parent type?
And of course this makes the implementation more complex, which means more working hours are needed.

On the other hand, the example provided by vronk shows something different:

CLARIN Dataset: CLARIAH.NL corpus of child speech
https://clariah.nl/…

affiliation = clarin
type = dataset
label = CLARIAH.NL corpus of child speech

Are we still all on the same page?
Is this understandable or should we discuss this in a (short) meeting?

@IvdL22
Copy link

IvdL22 commented Sep 19, 2024

@patrickakk , if the example by @vronk is clear and you know how to implement it, why do you ask me for more examples? :)
@vronk Matej, could you please help us move forward with this task? I provided examples but I am not able to clarify the specifications. Thank you :)

@patrickakk
Copy link
Contributor

@IvdL22 To make sure that:
a) What we defined, matches what you need
b) To check if there are any misunderstandings
c) Making sure that we all have the same expectations
d) Avoid wasting valuable development hours, by checking those things before starting the work

@patrickakk patrickakk modified the milestones: July 2024, 2024-11 Oct 2, 2024
@vronk
Copy link
Author

vronk commented Nov 6, 2024

have a flat list (modifiable by administrators) containing both CLARIN and national consortia.
allow multi-value in the field

example:
affiliation: ['CLARIN', 'FIN-CLARIN']

@patrickakk patrickakk modified the milestones: 2024-11, 2025-03 Nov 6, 2024
@patrickakk patrickakk added todo Ready to start development - Status of issue and removed specsmissing Development can’t start because specifications are missing – Status of issue labels Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
admin Admin - Part of application featurerequest New feature request - Kind of issue frontend Front End - Part of application todo Ready to start development - Status of issue
Projects
None yet
Development

No branches or pull requests

4 participants