-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluation of voc4cat-tool for management of vocabularies #235
Comments
Hi Alessandro,
Great! You would be the first to build another vocabulary from the template. There may be some rough edges / missing notes. But our voc4cat shows that it is working well.
IDranges give users pre-reserved ranges of IDs. So they already know the final concept-IRI when they submit new terms (given that the PR is accepted). The pipeline checks that every user creates new concepts only within their pre-reserved IDrange. See also iri-design.md. The IDrange-support is only useful/needed if terms include a unique number in the IRI. Some other vocabularies use UUID4-type IDs in their IRIs. In this case the IDrange-based coordination is not necessary and the IDrange-based checks may be removed from the pipeline.
What do you mean with label, skos:label? (not supported at the moment) Changing the excel sheet requires that the SHACL profile and the read/write of xlsx are changed accordingly. Adding more fields to the concept scheme sheet would be relatively easy. Some additional fields like license would be useful for us, too.
Some features are supported: multi-language (each language is on a separate line), multiple creators can be given as comma-separated list. We have collected some ideas about possible future changes in #124. (addressing e.g. deprecation) Modifying the concept sheet is more complex which is why we are collecting several possible changes in #124.
No problem. It would be great if the template is useful beyond the original project it was created in. |
Hi,
Sound good
Ok, I think in this case it won't be useful to me, but I'll keep the information in mind in case it becomes useful in the future.
I would like to add properties that describe the vocabulary, such as skos:prefLabel, omv:acronym, omv:resourceLocator, omv:knownUsage, dct:audience, doap:repository, dct:license, and dct:language. In this sense, I was asking if it was possible. These are properties I am currently using, and I would like to keep them. I also see that here 4th draft for next xlsx template (v1.0) 2024-02-11 there is indeed a row with Repository.
I don't know SHACL in detail, but I see that the profile is quite generic.
I'll take the next few days to check which properties I’m collecting to define the various concepts, so I can identify which ones are missing in your template. The goal is not to lose the properties I have collected so far to describe the concepts in the vocabulary.
Best |
Hi, dct:source (e.g. http://purl.obolibrary.org/obo/ENVO_01000155, http://purl.obolibrary.org/obo/ENVO_01000156) PS if i wanted to add a field in the excel sheet and have the counterpart in the ttl where should i act? |
Just a "FYI": Several properties are for recording changes. So far we kept provenance info rather simple as it was in the australian vocpub profile which we re-used. Moreover git has all the history details. With git blame, quite detailed provenance is easily accessible, e.g. https://github.com/nfdi4cat/voc4cat/blame/main/vocabularies/voc4cat/0000005.ttl (this could in principle be converted to a history expressed with properties in RDF).
The column in the concepts sheet results in skos:broader and skos:narrower relations (but not skos:broadMatch).
With ttl you mean the shacl profile? I would first write a test-turtle with a concept that has all the additional properties and then work on the shacl-profile until validation pases. With a local install of voc4cat-tool you can run validation (without conversion to Excel). Modification of the Excel-sheet requires changing the conversion code which means that you create a custom voc4cat-tool. To avoid this, it would be best if we could agree on a common set of properties (=columns in Excel). The profiles can be different, so you could mark properties that are mandatory for voc4cat as optional in you vocabulary profile. Hope it makes sense... |
I tried to make a pull request (oggioniale/elter-vocabularies#1) following the readme here. However, I encountered an error.
The idranges.toml contains this:
|
Hmm. Your case is special and fails because you don't have any user defined in the IDranges file. Therefore, id_range is empty => ValidationError. The validation is too strict because we have never thought about using the tools without setting IDranges for the users. I have to look how to approach this best. |
From the urls in your test, I saw that you use SKOSMOS to make the vocabulary available. We also have concrete plans to make the voc4cat vocabulary available over an NFDI SKOSMOS service in Q1-2025. Do you know if a SKOSMOS-compatible profile exist? If that is the case I would be interested in making the current profile compatible. |
I saw this and the provenance is very important for us.
Perfect!
Ahh ok! I thought there was a direct correspondence between the excel sheet and the ttl file that was produced, without having to go through a validator.
This makes a lot of sense. I already have the ttl corresponding to my excel sheet. But I don't have a shacl shape to validate it. But I think this will be a next step for me. |
I tried yesterday to enter a user (myself). Now id_range is filled but it still gives me an error.
|
Copied from PR-comments (to record all difficulties here) One problem was that the Excel file must just be named just dataLevel.xlsx for the vocabulary "dataLevel" not vocab_dataLevel.xlsx. The error message is in principle clear but one does not think about the excel file name as source of the problem. The excel filename determines which vocabulary is looked up in IDranges. This should be better documented and we should add a special check for this with a more meaningful error message. Another problem was the changed the Excel-Template version to 0.0.1 (on Introduction-sheet). It should stay at 0.4.3. It is a version of the Excel-structure, not of the vocabulary contents. |
With a little Python experience it is not particularly difficult to run the commands of the gh-action on your local computer. Especially if you plan to change more, running locally is important to iterate faster. I can give a short summary how to setup a local clone to do this, if you are interested. |
Hello PR now successful: oggioniale/elter-vocabularies#5 But now I would have expected that there would be a folder called dataLabel in https://github.com/oggioniale/elter-vocabularies/tree/main/vocabularies and that there would be documentation generated in workflow-artifacts I'm sorry for all these questions but I can't be autonomous. |
Don't worry. For us, your interest is a great opportunity to test and improve re-usability. We are happy to help. |
I created a PR in your test repository with a fixed Excel-file: oggioniale/elter-vocabularies#6 |
Your current configuration selects to manage multiple vocabularies together in a single git-repository. In voc4cat we manage just a single vocabulary in the repository. So this is more thoroughly tested in real life. Both variants have pro and cons. For example the cons of multiple vocabularies in one repository:
Typically, I would use a distinct repository for each vocabulary. Only for small closely related vocabularies I would manage multiple in one. |
I saw your work and I downloaded the excel file! Thanks! I understand also the errors (ROR and prefix) I would never have got there. My mistake of the prefix for collection. |
Thank you. I understand what you say. In this case, too, I wanted to do a test. |
BTW, the failure in the merge action is caused by a missing gh-pages branch. To use documentation hosting on gh-pages, gh-pages need to be activated in the settings of the repository and a (initially empty) gh-pages branch must be present. |
I was just wondering where I would find the documentation. |
If you want to host the docs on gh-pages, a redirect from a permanent URL service to gh-pages also needs to be configured. voc4cat´s w3id.org config can serve as inspiration. It configures redirects to the correct anchor of the concepts in html and supports multiple vocabulary versions. |
Ahhhhh ok. |
Done alse this! https://oggioniale.github.io/elter-vocabularies/dev/dataLevel/ |
last test for today. |
We are getting closer to the finish. 😄 🏁 So what made it fail?
Another (minor) issue: You created the gh-pages branch obviously from the main branch. However, the gh-pages branch should be empty at the beginning. The only file that should be added is an empty file named
A note on deleting: Deleting a concept in Excel does not delete the concept from the concept scheme. To completely remove a concept (which should be very rare!), its turtle file in the vocabulary directory has to be deleted via a PR. Due to this it is possible to edit only a reduced subset of a vocabulary via Excel-uploads. |
I have done all the tests to update the concepts. I think I now understand quite well how the process works. I think I'll do some tests with a denser vocabulary of concepts, I'll also compare with my colleagues and then I'll tell you. However, I think I will propose this solution for vocabulary management. However, the questions remain open concerning the addition of certain properties and the management of the id_range. On the other hand, I think it is good to be able to use only one vocabulary per repository. I will get back to you soon. |
Thanks for the update. Let us know how your proposal is received by your colleagues! If you would like us to present voc4cat or discuss details like how to approach the customization, we (@nmoust or me) would also be available for a zoom call. We will not address #237 immediately. Until your decision it has low priority because it does not affect our vocabulary. |
Dear each,
I am writing to you because I am considering using your tool for managing some vocabularies as part of a European project.
Currently, the management is being done using an Excel sheet, but there is a complete lack of versioning and, above all, the ability to use GitHub as a platform for discussions, issue tracking, etc.
Following the instructions provided in the README of your repository, I created a clone on my GitHub account (https://github.com/oggioniale/elter-vocabularies/tree/main?tab=readme-ov-file). I made the necessary modifications to the file idranges.toml, but I have three questions to ask.
The first concerns the "Section of IDranges" in the file idranges.toml. What are ID ranges in the vocabulary?
The second is whether in the Excel file (e.g., voc4cat_template_043.xlsx) in the "Concept Scheme" sheet, I can add fields beyond those already present in blue. Fields such as language, license, acronym, label, etc. And whether I can define multiplicities for a concept (e.g., Creator).
The third, and final question, concerns the same Excel file, but in the "Concept" sheet. In this case, I would also like to know if I can add columns to enrich the concept, such as properties like prefLabel or definition in different languages, deprecated, created, modified, etc.
I apologize in advance for the many questions, but since what you have created seems to me a very interesting tool, I would like to understand if I can use it.
Best
Alessandro
The text was updated successfully, but these errors were encountered: