-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add specification for how to extend the schema #27
Comments
Would it be relevant to have extensions elsewhere than the top level as well? Like, extra stuff for host/distribution for instance. |
I can see value in allowing for extensions at the dataset, distribution and host levels (perhaps project as well). For us (so far) the use case for using extensions has been in the import (creation) of a DMP via our API. It could be useful as well in We ended up using the following during the hackathon: {
"dmp": {
"extension": [
{
"dmptool": {
"template": {
"id": 946,
"title": "Environmental Resilience Institute Data Management Plan"
}
}
}
]
}
} Related issue: RDA-DMP-Common/hackathon-2020#3 |
Do you have more examples of extensions needed? This could help us find the best strategy for including them. What about doing it in a slightly different way by using within the
|
I think that could be a useful approach. We are currently working through an integration that is using the common standard as the method of communication. We are still in the early stages of the project though and have not finished defining what additional information we would like to pass along. Much of the information is at the project/dmp level for example:
|
I'm new here - sorry if I miss-interpreted something in this issue. I was in the call earlier on and I thought of adding some of my thoughts here.
I like it. In the Frictionless Data community we had a similar discussion: frictionlessdata/datapackage#663 In that case we were looking at adding specific fields. E.g. at the Swiss Polar Institute we are prefixing them with Only one possible problem (hypothetical) with the current suggestion: might two institutions come up with two extensions with the same name and some fields would be the same? I can think of two possible solutions:
|
hello ... also following this morning's call. Thanks to @cpina : this is the reservation I was trying to convey at the call:
Sorry if I misunderstand the question. |
This is a perfect summary, thanks!
My thoughts are: should we make work the case 2? (two different extensions share the name, share a field name and not the same meaning). If this is a concern and should work: what's the best way to go? (a "name" or a "prefix") |
We are going to begin work on the schema extensions for DMPRoadmap in late March or early April. We will plan to follow the pattern described by @cpina @froggypaule above by using a tool/codebase specific prefix like: Any early suggestions or feedback (once we start work on it) would be welcome. :) |
Hello! a quick one: why the name 'dmproadmap' ? I am saying this because that DMPRoadmap is the common code base to DMPTool and DMPonline. Is the name intentional? |
Yes. Any changes we'd be making would benefit the entire codebase (DMPTool, DMPonline, DMPOPIDoR, DMPAssistant, etc.). For example the DMPRoadmap system is driven in part by specific templates (e.g. Horizon2020, NSF, USGS, etc.). We have an API endpoint that allows users to create a DMP by passing in this metadata standard. To help facilitate the use of specific templates we would add a |
ok thanks.... I was just commenting :) |
Hi - I've been reading this thread, and I'm concerned that the consensus seems to be to invent a mechanism for handling namespaces in JSON. I would strongly recommend not doing this. At the start of this work, we decided to limit our focus and ambition with the standard, so that it was developed and managed as an information exchange format. More formally, it could be described as a metadata application profile. However, the interest in this work has grown and, as such, we are now faced with a decision. Do we accept that there is demand for a more expansive standard - essentially an ontology within which new concepts can be added? Or do we continue to limit our scope, while recognising that there is demand to include extra information in, or alongside, the information exchange? As I understand it, there are two viable options available to us: Option 1: Widen our scope, and become an ontologyIt could be argued that this is inevitable. In any case, there is already work underway to formally describe the standard as an OWL ontology, so there does appear to be demand for this. If this is the direction of travel for the DMP Common Standard, then I would recommend that we act sooner rather than later, and move from supporting plain JSON to supporting JSON-LD. Pros:
Cons:
Option 2: Continue as before, with a new section for arbitrary extensionsWe had certainly been considering how to handle extensions from the beginning of this work, and this was our original idea. With this approach, the scope of the DMP Common Standard is unchanged, but a place is added for third-parties to add arbitrary data. With this approach, the DMP Common Standard has nothing to say about how these extensions are encoded. If implementers add extensions which cause name collisions, then they will need to sort this out (typically by agreeing conventions). Pros:
Cons:
My recommendation:
Of these two options, I think that the JSON-LD option is the more future-proof at this point. |
Thanks @paulwalk for clairifying this: having come to the CS quite late, this helps a lot. |
Thanks @paulwalk . Sadly I'm not extremely familiar with JSON-LD and I need to do some refreshing on it. I 100% agree to avoid reinventing the wheel. If any of the ideas of my suggestions already exist in a standard I would say to go with the standard unless there is a very good reason for this use-case. |
Hi @paulwalk, in case it is decided that the community will go with the first option, we (mainly me, @JoaoMFCardoso, @ljgarcia and Marie-Christine) have been working on the ontology version of the DMP Common Standard (DMP Common Standard Ontology - DCSO), which is already committed as a part of this repository (https://github.com/RDA-DMP-Common/RDA-DMP-Common-Standard/tree/master/ontologies). This was a result of the DCS hackathon last year. The goal of the ontology is to have a 1-to-1 mapping to the current DCS, to ensure the compatibility between the DCSO and the original DCS standard. We will be very happy to discuss the ontology development (which you can later serialise as JSON-LD) to include the latest changes since the hackathon if you wish. As a note, we are currently working on an (invited) journal paper to showcase the DCSO and its features. So in case that the community decided to go with the JSON-LD, we can also report this development in the paper as well. |
Hi, I would vote for JSON-LD way.
@paulwalk It should be possible to remain backwards compatible (when someone ignores One might also ask why JSON-LD and not directly RDF. |
I think it would remain backwards-compatible for people parsing the document as JSON rather than JSON-LD. As far as I can see, the main thing that would be lost would be the namespace URI mapping - but the namespace prefixes would still be in the JSON.
This is really just about tooling. The DMP system APIs are already handling JSON. Developers mostly prefer it to RDF because they get native programming language support etc. JSON-LD seems to hit the "sweet-spot" for many. |
I think the use of JSON-LD would only break usage if you would decide to use a different way of expressing your attributes. A little sidewalk: IIIF v2 uses JSON-LD, but implementers rapidly started to realise that attribute values can be anything (reference url? regular string? array of reference urls?). IIIF v3 therefore decided to be far more strict; And that is what one should probably do to make other developers' live easier. Let's not forget that most JSON parsers are just JSON parsers, and are not like XML parsers that can handle namespaces. |
We are currently converting our API over to use this Common Standard metadata schema. We have a few scenarios where we also need to convey information that its required for our system but outside the scope of this schema.
It would be good if the schema provided guidance on how best to include this type of information. So that systems adopting the Common Standard schema follow similar patterns.
For example, the DMPTool API requires that a DMP template identifier be specified along with some other information specific to the caller's system (called 'abc' below) when creating a new DMP.
We will be using the following structure to accomplish this:
Apologies if this has already been discussed and I just missed it in the documentation somewhere.
The text was updated successfully, but these errors were encountered: