Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: FsF-R1-01MD checks does not consider dynamic web services API #513

Open
MarioLocati opened this issue Jul 10, 2024 · 2 comments
Open
Labels
bug Something isn't working

Comments

@MarioLocati
Copy link

MarioLocati commented Jul 10, 2024

Description

We are trying to improve the evaluation of the following two DOIs:

  • 10.6092/ingv.it-ahead
  • 10.13127/efsm20

In addition to DataCite metadata associated to the DOIs, we provide TC211 metadata, respectively:

Data from both can be downloaded using Open GeoSpatial Consortium (OGC) web services standards:

  • WMS (Web Map Services)
  • WFS (Web Features Services)

These standards support multiple data encoding formats, the user may request any of these supported formats, and may even apply a custom filter reducing the amount of data returned by the web service.

The "FsF-R1-01MD - Metadata specifies the content of the data" check seems to expect a "file type" and a "file size" but there is no way to provide it as in the case of web services many output formats are supported, and the size of the data output may vary depending on a combination of the selected output format and the presence of a data filter.

Expected Behavior

If web services are mentioned in the metadata to download data, the check should fully pass anyway, it does not matter if file type and size are not declared in the metadata.

Actual Behavior

The "FsF-R1-01MD" complains because neither the "file size" nor the "file type" information are specified in metadata.

Possible Fix

The "FsF-R1-01MD" check should be able to identify the presence of download web services, at least the most used ones such as the Open GeoSpatial Consortium (OGC), that in Europe are the preferred way to publish spatial data and the recommended way by the INSPIRE Directive, the Infrastructure for Spatial Information in the European Community.

Steps to reproduce

Simply perform a check using one of the DOIs provided above.

@MarioLocati MarioLocati added the bug Something isn't working label Jul 10, 2024
@huberrob
Copy link
Contributor

Dear Mario,

I think I understand the problem. You are delivering data via services (OGC) instead of data objects (dataset) and unfortunately F-UJI exclusively was built to support data objects. I know the difference seems to be subtile but this explains the results.
On the other hand, some major standards such as DCAT or ISO no not differentiate much between data and their transport methods. So this might justify changes to F-UJI.

I could therefore imagine to implement an additional test which checks for two service specific metadata properties: protocol and service endpoint
In your example, this is specifically (and well) described using gmd:transferOptions which would provide the necessary metadata and instead testing for file size and file type F-UJI could test for the presence of these properties.

I know that currently GeoInquire and e.g. FAIR-EASE are trying to improve FAIR for GEO. Are you also involved via EPOS? It would be good to solve this issue within a broader community..

Robert

@MarioLocati
Copy link
Author

MarioLocati commented Jul 12, 2024

Dear Robert
great about the possibility to introduce an additional check to find out about the existance of web services endpoint(s), please keep this issue updated about your progress on this.

It is worth mentioning that we do provide a link to the ISO 19115/19139-TC211 metadata in the DataCite DOI, see the "relatedIdentifiers" tag with a "relationType" set to "HasMetadata".
Such a relation is present in both the XML and JSON version provided by DataCite services, but is missing in their JSON-LD output, using the same DOIs examples above respectively

XML: https://api.datacite.org/application/vnd.datacite.datacite+xml/10.6092/ingv.it-ahead
JSON: https://api.datacite.org/application/vnd.datacite.datacite+json/10.6092/ingv.it-ahead
JSON-LD https://api.datacite.org/application/vnd.schemaorg.ld+json/10.6092/ingv.it-ahead

XML: https://api.datacite.org/application/vnd.datacite.datacite+xml/10.13127/efsm20
JSON: https://api.datacite.org/application/vnd.datacite.datacite+json/10.13127/efsm20
JSON-LD https://api.datacite.org/application/vnd.schemaorg.ld+json/10.13127/efsm20

Yes, I am invoved in a broader geosciences community, Geo-INQUIRE (reference persons: Laurentiu Danciu @danciul and Javier Quinteros, @javiquinte), EPOS (reference persons: Rossana Paciello @rpaciello, Kety Giuliacci @Kety20 and Daniele Bailo @danielebailo). In addition, I am the coordinator of the INGV Data Management Office that runs the DOI service associated to our Data Registry, so for any dataset published at INGV (DOI prefix "10.13127" and "10.6092/ingv.it") we have full control on DataCite and ISO 19115/19139/TC211 metadata, whereas the control over the DCAT-AP output will be improved as soon as possible.

huberrob added a commit that referenced this issue Sep 4, 2024
huberrob added a commit that referenced this issue Sep 4, 2024
…ols) are listed in metadata along with data links and verification: by now only checks if common data formats are used (xml, json etc)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

No branches or pull requests

2 participants