Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should instrumental FAIR data reference the instrument using a PID? #22

Open
hrzepa opened this issue Jul 3, 2017 · 11 comments
Open

Should instrumental FAIR data reference the instrument using a PID? #22

hrzepa opened this issue Jul 3, 2017 · 11 comments

Comments

@hrzepa
Copy link

hrzepa commented Jul 3, 2017

Data derived from an instrument can be very dependent on the characteristics, capability and calibration of that instrument. Should such instrumentation have its own PID and hence its own metadata which can serve to enhance the FAIR data that originates using it?

@brucellino
Copy link

Interesting question. In my experience (ALICE experiment at CERN), the conditions were kept in a database which could be retrieved with an identifier. This is not exactly "I" or "A", but was definitely "R" and probably "F", depending how how one considers it. The ability to reproduce results acquired by an instrument depends critically on the calibration of that instrument, and we've seen several instances where false positives were published^[citation needed]

There must be a way to calibrate the analysis of data coming from instruments. The real question is : if this calibration data is not FAIR, can the actual data be considered FAIR ?

Usually what is done is that raw data is not "published", but stored unFAIRly - only the calibrated, experiment-approved datasets are published FAIRly (sometimes). This does away with the need for instrument data, but also makes the data a little less reproducible.

On the other hand, one may argue that most instruments are far too complex for a member of the scientific public to reproduce the data.

For the sake of argument therefore, I would say "no", instrument data does not have to be FAIR, because the FAIR data is instrument independent (ie, the characteristics of the instrument have been calibrated and corrected for). I say for the sake of argument because one could easily argue the other side :)

@dr-shorthair
Copy link

I would say that iff a description of the instrument is sufficient to assess the quality of its output, then yes, the instrument should have a PID. However, there are experimental arrangements where the protocol includes routine calibration, using a standard or some other means. In such cases the identity of the instrument used is much less important - but instead the protocol (which is what provides assurance about the quality of the data) must be identified so that its description may be obtained.

@brucellino
Copy link

I was reminded a few days ago that sometimes instruments are 3-d printed, and can be easily reproduced. The case in point was an automated weather station used in sparse rural settings. In this case, it really does make sense to assign the machine - or rather the description of the machine in the modelling language.

The same is true of other pieces of software-defined hardware, like FPGAs. However, these are special cases, I don't know how widely they would be applicable.

@CaroleGoble
Copy link

On a related note, the Resource Identification Initiative (https://www.force11.org/group/resource-identification-initiative) aims to assign identifiers to physical objects and materials that have a digital manifestation.

@dr-shorthair
Copy link

dr-shorthair commented Jul 31, 2017

That's interesting. There have been a few attempts at this in the Biodiversity and Geoscience communities.
Currently the initiative with the most traction is probably IGSN - http://www.igsn.org/

@LeifLaaksonen
Copy link

I am not sure but we might talk about a bunch of questions here? The discussion started from a PID with its metadata but as long as we don't fully know how and for what this information is needed it will be challenging to foresee its usage. Taking the equivalence to the IT world all network(ed) devices have by default a hard-wired MAC address so if the instrument is on the network it has a sort of PID already. However, the metadata might not be the one anticipated for a research instrument.

I have not been following the THOR (https://project-thor.eu/) project but they might have some input to this as well?

A very general answer to the original question - I think the instruments should be referenced for FAIR data through PIDs.

@hrzepa
Copy link
Author

hrzepa commented Aug 2, 2017 via email

@brucellino
Copy link

Any metadata is definitely not better than no metadata, because any metadata also includes malicious metadata. Sure, having sparse, but true metadata is better than having no metadata, but having no metadata is better than having false metadata.

@hrzepa
Copy link
Author

hrzepa commented Aug 2, 2017 via email

@LeifLaaksonen
Copy link

Shouldn't the discussion be on what metadata to add rather than how it is done? From the previous discussion I assume that we all agree on the necessity of PIDs? What metadata should be included in the PID for a hardware device? I am not an expert in hardware for networking devices but what information is included in the network hardware for a PC, apart from the unique MAC number?

@hrzepa
Copy link
Author

hrzepa commented Aug 4, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants