-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should instrumental FAIR data reference the instrument using a PID? #22
Comments
Interesting question. In my experience (ALICE experiment at CERN), the conditions were kept in a database which could be retrieved with an identifier. This is not exactly "I" or "A", but was definitely "R" and probably "F", depending how how one considers it. The ability to reproduce results acquired by an instrument depends critically on the calibration of that instrument, and we've seen several instances where false positives were published^[citation needed] There must be a way to calibrate the analysis of data coming from instruments. The real question is : if this calibration data is not FAIR, can the actual data be considered FAIR ? Usually what is done is that raw data is not "published", but stored unFAIRly - only the calibrated, experiment-approved datasets are published FAIRly (sometimes). This does away with the need for instrument data, but also makes the data a little less reproducible. On the other hand, one may argue that most instruments are far too complex for a member of the scientific public to reproduce the data. For the sake of argument therefore, I would say "no", instrument data does not have to be FAIR, because the FAIR data is instrument independent (ie, the characteristics of the instrument have been calibrated and corrected for). I say for the sake of argument because one could easily argue the other side :) |
I would say that iff a description of the instrument is sufficient to assess the quality of its output, then yes, the instrument should have a PID. However, there are experimental arrangements where the protocol includes routine calibration, using a standard or some other means. In such cases the identity of the instrument used is much less important - but instead the protocol (which is what provides assurance about the quality of the data) must be identified so that its description may be obtained. |
I was reminded a few days ago that sometimes instruments are 3-d printed, and can be easily reproduced. The case in point was an automated weather station used in sparse rural settings. In this case, it really does make sense to assign the machine - or rather the description of the machine in the modelling language. The same is true of other pieces of software-defined hardware, like FPGAs. However, these are special cases, I don't know how widely they would be applicable. |
On a related note, the Resource Identification Initiative (https://www.force11.org/group/resource-identification-initiative) aims to assign identifiers to physical objects and materials that have a digital manifestation. |
That's interesting. There have been a few attempts at this in the Biodiversity and Geoscience communities. |
I am not sure but we might talk about a bunch of questions here? The discussion started from a PID with its metadata but as long as we don't fully know how and for what this information is needed it will be challenging to foresee its usage. Taking the equivalence to the IT world all network(ed) devices have by default a hard-wired MAC address so if the instrument is on the network it has a sort of PID already. However, the metadata might not be the one anticipated for a research instrument. I have not been following the THOR (https://project-thor.eu/) project but they might have some input to this as well? A very general answer to the original question - I think the instruments should be referenced for FAIR data through PIDs. |
Regarding metadata, I believe that ANY metadata is better than NO metadata. And once the community starts to resemble agreement on what metadata is best suited for describing a particular instrument, further metadata can always be added to any particular PID.
This I am already discussing the type of metadata that could be added to high resolution NMR spectrometers used in chemistry.
… On 2 Aug 2017, at 11:22, LeifLaaksonen ***@***.***> wrote:
I am not sure but we might talk about a bunch of questions here? The discussion started from a PID with its metadata but as long as we don't fully know how and for what this information is needed it will be challenging to foresee its usage. Taking the equivalence to the IT world all network(ed) devices have by default a hard-wired MAC address so if the instrument is on the network it has a sort of PID already. However, the metadata might not be the one anticipated for a research instrument.
I have not been following the THOR (https://project-thor.eu/ <https://project-thor.eu/>) project but they might have some input to this as well?
A very general answer to the original question - I think the instruments should be referenced for FAIR data through PIDs.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#22 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHd1wD9hJzXIJaVKT7nS_Ra2k3AQ527Pks5sUE3LgaJpZM4OMYKm>.
|
Any metadata is definitely not better than no metadata, because any metadata also includes malicious metadata. Sure, having sparse, but true metadata is better than having no metadata, but having no metadata is better than having false metadata. |
We have long added metadata using only workflows rather than manually. I presuppose that it would be manual data that is likely to be malicious, in which case one should argue that metadata should always be generated as part of a well tested workflow. Perhaps even that workflow should have its own metadata and perchance even a PID!
… On 2 Aug 2017, at 12:13, Bruce Becker ***@***.***> wrote:
Any metadata is definitely not better than no metadata, because any metadata also includes malicious metadata. Sure, having sparse, but true metadata is better than having no metadata, but having no metadata is better than having false metadata.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#22 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHd1wEp7I8PtQKzn95DGuTfsuWpQLubCks5sUFnHgaJpZM4OMYKm>.
|
Shouldn't the discussion be on what metadata to add rather than how it is done? From the previous discussion I assume that we all agree on the necessity of PIDs? What metadata should be included in the PID for a hardware device? I am not an expert in hardware for networking devices but what information is included in the network hardware for a PC, apart from the unique MAC number? |
If there is concern about malicious metadata, then how it is done is indeed surely important.
I doubt discussion about what metadata can be easily done outside of the community that uses eg any particular instrument, other than of course the standard Dublin Core schema.
… On 4 Aug 2017, at 08:01, Leif Laaksonen ***@***.***> wrote:
Shouldn't the discussion be on what metadata to add rather than how it is done? From the previous discussion I assume that we all agree on the necessity of PIDs? What metadata should be included in the PID for a hardware device? I am not an expert in hardware for networking devices but what information is included in the network hardware for a PC, apart from the unique MAC number?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#22 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHd1wMv956V5f9Qh20h1y5Q8265Vwh_uks5sUsHAgaJpZM4OMYKm>.
|
Data derived from an instrument can be very dependent on the characteristics, capability and calibration of that instrument. Should such instrumentation have its own PID and hence its own metadata which can serve to enhance the FAIR data that originates using it?
The text was updated successfully, but these errors were encountered: