Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formalize how channel types are shared between modalities that use channels.tsv files. #1436

Closed
sappelhoff opened this issue Mar 8, 2023 · 29 comments
Labels
consistency Spec is (potentially) inconsistent
Milestone

Comments

@sappelhoff
Copy link
Member

sappelhoff commented Mar 8, 2023

came up during the BEP029 (motion) community review:

also related to:

Some BIDS modalities allow for channels.tsv metadata:

  • EEG
  • MEG
  • iEEG
  • NIRS

channels.tsv files MUST have a type column that contains (upper case) strings from a defined vocabulary as specified in the modality specific sections. This column MUST be the second column in channels.tsv files.

In the schema, the valid values for type are specified separately for EEG, MEG, iEEG, and NIRS, see:

Even though in the spec, it says:

For EEG (types are shared with MEG and iEEG):

Restricted keyword list for field `type` in alphabetic order (shared with the
MEG and iEEG modality; however, only the types that are common in EEG data are listed here).
Note that upper-case is REQUIRED:

For MEG (no mention of sharing):

Restricted keyword list for field `type`.
Note that upper-case is REQUIRED:

For iEEG (types are shared with MEG and EEG):

Restricted keyword list for field type in alphabetic order (shared with the MEG
and EEG modality; however, only types that are common in iEEG data are listed here).
Note that upper-case is REQUIRED:

For NIRS (types are shared with EEG, MEG, and iEEG -- see last sentence):

All NIRS channels types MUST correspond to a [valid SNIRF data type](https://github.com/fNIRS/snirf/blob/master/snirf_specification.md#appendix).
Additional channels that are recorded simultaneously with the NIRS
device and stored in the same data file SHOULD be included as well.
However, additional channels that are simultaneously recorded with a different device
SHOULD be stored according to their appropriate modality specification.
For example, motion data that was simultaneously recorded with a different device should be specified
according to BEP029 and not according to the NIRS data type.
Whereas, if the motion data was acquired in with the NIRS device itself, it should be included here with the NIRS data.
Any of the channel types defined in other BIDS specification MAY be used here as well such as `ACCEL` or `MAGN`.
As several of these data types are commonly acquired using NIRS devices they are included as an example at the base of the table.
Note that upper-case is REQUIRED.


We should:

  1. clarify/formalize how channel types may be shared between modalities
  2. think about how to make this clearer in the schema and the associated rendering of channel type tables in the modality sections

I propose that channel types should be shared between all BIDS modalities.

@sappelhoff sappelhoff added this to the 1.9.0 milestone Mar 8, 2023
@sappelhoff sappelhoff added the consistency Spec is (potentially) inconsistent label Mar 8, 2023
@JuliusWelzel
Copy link
Collaborator

Would it make sense to specify for each modality if the keywords in type are modality-specific or can be shared? Some channel types might require modality-specific metadata to be shared -> type ACCEL in MOTION must also have component in the channels.tsv specified, type LATENCY is independent of any other modality-specific channels.tsv column.

I also noted that the glossary for channel types does not specify which modality it refers to (https://bids-specification.readthedocs.io/en/stable/glossary.html#type-sense-1-columns). Currently rendered as type sense 1, type sense 2, ... .

@sjeung
Copy link
Collaborator

sjeung commented Mar 10, 2023

I do like having channel definitions shared across modalities whenever possible.
However I also see the need for distinguishing between types that are more central to the modality and more peripheral.
For instance we should make it explicit that (not all but at least one of) MEG channel types MUST be included if they put the data in MEG folder whereas EEG channels may be included when it happens to have been recorded by the same device.

For instance,
(need to still think about the naming)
eeg core : EEG, EOG ...
eeg additional : eyegaze, audio ...

Then the website can only render the core channel types.
Additional types that are particularly relevant described in text.

And prescribe people with "recommended" modality per channel type?
For instance definition for POS channel type has motion as "recommended" modality, whereas latency can be marked as modality agnostic.

I acknowledge that this does not have to be a hard set thing because depending on the data set one modality is more important than another, but ideally people should be encouraged to separate between different data modalities and this can be made explicit by recommending a modality (or multiple of them) per type.

@sappelhoff
Copy link
Member Author

In the biweekly maintainers meeting yesterday we discussed a plan that I think goes well together with your suggestions, Julius and Sein. Thanks!

The current plan is to:

  1. Go through each channel type and check the definition, and formalize/disambiguate it where needed. Re: Julius comment, it's our goal that each channel type has one definition, as opposed to a different usage depending on BIDS modality. From a brief glance at the spec, I think that should currently still be possible without any breakage of backwards compatibility
  2. Adjust how channel types are handled in the schema -- in particular we'll see whether something like "Home modalities" for channel types would be feasible (e.g., axial gradiometer have the home modality MEG); "other" channels would be classified under a "common type" home modality (or similar)
  3. In each rendered BIDS spec modality section we will then list the channel types that belong to that home modality, together with a note that any other types are valid as well, and with a link to a new appendix page that will list all channel types in a single table

I'll take care of step 1 soon (next week, probably), and Chris will see whether he can carve out some time to do steps 2 and 3 (with help from the rest of us where needed)

cc @rwblair @effigies @ericearl @bendhouseart -- let me know if I missed something.

@JuliusWelzel
Copy link
Collaborator

Sounds like a very good solution you came up with! I can draft a table for MOTION-BIDS which channels are "home" in motion and which are not. Would that help?

@ericearl
Copy link
Collaborator

cc @rwblair @effigies @ericearl @bendhouseart -- let me know if I missed something.

@sappelhoff You got our discussion in there fully, thanks!

@sappelhoff
Copy link
Member Author

yes @JuliusWelzel that'd be helpful, thanks -- you could post it into this thread.

@JuliusWelzel
Copy link
Collaborator

JuliusWelzel commented Mar 12, 2023

Keyword Description Modalities to be shared with
ACCEL Accelerometer channel, one channel for each spatial axis. Column component for the axis MUST be added to the *_channels.tsv file (x, y, or z). fNIRS
ANGACC Angular acceleration channel, one channel for each spatial axis. Column component for the axis MUST be added to the *_channels.tsv file (x, y, or z). NONE
ANGVEL Angular velocity channel, one channel for each spatial axis. Column component for the axis MUST be added to the *_channels.tsv file (x, y, or z). NONE
GYRO Gyrometer channel, one channel for each spatial axis. Column component for the axis MUST be added to the *_channels.tsv file (x, y, or z). fNIRS
JNTANG Joint angle channel between two fixed axis belonging to two bodyparts. Angle SHOULD be defined between proximal and distal bodypart in deg. NONE
LATENCY Latency of samples in seconds from recording onset. MUST be in form of ss[.000000] , where [.000000] is an optional subsecond resolution between 1 and 6 decimal points. ALL
MAGN Magnetic field strength, one channel for each spatial axis. Column component for the axis MUST be added to the *_channels.tsv file (x, y or z). fNIRS
MISC Miscellaneous channels. ALL
ORNT Orientation channel, one channel for each spatial axis or quaternion component. Column component for the axis or quaternion label MUST be added to the *_channels.tsv file (x, y, z, quat_x, quat_y, quat_z, or quat_w). fNIRS
POS Position in space, one channel for each spatial axis. Column component for the axis MUST be added to the *_channels.tsv file (x, y or z). NONE (Eye-Tracking in the future)
VEL Velocity, one channel for each spatial axis. Column component for the axis MUST be added to the *_channels.tsv file (x, y or z). NONE (Eye-Tracking in the future)

This is a first draft on how a table could look like in the spec for MOTION-BIDS. I used the columns.yaml file from the schema to check for duplicate channel types.

I noticed the channel of type MAGN in MOTION-BIDS is similar to MEGMAG. It seems odd to abbreviate a channel of the same concept (both Magenometers) slightly different (*MAG and MAGN).

@sjeung Let me know if you spot any mistakes :)

@JuliusWelzel
Copy link
Collaborator

One uncertainty on my part when creating a BEP, which I am now realizing, is according to which criteria do I include which existing channel types in my schema objects? What is the rationale for including Audio as a channel type in EEG, iEEG, MEG and NIRS?

@JuliusWelzel
Copy link
Collaborator

How is channel type SYSCLOCK different from LATENCY? If there is no difference, it might make sense to use only one going forward.

@sappelhoff
Copy link
Member Author

sappelhoff commented Mar 15, 2023

How is channel type SYSCLOCK different from LATENCY? If there is no difference, it might make sense to use only one going forward.

The SYSCLOCK channel type was introduced with the MEG BEP at some point. I have not seen it being used in any BIDS dataset though. Do you remember the reasoning around it, or can provide some context @robertoostenveld @jasmainak @guiomar @teonbrooks ? For which kind of data is this channel type supposed to be used? The definition currently just says:

System time showing elapsed time since trial started

I find the definition of the LATENCY channel type much more clear:

Latency of samples in seconds from recording onset. MUST be in form of ss[.000000] , where [.000000] is an optional subsecond resolution

Although I would specify what is meant by "recording onset" --> perhaps the onset that is noted down in the acq_time column in the scans.tsv file?

Furthermore, the ss in ss[.000000] suggests that there will never be more than 99 seconds, or do I misunderstand this notation?

For reference, here are some similar metadata and their definitions:

acq_time definition (from scans file):

Acquisition time refers to when the first data point in each run was acquired. Furthermore, if this header is provided, the acquisition times of all files from the same recording MUST be identical. Datetime format and their anonymization are described in Units.

StartTime definition (from physio data):

Start time in seconds in relation to the start of acquisition of the first data sample in the corresponding neural dataset (negative values are allowed).


In general I agree that we should not introduce a new channel type (LATENCY) if an earlier channel type (SYSCLOCK) already covers it. Having that said, the SYSCLOCK type is not very clearly defined currently, so let's figure that out first.

@sappelhoff
Copy link
Member Author

I noticed the channel of type MAGN in MOTION-BIDS is similar to MEGMAG. It seems odd to abbreviate a channel of the same concept (both Magenometers) slightly different (*MAG and MAGN).

I agree it is kind of unfortunate to have this channel type in two versions -- if they do carry the same data.

The definition for MEGMAG currently reads:

MEG magnetometer

Whereas for MAGN, it reads:

Magnetic field strength, one channel for each spatial axis. Column component for the axis MUST be added to the *_channels.tsv file (x, y or z)

Potential solutions:

  1. live with the fact that we'll have MEGMAG for just MEG, and use MAGN for all other magnetometers
  2. add to the definition of MEGMAG, saying that if used under anything but MEG, it must have an associated component metadata entry in channels.tsv, and then use MEGMAG also for motion

I dislike the second suggestion because the type would have "MEG" in its name even when used outside of MEG, and because we already agreed (among maintainers) that it's probably not a smart idea to have channel types of the same name take on different meanings depending on the modality they are being used in.

I dislike the first suggestion because we are essentially duplicating a channel type. But I believe that for backward compatibility reasons, this is the road I'd take.

Any other suggestions?


(btw @JuliusWelzel and @sjeung, shouldn't the (x, y, z) in the description rather refer to the possible values for the component column, which are x, y, z, quat_x, quat_y, quat_z, quat_w, and n/a?)

@sappelhoff
Copy link
Member Author

Lastly @JuliusWelzel, I think you misunderstood the intention of which modalities channel types are shared with --> the idea is that all channel types can be used in any modality (what you have as "ALL" in your table) ... but that each channel type has a "home modality" (e.g., for MAGN it could be motion, for EEG it'd be eeg), and when there is an unclear home modality, the home modality would be "common" (e.g., for MISC or SYSCLOCK).

@jasmainak
Copy link
Collaborator

I don't have any recollection of SYSCLOCK ... maybe @robertoostenveld could help :) Furthermore, it seems that none of the openneuro datasets is using it

@sjeung
Copy link
Collaborator

sjeung commented Mar 16, 2023

Thanks @JuliusWelzel for the table!

Regarding the comment from @sappelhoff about latency channel, yes, the onset is intended to be the acq_time of scans.tsv and the text could be updated to make that clear.
That would also imply that the first sample should always have latency of 0, considering acq_time = when the first data point in each run was acquired.

ss[.000000] was not intended to imply integer digit limit. But I see how it is misleading.
Should we update the notation to s[.000000] and add that s refers to an integer of arbitrary number of digits?

Regarding the MEGMAG type I also favor the first suggestion because the use cases are quite different requiring different metadata.

And the (x,y,z) could simply be omitted or replaced with text description like "... to describe the mapping to the spatial axes"

@JuliusWelzel
Copy link
Collaborator

Regarding the comment from @sappelhoff about latency channel, yes, the onset is intended to be the acq_time of scans.tsv and the text could be updated to make that clear. That would also imply that the first sample should always have latency of 0, considering acq_time = when the first data point in each run was acquired.

Good point, I will update the text accordingly.

ss[.000000] was not intended to imply integer digit limit. But I see how it is misleading. Should we update the notation to s[.000000] and add that s refers to an integer of arbitrary number of digits?

I think it would be more feasible to hh:mm:ss[.000000] as indicated here for time stamps (https://bids-specification.readthedocs.io/en/stable/02-common-principles.html#units). This would make it more human readable if you have big latencies (e.g. 10000 secs).

Regarding the MEGMAG type I also favor the first suggestion because the use cases are quite different requiring different metadata.

+1 for Stefan first proposed solution.

And the (x,y,z) could simply be omitted or replaced with text description like "... to describe the mapping to the spatial axes"

I like that!

@sjeung
Copy link
Collaborator

sjeung commented Mar 16, 2023

When it comes to latency channel values, I would stick to the numeric format for keeping the content of .tsv uniform because I remember running into some problems (both reading and writing) when I mix non-numeric datetime and numerics as entries for .tsv

@JuliusWelzel
Copy link
Collaborator

When it comes to latency channel values, I would stick to the numeric format for keeping the content of .tsv uniform because I remember running into some problems (both reading and writing) when I mix non-numeric datetime and numerics as entries for .tsv

That is a good argument.

Furthermore, the ss in ss[.000000] suggests that there will never be more than 99 seconds, or do I misunderstand this notation?

What would be best to address Stefan's comment?
acq_time has to be in datetime and StartTime does not specifically say it can be subsecond precision, which would be important to indicate.

@robertoostenveld
Copy link
Collaborator

robertoostenveld commented Mar 17, 2023

I understand from a historical perspective how the unidirectional references from EEG to MEG, and iEEG to EEG+MEG, etc., came about: that is the order in which the BEPs were implemented and merged with the specification. The MEG BEP did not yet have the EEG and iEEG to link to, and apparently was not updated when EEG and iEEG were later merged.

I propose that channel types should be shared between all BIDS modalities.

I also agree.

Regarding "SYSCLOCK", I am quite confident that it will be a CTF-specific channel name... Searching for old CTF example datasets (which were from Montreal or Nijmegen), I found https://openfmri.org/dataset/ds000246/ with https://s3.amazonaws.com/openneuro/ds000246/ds000246_R1.0.0/uncompressed/sub-0001/meg/sub-0001_task-AEF_run-02_channels.tsv. That dataset is also here on openneuro, so apparently the search is not so powerful here.

I also see "clock" in https://doi.org/10.34973/37n0-yc51, which is one that we prepared and released around the time that BEP008/MEG was merged. Apparently it passed the validator back then.

Looking now at a CTF dataset, I see that the SYSCLOCK channel represents the time of acquisition, which is different from the time of recording. I.e. the acquisition to the computer is started first, data is checked, and then the record-to-disk button is pressed.

Here is a plot of the values in the fieldtrip tutorial Subject01.ds dataset. Values are in seconds, and the vertical jumps correspond to the inter-trial intervals which in this case are not on disk as the data was recorded in epoched mode. Looking at the data itself, you can see that it is discontinuous, but the SYSCLOCK channel allows to reconstruct how long the "holes" are between trials. I am not 100% sure about the overall offset of 2.14e04, which corresponds to about 6 hours. So I am not 100% sure whether SYSCLOCK=0 corresponds to start of acquisition. 6 hours seems a bit long for the measurement preparation. But this is a 20 year old dataset from the start of the Donders, so who knows.

Screenshot 2023-03-17 at 14 10 00

Edit: SYSCLOCK=0 might correspond to the time of the MEG electronics being started-up or rebooted (even without the acquisition running). That makes more sense, the MEG system would have been switched on in the morning and the measurement might have taken place in the afternoon.

@helenacockx
Copy link
Contributor

I noticed that we might have a redundancy in the proposed keywords for motion channel types currently. There is a channel name GYRO and ANGVEL, while gyroscopes, in fact, measure angular velocity. I think we should choose one of both here.
ANGVEL seems most logical given that the other channel types also describe what the motion trackers measure, rather than what they technically are (e.g., POS). However, GYRO was already implemented in the fINRS-BIDS...

@helenacockx
Copy link
Contributor

One other remark: we have ACCEL, but ANGACC. Wouldn't it make sense to make this consistent too, so ANGACCEL?

@JuliusWelzel
Copy link
Collaborator

I noticed that we might have a redundancy in the proposed keywords for motion channel types currently. There is a channel name GYRO and ANGVEL, while gyroscopes, in fact, measure angular velocity. I think we should choose one of both here.

Thanks for spotting this! We had the exact discussion a while back and agreed on GYRO over ANGVEL see here. I fixed it in 8cad179.

@JuliusWelzel
Copy link
Collaborator

One other remark: we have ACCEL, but ANGACC. Wouldn't it make sense to make this consistent too, so ANGACCEL?

We also discussed this on GH, but unfortunately I don't recall or find the reasoning. Maybe @sjeung can remember?

@sappelhoff
Copy link
Member Author

One other remark: we have ACCEL, but ANGACC. Wouldn't it make sense to make this consistent too, so ANGACCEL?

@helenacockx is this just a suggestion for renaming ANGACC to ANGACCEL? If yes, then I'd be in favor of this.

@robertoostenveld
Copy link
Collaborator

robertoostenveld commented Apr 28, 2023

Yes, that is what it is. I am also in favor.

Note that it also applies to ANGACCChannelCount in the json file.

@sappelhoff
Copy link
Member Author

I have taken the liberty to push these changes to @JuliusWelzel's PR:

@sjeung @JuliusWelzel please confirm that this is fine, then we can merge it and adjust the examples. I'll also have a look at the BIDS validator when the PR is merged.

@JuliusWelzel
Copy link
Collaborator

JuliusWelzel commented May 1, 2023

Looks good to me, thanks!

@sjeung
Copy link
Collaborator

sjeung commented May 2, 2023

I have taken the liberty to push these changes to @JuliusWelzel's PR:

@sjeung @JuliusWelzel please confirm that this is fine, then we can merge it and adjust the examples. I'll also have a look at the BIDS validator when the PR is merged.

Too me, too. I don't recall any particular reason to prefer ANGACC to ANGACCEL, except for the length. I am fine with renaming and will adjust example data sets accordingly.

Thanks for the feedback Helena & for update Stefan!

@JuliusWelzel
Copy link
Collaborator

@sappelhoff
Just checking the Issues I was involved in for the 1.9 release. :)

I think this issue can be closed as it seems to be addressed here. For future BEPs, all available channels type can be found in this schema object, correct? If a new channel type is introduced, it will be added here as well.

@sappelhoff
Copy link
Member Author

indeed @JuliusWelzel, thank you. I am closing this in favor of:

however, #1529 will be taken off the 1.9 release as it's mostly visuals that can also be added after a release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
consistency Spec is (potentially) inconsistent
Projects
None yet
Development

No branches or pull requests

7 participants