From fcfcb37ab037a53cbc094e5faf0aaf5d3a8fe683 Mon Sep 17 00:00:00 2001 From: Dimitri Papadopoulos <3234522+DimitriPapadopoulos@users.noreply.github.com> Date: Mon, 22 Apr 2024 10:44:10 +0200 Subject: [PATCH 01/74] =?UTF-8?q?anonymisation=20=E2=86=92=20deidentififca?= =?UTF-8?q?tion?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit In the vast majority of cases, strict anonymisation is impossible. Use the more generic term deidentification instead. --- src/modality-agnostic-files.md | 2 +- .../magnetic-resonance-imaging-data.md | 2 +- src/modality-specific-files/positron-emission-tomography.md | 2 +- src/schema/objects/columns.yaml | 4 ++-- src/schema/objects/metadata.yaml | 2 +- src/schema/objects/suffixes.yaml | 2 +- 6 files changed, 7 insertions(+), 7 deletions(-) diff --git a/src/modality-agnostic-files.md b/src/modality-agnostic-files.md index fb5b04ee3e..c6890aa6b4 100644 --- a/src/modality-agnostic-files.md +++ b/src/modality-agnostic-files.md @@ -534,7 +534,7 @@ ses-followup 2009-06-17T13:45:30 110 Template: `code/*` Source code of scripts that were used to prepare the dataset MAY be stored here. -Examples include anonymization or defacing of the data, or +Examples include deidentification or defacing of the data, or the conversion from the format of the source data to the BIDS format (see [source vs. raw vs. derived data](./common-principles.md#source-vs-raw-vs-derived-data)). Extra care should be taken to avoid including original IDs or diff --git a/src/modality-specific-files/magnetic-resonance-imaging-data.md b/src/modality-specific-files/magnetic-resonance-imaging-data.md index 072410f528..06fb0b997c 100644 --- a/src/modality-specific-files/magnetic-resonance-imaging-data.md +++ b/src/modality-specific-files/magnetic-resonance-imaging-data.md @@ -925,7 +925,7 @@ NIfTI headers. ### `*_asllabeling.*` -An anonymized screenshot of the planning of the labeling slab/plane +A deidentified screenshot of the planning of the labeling slab/plane with respect to the imaging slab or slices. This screenshot is based on DICOM macro C.8.13.5.14. diff --git a/src/modality-specific-files/positron-emission-tomography.md b/src/modality-specific-files/positron-emission-tomography.md index 5ea5f8bbcb..f15f3fad40 100644 --- a/src/modality-specific-files/positron-emission-tomography.md +++ b/src/modality-specific-files/positron-emission-tomography.md @@ -177,7 +177,7 @@ A guide for using macros can be found at --> {{ MACROS___make_sidecar_table("pet.PETTime") }} -We refer to the common principles for the standards for describing dates and timestamps, including possibilities for anonymization (see [Units](../common-principles.md#units)). +We refer to the common principles for the standards for describing dates and timestamps, including possibilities for deidentification (see [Units](../common-principles.md#units)). #### Reconstruction diff --git a/src/schema/objects/columns.yaml b/src/schema/objects/columns.yaml index 3ccb32d324..5ca387a007 100644 --- a/src/schema/objects/columns.yaml +++ b/src/schema/objects/columns.yaml @@ -19,7 +19,7 @@ acq_time__scans: Acquisition time refers to when the first data point in each run was acquired. Furthermore, if this header is provided, the acquisition times of all files from the same recording MUST be identical. - Datetime format and their anonymization are described in + Datetime format and their deidentification are described in [Units](SPEC_ROOT/common-principles.md#units). type: string format: datetime @@ -28,7 +28,7 @@ acq_time__sessions: display_name: Session acquisition time description: | Acquisition time refers to when the first data point of the first run was acquired. - Datetime format and their anonymization are described in + Datetime format and their deidentification are described in [Units](SPEC_ROOT/common-principles.md#units). type: string format: datetime diff --git a/src/schema/objects/metadata.yaml b/src/schema/objects/metadata.yaml index 57515dd6f0..1b13a6a614 100644 --- a/src/schema/objects/metadata.yaml +++ b/src/schema/objects/metadata.yaml @@ -1646,7 +1646,7 @@ LabelingLocationDescription: Description of the location of the labeling plane (`"CASL"` or `"PCASL"`) or the labeling slab (`"PASL"`) that cannot be captured by fields `LabelingOrientation` or `LabelingDistance`. - May include a link to an anonymized screenshot of the planning of the + May include a link to a deidentified screenshot of the planning of the labeling slab/plane with respect to the imaging slab or slices `*_asllabeling.*`. Based on DICOM macro C.8.13.5.14. diff --git a/src/schema/objects/suffixes.yaml b/src/schema/objects/suffixes.yaml index 0ddc8647d2..4ed5a281b0 100644 --- a/src/schema/objects/suffixes.yaml +++ b/src/schema/objects/suffixes.yaml @@ -505,7 +505,7 @@ asllabeling: value: asllabeling display_name: ASL Labeling Screenshot description: | - An anonymized screenshot of the planning of the labeling slab/plane + A deidentified screenshot of the planning of the labeling slab/plane with respect to the imaging slab or slices. This screenshot is based on DICOM macro C.8.13.5.14. beh: From cbb94e11dea2e6a94d9e68f6782494d3c36c9c79 Mon Sep 17 00:00:00 2001 From: Chris Markiewicz Date: Thu, 25 Apr 2024 06:41:47 -0400 Subject: [PATCH 02/74] [MAINT] Update contributor table generator style (#1805) * chore(code-style): Rewrite f-strings in print_contributors.py to outrage flake8 less * chore(render): Rerender contributors.md * chore(style): Ignore linting errors in auto-generated table --- src/appendices/contributors.md | 13 +++++++------ tools/print_contributors.py | 20 +++++++++----------- 2 files changed, 16 insertions(+), 17 deletions(-) diff --git a/src/appendices/contributors.md b/src/appendices/contributors.md index 2ffab4740f..f48586f04d 100644 --- a/src/appendices/contributors.md +++ b/src/appendices/contributors.md @@ -37,6 +37,7 @@ ecosystem (in alphabetical order). If you contributed to the BIDS ecosystem and your name is not listed, please add it. + | name | contributions | | ---------------------------------------------------- | -------------------------------------- | @@ -121,7 +122,7 @@ If you contributed to the BIDS ecosystem and your name is not listed, please add | Dianne Patterson | πŸ“– | | Dimitri Papadopoulos Orfanos | πŸ“–πŸ’‘πŸ€”πŸ’¬πŸ’» | | Dmitry Petrov | πŸ“–πŸ’» | -| Dora Hermes | πŸ“–πŸ’»βœ…πŸ”πŸ€” | +| Dora Hermes | πŸ“–πŸ’»βœ…πŸ”πŸ€” | | Dorien Huijser | πŸ“– | | Douglas N. Greve | πŸ“– | | Duncan Macleod | πŸ“–πŸš‡ | @@ -194,7 +195,7 @@ If you contributed to the BIDS ecosystem and your name is not listed, please add | Jeanette Mumford | πŸ“– | | Jefferson Casimir | πŸ”§ | | Jeffrey G. Ojemann | πŸ“– | -| Jeffrey S. Grethe | πŸ’¬πŸ›βœ…πŸ“’πŸ’» | +| Jeffrey S. Grethe | πŸ’¬πŸ›βœ…πŸ“’πŸ’» | | JegouA | πŸ’» | | Jelle Dalenberg | πŸ“– | | Jeremy Moreau | πŸ“–πŸ’‘ | @@ -211,7 +212,7 @@ If you contributed to the BIDS ecosystem and your name is not listed, please add | Jose Manuel Saborit | πŸ“– | | Joseph Wexler | πŸ“–πŸ’‘ | | Joseph Woods | πŸ“– | -| Julia Guiomar Niso GalΓ‘n | πŸ€”πŸŽ¨πŸ”πŸ‘€πŸ“‹πŸ“πŸ”§πŸ›πŸ’»πŸ”£βœ…πŸ’¬πŸ“–πŸ’‘πŸ“’ | +| Julia Guiomar Niso GalΓ‘n | πŸ€”πŸŽ¨πŸ”πŸ‘€πŸ“‹πŸ“πŸ”§πŸ›πŸ’»πŸ”£βœ…πŸ’¬πŸ“–πŸ’‘πŸ“’ | | Julia Sprenger | πŸ“– | | Julien Cohen-Adad | πŸ“–πŸ”£πŸ€” | | Julius Welzel | πŸ“–πŸ’‘πŸ›πŸ’»πŸ”£πŸ€”πŸ’¬πŸ““ | @@ -269,7 +270,7 @@ If you contributed to the BIDS ecosystem and your name is not listed, please add | Michael Hanke | πŸ“–πŸ€”πŸ”§πŸ›πŸ“’ | | Michael P. Harms | πŸ“–βš οΈπŸ”§ | | Michael P. Milham | πŸ’‘πŸ” | -| Michael P. Notter | πŸ’¬πŸ“βœ…πŸ“’πŸ“– | +| Michael P. Notter | πŸ’¬πŸ“βœ…πŸ“’πŸ“– | | Michael Schirner | πŸ“– | | MikaΓ«l Naveau | πŸ› | | Nader Pouratian | πŸ“– | @@ -288,7 +289,7 @@ If you contributed to the BIDS ecosystem and your name is not listed, please add | Patricia Clement | πŸ’¬πŸ›πŸ’»πŸ“–πŸ”£πŸ’‘πŸ“‹πŸ€”πŸ“†βš οΈπŸ“’ | | Patrick Park | πŸ“–πŸ’‘πŸ’¬πŸ’» | | Paule-Joanne Toussaint | πŸ“– | -| Peer Herholz | πŸ’¬πŸ“–πŸ‘€πŸ”§βœ…πŸ“’ | +| Peer Herholz | πŸ’¬πŸ“–πŸ‘€πŸ”§βœ…πŸ“’ | | Petra Ritter | πŸ“– | | Pierre Rioux | πŸ“– | | Pieter Vandemaele | πŸ“–πŸ’» | @@ -318,7 +319,7 @@ If you contributed to the BIDS ecosystem and your name is not listed, please add | Shashank Bansal | πŸ“– | | Sjoerd B. Vos | πŸ“– | | Soichi Hayashi | πŸ“–πŸ”§πŸ› | -| Stefan Appelhoff | πŸ“–πŸ’¬πŸ€”πŸ›πŸ’‘πŸ’»πŸ‘€βš οΈπŸ“’βœ…πŸ”§πŸ”ŒπŸ“πŸš§πŸ”£ | +| Stefan Appelhoff | πŸ“–πŸ’¬πŸ€”πŸ›πŸ’‘πŸ’»πŸ‘€βš οΈπŸ“’βœ…πŸ”§πŸ”ŒπŸ“πŸš§πŸ”£ | | Stephan Bickel | πŸ“– | | Steven Meisler | πŸ›πŸ’»πŸ’¬πŸ”§πŸ““ | | Suyash Bhogawar | πŸ“–πŸ’‘βš οΈπŸ”§πŸ’¬ | diff --git a/tools/print_contributors.py b/tools/print_contributors.py index c49ca6f823..29db86a39c 100644 --- a/tools/print_contributors.py +++ b/tools/print_contributors.py @@ -15,8 +15,9 @@ def contributor_table_header(max_name_length, max_contrib_length): - return f"""| name{" " * (max_name_length-4)} | contributions{" " * (max_contrib_length-13)} | -| {"-" * max_name_length} | {"-"*max_contrib_length} | + return f"""\ +| {"name":<{max_name_length}} | {"contributions":<{max_contrib_length}} | +| {"":-<{max_name_length}} | {"":-<{max_contrib_length}} | """ @@ -24,16 +25,13 @@ def create_line_contributor( contributor: dict[str, str], max_name_length: int, max_contrib_length: int ): name = contributor["name"] + emap = emoji_map() + contributions = "".join( + emoji.emojize(emap[cont]) for cont in contributor["contributions"] + ) - line = f"| {name}{' '*(max_name_length-len(name))} | " - - nb_contrib = len(contributor["contributions"]) * 2 - for contrib in contributor["contributions"]: - line += emoji.emojize(emoji_map()[contrib]) - - line += f"{' '*(max_contrib_length-nb_contrib)} |\n" - - return line + pad = max_contrib_length - len(contributor["contributions"]) * 2 + return f"| {name:<{max_name_length}} | {contributions}{'':<{pad}} |\n" def main(): From 90ec07f1e8357d1ba209570fdac3a1394a9d677b Mon Sep 17 00:00:00 2001 From: Yaroslav Halchenko Date: Thu, 25 Apr 2024 14:55:51 -0400 Subject: [PATCH 03/74] [FIX] Move `rawdata/` into `sourcedata/raw` in alternative structure example, clarify on naming of datasets themselves (#1741) * RF: move `rawdata/` to `sourcedata/raw` in an example + make overall dataset to be BIDS dataset This is my take on an extended discussion about ambiguity of `rawdata/` example: https://github.com/bids-standard/bids-specification/pull/1734/files#r1534475631 * Minor rewording in description of sourcedata/ content Prior one bundled naming aspect under the same MUST. I separated into separate sentences, added explicit statement that BIDS does not prescribe a particular naming scheme for source data. And added explicit RECOMMENDED on the example how to organize/name files there. * Add one dataset_description.json into an example to make it explicitly a BIDS dataset * My take on dataset naming common principle * [DATALAD RUNCMD] Replace use of rawdata in tests with explicit 'noncompliant' === Do not change lines below === { "chain": [], "cmd": "sed -i -e s,rawdata,noncompliant,g tools/schemacode/bidsschematools/validator.py tools/schemacode/bidsschematools/tests/test_validator.py tools/schemacode/bidsschematools/tests/data/expected_bids_validator_xs_write.log", "exit": 0, "extra_inputs": [], "inputs": [], "outputs": [], "pwd": "." } ^^^ Do not change lines above ^^^ * Do not use e.g. * Move dataset_description.json in the example to be listed after folders * Remove the notion that example layout can in fact be a valid BIDS dataset * Use lower case "recommended" as not part of BIDS spec, and recommend underscores too * Make into a single sentence Co-authored-by: Chris Markiewicz --------- Co-authored-by: Chris Markiewicz --- src/common-principles.md | 53 ++++++++++--------- .../data/expected_bids_validator_xs_write.log | 2 +- .../bidsschematools/tests/test_validator.py | 4 +- tools/schemacode/bidsschematools/validator.py | 2 +- 4 files changed, 33 insertions(+), 28 deletions(-) diff --git a/src/common-principles.md b/src/common-principles.md index 9b10ba19f2..3d9bc233af 100644 --- a/src/common-principles.md +++ b/src/common-principles.md @@ -97,6 +97,12 @@ and/or files (like `events.tsv`) are fully omitted *when they are unavailable or instead of specified with an `n/a` value, or included as an empty file (for example an empty `events.tsv` file with only the headers included). +## Dataset naming + +BIDS does not prescribe a particular naming scheme for directories containing individual BIDS datasets. +However, it is recommended to use a short descriptive name that reflects the content of the dataset, avoid spaces in the name, and use hyphens or underscores to separate words. +BIDS datasets embedded within a larger BIDS dataset MAY follow some convention (see for example [Storage of derived datasets](#storage-of-derived-datasets)). + ## Filesystem structure Data for each subject are placed in subdirectories named "`sub-