Skip to content

Latest commit

 

History

History
129 lines (108 loc) · 6.9 KB

README.md

File metadata and controls

129 lines (108 loc) · 6.9 KB

Nipoppy: Parkinson's Progression Markers Initiative dataset

This repository contains code to process tabular and imaging data from the Parkinson's Progression Markers Initiative (PPMI) dataset using the Nipoppy framework.

PPMI CSV files to download from LONI

Image collections

  • idaSearch.csv
    • Advanced Search
    • Check every box in "Display in result" column
    • Check "DTI" + "MRI" + "fMRI" in "Modality"

Study data

  • Study Docs: Data & Databases
    • Code_List_-__Annotated_.csv
    • Data_Dictionary_-__Annotated_.csv
  • Subject Characteristics: Patient Status
    • Participant_Status.csv
  • Subject Characteristics: Subject Demographics
    • Age_at_visit.csv
    • Demographics.csv
    • Socio-Economics.csv
  • Medical History: Medical
    • Clinical_Diagnosis.csv
    • Primary_Clinical_Diagnosis.csv
  • Motor Assessments: Motor / MDS-UPDRS
    • MDS-UPDRS_Part_I.csv
    • MDS-UPDRS_Part_III.csv
    • MDS_UPDRS_Part_II__Patient_Questionnaire.csv
    • MDS-UPDRS_Part_I_Patient_Questionnaire.csv
    • MDS-UPDRS_Part_IV__Motor_Complications.csv
  • Non-motor Assessments: ALL
    • All downloaded, though not all used or up-to-date
    • Benton_Judgement_of_Line_Orientation.csv
    • Clock_Drawing.csv
    • Cognitive_Categorization.csv
    • Cognitive_Change.csv
    • Epworth_Sleepiness_Scale.csv
    • Geriatric_Depression_Scale__Short_Version_.csv
    • Hopkins_Verbal_Learning_Test_-_Revised.csv
    • Letter_-_Number_Sequencing.csv
    • Lexical_Fluency.csv
    • Modified_Boston_Naming_Test.csv
    • Modified_Semantic_Fluency.csv
    • Montreal_Cognitive_Assessment__MoCA_.csv
    • Neuro_QoL__Cognition_Function_-_Short_Form.csv
    • Neuro_QoL__Communication_-_Short_Form.csv
    • QUIP-Current-Short.csv
    • REM_Sleep_Behavior_Disorder_Questionnaire.csv
    • SCOPA-AUT.csv
    • State-Trait_Anxiety_Inventory.csv
    • Symbol_Digit_Modalities_Test.csv
    • Trail_Making_A_and_B.csv
    • University_of_Pennsylvania_Smell_Identification_Test_UPSIT.csv

Manifest generation

DICOM download/reorg

PPMI data portal (LONI IDA)

  • Some search fields in LONI search tool cannot be trusted
    • Examples:
      • Modality
        • Modality=DTI can have anatomical images, and there are diffusion images with MRI modality
      • Weighting (under Imaging Protocol)
        • Some T1s have Weighting=PD
    • We classify image modalities/contrast only based on the Image Description column
      • This can also lead to issues, for example when a subject has the same description string for all of their scans. In that case, we manually determine the image modality/contrast and hard-code the mapping in heuristic.py for HeuDiConv
  • LONI viewer sometimes shows seemingly bad/corrupted files but they are actually fine once we convert them
    • Observed for some diffusion images (tend to have ~2700 slices according to the LONI image viewer)

Compute Canada

  • Some subjects have a huge amount of small DICOM files, which causes us to exceed the inode quota on /scratch

BIDS

BIDS data file naming

The tabular/ppmi_imaging_descriptions.json file is used to determine the BIDS datatype and suffix (contrast) associated with an image's MRI series description. It will be updated as new data is processed.

Here is a description of the available BIDS data and the tags that can appear in their filenames:

  • anat
    • The available suffixes are: T1w, T2w, T2starw, and FLAIR
    • Most images have an acq tag:
      • Non-neuromelanin images: acq-<plane><type>, where
        • <plane> is one of: sag, ax, or cor (for sagittal, axial, or coronal scans respectively)
        • <type> is one of: 2D, or 3D
      • Neuromelanin images: acq-NM
    • For some images, the acquisition plane (sag/ax/cor) or type (2D/3D) cannot be easily obtained. In those cases, the filename will not contain an acq tag.
  • dwi
    • All imaging files have the dwi suffix.
    • Most images have a dir tag corresponding to the phase-encoding direction. This is one of: LR, RL, AP, or PA
    • Images where the phase-encoding direction cannot be easily inferred from the series description string do not have a dir tag.
    • Some participants have multi-shell sequences for their diffusion data. These files will have an additional acq-B<value> tag, where value is the b-value for that sequence.

Currently, only structural (anat) and diffusion (dwi) MRI data are supported. Functional (func) data has not been converted to the BIDS format yet.

HeuDiConv errors

Not solved yet

  • AttributeError: 'Dataset' object has no attribute 'StackID'
  • AssertionError: Conflicting study identifiers found
    • Could be because all of a subject's DICOMs are pooled together in the dicom_org step, in which case this can be fixed by manually running HeuDiConv for each image
  • numpy.AxisError: axis 1 is out of bounds for array of dimension 1
  • AssertionError (assert HEUDICONV_VERSION_JSON_KEY not in json_)
    • Thrown by HeuDiConv
  • AssertionError: we do expect some files since it was called (assert bids_files, "we do expect some files since it was called")
    • Thrown by HeuDiConv

Notes on dwi data

  • Some subjects only have a single diffusion image (e.g., Ax DTI), might not be usable
  • Some subjects have 2 diffusion images, but they have the same description string (e.g., DTI_gated)
    • Checked some cases after BIDS conversion, and the JSON sidecars seem to have the same PhaseEncodingDirection (j-)
  • Some subjects have multi-shell sequences. Their files seem to follow the following pattern:
    • dir-PA: 1 B0, 1 B700, 1 B1000, and 1 B2000 image
    • dir-AP: 4 B0 images
  • Some (~2 for ses-BL) subjects have dir-AP for all their diffusion images
    • Seem to have 4 dir-AP B0 images and 4 other dir-AP images (according to their description string)
  • Some diffusion images do not contain raw data, but rather tensor model results (FA, ADC, TRACEW). Some of these have been excluded before BIDS conversion, but not all of them