Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to match neurodatascience/nipoppy:main #95

Merged
merged 6 commits into from
May 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -33,3 +33,7 @@ env/

# VS Code
.vscode/

# docs
nipoppy_cli/docs/build
nipoppy_cli/docs/source/schemas/*.json
27 changes: 27 additions & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# .readthedocs.yaml
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

# Required
version: 2

# Set the OS, Python version and other tools you might need
build:
os: ubuntu-22.04
tools:
python: "3.11"
jobs:
pre_build:
- python nipoppy_cli/docs/scripts/pydantic_to_jsonschema.py

python:
install:
- method: pip
path: nipoppy_cli
extra_requirements:
- doc

# Build documentation with Sphinx
sphinx:
configuration: nipoppy_cli/docs/source/conf.py
fail_on_warning: true
30 changes: 30 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Nipoppy

[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.8084759.svg)](https://doi.org/10.5281/zenodo.8084759)
[![License](https://img.shields.io/badge/license-MIT-blue.svg)](https://opensource.org/license/mit)
[![codecov](https://codecov.io/gh/neurodatascience/nipoppy/graph/badge.svg?token=SN38ITRO4M)](https://codecov.io/gh/neurodatascience/nipoppy)
[![https://github.com/psf/black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://black.readthedocs.io/en/stable/)
[![Documentation Status](https://readthedocs.org/projects/nipoppy/badge/?version=latest)](https://nipoppy.readthedocs.io/en/latest/?badge=latest)

Nipoppy is a lightweight framework for standardized organization and processing of neuroimaging-clinical datasets. Its goal is to help users adopt the
[FAIR](https://www.go-fair.org/fair-principles/) principles
and improve the reproducibility of studies.

The framework includes three components:

1. A specification for dataset organization that extends the [Brain Imaging Data Structure (BIDS) standard](https://bids.neuroimaging.io/) by providing additional guidelines for tabular (e.g., phenotypic) data and imaging derivatives.

![Nipoppy specification](nipoppy_cli/docs/source/_static/img/nipoppy_specification.jpg)

2. A protocol for data organization, curation and processing, with steps that include the following:
- **Organization** of raw data, including conversion of raw DICOMs (or NIfTIs) to [BIDS](https://bids.neuroimaging.io/)
- **Processing** of imaging data with existing or custom pipelines
- **Tracking** of data availability and processing status
- **Extraction** of imaging-derived phenotypes (IDPs) for downstream statistical modelling and analysis

![Nipoppy protocol](nipoppy_cli/docs/source/_static/img/nipoppy_protocol.jpg)

3. A **command-line interface** and **Python package** that provide user-friendly tools for applying the framework. The tools build upon existing technologies such as the [Apptainer container platform](https://apptainer.org/) and the [Boutiques descriptor framework](https://boutiques.github.io/). Several existing containerized pipelines are supported out-of-the-box, and new pipelines can be added easily by the user.
- We have also developed a [**web dashboard**](https://digest.neurobagel.org) for interactive visualizations of imaging and phenotypic data availability.

See the [documentation website](https://neurobagel.org/nipoppy/overview/) for more information!
67 changes: 0 additions & 67 deletions docs/README.md

This file was deleted.

4 changes: 2 additions & 2 deletions nipoppy/extractors/fmriprep/run_FC.py
Original file line number Diff line number Diff line change
Expand Up @@ -229,7 +229,7 @@ def run(participant_id: str,
if output_dir is None:
output_dir = f"{DATASET_ROOT}/derivatives/"

fmriprep_dir = f"{DATASET_ROOT}/derivatives/fmriprep/{FMRIPREP_VERSION}/output"
fmriprep_dir = f"{DATASET_ROOT}/derivatives/fmriprep/v{FMRIPREP_VERSION}/output"
DKT_dir = f"{DATASET_ROOT}/derivatives/networks/0.9.0/output"
FC_dir = f"{output_dir}/FC"

Expand Down Expand Up @@ -290,4 +290,4 @@ def run(participant_id: str,
with open(FC_config_file, 'r') as f:
FC_configs = json.load(f)

run(participant_id, global_configs, FC_configs, session_id, output_dir)
run(participant_id, global_configs, FC_configs, session_id, output_dir)
51 changes: 27 additions & 24 deletions nipoppy/extractors/freesurfer/run_structural_measures.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,28 +14,30 @@
# Globals
# Brainload has two separate functions to extract aseg data.
measure_column_names = ["StructName","Structure","Description","Volume_mm3", "unit"]
aseg_cols = ["StructName", "Volume_mm3"]
dkt_cols = ["StructName", "ThickAvg"]

def get_aseg_stats(participant_stats_dir, aseg_cols):
""" Parses the aseg.stats file
"""
aseg_cols = ["StructName", "Volume_mm3"]
aseg_stats = bl.stat(f'{participant_stats_dir}/aseg.stats')
table_df = pd.DataFrame(aseg_stats["table_data"], columns=aseg_stats["table_column_headers"])[aseg_cols]
measure_df = pd.DataFrame(data=aseg_stats["measures"], columns=measure_column_names)[aseg_cols]
_df = pd.concat([table_df,measure_df],axis=0)
return _df

def get_aparc_stats(participant_stats_dir, aparc_cols, parcel="aparc.DKTatlas"):
def get_DKT_stats(participant_stats_dir, dkt_cols, parcel="aparc.DKTatlas"):
""" Parses the <>.aparc.DKTatlas.stats file
"""
hemi = "lh"
stat_file = f"{hemi}.{parcel}.stats"
lh_dkt_stats = bl.stat(f'{participant_stats_dir}/{stat_file}')
lh_df = pd.DataFrame(lh_dkt_stats["table_data"], columns=lh_dkt_stats["table_column_headers"])[aparc_cols]
lh_df = pd.DataFrame(lh_dkt_stats["table_data"], columns=lh_dkt_stats["table_column_headers"])[dkt_cols]
lh_df["hemi"] = hemi

hemi = "rh"
stat_file = f"{hemi}.{parcel}.stats"
rh_dkt_stats = bl.stat(f'{participant_stats_dir}/rh.aparc.DKTatlas.stats')
rh_df = pd.DataFrame(rh_dkt_stats["table_data"], columns=rh_dkt_stats["table_column_headers"])[aparc_cols]
rh_dkt_stats = bl.stat(f'{participant_stats_dir}/{stat_file}')
rh_df = pd.DataFrame(rh_dkt_stats["table_data"], columns=rh_dkt_stats["table_column_headers"])[dkt_cols]
rh_df["hemi"] = hemi

_df = pd.concat([lh_df,rh_df], axis=0)
Expand All @@ -52,17 +54,16 @@ def get_aparc_stats(participant_stats_dir, aparc_cols, parcel="aparc.DKTatlas"):
parser.add_argument('--FS_config', type=str, help='path to freesurfer configs for a given nipoppy dataset', required=True)
parser.add_argument('--participants_list', default=None, help='path to participants list (csv or tsv')
parser.add_argument('--session_id', type=str, help='session id for the participant', required=True)
parser.add_argument('--save_dir', default='./', help='path to save_dir')
parser.add_argument('--output_dir', default=None, help='path to save extracted output (default: derivatives/freesurfer/<version>/IDP/<session>)')

args = parser.parse_args()

global_config_file = args.global_config
FS_config_file = args.FS_config
participants_list = args.participants_list
session_id = args.session_id
save_dir = args.save_dir

session = f"ses-{session_id}"
output_dir = args.output_dir

# Read global configs
with open(global_config_file, 'r') as f:
Expand All @@ -77,9 +78,12 @@ def get_aparc_stats(participant_stats_dir, aparc_cols, parcel="aparc.DKTatlas"):
stat_configs = FS_configs["stat_configs"]
stat_config_names = stat_configs.keys()

print(f"Using dataset root: {DATASET_ROOT} and FreeSurfer version: {FS_version}")
print(f"Using dataset root: {DATASET_ROOT} and FreeSurfer version: v{FS_version}")
print(f"Using stat configs: {stat_config_names}")

if output_dir == None:
output_dir = f"{DATASET_ROOT}/derivatives/freesurfer/v{FS_version}/IDP/{session}/"

if participants_list == None:
# use doughnut
doughnut_file = f"{DATASET_ROOT}/scratch/raw_dicom/doughnut.csv"
Expand All @@ -97,17 +101,17 @@ def get_aparc_stats(participant_stats_dir, aparc_cols, parcel="aparc.DKTatlas"):


# Extract stats for each participant
fs_output_dir = f"{DATASET_ROOT}/derivatives/freesurfer/{FS_version}/output/{session}/"
fs_output_dir = f"{DATASET_ROOT}/derivatives/freesurfer/v{FS_version}/output/{session}/"

aseg_df = pd.DataFrame()
aparc_df = pd.DataFrame()
dkt_df = pd.DataFrame()
for participant_id in bids_participants:
participant_stats_dir = f"{fs_output_dir}{participant_id}/stats/"
print(f"Extracting stats for participant: {participant_id}")

for config_name, config_cols in stat_configs.items():
print(f"Extracting data for config: {config_name}")
if config_name.strip() == "aseg":
if config_name.strip().lower() == "aseg":
try:
_df = get_aseg_stats(participant_stats_dir, config_cols)
# transpose it to wideform
Expand All @@ -122,36 +126,35 @@ def get_aparc_stats(participant_stats_dir, aparc_cols, parcel="aparc.DKTatlas"):
except:
print(f"Error parsing aseg data for {participant_id}")

elif config_name.strip() == "aparc":
elif config_name.strip().lower() == "dkt":
try:
_df = get_aparc_stats(participant_stats_dir, config_cols)
_df = get_DKT_stats(participant_stats_dir, config_cols)
# transpose it to wideform
names_col = config_cols[0]
values_col = config_cols[1]
cols = ["participant_id"] + list(_df["hemi"] + "." + _df[names_col])
vals = [participant_id] + list(_df[values_col])
_df_wide = pd.DataFrame(columns=cols)
_df_wide.loc[0] = vals
aparc_df = pd.concat([aparc_df,_df_wide], axis=0)
dkt_df = pd.concat([dkt_df,_df_wide], axis=0)

except Exception as e:
print(f"Error parsing aparc data for {participant_id} with exception: {e}")
print(f"Error parsing dkt data for {participant_id} with exception: {e}")

else:
print(f"Unknown stat config: {config_name}")

# Save configs
print(f"Saving collated stat tables at: {save_dir}")
aseg_csv = f"{save_dir}/aseg.csv"
aparc_csv = f"{save_dir}/aparc.csv"
print(f"Saving collated stat tables at: {output_dir}")
aseg_csv = f"{output_dir}/aseg.csv"
dkt_csv = f"{output_dir}/dkt.csv"

if len(aseg_df) > 0:
aseg_df.to_csv(aseg_csv, index=None)
else:
print("aseg_df is empty")

if len(aparc_df) > 0:
aparc_df.to_csv(aparc_csv, index=None)
if len(dkt_df) > 0:
dkt_df.to_csv(dkt_csv, index=None)
else:
print("aparc_df is empty")

print("dkt_df is empty")
26 changes: 17 additions & 9 deletions nipoppy/extractors/maget_brain/prepare_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,15 +50,17 @@ def get_masked_image(img_path, mask_path, masked_img_path):
fmriprep_dir = f"{DATASET_ROOT}/derivatives/fmriprep/{fmriprep_version}/output/"
maget_dir = f"{DATASET_ROOT}/derivatives/maget_brain/{maget_version}/output/"
maget_preproc_T1w_nii_dir = f"{maget_dir}/ses-{session_id}/preproc_T1w_nii/"
maget_proc_list_file = f"{maget_preproc_T1w_nii_dir}proc_participant.csv"

# Check / create maget subdirs
Path(maget_preproc_T1w_nii_dir).mkdir(parents=True, exist_ok=True)

# get all the subject ids
manifest_csv = f"{DATASET_ROOT}/tabular/manifest.csv"
manifest_df = pd.read_csv(manifest_csv)
bids_id_list = manifest_df["bids_id"].unique()
# get all the subject ids from the doughnut
doughnut_csv = f"{DATASET_ROOT}/scratch/raw_dicom/doughnut.csv"
doughnut_df = pd.read_csv(doughnut_csv)
bids_id_list = doughnut_df["bids_id"].unique()

proc_participants = [] # To be replaced when maget-brain tracker is written...
for bids_id in bids_id_list:
if run_id == None:
img_file_name = f"{bids_id}_ses-{session_id}_desc-preproc_T1w.nii.gz"
Expand All @@ -74,8 +76,14 @@ def get_masked_image(img_path, mask_path, masked_img_path):
mask_path = f"{fmriprep_dir}/{bids_id}/ses-{session_id}/anat/{mask_file_name}"
masked_img_path = f"{maget_preproc_T1w_nii_dir}/{masked_img_file_name}"

try:
get_masked_image(img_path, mask_path, masked_img_path)
except Exception as e:
print(e)

# Check if the masked image exists
if os.path.isfile(masked_img_path):
print(f"Participant segmentation already exist: {bids_id}")
else:
try:
get_masked_image(img_path, mask_path, masked_img_path)
proc_participants.append(bids_id)
except Exception as e:
print(e)

pd.DataFrame(data=proc_participants).to_csv(maget_proc_list_file, header=False, index=False)
1 change: 1 addition & 0 deletions nipoppy/trackers/run_tracker.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@
"fmriprep": ["anat"],
"mriqc": ["anat"],
"tractoflow": ["anat", "dwi"],
"maget_brain": ["anat"]
}
ALL_DATATYPES = sorted(["anat", "dwi", "func", "fmap"])
BIDS_PIPES = ["mriqc","fmriprep"]
Expand Down
26 changes: 0 additions & 26 deletions nipoppy_cli/README.md

This file was deleted.

21 changes: 21 additions & 0 deletions nipoppy_cli/docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = source
BUILDDIR = build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
./scripts/pydantic_to_jsonschema.py
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
Loading
Loading