-
Notifications
You must be signed in to change notification settings - Fork 5
CLI
The cli
section is the entrypoint for those who wish to control the workflows (and therefore the resources) from the command line. Each script (save one) corresponds to a separate pipeline, will self orient, and submit a specified number of jobs to the slurm
scheduler. For each of the pipeline scripts, a scheduled coordinator job determines which participants to submit work for, and then a parent job is submitted for each subject that meets the criteria. A default number of participants (n=8) are submitted at a time, to avoid hogging resources. Finally, intermediates can be kept via the [--keep-interm]
option for the subject-level scripts.
Rather than corresponding to a dedicated pipeline, checks.py
is a preparatory script. This script will check the state of the specified project directory against a dictionary of expected files in order to generate logs/completed_preprocessing.tsv
. The remaining scripts orient with tsv output. checks.py
is written to be executed both locally, to index data found on the NAS, and remotely, to index data found on the HPC. Each execution initiates a git pull
, then the completed_preprocessing.tsv is updated, and the script concludes with a git commit and push
. In this way, the GitHub repo can serve as the go-between for the NAS and HPC. The value of each cell is the timestamp from when the detected file was generated.
checks.py
only orients to participants who have not been excluded or withdrawn, according to github.com/emu-project/data_pulling/data_pulling/data/pseudo_guid_list.csv
. This was decided instead of orienting to all subjects in a certain directory, say BIDS/dset, to help make sure that only consented data is included in analyses. Given the note below, logs/completed_preprocessing
should be regenerated periodically to make sure that current consent is reflected in this document.
Note -- checks.py
only checks blank fields, only appends. In this way you can index partial datasets in two locations (HPC, NAS) and have one comprehensive file. This note is particularly relevant when new fields are added to logs/completed_preprocessing.tsv
, requiring a new dataframe to be generated.
Finally, as a special feature, the module resources.reports.check_complete.check_preproc
can also be used to check if a single subject has all the required data. The report will be reflected in an updated logs/completed_preprocessing.tsv
. Maybe this is useful?
It is intended that this script is ran prior to each pipeline submission, and if updated records are required, it should again be executed once the submitted work has finished. As it requires an internet connection, use the login node if working on the HPC (it runs very lightly).
Usage ought to be quite simple, the only required argument [-t]
needing a personal access token to github.com/emu-project
(for cloning pseudo_guid_list.csv as well as managing logs/completed_preprocessing.tsv). It is recommended to set the PAT as a global variable in your environment for ease of use. The [-p]
option will also be useful for deciding which directory to index.
python checks.py -t $TOKEN_GITHUB_EMU
Controlling automated segmentations of hippocampal subfields (ASHS) is accomplished through ashs.py
.
- First, this script will read in
logs/completed_preprocessing.tsv
and from that dataframe make a list of potential subjects. - Second, it will search through those potential subjects for (a) those who have a blank field in the log, and (b) have both T1- and T2-weighted files in the project "dset" directory.
- Third, for each subject in a batch of size N, the script will then submit a
workflow.control_ashs
. This is accomplished via thesubmit_jobs
function. - Fourth, output is saved to project directory
derivatives/ashs
.
A singularity image of this container is required, as are the location of a set of ASHS atlases.
Submit an sbatch job, capture the stdout/err in logs. Passing the path to the code directory is required.
code_dir="$(dirname "$(pwd)")"
sbatch --job-name=runAshs \
--output=${code_dir}/logs/runAshs_log \
--mem-per-cpu=4000 \
--partition=IB_44C_512G \
--account=iacc_madlab \
--qos=pq_madlab \
ashs.py \
-c $code_dir \
-s /path/to/ashs_latest.simg
Controlling AFNI's @afni_refacer_run
is accomplished through reface.py
. All refacing methods (deface, reface, reface_plus) are supported via the method
option (default=reface). The works steps are as follows:
- This script will read in
logs/completed_preprocessing.tsv
and from that dataframe make a list of potential subjects. - Determine which potential subjects have a T1-weighted file but not a refaced file.
- For each subject in a batch of size N, the script will then submit a
workflow.control_reface
. This is accomplished via thesubmit_jobs
function. - Output is saved to project directory
derivatives/<reface>
.
Submit an sbatch job, capture the stdout/err in logs. Passing the path to the code directory is required.
code_dir="$(dirname "$(pwd)")"
sbatch --job-name=runReface \
--output=${code_dir}/logs/runReface_log \
--mem-per-cpu=4000 \
--partition=IB_44C_512G \
--account=iacc_madlab \
--qos=pq_madlab \
reface.py \
-c $code_dir
Take data from dset
and produce freesurfer
and fmriprep
derivatives. This work is a precursor to the various AFNI workflows (below). Steps are as follow:
- Update the templateflow directory in
/scratch
to combat the purge. - Determine which participants in
dset
do not have the pre-processed T1-weighted file output by fMRIprep. - Submit
workflow.control_fmriprep
for a batch of said subjects.
Output will be saved to the respective derivatives folder in the project directory.
Submit an sbatch job, capture the stdout/err in logs. Specifying the code directory is required.
code_dir="$(dirname "$(pwd)")"
sbatch --job-name=runPrep \
--output=${code_dir}/logs/runPrep_log \
--mem-per-cpu=4000 \
--partition=IB_44C_512G \
--account=iacc_madlab \
--qos=pq_madlab \
fmriprep.py \
-c $code_dir
Extra pre-processing steps not done by fMRIprep, as well as a setup for an AFNI-style deconvolution, are conducted by afni_task_subj.py
. Steps are as follow:
- This script will read in
logs/completed_preprocessing.tsv
and from that dataframe make a list of potential subjects. - Determine whether a
decon_plan
has been supplied, and if so read in the json - Determine which subjects have fMRIprep output AND are missing an eroded white matter mask, session-task intersection mask, session-task scaled files, and/or the intended deconvolution file.
- For each subject in a batch of size N, the script will then submit a
workflow.control_afni.control_preproc
to buildafni_data
, and alsoworkflow.control_afni.control_deconvolution
. This is accomplished via thesubmit_jobs
function. - Certain output is saved to project directory
derivatives/afni
, and housekeeping is conducted (seesubmit_jobs
).
Naming conventions largely follow the BIDS specification, but some AFNI-esque filenames are also employed.
Submit an sbatch job, capture the stdout/err in logs. Specifying the session, task, and code directory is required.
code_dir="$(dirname "$(pwd)")"
sbatch --job-name=runAfniTask \
--output=${code_dir}/logs/runAfniTask_log \
--mem-per-cpu=4000 \
--partition=IB_44C_512G \
--account=iacc_madlab \
--qos=pq_madlab \
afni_task_subj.py \
-s ses-S2 \
-t task-test \
-c $code_dir \
--blur
Extra pre-processing steps not done by fMRIprep, as well as a setup for an AFNI-style resting state regression, are conducted by afni_resting_subj.py
. Steps are as follow:
- This script will read in
logs/completed_preprocessing.tsv
and from that dataframe make a list of potential subjects. - Determine which subjects have fMRIprep output AND are missing an eroded white matter mask, session-task intersection mask, session-task scaled files, and/or the intended correlation matrix.
- For each subject in a batch of size N, the script will then submit a
workflow.control_afni.control_preproc
to buildafni_data
, and alsoworkflow.control_afni.control_resting
. This is accomplished via thesubmit_jobs
function. - Certain output is saved to project directory
derivatives/afni
, and housekeeping is conducted (seesubmit_jobs
).
Naming conventions largely follow the BIDS specification, but some AFNI-esque filenames are also employed.
Submit an sbatch job, capture the stdout/err in logs. Specifying the code directory is required, and --blur
not recommended.
code_dir="$(dirname "$(pwd)")"
sbatch --job-name=runAfniRest \
--output=${code_dir}/logs/runAfniRest_log \
--mem-per-cpu=4000 \
--partition=IB_44C_512G \
--account=iacc_madlab \
--qos=pq_madlab \
afni_resting_subj.py \
-c $code_dir
Group pairwise comparisons are conducted via 3dttest++ implementation of Equitable Thresholding and Clustering (ETAC). The steps employed are as follows:
- This script will read in
logs/completed_preprocessing.tsv
and from that dataframe make a list of potential subjects. - Determine which subjects have both an EPI-anat intersection mask and a deconvolved file.
- Submit the group-level module
workflow.control_afni.control_task_group
. This is accomplished via thesubmit_jobs
function. - Output is saved to project directory
derivatives/afni/analyses
.
Submit an sbatch job, capture the stdout/err in logs. The --blur
option should be identical to that used in cli/afni_task_subj.py
. Specifying the code directory, session, task, decon filename, and list of behaviors is required.
code_dir="$(dirname "$(pwd)")"
sbatch --job-name=runTaskGroup \
--output=${code_dir}/logs/runAfniTaskGroup_log \
--mem-per-cpu=4000 \
--partition=IB_44C_512G \
--account=iacc_madlab \
--qos=pq_madlab \
afni_task_group.py \
--blur \
-c $code_dir \
-s ses-S1 \
-t task-study \
-d decon_task-study_UniqueBehs_stats_REML+tlrc \
-b neg neu
Conduct and A vs not-A analysis via 3dttest++ ETAC methods. The steps employed are:
- This script will read in
logs/completed_preprocessing.tsv
and from that dataframe make a list of potential subjects. - Determine which subjects have both an EPI-anat intersection mask and a Z-transformed file.
- Submit the group-level module
workflow.control_afni.control_resting_group
. This is accomplished via thesubmit_jobs
function. - Output is saved to project directory
derivatives/afni/analyses
.
Submit an sbatch job, capture the stdout/err in logs. The --blur
option should be identical to that used in cli/afni_resting_subj.py
. Specifying the code directory and seed name (used to generate the Z-transformed matrix) is required.
code_dir="$(dirname "$(pwd)")"
sbatch --job-name=runRSGroup \
--output=${code_dir}/logs/runAfniRestGroup_log \
--mem-per-cpu=4000 \
--partition=IB_44C_512G \
--account=iacc_madlab \
--qos=pq_madlab \
afni_resting_group.py \
-c $code_dir \
-s rPCC
Are you still awake?