Working directory size on HPC

yabrham · March 18, 2023, 2:41pm

Greetings,

I am preprocessing 3T data from HCP project on our HPC. It has 1 T1, 1 T2, 4 resting state scans and 2 fieldmaps. My preprocessing output includes 91k cifti. The working directory gets very large for each subject at around 100GB and even bigger. Then fmriprep crushes with message system does not have space. Is it normal for the working dir in fmriprep to get this big? Is there a way to set up a clean up process or a certain flag to control the size of work dir? I am trying to preprocess data in parallel by spinning up multiple containers.

Thank you,

Steven · March 18, 2023, 3:00pm

Hi @yabrham ,

FMRIprep does not have any flags to control the size of the working directory. Make sure you are not using the --low-mem flag, as that will increase the amount of workdir outputs.

Also, why not just use the already preprocessed HCP data?

Best,
Steven

yabrham · March 18, 2023, 7:08pm

Thank you, I am not using --low-mem. The data from HCP does not the functional scans preprocessed, that’s the reason we are preprocessing it.

effigies · March 18, 2023, 8:08pm

The only way to control working directory size currently is to restrict yourself to subsets of processing at a time.

One of our current efforts is to move to a mode where we generate a minimum set of derivatives (transforms and reference volumes, mostly), such that all other derivatives can be generated deterministically after the fact. This will dramatically reduce the workflow size as well, leading to a smaller working directory.

Steven · March 18, 2023, 8:31pm

There are preprocessed data available: GitHub - datalad-datasets/human-connectome-project-openaccess: WU-Minn HCP1200 Data: 3T/7T MR scans from young healthy adults twins and non-twin siblings (ages 22-35) [T1w, T2w, resting-state and task fMRI, high angular resolution dMRI].

Example: /path/to/HCP1200/$SUBJECTNUMBER/MNINonLinear/Results/rfMRI_REST1_LR/rfMRI_REST1_LR.nii.gz

yabrham · March 19, 2023, 1:22am

I’m specifically working with one of the recent dataset HCP EP.