I have requested 60GB memory and 16 cpus per fmriprep job, distributed by subject on our HPC. I have --mem_mb 48000
and --nthreads 10
in my fmriprep command. In looking at the job reports, there are times when the memory use spikes above 60GB. When this happens the main process is not killed, but the job hangs indefinitely.
My current solution is to give the memory more headroom (requesting 120GB and --mem_mb 48000
), so far the jobs are running ok but I’d like to know why this is happening and if I’m doing something wrong.
My fmriprep 21.0.2 command:
singularity run --cleanenv \
-B ${TEMPLATEFLOW_HOST_HOME}:${SINGULARITYENV_TEMPLATEFLOW_HOME} \
-B ${TMPDIR}/in:/in \
-B ${TMPDIR}/out:/out \
-B ${TMPDIR}/wrk:/wrk \
${container} /in /out/${s} participant \
--participant_label ${s} \
-w /wrk/${s} \
--nthreads 10 \
--mem_mb 48000 \
--fs-license-file ${license} \
--output-spaces fsaverage6 fsLR MNI152NLin2009cAsym \
--cifti-output \
--skip-bids-validation \
--notrack \
--use-aroma \
--error-on-aroma-warnings