Fmriprep 21.0.2 memory use spiking above --mem_mb

dmoracze · June 9, 2022, 6:24pm

I have requested 60GB memory and 16 cpus per fmriprep job, distributed by subject on our HPC. I have --mem_mb 48000 and --nthreads 10 in my fmriprep command. In looking at the job reports, there are times when the memory use spikes above 60GB. When this happens the main process is not killed, but the job hangs indefinitely.

My current solution is to give the memory more headroom (requesting 120GB and --mem_mb 48000), so far the jobs are running ok but I’d like to know why this is happening and if I’m doing something wrong.

My fmriprep 21.0.2 command:

singularity run --cleanenv                                                      \
	-B ${TEMPLATEFLOW_HOST_HOME}:${SINGULARITYENV_TEMPLATEFLOW_HOME}            \
	-B ${TMPDIR}/in:/in                                                         \
	-B ${TMPDIR}/out:/out                                                       \
	-B ${TMPDIR}/wrk:/wrk                                                       \
	${container} /in /out/${s} participant                                      \
	--participant_label ${s}                                                    \
	-w /wrk/${s}                                                                \
	--nthreads 10                                                               \
	--mem_mb 48000                                                              \
	--fs-license-file ${license}                                                \
	--output-spaces fsaverage6 fsLR MNI152NLin2009cAsym                         \
	--cifti-output                                                              \
	--skip-bids-validation                                                      \
	--notrack                                                                   \
	--use-aroma                                                                 \
	--error-on-aroma-warnings

Steven · June 9, 2022, 8:06pm

Hello,

You can try adding the -low-mem flag or lowering the number of threads. It’s also worth noting your call involves a lot of non-standard processes (resamplings, cifti, aroma, etc). Are there multiple runs per subject? How long are the acquisitions? Has recon-all already been run (and are there multiple T1s)?

You don’t have to answer this, but these are all considerations as to why your call is using a lot of memory.

Best,
Steven

dmoracze · June 9, 2022, 8:31pm

Yeah, I’m prepping HCP resting state, so lots of data. I guess I was just surprised that fmriprep is spiking past the memory limit I gave it.

effigies · June 9, 2022, 8:47pm

The estimates of memory usage aren’t particularly great when working with very long BOLD series. We do need to improve them. If you have an interest in contributing, I can describe what needs doing to determine them.

To be clear, are you running into issues with resident memory (actually allocated pages loaded into RAM) or virtual memory (address space assigned to processes)? Many HPCs set overcommit policies that cause the kernel to treat pages that never get loaded anywhere as “allocated” and count against your space limits. The only fix there is to request more memory, because we have very little control over how much memory processes claim from the kernel.