Fmriprep in singularity not running when submitted with qsub

jmatthews · October 23, 2019, 4:37pm

Hello all,

We have set up fmriprep (1.3.2) in a singularity (3.4.0) container on our grid cluster (CentOS 6.3). If I connect directly to one of our nodes and run my singularity-fmriprep command, the pipeline will run to completion. However, when I pass the command to a node with qsub, fmriprep begins, does the BIDS validation, generates the boilerplate, and stops without errors. We haven’t been able to figure out why this would be happening. Does anyone have similar experiences, or things they suggest we look into? Thank you!

Command and outputs below:

This is my singularity-fmriprep command:

singularity run --cleanenv --bind /data:/mnt /opt/fmriprep-1.3.2 /mnt/chen_lab/jmatthews/GridNodeTest/input/ds002156-1.0.0/ /mnt/chen_lab/jmatthews/GridNodeTest/output/ participant -w /mnt/chen_lab/jmatthews/fmriprep-work/ --fs-license-file /mnt/chen_lab/jmatthews/fmriprep-work/license.txt

This command is saved in a file named “fmriprep_command_long” and passed with command:

qsub -cwd fmriprep_command_long

The resulting qsub files are:
fmriprep_command_long.e1389526
-which is empty
and
fmriprep_command_long.o1389526
-which contains (I have condensed this output with ellipses, as it is long and the outputs contain “@” symbols which the forum is flagging as user tags, and I don’t know how to get around. If I’m missing anything critical, please ask.):

This dataset appears to be BIDS compatible.
…

If you have any questions please post on https://neurostars.org/tags/bids

Making sure the input data is BIDS compliant (warnings can be ignored in most cases).
191022-15:50:00,673 nipype.workflow IMPORTANT:
Running fMRIPREP version 1.3.2:
  * BIDS dataset path: /mnt/chen_lab/jmatthews/GridNodeTest/input/ds002156-1.0.0.
  * Participant list: ['23638'].
  * Run identifier: 20191022-155000_9cac8d1e-745d-471d-90a7-12f82ede0cc1.
…
Works derived from this fMRIPrep execution should include the following boilerplate:

…
Anatomical data preprocessing

…

Functional data preprocessing

…

Many internal operations of fMRIPrep use
…

References

jmatthews · January 28, 2020, 5:18pm

Bumping for answers.

@oesteban someone suggested you may be able to help?

Our (Rotman Research Institute, Toronto) IT department has been getting more requests to use fmriprep on our grid, but no one has been able to solve this problem. I can ask them to join this forum, or put someone in contact with them directly.

Thank you in advance for any help!

oesteban · January 28, 2020, 5:36pm

It looks like a memory issue at first. Two (non-exclusive) options:

Revise you can access same amount of RAM and have same overcommitting policies regardless of running interactively or via qsub.
Upgrade to fMRIPrep +1.5 which has some improvements postponing some memory-hungry operations related to the citation boilerplate.

Tony · January 28, 2020, 5:57pm

Thanks for your quick response. I thought it was a memory issue too, had tried different ways with resources allocation for specific option, using 4G Ram,but still dropping at same spot. I saw on fmriprep FAQ, say it is python bug issue, not sure if it get fixed? will try fMRIPrep + 1.5.

My fMRIPrep run is hanging…

When running on Linux platforms (or containerized environments, because they are built around Ubuntu), there is a Python bug that affects fMRIPrep that drives the Linux kernel to kill processes as a response to running out of memory. Depending on the process killed by the kernel, fMRIPrep may crash with a BrokenProcessPool error or hang indefinitely, depending on settings. While we are working on finding a solution that does not run up against this bug, this may take some time. This can be most easily resolved by allocating more memory to the process, if possible.

Tony · January 28, 2020, 6:10pm

Question is, we can run it on computing node directory without problem, but same job being assigned to same node with qsub, it will drop. what is minimum memory allocation for fMRIPrep with one subject?