fMRIprep breakdown

Yi_Zheng · October 14, 2024, 4:00am

Summary of what happened:

Dear experts,

I tried to run fMRIprep to preprocess functional data for about 30 subjects. After running about 1 day, it suddenly broke down, and I do not know why.

Command used (and if a helper script was used, a link to the helper script or the command generated):

#!/bin/bash

#User inputs:
HOME=/share/home
bids_root_dir=/share/home/data/
nthreads=30
mem=100 #gb
container=singularity #docker or singularity

#Begin:

#Convert virtual memory from gb to mb
mem=`echo "${mem//[!0-9]/}"` #remove gb at end
mem_mb=`echo $(((mem*1000)-5000))` #reduce some memory for buffer space during pre-processing


FREESURFER_HOME=$HOME/FreeSurferLicense
TEMPLATEFLOW_HOME=/share/home/templateflow
#Run fmriprep

singularity run --cleanenv -B $TEMPLATEFLOW_HOME:/templateflow $HOME/software/my_images/fmriprep-latest.simg \
  $bids_root_dir $HOME/project5/preprocessing_volume/derivatives1 \
  participant \
    --participant-label 016 017 018 019 020 021 023 024 026 027 029 031 032 033 034 036 038 039 040 041 042 043 045 046 047 048 049 050 051 052 053 054 055 056 057 059 \
    --skip-bids-validation \
    --md-only-boilerplate \
    --fs-license-file $FREESURFER_HOME/license.txt \
    --dummy-scans 5 \
    --fs-no-reconall \
    --output-spaces MNI152NLin6Asym:res-2 \
    --nthreads $nthreads \
    --stop-on-first-crash \
    --mem_mb $mem_mb \
    -w $HOME/work1

Version: latest

Environment (Docker, Singularity / Apptainer, custom installation):

singularity

Data formatted according to a validatable standard? Please provide the output of the validator:

yes it is BIDS format

Relevant log outputs (up to 20 lines):

241013-23:54:31,101 nipype.interface INFO:
	 stderr 2024-10-13T23:54:31.101718:++ Max displacement in automask = 1.54 (mm) at sub-brick 282
2024-10-13 23:54:31,101 [    INFO] stderr 2024-10-13T23:54:31.101718:++ Max displacement in automask = 1.54 (mm) at sub-brick 282
241013-23:54:31,102 nipype.interface INFO:
	 stderr 2024-10-13T23:54:31.101718:++ Max delta displ  in automask = 1.46 (mm) at sub-brick 282
2024-10-13 23:54:31,102 [    INFO] stderr 2024-10-13T23:54:31.101718:++ Max delta displ  in automask = 1.46 (mm) at sub-brick 282
241013-23:54:47,872 nipype.workflow INFO:
	 [Node] Finished "_runwise_avg1", elapsed time 390.14242s.
2024-10-13 23:54:47,872 [    INFO] [Node] Finished "_runwise_avg1", elapsed time 390.14242s.
2024-10-13 23:54:57,986 [ WARNING] Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f167e0934d0>: Failed to establish a new connection: [Errno 101] Network is unreachable')': /api/1137693/envelope/
2024-10-13 23:54:57,989 [ WARNING] Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f167e0d8fd0>: Failed to establish a new connection: [Errno 101] Network is unreachable')': /api/1137693/envelope/
2024-10-13 23:54:57,991 [ WARNING] Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f167e0dbb50>: Failed to establish a new connection: [Errno 101] Network is unreachable')': /api/1137693/envelope/
job end time is Sun Oct 13 23:55:02 CST 2024

Screenshots / relevant information:

Steven · October 14, 2024, 1:05pm

Hi @Yi_Zheng,

Are you submitting this job to a scheduler like SLURM? If so, it looks like without an SBATCH header you are only using the default job allocations for resources and time limit, which for clusters are usually pretty limited. Even with 30CPUs, if you have 30 subjects, each subject may only get 1 CPU which will lead to slow processing.

Additionally, --fs-no-reconall is not recommended.

Best,
Steven

effigies · October 14, 2024, 1:59pm

There’s nothing in the log that indicates a failure. There are warnings that it couldn’t send an update to sentry to indicate a failure/success, but those are not errors.

version: latest is meaningless. Please share the actual version as reported by the tool.