Node n4 failed to run on host gpu01.cluster

Summary of what happened:

I am running fmriprep on a dataset within a singularity container on my school’s HPC. Across 110 subjects, 57 subjects ran without fail. 53 subjects all failed with the same pattern of a NodeExecutionError followed by a TraitError followed by a FileNotFoundError in node N4.

My last try was running each subject with 50GB memory with 10 CPU. I read similar errors being thrown; the solution for someone else was to set the fmriprep work directory of each individual to a unique directory, so the code below does that. (My original code sets the work directory to /mnt/fmriprep_work.) Neither worked. I also tried increasing memory to 80GB, still didn’t work.

Thank you!

Command used (and if a helper script was used, a link to the helper script or the command generated):

#!/bin/bash
#SBATCH --job-name=fmriprep_drop_nodeexec
#SBATCH --partition=main
#SBATCH --nodes=2
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=10
#SBATCH --mem=50GB
#SBATCH --time=5:00:00
#SBATCH --output=singularity_images/output_logs/get_fmriprep_drop_%A_%a_o.log
#SBATCH --error=singularity_images/error_logs/get_fmriprep_drop_%A_%a_e.log 
#SBATCH --array=1-53%18
#SBATCH --mail-user=ibrayyilmaz@cmc.edu
#SBATCH --mail-type=BEGIN,END,FAIL
# Define paths
SINGULARITY_IMG="/hopper/groups/enkavilab/singularity_images/fmriprep-24.1.1.simg"
DATASET_PATH="/hopper/groups/enkavilab/ds004636"
SUBJECT_LIST=("sub-s586" "sub-s172" "sub-s518" "sub-s587" "sub-s591" "sub-s613" "sub-s640" "sub-s647" "sub-s597" "sub-s648" "sub-s601" "sub-s606" "sub-s608" "sub-s590" "sub-s609" "sub-s631" "sub-s607" "sub-s600" "sub-s585" "sub-s599" "sub-s582" "sub-s549" "sub-s373" "sub-s513" "sub-s548" "sub-s584" "sub-s546" "sub-s579" "sub-s541" "sub-s512" "sub-s558" "sub-s593" "sub-s567" "sub-s594" "sub-s556" "sub-s568" "sub-s557" "sub-s595" "sub-s561" "sub-s643" "sub-s626" "sub-s619" "sub-s617" "sub-s610" "sub-s645" "sub-s616" "sub-s627" "sub-s499" "sub-s635" "sub-s603" "sub-s234" "sub-s519" "sub-s553")

# Get the subject for this job
SUBJECT=${SUBJECT_LIST[$SLURM_ARRAY_TASK_ID - 1]}
PARTICIPANT_WORK_DIR="/hopper/groups/enkavilab/fmriprep_work_sub/$SUBJECT"
cd $DATASET_PATH || { echo "Failed to change directory to $DATASET_PATH"; exit 1; }
mkdir -p $PARTICIPANT_WORK_DIR

# Run fMRIPrep
/usr/bin/time -v singularity run --cleanenv -B /hopper/groups/enkavilab:/mnt \
    ${SINGULARITY_IMG} \
    /mnt/ds004636 /mnt/ds004636/derivatives/fmriprep \
    participant \
    --participant-label ${SUBJECT} \
    --fs-license-file /mnt/freesurfer_license.txt \
    --work-dir /mnt/fmriprep_work_sub/$SUBJECT \
    --output-spaces MNI152NLin2009cAsym \
    --nthreads 10 --mem_mb 50000 \
    --omp-nthreads 10 \
    --fs-no-reconall \
    --stop-on-first-crash \
    --skip_bids_validation \
    --use-aroma full \

# Check if fMRIPrep ran successfully
if [ $? -eq 0 ]; then
    echo "fMRIPrep completed successfully for $SUBJECT."
    # Step 3: Log storage usage
    du -sh /hopper/groups/enkavilab/ds004636/derivatives/fmriprep > storage_report_${SUBJECT}.txt
    # Step 4: Drop data with Datalad
    echo "Dropping data for $SUBJECT..."
    datalad drop --nocheck $DATASET_PATH/$SUBJECT || { echo "Failed to drop $SUBJECT"; exit 1; }
    echo "Data successfully dropped for $SUBJECT."
else
    echo "fMRIPrep failed for $SUBJECT. Skipping datalad drop."
    exit 1
fi

Version:

Fmriprep version: 24.1.1
singularity version: 4.1.4-1.el9

Environment (Docker, Singularity / Apptainer, custom installation):

Running it in a singularity Container.

Data formatted according to a validatable standard? Please provide the output of the validator:

[WARNING] UNKNOWN_BIDS_VERSION The BIDSVersion field of 'dataset_description.json' does not match a known release.
The BIDS Schema used for validation may be out of date.

		/dataset_description.json

	Please visit https://neurostars.org/search?q=UNKNOWN_BIDS_VERSION for existing conversations about this issue.

	[WARNING] JSON_KEY_RECOMMENDED A JSON file is missing a key listed as recommended.
		DatasetType
		/dataset_description.json

		GeneratedBy
		/dataset_description.json

		SourceDatasets
		/dataset_description.json

	Please visit https://neurostars.org/search?q=JSON_KEY_RECOMMENDED for existing conversations about this issue.

	[WARNING] GZIP_HEADER_MTIME The gzip header contains a non-zero timestamp.
This may leak sensitive information or indicate a non-reproducible conversion process.

		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2843 more files with the same issue

	Please visit https://neurostars.org/search?q=GZIP_HEADER_MTIME for existing conversations about this issue.

	[WARNING] GZIP_HEADER_FILENAME The gzip header contains a non-empty filename.
This may leak sensitive information or indicate a non-reproducible conversion process.

		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2843 more files with the same issue

	Please visit https://neurostars.org/search?q=GZIP_HEADER_FILENAME for existing conversations about this issue.

	[WARNING] SIDECAR_KEY_RECOMMENDED A data file's JSON sidecar is missing a key listed as recommended.
		B0FieldIdentifier
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz

		649 more files with the same issue

		Manufacturer
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		5002 more files with the same issue

		ManufacturersModelName
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		5002 more files with the same issue

		DeviceSerialNumber
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		5002 more files with the same issue

		StationName
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		SoftwareVersions
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		5002 more files with the same issue

		MagneticFieldStrength
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		ReceiveCoilName
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		ReceiveCoilActiveElements
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		GradientSetType
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		MRTransmitCoilSequence
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		MatrixCoilMode
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		CoilCombinationMethod
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		PulseSequenceType
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		ScanningSequence
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		SequenceVariant
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		ScanOptions
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		SequenceName
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		PulseSequenceDetails
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		MTState
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		SpoilingState
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		NumberShots
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		ParallelReductionFactorInPlane
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		ParallelReductionFactorOutOfPlane
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		ParallelAcquisitionTechnique
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		PartialFourier
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		PartialFourierDirection
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		EffectiveEchoSpacing
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		756 more files with the same issue

		MixingTime
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		PhaseEncodingDirection
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		756 more files with the same issue

		TotalReadoutTime
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		756 more files with the same issue

		EchoTime
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		756 more files with the same issue

		InversionTime
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		DwellTime
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		FlipAngle
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		756 more files with the same issue

		MultibandAccelerationFactor
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		InstitutionName
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		InstitutionAddress
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		InstitutionalDepartmentName
		/sub-s549/ses-1/fmap/sub-s549_ses-1_magnitude.nii.gz
		/sub-s549/ses-1/fmap/sub-s549_ses-1_fieldmap.nii.gz

		2480 more files with the same issue

		TaskName
		/sub-s549/ses-1/func/sub-s549_ses-1_task-discountFix_run-1_recording-cardiac_physio.tsv.gz
		/sub-s549/ses-1/func/sub-s549_ses-1_task-motorSelectiveStop_run-1_recording-respiratory_physio.tsv.gz

		2523 more files with the same issue

		StimulusPresentation
		/sub-s549/ses-1/func/sub-s549_ses-1_task-surveyMedley_run-1_events.tsv
		/sub-s549/ses-1/func/sub-s549_ses-1_task-DPX_run-1_events.tsv

		1040 more files with the same issue

		NumberOfVolumesDiscardedByScanner
		/sub-s549/ses-1/func/sub-s549_ses-1_task-rest_run-1_bold.nii.gz
		/sub-s549/ses-1/func/sub-s549_ses-1_task-DPX_run-1_bold.nii.gz

		1266 more files with the same issue

		NumberOfVolumesDiscardedByUser
		/sub-s549/ses-1/func/sub-s549_ses-1_task-rest_run-1_bold.nii.gz
		/sub-s549/ses-1/func/sub-s549_ses-1_task-DPX_run-1_bold.nii.gz

		1266 more files with the same issue

		DelayTime
		/sub-s549/ses-1/func/sub-s549_ses-1_task-rest_run-1_bold.nii.gz
		/sub-s549/ses-1/func/sub-s549_ses-1_task-DPX_run-1_bold.nii.gz

		1266 more files with the same issue

		AcquisitionDuration
		/sub-s549/ses-1/func/sub-s549_ses-1_task-rest_run-1_bold.nii.gz
		/sub-s549/ses-1/func/sub-s549_ses-1_task-DPX_run-1_bold.nii.gz

		1266 more files with the same issue

		DelayAfterTrigger
		/sub-s549/ses-1/func/sub-s549_ses-1_task-rest_run-1_bold.nii.gz
		/sub-s549/ses-1/func/sub-s549_ses-1_task-DPX_run-1_bold.nii.gz

		1266 more files with the same issue

		Instructions
		/sub-s549/ses-1/func/sub-s549_ses-1_task-rest_run-1_bold.nii.gz
		/sub-s549/ses-1/func/sub-s549_ses-1_task-DPX_run-1_bold.nii.gz

		1266 more files with the same issue

		TaskDescription
		/sub-s549/ses-1/func/sub-s549_ses-1_task-rest_run-1_bold.nii.gz
		/sub-s549/ses-1/func/sub-s549_ses-1_task-DPX_run-1_bold.nii.gz

		1266 more files with the same issue

		CogAtlasID
		/sub-s549/ses-1/func/sub-s549_ses-1_task-rest_run-1_bold.nii.gz
		/sub-s549/ses-1/func/sub-s549_ses-1_task-DPX_run-1_bold.nii.gz

		1266 more files with the same issue

		CogPOID
		/sub-s549/ses-1/func/sub-s549_ses-1_task-rest_run-1_bold.nii.gz
		/sub-s549/ses-1/func/sub-s549_ses-1_task-DPX_run-1_bold.nii.gz

		1266 more files with the same issue

	Please visit https://neurostars.org/search?q=SIDECAR_KEY_RECOMMENDED for existing conversations about this issue.

	[WARNING] TSV_COLUMN_TYPE_REDEFINED A column required in a TSV file has been redefined in a sidecar file. This redefinition is being ignored.
		onset
		/sub-s549/ses-1/func/sub-s549_ses-1_task-surveyMedley_run-1_events.tsv - defined in /task-surveyMedley_events.json
		/sub-s549/ses-1/func/sub-s549_ses-1_task-DPX_run-1_events.tsv - defined in /task-DPX_events.json

		1040 more files with the same issue

		duration
		/sub-s549/ses-1/func/sub-s549_ses-1_task-surveyMedley_run-1_events.tsv - defined in /task-surveyMedley_events.json
		/sub-s549/ses-1/func/sub-s549_ses-1_task-DPX_run-1_events.tsv - defined in /task-DPX_events.json

		1040 more files with the same issue

		response_time
		/sub-s549/ses-1/func/sub-s549_ses-1_task-surveyMedley_run-1_events.tsv - defined in /task-surveyMedley_events.json
		/sub-s549/ses-1/func/sub-s549_ses-1_task-DPX_run-1_events.tsv - defined in /task-DPX_events.json

		1040 more files with the same issue

	Please visit https://neurostars.org/search?q=TSV_COLUMN_TYPE_REDEFINED for existing conversations about this issue.

	[WARNING] EVENTS_TSV_MISSING Task scans should have a corresponding 'events.tsv' file.
If this is a resting state scan you can ignore this warning or rename the task to include the word "rest".

		/sub-s640/ses-2/func/sub-s640_ses-2_task-discountFix_run-1_recording-cardiac_physio.tsv.gz
		/sub-s640/ses-2/func/sub-s640_ses-2_task-motorSelectiveStop_run-1_recording-respiratory_physio.tsv.gz

		52 more files with the same issue

	Please visit https://neurostars.org/search?q=EVENTS_TSV_MISSING for existing conversations about this issue.

	[WARNING] EVENT_ONSET_ORDER The onset column in events.tsv files should be sorted.

		/sub-s518/ses-2/func/sub-s518_ses-2_task-CCTHot_run-1_events.tsv
		/sub-s471/ses-2/func/sub-s471_ses-2_task-motorSelectiveStop_run-1_events.tsv

		4 more files with the same issue

	Please visit https://neurostars.org/search?q=EVENT_ONSET_ORDER for existing conversations about this issue.

	[WARNING] SUSPICIOUSLY_LONG_EVENT_DESIGN The onset of the last event is after the total duration of the corresponding scan.
This design is suspiciously long.

		/sub-s601/ses-2/func/sub-s601_ses-2_task-twoByTwo_run-1_bold.nii.gz
		/sub-s648/ses-2/func/sub-s648_ses-2_task-CCTHot_run-1_bold.nii.gz

	Please visit https://neurostars.org/search?q=SUSPICIOUSLY_LONG_EVENT_DESIGN for existing conversations about this issue.

	[WARNING] SUSPICIOUSLY_SHORT_EVENT_DESIGN The onset of the last event is less than half the total duration of the corresponding scan.
This design is suspiciously short.

		/sub-s648/ses-2/func/sub-s648_ses-2_task-ANT_run-1_bold.nii.gz

	Please visit https://neurostars.org/search?q=SUSPICIOUSLY_SHORT_EVENT_DESIGN for existing conversations about this issue.

	[ERROR] NOT_INCLUDED Files with such naming scheme are not part of BIDS specification. This error is most commonly caused by typos in file names that make them not BIDS compatible. Please consult the specification and make sure your files are named correctly. If this is not a file naming issue (for example when including files not yet covered by the BIDS specification) you should include a ".bidsignore" file in your dataset (see https://github.com/bids-standard/bids-validator#bidsignore for details). Please note that derived (processed) data should be placed in /derivatives folder and source data (such as DICOMS or behavioural logs in proprietary formats) should be placed in the /sourcedata folder.
		/storage_report_sub-s573.txt
		/storage_report_sub-s618.txt

		53 more files with the same issue

	Please visit https://neurostars.org/search?q=NOT_INCLUDED for existing conversations about this issue.

	[ERROR] NIFTI_HEADER_UNREADABLE We were unable to parse header data from this NIfTI file. Please ensure it is not corrupted or mislabeled.
		/sub-s637/ses-2/anat/sub-s637_ses-2_T1w.nii.gz
		/sub-s637/ses-2/anat/sub-s637_ses-2_T2w.nii.gz

		1073 more files with the same issue

	Please visit https://neurostars.org/search?q=NIFTI_HEADER_UNREADABLE for existing conversations about this issue.


          Summary:                           Available Tasks:          Available Modalities:
          6573 Files, 1.37 TB                motorSelectiveStop        MRI                  
          110 - Subjects 4 - Sessions        twoByTwo                                       
                                             DPX                                            
                                             stopSignal                                     
                                             discountFix                                    
                                             ANT                                            
                                             rest                                           
                                             surveyMedley                                   
                                             CCTHot                                         
                                             stroop                                         
                                             WATT3                                          

	If you have any questions, please post on https://neurostars.org/tags/bids.

Relevant log outputs (up to 20 lines):

Node n4 failed to run on host gpu01.cluster.
Traceback (most recent call last):
  File "/opt/conda/envs/fmriprep/lib/python3.11/site-packages/nipype/pipeline/plugins/multiproc.py", line 67, in run_node
    result["result"] = node.run(updatehash=updatehash)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/fmriprep/lib/python3.11/site-packages/nipype/pipeline/engine/nodes.py", line 527, in run
    result = self._run_interface(execute=True)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/fmriprep/lib/python3.11/site-packages/nipype/pipeline/engine/nodes.py", line 645, in _run_interface
    return self._run_command(execute)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/fmriprep/lib/python3.11/site-packages/nipype/pipeline/engine/nodes.py", line 771, in _run_command
    raise NodeExecutionError(msg)
nipype.pipeline.engine.nodes.NodeExecutionError: Exception raised while executing Node n4.

Cmdline:
	N4BiasFieldCorrection -d 3 --input-image /mnt/fmriprep_work_sub/sub-s648/fmriprep_24_1_wf/sub_s648_wf/bold_ses_1_task_DPX_run_1_wf/bold_fit_wf/unwarp_wf/brainextraction_wf/clipper_pre/clipped.nii.gz --convergence [ 50x50x50x50x50, 1e-07 ] --output clipped_corrected.nii.gz --shrink-factor 4
Stdout:

Stderr:
	Exception caught during reference file reading 

	itk::ExceptionObject (0x560fcb626200)
	Location: "unknown" 
	File: /home/conda/feedstock_root/build_artifacts/libitk_1717078286681/work/Modules/IO/NIFTI/src/itkNiftiImageIO.cxx
	Line: 2124
	Description: ITK ERROR: ITK only supports orthonormal direction cosines.  No orthonormal definition found!

	 file /mnt/fmriprep_work_sub/sub-s648/fmriprep_24_1_wf/sub_s648_wf/bold_ses_1_task_DPX_run_1_wf/bold_fit_wf/unwarp_wf/brainextraction_wf/clipper_pre/clipped.nii.gz
	Segmentation fault (core dumped)
Traceback:
	Traceback (most recent call last):
	  File "/opt/conda/envs/fmriprep/lib/python3.11/site-packages/nipype/interfaces/base/core.py", line 453, in aggregate_outputs
	    setattr(outputs, key, val)
	  File "/opt/conda/envs/fmriprep/lib/python3.11/site-packages/nipype/interfaces/base/traits_extension.py", line 330, in validate
	    value = super(File, self).validate(objekt, name, value, return_pathlike=True)
	            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	  File "/opt/conda/envs/fmriprep/lib/python3.11/site-packages/nipype/interfaces/base/traits_extension.py", line 135, in validate
	    self.error(objekt, name, str(value))
	  File "/opt/conda/envs/fmriprep/lib/python3.11/site-packages/traits/base_trait_handler.py", line 74, in error
	    raise TraitError(
	traits.trait_errors.TraitError: The 'output_image' trait of a N4BiasFieldCorrectionOutputSpec instance must be a pathlike object or string representing an existing file, but a value of '/mnt/fmriprep_work_sub/sub-s648/fmriprep_24_1_wf/sub_s648_wf/bold_ses_1_task_DPX_run_1_wf/bold_fit_wf/unwarp_wf/brainextraction_wf/n4/clipped_corrected.nii.gz' <class 'str'> was specified.
...
FileNotFoundError: No such file or directory '/mnt/fmriprep_work_sub/sub-s648/fmriprep_24_1_wf/sub_s648_wf/bold_ses_1_task_DPX_run_1_wf/bold_fit_wf/unwarp_wf/brainextraction_wf/n4/clipped_corrected.nii.gz' for output 'output_image' of a N4BiasFieldCorrection interface

Screenshots / relevant information:


Hi @ulas_ayyilmaz and welcome to neurostars!

Can you run seff $JOBID on those failing jobs and see what the memory usage was? Also, can you try rerunning with a fresh working directory? Or you can use --clean-workdir if every subject has their own working directory folder (do NOT use it if everyone has the same workdir). Is there anything potentially different about the failing subjects (e.g., number of scans)?

Best,
Steven

Can you share your T1w image? This is relevant to Support specification of Nifti sform tolerance via environmental variable · Issue #4839 · InsightSoftwareConsortium/ITK · GitHub.

Does this dataset validate on its own? It could be a filesystem permissions issue, if the file can be seen but not loaded.

Hi, thank you!

I tried to attach the nii/nii.gz files (24MB) t1w image for sub-s648 whose fmriprep failed with the same error, but the chat didn’t allow for sharing a file with that extension. Below is the header of the T1w image file:

<class 'nibabel.nifti1.Nifti1Header'> object, endian='<'
sizeof_hdr      : 348
data_type       : b''
db_name         : b''
extents         : 0
session_error   : 0
regular         : b''
dim_info        : 57
dim             : [  3 256 256 186   1   1   1   1]
intent_p1       : 0.0
intent_p2       : 0.0
intent_p3       : 0.0
intent_code     : none
datatype        : int16
bitpix          : 16
slice_start     : 0
pixdim          : [1.      0.8984  0.8984  0.9001  0.00724 1.      1.      1.     ]
vox_offset      : 0.0
scl_slope       : nan
scl_inter       : nan
slice_end       : 185
slice_code      : unknown
xyzt_units      : 10
cal_max         : 3959.0
cal_min         : 112.0
slice_duration  : nan
toffset         : 0.0
glmax           : 0
glmin           : 0
descrip         : b'te=2.78;ti=450;fa=12;ec=0.0000;acq=[0,0];mt=0;rp=1.7;rs=1.0'
aux_file        : b''
qform_code      : scanner
sform_code      : scanner
quatern_b       : -0.70710677
quatern_c       : 0.0
quatern_d       : 0.70710677
qoffset_x       : 78.5498
qoffset_y       : 130.551
qoffset_z       : 153.449
srow_x          : [ 0.      0.     -0.9001 78.5498]
srow_y          : [  0.      -0.8984   0.     130.551 ]
srow_z          : [ -0.8984   0.       0.     153.449 ]
intent_name     : b''
magic           : b'n+1'

Also for the Bids_validator error, the error thrown for /sub-s637/ses-2/anat/sub-s637_ses-2_T1w.nii.gz (along with 1000+ more files) applies to subjects who both either succesfully completed or failed fmriprep. Since there are subjects who complete fmriprep without a problem, for which bids_validator still throws this error, I assume the dataset validates on it’s own. And I use " --skip_bids_validation "

Is there anything I can share more? I can’t really make sense of the header data, but do you feel like the error relating to lacking of orthonormal definitions is visible in the header of the T1 file?

Thanks,
Ulas

Hi Steven! There is nothing distinctly different for the failing subjects as most runs consist of 2 sessions and 10 tasks for each subject. I tried to identify patterns in fails, but really couldn’t. One pattern I could identify was that when I first ran my code across all subjects, approximately 30 threw a NodeExecutionError, and another 30ish threw a datalad get error. (I originally was running everything in a pipeline of datalad get - fmriprep processing - datalad drop).

Then I checked the error being thrown for datalad get fails (that didn’t even reach fmriprep step). However, realized that all of these files actually downloaded the data to the .git/annex/objects folder. In the original dataset directory, all files point to a symbolic link, and I could find every single data being pointed at by the symlinks for subjects that threw the following datalad get error. Because I saw that the data was actually downloaded, I disregarded these errors:

get(error): sub-s557/ses-1/func/sub-s557_ses-1_task-motorSelectiveStop_run-1_recording-cardiac_physio.tsv.gz (file) [S3 bucket does not allow public access; Set both AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to use S3
git-annex: .git/annex/othertmp/66d_67b_SHA256E-s87488--85c0d9c263fc8458d406a4793c36038423ff03e12bdcb5cc88c1b7044677bdb2.tsv.gz.log: openFile: failed (Remote I/O error)]
get(error): sub-s557/ses-1/func/sub-s557_ses-x1_task-motorSelectiveStop_run-1_recording-respiratory_physio.tsv.gz (file) [S3 bucket does not allow public access; Set both AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to use S3
git-annex: .git/annex/othertmp/8fd_5a6_SHA256E-s35546--e989cad766059ec170b4d0a1a746228cfea50f4d024795c36ef12626c314a69f.tsv.gz.log: openFile: failed (Remote I/O error)]

But then when I run fmriprep on all these subjects who initially failed datalat get, 100% of them failed fmriprep with the indicated NodeExecution - TraitError - FileNotFoundError for node N4. I am not sure if this access error is related to my actual error.

Also here are what seff $JOBID returns:

for a specific failed task

seff 5298_10
Job ID: 5308
Array Job ID: 5298_10
Cluster: cluster
User/Group: ibrayyilmaz/kdis-hpc
State: FAILED (exit code 1)
Nodes: 1
Cores per node: 10
CPU Utilized: 00:05:32
CPU Efficiency: 1.22% of 07:32:20 core-walltime
Job Wall-clock time: 00:45:14
Memory Utilized: 26.66 GB
Memory Efficiency: 53.31% of 50.00 GB

for the last failed task:

Job ID: 5298
Array Job ID: 5298_53
Cluster: cluster
User/Group: ibrayyilmaz/kdis-hpc
State: FAILED (exit code 1)
Nodes: 1
Cores per node: 10
CPU Utilized: 00:04:48
CPU Efficiency: 1.30% of 06:09:50 core-walltime
Job Wall-clock time: 00:36:59
Memory Utilized: 26.18 GB
Memory Efficiency: 52.36% of 50.00 GB

I will try rerunning with --clean-workdir and report back!

Thank you!

Ran a sample with --clean-workdir. Still the same NodeExecutionError - TraitError - FileNotFoundError in Node 4 issue.
Thanks

Ah, sorry, I saw the N4 and thought this was an anatomical workflow bug we’d recently been looking at. I’ve pulled that subject and run, and I’ll see if I can reproduce this.

I was not able to reproduce this, so I think rerunning with a fresh working directory is the right move. Incidentally, do you still have this file?

Hi! In the previous code I shared, I created a parent directory “fmriprep_work_sub” in my lab home directory, then created subject specific work directories. When I ran for all failed subjects in this manner (creating subject specific fresh working directories in fmriprep_work_sub), all subjects fail with the same NodeExecutionError. And I still have the file:

/mnt/fmriprep_work_sub/sub-s648/fmriprep_24_1_wf/sub_s648_wf/bold_ses_1_task_DPX_run_1_wf/bold_fit_wf/unwarp_wf/brainextraction_wf/clipper_pre/clipped.nii.gz

Would you have any suggestions on how to move forward? Thank you.