QSIPrep - RuntimeError: Robust spatial normalization failed after 3 retries

Summary of what happened:

I ran QSIPrep 1.0.1 in a Conda environment using SLURM. The pipeline fails during the nonlinear normalization step ( anat_nlin_normalization node of anatomical preprocessing workflow). After three retries of antsRegistrationSyN.sh, the ANTs Registration fails, raising a “Robust spatial normalization failed” error. Although this error is while using Python 3.10, I previously encountered same error while using Python 3.9 and earlier versions of Conda.

Command used:

# This code is wrapped as a SLURM job for individual subjects.
singularity run \
  -B "${BIDS_DIR}:/data:ro" \
  -B "${OUT_DIR}:/out" \
  -B "${FS_LICENSE}:/license.txt:ro" \
  -B "${TEMPLATEFLOW_HOME}:${TEMPLATEFLOW_HOME}" \
  --env "TEMPLATEFLOW_HOME=${TEMPLATEFLOW_HOME}" \
  "${SIF}" \
    /data /out participant \
      --participant-label "${SUBJ}" \
      --fs-license-file /license.txt \
      -w "${PROJ_DIR}/workDir_new" \
      --nprocs "${SLURM_CPUS_PER_TASK}" \
      --mem 64GB \
      --skip-bids-validation \
      --use-syn-sdc \
      --output-resolution 2.2

Version:

conda → 25.3.0
Python → 3.10.18
qsiprep → 1.0.1
nipype → 1.9.1
dipy → 1.8.0
NumPy → 1.26.3
(Others can be traced via given .yml file)

Environment (Docker, Singularity / Apptainer, custom installation):

As explained earlier, running the pipeline using Singularity image, with Anaconda environment, and processing each subject as a SLURM job.
Here is the link to .yml file to recreate the same environment: Box

Data is formatted with heudiconv and dcm2niix into BIDS-compliant format:

# Here is the 'tree' output:

Data_dir/qsiPrep_sample/
├── CHANGES
├── dataset_description.json
├── participants.json
├── participants.tsv
├── README
├── scans.json
├── sub-192037
│   └── ses-01
│       ├── anat
│       │   ├── sub-192037_ses-01_run-001_FLAIR.json
│       │   ├── sub-192037_ses-01_run-001_FLAIR.nii.gz
│       │   ├── sub-192037_ses-01_run-001_T1w.json
│       │   ├── sub-192037_ses-01_run-001_T1w.nii.gz
│       │   ├── sub-192037_ses-01_run-001_T2w.json
│       │   └── sub-192037_ses-01_run-001_T2w.nii.gz
│       ├── dwi
│       │   ├── sub-192037_ses-01_run-001_dwi.bval
│       │   ├── sub-192037_ses-01_run-001_dwi.bvec
│       │   ├── sub-192037_ses-01_run-001_dwi.json
│       │   └── sub-192037_ses-01_run-001_dwi.nii.gz
│       ├── func
│       │   ├── sub-192037_ses-01_task-rest_run-001_bold.json
│       │   ├── sub-192037_ses-01_task-rest_run-001_bold.nii.gz
│       │   └── sub-192037_ses-01_task-rest_run-001_events.tsv
│       ├── perf
│       │   ├── sub-192037_ses-01_task-asl_run-001_asl.json
│       │   └── sub-192037_ses-01_task-asl_run-001_asl.nii.gz
│       └── sub-192037_ses-01_scans.tsv
└── task-rest_bold.json

Relevant log outputs (crash report):

Node inputs:

compress_report = auto
explicit_masking = True
flavor = precise
float = True
initial_moving_transform = <undefined>
lesion_mask = <undefined>
moving = T1w
moving_image = <undefined>
moving_mask = <undefined>
num_threads = 3
orientation = LPS
out_report = report.svg
reference = T1w
reference_image = <undefined>
reference_mask = <undefined>
settings = <undefined>
template = MNI152NLin2009cAsym
template_resolution = <undefined>
template_spec = <undefined>

Traceback (most recent call last):
  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/nipype/pipeline/plugins/multiproc.py", line 66, in run_node
    result["result"] = node.run(updatehash=updatehash)
  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/nipype/pipeline/engine/nodes.py", line 525, in run
    result = self._run_interface(execute=True)
  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/nipype/pipeline/engine/nodes.py", line 643, in _run_interface
    return self._run_command(execute)
  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/nipype/pipeline/engine/nodes.py", line 769, in _run_command
    raise NodeExecutionError(msg)
nipype.pipeline.engine.nodes.NodeExecutionError: Exception raised while executing Node anat_nlin_normalization.

Traceback:
	Traceback (most recent call last):
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/nipype/interfaces/base/core.py", line 401, in run
	    runtime = self._run_interface(runtime)
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/niworkflows/interfaces/norm.py", line 245, in _run_interface
	    raise RuntimeError(
	RuntimeError: Robust spatial normalization failed after 3 retries.

Outputs generated so far in the Protocols folder:

Protocols/qsiPrep_sample_py310/
├── dataset_description.json
├── logs
│   ├── CITATION.bib
│   ├── CITATION.html
│   ├── CITATION.md
│   └── CITATION.tex
├── sub-192037
│   ├── anat
│   │   ├── sub-192037_from-ACPC_to-anat_mode-image_xfm.mat
│   │   ├── sub-192037_from-anat_to-ACPC_mode-image_xfm.mat
│   │   ├── sub-192037_space-ACPC_desc-aseg_dseg.nii.gz
│   │   ├── sub-192037_space-ACPC_desc-brain_mask.nii.gz
│   │   ├── sub-192037_space-ACPC_desc-preproc_T1w.json
│   │   ├── sub-192037_space-ACPC_desc-preproc_T1w.nii.gz
│   │   └── sub-192037_space-ACPC_dseg.nii.gz
│   ├── figures
│   │   ├── sub-192037_about.html
│   │   └── sub-192037_summary.html
│   ├── log
│   │   └── 20250617-163130_3f7946a4-431d-4d58-b0a6-5f03e5683085
│   │       ├── crash-20250617-232651-dppos-anat_nlin_normalization-8e7f2543-d739-4a1f-b444-50ce6d9fb180.txt
│   │       └── qsiprep.toml
│   └── ses-01
│       ├── anat
│       │   └── sub-192037_ses-01_run-001_from-orig_to-anat_mode-image_xfm.txt
│       ├── dwi
│       │   ├── sub-192037_ses-01_run-001_desc-confounds_timeseries.tsv
│       │   ├── sub-192037_ses-01_run-001_space-ACPC_desc-brain_mask.nii.gz
│       │   ├── sub-192037_ses-01_run-001_space-ACPC_desc-image_qc.tsv
│       │   ├── sub-192037_ses-01_run-001_space-ACPC_desc-preproc_dwi.b
│       │   ├── sub-192037_ses-01_run-001_space-ACPC_desc-preproc_dwi.b_table.txt
│       │   ├── sub-192037_ses-01_run-001_space-ACPC_desc-preproc_dwi.bval
│       │   ├── sub-192037_ses-01_run-001_space-ACPC_desc-preproc_dwi.bvec
│       │   ├── sub-192037_ses-01_run-001_space-ACPC_desc-preproc_dwi.json
│       │   ├── sub-192037_ses-01_run-001_space-ACPC_desc-preproc_dwi.nii.gz
│       │   ├── sub-192037_ses-01_run-001_space-ACPC_desc-slice_qc.json
│       │   ├── sub-192037_ses-01_run-001_space-ACPC_dwiref.nii.gz
│       │   ├── sub-192037_ses-01_run-001_space-ACPC_model-eddy_stat-cnr_dwimap.json
│       │   └── sub-192037_ses-01_run-001_space-ACPC_model-eddy_stat-cnr_dwimap.nii.gz
│       └── figures
│           ├── sub-192037_ses-01_conform.html
│           ├── sub-192037_ses-01_desc-seg_mask.svg
│           ├── sub-192037_ses-01_run-001_desc-biascorrpost_dwi.svg
│           ├── sub-192037_ses-01_run-001_desc-carpetplot_dwi.svg
│           ├── sub-192037_ses-01_run-001_desc-coreg_dwi.svg
│           ├── sub-192037_ses-01_run-001_desc-denoising_dwi.svg
│           ├── sub-192037_ses-01_run-001_desc-resampled_b0ref.svg
│           ├── sub-192037_ses-01_run-001_desc-samplingscheme_dwi.gif
│           └── sub-192037_ses-01_run-001_desc-summary_dwi.html
└── sub-192037.html

Please let me know if you require any other details. Thank you!


Hi @atharvakarnik and welcome to neurostars!

I am not sure why this would matter. Your local anaconda environment shouldn’t be accessed when running a container.

You should add -e and --writable-tmpfs to your singularity run preamble.

I don’t think QSIPrep recognizes the GB symbol. As noted in the documentation, --mem is expected to be in megabytes.

$PROJ_DIR does not appear to be mounted in your singularity run preamble.

Is this error subject specific or across the entire dataset?

Just in case can you share the entire SLURM script?

Best,
Steven

Hi @Steven,

Thank you for your reply.

About the Python and Anaconda previous versions comment - I read somewhere that QSIPrep (and other Neuroimaging helper packages) work better with Py 3.10 natively built, and I was using Conda 4.10.3 earlier which is built on Py 3.9. I just thought of clarifying this in to rule out that side of dependency issues.

Moving on to your question about error - Yes, this is an error across entire dataset. Seeing this, I created a sample dataset having single subject (for quicker results) to test if the tweaked code works.

Here is the current version of entire SLURM script:

#!/bin/bash
#SBATCH --partition=all
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=64G
#SBATCH --time=12:00:00
#SBATCH --output=logs/%j_qsiprep.log

PROJ_DIR="/cbica/projects/DPPOS/Pipelines/QSIPrep"
BIDS_DIR="${PROJ_DIR}/Data_dir/qsiPrep_sample"
OUT_DIR="${PROJ_DIR}/Protocols/qsiPrep_sample_py310"
SIF="${PROJ_DIR}/qsiprep-latest.sif"
FS_LICENSE="${PROJ_DIR}/license.txt"

mkdir -pv ${OUT_DIR}

export TEMPLATEFLOW_HOME="/cbica/projects/DPPOS/.cache/templateflow"

export PATH="$HOME/miniforge3/bin:$PATH"
hash -r
eval "$(conda shell.bash hook)" # Mini conda activation instead of init and sourcing bashrc
mamba activate qsiprep-py310

if [[ "${CONDA_PREFIX:-}" == "" ]]; then
    echo "ERROR: Could not activate Conda env '$CONDA_ENV_NAME'. Exiting."
    exit 1
fi
echo "Conda environment '$CONDA_ENV_NAME' activated. CONDA_PREFIX=$CONDA_PREFIX"

SUBJ="${1}"
if [ -z "${SUBJ}" ]; then
  echo "ERROR: No subject label provided."
  echo "Usage: sbatch qsiprep_dwi.sbatch <label>"
  exit 1
fi

singularity run \
  -B "${BIDS_DIR}:/data:ro" \
  -B "${OUT_DIR}:/out" \
  -B "${FS_LICENSE}:/license.txt:ro" \
  -B "${TEMPLATEFLOW_HOME}:${TEMPLATEFLOW_HOME}" \
  --env "TEMPLATEFLOW_HOME=${TEMPLATEFLOW_HOME}" \
  "${SIF}" \
    /data /out participant \
      --participant-label "${SUBJ}" \
      --fs-license-file /license.txt \
      -w "${PROJ_DIR}/workDir_new" \
      --nprocs "${SLURM_CPUS_PER_TASK}" \
      --mem 64GB \
      --skip-bids-validation \
      --use-syn-sdc \
      --output-resolution 2.2

The above script is wrapped inside this for loop to submit for each subject in the BIDS directory:

for subj_path in "${BIDS_DIR}"/sub-*; do
  if [ -d "${subj_path}" ]; then
    subj=$(basename "${subj_path}")
    label=${subj#sub-}
    echo "-> Submitting QSIPrep job for ${subj} (label=${label})"
    sbatch "${SBATCH_SCRIPT}" "${label}"
  fi
done

Let me know if you need more.

Thanks again,
Atharva

Hi @atharvakarnik,

Please make the changes I recommended and try again. Just in case, make a new working directory when you retry.

Also, unrelated, but I’d recommend submitting multiple jobs as an sbatch array instead of individual jobs.

Best,
Steven