Having trouble running QSIPrep

Hi @gavkhar_a,

Yes these errors are precluding qsiprep.

.epi is not a bids valid suffix for DWI

You should see if this is consistent for all subjects, and how large the discrepancy is. I recommend using most up to date dcm2niix when converting subjects. The version you are using is 5 years old.

DWI files do not have event files or task labels.

Best,
Steven

Ah got it! Thank you so so much Steven!

Best,
Gavkhar

Hi Steven, thanks again. The QSIprep ran for about an hour and exited with some errors. I attached a slurm file as well as crash text files it outputted.

Do you have an insight into what might be the issue?

Also, it seems to be failing to grab bval and bvec files but I made sure they are there so wondering why. Thanks in advance!

(also, Rachel says hi :slight_smile: )

slurm.txt (108.7 KB)
crash-20250820-142819-ga2541-get_template_image-27617b43-8cd8-4648-a21b-e4d05802cfcd.txt (5.5 KB)
crash-20250820-141417-ga2541-merge_dwis-b6e27e93-92f3-4f8e-8b5c-d8852171aac5.txt (4.7 KB)

1 Like

Hi @gavkhar_a,

What is the current BIDS validator output and tree directory structure for this subject?

For this error:

nipype.pipeline.engine.nodes.NodeExecutionError: Exception raised while executing Node get_template_image.

Traceback:
	Traceback (most recent call last):
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/urllib3/connection.py", line 196, in _new_conn
	    sock = connection.create_connection(
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
	    raise err
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection
	    sock.connect(sa)
	TimeoutError: [Errno 110] Connection timed out

	The above exception was the direct cause of the following exception:

	Traceback (most recent call last):
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/urllib3/connectionpool.py", line 789, in urlopen
	    response = self._make_request(
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/urllib3/connectionpool.py", line 490, in _make_request
	    raise new_e
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/urllib3/connectionpool.py", line 466, in _make_request
	    self._validate_conn(conn)
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1095, in _validate_conn
	    conn.connect()
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/urllib3/connection.py", line 615, in connect
	    self.sock = sock = self._new_conn()
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/urllib3/connection.py", line 205, in _new_conn
	    raise ConnectTimeoutError(
	urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPSConnection object at 0x1537cd0d0a60>, 'Connection to templateflow.s3.amazonaws.com timed out. (connect timeout=None)')

	The above exception was the direct cause of the following exception:

	Traceback (most recent call last):
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/requests/adapters.py", line 667, in send
	    resp = conn.urlopen(
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/urllib3/connectionpool.py", line 843, in urlopen
	    retries = retries.increment(
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/urllib3/util/retry.py", line 519, in increment
	    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
	urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='templateflow.s3.amazonaws.com', port=443): Max retries exceeded with url: /tpl-MNI152NLin2009cAsym/tpl-MNI152NLin2009cAsym_res-01_T1w.nii.gz (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x1537cd0d0a60>, 'Connection to templateflow.s3.amazonaws.com timed out. (connect timeout=None)'))

	During handling of the above exception, another exception occurred:

	Traceback (most recent call last):
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/nipype/interfaces/base/core.py", line 401, in run
	    runtime = self._run_interface(runtime)
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/qsiprep/interfaces/anatomical.py", line 226, in _run_interface
	    template_path = get_template(
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/templateflow/conf/__init__.py", line 69, in wrapper
	    return func(*args, **kwargs)
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/templateflow/api.py", line 145, in get
	    _s3_get(filepath)
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/templateflow/api.py", line 299, in _s3_get
	    r = requests.get(url, stream=True)
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/requests/api.py", line 73, in get
	    return request("get", url, params=params, **kwargs)
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/requests/api.py", line 59, in request
	    return session.request(method=method, url=url, **kwargs)
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
	    resp = self.send(prep, **send_kwargs)
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
	    r = adapter.send(request, **kwargs)
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/requests/adapters.py", line 688, in send
	    raise ConnectTimeout(e, request=request)
	requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='templateflow.s3.amazonaws.com', port=443): Max retries exceeded with url: /tpl-MNI152NLin2009cAsym/tpl-MNI152NLin2009cAsym_res-01_T1w.nii.gz (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x1537cd0d0a60>, 'Connection to templateflow.s3.amazonaws.com timed out. (connect timeout=None)'))

It looks like your compute nodes do not have internet connection, so your jobs fail when you try to download the anatomical templates from TemplateFlow. To get around this:

  1. install Datalad (conda install -c conda-forge datalad)
  2. clone the Templateflow dataset to somewhere on your cluster (datalad clone https://github.com/templateflow/templateflow.git)
  3. cd to your templateflow directory, and pull the necessary files from the cloud
# if you see no folders in the templateflow directory first run the below
datalad get -n *
# if you see folders then run the below
datalad get tpl-MNI152NLin2009cAsym
  1. In your QSIPrep script, add -B /path/to/your/templateflow:/templateflow to your apptainer run preamble. Also add export APPTAINERENV_TEMPLATEFLOW_HOME=/templateflow before your apptainer run command.

For the other error:

Traceback (most recent call last):
  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/nipype/pipeline/plugins/multiproc.py", line 66, in run_node
    result["result"] = node.run(updatehash=updatehash)
  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/nipype/pipeline/engine/nodes.py", line 525, in run
    result = self._run_interface(execute=True)
  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/nipype/pipeline/engine/nodes.py", line 643, in _run_interface
    return self._run_command(execute)
  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/nipype/pipeline/engine/nodes.py", line 769, in _run_command
    raise NodeExecutionError(msg)
nipype.pipeline.engine.nodes.NodeExecutionError: Exception raised while executing Node merge_dwis.

Traceback:
	Traceback (most recent call last):
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/nipype/interfaces/base/core.py", line 401, in run
	    runtime = self._run_interface(runtime)
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/qsiprep/interfaces/dwi_merge.py", line 76, in _run_interface
	    to_concat, b0_means, corrections = harmonize_b0s(
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/qsiprep/interfaces/dwi_merge.py", line 700, in harmonize_b0s
	    b0_mean = index_img(dwi_nii, b0_indices).get_fdata().mean()
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/nilearn/image/image.py", line 669, in index_img
	    return _index_img(imgs, index)
	  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/nilearn/_utils/niimg_conversions.py", line 77, in _index_img
	    img, _get_data(img)[:, :, :, index], img.affine, copy_header=True
	IndexError: index 4 is out of bounds for axis 3 with size 4

Can you print the bvals for this subject? I see the following line in the crash log:

scan_metadata = {'/data/sub-FACT130/ses-pre/dwi/sub-FACT130_ses-pre_dwi.nii.gz': {'AcquisitionMatrixPE': 128, 'AcquisitionNumber': 1, 'AcquisitionTime': '12:21:16.752500', 'BandwidthPerPixelPhaseEncode': 19.531, 'BaseResolution': 128, 'ConversionSoftware': 'dcm2niix', 'ConversionSoftwareVersion': 'v1.0.20180622 GCC6.3.0', 'DerivedVendorReportedEchoSpacing': 0.00080001, 'DeviceSerialNumber': '35469', 'DwellTime': 2.8e-06, 'EchoTime': 0.057, 'EffectiveEchoSpacing': 0.000400005, 'FlipAngle': 90, 'ImageOrientationPatientDICOM': [0.996543, 0.0807755, -0.0194357, -0.0824693, 0.990091, -0.113657], 'ImageType': ['ORIGINAL', 'PRIMARY', 'M', 'ND', 'MOSAIC'], 'InPlanePhaseEncodingDirectionDICOM': 'COL', 'InstitutionAddress': 'Vassar_43_Cambridge_MA_US_02139', 'InstitutionName': 'McGovern_Brain_Institute', 'InstitutionalDepartmentName': 'Department', 'MRAcquisitionType': '2D', 'MagneticFieldStrength': 3, 'Manufacturer': 'Siemens', 'ManufacturersModelName': 'TrioTim', 'Modality': 'MR', 'ParallelReductionFactorInPlane': 2, 'PartialFourier': 0.75, 'PatientPosition': 'HFS', 'PercentPhaseFOV': 100, 'PhaseEncodingDirection': 'j-', 'PhaseEncodingSteps': 96, 'PhaseResolution': 1, 'PixelBandwidth': 1395, 'ProcedureStepDescription': 'INVESTIGATORS_Gabrieli', 'ProtocolName': 'TopUp_Distortion_Map_A-P-A_Diffusion', 'PulseSequenceDetails': '%CustomerSeq%_Andre_ep2d_se_dist_nav_map', 'ReceiveCoilName': '32Ch_Head', 'ReconMatrixPE': 128, 'RepetitionTime': 7.24, 'SAR': 0.212809, 'ScanOptions': 'PFP_FS', 'ScanningSequence': 'EP', 'SequenceName': 'epse2d1_128', 'SequenceVariant': 'SK_SP', 'SeriesDescription': 'TopUp_Distortion_Map_A-P-A_Diffusion', 'SeriesNumber': 17, 'ShimSetting': [6351, -28080, -24530, 440, -193, 187, 88, 113], 'SliceThickness': 2, 'SliceTiming': [3.635, 0, 3.7325, 0.0975, 3.83, 0.1975, 3.93, 0.295, 4.0275, 0.3925, 4.125, 0.49, 4.2225, 0.59, 4.3225, 0.6875, 4.42, 0.785, 4.5175, 0.885, 4.6175, 0.9825, 4.715, 1.08, 4.8125, 1.1775, 4.91, 1.2775, 5.01, 1.375, 5.1075, 1.4725, 5.205, 1.5725, 5.305, 1.67, 5.4025, 1.7675, 5.5, 1.865, 5.5975, 1.965, 5.6975, 2.0625, 5.795, 2.16, 5.8925, 2.26, 5.9925, 2.3575, 6.09, 2.455, 6.1875, 2.555, 6.285, 2.6525, 6.385, 2.75, 6.4825, 2.8475, 6.58, 2.9475, 6.68, 3.045, 6.7775, 3.1425, 6.875, 3.2425, 6.975, 3.34, 7.0725, 3.4375, 7.17, 3.535], 'SoftwareVersions': 'syngo_MR_B17', 'SpacingBetweenSlices': 2, 'StationName': 'MRC35469', 'TaskName': 'topupdwi', 'TotalReadoutTime': 0.0508007, 'TxRefAmp': 346.603}}

Based on that scan metadata, it looks like that is some topupdwi file? Is it possible you are inputting the wrong file into qsiprep? Can you explain where those topup images came from? Where they something right off the scanner? Are they just one volume?

Hi Rachel!

Best,
Steven

Thank you!

  1. Here is the bids tree folder for this subject. I renamed the topup runs (which are the type of fmap) into bids appropriate names and moved them into fmap folder (see screenshot) after making sure they were PA oriented (now I am not sure about this – please see #2 below)

  1. However, I am now not quite sure about the whole phase-encoding thing. If you look at the json file (aka freshly renamed toptup json by me) of this fmap, it indicates j-- encoding. This means PA encoding, correct? But could be AP as you previously said? I think I confused myself with this at this point…
    I was reading more about phase encoding and topup runs here: PhaseEncodingDirection "i" and "j-" in .json file

image

  1. Our HPC where we run our analyses are very strict with their space limits so I was told installing and running conda will take up too much space. Would me downloading and uploading the templateflow onto directory without involving dataled work ok? I know it’s more manual work but not impossible I guess?

Thanks!
Gavkhar
sub-FACT130_ses-pre_dir-PA_epi.txt (2.9 KB)

Hi @gavkhar_a,

For the future, text is preferable to screenshots.

If you could provide the rest of the details requested in my last post (BIDS validator, bvals, information about the topupdwi file), that would help me debug.

You can use miniforge which installs only a minimal installation of mamba.

Yes, but you would still need datalad at some point to retrieve the images from the cloud.

Best,
Steven

Hi @Steven I ran the bids-validator on another subject just to check things around and these are the errors it is throwing me that I can’t understand where they are coming from as it seems looking fine folder-tree wise:

[WARNING] GZIP_HEADER_MTIME The gzip header contains a non-zero timestamp.
This may leak sensitive information or indicate a non-reproducible conversion process.

		/ses-pre/fmap/sub-FACT181_ses-pre_dir-AP_epi.nii.gz
		/ses-pre/func/sub-FACT181_ses-pre_task-story_run-01_bold.nii.gz

		4 more files with the same issue

	Please visit https://neurostars.org/search?q=GZIP_HEADER_MTIME for existing conversations about this issue.

	[WARNING] EVENTS_TSV_MISSING Task scans should have a corresponding 'events.tsv' file.
If this is a resting state scan you can ignore this warning or rename the task to include the word "rest".

		/ses-pre/func/sub-FACT181_ses-pre_task-story_run-01_bold.nii.gz

	Please visit https://neurostars.org/search?q=EVENTS_TSV_MISSING for existing conversations about this issue.

	[ERROR] MISSING_DATASET_DESCRIPTION A dataset_description.json file is required in the root of the dataset
		

	Please visit https://neurostars.org/search?q=MISSING_DATASET_DESCRIPTION for existing conversations about this issue.

	[ERROR] ALL_FILENAME_RULES_HAVE_ISSUES Multiple filename rules were found as potential matches. All of them had at least one issue during filename validation.
		/ses-pre/sub-FACT181_ses-pre_task-story_run-01_events.tsv - Rules that matched with issues: rules.files.raw.task.events, rules.files.raw.task.events__mri, rules.files.raw.task.events__motion, rules.files.raw.task.events__pet, rules.files.raw.task.events__mrs

	Please visit https://neurostars.org/search?q=ALL_FILENAME_RULES_HAVE_ISSUES for existing conversations about this issue.

	[ERROR] INVALID_LOCATION The file has a valid name, but is located in an invalid directory.
		/ses-pre/sub-FACT181_ses-pre_scans.tsv - Expected location: /sub-FACT181/ses-pre/
		/ses-pre/fmap/sub-FACT181_ses-pre_dir-AP_epi.nii.gz - Expected location: /sub-FACT181/ses-pre/

		15 more files with the same issue

	Please visit https://neurostars.org/search?q=INVALID_LOCATION for existing conversations about this issue.

	[ERROR] SCANS_FILENAME_NOT_MATCH_DATASET Filenames in scans.tsv file do not match what is present in the BIDS dataset.

		/ses-pre/sub-FACT181_ses-pre_scans.tsv

	Please visit https://neurostars.org/search?q=SCANS_FILENAME_NOT_MATCH_DATASET for existing conversations about this issue.


          Summary:                         Available Tasks:        Available Modalities:
          18 Files, 101 MB                 topupdwi                MRI                  
          1 - Subjects 1 - Sessions        story                                        
                                           rest                                         
                                           topuprest                                    

Here is the bids-tree for this subject:

login-3:sub-FACT181$ ls
ses-pre
login-3:sub-FACT181$ cd ses-pre
login-3:ses-pre$ ls
anat  dwi  fmap  func  sub-FACT181_ses-pre_scans.tsv  sub-FACT181_ses-pre_task-story_run-01_events.tsv
login-3:ses-pre$ cd func
login-3:func$ ls
sub-FACT181_ses-pre_task-rest_run-01_bold.json    sub-FACT181_ses-pre_task-story_run-01_bold.nii.gz
sub-FACT181_ses-pre_task-rest_run-01_bold.nii.gz  sub-FACT181_ses-pre_task-topuprest_run-01_bold.json
sub-FACT181_ses-pre_task-rest_run-01_events.tsv   sub-FACT181_ses-pre_task-topuprest_run-01_bold.nii.gz
sub-FACT181_ses-pre_task-story_run-01_bold.json   sub-FACT181_ses-pre_task-topuprest_run-01_events.tsv
login-3:func$ cd ..
login-3:ses-pre$ cd dwi
login-3:dwi$ ls
sub-FACT181_ses-pre_dwi.bval  sub-FACT181_ses-pre_dwi.bvec  sub-FACT181_ses-pre_dwi.json  sub-FACT181_ses-pre_dwi.nii.gz
login-3:dwi$ 

I am struggling to understand these errors as I checked and all seems to be located in places it is asking for. All other errors I was able to resolve but this is where I am stuck at.

Thanks much !!
Gavkhar

Hi @gavkhar_a,

It is hard to diagnose without knowing the contents of your scans.tsv file. But that file is optional and you might just be better off deleting it (or at least putting it somewhere else temporarily).

The events file does not belong in the session folder. Events files are only valid for certain data types (e.g… BOLD).

I see a topuprest file in the func folder. Are you sure that’s not supposed to be an fmap?

You need a dataset_description.json file in the root of your dataset.

Best,
Steven

Thanks @Steven ! I put the scan.tsv file in scan_tsv_files for now.
The dataset-description.json file exists in the main folder where other subjects folder are (here is the excerpt-screenshot):

I reran the validator and it is not complaining about the topuprest file anymore but I guess in any case, this should now potentially impede QSIPrep…?

The only error that is left and does not make sense to me is this:

		[ERROR] INVALID_LOCATION The file has a valid name, but is located in an invalid directory.
		/ses-pre/sub-FACT181_ses-pre_dir-AP_epi.nii.gz - Expected location: /sub-FACT181/ses-pre/
		/ses-pre/sub-FACT181_ses-pre_dir-AP_epi.json - Expected location: /sub-FACT181/ses-pre/

because the those files are already there:

login-3:fmap$ ls
login-3:fmap$ cd ..
login-3:ses-pre$ ls
anat  dwi  fmap  func  sub-FACT181_ses-pre_dir-AP_epi.json  sub-FACT181_ses-pre_dir-AP_epi.nii.gz
login-3:ses-pre$

And the fmap folder is not empty so I am not sure how it will find fmaps for dwi now

Hi @gavkhar_a,

the scans tsv files folder should not be placed within the bids root because it is not a valid folder name. You can add the folder to a .bidsignore file temporarily.

They should not be there, as they are field maps, and should be in fmap.

Best,
Steven

Hi @Steven, I’ve been finally been able to run QSIprep on one subject and it’s been running for almost 25 hours by now which made me a bit concerned so I looked at the slurm file. From what I could gather, it looks like the last update is from yesterday at around noon:14 min and there is no print-out saying that QSIPrep is completed, etc.

Does this mean it is somehow stuck in a loop? Or could there be another issue at hand?
Slurm file attached. Thank you so so much for all your help – truly couldn’t have gone this far without your support, Steven!

Best,
Gavkhar
slurm-13637849.txt (71.3 KB)

try increasing memory and cpu, I would also specify the memory and cpu arguments within qsiprep as well as the SBATCH header. When you run seff $JOBID from your last jobid, what does the output show?

this is what I got:

login-2:~$ seff $13637849
Job ID: 3637849
Array Job ID: 3637782_66
Cluster: zaratan
User/Group: asavel/zt-kempton-prj
State: COMPLETED (exit code 0)
Cores: 1
CPU Utilized: 00:07:47
CPU Efficiency: 96.89% of 00:08:02 core-walltime
Job Wall-clock time: 00:08:02
Memory Utilized: 595.92 MB
Memory Efficiency: 14.90% of 3.91 GB

And this is what i have in my script as of now:

#!/bin/bash
#SBATCH --job-name=qsiprep
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=6
#SBATCH --time=50:00:00
#SBATCH --mem=24000
#SBATCH --mail-type=ALL

That seff output does not match with your sbatch header. Are you sure you entered the right ID?

Oh I see, you typed it in with the $, you shouldn’t do that

ohhh sorry about that:

login-1:~$ seff 13637849
Job ID: 13637849
Cluster: zaratan
User/Group: ga2541/zt-romeo-prj
State: RUNNING
Nodes: 1
Cores per node: 6
CPU Utilized: 00:00:00
CPU Efficiency: 0.00% of 7-05:36:30 core-walltime
Job Wall-clock time: 1-04:56:05
Memory Utilized: 0.00 MB (estimated maximum)
Memory Efficiency: 0.00% of 23.44 GB (23.44 GB/node)
WARNING: Efficiency statistics may be misleading for RUNNING jobs.

Looks like the job is still running, seff doesn’t work for running jobs.

Yes exactly, it is still running – it’s been 1 day and 10 hours as of right now… If i look at slurm print out, it looks like the last update was tomorrow at around noon so I guess I will request memory and cpu as yo you said. Thank you.
Thanks!

	 [Node] Setting-up "qsiprep_1_0_wf.sub_FACT181_ses_pre_wf.dwi_preproc_ses_pre_wf.hmc_sdc_wf.extract_b0_series" in "/work/qsiprep_1_0_wf/sub_FACT181_ses_pre_wf/dwi_preproc_ses_pre_wf/hmc_sdc_wf/extract_b0_series".
250831-12:14:44,900 nipype.workflow INFO:
	 [Node] Executing "split_eddy_lps" <qsiprep.interfaces.images.SplitDWIsFSL>
250831-12:14:44,912 nipype.workflow INFO:
	 [Node] Executing "extract_b0_series" <qsiprep.interfaces.gradients.ExtractB0s>
login-2:scratch$

I am not sure though is I should rerun this specific subject again. I am plannign on sending an array now that I know the script works. What cpu and memory would you recommend for an array of ~20 subjects?

You can cancel job then seff it.