Qsiprep wirh cuda 12.2

Summary of what happened:

Dear experts,

I tried to run qsiprep version 19.1 with the following command line. After successfully starting, the program crashed with the following CUDA-related error. On our server we have the cuda version 12.2. installed. Is there meanwhile a solution for the problem? The data was acquired in line with the HCP protocols (AP an PA dwi images with spinecho field maps, T1w and T2w), but for qsiprep I used only T1w anatomical images.

After downloading the latest qsiprep version 0.21.5 the script did not work at all (no preprocessing of the anatomy).

Best
Ralf

Command used (and if a helper script was used, a link to the helper script or the command generated):

docker run --rm -it \
-v /opt_prg/freesurfer_741/license.txt:/opt/freesurfer/license.txt:ro \
-v /server/fo2-22/data/BrainStim_tDCS_3/MRT/BIDS/inputs:/data:ro \
-v /server/fo2-22/data/BrainStim_tDCS_3/MRT/BIDS/inputs/eddy_params.json:/sngl/eddy/eddy_config.json:ro \
-v /server/fo2-22/data/BrainStim_tDCS_3/MRT/BIDS/derivatives:/out \
-v /export/local/tmp_local:/scratch \
pennbbl/qsiprep:latest \
/data /out participant \
--eddy-config /sngl/eddy/eddy_config.json \
--skip-bids-validation \
--participant-label 5058 \
--output-resolution 1.2 \
--anat-modality T1w \
--distortion-group-merge average \
--hmc-model eddy \
--pepolar-method TOPUP \
--stop-on-first-crash \
-w /scratch

Version:

0.19.1 and 0.21.5

Environment (Docker, Singularity / Apptainer, custom installation):

Docker

Data formatted according to a validatable standard? Please provide the output of the validator:

PASTE VALIDATOR OUTPUT HERE

Relevant log outputs (up to 20 lines):

For version 0.19.1

Stderr:
	eddy_cuda: error while loading shared libraries: libcublas.so.10: cannot open shared object file: No such file or directory

For version 0.21.5

240506-08:38:59,982 nipype.utils WARNING:
	 No metadata was found in the pkl file. Make sure you are currently using the same Nipype version from the generated pkl.
240506-08:38:59,984 nipype.workflow WARNING:
	 Error while checking node hash, forcing re-run. Although this error may not prevent the workflow from running, it could indicate a major problem. Please report a new issue at https://github.com/nipy/nipype/issues adding the following information:

	Node: qsiprep_wf.single_subject_5058_wf.anat_preproc_wf.output_grid_wf.autobox_template
	Interface: nipype.interfaces.afni.utils.Autobox

Screenshots / relevant information:


Hi @Ralf, we just updated eddy (and eddy_cuda), could you please try qsiprep version 0.21.4? There image ships with cuda 11.1, which should work on your gpus.

Hi @mattcieslak

first I deleted the image with the last version and pulled the version 0.21.4 (actually 0.21.5 dev… was downloaded). However the script crashes. The error seems to be not related to cuda version. However, I used the same commands.

240506-13:43:46,966 nipype.workflow INFO:
	 Running with omp_nthreads=8, nthreads=128
240506-13:43:46,967 nipype.workflow IMPORTANT:
	 
    Running qsiprep version 0.21.5.dev0+g36b93fe.d20240504:
      * BIDS dataset path: /data.
      * Participant list: ['5058'].
      * Run identifier: 20240506-134314_c63db3f6-2cb4-4674-b5b4-0b88194d99cb.
    
240506-13:44:10,842 nipype.workflow INFO:
	 Running nonlinear normalization to template
240506-13:44:17,367 nipype.workflow INFO:
	 Combining all dwi files within each available session:
240506-13:44:17,367 nipype.workflow INFO:
	 	- 2 scans in session T1
240506-13:44:17,367 nipype.workflow INFO:
	 	- 2 scans in session T3
240506-13:44:17,367 nipype.workflow INFO:
	 	- 2 scans in session T2
240506-13:44:17,481 nipype.workflow INFO:
	 [{'dwi_series': ['/data/sub-5058/ses-T1/dwi/sub-5058_ses-T1_dir-PA_run-1_dwi.nii'], 'dwi_series_pedir': 'j', 'fieldmap_info': {'suffix': 'rpe_series', 'rpe_series': ['/data/sub-5058/ses-T1/dwi/sub-5058_ses-T1_dir-AP_run-1_dwi.nii']}, 'concatenated_bids_name': 'sub-5058_ses-T1_run-1'}, {'dwi_series': ['/data/sub-5058/ses-T3/dwi/sub-5058_ses-T3_dir-PA_run-1_dwi.nii'], 'dwi_series_pedir': 'j', 'fieldmap_info': {'suffix': 'rpe_series', 'rpe_series': ['/data/sub-5058/ses-T3/dwi/sub-5058_ses-T3_dir-AP_run-1_dwi.nii']}, 'concatenated_bids_name': 'sub-5058_ses-T3_run-1'}, {'dwi_series': ['/data/sub-5058/ses-T2/dwi/sub-5058_ses-T2_dir-PA_run-1_dwi.nii'], 'dwi_series_pedir': 'j', 'fieldmap_info': {'suffix': 'rpe_series', 'rpe_series': ['/data/sub-5058/ses-T2/dwi/sub-5058_ses-T2_dir-AP_run-1_dwi.nii']}, 'concatenated_bids_name': 'sub-5058_ses-T2_run-1'}]
240506-13:44:17,567 nipype.workflow IMPORTANT:
	 Creating dwi processing workflow "dwi_preproc_ses_T1_run_1_wf" to produce output sub-5058_ses-T1_run-1 (1.34 GB / 110 DWIs). Memory resampled/largemem=2.34/2.68 GB.
240506-13:44:17,568 nipype.workflow INFO:
	 Automatically using 5, 5, 5 window for dwidenoise
240506-13:44:18,815 nipype.workflow INFO:
	 Using 8 threads in eddy
240506-13:44:18,832 nipype.workflow INFO:
	 Using single-stage SDC, TOPUP-only
240506-13:44:18,901 nipype.workflow IMPORTANT:
	 Creating dwi processing workflow "dwi_preproc_ses_T3_run_1_wf" to produce output sub-5058_ses-T3_run-1 (1.34 GB / 110 DWIs). Memory resampled/largemem=2.34/2.68 GB.
240506-13:44:18,903 nipype.workflow INFO:
	 Automatically using 5, 5, 5 window for dwidenoise
240506-13:44:19,281 nipype.workflow INFO:
	 Using 8 threads in eddy
240506-13:44:19,297 nipype.workflow INFO:
	 Using single-stage SDC, TOPUP-only
240506-13:44:19,361 nipype.workflow IMPORTANT:
	 Creating dwi processing workflow "dwi_preproc_ses_T2_run_1_wf" to produce output sub-5058_ses-T2_run-1 (1.34 GB / 110 DWIs). Memory resampled/largemem=2.34/2.68 GB.
240506-13:44:19,363 nipype.workflow INFO:
	 Automatically using 5, 5, 5 window for dwidenoise
240506-13:44:19,743 nipype.workflow INFO:
	 Using 8 threads in eddy
240506-13:44:19,759 nipype.workflow INFO:
	 Using single-stage SDC, TOPUP-only
240506-13:44:30,48 nipype.workflow IMPORTANT:
	 Works derived from this qsiprep execution should include the following boilerplate:


Preprocessing was performed using *QSIPrep* 0.21.5.dev0+g36b93fe.d20240504,
which is based on *Nipype* 1.8.6
(@nipype1; @nipype2; RRID:SCR_002502).




WARNING] This document format requires a nonempty <title> element.
  Please specify either 'title' or 'pagetitle' in the metadata.
  Falling back to 'CITATION'
240506-13:44:46,455 nipype.utils WARNING:
	 No metadata was found in the pkl file. Make sure you are currently using the same Nipype version from the generated pkl.
240506-13:44:46,458 nipype.workflow WARNING:
	 Error while checking node hash, forcing re-run. Although this error may not prevent the workflow from running, it could indicate a major problem. Please report a new issue at https://github.com/nipy/nipype/issues adding the following information:

	Node: qsiprep_wf.single_subject_5058_wf.anat_preproc_wf.output_grid_wf.autobox_template
	Interface: nipype.interfaces.afni.utils.Autobox
	Traceback:
Traceback (most recent call last):

Here is the output of the corresponding logfile

Node: qsiprep_wf.single_subject_5058_wf.anat_preproc_wf.anat_template_wf.anat_conform
Working directory: /scratch/qsiprep_wf/single_subject_5058_wf/anat_preproc_wf/anat_template_wf/anat_conform

Node inputs:

deoblique_header = True
in_file = <undefined>
target_shape = <undefined>
target_zooms = <undefined>

Traceback (most recent call last):
  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/nipype/pipeline/plugins/multiproc.py", line 292, in _send_procs_to_workers
    num_subnodes = self.procs[jobid].num_subnodes()
  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/nipype/pipeline/engine/nodes.py", line 1308, in num_subnodes
    self._get_inputs()
  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/nipype/pipeline/engine/nodes.py", line 1322, in _get_inputs
    super(MapNode, self)._get_inputs()
  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/nipype/pipeline/engine/nodes.py", line 590, in _get_inputs
    outputs = _load_resultfile(results_fname).outputs
  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/nipype/pipeline/engine/utils.py", line 293, in load_resultfile
    result = loadpkl(results_file)
  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/nipype/utils/filemanip.py", line 666, in loadpkl
    raise e
  File "/opt/conda/envs/qsiprep/lib/python3.10/site-packages/nipype/utils/filemanip.py", line 643, in loadpkl
    unpkl = pickle.loads(pkl_contents)
ModuleNotFoundError: No module named 'qsiprep.niworkflows'

here is the logfile when version 0.19.1 was defined

Node: qsiprep_wf.single_subject_5058_wf.dwi_preproc_ses_T1_run_1_wf.hmc_sdc_wf.eddy
Working directory: /scratch/qsiprep_wf/single_subject_5058_wf/dwi_preproc_ses_T1_run_1_wf/hmc_sdc_wf/eddy

Node inputs:

args = --ol_nstd=4
cnr_maps = True
dont_peas = False
dont_sep_offs_move = False
environ = {'FSLOUTPUTTYPE': 'NIFTI_GZ', 'OMP_NUM_THREADS': '8'}
estimate_move_by_susceptibility = True
fep = False
field = /scratch/qsiprep_wf/single_subject_5058_wf/dwi_preproc_ses_T1_run_1_wf/hmc_sdc_wf/topup/fieldmap_HZ.nii.gz
field_mat = /scratch/qsiprep_wf/single_subject_5058_wf/dwi_preproc_ses_T1_run_1_wf/hmc_sdc_wf/topup_to_eddy_reg/topup_reg_image_flirt.mat
flm = quadratic
fudge_factor = 10.0
fwhm = <undefined>
in_acqp = /scratch/qsiprep_wf/single_subject_5058_wf/dwi_preproc_ses_T1_run_1_wf/hmc_sdc_wf/gather_inputs/eddy_acqp.txt
in_bval = /scratch/qsiprep_wf/single_subject_5058_wf/dwi_preproc_ses_T1_run_1_wf/pre_hmc_wf/rpe_concat/merge__merged.bval
in_bvec = /scratch/qsiprep_wf/single_subject_5058_wf/dwi_preproc_ses_T1_run_1_wf/pre_hmc_wf/rpe_concat/merge__merged.bvec
in_file = /scratch/qsiprep_wf/single_subject_5058_wf/dwi_preproc_ses_T1_run_1_wf/pre_hmc_wf/rpe_concat/merge__merged.nii.gz
in_index = /scratch/qsiprep_wf/single_subject_5058_wf/dwi_preproc_ses_T1_run_1_wf/hmc_sdc_wf/gather_inputs/eddy_index.txt
in_mask = /scratch/qsiprep_wf/single_subject_5058_wf/dwi_preproc_ses_T1_run_1_wf/hmc_sdc_wf/pre_eddy_b0_ref_wf/synthstrip_wf/mask_to_original_grid/topup_imain_corrected_avg_trans_mask_trans.nii.gz
in_topup_fieldcoef = <undefined>
in_topup_movpar = <undefined>
initrand = <undefined>
interp = spline
is_shelled = True
json = <undefined>
mbs_ksp = <undefined>
mbs_lambda = <undefined>
mbs_niter = <undefined>
method = jac
mporder = 10
multiband_factor = <undefined>
multiband_offset = <undefined>
niter = 5
num_threads = 8
nvoxhp = 1000
out_base = eddy_corrected
outlier_nstd = <undefined>
outlier_nvox = <undefined>
outlier_pos = <undefined>
outlier_sqr = <undefined>
outlier_type = <undefined>
output_type = NIFTI_GZ
repol = True
residuals = False
session = <undefined>
slice2vol_interp = <undefined>
slice2vol_lambda = <undefined>
slice2vol_niter = <undefined>
slice_order = <undefined>
slm = linear
use_cuda = True

Traceback (most recent call last):
  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/plugins/multiproc.py", line 67, in run_node
    result["result"] = node.run(updatehash=updatehash)
  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 527, in run
    result = self._run_interface(execute=True)
  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 645, in _run_interface
    return self._run_command(execute)
  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 771, in _run_command
    raise NodeExecutionError(msg)
nipype.pipeline.engine.nodes.NodeExecutionError: Exception raised while executing Node eddy.

Cmdline:
	eddy_cuda --ol_nstd=4 --cnr_maps --estimate_move_by_susceptibility --field=/scratch/qsiprep_wf/single_subject_5058_wf/dwi_preproc_ses_T1_run_1_wf/hmc_sdc_wf/topup/fieldmap_HZ --field_mat=/scratch/qsiprep_wf/single_subject_5058_wf/dwi_preproc_ses_T1_run_1_wf/hmc_sdc_wf/topup_to_eddy_reg/topup_reg_image_flirt.mat --flm=quadratic --ff=10.0 --acqp=/scratch/qsiprep_wf/single_subject_5058_wf/dwi_preproc_ses_T1_run_1_wf/hmc_sdc_wf/gather_inputs/eddy_acqp.txt --bvals=/scratch/qsiprep_wf/single_subject_5058_wf/dwi_preproc_ses_T1_run_1_wf/pre_hmc_wf/rpe_concat/merge__merged.bval --bvecs=/scratch/qsiprep_wf/single_subject_5058_wf/dwi_preproc_ses_T1_run_1_wf/pre_hmc_wf/rpe_concat/merge__merged.bvec --imain=/scratch/qsiprep_wf/single_subject_5058_wf/dwi_preproc_ses_T1_run_1_wf/pre_hmc_wf/rpe_concat/merge__merged.nii.gz --index=/scratch/qsiprep_wf/single_subject_5058_wf/dwi_preproc_ses_T1_run_1_wf/hmc_sdc_wf/gather_inputs/eddy_index.txt --mask=/scratch/qsiprep_wf/single_subject_5058_wf/dwi_preproc_ses_T1_run_1_wf/hmc_sdc_wf/pre_eddy_b0_ref_wf/synthstrip_wf/mask_to_original_grid/topup_imain_corrected_avg_trans_mask_trans.nii.gz --interp=spline --data_is_shelled --resamp=jac --mporder=10 --niter=5 --nvoxhp=1000 --out=/scratch/qsiprep_wf/single_subject_5058_wf/dwi_preproc_ses_T1_run_1_wf/hmc_sdc_wf/eddy/eddy_corrected --repol --slm=linear
Stdout:

Stderr:
	eddy_cuda: error while loading shared libraries: libcublas.so.10: cannot open shared object file: No such file or directory
Traceback:
	Traceback (most recent call last):
	  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/interfaces/base/core.py", line 453, in aggregate_outputs
	    setattr(outputs, key, val)
	  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/interfaces/base/traits_extension.py", line 330, in validate
	    value = super(File, self).validate(objekt, name, value, return_pathlike=True)
	  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/interfaces/base/traits_extension.py", line 135, in validate
	    self.error(objekt, name, str(value))
	  File "/usr/local/miniconda/lib/python3.8/site-packages/traits/base_trait_handler.py", line 74, in error
	    raise TraitError(
	traits.trait_errors.TraitError: The 'out_corrected' trait of an ExtendedEddyOutputSpec instance must be a pathlike object or string representing an existing file, but a value of '/scratch/qsiprep_wf/single_subject_5058_wf/dwi_preproc_ses_T1_run_1_wf/hmc_sdc_wf/eddy/eddy_corrected.nii.gz' <class 'str'> was specified.

	During handling of the above exception, another exception occurred:

	Traceback (most recent call last):
	  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/interfaces/base/core.py", line 400, in run
	    outputs = self.aggregate_outputs(runtime)
	  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/interfaces/base/core.py", line 460, in aggregate_outputs
	    raise FileNotFoundError(msg)
	FileNotFoundError: No such file or directory '/scratch/qsiprep_wf/single_subject_5058_wf/dwi_preproc_ses_T1_run_1_wf/hmc_sdc_wf/eddy/eddy_corrected.nii.gz' for output 'out_corrected' of a ExtendedEddy interface

I missed it in your original post - you need to tell docker to allow access to your gpus. docker run --gpus all

I defined --gpus all after participant, but qsiprep does not recognize it

qsiprep: error: unrecognized arguments: --gpus all

directly after run
docker run --rm --gpus all
docker: Error response from daemon: could not select device driver “” with capabilities: [[gpu]].

it’s an argument for docker, not qsiprep

Hi @Ralf,

Did you install the nvidia container toolkit? Installing the NVIDIA Container Toolkit — NVIDIA Container Toolkit 1.15.0 documentation

Hi @Steven,

So far the toolkit has not been installed. However, I have to ask our sysadmin because we work behind a firewall (university hospital) and the proxy settings and so on are more complicated. We work with rocky linux 9.2 and nvidia A40. I’ll keep you updated.