Stuck on eddy_cuda with hcp-development dwi data

Summary of what happened:

Hi, experts. I am using QSIPrep to preprocess DTI data from the HCP-development database and wish to use eddy_cuda to accelerate the SDC process. I have tried both pennbbl/qsiprep:0.20.0 and pennbbl/qsiprep:0.18.1, and both reported the same error, 'cudaErrorName = cudaErrorInsufficientDriver, cudaErrorString = CUDA driver version is insufficient for CUDA runtime version’.

Command used (and if a helper script was used, a link to the helper script or the command generated):

export HOME=/home/user7/Tools/HCPLifespan2BIDS-master

docker run -ti --rm \
    -v $HOME/bids:/data \
    -v $HOME/output:/output \
    -v ${FREESURFER_HOME}/license.txt:/usr/local/freesurfer/license.txt \
    pennbbl/qsiprep:0.20.0 \
    /data /output participant \
    --fs-license-file /usr/local/freesurfer/license.txt \
    --output-resolution 1.5 \
    --distortion-group-merge average \
    --skip-anat-based-spatial-normalization \
    --eddy-config /output/eddy_params.json

Version:

0.18.1 and 0.20.0

Environment (Docker, Singularity / Apptainer, custom installation):

Docker

Data formatted according to a validatable standard? Please provide the output of the validator:

PASTE VALIDATOR OUTPUT HERE

Relevant log outputs (up to 20 lines):

Cmdline:
eddy_cuda -- cnr_maps --field=/tmp/work/qsiprep_wf/single_subject_HCD0001305_wf/
_preproc_wf/hmc_sdc_wf/topup/fieldmap_HZ --field_mat=/tmp/work/qsiprep_wf/single_subject
D0001305_wf/dwi_preproc_wf/hmc_sdc_wf/topup_to_eddy_reg/topup_reg_image_flirt.mat --flm=
ear --ff=10.0 --acqp=/tmp/work/qsiprep_wf/single_subject_HCD0001305_wf/dwi_preproc_wf/hm
dc_wf/gather_inputs/eddy_acqp.txt --bvals=/tmp/work/qsiprep_wf/single_subject_HCD0001305
/dwi_preproc_wf/pre_hmc_wf/rpe_concat/merge__merged.bval --bvecs=/tmp/work/qsiprep_wf/si
e_subject_HCD0001305_wf/dwi_preproc_wf/pre_hmc_wf/rpe_concat/merge__merged.bvec --imain=
p/work/qsiprep_wf/single_subject_HCD0001305_wf/dwi_preproc_wf/pre_hmc_wf/rpe_concat/merg
merged.nii.gz --index=/tmp/work/qsiprep_wf/single_subject_HCD0001305_wf/dwi_preproc_wf/h
sdc_wf/gather_inputs/eddy_index.txt --mask=/tmp/work/qsiprep_wf/single_subject_HCD000130
f/dwi_preproc_wf/hmc_sdc_wf/pre_eddy_b0_ref_wf/synthstrip_wf/mask_to_original_grid/topup
ain_corrected_avg_trans_mask_trans.nii.gz --interp=spline --data_is_shelled --resamp-jac
niter=5 --nvoxhp=1000 --out=/tmp/work/qsiprep_wf/single_subject_HCD0001305_wf/dwi_prepro
f/hmc_sdc_wf/eddy/eddy_corrected --repol --slm=linear
Stdout:
EDDY::: EddyCudaHelperFunctions::InitGpu: cudaGetDevice returned an error: cuda
or_t = 35, cudaErrorName = cudaErrorInsufficientDriver, cudaErrorString = CUDA driver ve
on is insufficient for CUDA runtime version
EDDY ::: cuda/EddyCudaHelperFunctions.cu::: static void EDDY:: EddyCudaHelperFun
ons::InitGpu (bool): Exception thrown
EDDY::: cuda/EddyGpuUtils.cu::: static std::shared_ptr<EDDY: :DWIPredictionMake
EDDY :: EddyGpuUtils::LoadPredictionMaker (const EDDY :: EddyCommand LineOptions&, EDDY::ScanT
const EDDY:: ECScanManager&, unsigned int, float, NEWIMAGE: : volume<float>&, bool): Exc
ion thrown
EDDY ::: eddy.cpp::: EDDY: : ReplacementManager* EDDY:: Register (const EDDY :: EddyC
and LineOptions&, EDDY::ScanType, unsigned int, const std::vector<float, std::allocator<f
t> >&, EDDY::SecondLevelECModel, bool, EDDY::ECScanManager&, EDDY: : ReplacementManager*,
MAT::Matrix&, NEWMAT::Matrix&): Exception thrown
EDDY::: Eddy failed with message EDDY::: eddy.cpp::: EDDY: : Replacement Manager*
DY: : DoVolumeToVolumeRegistration (const EDDY :: EddyCommandLineOptions&, EDDY:: ECScanManage
: Exception thrown
Stderr:

Screenshots / relevant information:

My server’s Nvidia driver version is 440.95.01, which corresponds to CUDA 10.2, whereas the CUDA runtime version in the QSIPrep docker image is 9.1 (right?). I thought that the driver version should be backward compatible with the docker’s CUDA runtime and the corresponding cudatoolkit. I am puzzled by this issue and wonder if you could help me solve it.


Hi @ZitengHan and welcome to neurostars!

For future posts like this, please use the Software Support category which provides a standard post template. You can see I have edited your post to match this template, and have changed your screenshot to text (which is easier for us to make sense of).

May you please share the contents of your eddy config JSON? Can you also confirm you have the Docker nvidia runtime installed? I also see you did not add --runtime=nvidia in your docker command, which might help. I mainly use Singularity/Apptainer, so I am not sure if that is how to enable it, but it is worth a shot.

Best,
Steven

Thank you for your quick reply, and I will follow this template for my next posts.

May you please share the contents of your eddy config JSON?

{
  "flm": "linear",
  "slm": "linear",
  "fep": false,
  "interp": "spline",
  "nvoxhp": 1000,
  "fudge_factor": 10,
  "dont_sep_offs_move": false,
  "dont_peas": false,
  "niter": 5,
  "method": "jac",
  "repol": true,
  "num_threads": 40,
  "is_shelled": true,
  "use_cuda": true,
  "cnr_maps": true,
  "residuals": false,
  "output_type": "NIFTI_GZ",
  "args": ""
}

Can you also confirm you have the Docker nvidia runtime installed?

Yes, I confirm that I have installed the nvidia-container-runtime. Is this what you are referring to?

I also see you did not add --runtime=nvidia in your docker command, which might help.

I’ll try it right now. Thank you very much.

1 Like

I also see you did not add --runtime=nvidia in your docker command, which might help.

Turns out, I was using the GPU in Docker the wrong way. I tweaked the command, and now eddy_cuda works just fine. Thank you very much for your help :slight_smile:

export HOME=/home/user7

docker run -ti --rm \
    --gpus all  -e NVIDIA_DRIVER_CAPABILITIES=compute,utility -e NVIDIA_VISIBLE_DEVICES=all \
    -v $HOME/bids:/data \
    -v $HOME/output:/output \
    -v ${FREESURFER_HOME}/license.txt:/usr/local/freesurfer/license.txt \
    pennbbl/qsiprep:0.20.0 \
    /data /output participant \
    --fs-license-file /usr/local/freesurfer/license.txt \
    --output-resolution 1.5 \
    --distortion-group-merge average \
    --skip-anat-based-spatial-normalization \
    --eddy-config /output/eddy_params.json \
    -w /output -v -v

References

docker: Error response from daemon: Unknown runtime specified nvidia. · Issue #838 · NVIDIA/nvidia-docker (github.com)
Found no NVIDIA driver on your system · Issue #533 · NVIDIA/nvidia-docker (github.com)

1 Like