Fmriprep on symlinked bids dataset

TL;DR: Can fmriprep work on a dataset whose bids organization is a symlink overlay? If yes, can you help me troubleshoot my issue?

I have created a symlink overlay of a subset of the ukbiobank dataset. I used this code as a starting point for creating the overlay, then customized it to my own file system’s organization.

Here is an example of the file organization:

ukbb_bids/
└── sub-XXX/
	├── anat
	│   ├── sub-XXX_FLAIR.nii.gz -> <path_to_raw>/T2_FLAIR/T2_FLAIR.nii.gz
	│   └── sub-XXX_T1w.nii.gz -> <path_to_raw>/T1/T1.nii.gz
	├── dwi
	│   ├── sub-XXX_acq-AP_dwi.bval -> <path_to_raw>/dMRI/raw/AP.bval
	│   ├── sub-XXX_acq-AP_dwi.bvec -> <path_to_raw>/dMRI/raw/AP.bvec
	│   ├── sub-XXX_acq-AP_dwi.nii.gz -> <path_to_raw>/dMRI/raw/AP.nii.gz
	│   ├── sub-XXX_acq-PA_dwi.bval -> <path_to_raw>/dMRI/raw/PA.bval
	│   ├── sub-XXX_acq-PA_dwi.bvec -> <path_to_raw>/dMRI/raw/PA.bvec
	│   └── sub-XXX_acq-PA_dwi.nii.gz -> <path_to_raw>/dMRI/raw/PA.nii.gz
	└── func
	    ├── sub-XXX_task-hariri_bold.nii.gz -> <path_to_raw>/fMRI/tfMRI.nii.gz
	    ├── sub-XXX_task-hariri_sbref.nii.gz -> <path_to_raw>/fMRI/tfMRI_SBREF.nii.gz
	    ├── sub-XXX_task-rest_bold.nii.gz -> <path_to_raw>/fMRI/rfMRI.nii.gz
	    └── sub-XXX_task-rest_sbref.nii.gz -> <path_to_raw>/fMRI/rfMRI_SBREF.nii.gz

3 directories, 12 files

I am now wanting to run fmriprep 20.2.0 on these data using singularity. My command is as follows:

singularity run --cleanenv 														\
	-B ${TEMPLATEFLOW_HOST_HOME}:${SINGULARITYENV_TEMPLATEFLOW_HOME} 			\
	${container} ${input} ${TMPDIR}/out participant --participant_label ${s} 	\
	-w ${TMPDIR}/wrk 															\
	--nthreads $SLURM_CPUS_PER_TASK 											\
	--mem_mb $SLURM_MEM_PER_NODE 												\
	--fs-license-file ${license} 												\
	--output-spaces fsaverage6 fsLR MNI152NLin2009cAsym 						\
	--cifti-output 																\
	--skip-bids-validation 														\
	--notrack 																	\
	--omp-nthreads 8 															\
	--use-aroma 																\
	--error-on-aroma-warnings

Where input is the symlinked bids dataset, container is the fmriprep 20.2.0 singularity container, s is the subject ID, and so on.

When running this on a sample subject, I receive the error:

fmriprep: error: Path does not exist: <input>.

Where input is replaced by the directory I fed it above.

The directory certainly exists and the permissions are in order for me to access it. I have run the above command on another dataset that is not symlinked and the command works fine.

Any ideas?
Thanks in advance!

When you call Singularity, make sure you bind the symlinked drive to the singularity container. For example, if your data is stored in /path/to/my/BIDS/, then add -B /path to the singularity call. To be safe, do this for both the drive where your symlinks live, and where the actual files are (if they are not on the same drive). Does that make sense?

Best,
Steven

Thanks, that does make sense and I can try it.

If this is the issue, is there some reason that the above command would work on one dataset and not the symlinked one?

This did not work:

input="/data/bids_symlink"

singularity run --cleanenv 														\
	-B ${TEMPLATEFLOW_HOST_HOME}:${SINGULARITYENV_TEMPLATEFLOW_HOME} 			\
	-B ${input}:input 															\
	${container} input ${TMPDIR}/out participant --participant_label ${s} 		\
	-w ${TMPDIR}/wrk 															\
	--nthreads $SLURM_CPUS_PER_TASK 											\
	--mem_mb $SLURM_MEM_PER_NODE 												\
	--fs-license-file ${license} 												\
	--output-spaces fsaverage6 fsLR MNI152NLin2009cAsym 						\
	--cifti-output 																\
	--skip-bids-validation 														\
	--notrack 																	\
	--omp-nthreads 8 															\
	--use-aroma 																\
	--error-on-aroma-warnings
FATAL:   container creation failed: unable to add /data/bids_symlink to mount list: destination must be an absolute path

I don’t know if it helps, but here are the permissions on bids_symlink

drwxr-sr-x 2 dmoracze UKBB      4096 Sep  2 16:19 bids_symlink

When you run singularity, you are essentially entering a new terminal that only has the software distributed with the container. You bring nothing to it (except maybe your home ~ drive), including any data on your machine. For singularity to find data on the machine, it has to be explicitly linked to the container, which is what -B does.

To address your error, /data is a relative path, not an absolute path. That is, from where you are running the code, /data is a folder, but if you are anywhere else, /data does not exist. Mount the full path. Also, you only need to mount the highest up drive. So, again if it’s something like /Highest_drive/path/to/data/bids_symlink, then you only need -B /Highest_drive

Sorry for the confusion. /data/bids_symlink is the absolute path here.

Actually, I’m giving it /data/UKBB/bids_symlink, I just shortened it here for readability. I am giving the absolute path on my cluster, though.

Here is the full command with all variables (which are all absolute paths).

export TMPDIR=/lscratch/$SLURM_JOB_ID
export TEMPLATEFLOW_HOST_HOME=$HOME/.cache/templateflow
export FMRIPREP_HOST_CACHE=$HOME/.cache/fmriprep
export SINGULARITYENV_TEMPLATEFLOW_HOME="/templateflow"

input="/data/UKBB/bids_symlink"
container="/data/dmoracze/containers/fmriprep-20.2.0.simg"
license="${HOME}/license.txt"

singularity run --cleanenv 														\
	-B ${TEMPLATEFLOW_HOST_HOME}:${SINGULARITYENV_TEMPLATEFLOW_HOME} 			\
	-B ${input}:input 															\
	${container} input ${TMPDIR}/out participant --participant_label ${s} 		\
	-w ${TMPDIR}/wrk 															\
	--nthreads $SLURM_CPUS_PER_TASK 											\
	--mem_mb $SLURM_MEM_PER_NODE 												\
	--fs-license-file ${license} 												\
	--output-spaces fsaverage6 fsLR MNI152NLin2009cAsym 						\
	--cifti-output 																\
	--skip-bids-validation 														\
	--notrack 																	\
	--omp-nthreads 8 															\
	--use-aroma 																\
	--error-on-aroma-warnings

Hmmm, okay. Can you try a test for me? Enter the fmriprep singularity image in your terminal:
singularity shell -B /data $container and see if you can navigate to your data in there?

Also, I see the TMPDIR may not be properly bound to the singularity container either.

Also, as a stylistic tip which you may or may not find helpful, I usually bind the drives and rename them in the singularity container. So I will bind my output directory (usually /path/to/BIDS/derivatives) with the name “output” in the container.

singularity run -B ${scratch}:/workdir -B ${bids_dir}:/mnt:ro -B ${output_dir}:/output -B ${bids_dir}/code/qsiprep/license.txt:/license.txt $IMG /mnt /output participant -w /workdir --fs-license-file /license.txt $OTHER_ARGUMENTS

This should work for a lot of BIDS apps.

Hmmm, weird.

singularity shell -B /data $container worked, I can see all the directories in /data

But singularity shell -B /data/UKBB:input $container gave me the same error. So it looks like there’s an issue with the /data/UKBB drive that contains bids_symlink, it might not be an issue with the symlinked directory.

Makes sense, thanks.

As a point of clarification, are both the original data (that is, where the symlinks point to) and the symlinks themselves on /data ?

Ah ok, I think the issue is how my cluster has aliased this drive.

realpath /data/UKBB
     /spin1/USERS1/UKBB

realpath /data/UKBB_bids_symlink
    /vf/users/UKBB/bids_symlink

I’m confused why I’m given two different drives, but I have a direction figure it out. Thank you!

No problem! Let me know if you continue to run into troubles.

Ok, so maybe not…

I’m choosing to bind /data/UKBB since it contains both the bids symlinks and the raw data to which the links point.

singularity shell -B /data/UKBB:/data/UKBB fmriprep-20.2.0.simg

This works. I can see bids_symlink from the container.

But then this fails:

export TMPDIR=/lscratch/$SLURM_JOB_ID
export TEMPLATEFLOW_HOST_HOME=$HOME/.cache/templateflow
export FMRIPREP_HOST_CACHE=$HOME/.cache/fmriprep
export SINGULARITYENV_TEMPLATEFLOW_HOME="/templateflow"

input="/data/UKBB"
container="/data/dmoracze/containers/fmriprep-20.2.0.simg"
license="${HOME}/license.txt"

singularity run --cleanenv 														\
	-B ${TEMPLATEFLOW_HOST_HOME}:${SINGULARITYENV_TEMPLATEFLOW_HOME} 			\
	-B ${input}:/data/UKBB 															\
	${container} /data/UKBB ${TMPDIR}/out participant --participant_label ${s} 		\
	-w ${TMPDIR}/wrk 															\
	--nthreads $SLURM_CPUS_PER_TASK 											\
	--mem_mb $SLURM_MEM_PER_NODE 												\
	--fs-license-file ${license} 												\
	--output-spaces fsaverage6 fsLR MNI152NLin2009cAsym 						\
	--cifti-output 																\
	--skip-bids-validation 														\
	--notrack 																	\
	--omp-nthreads 8 															\
	--use-aroma 																\
	--error-on-aroma-warnings
fmriprep: error: Path does not exist: </data/UKBB/bids_symlink>.

I’m choosing to map /data/UKBB into the container as /data/UKBB because I want the symlinks to point to the right place.

Well it looks like /data/UKBB points to folders in /spin1 and /vf. Try binding those two drives too.

1 Like