Dear fmriprep experts,
I am currently trying to run fmriprep (latest version on a singularity version 3.01 image) for 76 subjects in parallel, for a dataset countaining T1w, T2w, T2star and 3 bold functional files per subject.
I tried running it multiple times, and several issues came up:
-for the same subject, the recon-all output was different across the multiple runs, among all the subjects. Would you have an explanation to this?
The differences across the runs are the 2 cases as follows:
1-When the recon-all doesn’t make it completely, it skips some steps, especially for these files
aparc.a2009s+aseg.mgz
aparc.DKTatlas+aseg.mgz
and less frequently: aseg.mgz
I don’t get any error in the recon-all.log file, except that I see that recon-all skips these steps. In the /output/fmriprep/log/ log file (crash file) it is written that recon-all errors occured. And on the visual html report, I see “_autorecon30” or “_autorecon31”.
2- When recon-all went well, then I get a different type of error. There is no error in the recon-all.log file, and regarding the error in the /output/fmriprep/log/ log file, I obtain this:
Node: fmriprep_wf.single_subject_004_wf.func_preproc_ses_1_task_compassion_wf.bold_surf_wf.medial_nans
Working directory: /mnt/data/loic2/work/fmriprep_wf/single_subject_004_wf/func_preproc_ses_1_task_compassion_wf/bold_surf_wf/_hemi_lh/medial_nans
Node inputs:
in_file = ['/mnt/data/loic2/work/fmriprep_wf/single_subject_004_wf/func_preproc_ses_1_task_compassion_wf/bold_surf_wf/_hemi_lh/sampler/mapflow/_sampler0/lh.fsaverage5.gii']
subjects_dir = <undefined>
target_subject = ['fsaverage5']
Traceback (most recent call last):
File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/plugins/multiproc.py", line 69, in run_node
result['result'] = node.run(updatehash=updatehash)
File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 473, in run
result = self._run_interface(execute=True)
File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 1253, in _run_interface
self.config['execution']['stop_on_first_crash'])))
File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 1128, in _collate_results
for i, nresult, err in nodes:
File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/utils.py", line 99, in nodelist_runner
result = node.run(updatehash=updatehash)
File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 473, in run
result = self._run_interface(execute=True)
File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 557, in _run_interface
return self._run_command(execute)
File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 637, in _run_command
result = self._interface.run(cwd=outdir)
File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/interfaces/base/core.py", line 369, in run
runtime = self._run_interface(runtime)
File "/usr/local/miniconda/lib/python3.7/site-packages/niworkflows/interfaces/freesurfer.py", line 328, in _run_interface
newpath=runtime.cwd)
File "/usr/local/miniconda/lib/python3.7/site-packages/niworkflows/interfaces/freesurfer.py", line 463, in medial_wall_to_nan
darray.data[medial] = np.nan
ValueError: assignment destination is read-only
Moreover, in the visual report, I obtained this error (knowing that my first task in the resting state bold data is compassion) :
fmriprep_wf.single_subject_004_wf.func_preproc_ses_1_task_compassion_wf.bold_surf_wf.medial_nans
Finally, in this case (2), considered as most advanced run, I don’t get any ‘/output/fmriprep/func/’ directory and this doesn’t seem normal to me.
I am running my jobs on a cluster composed of 12 nodes, each of which have 20 processors composed of 2 threads each. Making a total of 40 threads per node and 480 threads total.
I read on previous reports that fmriprep wasn’t tested and reliable on data sets superior to 4 subjects ( for jobs in parallel) , would this really be an issue?
I also think that my functional data might have issues such as oversized brains…would it be an issue if some voxels go empty due to split parts of the brain for the registration?
Here is my script that I run on slurm (sbatch):
#! /bin/bash
#
#SBATCH --job-name=limited_mindfcomp_22_03
#SBATCH --output=limited_mindfcomp_22_03.txt
#SBATCH --error=limited_mindfcomp_22_03.err
#SBATCH --cpus-per-task=20
#SBATCH --array=1-76
SUBJ=(002 004 005 007 010 011 012 014 016 017 018 022 025 026 028 029 030 032 034 035 036 037 038 040 042 050 052 053 054 055 056 057 058 059 060 062 063 064 065 067 068 069 070 071 072 073 074 075 076 077 078 079 080 081 082 083 087 089 090 091 092 093 094 095 096 097 098 099 101 102 103 104 105 106 108 109)
singularity run --cleanenv -B /mnt:/mnt /mnt/data/singularity_images/fmriprep-latest.simg /mnt/data/loic2/RSBIDS4 /mnt/data/loic2/fmriprep_output_tw_less participant --participant-label ${SUBJ[$SLURM_ARRAY_TASK_ID-1]} --low-mem --stop-on-first-crash --medial-surface-nan --use-aroma --cifti-output --notrack --output-space template fsaverage5 --fs-license-file /mnt/data/loic2/license.txt
Thanks in advance for any suggestion.
Best,
Loïc