Fmriprep 20.2.0 error in anat_preproc_wf.brain_extraction_wf.atropos_wf.dil_brainmask

Kyungsun · June 2, 2021, 7:42pm

Hi!
I ran the workflow for a single subject using singularity container but ran into following error.

Here’s the full log:

Node:fmriprep_wf.single_subject_HC004_wf.anat_preproc_wf.brain_extraction_wf.atropos_wf.dil_brainmask
Working directory: /autofs/space/tsogyal_001/users/kyungsun/SPARC_BAY1/fmriprep_tmp/fmriprep_wf/single_subject_HC004_wf/anat_preproc_wf/brain_extraction_wf/atropos_wf/dil_brainmask

Node inputs:

args =
copy_header = True
dimension = 3
environ = {‘NSLOTS’: ‘1’}
num_threads = 1
op1 =
op2 = 2
operation = MD
output_image =

Traceback (most recent call last):
File “/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/plugins/multiproc.py”, line 67, in run_node
result[“result”] = node.run(updatehash=updatehash)
File “/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py”, line 516, in run
result = self._run_interface(execute=True)
File “/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py”, line 635, in _run_interface
return self._run_command(execute)
File “/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py”, line 741, in _run_command
result = self._interface.run(cwd=outdir)
File “/usr/local/miniconda/lib/python3.7/site-packages/nipype/interfaces/base/core.py”, line 419, in run
runtime = self._run_interface(runtime)
File “/usr/local/miniconda/lib/python3.7/site-packages/nipype/interfaces/base/core.py”, line 814, in _run_interface
self.raise_exception(runtime)
File “/usr/local/miniconda/lib/python3.7/site-packages/nipype/interfaces/base/core.py”, line 745, in raise_exception
).format(**runtime.dictcopy())
RuntimeError: Command:
ImageMath 3 tpl-OASIS30ANTs_res-01_label-brain_probseg_trans_resampled_maths.nii.gz MD /autofs/space/tsogyal_001/users/kyungsun/SPARC_BAY1/fmriprep_tmp/fmriprep_wf/single_subject_HC004_wf/anat_preproc_wf/brain_extraction_wf/thr_brainmask/tpl-OASIS30ANTs_res-01_label-brain_probseg_trans_resampled.nii.gz 2
Standard output:

Standard error:
Illegal instruction (core dumped)
Return code: 132

The command I had used was:
singularity run --cleanenv -B /autofs/space/tsogyal_001/users/kyungsun/SPARC_BAY1/BIDS -B /autofs/space/tsogyal_001/users/kyungsun/SPARC_BAY1/fmriprep_tmp -B /scratch:/tmp /usr/pubsw/packages/fmriprep/20.2.0/fmriprep-20.2.0.simg --participant_label $sub_name --mem 42 --nthreads 6 -w /autofs/space/tsogyal_001/users/kyungsun/SPARC_BAY1/fmriprep_tmp --output-spaces {MNI152NLin2009cAsym:res-2,T1w:res-2} --fs-license-file=/autofs/space/tsogyal_001/users/kyungsun/SPARC_BAY1/BIDS/code/license.txt /autofs/space/tsogyal_001/users/kyungsun/SPARC_BAY1/BIDS /autofs/space/tsogyal_001/users/kyungsun/SPARC_BAY1/BIDS/derivatives/ participant

Any ideas on how to solve this problem?
Thanks!
Kyungsun

Steven · June 2, 2021, 8:58pm

I think the --mem argument specifies megabytes, not GB, according to the documentation.

Try making it 42000 if you want 42 GB.
Another way you can improve your command, although purely aesthetic, would be only using -B /autofs/space/tsogyal_001/users/kyungsun/SPARC_BAY1/ since both BIDS and fmriprep_tmp are contained within it, and you don’t use /tmp.
Make sure you have read and write permissions to your output and work folders.
Replace the = with a space in the fs-license-file part

Hope this solves your problem!
Steven

Kyungsun · June 3, 2021, 1:30pm

Hi Steven, thanks for your help.
I’ve tried, but still ran into the same error.

Steven · June 3, 2021, 1:43pm

Does this happen with each subject or is it just one in particular?

Kyungsun · June 3, 2021, 2:01pm

It’s happening with each subject with the same error.

Steven · June 3, 2021, 2:23pm

Not sure what the problem is if that didn’t fix it, would you be comfortable sharing data for a single subject so I can see if I can replicate the error?

Steven · June 3, 2021, 10:11pm

Have you run BIDS validator on your dataset? Or try removing the nthreads variable?

hpfisher3 · June 18, 2021, 4:00am

Hey Steven!

Just following up here. Kyungsun and I are pretty confident that the issue comes from the fact that we are running out of space in our Home .cache/templateflow directory. We can get one subject to run, but then subsequent jobs on our cluster fail as the templateflow directory gets filled. If we empty the cache directory, the next job completes. We are on a linux system at the Martinos Center, and we have very little space on our linux machine homes, but tons of space on the actual drives. I found a previous thread regarding how/where templateflow writes antsBrainExtraction failure? Freesurfer does fine - #33 by oesteban, although @oesteban was helping the user get the singularity to write to the home .cache directory, and we want the opposite.

If I run
singularity exec /usr/pubsw/packages/fmriprep/20.2.0/fmriprep-20.2.0.simg python -c "from templateflow.conf import TF_HOME; print(TF_HOME)"
I get the output
/homes/1/hf949/.cache/templateflow

Should I just bind this directory somewhere else? Or include export SINGULARITYENV_TEMPLATEFLOW_HOME=/home/fmriprep/templateflow before the singularity call?

Thanks!!

Steven · June 18, 2021, 4:20am

Hi @hpfisher3,

If you have proper permissions, you can create a symlink between your home cache folder and a new folder where you have more space. For example:

mv ~/.cache /DRIVE/WITH/SPACE/
ln -s /DRIVE/WITH/SPACE/.cache ~/.cache

Then, you won’t have to worry about your cache being lost to any softwares. Just make sure you bind both drives with your caches to Singularity when running the command.

Hope this works,
Steven

Kyungsun · June 22, 2021, 3:27pm

Thanks for your thoughts, Steven!
I’ve tried symlinking the cache folder but failed to run fmriprep with error below.
Same thing happened when symlinking the subdirectories such as .cache/templateflow.

Traceback (most recent call last):
File “/usr/local/miniconda/lib/python3.7/pathlib.py”, line 1241, in mkdir
self._accessor.mkdir(self, mode)
FileNotFoundError: [Errno 2] No such file or directory: ‘/homes/2/kh81/.cache/templateflow’

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/usr/local/miniconda/bin/fmriprep”, line 6, in
from fmriprep.cli.run import main
File “/usr/local/miniconda/lib/python3.7/site-packages/fmriprep/cli/run.py”, line 4, in
from … import config
File “/usr/local/miniconda/lib/python3.7/site-packages/fmriprep/config.py”, line 96, in
from templateflow import version as _tf_ver
File “/usr/local/miniconda/lib/python3.7/site-packages/templateflow/init.py”, line 19, in
from . import api
File “/usr/local/miniconda/lib/python3.7/site-packages/templateflow/api.py”, line 7, in
from .conf import TF_LAYOUT, TF_S3_ROOT, TF_USE_DATALAD
File “/usr/local/miniconda/lib/python3.7/site-packages/templateflow/conf/init.py”, line 42, in
_update_s3(TF_HOME, local=True, overwrite=True)
File “/usr/local/miniconda/lib/python3.7/site-packages/templateflow/conf/_s3.py”, line 20, in update
retval = _update_skeleton(skel_file, dest, overwrite=overwrite, silent=silent)
File “/usr/local/miniconda/lib/python3.7/site-packages/templateflow/conf/_s3.py”, line 52, in _update_skeleton
dest.mkdir(exist_ok=True, parents=True)
File “/usr/local/miniconda/lib/python3.7/pathlib.py”, line 1245, in mkdir
self.parent.mkdir(parents=True, exist_ok=True)
File “/usr/local/miniconda/lib/python3.7/pathlib.py”, line 1241, in mkdir
self._accessor.mkdir(self, mode)
FileExistsError: [Errno 17] File exists: ‘/homes/2/kh81/.cache’

I wonder if I should also bind symlinked cache directory to the singularity container.
Thanks!

Steven · June 22, 2021, 3:33pm

Hi,

Yes you should definitely bind the symlinked cache directory to the singularity container.

Best,
Steven

Kyungsun · June 23, 2021, 8:00pm

Hi Steven,
Binding the symlinked directory to the singularity container solved the issue of my last post, but now we’re back at fmriprep failing during the brain extraction workflow. So it seems like the symlink has worked, but the issue is still memory related. For example, here’s the error output log.

Downloading https://templateflow.s3.amazonaws.com/tpl-MNI152NLin2009cAsym/tpl-MNI152NLin2009cAsym_res-01_desc-carpet_dseg.nii.gz
  0%|          | 0.00/440 [00:00<?, ?B/s]
 27%|██▋       | 118/440 [00:00<00:00, 1.10kB/s]
441B [00:00, 2.26kB/s]

It seems like the download of the MNI templates is getting truncated. And if we navigate to the templateflow folder (in the new cache directory we create), it contains a lot of files that have 0 bytes.

We’re going to try checking with our IT help desk again, but let us know if you think of anything else to try!

Kyungsun

Steven · June 23, 2021, 8:56pm

Just as a sanity check, have you made sure the cache drive in the new symlinked location is recognized? That is, new cache items are being stored there? I guess another thing I might try is just downloading these templates separately before running fmriprep and then it should skip the download, although I don’t know what items you need off the top of my head.

Steven · June 23, 2021, 8:58pm

If you want, I can send you a zipped version of my templateflow cache, which should contain more than enough to run fmriprep. Just send me a message with your email and I’ll do a secure file transfer.

Kyungsun · June 25, 2021, 6:18pm

Hi,

Thanks for sending the templateflow cache. Unfortunately, it ended up having the same exact error. Now I guess it is unrelated to memory issue.

Here’s the error log:

Node: fmriprep_wf.single_subject_HC030_wf.anat_preproc_wf.brain_extraction_wf.atropos_wf.dil_brainmask
    Working directory: /autofs/space/tsogyal_001/users/kyungsun/SPARC_BAY1/fmriprep_tmp/fmriprep_wf/single_subject_HC030_wf/anat_preproc_wf/brain_extraction_wf/atropos_wf/dil_brainmask

Node inputs:

args = <undefined>
copy_header = True
dimension = 3
environ = {'NSLOTS': '1'}
num_threads = 1
op1 = <undefined>
op2 = 2
operation = MD
output_image = <undefined>

Traceback (most recent call last):
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/plugins/multiproc.py", line 67, in run_node
    result["result"] = node.run(updatehash=updatehash)
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 516, in run
    result = self._run_interface(execute=True)
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 635, in _run_interface
    return self._run_command(execute)
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 741, in _run_command
    result = self._interface.run(cwd=outdir)
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/interfaces/base/core.py", line 419, in run
    runtime = self._run_interface(runtime)
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/interfaces/base/core.py", line 814, in _run_interface
    self.raise_exception(runtime)
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/interfaces/base/core.py", line 745, in raise_exception
    ).format(**runtime.dictcopy())
RuntimeError: Command:
ImageMath 3 tpl-OASIS30ANTs_res-01_label-brain_probseg_trans_resampled_maths.nii.gz MD /autofs/space/tsogyal_001/users/kyungsun/SPARC_BAY1/fmriprep_tmp/fmriprep_wf/single_subject_HC030_wf/anat_preproc_wf/brain_extraction_wf/thr_brainmask/tpl-OASIS30ANTs_res-01_label-brain_probseg_trans_resampled.nii.gz 2
Standard output:

Standard error:
Illegal instruction (core dumped)
Return code: 132

I’m trying to check with our IT help desk if there’s any issue with my local machine. But let us know if you have any ideas.
Thanks!

Kyungsun

walkerped · June 26, 2021, 5:07pm

I am also getting this error:

RuntimeError: Command:
ImageMath 3 tpl-OASIS30ANTs_res-01_label-brain_probseg_trans_resampled_maths.nii.gz MD /scratch/wsp/fmriprep_wf/single_subject_MOATMS26_wf/anat_preproc_wf/brain_extraction_wf/thr_brainmask/tpl-OASIS30ANTs_res-01_label-brain_probseg_trans_resampled.nii.gz 2
Standard output:

Standard error:
Illegal instruction (core dumped)
Return code: 132

This is happening for all subjects, even ones that have previously run successfully (with the same version of fmriprep) after deleting their fmriprep and freesurfer folders and clearing their working directories.

The tpl-OASIS30ANTs_res-01_label-brain_probseg_trans_resampled.nii.gz file is in the directory specified. If I run the ImageMath command by hand, I get the same error. If I use GD instead of MD (grayscale dilation, instead of morphological) the command finishes successfully. I have confirmed using 3dROIstats that tpl-OASIS30ANTs_res-01_label-brain_probseg_trans_resampled.nii.gz is a binary mask with 0s and 1s.

I am using singularity with the command:

singularity run --cleanenv \
    -B ${BIDS_DIR}:${BIDS_DIR},${derivatives_DIR}:${derivatives_DIR},/scratch/wsp:/scratch/wsp,${fsLicense}:${fsLicense},${filterFile}:${filterFile},${cache}:${cache} \
    fmriprep-20.2.1.simg\
    $BIDS_DIR $derivatives_DIR participant \
    --participant-label sub-${subject} \
    --output-spaces fsnative fsaverage T1w MNI152NLin2009cAsym \
    --longitudinal \
    --work-dir /scratch/wsp \
    --cifti-output \
    --bids-filter-file ${filterFile} \
    --dummy-scans 3 \
    --fs-license-file ${fsLicense} \
    --mem_mb 16000 \
    --skip_bids_validation

walkerped · July 7, 2021, 8:00pm

After getting some help here Error during brain masking · Issue #2433 · nipreps/fmriprep · GitHub and here antsBrainExtraction.sh failing on ImageMath · Issue #1204 · ANTsX/ANTs · GitHub, it was determined that the processors on the cluster I was using were too old for the version of ANTs fmriprep uses (or more specifically, the way ANTs was compiled). I have moved processing to a newer cluster, and it is running well.

Kyungsun · July 7, 2021, 8:55pm

Thanks for sharing! I should also try running this on different cluster.