Fmriprep - potentially antsApplyTransforms and hd5 issue

Hello fmriprep experts,

Here’s what we’re doing

  • our aim is to run anat-only (we’re doing this purposefully for an automated pipeline and would like to run anat first.
  • the error is at that antsApplyTransforms step
  • We’re also getting empty hdf5

This is the code that we’ve been running

singularity exec
-B /corral-secure/projects/A2CPS/:/corral-secure/projects/A2CPS/
-e docker://jurrutia/fmriprep:20.2.1
fmriprep
/corral-secure/projects/A2CPS/products/mris/UI_uic/bids/UI10003V1
.
participant --participant_label 10003
-w ./work
–write-graph
–n-cpus 16
–notrack
–mem_mb 48000
–bold2t1w-dof 9
–dummy-scans 0
–aroma-melodic-dimensionality 0
–fd-spike-threshold 0
–cifti-output 91k
–anat-only
–skip-bids-validation
–fs-license-file /opt/freesurfer_license/license.txt

Errors that we get

Traceback (most recent call last):
File “/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/plugins/multiproc.py”, line 67, in run_node
result[“result”] = node.run(updatehash=updatehash)
File “/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py”, line 516, in run
result = self._run_interface(execute=True)
File “/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py”, line 635, in _run_interface
return self._run_command(execute)
File “/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py”, line 741, in _run_command
result = self._interface.run(cwd=outdir)
File “/usr/local/miniconda/lib/python3.7/site-packages/nipype/interfaces/base/core.py”, line 419, in run
runtime = self._run_interface(runtime)
File “/usr/local/miniconda/lib/python3.7/site-packages/niworkflows/interfaces/fixes.py”, line 42, in _run_interface
runtime, correct_return_codes
File “/usr/local/miniconda/lib/python3.7/site-packages/nipype/interfaces/base/core.py”, line 814, in _run_interface
self.raise_exception(runtime)
File “/usr/local/miniconda/lib/python3.7/site-packages/nipype/interfaces/base/core.py”, line 745, in raise_exception
).format(**runtime.dictcopy())
RuntimeError: Command:
antsApplyTransforms --default-value 0 --dimensionality 3 --float 1 --input /corral-secure/projects/A2CPS/system/jobs/urrutia/job-f4851f9d-8564-4aea-bdd2-ca819f1f7c85-007-fmriprep-anat-ui10003v1/work/fmriprep_wf/single_subject_10003_wf/anat_preproc_wf/split_seg/aseg_label-WM_mask.nii.gz --interpolation Gaussian --output aseg_label-WM_mask_trans.nii.gz --reference-image /home1/05369/urrutia/.cache/templateflow/tpl-MNI152NLin6Asym/tpl-MNI152NLin6Asym_res-01_T1w.nii.gz --transform /corral-secure/projects/A2CPS/system/jobs/urrutia/job-f4851f9d-8564-4aea-bdd2-ca819f1f7c85-007-fmriprep-anat-ui10003v1/work/fmriprep_wf/single_subject_10003_wf/anat_preproc_wf/anat_norm_wf/_template_MNI152NLin6Asym/registration/ants_t1_to_mniComposite.h5
Standard output:
Standard error:
Transform reader for /corral-secure/projects/A2CPS/system/jobs/urrutia/job-f4851f9d-8564-4aea-bdd2-ca819f1f7c85-007-fmriprep-anat-ui10003v1/work/fmriprep_wf/single_subject_10003_wf/anat_preproc_wf/anat_norm_wf/_template_MNI152NLin6Asym/registration/ants_t1_to_mniComposite.h5 caught an ITK exception:
itk::ExceptionObject (0x2323510)
Location: “unknown”
File: /src/ants/build/ITKv5/Modules/IO/TransformBase/src/itkTransformFileReader.cxx
Line: 128
Description: itk::ERROR: TransformFileReaderTemplate(0x231ef70): Could not create Transform IO object for reading file /corral-secure/projects/A2CPS/system/jobs/urrutia/job-f4851f9d-8564-4aea-bdd2-ca819f1f7c85-007-fmriprep-anat-ui10003v1/work/fmriprep_wf/single_subject_10003_wf/anat_preproc_wf/anat_norm_wf/_template_MNI152NLin6Asym/registration/ants_t1_to_mniComposite.h5
Tried to create one of the following:
HDF5TransformIOTemplate
HDF5TransformIOTemplate
MatlabTransformIOTemplate
MatlabTransformIOTemplate
TxtTransformIOTemplate
TxtTransformIOTemplate
You probably failed to set a file suffix, or
set the suffix to an unsupported type.

here’s what we did

  1. we tried opening up the hdf5
  2. added --clean-env : thinking it was a lingering path issue
  3. explicitly designated template via --output-spaces: based on this Resampling funcs to standard spaces fails in 1.4.0a0 · Issue #1626 · nipreps/fmriprep · GitHub
  4. removed the aroma melodic dimensionality
  5. added -B $HOME:/home/fmriprep --home /home/fmriprep \
  • the -B home option did not fix the error
  • the test data is failing in the same way, so it is likely an issue with the system and not with the data
  1. Re-working the $TEMPLATEFLOW_HOME didn’t fixthe issue. Based on the logs, this was the output space: MNI152NLin2009cAsym:res-native. We believe the associated template file is this one: /home/fmriprep/.cache/templateflow/tpl-MNI152NLin2009cAsym

This is a lot of information, so please let me know how to make this clearer.

Thank you,
Heejung

Hi, does this reproduce? If you delete /corral-secure/projects/A2CPS/system/jobs/urrutia/job-f4851f9d-8564-4aea-bdd2-ca819f1f7c85-007-fmriprep-anat-ui10003v1/work/fmriprep_wf/single_subject_10003_wf/anat_preproc_wf/anat_norm_wf/_template_MNI152NLin6Asym/registration/, and re-run do you get the same error again?

Without looking at it closer, my first thought is that it’s a filesystem issue, either that you hit a quota and were unable to finish writing the full .h5 file, or it was written but not properly synced when it attempted to be read. /corral-secure indicates that this is probably a networked filesystem, which can have synchronization issues.

Thanks for the quick response. I’ll keep you posted after going through your suggested method.

[ So after doing a couple of checks ]

  1. We ruled out a disk quota issue
    – What we did to rule out the disk quota issue
  • df
  • checked work, home, and corral-secure disk space.
  • written several gbs since that first failure
  1. At this point, we don’t think the deletion of the files would help because we ran 6 runs in total, which all run in a new location. In other words, they had different registration locations, so deleting wouldn’t help in our case, because there is no overlapping “registration file” from a previous job. The syncing seems possible, but not 6 times in a row.

[ Next plan, ]

  • we’ve submitted a new job using a verbose flag -vvv to get some information.

[ You mentioned “unable to finish writing to the full .h5” ]

  • this may be irrelevant, but I’ve been able to run partially successful anat-only fmriprep on a different compute node (with slightly different parameters than posted above, so there are a millions reasons to this different output) –
  • the error I get is: it would produce anat nifti outputs, but would always error out when writing to the citation.html file Could not generate CITATION.html file:
  • Wonder if something is going on while writing to specific files.

What would be a good place to work on for the time being, other than the -vvv option?

When that happens, just add the --md-only-boilerplate flag to avoid trying to convert the CITATION text to HTML and LaTeX.

I’m not sure -vvv is going to be a good move, since that level will be debugging the workflow submitter and you’ll get a status of the work queue every two seconds. For users, I don’t think I’d recommend going beyond -v.

Hello @effigies and fmriprep

We’ve found the solution and thought we’d report it to keep it on your radar.
It turns out that home on the host system was overwriting home on the container.

Here’s the solution that worked for us:

  1. set —no-home
  2. And then define --home $WHICHEVER_PATH as the new directory.

Thank you,
Heejung