Excessive size resulting from antsApplyTransforms using _xfm.h5 from fMRIprep output


Dear experts,

Thanks to fMRIprep docker I had completed with little effort preprocessing my anatomical & functional images on T1w space using the option --output-spaces T1w. So far I have continued my analysis on T1w space but recently requires my project a further analysis on MNI space.

The simplest solution may be rerunning fMRIprep without using --output-spaces option to obtain the default outputs on MNI. However, it may take much time and resource. To sum up, I would like to warp functional BOLD outputs on T1w space to MNI space as efficiently as possible.

One of my colleagues suggested for me to use SPM normalize to functional BOLD outputs. However, this may be only an ad-hoc approach while fMRIprep already provides bidirectional composite nonlinear transformation info files ending with _xfm.h5, possibly for those with the need like mine.


After some searching and scanning of the source codes of fMRIprep on Git Hub(especially fmriprep/workflows/bold/registration.py and fmriprep/workflows/bold/resampling.py), I arrived at the following command on terminal:

antsApplyTransforms --default-value 0 --dimensionality 3 --float 1 --input-image-type 3 --verbose 1 \
--interpolation LanczosWindowedSinc \
--input /input_files/sub-xxxx_ses-2_task-XXXX_run-1_space-T1w_desc-preproc_bold.nii.gz \
--reference-image /input_files/sub-xxxx_space-MNI152NLin2009cAsym_desc-preproc_T1w.nii.gz \
--transform /input_files/sub-xxxx_from-T1w_to-MNI152NLin2009cAsym_mode-image_xfm.h5 \
--output /output_files/sub-xxxx_ses-2_task-XXXX_run-1_space-MNI_desc-preproc_bold.nii.gz 

Note that before the command I had first copied into /input_files/

  • the target to be transformed(3D functional BOLD time series on T1w)
  • the reference image(anatomy image on MNI, a default fMRIprep output)
  • the transformation file(from_T1w_to_MNI..._xfm.h5, a default fMRIprep output).

The resulting image looks nice, i.e. it matches well to MNI anatomy template. The problem is, the resulting file /output_files/sub-xxxx_ses-2_task-XXXX_run-1_space-MNI_desc-preproc_bold.nii.gz takes too much space on hard disc; it has 168 time points, and takes 3.13GB, while the input /input_files/sub-xxxx_ses-2_task-XXXX_run-1_space-T1w_desc-preproc_bold.nii.gz takes only 87MB.


  1. What caused this excessive size? Is there any solution to the problem (which may already be used by fMRIprep)?

  2. Is there any method more efficient than the way I did to warp functional BOLD outputs on T1w space? Is it best to re-run the fMRIprep using default settings?

  3. If I were to use the warped image with 3.13GB, would that image be qualitatively different from what may have resulted from fMRIprep using default settings? In other words, are there any additional post-correctional touches to normalized functional BOLD image on MNI in the process of the standard fMRIprep pipeline?

Thank you so much in advance.

Two things, most likely.

  1. Voxel size and field-of-view. If you’re using your normalized T1w image as your reference, then your outputs are going to match your normalized T1w’s voxel sizes and field-of-view, which will be appropriate for structural data, but probably too high-density to be sensible for BOLD.
  2. Possibly your on-disk data type is larger for your output files. You’d just have to look.

It would probably be easiest to rerun fMRIPrep. One thing that doing so buys you is that head-motion-correction, susceptibility-distortion-correction and normalization are applied in a single step, reducing resampling artifacts.

If you still have your working directory, it should not take very long to rerun.

There shouldn’t be any corrections applied, but there may be differences in artifacts based on resampling twice, rather than once. The only other consideration is that, if your input data is stored on-disk as int16, fMRIPrep will save the results as int16 to avoid needlessly inflating precision, so that’s one place a (very small) additional rounding difference could be introduced.


@effigies Thanks a lot for your considerate response.

The main culprits seem to have been i) --float 1 option in antsApplyTransforms and ii) --reference-image. The 3.13GB output was in float32 in contrast to int16 as the default fMRIprep output. Also, the dimension was set to the reference image, which was 193 x 229 x 193, in contrast to 63 x 73 x 66 as the default.

Please allow me one more question: what exactly are _xfm.h5 files for? So I previously thought that _xfm.h5 files in the output may be the very magic responsible for T1w-to-template warp. Therefore, since the preprocessed images on T1w have gone through all the preprocessing steps(including head-motion-correction and SDC) but the T1w-to-template warp, I thought I would be legitimate to apply _xfm.h5 file to the preprocessed images on T1w.

Thank you!

They contain an affine transform and nonlinear warp between two spaces. HDF5 is a container format that ANTs uses to specify a sequence of transformations.

I suspect the conceptual issue is what it means to transform data. To resample an image, you need a reference space that allows you to say “this voxel has this coordinate”, which you get from providing your --reference-image. At each voxel, you take the coordinate, and ask the transform file, “what coordinate does this correspond to in the original image?” You then find the nearest voxel(s) to the coordinate, and use an interpolation scheme to combine multiple voxels to get a best guess of the true value at the coordinate you’ve requested, and place that value in your new image.

You might start to see why you an get artifacts: some translated coordinates might land in the middle of a voxel, and the retrieved value will be the value at that voxel; others might land on a corner, and have to find the mean of each of the 8 connecting voxels.

For more on this process, see Nitransforms’ ISBI 2020 submission: https://github.com/poldracklab/nitransforms/blob/master/docs/notebooks/NiTransforms%20-%20ISBI%202020.ipynb

It is legitimate. The only issue is that to get the BOLD series resampled to the T1w, you’ve already applied one interpolation; to then resample that to MNI, you have no alternative to applying another interpolation, which means you have the potential to compound artifacts. This is why we work to do it in a single shot.

1 Like

@effigies Thank you for the enlightenment with detailed explanations. This thread helped me a lot.