Loading subsets of nifti data in python (spm_get_data equivalent)

spm_get_data is a very useful function in SPM that uses compiled routines to quickly load time series data for a set of voxel indices into a matrix. I use it all the time for ROI analysis, and I really miss it in Python. Is there something comparable? As far as I can tell, most packages (e.g., nitime, nilearn) load the entire 4D time series into memory and then index the resulting array, which won’t be as fast, and risks memory issues for large datasets.

nibabel supports loading slices of datasets by indexing the dataobj, but that approach doesn’t work with fancy indexing. I suppose I could work out what slices would cover my ROI coordinates, load that sub-matrix and then index it, but that’s a lot of trouble and risks negating any performance gains.

Hi Johan

Did you find an answer to your question? I am currently seeking the same, implemented in either fsl, ants or freesurfer, that I can call through nipype.


Sorry, no progress to report here. Do report back if you find a way!

Not knowing the spm function mentionned, I’m not sure to completely get out to reproduce it.

To keep only some images from 4D niimgs imgs series, the most forward thing to do with nilearn

import numpy as np
from nilearn.image import index_img

indexes_of_interest = np.array([0,3,6,10,14,20]) 
images_subset = index_img(imgs, indexes_of_interest)

images_subset is still a 4D Niimg here. If you want to keep only signals an ROI based on a niimg roi_mask:

from nilearn.input_data import NiftiMasker
roi_masker = NiftiMasker(mask_img=roi_mask).fit()
voxels_signals = roi_masker.transform(images_subset)

voxels_signals is now a 2D array (number_indexes, number_voxels_in_ROI)

Thanks for that. To clarify, spm_get_data is a compiled routine that takes an spm_vol struct that references a set of n_vol 3D or 4D nifti volumes (basically, a nifti header) and a set of n voxel indices [xyz, n] array, and returns a [n_vol, n] array of time courses. You can see it in use here:

Without knowing the underlying C code, it seems to achieve this without loading the full 4D nifti into memory, and so is far more performant than the alternative route (which in SPM land would be spm_read_vols to load the full 4D matrix into memory followed by indexing in Matlab).

Reading through the Nilearn code, it looks like it does load the full 4D matrix into memory at this point, and then applies indexing in python:

I suspect that this is not going to be as fast or memory efficient as spm_get_data, although I have not compared directly.