Loading subsets of nifti data in python (spm_get_data equivalent)

Johan · October 19, 2018, 1:26pm

spm_get_data is a very useful function in SPM that uses compiled routines to quickly load time series data for a set of voxel indices into a matrix. I use it all the time for ROI analysis, and I really miss it in Python. Is there something comparable? As far as I can tell, most packages (e.g., nitime, nilearn) load the entire 4D time series into memory and then index the resulting array, which won’t be as fast, and risks memory issues for large datasets.

nibabel supports loading slices of datasets by indexing the dataobj, but that approach doesn’t work with fancy indexing. I suppose I could work out what slices would cover my ROI coordinates, load that sub-matrix and then index it, but that’s a lot of trouble and risks negating any performance gains.

Kelly_Garner · May 20, 2020, 5:01am

Hi Johan

Did you find an answer to your question? I am currently seeking the same, implemented in either fsl, ants or freesurfer, that I can call through nipype.

Cheers
Kelly

Johan · May 21, 2020, 6:24am

Sorry, no progress to report here. Do report back if you find a way!

tbazeille · May 22, 2020, 9:05am

Not knowing the spm function mentionned, I’m not sure to completely get out to reproduce it.

To keep only some images from 4D niimgs imgs series, the most forward thing to do with nilearn

import numpy as np
from nilearn.image import index_img

indexes_of_interest = np.array([0,3,6,10,14,20]) 
images_subset = index_img(imgs, indexes_of_interest)

images_subset is still a 4D Niimg here. If you want to keep only signals an ROI based on a niimg roi_mask:

from nilearn.input_data import NiftiMasker
roi_masker = NiftiMasker(mask_img=roi_mask).fit()
voxels_signals = roi_masker.transform(images_subset)

voxels_signals is now a 2D array (number_indexes, number_voxels_in_ROI)

Johan · May 22, 2020, 12:10pm

Thanks for that. To clarify, spm_get_data is a compiled routine that takes an spm_vol struct that references a set of n_vol 3D or 4D nifti volumes (basically, a nifti header) and a set of n voxel indices [xyz, n] array, and returns a [n_vol, n] array of time courses. You can see it in use here:

github.com

jooh/pilab/blob/7bc68b4725b2c1296c0dd5abcea1e51c31849eef/containers/loadmaskedvolumes.m

% Convenience function for loading in-mask voxels for a set of volumes
% using spm_get_data.
%
% Inputs:
% paths: cell or char array of file paths to SPM-readable volumes
% mask: 3D matrix in same shape as volumes
%
% If you specify the mask output this function will strip out voxels that
% are 'bad' (ie, NaN or 0 at any time point) and return both cleaned up
% data and a cleaned up mask. If you specify only data output we assume you
% don't want any cleaning of data or mask.
%
% [data,[mask]] = loadmaskedvolumes(paths,mask)
function [data,mask] = loadmaskedvolumes(paths,mask)

% ensure logical
mask = mask > 0;
maskind = find(mask);
[x,y,z] = ind2sub(size(mask),maskind);

This file has been truncated. show original

Without knowing the underlying C code, it seems to achieve this without loading the full 4D nifti into memory, and so is far more performant than the alternative route (which in SPM land would be spm_read_vols to load the full 4D matrix into memory followed by indexing in Matlab).

Reading through the Nilearn code, it looks like it does load the full 4D matrix into memory at this point, and then applies indexing in python:
https://github.com/nilearn/nilearn/blob/64faf99202c30fedc666f9846f8af017cecf8efd/nilearn/masking.py#L796

I suspect that this is not going to be as fast or memory efficient as spm_get_data, although I have not compared directly.