I am attempting to fit a lot of files using Niftimasker, over 10000 of them. I am doing something like this:
filenames # 10000 filenames such as file_1.nii, file_2.nii, etc.
masker = NiftiMasker()
masker.fit(filenames)
This should create a mask using all 10000 of the images. However, I am having memory issues with this. If I do a much smaller subset of the files, for example 500, I have no issues.
If I understand correctly your fit operation is done per voxel basis. And if you are getting a memory error when doing fitting, maybe a quick solution would be to divide your mask into multiple pieces (e.g. one mask per hemisphere) instead of loading smaller number of files to call fit multiple times with different masks. You can later merge the results in space.
However if you are getting a memory error before -when masking or loading your 10000 files- , maybe you can check the precision your nifti files are loaded into memory. If it makes sense numerically, you can reduce your precision (e.g. float instead of doubles, or int instead of floats …). Hope it helps.
Is it possible to have multiple mask files, however? For example, one mask for the first 2000 files, another mask for the second 2000 files and so on. And then do a final fit with the five masks that I would have?
I seem to have found a solution (cc @ofgulban) in case anyone else needs this.
I created separate mask images for subsets of the dataset using NiftiMasker. I then used nilearn.masking.intersect_masks to join them all together in union with a threshold of 0. Can also do intersection with a threshold of 1. Things are much more memory efficient this way.