Examples for decoding unthresholded continuous map

Peng_Ren · July 8, 2021, 1:26am

Dear,

while the discrete decoding is recommended in NiMARE, I still want to operate functional decoding of unthresholded distribution map of the tumor, which has been mentioned to be expensive.
I just run the decoder fitting for a whole night and no sign of finishing yet, is this one part of the cost?

*decoder=nimare.decode.continuous.CorrelationDecoder(feature_group=None, features=None, frequency_threshold=0.001, meta_estimator=None, target_image='z_desc-specificity')*
*decoder.fit(used_dataset)*
*encoded_img=['/prot/bhb_Tumor/Decoding_Meta/distribution_OS_Short.nii' '/prot/bhb_Tumor/Decoding_Meta/distribution_OS_Long.nii']*
*decoded_df, _ = decode.continuous.gclda_decode_map(decoder, encoded_img)*
*decoded_df.sort_values(by="Weight", ascending=False).head(10)*

the code is stuck at fitting the decoder, which is showing Info:
INFO:nimare.utils:Shared path detected: ‘/prot/bhb_Tumor/Decoding_Meta/decoding/neurosynth/’
INFO:nimare.utils:Shared path detected: ‘/prot/bhb_Tumor/Decoding_Meta/decoding/neurosynth/’
INFO:nimare.meta.cbma.mkda:Temporary file written to /home/rp/.nimare/temporary_files/MKDAChi27kx6u82a20210708T051632
INFO:nimare.meta.cbma.mkda:Temporary file written to /home/rp/.nimare/temporary_files/MKDAChi2ung08gkt20210708T051632
INFO:nimare.meta.cbma.mkda:Removing temporary file: /home/rp/.nimare/temporary_files/MKDAChi27kx6u82a20210708T051632
INFO:nimare.meta.cbma.mkda:Removing temporary file: /home/rp/.nimare/temporary_files/MKDAChi2ung08gkt20210708T051632

And also, there’re many files of studies generated, is this generating neuroimaging data for all studies I selected?
study-28928648-1_affine-34d2ff913320e14f04a4746cfa875fcd_low_memory-True_r-10.0_value-1_MKDAKernel.nii.gz
…

Thanks for any suggestions in advance!

Best wishes,
Peng

tsalo · July 8, 2021, 3:53pm

Hi Peng,

decoded_df, _ = decode.continuous.gclda_decode_map(decoder, encoded_img)

This step shouldn’t work. What you need to do is use decoder.transform(). gclda_decode_map will only work when the “model” you give it is a GCLDAModel.

You’re right that the code is running very slowly at the fitting stage. This is something we’ve been trying to improve on, but it’s a difficult issue. The problem is balancing performance and memory usage with large datasets, like Neurosynth or NeuroQuery. Performing the necessary meta-analysis can lead to creating very large arrays in memory, which we try to circumvent by using memory-mapped arrays. Unfortunately, that slows things down considerably.

That’s right. For each of the label-specific meta-analyses, each study needs to create a modeled activation map from its coordinates. To streamline that, we save out the modeled activation maps ahead of time so they can be easily loaded as necessary, rather than created from scratch each time.

It is definitely suboptimal, but one approach that might work better for you right now would be to generate the meta-analytic maps with a script and then perform the correlation directly. Basically, I would propose taking the contents of the CorrelationDecoder and running them directly.

import nimare

# Save meta-analytic maps to an output directory
out_dir = "meta-analyses/"

# Initialize the Estimator
# You could use `low_memory=True` here if you want, but that will slow things down.
meta_estimator = nimare.meta.cbma.mkda.MKDAChi2()

# Pre-generate MA maps to speed things up
kernel_transformer = meta_estimator.kernel_transformer
dataset = kernel_transformer.transform(dataset, return_type="dataset")
dataset.save("neurosynth_with_ma.pkl.gz")

# Get features
labels = dataset.get_labels()
for label in labels:
    print("Processing {}".format(label), flush=True)
    label_positive_ids = dataset.get_studies_by_label(label, 0.001)
    label_negative_ids = list(set(dataset.ids) - set(label_positive_ids))
    # Require some minimum number of studies in each sample
    if (len(label_positive_ids) == 0) or (len(label_negative_ids) == 0):
        print("\tSkipping {}".format(label), flush=True)
        continue

    label_positive_dset = dataset.slice(label_positive_ids)
    label_negative_dset = dataset.slice(label_positive_ids)
    meta_result = meta_estimator.fit(label_positive_dset, label_negative_dset)
    meta_result.save_maps(out_dir=out_dir, prefix=label)

Once that has been run, you will have a bunch of maps in the output directory. Files like:

meta-analyses/Neurosynth_TFIDF__memory_z_desc-consistency.nii.gz
meta-analyses/Neurosynth_TFIDF__memory_z_desc-specificity.nii.gz
meta-analyses/Neurosynth_TFIDF__pain_z_desc-consistency.nii.gz
meta-analyses/Neurosynth_TFIDF__pain_z_desc-specificity.nii.gz

From there, it should be easy enough to grab all of the files of a given type (the default for CorrelationDecoder is the z_desc-consistency map), then extract the labels based on the filenames, load those maps into memory, and then correlate them with your target map.

I cannot guarantee that this will be faster than NiMARE’s current decoder, and the arrays may cause numpy out-of-memory errors (i.e., the arrays are too big), but it may also work. Until we have a better solution in NiMARE, this may be the best method.

I hope that helps.

Best,
Taylor

Peng_Ren · July 13, 2021, 3:56am

Dear Taylor,

Sorry for the delay, I have to fight against the server from time to time.

Just like you said, it took more than 1 day to get all the z_desc-consistency maps for my labels, now I can play on them!

Best wishes,
Peng

Qing_Ma · April 10, 2022, 2:57pm

Hi tsalo,
I am trying to use your code to generate meta analytic maps for all the terms for the further correlations with target map. However, when I ran your code, something goes wrong, the error content is as follows:

Traceback (most recent call last):
File “D:\MRISoftware\anaconda\lib\site-packages\IPython\core\interactiveshell.py”, line 3437, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File “”, line 1, in
runfile(‘D:/OpenDataAccess/ABCD/Code/Neurosynth/neurosynth_data_extract.py’, wdir=‘D:/OpenDataAccess/ABCD/Code/Neurosynth’)
File “D:\Installation\pycharm\PyCharm 2020.1\plugins\python\helpers\pydev_pydev_bundle\pydev_umd.py”, line 197, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File “D:\Installation\pycharm\PyCharm 2020.1\plugins\python\helpers\pydev_pydev_imps_pydev_execfile.py”, line 18, in execfile
exec(compile(contents+"\n", file, ‘exec’), glob, loc)
File “D:/OpenDataAccess/ABCD/Code/Neurosynth/neurosynth_data_extract.py”, line 17, in
dataset = kernel_transformer.transform(dataset, return_type=“dataset”)
NameError: name ‘dataset’ is not defined

I download NiMARE package from the github and didn’t modified the package. Thus I don’t know why I couldn’t find dataset. Could you give me any suggestions?

Thank you for any suggestions in advance,

Best,

Qing

tsalo · April 11, 2022, 2:44pm

Hello Qing,

The issue is that the code I shared is only a portion of what you need, specifically for performing the meta-analyses. What the dataset variable is depends on what dataset you want to use. If you want to do this for Neurosynth, then you can follow the Neurosynth-specific steps in this example. In that example, neurosynth_dset is the variable you would use instead of dataset in the code I shared.

Another problem is that the code I shared has a small typo in it:

label_negative_dset = dataset.slice(label_positive_ids)

should be

label_negative_dset = dataset.slice(label_negative_ids)

I hope that helps.