Examples for decoding unthresholded continuous map

Hi Peng,

decoded_df, _ = decode.continuous.gclda_decode_map(decoder, encoded_img)

This step shouldn’t work. What you need to do is use decoder.transform(). gclda_decode_map will only work when the “model” you give it is a GCLDAModel.

You’re right that the code is running very slowly at the fitting stage. This is something we’ve been trying to improve on, but it’s a difficult issue. The problem is balancing performance and memory usage with large datasets, like Neurosynth or NeuroQuery. Performing the necessary meta-analysis can lead to creating very large arrays in memory, which we try to circumvent by using memory-mapped arrays. Unfortunately, that slows things down considerably.

That’s right. For each of the label-specific meta-analyses, each study needs to create a modeled activation map from its coordinates. To streamline that, we save out the modeled activation maps ahead of time so they can be easily loaded as necessary, rather than created from scratch each time.

It is definitely suboptimal, but one approach that might work better for you right now would be to generate the meta-analytic maps with a script and then perform the correlation directly. Basically, I would propose taking the contents of the CorrelationDecoder and running them directly.

import nimare

# Save meta-analytic maps to an output directory
out_dir = "meta-analyses/"

# Initialize the Estimator
# You could use `low_memory=True` here if you want, but that will slow things down.
meta_estimator = nimare.meta.cbma.mkda.MKDAChi2()

# Pre-generate MA maps to speed things up
kernel_transformer = meta_estimator.kernel_transformer
dataset = kernel_transformer.transform(dataset, return_type="dataset")
dataset.save("neurosynth_with_ma.pkl.gz")

# Get features
labels = dataset.get_labels()
for label in labels:
    print("Processing {}".format(label), flush=True)
    label_positive_ids = dataset.get_studies_by_label(label, 0.001)
    label_negative_ids = list(set(dataset.ids) - set(label_positive_ids))
    # Require some minimum number of studies in each sample
    if (len(label_positive_ids) == 0) or (len(label_negative_ids) == 0):
        print("\tSkipping {}".format(label), flush=True)
        continue

    label_positive_dset = dataset.slice(label_positive_ids)
    label_negative_dset = dataset.slice(label_positive_ids)
    meta_result = meta_estimator.fit(label_positive_dset, label_negative_dset)
    meta_result.save_maps(out_dir=out_dir, prefix=label)

Once that has been run, you will have a bunch of maps in the output directory. Files like:

meta-analyses/Neurosynth_TFIDF__memory_z_desc-consistency.nii.gz
meta-analyses/Neurosynth_TFIDF__memory_z_desc-specificity.nii.gz
meta-analyses/Neurosynth_TFIDF__pain_z_desc-consistency.nii.gz
meta-analyses/Neurosynth_TFIDF__pain_z_desc-specificity.nii.gz

From there, it should be easy enough to grab all of the files of a given type (the default for CorrelationDecoder is the z_desc-consistency map), then extract the labels based on the filenames, load those maps into memory, and then correlate them with your target map.

I cannot guarantee that this will be faster than NiMARE’s current decoder, and the arrays may cause numpy out-of-memory errors (i.e., the arrays are too big), but it may also work. Until we have a better solution in NiMARE, this may be the best method.

I hope that helps.

Best,
Taylor

1 Like