How to interpret results of nimare.decode.discrete.ROIAssociationDecoder

I have a parcellated ROI mask (i.e. not voxel-wise values, but region-wise values with either 1s or 0s) that I would like to decode using nimare. I am a little bit unsure about how I should interpret the results of nimare.decode.discrete.ROIAssociationDecoder. There seems to be some documentation on it but I thought someone from the nimare team could explain one more time what the resulting r-values mean? I don’t want to do something wrong here.

Here’s some example code that just uses the dorsal attential network of the Yeo-atlas as input:

import nimare
from nilearn import datasets
from nilearn.image import load_img
import pandas as pd
import numpy as np
from nilearn.plotting import plot_roi
from nilearn.image import new_img_like

# load the schaefer atlas
atlas_schaefer = datasets.fetch_atlas_schaefer_2018()
atlas_labels = [byte.decode('utf-8') for byte in atlas_schaefer['labels']]
atlas_img = load_img(atlas_schaefer['maps'])

# generate an artificial (parcellated) ROI mask by subsetting to only one Yeo-Network 
atlas_labels_df = pd.DataFrame({'label':atlas_labels})
atlas_labels_df['label_idx'] = range(1, len(atlas_labels_df) + 1)
roi_idxs = atlas_labels_df.loc[atlas_labels_df['label'].str.contains('DorsAttn'),'label_idx']
rois = np.isin(atlas_img.get_fdata(),roi_idxs).astype(int)
roi_img = new_img_like(atlas_img,rois)
plot_roi(roi_img)

# get neurosynth data (Note: This can take a while!)
databases = nimare.extract.fetch_neurosynth(data_dir='../data')[0]

# convert to NiMARE dataset (Note: This can take a while!)
ds = nimare.io.convert_neurosynth_to_dataset(
    coordinates_file=databases['coordinates'],
    metadata_file=databases['metadata'],
    annotations_files=databases['features']
    )

# decode ROI image (Note: This can take a while!)
# See: https://nimare.readthedocs.io/en/latest/decoding.html#discrete-decoding
decoder = nimare.decode.discrete.ROIAssociationDecoder(roi_img)
decoder.fit(ds)
decoded_df = decoder.transform()
print(decoded_df.iloc[60:80,:].to_string())

The output from the last line of code gives us:

                                            r
feature                                      
terms_abstract_tfidf__aberrant      -0.038360
terms_abstract_tfidf__abilities      0.011698
terms_abstract_tfidf__ability        0.007748
terms_abstract_tfidf__able           0.013689
terms_abstract_tfidf__abnormal      -0.053229
terms_abstract_tfidf__abnormalities -0.063950
terms_abstract_tfidf__abnormality   -0.025151
terms_abstract_tfidf__absence       -0.010446
terms_abstract_tfidf__absent         0.018008
terms_abstract_tfidf__abstract       0.030507
terms_abstract_tfidf__abuse         -0.019701
terms_abstract_tfidf__acc           -0.026439
terms_abstract_tfidf__access         0.002581
terms_abstract_tfidf__accompanied   -0.001844
terms_abstract_tfidf__accordance    -0.011592
terms_abstract_tfidf__according      0.016201
terms_abstract_tfidf__accordingly   -0.002407
terms_abstract_tfidf__account       -0.006491
terms_abstract_tfidf__accounted     -0.011223
terms_abstract_tfidf__accounts      -0.009894

How would I intuitively interpret the correlation values? Is there a reason why nimare doesn’t output corresponding p-values?

1 Like

Hi @JohannesWiesner,

correlation values interpretation: This is the correlation measuring the strength of the relationship between a study reporting a coordinate in/close to the ROI and the weighting of the term in the abstract.

EXAMPLE
using the term ability for example, with a correlation of 0.0, would suggest that the abstract mentioning ability likely has no relationship with a coordinate being reported in the ROI. If the correlation was 1.0, the study has a coordinate reported in the ROI whenever ability is mentioned in the abstract.

This does not control for baseline rates like a chi-square test, so some terms will have a higher correlation across ROIs than other terms (terms that do not have a lot of specificity, like “account”), and some ROIs will have a higher correlation across the terms than other ROIs (like the dACC/insula).

p-values are not included since they are mostly non-sensible, the size of the neurosynth dataset being ~15,000 studies, results in a correlation between two 15,000 element lists. when the lists are that large, if there is any correlation, it will be considered statistically significant. For every value you posted, you can place an infinitesimally small p-value next to the correlation.

Hope this helps!
James

1 Like