How to use RSA on fmri data

orko · May 17, 2020, 9:14am

Hi,
I have a dataset of n subjects, with 347 stimuli and activity in 8 regions. I have a representation (embedding) of the stimuli and I want to apply RSA to check the relation between neural pattern similarity and the representation, in each of the regions. (To probe what is the region in which the activity patterns resembles the most to the embeddings).
My data is in the form of pandas dataframe with the columns:
subject stimuli_embedding region1_activity region2_activity... region6_activity

What will be the best way to apply RSA (with python ) on these data?

bthirion · May 17, 2020, 5:49pm

Hi,
What you call region1_activity region2_activity is the average activity within the regions or a bunch of values with for all voxels in the regions ?

If it is the average value you cannot run RSA. The best thing you can do is to create a classifier to predict the class from the activity within all ROIs
If it contains voxel values, then you can for example compute the accuracy of a classifier that predicts the class given voxel activity using e.g.
http://nilearn.github.io/auto_examples/02_decoding/plot_haxby_searchlight.html#sphx-glr-auto-examples-02-decoding-plot-haxby-searchlight-py
otherwise you can use the following toolbox:
https://github.com/Charestlab/pyrsa
Best

orko · May 17, 2020, 7:12pm

@bthirion Thanks!
Can you please explain why can’t I run RSA with average value? Can’t I run it between-subject?

bthirion · May 18, 2020, 6:04pm

You can still run ti formally, but it won’t be informative. The information that you keep from each regions is just one scalar value: the amount of activity, so this is very little to characterize the pattern of activity of that regions and compare it to some metric of stimulus relatedness.

In that case, doing it across subjects does not help.

The only think that makes sense to me is to use the region-level averages as input to a global classification/regression method.
HTH

orko · May 20, 2020, 7:52am

@bthirion Thanks, can I do a between subject similarity analysis? For example, between each of the 8 vectors of [1Xn] per-region activity, and the stimuli embedding?
Thus, for example, I can see that there is high correlation between region#1 activity to the embedding, but only random correlation between region#2 and the embedding, to suggest that region#1 represents the data better?

bthirion · May 20, 2020, 8:49pm

I think that comparing similarity of within-region activation across-subject to “stimuli” would make sense if the output variables were actually not stimuli but some subject related quantity, like a behavioral score, response time or any other individual characteristic.

Here I don’t really understand what you would infer from it. Sorry if I misunderstood.

orko · May 21, 2020, 1:13pm

@bthirion Thank you very much. The idea is to change the parameters of the embedding to see which correlated better to brain-data therefore possibly these parameters (e.g more importance so specific features in the representation) more resembles to the way these data is cognitively represented , thus we can learn something about the representation in the brain.
Does it make sense?

bthirion · May 23, 2020, 9:26am

Yes, but you can this type information by regression: which feature(s) best explain(s) brain activity. This is called an “encoding model” in the literature.

orko · May 24, 2020, 2:14pm

@bthirion Thank you for your answers.
I do aware of encoding model and read quiet a few of these papers, but:

I am failing to understand why is this the same information?
Isn’t the encoding model has much more degrees of freedom that are not present in what I suggested (e.g regression model with its params, evaluation metric).
Isn’t encoding model assume that the features itself has meaning, therefore not be suitable to embedding method such as PCA, VAE , BERT (for sentence) , etc … in which the representative vector in meaningful but each feature by itself is meaningless?

bthirion · May 24, 2020, 5:44pm

1 . Well: the most precise response to the question: “what stimulus features explains activity in region XXX ?” is given by an encoding model, that explicitly tests that. RSA is a slightly more indirect way to answer the question. Note that the two are related, see e.g.

But RSA has even more degrees of freedom (searchlight model, dissimilarity model, comparison statistic etc.)
With encoding you can also test groups of features, if you believe that each of them, taken separately, is not meaningful.

orko · May 25, 2020, 6:37am

Thanks, I get your point on why they are conceptually the same / answer the same research question. But does building an encoding model contradicts the similarity analysis or make it redundant? Isn’t it possible to come up with finding on one but not on the other?

bthirion · May 25, 2020, 7:42am

H,
,
No my point is rather that these are two formulations on the same problem. Practically, I would not expect strong difference on the results of RSA vs encoding analysis.
Best,
Bertrand