Our group want to conduct statistics after a representational similarity analysis within subjects. Each subject has 2 conditions, and each condition has 1,000 maps of correlation values. These maps were generated by resampling the data within each subject in a given condition. We want to compare these 2 conditions, and we also want to test whether the correlation maps with each condition are significantly different from zero. We planned to use “nilearn.mass_univariate.permuted_ols”, but we don’t fully understand how it works and whether we can use it in the context of our experiment as described above. Finally, does the algorithm produce a map of p values that is corrected for multiple comparisons across voxels?

Hi, nilearn.mass_univariate.permuted_ols does a permutation test: it looks whether the difference between the two conditions is larger than the difference you get when the labels of the conditions are shuffled. It has built-in correction for multiple comparisons, using the maxT method (tabulating the distribution of cross-voxel maximum).
2 things you should look in detail:

you may want to z-transform your correlation values to make them behave more like Gaussian variables. It makes the mean difference test more sensitive. However, the permutation test is correct even without z-transform.

your (2 * 1000) samples should be independent; it is unclear from the description you give whether they are.

Condition 1 are zscored correlation results from a model based RSA analysis performed on trials of condition 1 and condition 2 is the same but done with trials from condition 2. Because we have 2000 trials per subject and we shuffle and repeat the analysis 1000 times with randomly selected samples we believe the correlations that we get from condition 1 and 2 are independent, even if they come from data from the same subject, would you agree?

The second thing that we want to do is to compare condition 1 vs zero correlation and the same for condition 2, can this be done using the nilearn function?

The danger is that most procedures will assume that the 1000 or 2000 available samples are independent, which they are not. IIUC, you’re doing some kind of bootstrap analysis, and indeed you cannot count bootstrap samples as independent (otherwise, you could generate infinite number of samples and make any effect significant…).

For comparison against 0, the code does not handle that; this needs to be added to Nilearn (with the null distribution simulated by random sign flips of the values).

Hi Bertrand, thanks for your message - we have a similar problem in a study in which we scanned 7 participants for 6 MRI sessions of 1 h . One set of analyses involve model based RSA and we would like to be able to make inferences at the single subject level. We did searchlight RSA. For each searchlight we compute the correlation between a representational dissimilarity matrix based on a model, and the brain RDM based on the BOLD responses to the experimental stimuli. These RDMs are comprised of 96 experimental conditions. Because we have 1728 trials per subject, what we do is to resample the fMRI data so that each time we pick up 96 fMRI scans (randomly) corresponding to those 96 experimental conditions, and compute the searchlight RSA for each resampling. We repeat this 500 times so we have 500 RSA searchlights. Then to assess the statistical significance of these RSA maps, we apply the Fisher transform and run one-sample non parametric t- test using randomise in FSL. My question is regarding the assumption of independence among the observations entered in the t test. Based on what you note, we can not assume each searchlight RSA map derived from each of the resamplings is independent, but at the same time we do not know of any statistical procedure that we could use to meet our goal. Do you have any advise here? I wonder whether it would be useful at all to demonstrate that the pattern of results holds regardless of whether 30, 60, or 90 random re-samplings are performed? your advise is much appreciated!
thanks so much and keep well
best
david

For instance, would it be sensible to compare, within each subject, the distribution of the RSA maps produced by re-sampling (or bootstraping) with a distribution of “chance” distribution RSA maps derived from shuffling the labels in the model and brain RDMs? other than this I can not think of any way to investigate this issue at the single subject level.

I would advise to perform 18 resampling on disjoint samples and do stats on these independent samples.

Note that you can work with dependent samples and report the average accuracy (that remains unbiased). You can also report the number of times the accuracy is grater than chance, which is a measure of effect strength. But AFAIK you cannot p-values on these numbers…

Indeed, you can also report that you achive higher accuracy than a shuffled version of the data in say 90% of the samples. This number, 90% remains meaningful too.