I have 18 .nii.gz files (after fmriprep).
Each file is from a different subject.
I have 2 conditions, each subject belongs to a different condition (9 subjects are False and 9 subjects are True)
I want to run searchlight analysis to classify these files. (meaning have a classifier that predict subject’s condition based on its data).
Is this possible? In all the examples I saw, the searchlight is always within subject.
Hi @orko, if I understand correctly, I think your question may be related to this other post: Haxby searchlight example for multiple subjects?
If not, can you give more detail about your data and what you want to do?
@ymzayek Thanks, I’m not sure that this is related to my problem.
To my understanding, in the Haxby dataset, each subject can have various labels per session - house, face, etc.
In my case, the classification is per-subject.
Meaning, subject #1 only saw faces and subject #2 only saw houses. So I have two types of subjects - house or faces.
I want to run searchlight analysis of a classifier that predict type of subject, given the data.
Ok I see. I am not sure this is implemented, but perhaps there is a more appropriate approach for your analysis in nilearn. I will let someone else chime in. Perhaps @bthirion can better answer this?
Hi,
Formally, the problem of inter-subject searchlight analysis is the same as the intra-subject analysis: you have a bunch of fmri files associated with labels, and you want to measure the statistical associations between images and labels. So you can essentially use the same code.
This being , there is an outsanding problem: performing such a statistical analysis on n=18 images will not yield any meaningful result, especially because there is a multiple comparison issue (across all searchlights).
Best,
Bertrand
@bthirion Thanks for the response!
- Actually in all the examples I saw, the Searchlight object gets 1-subject data and fit the classifier on it. So not sure how to use existing codes when only a single label per subject. Do you maybe have an example?
- Can you please elaborate on why the results will be meaningless?
-
Assuming that you have performed standard preprocessing of your data, they are samples in the same space, so you can formally act as if they were data from one subject. Tu put it differently, you can concatenate individual images into a 4D-image using e.g. Nilearn: Statistical Analysis for NeuroImaging in Python — Machine learning for NeuroImaging, and get the corresponding vector of labels, and give it to the searchlight function. I can try and craft an example, but this won’t be much different from what I described here.
-
the confidence intervals around the searchlight statistics you obtain will be huge. indeed, these confidence intervals decrease as 1/sqrt(n_samples). You can e.g. take a look at
Cross-validation failure: Small sample sizes lead to large error bars - ScienceDirect
Best,
Bertrand
@bthirion Thanks!
So just to make sure I understand:
You suggest that I will take each of the 18 4D-images (one per subjects) and concat the into a 18 items list of 4D data.
This list, together with 18 items labels vector, should be fed into the searchlight. Thus, each subjects data (~6 minutes) will be “treated” by the searchlight model as a single TR normally does?
However, this will make the data much smaller - instead of the usual hundreds of samples, I will actually have only 18 samples. Therefore, for each sphere, the error bars will be roughly 1/sqrt(18) = 0.23, making the result meaningless ?
I thought that you had one volume per subject, but it looks like it is not the case. What are the 6 minutes of data you have per subject ? Are these time courses ?
It is unclear to me what per-subject information you actually want to use in the searchlight analysis.
Best,
Bertrand
@bthirion I have 250 TRs per subject.
I want to run searchlight to discover which voxel are relevant to classification of subject’s condition. It is possible to mean the data of a subject to 1XnVoxels to get “One volume” per subject, if it help - that the data from each subject is a vector of mean activity per voxel.
I don’t think that you can discover much if you don’t have more structure in your data. Do the 250 scans correspond to a synchronized stimulation condition ?
The average activity across time is usually not considered a meaningful feature. What matters are synchronized variations in brain activity.
Best,
Bertrand
Yes, the 250 TRs are all the same song that the subjects are hearing. I understand that this is sub-optimal but we were asked to run this searchlight.
Is there a way to do so intra-subject?
Hi @orko,
could you maybe outline the task
/paradigm
a bit further?
Depending on if participants
performed a passive listening paradigm
(“just” hearing the song) or conducted a certain task (e.g. “press a button whenever you hear a guitar”, etc.), different modelling/analyses approaches might be feasible.
The same holds true for the spatial
and temporal
alignment
of participants
:
- did you perform a
spatial transformation
intotemplate space
- did
participants
hear the same part of the song at the “same time” (in terms ofTR
and stimulus synchronization)
Cheers, Peer
P.S.: as outlined by @bthirion: independent of your answers to the above questions, whatever results you obtain won’t be meaningful or even statically valid (worst case), as the low n
will prevent obtaining any reliable outcome measure and furthermore won’t allow any assessment of generalization
. Isn’t there any room for discussion re the analysis of this dataset? For example, going a bit more into the direction of connectivity analyses
?
@PeerHerholz Thank you!
regarding ur questions:
-
It was a passive paradigm, no task was given
-
Yes, I transformed them to MNI152
-
Yes, it was synchronised.
-
There might be some room for discussion but I want to make sure I understand the problem first in order to communicate it correctly forward - I will basically have 1 label per subject, resulting in 18 samples, meaning very low predictive power, and very prone to false positive considering the high number of comparisons in searchlight - right?
-
Would love to hear any idea u may have re connectivity or any type of analysis. Overall, the goal is to check if a certain area (as defined using a specific existing mask we have) is indicative to condition (group of subjects)
Hi @orko,
cool, thx so much for the information!
Gotcha! So this makes “classic analysis approaches” a bit difficult as e.g. for a GLM
you would need somewhat distinct regressors/events (e.g. music onset, tempo change, etc.). Furthermore, depending on the regressors/events and GLM
approach you would most likely get a very limited amount of estimated responses, e.g. 1 beta
/z map
per regressor/event. Other analysis approaches might be more feasible/suitable, please see the response to 5.
Alrighty. In this case and without any further steps you would assume “anatomical feature correspondence
” regarding the searchlight
, cross-validation
, etc., i.e. that a given voxel
/signal location
in one participant
as identical or comparable in another participant
. However, while often “ok”, spatial transformation
into a reference/template space is of course never 100% perfect and additionally, there’s a prominent inter-participant variability regarding voxel
/signal
and thus feature
location. Thus, it’s always a bit “sub-optimal”. Other approaches, e.g. functional alignment
could be interesting to look into here but please see the response to 5.
This might allow for other analysis approaches
, for example also spatio-temporal searchlights
but please see 5. re this.
Yeah, it’s a rather holistic problem: few participants
and few data
per participants
. If your goal is to evaluate if the response to music can “predict” certain participant groups
, then having an appropriate amount of data
(including SNR, etc.) and participants
(inter
vs. intra-variability
) is crucial re obtaining reliable estimates, certainty, generalization, etc. . As mentioned by @bthirion, the confidence intervals
you would most likely obtain will be huge and together with the other outlined problems render the performance and interpretation of your model and results drastically limited at best. Are there potentially other, comparable datasets out there you could maybe utilize for training
a model and then test
it on your data
? Then again, the question would if this is somewhat meaningful/feasible/suitable.
If you already have an ROI
that you want to evaluate, a searchlight
analysis might not be the approach you want to take, as it tries to obtain insights on “where” information
re a given classification task
(and thus potentially cognitive process) is located. It’s thus rather “explorative” in terms of spatial
things. However, if your ROI
is rather “big”, ie. the temporal lobe
, you could utilize it as a searchlight mask
, ie. an ROI
within which the searchlight
is run to evaluate “where” in the ROI
certain information is entailed. Otherwise, you might just want to use your ROI
in rather “classic” decoding approach, ie. extract signal/patterns (whatever these means atm) and evaluate if this entails information re your question, that is if it can distinguish groups
. Then again, the same problems as for the searchlight
re amount of data
are present.
As mentioned before, there might be other analyses you could look into, for example connectivity
, encoding
and/or functional alignment/SRM
.
Re the first, you employ a functional connectivity
analysis
, computing the correlation
of ROI
time series
and then compare that between groups
via a non-parametric test
, maybe with permutations
, to address the amount and distribution of the data
.
Re the second, you could extract stimulus features
of the song
, ie. chroma
, pitch
, timbre
, valence
, etc. and utilize that in a regression analysis
, predicting brain responses
, either for certain ROI
or voxels
. The resulting maps
, or performance
of the model could then be compared between groups
.
Re the third, you could treat your data
as originating from a (somewhat) naturalistic paradigm
and go into the direction of functional alignment
/shared response modelling
/intersubject correlation
/etc. .
HTH, cheers, Peer
@PeerHerholz Thank you very much for this very detailed answer!
My ROI is pretty big (the reward system) and using “classic” decoding approach we did found it to be significant. However, we were advised to run an whole-brain searchlight analysis to ensure these results are unique to this area and are not a fluke. Do you have any other suggestion as to how to do it?
Re ur suggestions:
A) Do you have maybe an example tutorial to the connectivity analysis u propose?
B) We did tried encoding model - but results were inconclusive
C) I did saw the option if ISC and we will probably try it. It should be done within my ROI and then I should asses statistical significance cane by comparing to null distribution? I read about the option of functional alignment and SRM but to my understanding it is more about to reduce data dimensions than to reach a certain conclusion about it? Maybe I missed something?
Thanks!
Ok, thanks for the information! It’s not my research field, thus please excuse the questions but does this “reward system” ROI
entail one big cohesive ROI
or multiple ROI
s at different locations that are not connected? Also: would you mind sharing how this ROI
was derived (sorry if I missed this)?
Instead of running a whole-brain searchlight
you could also utilize a whole-brain atlas
based on functional markers
/aspects
, for example DiFuMo, and run one ROI decoding analysis
for each. In its"highest form" DiFuMo
contains 1024
ROI
s (IIRC) which would result in 1024 decoding analyses
which is notably less than e.g. a whole-brain searchlight
for 70-100k voxels
concerning both computational resources and number of comparisons you have to correct for. Additionally, through its characteristics of being a whole-brain atlas/parcellation
and high number of ROI
s you won’t lose a lot of insights re spatial information
, as well as already have some labeling and ease up statistical inference. However, for all the things mentioned above, the same shortcomings as for the searchlight
apply, ie. small number of participants
, etc. .
There are multiple options to do this I think. Most commonly it might be beta series correlations
or DCM
. Re the first, you could have a look at Nibetaseries and a respective tutorial (nb: I’m biased here because it worked on this package, sorry). Re the latter, this resource might be helpful. Please note that both traditionally assume a different type of experimental design/paradigm and thus some modifications might be required.
Kk!
Re ISC
and comparable approaches, I would suggest checking out Brainiak, e.g. this tutorial.
I think I wouldn’t agree with these approaches being mainly used for data dimensionality reduction
than analyses
. In fact, the opposite might be the case as you might drastically increase the dimensions
of shared information
and actually enable certain analysis
approaches that won’t be possible otherwise, as well as enhance existing ones, However, all of that of course heavily depends on the data
one has and the hypotheses
/ideas
one wants to test. That being said: do you have additional data
from your participants
other than the 6 min songs
? I’m asking because that might not be enough data
to implement the above-mentioned approaches in a feasible/suitable manner.
Cheers, Peer
@PeerHerholz Thank you very much for the elaborated response!
- OK I think I will just skip all the searchlight issue
- Is it common to run intra-subject beta series analysis?
- No other data. 6 minutes should be enough data for ISC? Is there any reference for it, so I will be able to justify why I don’t run it?