Convert beta estimates to t-statistics for whole brain searchlight MVPA - is it necessary for my task?

k_g_j · September 11, 2020, 5:03pm

Hello all!

I have a fast event-related task for which I would like to conduct a whole brain searchlight MVPA analysis. From what I’ve read, it seems like I should use single-trial estimates, however, some papers seem to use the raw beta parameter estimates, and others, a la Misaki et al. (2010), use t-statistics.

At this point, I’ve generated a 4D beta series using the LS-S procedure (Mumford et al., 2012) for each participant, but I want to make sure the searchlight I run will have the highest chance for success possible - I’ll be using a linear SVM classifier.

What I would like to know is how necessary is it to transform my beta series estimates to t-statistics, and if I need to do so, what the best way to perform that transformation might be. For example, I know you divide the beta estimates by the std. error to get the t-statistic, but I’m not sure as to how to calculate the std. error. (i.e. trial-wise? voxel-wise?). It might be worth noting that I still plan to detrend and normalize (zscore - this IS different than using the t-statistic, right?) the data prior to conducting the searchlight, and that I’m using Python/PyMVPA to do so.

Any specific help or general advice is appreciated. I really love how supportive the community has been, so far - it’s been an awesome help - and I’ll take whatever I can get! This is my first foray into MVPA, and the amount of “experimenter degrees of freedom” is pretty overwhelming!!

As a bonus question: should I have used the method from Turner et al. (2012) to generate my parameter estimates? Reading other papers has provided mixed advice regarding which method to choose. If so, I’m a little unclear on what the model looks like i.e. for the LSS method I used there’s a regressor for 1) the current trial, 2) all other trials of that type, 3) all other trials of the other type, 4) and nuisance variables such as motion parameters. I did the LS-S modeling in SPM, if that makes a difference.

Thanks again, everybody!
Kade

bthirion · September 12, 2020, 7:11pm

Hi,
As far as I know, you dont need to use z-scores: there is no literature that shows that it should be done so.
However I recommend doing it: z-scoring is a way to normalize the signal, across runs/sessions in particular, removing some kind of unwanted variability.
The best way to do it is to rely on a GLM for each run, that will separate each trial into a column of the design matrix. This can provide a z score for each trial, corresponding to the effect standardized wrt noise-induced uncertainty.
If the GLM is well-posed, then the t-stat will have lots of degrees of freedom, make the difference wrt z-statistic unnoticeable.
HTH,

tsalo · September 13, 2020, 12:46am

Could you go into a bit more detail on why you’re recommending what Mumford et al. (2012) and Turner et al. (2012) refer to as the least squares- all (LSA) instead of the least squares- separate (LSS) approach those papers recommend? In Abdulrahmanab and Henson (2016), I believe they found that the ratio trial-to-trial variability to scan noise was an important variable in deciding between LSA and LSS, but I don’t think there was a clear winner.

k_g_j · September 13, 2020, 4:19pm

Hey bthirion,

Thanks for your reply! I do recall a paper (the citation escapes me…) which suggested that z-scoring didn’t make a difference in the results of their decoding analysis, but it makes sense, as you describe it, to perform one.

I’m familiar with the method you describe, and I am performing a GLM on each trial. I’m also curious re: tsalo’s response why you recommend what is sometimes referred to as the LSA instead of the LSS - I was under the impression the LSS might be better suited to my data!

Additionally, do you mean that the results of my GLM are already in “z-score” units? I was under the impression the output was in beta values. I’m probably just confused, though! It sounds like you’re saying, in the case in which I use the output of a GLM, I don’t need to translate the output to a t-stat - do I understand correctly? Thanks again for your response - your feedback in really appreciated!!

Kade

bthirion · September 13, 2020, 5:18pm

I don’t know what you have as a GLM output: it may be beta, t-stats or z-score images… it depends on the software you use and the exact way you call the function. You may want to copy-paste some code on a gist for instance. If it is from some Python library, I should be able to help you

bthirion · September 13, 2020, 5:26pm

This is an important point: Theoretically, the LSS method is not well-grounded, as it makes some rather extreme simplifications of the model. Yet in practice it performs well, most likely because it offers a favorable bias/variance tradeoff.
Actually, it is hard to know which one should be the winner. Maybe some day, we should work on a systematic benchmarks on several datasets to get an answer or at least provide well-grounded guidelines.
Note also the work we did here, in which we benchmarked LSS against alternatives:

Best,
B

k_g_j · September 13, 2020, 9:24pm

I used SPM12 software to build the models for the data with just their basic “Specify 1st-level” function called through a batch script - I didn’t change any of the primary options. I did some extra digging, and I’m mostly certain the output is a beta image. I think that means, based on your previous comment, that I should do some sort of transformation (as opposed to using just the beta values)?

Although I’ll be doing the searchlight and post-GLM processing in Python, I haven’t quite made a complete change to it yet for pre-processing/basic GLM analysis (although I plan to!)

That paper you shared looks interesting - every time I think I start to get a handle on the right steps to take, there turns out to be so many extra layers!!

bthirion · September 14, 2020, 6:11am

To get t stats (spmTs maps) out of SPM, you need to secify a list of contrast of the type [0 … 0 1 0 … 0] where the 1 in the ith position corresponds to the ith column of the design matrix, i.e. the ith trial in your analysis (assuming an LSA approach).

k_g_j · September 14, 2020, 4:05pm

Thank you for the information!! I really appreciate your comments and help on this.

Raul_Hernandez · September 16, 2020, 2:09pm

Hi Kade,

I have been working with PyMVPA to analyze dog fMRI data for some time now, I’m not an expert but I have tried many different approaches and variables, so I have some “on the ground” experience. If you want, we can chat and maybe there are some details I can help you with.

Raúl