TDT input data scaling: per run?


I was just wondering whether the scaling is done across run-specific samples per voxel (dimension), or is done across all samples in training set. Let’s say I have fMRI bold estimates, one beta per condition per run, and I have 3 runs. Thus, in a leave-1-run-out-crossval SVM analysis, my training data consist of 4 samples (because training data = 2 runs). Should I scale the data independently for each run in the first place (this results in -0.7071 and +0.7071, only 2 possible values in all dimensions), or I should scale the data across the 4 training samples? (assuming that I don’t scale my test data). Any general comments or suggestions on scaling for SVM are highly welcomed as well!

Many thanks!

Hi Yinan,

You can choose what kind of scaling you would like to use, but only ‘separate’ would scale by chunk (i.e. run)

I only use scaling if I feed in unscaled data (betas are usually standardized i.e. scaled anyway) or if deciding for some reason is really slow. Otherwise I haven’t seen much of a benefit myself really.


Thank you so much for your prompt reply, Martin! Just checked in real data: scaling doesn’t change much for the decoding results, if anything, raw betas work actually better.