fMRI decoding in block design

shanshan_zhu · November 8, 2025, 10:41am

Hi everyone,

I’m a beginner in fMRI decoding analysis and would really appreciate some advice.

In my experiment, the TR is 1.35 s, and the design is block-based.

Main task: 4 conditions × 5 runs
Localiser task: 2 conditions (U and N letters) × 2 runs, used as the training dataset for decoding
I’m mainly interested in the neural representations in V1 and V2.

My question is about how to prepare the input data for decoding.
Should I:

Model each trial as a separate regressor in the first-level GLM (i.e., a beta-series approach) and use those beta estimates as decoding inputs,
or
Use a different method more suitable for block design?

I’ve noticed that many studies use the first (single-trial beta) approach, but when I tried it, the results didn’t seem appropriate for my block-design data.

So I’m wondering — what is the recommended way to extract decoding features for a block design like mine?
Should I average the time series within each block, or perform some other preprocessing before decoding?

Any suggestions, references, or example workflows would be really appreciated!

Thanks in advance,
Shanshan

bthirion · November 9, 2025, 10:16pm

Dear Shanshan,
In the case of blocks, you may want to have each block as a long event, or do you want to distinguish events within the blocks ? Distinguishing events is ill-posed due to the sluggish hrf.
So the best idea is likely to have each block as one “big” event.
Does that make sense in your context ?
Best,
Bertrand

shanshan_zhu · November 10, 2025, 12:38am

Dear Bertrand,

Thank you very much for your helpful reply!

After reading your comment, I also looked into how other block-design decoding studies have handled this. For example, one study described the following approach:

“For the multivariate analyses, spatially non-smoothed, motion-corrected, high-pass filtered (128 s) data were obtained for each ROI. Data were temporally filtered using a third-order Savitzky–Golay low-pass filter (window length 21) and z-scored for each run separately. Resulting time courses were shifted by three TRs (i.e. 4.2 s) to compensate for HRF lag, averaged over trials, and null-trials discarded. For each participant, this resulted in 18 samples per class for the localiser and 96 samples per condition for the main runs. For classification, a logistic regression classifier (scikit-learn, default settings) was trained on the time-averaged localiser data and tested on the time-averaged experimental data.”

Would you say this is generally considered a good practice for such designs? Also, I wonder if a similar workflow could be implemented in Nilearn?

Thank you again for your time and advice! I really appreciate your guidance as I’m learning to design and analyze decoding studies properly.

Best regards,
Shanshan

bthirion · November 10, 2025, 6:48am

This design is possible, but there is a caveat: successive volumes are not independent. They may be used jointly for training, but make sure that you don’t use some consecutive volumes in train/test. Normally, test data should come from a different run to ensure that they are independent from the training data.
HTH,
Bertrand