Decoding on raw data

Hi neurostars,
my question is: can i run decoding analysis on single-trial raw data (preprocessed, detrended, standardized, but not 1st-level modeled)?
Background: Most published decoding studies seem to work on beta-maps derived by fitting first-level models. The rationale is to reduce noise. However, there seem to be a few studies that just decode on more or less raw data - e.g. Tom Mitchell’s work appears to use a simple difference from baseline score and a fixed time-shift (they do use some averaging of the single trials and volumes, though). Also, the nilearn-haxby examples seem to include no first-level modeling (i also searched in pymvpa, where a bit of documentation exists on the dataset). In a block design it may not matter so much anyway, but I was wondering if it would be legit to do in rather slow event-related design, trying to decode e.g. single images of faces vs. houses or so. I would find this approach i) much simpler (am lazy and don’t love the glm as much as others) and ii) possibly applicable online (e.g. neurofeedback/bci-style approaches where you don’t have the full dataset - only incoming data).
Also, I was wondering if there aren’t ways one could accomplish similar denoising/signal improvement, e.g. why not use the +/-5 volumes around stimulus onset, essentially incorporating some temporal aspects into the decoding. That would, of course, create a much wider feature-matrix, but would that hurt very much? Any tips on this issue (i.e. how bad is it going to be if I shortcut the first-level modeling) and on temporal decoding more broadly would be highly appreciated.

Best, Ralf


Quick answer: if your experiment is not a block design, there is no simple way to run a decoding without a first level. The challenge is that each Tr cannot be reliably assigned to a condition.

We did develop at approach for time-domain decoding: but it is more involved than running the standard first level.


Thanks for the super fast reply! The paper looks great!

if your experiment is not a block design, there is no simple way to run a decoding without a first level. >The challenge is that each Tr cannot be reliably assigned to a condition.

One quick follow-up: Why, i.e. How is event-related fundamentally different from block here? In block you know e.g. stuff happens from TR 10-24, in event-related you could know that stimulus onset occurs e.g. at the 3rd , 20th, and 42nd volume, so with a TR of 2 sec and an assumed delay of 6sec to the HRF peak, you could e.g. use the 6th, 23rd, and 45th volumes’ value (and subtract a baseline). I get that this approach would be a bit crude, but has anyone tried how much accuracy suffers?
Best, Ralf

Hi @rschmaelzle. You definitely can perform an MVPA analysis on unmodeled events by using volumes acquired during the expected peak response. And, if the overlap of hemodynamic responses to consecutive events are trivial (you say you’re using a slow design), then this approach seems appropriate, even if it loses the denoising aspects of the GLM.

I haven’t done a rigorous comparison, but in some pilot data with about a 10s interval between events, I found very little difference in results at an individual level. I cannot speak at all to how this would translate to group-level results and have not considered the impact on interpretability of results.

If you’re interested in doing a comparison, yourself, (or if you do have non-trivial HRF overlap between events) I would highly recommend modeling individual events with some variant of Mumford, et al. 2012’s least squares-separate approach (LS-S, in the paper). In Markiewicz and Bohland, 2016, we estimated all events of a given class (which were constrained to have non-overlapping HRFs in the experimental design) in a single GLM, reducing the cost of estimating so many models.


+1 to the Mumford reference, which is a very good one.

+1 also to the point that if ISI are large enough, it doesn’t matter much :), so I would still prefer a beta-series regression in such a case, as it is slightly more principled.

1 Like

Yes, you can do MVPA without fitting a GLM (or other 1st level model); I’ve generally found the results quite similar. It’s dated now, but I described a few comparisons in doi:10.1016/j.neuroimage.2010.08.050. I’ve worked with individual frames (e.g., only use the 4th frame after stimulus onset) and averaging for temporal compression.

1 Like

@jaetzel @GaelVaroquaux @effigies Thanks to all of you for the replies. All very helpful and reassuring!

I worked with decoding data from a control condition - l/r fingertapping - that I included to serve as sanity check. I found excellent results with a baseline-subtracted peak estimate for individual tapping trials and plotting the weights gives very crisp results in motor cortex. By contrast, if I create a conventional GLM and look at the contrast-z-map for left vs.right, it also looked right but far noisier (ok, it’s a different analysis, but still I was surprised). I then compared the extracted timeseries from left/right MC and compared them against the modeled timecourses . This revealed that although there was a rough fit, the model did not fit as well as I had assumed (correlation of about 0.23 - some temporal mismatch evident). Hence I started to go into the rabbitthole of whether GLM-model in single trials are necessarily always better (“denoise”) - or can also lead astray. But you’re right — I could run betaseries and compare formally.

Fingertapping as a control warms my heart; I have found checks like that incredibly useful when starting new analyses, but few people seem to run them. (Aside: in our datasets there is not usually a dedicated control condition, but it is pretty much always possible to find one, such as using the response button presses or whether visual stimulus present or not - the key is activity that must be present in a predictable area.)

Lately, when setting up for a non-GLM-based MVPA I’ve been using afni 3dDetrend, see I’d guess that this is not exactly the same as what you did, if you explicitly defined a baseline and subtracted, but related: centering the timeseries with 3dDetrend (or similar) allows checking if the BOLD after the events is higher than in the rest of the run.

It sounds like you did quite a few comparisons; perhaps write them up, at least as a little conference paper or blog post, as they will likely be useful for others. Your guess of the GLM being a poor fit for the particular dataset is quite plausible to me; any model/technique has assumptions and properties that will work better with some datasets than others.


I am working on abide dataset classfication problem using fmri. I wanted to fetch (timeseries, voxel) data and apply smoothing on it. Is rois_cc200.1D files the timeseries to use directly ? Can they be smoothed further using masking ?
It would be great if someone can help me out on how to extract timeseries data and smooth it out.