Confounds list - how to choose which confounds to include as nuisance regressors in first level analysis GLM?


After finishing the preprocessing of my task fmri data, I would like to select the most relevant confounds from the " desc-confounds_timeseries.tsv " file, to regress out of my analysis by including them as nuisance regressors in the first level analysis GLM I will perform using FSL FEAT.

After reading some threads in this forum and some articles, I see this is a highly debated topic, and cannot figure out a strategy by which to choose the best confounds to include.

I would be very happy to receive help regarding this issue.

Thank you very much,


Hi @Uri_Shinitsky,and welcome to Neurostars!

As the articles and threads point out, there really isn’t a “best” method, especially for task based GLMs which have not been extensively benchmarked like resting state connectivity has.

A denoising scheme from fmriprep confounds may look like:

  1. to control for motion: some amount head motion parameters (6 is most basic, but you could also choose to include the squared and derivative expansions for up to 24 terms). Alternatively, using the edge regressors (e.g., This paper) introduced in recent versions are promising, but still yet to be rigorously evaluated.

  2. for physiological noise: some amount of acompcor regressors (either a set number or enough components to account for 50% variance) or mean tissue class signal. Note that if you use acompcor components you should also use the cosine columns in the confounds to high pass filter your data.

  3. volume censoring: removing motion outliers and/or non-steady-state volumes

The choice of what you go with should be motivated by the quality and amount of data you have and what you want to analyze.

If you have lots of within-subject data (short TR, long acquisition) then maybe you will be comfortable with higher temporal DOF loss from more aggressive (I.e more terms) pipelines.

If you think a task will cause motion, maybe don’t scrub out motion outliers, which may introduce bias against your task.

Global signal regression is always hotly debated. Do or do not do at your own risk :man_shrugging:

White matter signal could actually be informative during tasks (Functional MRI and resting state connectivity in white matter - a mini-review - PubMed), so if that’s something important for you, don’t regress it out.

The best thing to do is to find some independent but similar data, and try out different denoising pipelines there, to figure out what to apply to the data of interest. This will avoid the urge to try different denoising strategies on your own data until one comes out that happens to confirm your hypothesis…

For mathematical interpretations of biases that different regressors may introduce, the articles will spell it out better than we could hear.


1 Like

Thanks a lot for your help. Outstanding forum and support

1 Like


After giving it a thought, I decided I will include in my GLM all the “rot” and “trans” columns from the confounds list, as well as the framewise displacement column, in order to account for the noise generated by motion.

Does that seem reasonable to you?


Hi @Uri_Shinitsky,

I don’t know anything about your data or hypotheses, so I cannot say what may be reasonable. But I’m not sure I see the added value of including FD if you already have rotation and translation parameters. I would also still have some kind of physiological regression strategy (unless you are focused on BOLD signal in WM or CSF).


1 Like

I appreciate your help! This is my first time analyzing any MRI data so your answers are very helpful.

So I am currently analyzing a fast event related design task, in which participants heard stimuli of three different classes (auditory stimuli, 5 seconds long) with 6-8 silence intervals.

Total 453 seconds, TR of 1sec. 5T Siemens machine, cmrr protocol.

We are interested in examining the neuronal activity in response to each stimulus type, contrasting different pairs of the three EVs.

Regarding the physiological noise - I am leaning towards including the acompcor confounds as it seems more based in the literature than the global signals, if I understand correctly.

My question is - how many of the acompcor confounds should I include?
Perhaps only the first five? But for which tissue?

And then how many of the cosine confounds should I include?

In this case, it means I will have to turn off any high pass filter in the first level FEAT, as well as turn off mcflirt motion correction, correct?

Indeed this is a heavy matter at hand.



Global signals certainly has a lot of literature behind it, but it is just hotly contested. Here is a neat recent preprint (Bolt et al., Under Review). The findings (as well as other papers cited in the intro/discussion) should be some good reads if you are interested in this.

There are conventions for using the first 5 or 6 components, or all that explain 50% variance. That 50% variance number can vary a bit in terms of how many components that ends up being. If that ends up being a lot of terms, maybe go with first 5/6 components. Otherwise, up to you. Again, testing on an independent (but ideally similar) data set is the best thing to do.

In theory, either the combined WM+CSF mask or the separate WM/CSF masks should provide similar results, but I tend to use the combined mask to reduce the number of components in the model. Certainly if you were interested in WM or CSF in particular you would avoid using that mask.

All of them, regardless of how many components you include. The cosine basis functions, collectively, act to high pass filter data at 1/128 Hz. And yes, you probably wouldn’t want to filter twice. But if you high pass filter at a frequency sufficiently lower than 1/128 Hz, then in theory that filter wouldn’t do anything and it wouldn’t be a big deal. Still, safer to not filter twice. I run all of my GLMs through FITLINS so I would not know if/how to disable that in FEAT.



1 Like

Hi, I have a follow up question about the a_comp_cor components:

The 176 a_comp_cor components are needed in order to explain 50% of the variance, is it reasonable to use that many, or is it safer to go with the first 5/6?

I ran a first level analysis using both options, is there a way of comparing both outputs in order to decide which one is better?

Thanks again,


Are there any subjects you have processed but do not plan on analyzing? You can run a second-level model on them and see if you can resolve an effect that you expect to see with either/both of the denoising pipelines. E.g., is there a canonical auditory vs. rest effect that you can or cannot resolve with either of the pipelines?


1 Like