After fmriprep, should I demean/scale my motion parameters before glm?

Aisa2 · December 11, 2024, 5:49pm

Question is in the title. I have pre-processed data with fmriprep in preparation for a task-based analysis. I am now trying to finish out the processing with AFNI (mask, blur, scale, regress, etc). Looking at the confounds tsv file, I see that a number of the confound regressors have a mean of zero (likely because of how they are calculated, e.g. with PCA/ICA), but the motion parameters do not have a mean of zero.

AFNI has an option to demean motion regressors, but it seems a little obscure and I’m not certain it’s necessary. For script flexibility reasons, it would be easier not to demean the motion regressors, but maybe I’m missing something important. Can anyone shed light on this situation?

Steven · December 11, 2024, 6:01pm

Hi @Aisa2,

In theory, demeaning/scale shouldn’t matter for GLM calculations (because both are linear operations). But, some GLM software libraries just tend to play nicer with data that are demeaned and scaled. Keep in mind that scaling the data will also of course scale the beta coefficients for the GLM, if that is important to you. Demeaning, on the other hand will change intercepts. But neither impact the statistical significance. I would recommend doing both, personally.

Best,
Steven

ptaylor · December 11, 2024, 6:09pm

Hi-

I agree with @Steven 's points, and will just add a little more AFNI-specific detail, since that is the mentioned processing software.

This is a snippet from the afni_proc.py help option for how motion regressors would be applied, which should help clarify (“basic” are the motion estimates just from 3dvolreg directly, and “demean” refers to the demeaned ones):

    -regress_apply_mot_types TYPE1 ... : specify motion regressors

...

     ** Note also that basic and demean will give the same results, except
        for the betas of the constant drift parameters (and subject to
        computational precision).

     ** A small side effect of de-meaning motion parameters is that the
        constant drift terms should evaluate to the mean baseline.

We generally do demean regression parameters for these reasons. This article might also provide further description of afni_proc.py options and processing blocks for the modeling and other parts.

Scaling considerations are important for being able to interpret the data, as described here:

Chen G, Taylor PA, Cox RW (2017). Is the statistic value all we should care about in neuroimaging? Neuroimage. 147:952-959. doi:10.1016/j.neuroimage.2016.09.066
Is the statistic value all we should care about in neuroimaging? - PubMed

Hope those are helpful.

–pt

Aisa2 · December 11, 2024, 9:21pm

Thank you both for your quick responses! I have indeed come across the article @ptaylor mentions, it’s been very useful. And thanks for the pointer to the explanation under -regress_apply_mot_types

For context, I am trying to write a pipeline with various denoising options, and the presence/number of motion regressors can vary depending on the choice for denoising. So, I would prefer to add all my confound regressors, motion and otherwise, using the -regress_extra_ortvec option. However, this may mean that I need to do some extra steps manually to replicate what afni would normally do with motion regressors (e.g. demeaning), since the help for afni_proc.py about -regress_extra_ortvec says:

These files should be in 1D format, columns of regressors in text files. They are not modified by the program, and should match the length of the final regression.

… I think that I will follow @Steven’s advice and scale/demean these coefficients myself just to be on the safe side.

The scaling of the BOLD data itself is actually a separate can of worms that I do have questions about, but that will be for another day and another thread.

ptaylor · December 11, 2024, 9:56pm

OK, you might want to use the way that afni_proc.py’s proc script would do scaling and/or other regressor creation. Because the proc script is commented, that can make some of the processes clearer rather than having to guess about processing.

For example, the Bootcamp example script s05* runs afni_proc.py and generates a proc script with this recipe for scaling (there are 3 runs of data in the example):

# ================================= scale ==================================
# scale each voxel time series to have a mean of 100
# (be sure no negatives creep in)
# (subject to a range of [0,200])
foreach run ( $runs )
    3dTstat -prefix rm.mean_r$run pb03.$subj.r$run.blur+tlrc
    3dcalc -a pb03.$subj.r$run.blur+tlrc -b rm.mean_r$run+tlrc \
           -c mask_epi_extents+tlrc                            \
           -expr 'c * min(200, a/b*100)*step(a)*step(b)'       \
           -prefix pb04.$subj.r$run.scale
end

In the start of the regress block, there is this work about demeaning and derivative-izing of motion parameters, as well as estimating censoring time series:

# ================================ regress =================================

# compute de-meaned motion parameters (for use in regression)
1d_tool.py -infile dfile_rall.1D -set_nruns 3                            \
           -demean -write motion_demean.1D

# compute motion parameter derivatives (just to have)
1d_tool.py -infile dfile_rall.1D -set_nruns 3                            \
           -derivative -demean -write motion_deriv.1D

# convert motion parameters for per-run regression
1d_tool.py -infile motion_demean.1D -set_nruns 3                         \
           -split_into_pad_runs mot_demean

# create censor file motion_${subj}_censor.1D, for censoring motion 
1d_tool.py -infile dfile_rall.1D -set_nruns 3                            \
    -show_censor_count -censor_prev_TR                                   \
    -censor_motion 0.3 motion_${subj}

# combine multiple censor files
1deval -a motion_${subj}_censor.1D -b outcount_${subj}_censor.1D         \
       -expr "a*b" > censor_${subj}_combined_2.1D

–pt

Mkassaie · December 21, 2024, 10:44pm

I’m wondering what scale would be sensible for motion params? Z-scoring or perhaps scaling by 100? (if by 100, is it due to concerns for rounding errors or compatibility with the task regressors)?

Also, do you recommend the same demeaning/scaling for tissue signals (global, wm, csf)?

(I’m doing all of this for a task-based GLM analysis in FSL feat in that matters)

Steven · December 21, 2024, 10:52pm

Z scoring is fine for everything.

Best,
Steven