Preprocessing for atlas based connectivity matrices

I used fsl_regfilt to remove ICA components. As far as I know, it will include all regressor in MELODICmix.tsv and subtract the effect of noise regressors you given from input data. So I could plug nuisance regressors into this file and specify its column number in fsl_regfilt to do a one step denoise. Is that correct?

For MP2RAGE image, I tried BET and AFNI 3dSkullStrip both on origin image. None of them given good result. Then I followed this note create UNIxINV2 image (with/without N4BiasFieldCorrection) and do skullstrip on it. AFNI works ok but some gyrus in parietal cortex was missing. Bet still works poorly. For now I use AFNI with manually adjusted brainmask.

It just follow the FSL’s FEAT convention to make the preprocessed data looks like my old data. I have no idea why FSL do that.

I would love this feature. For task fMRI activation analysis, smoothing is necessary. And I have a general sense that applies smoothing on ‘raw’ data is better than smoothing a first-level/participant-level statistical map. For MVPA/RSA analysis, people usually used unsmoothed data (Although some paper says a little smooth actually have a little benefit for classification).
So I’m thinking of an --smooth-fwhm argument that could take a list fwhm for smoothing. For example, if I specify --smooth-fwhm 0.0 6.0, the output will contain a smoothed data for activation analysis and a unsmoothed data for MVPA analysis (also with ica denoised version). Neat!

Sorry - I misread your original message. To do non-agressive denoising two separate regressions are required so those steps cannot be combined. If you wanted to do aggressive denoising all confounds could be combined in one design matrix. Mind that FMRIPREP already outputs the non-aggressively denoised variant of the data in MNI space so you don’t have to run it yourself.

PS I’ll be on the lookout for good MP2RAGE skullstripping methods (CBS Tools might be worth checking out).

I only do it manually when I use a different smooth FWHM other than 6mm. That’s why I think a smooth argument is super helpful.

I will check out the CBS tools. Thanks for the suggestions.

Thanks a lot for these suggestions @ChrisGorgolewski. I am slightly confused by the subsequent answers to @Zhifang. I would like to implement your suggestions in order to obtained denoised data in T1 space at subject level (I have resting state data for which I would like to get the associated connectivity matrices).

  1. Run FMRIPREP with --use-aroma option

In this case, I assume I would need to run --use-aroma --output-space T1w template as suggested earlier by @effigies

  1. Take the *bold_space-MNI152NLin2009cAsym_preproc.nii.gz outputs.
  2. Either discard 4 first volumes, or as many volumes as the FMRIPREP steady non-state detector recommends as indicated by the number of NonSteadyStateOutlier. Alternatively you can use these columns in the denoising design matrix (see below).

I couldn’t find the NonSteadyStateOutlier information. I looked into the sub-<subject>_task-rest_bold_confounds.tsv file but could only find the columns WhiteMatter GlobalSignal stdDVARS non-stdDVARS vx-wisestdDVARS FramewiseDisplacement tCompCor0* aCompCor* X Y Z RotX RotY RotZ AROMAAggrComp0*

  1. Smooth the data with 8mm (or whatever is appropriate for your data). Alternatively you can opt for not smoothing the 4D data and smoothing only the first level statistical maps prior to doing group level.

From the subsequent posts related to the questions of @Zhifang it seems that a 6mm smoothing kernel is applied before the ICA-AROMA calculations. Are these smoothed data only temporary and discarded? That is, the 8mm smoothing kernel is to be applied to sub-<subject_label>_task-<task_id>_bold_space-T1w_preproc.nii.gz which is unsmoothed, is that correct?

  1. Create a denoising design matrix that would include the following columns from the _confounds.tsv file: AROMAAggrComp* (those are the noise components from ICA AROMA), WhiteMatter, and aCompCor01 (first component of the anatomical compcor is a good approximation of the mean signal in WM and CSF). In addition to achieve the linear detrending you should add a column of ascending numbers (1,2,3,4 etc).

I have a few questions regarding this point:

  • Is the associated command as below (confounds file transposed to be row-wise for fsl_regfit ):
    fsl_regfilt -i sub-<subject_label>_task-<task_id>_bold_space-T1w_preproc_smooth8mm.nii.gz -f <col nums for WhiteMatter, aCompCor01 and AROMAAggrComp*> -d sub-<subject_label>_task-<task_id>_bold_confounds_transpose.tsv -o sub-<subject_label>_task-<task_id>_bold_space-<space>_T1w_preproc_smooth8mm_Denoised.nii.gz
  • Linear detrending is done in the reference pipeline I suggested before the ICA-AROMA calculations, I assume this is not currently possible with fMRIprep? The fallback solution you proposed is doing it at the noise regression step by adding a row “1, 2, …, size of 4th dim” to sub-<subject_label>_task-<task_id>_bold_confounds_transpose.tsv if I understood correctly. I am not sure of the exact differences it creates compared to pre-AROMA calculations detrending.
  • Regarding WM and CSF regressors, from a statistical point of view, wouldn’t the correlation between WhiteMatter and aCompCor01 be a problem for the stability of the GLM (since aCompCor01 also includes WhiteMatter)? Maybe applying the solution proposed below would be helpful?

PPS The reason why we output only WM but not CSF is that for participants with small ventricles we cannot reliably estimate CSF ROIs without running into partial volume issues. I would love to hear suggestions how to work around this.

What about using a robust CSF estimation (e.g. tissue probability maps with 99% threshold, eroded CSF mask) and ignoring CSF signal if the number of voxels is below what you have in mind when you mentioned “small ventricles”? Could ignoring the signal be implemented as n/a values in the confounds file (i saw such n/a values in the first row of these files for the columns stdDVARS and non-stdDVARS )? This would allow to get a distinct CSF regressor column in the confounds file.

Thanks for the useful reference @oesteban. This point was actually addressed in the manuscript proposing the pipeline I referenced, where they tested despiking both before and after denoising, and did not see any significant difference in any of their pipeline performance metrics.

Considering it would be difficult at the moment to introduce despiking early on in the pipeline with fmriprep, it is then probably better (easier) to do it at the end.

Those columns are only added if the non-steady state was detected. If it wasn’t (probably because your scanner acquired dummy scans to deal with it before starting the sequence) no such columns will be added to the _confounds.tsv file. In such case I don’t see why would you need to discard any volumes, but it would not hurt (unless your sequence is very short).

The 6mm smoothed data is used to create the _variant-smoothAROMAnonaggr_preproc.nii.gz. The other _preproc.nii.gz file is unsmoothed.

Possibly - you would have to check if fsl_regfilt is not going to get stuck on headers (if it does it would be worth pinging FSL people if this could be fixed- selecting columns by names rather than numbers seems less error prone)

I don’t expect a huge difference. If you don’t detrent prior to ICA one of the components will most likely pick on the trend anyway.

Yes many of those regressors will be colinear, but since we only care about the residuals of this model this is not a problem. Colinearity will lead to variance in parameter estimates, but will not negatively impact the residuals.

Good idea.

Awesome, thanks @ChrisGorgolewski. Just two small remaining issues (let me know if I should write this somewhere else):

  1. I do not have any visualization output of the ICA-AROMA analysis, as described and shown in the fmriprep documentation

  2. You mentioned that:

The 6mm smoothed data is used to create the _variant-smoothAROMAnonaggr_preproc.nii.gz. The other _preproc.nii.gz file is unsmoothed.

But I do not have the smoothed file. In the func directory I have the following files:


And in the anat directory, I have (excluding surfaces):


This looks like a bug. Could you open an issue on github sharing an HTML report and the command line invocation you used?

Hi @michael. Sorry if I was unable to find it, but what version of fmriprep are you using? The outputs you’re missing were both added in 1.0.0-rc1 (a.k.a. 0.6.1), so if you’re using 0.6.0 or earlier, the issue is that you need to upgrade.

If not, I would encourage you to upgrade to the latest, and re-run. This should result in better error reporting, and make it easier to address this issue.

It would also be good to open a new issue on GitHub (or just a new thread in NeuroStars) to keep this thread from getting further off-topic.

Thank you @effigies, I now have to complement the pipeline discussed above with additional steps on the fmriprep outputs I already have, but for the next task I will certainly upgrade fmriprep to the latest version as you suggested (and if required open a new issue on github at that point).

RE: median 10000 normalization

I haven’t looked at the code to see the reason why, but FSL’s FILM_GLS does not like negative values or even values close to zero I believe, the performance of that algorithm depends on high(ish) voxel intensities, so making the median normalization makes it so that no voxels are in danger of being too close to zero or negative.

@ChrisGorgolewski Did you mean indeed aCompCor01 or aCompCor00? Also I have on average 30 +/- 10 AROMAAggrComp components per subject. Does that seem reasonable?

Sorry I meant aCompCor00. The number of components seems reasonable.

1 Like

I could not used the suggestion you gave as

raised the error:

ERROR: no registration file specified

However, what about replacing the two commands you mentioned with the command below? It seems to work in the single case I tested:

mri_convert -rl /preproc/freesurfer/sub-<subject_label>/mri/rawavg.mgz -rt nearest /preproc/freesurfer/sub-<subject_label>/mri/aparc+aseg.mgz aparc+aseg.nii.gz

This command is discussed here for example.

Sorry, I forgot the --regheader flag. I’ve updated the command in my original reply.

The command you gave may work just as well. Good find!

Thanks @effigies. Actually my command creates issues as for some reason the headers of the atlas and T1 volumes are not exactly the same (sform and qform are different around the 10^-6 digits), so yours works better. I still got the same error though, and had to add --reg dummy.dat for your command to work, was that the right thing to do?

An important question: I thought fmriprep/sub-<subject_label>/func/sub-<subject_label>_task-rest_bold_space-T1w_preproc.nii.gz was in the same space as the T1, but they have different voxel size (~ 3mm isotropic for the first, and ~ 1mm for the latter). Then:

  • Does the fmriprep pipeline align the bold data with the T1 data (...T1w_preproc.nii.gz being the result) but without resampling to T1 space?
  • To compute the connectivity matrices, I would need the bold data exactly in the same space as the aparc+aseg in native T1 space (that includes same voxel size). Would you have any suggestion on how to do that? (Having aparc+aseg with the same low resolution as the bold data would be fine and maybe preferable to avoid further processing of the time series)

I think you actually need to add --regheader to mri_vol2vol as well. Sorry for not catching that, earlier. If that works, let me know, and I’ll update the command above.

When we say the BOLD and T1w images are in the same space, what we mean is that they have been coregistered such that the same RAS coordinates in either get you the same brain structure. It is very rare that someone would want to upsample their BOLD data into T1w resolution, I think.


Ah, yes. You’ll still need to down-sample. You can probably do this directly by switching from using T1w_preproc.nii.gz to one of your BOLD outputs, in both the tkregister2 and mri_vol2vol commands, but you should really verify this…

Thanks for all the answers @effigies. In the case of the atlas I am using I prefer indeed to use native BOLD resolution, and I resampled the atlas to that resolution.

I got the same error in the first command (tkregister2) even when using --regheader. I could get rid of the error by adding --reg dummy.dat to save the transform (i.e. adding --regheader --reg dummy.dat to the original command ). No change was needed to mri_vol2vol (I did not need to add --regheader).

Hello @oesteban and @ChrisGorgolewski , I was not sure where else to ask about this, but do you have any updates on the possiblity of using despiking with fmriprep? I read the linked github post, and the Power (2017) paper, and from what I understand, the main concern is “masking” underlying motion artifacts with lower motion values. However its not clear whether this applies to both of the approaches 3dDespike uses (residual SD attenuation, and a pre and post timepoint average interpolation), or whether using a lower FD threshold would largely overcome the problem.

Also, I have come across at least three papers here, and in this recent review which suggest carrying out (presumably attenuation-based) despiking first, showing improvements (not necessarily related to order). So my question was, have your views changed at all on using 3dDespike BEFORE running fmriprep (since running at the end may be problematic in relation to high pass filters)? I have run a sample dataset with and without despiking, and as expected, FD values and DIVARS are attenuated, with fewer motion censoring resgressors. However, I’m still unsure whether we will use those due to concerns about reducing the experimental space/unbalanced conditions, so I’m trying to weigh pros and cons of this approach.

Also, since we used the despike option in MRIQC, it seems to make sense to use the same thresholds and values in fmriprep. Any updated thoughts on this would be greatly appreciated!!

Short answer is fMRIPrep will not produce “despiked” outputs, as it is a destructive operation of the original data.

That doesn’t mean we cannot integrate 3dDespike in the following ways:

  • Generate a new confound timeseries indicating whether 3dDespike identified a frame as outlier. I would need to learn more about 3dDespike, but this new regressor could be more sophisticated than just a flag -e.g., if 3dDespike only overwrites outlying voxels, provide the rate of outlying voxels w.r.t. the brain mask or similar- (@jdkent, any insights?).
  • Execute head-motion estimation with 3dDespiked signals, which in principle should lead to lower head-motion estimates?

As we promote in our paper (, MRIQC and fMRIPrep are (and should be kept) different tools for different purposes. So, although it seems logical to use same thresholds and values, this is not a best practice. For example, MRIQC wants to evidence quality problems - so the design decisions will typically try to exacerbate such problems so they become as obvious as possible to the researcher’s eyes. Conversely, fMRIPrep wants to minimally massage data so that they can statistically analyzed. As objectives are different, implementations are different.