Preprocessing for atlas based connectivity matrices

Glad to find this discussion here. Recently I worked on a analysis pipeline using FMRIPREP. And here’s what I did in preprocess after FMRIPREP for later modelling.

  1. Grab *bold_space-MNI152NLin2009cAsym_preproc.nii.gz outputs and normalized its median intensity to 10000 (follow FSL’s convention) .
  2. Use SUSAN smoothed the data with x mm FWHM.
  3. Optional Applied non-aggressive ICA-AROMA denoise as FMRIPREP document mentioned.
  4. Optional Use fsl_glm to regress out nuisance like WM from _confounds.tsv and taken the residuals images.
  5. Applied temporal filtering use fslmath -bptf (I think if you applied a highpass filter, it will taken care of the linear detrend.)
  6. Add back the mean image derived from 2 or 3 to make the filtered image a firm dataset.

I have several question regrading to this pipeline.

  1. As you mentioned,

Create a denoising design matrix that would include the following columns from the _confounds.tsv file: AROMAAggrComp* (those are the noise components from ICA AROMA), WhiteMatter, and aCompCor01 (first component of the anatomical compcor is a good approximation of the mean signal in WM and CSF). In addition to achieve the linear detrending you should add a column of ascending numbers (1,2,3,4 etc).

should I combined 3 and 4 into one step use fsl_glm? Is that an aggressive way to remove nuisance variables?

  1. I’m aware that the in FMRIPREP, ICA-AROMA was calculated on smoothed data with 6mm FWHM. Could I apply the derived regressors on smoothed data with different FWHM (or unsmoothed data for MVPA analysis)?

  2. Could I use these ICA-AROMA regressors on data in T1w space? Or some modification are needed?

Another question not quite related to this thread. If it’s inappropriate here I could open a new thread later.

I obtained several 7T MRI data as a benchmark for later study. For structure image, I got a MP2RAGE image with 0.7 mm resolution. After some tweaks, I still couldn’t get a good skullstrip result from FMRIPREP due to the image contains noise voxels around the brain with similar intensity. Is it possible to modifies anat workflow to handle MP2RAGE image?
Also, the partial FOV bold image corregistration was failed in this dataset (maybe due to problematic anat image). Should I scan a whole brain reference EPI image with same parameters (_sbref ?) to help registration? Could FMRIPREP support it?


Combining these steps does not make the denoising more or less aggressive, but will reduce the number of intermediate files you create.

In theory I don’t see anything wrong with that, but I have not looked into this issue too much.

Again - I don’t see anything wrong with this theoretically.

We would love to add support for MP2RAGE into FMRIPREP. What do you normally use for skullstripping?

We currently do not use sbref images, but we have seen successful coregistration with other FOV datasets. I would focus on issues with anat first.

Some more general questions:

  1. Why are you doing the median 10000 normalization? Would you prefer FMRIPREP to output such data by default?
  2. It seems that everyone is doing some sort of smoothing. Would it make sense to add an option in FMRIPREP to output smoothed data?

I used fsl_regfilt to remove ICA components. As far as I know, it will include all regressor in MELODICmix.tsv and subtract the effect of noise regressors you given from input data. So I could plug nuisance regressors into this file and specify its column number in fsl_regfilt to do a one step denoise. Is that correct?

For MP2RAGE image, I tried BET and AFNI 3dSkullStrip both on origin image. None of them given good result. Then I followed this note create UNIxINV2 image (with/without N4BiasFieldCorrection) and do skullstrip on it. AFNI works ok but some gyrus in parietal cortex was missing. Bet still works poorly. For now I use AFNI with manually adjusted brainmask.

It just follow the FSL’s FEAT convention to make the preprocessed data looks like my old data. I have no idea why FSL do that.

I would love this feature. For task fMRI activation analysis, smoothing is necessary. And I have a general sense that applies smoothing on ‘raw’ data is better than smoothing a first-level/participant-level statistical map. For MVPA/RSA analysis, people usually used unsmoothed data (Although some paper says a little smooth actually have a little benefit for classification).
So I’m thinking of an --smooth-fwhm argument that could take a list fwhm for smoothing. For example, if I specify --smooth-fwhm 0.0 6.0, the output will contain a smoothed data for activation analysis and a unsmoothed data for MVPA analysis (also with ica denoised version). Neat!

Sorry - I misread your original message. To do non-agressive denoising two separate regressions are required so those steps cannot be combined. If you wanted to do aggressive denoising all confounds could be combined in one design matrix. Mind that FMRIPREP already outputs the non-aggressively denoised variant of the data in MNI space so you don’t have to run it yourself.

PS I’ll be on the lookout for good MP2RAGE skullstripping methods (CBS Tools might be worth checking out).

I only do it manually when I use a different smooth FWHM other than 6mm. That’s why I think a smooth argument is super helpful.

I will check out the CBS tools. Thanks for the suggestions.

Thanks a lot for these suggestions @ChrisGorgolewski. I am slightly confused by the subsequent answers to @Zhifang. I would like to implement your suggestions in order to obtained denoised data in T1 space at subject level (I have resting state data for which I would like to get the associated connectivity matrices).

  1. Run FMRIPREP with --use-aroma option

In this case, I assume I would need to run --use-aroma --output-space T1w template as suggested earlier by @effigies

  1. Take the *bold_space-MNI152NLin2009cAsym_preproc.nii.gz outputs.
  2. Either discard 4 first volumes, or as many volumes as the FMRIPREP steady non-state detector recommends as indicated by the number of NonSteadyStateOutlier. Alternatively you can use these columns in the denoising design matrix (see below).

I couldn’t find the NonSteadyStateOutlier information. I looked into the sub-<subject>_task-rest_bold_confounds.tsv file but could only find the columns WhiteMatter GlobalSignal stdDVARS non-stdDVARS vx-wisestdDVARS FramewiseDisplacement tCompCor0* aCompCor* X Y Z RotX RotY RotZ AROMAAggrComp0*

  1. Smooth the data with 8mm (or whatever is appropriate for your data). Alternatively you can opt for not smoothing the 4D data and smoothing only the first level statistical maps prior to doing group level.

From the subsequent posts related to the questions of @Zhifang it seems that a 6mm smoothing kernel is applied before the ICA-AROMA calculations. Are these smoothed data only temporary and discarded? That is, the 8mm smoothing kernel is to be applied to sub-<subject_label>_task-<task_id>_bold_space-T1w_preproc.nii.gz which is unsmoothed, is that correct?

  1. Create a denoising design matrix that would include the following columns from the _confounds.tsv file: AROMAAggrComp* (those are the noise components from ICA AROMA), WhiteMatter, and aCompCor01 (first component of the anatomical compcor is a good approximation of the mean signal in WM and CSF). In addition to achieve the linear detrending you should add a column of ascending numbers (1,2,3,4 etc).

I have a few questions regarding this point:

  • Is the associated command as below (confounds file transposed to be row-wise for fsl_regfit ):
    fsl_regfilt -i sub-<subject_label>_task-<task_id>_bold_space-T1w_preproc_smooth8mm.nii.gz -f <col nums for WhiteMatter, aCompCor01 and AROMAAggrComp*> -d sub-<subject_label>_task-<task_id>_bold_confounds_transpose.tsv -o sub-<subject_label>_task-<task_id>_bold_space-<space>_T1w_preproc_smooth8mm_Denoised.nii.gz
  • Linear detrending is done in the reference pipeline I suggested before the ICA-AROMA calculations, I assume this is not currently possible with fMRIprep? The fallback solution you proposed is doing it at the noise regression step by adding a row “1, 2, …, size of 4th dim” to sub-<subject_label>_task-<task_id>_bold_confounds_transpose.tsv if I understood correctly. I am not sure of the exact differences it creates compared to pre-AROMA calculations detrending.
  • Regarding WM and CSF regressors, from a statistical point of view, wouldn’t the correlation between WhiteMatter and aCompCor01 be a problem for the stability of the GLM (since aCompCor01 also includes WhiteMatter)? Maybe applying the solution proposed below would be helpful?

PPS The reason why we output only WM but not CSF is that for participants with small ventricles we cannot reliably estimate CSF ROIs without running into partial volume issues. I would love to hear suggestions how to work around this.

What about using a robust CSF estimation (e.g. tissue probability maps with 99% threshold, eroded CSF mask) and ignoring CSF signal if the number of voxels is below what you have in mind when you mentioned “small ventricles”? Could ignoring the signal be implemented as n/a values in the confounds file (i saw such n/a values in the first row of these files for the columns stdDVARS and non-stdDVARS )? This would allow to get a distinct CSF regressor column in the confounds file.

Thanks for the useful reference @oesteban. This point was actually addressed in the manuscript proposing the pipeline I referenced, where they tested despiking both before and after denoising, and did not see any significant difference in any of their pipeline performance metrics.

Considering it would be difficult at the moment to introduce despiking early on in the pipeline with fmriprep, it is then probably better (easier) to do it at the end.

Those columns are only added if the non-steady state was detected. If it wasn’t (probably because your scanner acquired dummy scans to deal with it before starting the sequence) no such columns will be added to the _confounds.tsv file. In such case I don’t see why would you need to discard any volumes, but it would not hurt (unless your sequence is very short).

The 6mm smoothed data is used to create the _variant-smoothAROMAnonaggr_preproc.nii.gz. The other _preproc.nii.gz file is unsmoothed.

Possibly - you would have to check if fsl_regfilt is not going to get stuck on headers (if it does it would be worth pinging FSL people if this could be fixed- selecting columns by names rather than numbers seems less error prone)

I don’t expect a huge difference. If you don’t detrent prior to ICA one of the components will most likely pick on the trend anyway.

Yes many of those regressors will be colinear, but since we only care about the residuals of this model this is not a problem. Colinearity will lead to variance in parameter estimates, but will not negatively impact the residuals.

Good idea.

Awesome, thanks @ChrisGorgolewski. Just two small remaining issues (let me know if I should write this somewhere else):

  1. I do not have any visualization output of the ICA-AROMA analysis, as described and shown in the fmriprep documentation

  2. You mentioned that:

The 6mm smoothed data is used to create the _variant-smoothAROMAnonaggr_preproc.nii.gz. The other _preproc.nii.gz file is unsmoothed.

But I do not have the smoothed file. In the func directory I have the following files:


And in the anat directory, I have (excluding surfaces):


This looks like a bug. Could you open an issue on github sharing an HTML report and the command line invocation you used?

Hi @michael. Sorry if I was unable to find it, but what version of fmriprep are you using? The outputs you’re missing were both added in 1.0.0-rc1 (a.k.a. 0.6.1), so if you’re using 0.6.0 or earlier, the issue is that you need to upgrade.

If not, I would encourage you to upgrade to the latest, and re-run. This should result in better error reporting, and make it easier to address this issue.

It would also be good to open a new issue on GitHub (or just a new thread in NeuroStars) to keep this thread from getting further off-topic.

Thank you @effigies, I now have to complement the pipeline discussed above with additional steps on the fmriprep outputs I already have, but for the next task I will certainly upgrade fmriprep to the latest version as you suggested (and if required open a new issue on github at that point).

RE: median 10000 normalization

I haven’t looked at the code to see the reason why, but FSL’s FILM_GLS does not like negative values or even values close to zero I believe, the performance of that algorithm depends on high(ish) voxel intensities, so making the median normalization makes it so that no voxels are in danger of being too close to zero or negative.

@ChrisGorgolewski Did you mean indeed aCompCor01 or aCompCor00? Also I have on average 30 +/- 10 AROMAAggrComp components per subject. Does that seem reasonable?

Sorry I meant aCompCor00. The number of components seems reasonable.

1 Like

I could not used the suggestion you gave as

raised the error:

ERROR: no registration file specified

However, what about replacing the two commands you mentioned with the command below? It seems to work in the single case I tested:

mri_convert -rl /preproc/freesurfer/sub-<subject_label>/mri/rawavg.mgz -rt nearest /preproc/freesurfer/sub-<subject_label>/mri/aparc+aseg.mgz aparc+aseg.nii.gz

This command is discussed here for example.

Sorry, I forgot the --regheader flag. I’ve updated the command in my original reply.

The command you gave may work just as well. Good find!

Thanks @effigies. Actually my command creates issues as for some reason the headers of the atlas and T1 volumes are not exactly the same (sform and qform are different around the 10^-6 digits), so yours works better. I still got the same error though, and had to add --reg dummy.dat for your command to work, was that the right thing to do?

An important question: I thought fmriprep/sub-<subject_label>/func/sub-<subject_label>_task-rest_bold_space-T1w_preproc.nii.gz was in the same space as the T1, but they have different voxel size (~ 3mm isotropic for the first, and ~ 1mm for the latter). Then:

  • Does the fmriprep pipeline align the bold data with the T1 data (...T1w_preproc.nii.gz being the result) but without resampling to T1 space?
  • To compute the connectivity matrices, I would need the bold data exactly in the same space as the aparc+aseg in native T1 space (that includes same voxel size). Would you have any suggestion on how to do that? (Having aparc+aseg with the same low resolution as the bold data would be fine and maybe preferable to avoid further processing of the time series)

I think you actually need to add --regheader to mri_vol2vol as well. Sorry for not catching that, earlier. If that works, let me know, and I’ll update the command above.

When we say the BOLD and T1w images are in the same space, what we mean is that they have been coregistered such that the same RAS coordinates in either get you the same brain structure. It is very rare that someone would want to upsample their BOLD data into T1w resolution, I think.


Ah, yes. You’ll still need to down-sample. You can probably do this directly by switching from using T1w_preproc.nii.gz to one of your BOLD outputs, in both the tkregister2 and mri_vol2vol commands, but you should really verify this…

Thanks for all the answers @effigies. In the case of the atlas I am using I prefer indeed to use native BOLD resolution, and I resampled the atlas to that resolution.

I got the same error in the first command (tkregister2) even when using --regheader. I could get rid of the error by adding --reg dummy.dat to save the transform (i.e. adding --regheader --reg dummy.dat to the original command ). No change was needed to mri_vol2vol (I did not need to add --regheader).