Getting started using fmriprep's ICA-AROMA outputs

Hi All,

I am a research assistant charged with incorporating ICA-AROMA outputs into our analysis pipeline. We’ve just completed running fmirprep with version 20.

I’ve been looking over threads in NeuroStars such as @rastko’s comments here and here (since he is apparently a leader in denoising), other threads here and here, and some linked notebooks and github issues, but I’m still struggling to wrap my head around what to do. Our original pipeline uses the original preprocessed outputs (e.g. sub-s001_ses-1_task-stop_run-1_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz , and to add in confounds following this function:

def process_confounds(confounds_file, a_comp_cor=True):
    scrubbing for TASK
    remove TRs where FD>.5, stdDVARS (that relates to DVARS>.5)
    regressors to use
    junk regressor: errors, ommissions, maybe very fast RTs (less than 50 ms)
    confounds_df = pd.read_csv(confounds_file, sep = '\t', 
    excessive_movement = (confounds_df.framewise_displacement>.5) & \
    excessive_movement_TRs = excessive_movement[excessive_movement].index
    excessive_movement_regressors = np.zeros([confounds_df.shape[0], 
    for i,TR in enumerate(excessive_movement_TRs):
        excessive_movement_regressors[TR,i] = 1
    excessive_movement_regressor_names = ['rejectTR_%d' % TR for TR in 
    # get movement regressors
    movement_regressor_names = ['trans_x','trans_y','trans_z','rot_x','rot_y','rot_z']
    movement_regressors = confounds_df.loc[:,movement_regressor_names]
    movement_regressor_names += [i+'td' for i in movement_regressor_names]
    movement_regressors = np.hstack((movement_regressors, np.gradient(movement_regressors,axis=0)))
    # add square
    movement_regressor_names += [i+'_sq' for i in movement_regressor_names]
    movement_regressors = np.hstack((movement_regressors, movement_regressors**2))
    # add additional relevant regressors
    add_regressor_names = ['framewise_displacement'] 
    if a_comp_cor: 
        add_regressor_names += [i for i in confounds_df.columns if 'a_comp_cor' in i][:8]
    additional_regressors = confounds_df.loc[:,add_regressor_names].values
    regressors = np.hstack((movement_regressors,
    # concatenate regressor names
    regressor_names = movement_regressor_names + add_regressor_names + \
    return regressors, regressor_names

As I understand it, there are two main options for incorporating ICA aroma outputs.

  1. sub out sub-s001_ses-1_task-stop_run-1_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz for sub-s001_ses-1_task-stop_run-1_space-MNI152NLin6Asym_desc-smoothAROMAnonaggr_bold.nii.gz
  2. add the first n components of sub-s001_ses-1_task-stop_run-1_AROMAnoiseICs.csv to our confounds matrix (guidance welcome on the number of components, preliminary skimmings suggest 100-150)

It seems like in both cases, dropping the realignment parameters is recommended.Are these two options correct interpretations of the forums? One thing that has made me nervous is Alex Fornito’s comment that AROMA doesn’t jive with multiband sequences, which we did use to collect the data.

Any guidance would be incredibly appreciated. Thanks in advance for your time and your patience.



On the whole, that is right — you can either use the nonaggressively denoised image produced by fMRIPrep or you can select and fit the regressors yourself — if you do your own fit, you will want to examine the metadata to ensure that you’re using only the regressors classified as artefact.

To answer your other question — for ICA-AROMA, rather than selecting the first n components as you would in a PCA-based approach like CompCor, you would select all aroma_motion~ components with the key-value pair "MotionNoise": true in the metadata file ~desc-confounds_regressors.json as confound regressors. The discussion in the other thread I believe refers to limiting the dimension of the ICA decomposition itself, which you can cap in the 100-150 range using fMRIPrep’s --aroma-melodic-dimensionality option. (You will likely have to re-run the ICA-AROMA nodes of fMRIPrep to do this!) The exact decision could depend not only on whether your data are single-band vs. multiband but also on the number of TRs you have per functional scan, but the recommendation from the other thread should be a good starting point.

And yes, I’d currently recommend dropping the realignment parameters because (i) the original paper from Pruim and colleagues did not include them (and their use has accordingly not been validated in this context) and (ii) one of the features that ICA-AROMA uses in order to classify a component as noise is the correlation of the component time series with realignment parameters and derived values — so the thought is that the components themselves capture salient motion-related variance (and also including the realignment parameters might be unnecessary and potentially detrimental to effective data volume).

Including the mean WM and CSF signals as per Pruim and colleagues is strongly recommended. On the other hand, the merits of expanding ICA-AROMA to also include global signal regression (GSR: regressing the mean time series across the entire brain) are less certain — both our study and the Parkes et al. study from the Fornito group found that it helped to mitigate motion-related effects, but there is a vocal subset of the community that vehemently opposes it and argues that GSR discards important effects of interest. Pragmatically, there is always some chance that reviewers will ask you to do it both ways to assess robustness.

Also note, however, that there are some further nuances to denoising the data yourself if you choose that route rather than using the nonaggressively denoised data provided by fMRIPrep. In particular, the ICA-AROMA program uses fsl_regfilt, which in the nonaggressive case requires some extra command-line arguments and, importantly, is not the same as an ordinary least squares fit. I’m happy to provide further details if you choose this approach.

I haven’t had experience using ICA-AROMA on multiband data myself, but I’ve heard reports in both directions (i.e., that there are substantial problems with performance generalization to this setting, and that it works reasonably well). I’m not aware of specific publications in this vein yet (Aug 2020). You could always try it and run a QC-FC analysis to assess residual motion effects, for which I’d be happy to advise, look over code, etc. There are also some software tools (denoiser, fMRIDenoise, XCP, etc.) that can automate the evaluation step for you.

Looking at your code, as a minor note, the latest version of fMRIPrep should be able to compute regression spikes for you using the --fd-spike-threshold .5 and --dvars-spike-threshold 1.2 options. It will also provide an expanded model that includes temporal derivatives and quadratic terms.

Please let me know if I can clarify anything!



I’ve been thinking of switching to this route, but it sounds like it’s not quite as simple as applying the aroma_motionXX confounds on the *desc-preproc_bold.nii.gz data. Could you elaborate on what these nuances are?