Hi everyone,

I’m new to tedana and multi-echo BOLD data, and I had a few questions about how to evaluate the tedana outputs for the purpose of quality control.

I’m working with a resting state fMRI dataset of multi-echo data (2.5 mm isotropic, TR = 1670 ms, TEs = 15.60 ms, 38.20 ms, 60.80 ms, 83.40 ms, GRAPPA acceleration factor 2, multiband factor 4) in depressed but otherwise healthy adults. Each participant has two runs (AP and PA) of 300 volumes each. The data has been pre-processed using fMRIPrep 23.2.3 with the `--me-output-echos`

flag.

I ran a preliminary test of tedana 24.0.1 on 50 participants (100 runs) after removing dummy volumes by following the example here. I used the options `tedpca="aic"`

, `fittype="curvefit"`

, and `tedort=True`

, with the intention of using the rejected components as custom confounds in XCP-D.

For the PCA step, the number of included components using AIC is on average 44.6 (range 21 to 113), and these components explain 63% of the variance in the data (range 41% to 87%). Considering that a single run has 300 volumes, 45 PCA components feels low. I have seen general guidelines on these and other forums that the number of components should be less than 1/2 of the number of volumes, but more than 1/5th.

After the ICA, on average there are 29 (11 to 70) accepted components and 15.5 (4 to 43) rejected components, and the variance explained by decomposition is 85.99% (65.42% to 95.30%).

To summarize my questions:

- For this dataset, does the number of components or the variance explained after PCA seem too low considering 300 volumes? If so, should I try
`tedpca="kundu"`

? - Similarly, is the variance explained by ICA acceptable?
- Are there other metrics that are useful in quality control?
- In general, are there acceptable ranges for these metrics (for example, variance explained by decomposition should be > 80%), or thresholds where a participant should be excluded (either hard thresholds, or based on standard deviation)?

I know there often aren’t hard cutoffs for metrics like these, and every dataset is different, but any guidance would be appreciated. We plan to manually inspect the components from a subset of participants, but we will have up to 600 resting state runs, so I’d like to come up with some guidelines for data-driven or automated quality control. Thanks for the help!

Sincerely,

Keith Jones