I’m currently working with the neuroCombat package in rstudio (GitHub - Jfortin1/neuroCombat_Rpackage: neuroCombat R package) and I have a more general question and would appreciate all opinions and insights on this.
I work on a multi-site study, and we have to combat to account for scanner effects (currently there are six different scanners included in the study, 2 different scanners at each site).
Currently, I am preserving the variance for all of the following covariates across scanners (19 total variables):
modcombat ← model.matrix(~group +sex + age +
rmet_total + er40_total +
tasit1_total + tasit2_sin + tasit2_ssar + tasit2_psar + tasit3_lies + tasit3_sar +
mccb_pspeed + mccb_atvig + mccb_wmem + mccb_verblearn+ mccb_vislearn
+ mccb_reasonps + mccb_scog +bsfs_total, data=validation_data)
In the past, we have only preserved the variance for main variables such as group (which is diagnostic group), sex, and age. In this example above, I am also preserving variance in our cognitive variables of interest. I’m not sure if including too many variables affects the harmonization? How many is too many?