ABCD family nested within site or scanner ID? How to deal with inaccuracies?

Hi all,

After project week I started working on the ABCD data. For our multi-level model we would like to use random effects of 1) family nested within site (site_id_l) or 2) family nested within scanner ID (mri_info_deviceserialnumber). However, it seems that both these variables have some inaccuracies.

  1. How is it possible that the site_id_l has 22 unique sites, while only 21 sites participate in the ABCD study?
  2. My frequency table of site and scanner also shows inaccuracies. After quality control, site 12 reported for example 6 different scanner IDs with 15 (’’"), 42 (“HASH31ce566d”), 1 (HASH3935c89e|), 2 (“HASHc3bf3d9c|”), 1 (HASHdb2589d4|), and 484 (HASHe4f6957a|) subjects, respectively.

Any suggestions about how to deal with this? Manually correct? Or just accept these ‘small’ errors as they most likely won’t have a large impact on the final results?



Hi! For anyone who has the same question, I got the following answer via Slack from Angie Laird:

During the first year of the study, some sites were added and others were removed due to recruitment success and challenges, respectively. So while there are only 21 sites actively engaged as data collection sites now, at one point there were 22.