After project week I started working on the ABCD data. For our multi-level model we would like to use random effects of 1) family nested within site (site_id_l) or 2) family nested within scanner ID (mri_info_deviceserialnumber). However, it seems that both these variables have some inaccuracies.
- How is it possible that the site_id_l has 22 unique sites, while only 21 sites participate in the ABCD study?
- My frequency table of site and scanner also shows inaccuracies. After quality control, site 12 reported for example 6 different scanner IDs with 15 (’’"), 42 (“HASH31ce566d”), 1 (HASH3935c89e|), 2 (“HASHc3bf3d9c|”), 1 (HASHdb2589d4|), and 484 (HASHe4f6957a|) subjects, respectively.
Any suggestions about how to deal with this? Manually correct? Or just accept these ‘small’ errors as they most likely won’t have a large impact on the final results?