BIDS Validator error code: 51 - PHENOTYPE_SUBJECTS_MISSING

BIDS Validators,

The official BIDS Validator (version 1.8.2) is yelling at me for having any participant_id in a phenotype/ .tsv file a without a matching sub-<participant_id>/ data folder on the upper level of the BIDS data set root folder. I stand to disagree.

The majority of this dataset we are preparing to share on OpenNeuro is surveys/questionnaires/blood tests/etc. Only participants who passed certain criteria got to come in for MRI and MEG data (which is also to be shared in the same BIDS data set). I feel if the participant_id is present in the participants.tsv file the BIDS Validator should not require an otherwise empty sub-<participant_id>/ folder.

Is this thought unreasonable or have there just been less data sets preparing phenotype/ .tsv files before?

Thanks everybody!

Eric Earl
NIMH Data Science & Sharing Team
eric.earl@nih.gov

1 Like

Hum… I’d be tempted to suggest to have the full phenotype/.tsv in a sourcedata subfolder.

This could help make it clear that the particpants with MRI are a subset of the whole pool.

Not sure. Curious to see what others suggest.

Something like this:

└── dataset
    ├── sub-01
    ├── ...
    ├── participants.tsv
    ├── dataset_description.json
    └── sourcedata
        ├── participants.tsv
        └── phenotype
            ├── ...
            └── something.tsv

@Remi-Gau Thanks for the quick reply! I don’t think using sourcedata would be appropriate from the phrasing in the standard:

source data, which is defined as data before harmonization, reconstruction, and/or file format conversion (for example, E-Prime event logs or DICOM files).

But maybe I’m misunderstanding the meaning. I’m open to ideas. Thanks!

True. But nothing prevents you to do it either. :slight_smile:

BIDS has almost no requirements on sourcedata, so I was more trying to find a way to stay within BIDS to accommodate this use-case (FYI I have done similar things in some of my stuff: keep the results of all the screened participants in sourcedata but only the included are in the raw.

Other possibility (more far fetched): can you have some of those phenotypes files (surveys, questionnaires…) as events.tsv in a beh ?

@Remi-Gau since no one else is jumping up to provide a suggestion (and I don’t think utilizing the beh functionality is the right move here), do you think this edge case is a BIDS Standard problem? Where would I go to propose a change to the BIDS Standard’s Modality Agnostic files these days? Should I propose an Issue on the BIDS standard repo or just jump straight to Pull Request?

1 Like

in general it is good to open an issue on the bids standard to get the discussion rolling in “the official venue” (not everyone watches neurostars)

FYI I will be in touch soon for some phenotype issue that might have opinions about.

Thanks @Remi-Gau! Happy to provide phenotype/ advice!