Behavior Metadata without (tsv) Event Data Related to a Neuroimaging Data

alexrockhill · May 12, 2020, 9:27pm

Hmm, they also have values for each of these medication fields specific to the session: i.e. Parkinson’s disease patients on and off medications so that doubles the amount of fields but also, more importantly, makes it ambiguous which session the phenotype data applies to. I don’t think that is a solution that works in this instance unfortunately.

franklin · May 12, 2020, 9:42pm

Would it work to split along these two groups (on and off meds within your sample)? In this case there will be 2 phenotype files. Is there further hierarchy built into your design? Within say on meds group Though it appears your dataset has 1 session? (ses-hc?)

I may be missing something? It may be worth taking a step back and clarifying your experimental design w.r.t. how the groups were designed. This will assist in thinking through how to build in and maintain this hierarchy and ease reusability of your dataset

alexrockhill · May 12, 2020, 9:45pm

I’m not sure what you mean. The on meds and off meds groups are the same subjects recorded on two sessions: on med and off meds. I’m not sure that splitting those up would make sense.

The healthy controls (hc) don’t have meds so they only do one session which I called hc to disambiguate and head off any misunderstandings.

franklin · May 12, 2020, 10:44pm

That’s was what I interested in - insights into how this experiment was designed

If I am interpreting correctly (the behavioral portion) -
You have 2 groups: PD and Controls
Controls had 1 session
PD had 2 sessions: on and off meds
Within a session there was behavioral testing

Please correct me if I misunderstood

In this case, to account for the two different sessions and address the potential ambiguity between the 2 sessions it seems reasonable to split. They will be in the same phenotype folder with the filename clearly differentiating these two sessions. You’ll be using the same subject identifiers across the two files. This would also help remedy the sparse matrix concern (still sparse but not as much as if they were pooled together). Controls will also have their own file. The human readability of this data is derived in the sidecar json. Downstream scripts can extract and collect the pertinent data for modeling.

This may come down to preference and style for representing your dataset. Perhaps others in our community have addressed this coding previously

alexrockhill · May 13, 2020, 12:09am

Ok, great, I think we’re on the same page. My concern is that I don’t see how phenotype can be linked specifically to sessions from the documentation here: https://bids-specification.readthedocs.io/en/latest/03-modality-agnostic-files.html#phenotypic-and-assessment-data. You could include ‘ses-{on|off|hc}’ in the name but my concern is that this is unstructured/not in the specification so it seems to me likely to cause ambiguity. What I like about the way it is structured now is that the data only relevant to the session of interest is in that particular session directory. In my interpretation, that is a general principle of BIDS: data/metadata ideally goes in the most upstream directory in which it applies to everything downstream. For example, it wouldn’t make sense to me to have a bids_root/task-rest_ses-on-beh.json file because it doesn’t apply to other sessions which are downstream of the root directory, in that case it would be better to have a sidecar for each behavior file in the session folder it applies to even if it’s a bit redundant for clarity. Maybe that’s not essential, but I think it’s better to err on the side of putting metadata in the same directory/paired behavior structure as the data rather than putting everything upstream and relying on someone to sort out which downstream structures it applies to.

franklin · May 13, 2020, 12:38am

Good to hear we’re on the same page! I understand and that works - sounds good!