Longitudinal BIDS session naming with multiple fMRI paradigms

kslays · July 23, 2019, 8:16pm

We are working on using heudiconv to convert our DICOM files to BIDS for all the scans our lab has ever acquired. For most participants, we scanned them multiple times over the years. We have 6 fMRI project cohorts over about 18 years of scanning with over 300 subjects and a lot of demographic, behavioral, and physiological data. We’re in the process of seeing if we can make it public.

We are confused about session numbering in this longitudinal context. We could use absolute session numbering, which would be pretty straightforward. That way each participant will have multiple subfolders, one for each session, with session numbers ordered by date of acquisition. However, that might make fMRI analysis a bit confusing, for example if participant MOB101 did the stroop-task on session 1, but MOB120 did the stroop-task scan on session 3 (stroop-task visit 1) and again on session 4 (stroop-task visit 2).

In other words, we have two session counts: absolute session, and fMRI task session. Both are useful information.

When running fMRIprep or other software, how would you specify which sessions to use?

We could specify this with a single BIDS session tag: ‘ses-02-stroop-02’. Or, we could specify it with two tags: ‘ses-02’ and ‘cohortses-stroop-02’. However, there is no ‘cohortses’ tag, so this seems like a bad idea.

So, the files could be of the form of one of the following two examples:
BOLD:
sub-MOB238_ses-03_task-stroop_run-01_bold.nii.gz
sub-MOB238_ses-03-stroop-02_task-stroop_run-01_bold.nii.gz
sub-MOB238_ses-03_cohortses-stroop-02_task-stroop_run-01_bold.nii.gz
T1w:
sub-MOB238_ses-03_run-01_T1w.nii.gz
sub-MOB238_ses-03-stroop-02_run-01_T1w.nii.gz
sub-MOB238_ses-03_cohortses-stroop-02_run-01_T1w.nii.gz

If we just used absolute session and didn’t tag fMRI task sessions, then I guess we could make a list of which absolute session the relevant fMRI task could be found in, but that sounds like a confusing mess. I think that is the approach suggested in this thread, with a sessions.tsv file and a sessions.json data dictionary:

The “type” column in the sessions.tsv could specify the fMRI task cohort (aka wave). During analysis, you could use that type column to filter, so it would be clear if MOB101 did stroop-task on ses-01 and MOB120 did stroop-task on ses-03 and again on ses-04.

Basically, my question is this: Should we just use “ses-01” in our filenames, or should we add information about the fMRI cohort (like “ses-01-stroop-01” or “ses-01_cohortses-01”)? (As an aside, are multiple hyphens allowed in a BIDS tag?)

franklin · July 23, 2019, 10:16pm

Hi @kslays

Thank you for your message! Great to hear you are converting your data over and plan to share!

Regarding the session count - I think the absolute session count is the more straightforward approach and will help with sharing and reusing down the line. By chance are the sessions unique? The sessions can have strings. (referencing the longitudinal description in the BIDS specification).

Regarding your proposed example - the hyphens in BIDS are used to separate key-value pairs. Having multiple hyphens is not part of the current standard. So perhaps the absolute session can be the approach.

The additional metadata to help explain the organization I think would be beneficial. Your proposed solution for using type to help query across the subjects can be helpful!

I would also recommend to check with the BIDS validator after a few subjects to confirm it will conform before expanding to the whole dataset.

Thank you,
Franklin

kslays · July 24, 2019, 8:16pm

Hi @franklin, thanks so much for your answers. The sessions are not entirely unique, because sometimes there were issues with the participant or the scanner equipment and we needed to collect multiple sessions to get all of our t1, fMRI, diffusion, etc. scans for a given project. So most of the time subjects will just have one session for a given project, but there are several subjects who have multiple sessions.

Does this look like a good plan?

Project outline

├──sessions.json
└── sub-MOB1001
    ├──sub-MOB1001_sessions.tsv
    ├── ses-01
    └── ses-02
└── sub-MOB1002
    ├──sub-MOB1002_sessions.tsv
    ├── ses-01
    ├── ses-02
    ├── ses-03
    └── ses-04
└── sub-MOB1003
    ├──sub-MOB1003_sessions.tsv
    └── ses-01

Sessions file /sub-MOB1001/sub-MOB1001_sessions.tsv

session_id    project
ses-01    stroop
ses-02    faces

Sessions file: /sub-MOB1002/sub-MOB1002_sessions.tsv

session_id    project
ses-01    stroop
ses-02    MID
ses-03    slots
ses-04    slots

Sessions file: /sub-MOB1003/sub-MOB1003_sessions.tsv

session_id    project
ses-01    slots
ses-01    MID
ses-02    faces

Data dictionary: /sessions.json

{  
   "project":{  
      "Description":"Project cohort",
      "Levels":{  
         "stroop":"Stroop project",
         "faces":"Emotional facial expressions project",
         "MID":"Monetary incentive delay project",
         "slots":"Slot machine project"
      }
   }
}

Then, when we want to analyze a particular fMRI project, we would have to figure out how to extract the following sub/ses combinations from the *sessions.tsv files. For example, here’s slots:

/sub-MOB1002/ses-03
/sub-MOB1002/ses-04
/sub-MOB1003/ses-01

The faces project would be:

/sub-MOB1001/ses-02
/sub-MOB1003/ses-02

Do you know if this can be specified in fMRIprep and other BIDS apps?

Alternatively, we could add a string onto the end of the ses number. For brevity, I’ll just show MOB1002 from the example above:

└── sub-MOB1002
    ├──sub-MOB1002_sessions.tsv
    ├── ses-01stroop01
    ├── ses-02MID01
    ├── ses-03slots01
    └── ses-04slots02

This does look pretty messy though. Do you have a preference?

These project sessions are also from multiple sites (Boston VA and Martinos) and scanners (Bay 4 and Bay 8), which we’re planning on addressing with four columns in participants.tsv: site, magnetmodel, bay, and headcoil.

We’ll use the BIDS validator, thanks for the suggestion.

franklin · July 24, 2019, 11:04pm

Hi @kslays

Thank you for sending over the structure! I think that looks good and makes sense! I think the first proposed structure looks better and more easily queryable. The additional columns in participants.tsv sounds good! Having the structure with a subject or two organized and validated will be good.

Regarding specified in analysis packages like fMRIPrep - I do not believe so. For fMRIPrep in particular, it is on a single subject level so it would be passing the data to analyze, should not have to worry about matching sessions at that stage.

Thank you,
Franklin