Debugging duplicate study sessions in DICOM to BIDS conversion

Hi,
I have been given a set of DICOM data and I am using heudiconv to convert them to NIfTI and BIDS. Unfortunately, I have very limited experience converting DICOM data.

I have a custon heuristic file that I have developped based on the reproin example available on the heudiconv repository. The data I have contains T1w data, dMRI data (AP and PA) and SE field maps.

I am using a singularity image built on Oct 29 from the latest version. My call to heudiconv is simple:

singularity exec \
  --bind ${data_dirname} \
  ${heudiconv_singularity_fname} \
  heudiconv \
    -f ${config_fname} \
    --bids \
    -o ${out_dirname} \
    --files ${in_dirname}

where the variables have been appropriately set to the folders of interest.

The DICOM files are located in three separate folders, T1, DWI_AP, DWI_PA, Spin_Echo_Maps, each one containing one folder containing all the DICOM files per participant.

When running the conversion, the task raises an assertion error because when checking that the study session that is being parsed is not already in the list,

it finds that the session is already in the list of study sessions.

The data belongs to the same study (at least for my purposes), but the StudySessionInfo instances show that they have no session information, and the locator (study) is different, e.g.

StudySessionInfo(locator='Investigators/MyStudySZPain10820', session=None, subject='092743')
(...)
StudySessionInfo(locator='Investigators/MyStudySZPain8232021', session=None, subject='197668')
(...)

There seems to be 4 different locators (MyStudy is a substitute for the real name):

Investigators/MyStudy
Investigators/MyStudyFDNeuro08242021
Investigators/MyStudyFDPain08242021
Investigators/MyStudySZPain
Investigators/MyStudySZPain10820
Investigators/MyStudySZPain8232021

If I keep only the DWI folders in my in_dirname, heudiconv is able to complete the conversion, and organizes the data into an Investigators main folder with the rest of MyStudy* subfolders containing a subset of the 68 participants contained in the dataset.

If I keep the T1 and DWI_AP folder in the in_dirname, I get the error mentioned above.

I found this PR

ENH: grouping by mgxd · Pull Request #359 · nipy/heudiconv · GitHub

and thought that maybe the --grouping all flag could be helpful to solve this. When using the flag I get an assertion error from my config file, because the seq infos are not unique (as said, inherited from the reproin example):

I do not know how to go about this or how to debug this.

Any help is highly appreciated.

Thanks.

Your data was probably acquired during different MRI sessions. If so, I think you need to convert them separately and save them under different ses folders in a final bids-valid folder. For example, in my study, I had two sessions per participant and I converted each session separately using the code provided below:

for subj_id in $subjects; do
  subj_dir="${project_path}/raw_data/sub-$subj_id/"
  sess_i=1

  # Check if the subject directory exists
  if [ -d "$subj_dir" ]; then
    sess_ids=$(ls "$subj_dir")

    for full_sess_id in $sess_ids; do
      sess_id="${full_sess_id:5}"

      docker run --rm -v ${PWD}:/base nipy/heudiconv:latest \
       -d /base/raw_data/sub-{subject}/sess-${sess_id}/*/*.dcm \
       -o /base/bids_data/ \
       -f /base/analysis/heuristic_sess0"${sess_i}".py \
       -s "$subj_id" -ss "00${sess_i}" \
       -c dcm2niix -b

      sess_i=$((sess_i+1))
   done
  else
    echo "Directory $subj_dir does not exist."
  fi
done

I hope it helps!

Thanks for the answer @egor.levchenko.

I applied heudiconv to incremental subsets of the data starting from the first participant until I was able to locate where heudiconv would error telling me that the that the current study session had already been analyzed. I removed that participant and heudiconv run without any apparent errors. Not sure if that is the appropriate strategy, but I cannot think of other ways to make heudiconv proceed.

I had tried cloning the heudiconv source code and debugging but the thing was failing with simple things such as importing the __version__, the queue module shadowing some other system module, etc. so I did not follow that path.

Your data was probably acquired during different MRI sessions.

They were definitely acquired on different days but using the same protocol (in theory), and there is no functional data, so not sure how a “session” would be defined here.

After heudiconv has finished the conversion (after I had removed the allegedly duplicate participant), participants are distributed across 6 folders/study names:

Investigators/MyStudy
Investigators/MyStudyFDNeuro08242021
Investigators/MyStudyFDPain08242021
Investigators/MyStudySZPain
Investigators/MyStudySZPain10820
Investigators/MyStudySZPain8232021

each having its own BIDS structure files (CHANGES, dataset_description.json, participants.json, participants.tsv, README, etc.).

If so, I think you need to convert them separately and save them under different ses folders in a final bids-valid folder.

Thanks for the snippet, but I do not know which participants were acquired in which session; I was given all the DICOM files split by modality/acquisition (T1, DWI_AP, DWI_PA, Spin_Echo_Maps) and each participant has a folder within each where the identifier is an arbitrary one that does not match with the participant ID of the DICOM data.

So unless I am missing something, there is no way for me to distinguish session folders.