Heudiconv file mapping from dicom dirs to nifti files

I am using Heudiconv to convert my dicom datset to nifti. I want to know if there exists a way to capture the mapping of directories from the dicom dataset to the produced nifti files.

I have taken a look at dicominfo.tsv and other files in .heudiconv (such as filegroup.json and auto.txt), but believe the information available there does not suffice to track conversion from the source dicom directories to the resultant nifti files. E.g. four sub-directories in my dicom dataset are converted to the same nifti directory (Nifti/sub-001/func) with file names:

  • sub-001_task-rest_dir-LR_run-01_bold.nii.gz
  • sub-001_task-rest_dir-LR_run-02_bold.nii.gz
  • sub-001_task-rest_dir-LR_run-03_bold.nii.gz
  • sub-001_task-rest_dir-LR_run-04_bold.nii.gz

Similarly, I have a bunch of Na data files being converted, resulting in files (in directory Nifti/sub-001/func with names:

  • sub-001_acq-23Na_echo-01.nii.gz
  • sub-002_acq-23Na_echo-02.nii.gz
  • sub-001_acq-23Na_echo-16.nii.gz

And I wish to identify which of the above Nifti files were generated from which dicom sub-directory. I presume the conversion might be in alphabetical order of the dicom sub-directories. But was curious to know if there was a more robust way to capture this mapping.

to a degree it is what reproin heuristic does: see e.g. Reproin — heudiconv 1.1.0 documentation and how to remap e.g. heuristic option: sequence names remappings and other control · Issue #18 · ReproNim/reproin · GitHub

but note that “dicom sub-directory” is not smth heudiconv relies on – it relies on metadata from dicoms, and usually dicom sub-directory name is based on that

Thanks for pointing me to Reproin. I have started exploring it to see how it can help resolve the mapping issue.

but note that “dicom sub-directory” is not smth heudiconv relies on – it relies on metadata from dicoms, and usually dicom sub-directory name is based on that

True. For example, one of the issues I am facing with the file mappings is that I have a dicom sub-directory which contains files:

  • original-primary-m-nd_e01_0001.dcm
  • original-primary-m-nd_e01_0060.dcm
  • original-primary-m-nd_e02_0001.dcm
  • original-primary-m-nd_e02_0060.dcm

I am using a custom heuristic file, that is already in use at our lab.
In dicominfo.tsv, the above files appears as a single entry:

  • 41-gre_fieldmap_sag_3mm → 120 files

The heuristic also does not differentiate between the 120 files and includes them as a single entity. This is reflected in filegroup.json generated by heudiconv with all 120 files assigned to the same key (41-gre_fieldmap_sag_3mm). But in the resultant Nifti conversion of this directory, it creates two files:

  • sub-001_acq-gre_run-01_magnitude1.nii.gz
  • sub-001_acq-gre_run-01_magnitude2.nii.gz

In general, I was hoping to store the file mappings for provenance purposes, so that the nifti files can later be tracked back to their dicom sources. Something such as:
[file1_1.dcm, file1_2.dcm, ..., file1_N.dcm] -> file1.nii.gz

(I was curious to test out --with-prov for seeing the available provenance data, but encountered this previously reported issue)

for that files under .heudiconv/ contain all the gory details for that particular subject / session conversion – but indeed might not account for some renames which heudiconv does for various _echo- etc.

Also, in reproin heuristic we simply bundle original dicoms to reside under similar filenames under sourcedata/, see e.g. DataLad Repository as an example dataset converted using reproin heuristic.

Yeah that’s what I am finding. .heudiconv does not contain all the required info for identifying the source. And having gone through heudiconv internals now, I believe that some of the required info is lying within 'dcm2niix` and not available via heudiconv (e.g. the handling of multi-echo data to identify which specific subset of files created a nifti echo file).

Thank you for starting this post.
I had a similar question re this line:

Is there any way to construct a custom heuristic that can access the metadata of the dicom files - or are we limited to just sorting files via column attributes in the dicominfo.tsv?

do you mean some additional metadata from dicoms besides that one already extracted? Then - yes since recent heudiconv you can add custom_seqinfo to your heuristic which would give you access to full dicom so you could extract metadata of interest to be serialized/added to that file . See e.g. heudiconv/heudiconv/heuristics/convertall_custom.py at 7caff37cf040d0ba12967b41b0385548ce6a51bd · nipy/heudiconv · GitHub