GSOC 2020 project idea 28: DIPY support for DICOM files

malin · February 5, 2020, 5:40pm

Magnetic resonance imaging (MRI) is often stored in Digital Imaging and Communications in Medicine (DICOM) format. We want to provide DIPY users support for reading and writing to DICOM files. And also have an additional option to convert from DICOM to NIFTI file format.

Understand how DICOM file format works.
Add support in DIPY to read and write DICOM files.
Understand and create a new DIPY command-line interface (Workflow).
Implement a robust DICOM to NIFTI conversion method.
(Optional) Create a workflow to connect and get data from a PACS server.

Difficulty: Intermediate

Skills required : Python/Cython, medical imaging.

Mentors : @bramsh Bramsh Chandio (bqchandi@iu.edu), @garyfallidis Eleftherios Garyfallidis (elef@indiana.edu)

snack0verflow · February 8, 2020, 11:21am

Hello Mentors!

I am Abid Abdullah a CS undergrad at BITS-Hyderabad. I am interested in this project and want to learn more. I have university and industry experience in Python programming, I don’t know much about medical imaging but I would put my 100% to learn what is needed for this project.
So if you could help me get started and provide me with some pointers that would be great.
Thanks

bramsh · February 9, 2020, 10:45pm

Hello Abid @snack0verflow,

Nice to e-meet you!

You can start by making a small enhancement/bugfix/documentation fix/etc to DIPY. It can help you get some idea about how things would work during the GSoC. The fix does not need to be related to your proposal.

You should look at DICOM files and how the format is different from NIFTI. You can try to read files DICOM files in python.

snack0verflow · February 10, 2020, 12:48am

Thank you for your reply @bramsh
Could you please me provide the link for the relevant code repositories for DIPY? Also any documentation that helps me get started will be greatly appreciated. Thanks!

bramsh · February 10, 2020, 1:35am

@snack0verflow this will help you get a start:

Getting started with DIPY tutorial https://dipy.org/documentation/1.1.1./examples_built/quick_start/#example-quick-start

DIPY GSoC 2020 GitHub page https://github.com/dipy/dipy/wiki/Google-Summer-of-Code-2020

DIPY utils https://github.com/dipy/dipy/tree/master/dipy/io

Chris_Rorden · February 11, 2020, 2:36pm

I would suggest you define what a good solution would look like.

Develop a Python DICOM-to-NIfTI converter to replace dcm2niix. This would use FSL-convention for the for the b-values and b-vectors. If this is your goal, I would look at extending the excellent dicom2nifti. This already does a great job, but does not handle gradients.
Modify DIPY to directly read DICOM data without using NIfTI as an intermediate format. This would be more challenging, as different vendors use different coordinate systems for gradients, and use different storage types (mosaics vs XYZT vs XYTZ), transfer syntaxes, etc.

Here is a repository of validation diffusion across scanners.

I would also consider exactly what problem you are trying to solve. DICOM formats are constantly being reinterpreted by vendors so there will be a constant need to maintain code (e.g. recent enhanced DICOM for Philips and Siemens XA), there are many existing tools (see the ‘alternatives section here’) so it may be more useful to extend an existing tool than create a new one. Also, if you want to support the full gamut of transfer syntaxes, you will want to leverage non-Python code where libraries exist and where the data structures fit more naturally into other languages (e.g. I believe OpenJPEG is the only library that can read JPEG2000 with the 16-bit precision used by DICOM). dcm2niix is written in C, and can be built statically with the libraries that support transfer syntaxes, and compiles on all the major operating systems. It might be worth just using that, or alternatively extending Python bindings, in the same way as divest allows R users to use all the nice features of the R language with the performance of C code for decoding images.

mrrichardchou · March 11, 2020, 7:02pm

Dear Mentors,

I am currently studying as a first-year CS graduate student at the University of Florida. I think I’m a good candidate for this project as I always want to participate in GSoC, and before that, I worked for several years as a reconstruction engineer in a medical imaging company and thus have dealt with DICOM files frequently.

Based on my experience, a general read\write support of DICOM is not very difficult, but when it comes to advanced applications like DTI, etc, sometimes it’s quite challenging to understand the manufacturer’s proprietary tags and try to extract useful information from that. So I agree with what Chris_Rorden posted above, we should define a clear understanding of what we are trying to solve first.

Best Wishes
Richard Chou

Chris_Rorden · March 12, 2020, 1:55pm

@mrrichardchou while converting a single DICOM file can be simple, the challenge is to robustly handle the different conventions, transfer syntaxes, and errors encountered in the wild. There are several good tools out there, so I would suggest investigating them rather than re-inventing the wheel. In particular, I would suggest you consider dcm2niix (written in C) or dicom2nifti (Python, though depends on gdcmconv for compressed transfer syntaxes).

For evaluating performance, I would examine real world datasets where many series are jumbled together. This will show you how the tool scales. I would look at how performance scales both in terms of time to convert and maximum memory used to convert. In my experience, some conversion tools that are terrific for small datasets exceed the RAM available in modern computers when asked to convert large real world datasets such as the HCP sequences.

A good place to start is a modestly large dataset like this 428mb DICOM DWI dataset. In my testing, this dataset was converted x10 faster and with 1/8th the RAM using dcm2niix versus dicom2nifti. My sense is that dicom2nifti attempts to store all input images in RAM during conversion, while dcm2niix reads all the headers and then loads and unloads each DICOM image data as required. I do think dicom2nifti does a great job for small datasets, and therefore is a terrific foundation for future work. If you want to pursue a Python-based solution, I would start with dicom2nifti and see if the conversion could be modified to handle large DWI datasets.

mrrichardchou · March 13, 2020, 6:05pm

@Chris_Rorden Thanks for your suggestion. They seem really helpful. I’m new to DIPY and here’re my understandings for this project so far :

Understand how DICOM file format works.
– DICOM is a big standard with a lot of stuff, I’ve got only certain familiarity with some MRI related topics (file formats\ C-STORE\ C-FIND\etc) but never dealt with Consistent Presentation of Images etc and etc. Hope this will be sufficient for this project, and I will always try to learn something new if necessary.
Add support in DIPY to read and write DICOM files.
– Libraries like pydicom will do this well, here’re some data type mappings I knew of so far:

Annotation%202020-03-13%201306401309×510 21.8 KB
Understand and create a new DIPY command-line interface (Workflow).
– Not sure what kind of command-line interface is needed here so far.
Implement a robust DICOM to NIFTI conversion method.
– Will definitely try out dcm2niix. So for the scope of this project, I’m currently thinking of providing a wrapper to something like dcm2niix. I didn’t think much about the performance and RAM usage for this case till you mentioned it, but you are right. We can not assume everyone using DIPY has beefy workstations and tons of RAM.
(Optional) Create a workflow to connect and get data from a PACS server.
– Have done this kind of task before, fully compatible with any existing PACS server would be challenging. However, support for some well-behaved PACS should still be quite doable.

Chris_Rorden · March 14, 2020, 1:25pm

If you do decide to use dcm2niix, be aware that by default it converts ALL the series in a given directory to NIfTI (or NRRD). If DIPY wants to load a single series at a time, you can run dcm2niix once with the -n -1 parameter to list all the files in the directory and their corresponding CRC. You then run dcm2niix a second time with -n X where X is the CRC number of the series you want to convert.

FSLeyes provides Python code for this method when you choose the File/AddFromDICOM menu item. MRIcroGL also uses this method when a user drops a DICOM folder onto the application.

Here is a sample from the command line for the dcm_qa_philips dataset. The first call lists the available DICOM series, the second call converts just one series.

$dcm2niix -n -1 ~/src/dcm_qa_philips
Chris Rorden’s dcm2niiX version v1.0.20200312 Clang11.0.0 (64-bit MacOS)
Found 2748 DICOM file(s)
619821335 /dcm_qa_philips/dcm_qa_philips_dti_0_1101
/dcm_qa_philips/In/Rosetta_2013/1101/dti.dcm
319436261 /dcm_qa_philips/dcm_qa_philips_WIP_dti_ax_20180517113239_601
/dcm_qa_philips/In/Vanderbilt_2018/601_WIP_dti_ax/0016.dcm
2544378125 /dcm_qa_philips/dcm_qa_philips_DT_HIGH_32DIR_SENSE_20081029124138_1201
/dcm_qa_philips/In/Bangalore_2008/dwi_no-phi/IM_0001
3957178676 /dcm_qa_philips/dcm_qa_philips_EPI_asc_CLEAR_20140214090057_201
/dcm_qa_philips/In/Magdeburg_2014/fmri/201_EPI_asc_CLEAR_0001_01.dcm
Conversion required 1.125741 seconds (1.123404 for core code).

$dcm2niix -n 619821335 ~/src/dcm_qa_philips

mrrichardchou · March 15, 2020, 7:22am

Hi Bramsh,

I’m currently working on some POC code using pydicom to read write dicom files in DIPY. Meanwhile, I’m also trying to finish the proposal used for the final GSOC application process. Thus I have a question regarding the following sub-tasks:

Understand and create a new DIPY command-line interface (Workflow).
Implement a robust DICOM to NIFTI conversion method.

Where can I get more detailed requirements for these two tasks? Am I suppose to figure it out myself by reading code, or by making discussion here or even on gitter?

Best Wishes
Richard Chou

Siddharth_Kapoor · March 31, 2020, 9:27am

@mentors
I have successfully come up with a proposal for this project. apologies for putting it out so late. Any suggestions or advices would be appreciated.