Reading CSA Series Header Info from a pydicom object

Fidel · October 23, 2018, 6:22pm

Hi there,

Does anybody know how to properly read /store the array of bytes that comes in the ‘CSA Series Header Info’ field ((0029, 1020)) in a pydicom object read from a SIEMENS DICOM file?

I made the mistake of converting every field to a string and the result makes no sense at all (it is probably an encoding problem, but I have not found much documentation on this).

Thanks

Chris_Rorden · October 24, 2018, 12:32pm

Fidel,
The nibabel documentation describes these briefly, and you may want to look at test_csareader.py.
Basically, the Siemens CSA data is stored in two segments, the organized Image Header Info (0029,1010) which uses a series of tags and null terminated strings and the less organized Series Header Info (0029,1020). I would generally be cautious of parsing the latter, as it seems pretty unstable.

If you are willing to delve into C, you can modify the dcm2niix function siemensCsaAscii() which attempts to parse many attributes from this section.

If you want to do this in Python, the trick to reading 0029,1020 is to only look at the ASCII text in between the flags

   ### ASCCONV BEGIN
  ... 
   ### ASCCONV END

This section includes strings with Unix line termination (0x0A). Each string starts with a tag name, white space and a tag value. For example

sPat.lAccelFactPE                        = 2

suggests that the iPAT factor (SENSE/GRAPPA) was x2.

raamana · January 12, 2022, 3:00pm

Hi Chris, is this the only reader you are aware of? i wonder why doesn’t pydicom or dicom_parser do not offer these readers? or perhaps they do and we just need to dig them out of their libraries?

Chris_Rorden · January 13, 2022, 3:30pm

I am not aware of any Python other implementation that reads the Siemens CSA header. Even dicom2nifti uses DICOM tags. The CSA header is an undocumented format of binary and ASCII information dumped into a private tag, so reading this seems to be beyond the scope of a typical DICOM reader: it does not use the DICOM format.

The Siemens XA* systems use conventional DICOM tags to store meta data, and dispense with the proprietary CSA header seen in VA-VE systems. As VE scanners are getting upgraded to XA30, the CSA format will soon only be an issue for archival studies.

Both dcm2niix (written in C) and dicm2nii (Matlab) decode the CSA header, and could be translated to Python.

effigies · January 13, 2022, 9:48pm

I believe https://github.com/open-dicom/dicom_parser does read the CSA headers, though @zvibaratz can say more.

ZviBaratz · January 14, 2022, 11:40am

@effigies you are correct. I think there are quite a few unpublished fixes, but if you install dicom_parser from master you should be able to get a dictionary of the parsed values, e.g.:

>>> from dicom_parser import Header

>>> dcm_path = "/some/siemens/image.dcm"
>>> header = Header(dcm_path)
>>> csa = header.get(("0029", "1020"))
>>> csa.keys()
dict_keys(['UsedPatientWeight', 'NumberOfPrescans', 'TransmitterCalibration', 'PhaseGradientAmplitude', 'ReadoutGradientAmplitude', 'SelectionGradientAmplitude', 'GradientDelayTime', 'RfWatchdogMask', 'RfPowerErrorIndicator', 'SarWholeBody', 'Sed', 'SequenceFileOwner', 'Stim_mon_mode', 'Operation_mode_flag', 'dBdt_max', 't_puls_max', 'dBdt_thresh', 'dBdt_limit', 'SW_korr_faktor', 'Stim_max_online', 'Stim_max_ges_norm_online', 'Stim_lim', 'Stim_faktor', 'CoilForGradient', 'CoilForGradient2', 'CoilTuningReflection', 'CoilId', 'MiscSequenceParam', 'MrProtocolVersion', 'DataFileName', 'RepresentativeImage', 'PositivePCSDirections', 'RelTablePosition', 'ReadoutOS', 'LongModelName', 'SliceArrayConcatenations', 'SliceResolution', 'AbsTablePosition', 'AutoAlignMatrix', 'MeasurementIndex', 'CoilString', 'PATModeText', 'PatReinPattern', 'ProtocolChangeHistory', 'Isocentered', 'MrPhoenixProtocol', 'GradientMode', 'FlowCompensation', 'PostProcProtocol', 'RFSWDOperationMode', 'RFSWDMostCriticalAspect', 'SARMostCriticalAspect', 'TablePositionOrigin', 'MrProtocol', 'MrEvaProtocol', 'VFModelInfo', 'VFSettings', 'AutoAlignData', 'FmriModelParameters', 'FmriModelInfo', 'FmriExternalParameters', 'FmriExternalInfo', 'FmriAcquisitionDescriptionSequence', 'B1rms', 'B1rmsSupervision', 'TalesReferencePower', 'PhaseSliceOversampling', 'SafetyStandard', 'DICOMImageFlavor', 'DICOMAcquisitionContrast', 'EchoTrainLength', 'RFEchoTrainLength', 'GradientEchoTrainLength', 'Laterality4MF', 'ArterialSpinLabelingContrast', 'ConfigFileInfo', 'UserDefinedSeries', 'AASpineModelVerificationStatus', 'AASpineModelData'])
>>> csa.get("MrProtocolVersion")
{'index': 28, 'VR': 'IS', 'VM': 1, 'value': 51130001}
>>> nested = csa.get("MrPhoenixProtocol")["value"]
>>> nested.keys()
dict_keys(['Version', 'SequenceFileName', 'ProtocolName', 'EVAProt', 'ReferenceImage0', 'ReferenceImage1', 'ReferenceImage2', 'ScanRegionPosTra', 'ScanRegionPosValid', 'PtabAbsStartPosZ', 'PtabAbsStartPosZValid', 'TablePositioningMode', 'EnableNoiseAdjust', 'Contrasts', 'CombinedEchoes', 'DisableChangeStoreImages', 'AAMode', 'AARegionMode', 'AARefMode', 'ReconstructionMode', 'OneSeriesForAllMeas', 'PHAPSMode', 'WrapUpMagn', 'Averages', 'AveragesDouble', 'Repetitions', 'ScanTimeSec', 'TotalScanTimeSec', 'RefSNR', 'RefSNR_VOI', 'MotionCorr', 'ParadigmPeriodicity', 'CineMode', 'SequenceType', 'CoilCombineMode', 'FlipAngleMode', 'TOM', 'ProtID', 'SequenceID', 'ReadOutMode', 'Bold3dPace', 'ForcePositioningOnNDIS', 'InteractiveRealtime', 'InternalTablePosValid', 'TmapB0Correction', 'TmapEval', 'TmapImageType', 'OrganUnderExamination', 'TissueT1', 'TissueT2', 'InvContrasts', 'ReaquisitionMode', 'ProtConsistencyInfo', 'GRADSPEC', 'TXSPEC', 'RXSPEC', 'AdjData', 'TR', 'TR[0]', 'TI', 'TI[0]', 'TD', 'TE', 'TE[0]', 'FlowComp', 'FlowComp[0]', 'SliceArray', 'GroupArray', 'RSatArray', 'NavigatorArray', 'AutoAlign', 'NavigatorPara', 'BladePara', 'PrepPulses', 'KSpace', 'FastImaging', 'PhysioImaging', 'SpecPara', 'Diffusion', 'Angio', 'PreScanNormalizeFilter', 'DistortionCorrFilter', 'NoiseDecorrData', 'Pat', 'Mds', 'AAInitialOffset', 'RepetitionsDelayTimeMs', 'FlipAngleDegree', 'FlipAngleDegree[0]', 'ServicePara', 'PerProxy2Skip', 'CoilSelectMeas', 'CoilSelectUI', 'EFISPEC', 'WipMemBlock', 'BOLDParadigmArray', 'BOLDParadigmArray[0]', 'ParametricMapping', 'IR', 'Asl', 'InversionArray', 'Workflow', 'DynDistCorrFilter', 'ChannelMatrix', 'PTXData', 'InlineCardioEval', 'Interactive', 'DixonData', 'DynmicAdjustVolumes', 'SliceAcceleration'])
>>> nested["SliceArray"]["Mode"]
4

Current implementation is far from perfect, but it is entirely functional. @Chris_Rorden I hope this is what you meant, in any case I would be very grateful for any comments.
HTH

raamana · January 18, 2022, 2:02pm

thanks everyone! I will take a look. I did come across dicom_parser and dcmstack as I dig through various scripts and libraries and am learning about and learning to use them.

DICOMs are a quite a FUN format, eh?

also, what are the best practices to specify and handle the file paths for DICOM files? its awkward to search for a single .dcm file within a folder of large number of files (without any criteria) to refer to an single subject single session image. If you are aware of a best practices guide/tips doc somewhere, let me know.

Chris_Rorden · January 18, 2022, 3:22pm

DICOM filenames are typically based on their UIDs, so they are not very meaningful. I ofter use the re-naming feature (-r y) of dcm2niix to give my DICOMs meaningful names. A safe option is to generate hierarchical folders based on date/time, series number and protocol name, with the filename based on instance number and the mediaObjectInstanceUID (0002,0003).

Based on the dcm2niix filenaming structure this becomes:

dcm2niix -r y -f %t/%s_%p/%4r_%o.dcm -o /new/DICOMs /path/to/DICOMs

If you are certain that your manufacturer is not Siemens (e.g. GE and Philips) you can typically omit the mediaObjectInstanceUID. However, you need to be aware that the DICOM standard does not require the instance number to be unique, and Siemens field maps will use the same instance number for different echos (so different images will have name conflicts if you omit the UID). While Siemens and GE typically generate meaningful instance numbers (e.g. based on temporal order), Philips instance numbers are often assigned with little regard to the slice or temporal order.

One reason to organize DICOMs in this manner is that it avoids a strange file saving behavior of different PACS systems. For example, some systems limit the number of files in a folder to 1000. If you have a session where the first series was a T1 scan with 200 slices, and the second session was a fMRI series with 2000 slices, the first folder created will include all the T1 and the first 800 slices of the fMRI series, with additional folders created for other scans from the fMRI series.

raamana · January 19, 2022, 1:44pm

thanks Chris - this is helpful notes/tips.

I am trying to work off projects/folders coming directly from scanner via XNAT, and would ideally prefer not to run any renaming pipelines if I don’t have to, unless I can configure XNAT to output them in a more meaningful or easy to reuse manner (with some tips you already note).

also, back to the question i was asking: how would you refer to an MR image in DICOM format with a single file path, with or without renaming them as you suggest?

also this seems strange and bad – why would they mix up two different modalities altogether into the same folder? that’s some bad negligence, no?

raamana · January 19, 2022, 2:41pm

as you probably can see already, i (and many others) are used to single-3d-image-in-a-single-file-on-disk convenience offered by NiFTi/MGH formats etc, and am trying to apply that to DICOM. The DICOM won’t be a single file but I want to be able to unambiguously refer to a 3D MRI scan from a single modality with a file path to a single disk entity – it would prefer to use a folder path (containing all the .dcm files, assuming they don’t mix up multiple sessions or modalities), than a file path to one of the .dcms

i also need to understand how all the dcms are linked to one another in the DICOM standard

raamana · January 19, 2022, 2:44pm

I actually started a new thread to keep it clean: Best practices in handling DICOM files

Chris_Rorden · January 20, 2022, 1:04pm

@raama be aware that the multi frame (enhanced) DICOM format will save all 2D slices from a series as a single file on disk. Usage will depend on vendor:

Philips was the pioneer: they provided the option for enhanced DICOM many years ago. Unfortunately, their implementation contains so much redundant information that the DICOM header often dwarfs the size of the image data. This choice bogs down PACS systems, wastes disk space, yields slow conversions, and can overwhelm DICOM viewers like Horos or Osirix if the user attempts to view the meta data. Due to these reasons, many Philips users prefer to export to classic DICOM.
Canon has implemented a lean enhanced DICOM. I know that V6.0SP2001* did not correctly specify TemporalPositionIndex (0020,9128) for fMRI sequences (though this was reported and may be patched).
Siemens introduced a lean enhanced DICOM with XA10. The original implementation was a bit too lean, lacking required meta data. However, the more recent XA20 and XA30 seem very solid. Siemens is actively upgrading VE11 users (e.g. Prisma’s popular with research) to XA30.

Therefore, I think this situation will evolve rapidly in the near future (at least for new acquisitions). Returning to the original topic of this post, the Siemens XA enhanced images do not use the private CSA header format, using traditional DICOM compatible tags to store sequence details.

As an aside, deciding that all images in a folder are from a single series would not conform to a single BIDS format file. For example, a single fMRI series can acquire multiple echoes. The BIDS standard requires that each echo be saved as a separate 4D file, rather than as a single 5D file.