Parsing EXAR files

Does anyone know how to read the EXAR protocol files (export file format of protocols setup in Siemens scanner) from python to programmatically parse different steps and sequences inside them? Thanks.

cc @Chris_Rorden @tsalo @yarikoptic

I do not think these are documented or designed for public consumption. Perhaps someone else knows details. These are simply sqlite files, so reading the attributes is not hard, but interpreting them looks challenging. Here is the skeleton for a Python reader:

import sqlite3
con = sqlite3.connect('TERRA_20ch.exar1')
cur = con.cursor()
cur.execute("SELECT name FROM sqlite_master WHERE type='table';")
print(cur.fetchall())
cur.execute("SELECT * FROM Instance")
rows = cur.fetchall()
for row in rows:
    print(row)
con.close()

Thanks Chris – I will give this a try! I was hoping someone here tried to do that already :slight_smile:

@Chris_Rorden Can you try that on your side:

#!/bin/env python3

#
# extract-exar1.py
#
# A small script to extract structures from SIEMENS exar1 file
# It will generate one JSON for each entry in the Content table.
# And sometimes an XML if json['Data'] attribute is found.
#
# Usage: python3 extract-exar1.py "filename.exar1"
#

# https://neurostars.org/t/parsing-exar-files/20237
import sqlite3
import sys
import zlib
import json

headers = [
    'EDF V1: ContentType=syngo.MR.ExamDataFoundation.Data.EdfAddInConfigContent;',
    'EDF V1: ContentType=syngo.MR.ExamDataFoundation.Data.EdfDecisionStepContent;',
    'EDF V1: ContentType=syngo.MR.ExamDataFoundation.Data.EdfDirectoryContent;',
    'EDF V1: ContentType=syngo.MR.ExamDataFoundation.Data.EdfInteractionStepContent;',
    'EDF V1: ContentType=syngo.MR.ExamDataFoundation.Data.EdfJoinStepContent;',
    'EDF V1: ContentType=syngo.MR.ExamDataFoundation.Data.EdfMeasurementStepContent;',
    'EDF V1: ContentType=syngo.MR.ExamDataFoundation.Data.EdfPauseStepContent;',
    'EDF V1: ContentType=syngo.MR.ExamDataFoundation.Data.EdfProgramContent;',
    'EDF V1: ContentType=syngo.MR.ExamDataFoundation.Data.EdfProtocolContent;',
    'EDF V1: ContentType=syngo.MR.ExamDataFoundation.Data.EdfSplitStepContent;',
    'EDF V1: ContentType=syngo.MR.ExamDataFoundation.Data.EdfStringContent;',
    'EDF V1: ContentType=syngo.MR.ExamDataFoundation.Data.EdfStructureContent;',
    'EDF V1: ContentType=syngo.MR.ExamDataFoundation.Data.EdfWorkflowStepContent;'
]
filename = sys.argv[1]
table = 'Content'

con = sqlite3.connect(filename)
cur = con.cursor()
cur.execute(f"SELECT * FROM {table}")
rows = cur.fetchall()
for row in rows:
    name = row[0]  # unique hash ?
    deflate_data = row[1]
    assert deflate_data
    assert row[2] == 'DS'  # wotsit ?
    unzipped = zlib.decompress(deflate_data, -zlib.MAX_WBITS)
    str_data = unzipped.decode("utf-8")
    # skip first line:
    # 'EDF V1: ContentType=syngo.MR.ExamDataFoundation.Data.EdfAddInConfigContent;'
    lines = str_data.splitlines(keepends=True)
    header = lines[0].strip()
    assert header in headers
    json_str = ''.join(lines[1:])
    # make sure this is JSON before writing it:
    json_data = json.loads(json_str)
    with open(name + '.json', 'w') as f1:
        f1.write(json_str)
        if 'Data' in json_data:
            xml_data = json_data['Data']
            # Always write with XML file extension, even for XProtocol
            with open(name + '.xml', 'w') as f2:
                f2.write(xml_data)
con.close()

@malaterre thanks: your script does extract a lot of sequence information into JSON and XML files.