BIDS Validator error: TR mismatch between NIFTI header and JSON file, and BIDS Validator is somehow finding TR=0. Best solution?

zreagh · June 7, 2018, 5:03pm

Hi all,

I have a rather specific issue that I’m hoping to get some input on. I got some helpful feedback on Twitter, but Chris (effigies) suggested I post here for a more formal back-and-forth.

I am planning to use fmriprep to analyze a very large existing dataset (Cam-CAN, for those who are familiar). After requesting access, you are granted a server login and can download the data from ~700 participants. Importantly, you get NIFTIs, but not the raw DICOMs.

I had to rework their file directory structure a bit to make it fully BIDS-compatible, but this is what I did (notably, each subject has JSON files in the same directory as the corresponding NIFTI files, but I didn’t see a need to make master JSON files in the parent directory):

CamCan/
    dataset_description.json
    participants.tsv
    each participant folder/
        anat/
            t1w NIFTIs & JSONs
            t2w NIFTIs & JSONs
        fmap/
            phasediff NIFTIs & JSONs
            magnitude 1 NIFTIs (no JSONs)
            magnitude 2 NIFTIs (no JSONs)
        func/
            rest NIFTIs & JSONs
            movie NIFTIs & JSONs (BOLD during film viewing)

When running the BIDS Validator, I get the following errors:

Error: 3
Repetition time did not match between the scan’s header and the associated JSON metadata file.1246 files
sub-CC110045_task-Movie_bold.nii.gz27256.459 KB | application/x-gzip
Location:
CamCan/sub-CC110045/func/sub-CC110045_task-Movie_bold.nii.gz

Reason:
Repetition time defined in the JSON (2.47 sec.) did not match the one defined in the NIFTI header (0 sec.)

sub-CC110045_task-Rest_bold.nii.gz36890.833 KB | application/x-gzip
Location:
CamCan/sub-CC110045/func/sub-CC110045_task-Rest_bold.nii.gz

Reason:
Repetition time defined in the JSON (1.97 sec.) did not match the one defined in the NIFTI header (0 sec.)

The TRs being read out from the JSON file (2.47 for movie, 1.97 for rest) are indeed the desired values, but it’s seemingly reading out TR=0 from the NIFTI header. To further complicate matters, this isn’t what the NIFTI header appears to actually contain, if one runs fslhd on a given file, the TR appears to be 1sec (bolded):

fslhd sub-CC723395_task-Movie_bold.nii.gz
filename sub-CC723395_task-Movie_bold.nii.gz

sizeof_hdr 348
data_type UINT16
dim0 4
dim1 64
dim2 64
dim3 32
dim4 193
dim5 1
dim6 1
dim7 1
vox_units mm
time_units s
datatype 512
nbyper 2
bitpix 16
pixdim0 0.000000
pixdim1 3.000000
pixdim2 2.999999
pixdim3 4.440000
pixdim4 1.000000
pixdim5 0.000000
pixdim6 0.000000
pixdim7 0.000000
vox_offset 352
cal_max 0.0000
cal_min 0.0000
scl_slope 1.000000
scl_inter 0.000000
phase_dim 0
freq_dim 0
slice_dim 0
slice_name Unknown
slice_code 0
slice_start 0
slice_end 0
slice_duration 0.000000
time_offset 0.000000
intent Unknown
intent_code 0
intent_name
intent_p1 0.000000
intent_p2 0.000000
intent_p3 0.000000
qform_name Aligned Anat
qform_code 2
qto_xyz:1 -2.993574 0.000000 -0.290462 95.590027
qto_xyz:2 0.103233 2.551443 -2.330460 -50.001572
qto_xyz:3 -0.166914 1.578015 3.768048 -120.494217
qto_xyz:4 0.000000 0.000000 0.000000 1.000000
qform_xorient Right-to-Left
qform_yorient Posterior-to-Anterior
qform_zorient Inferior-to-Superior
sform_name Aligned Anat
sform_code 2
sto_xyz:1 -2.993574 0.000000 -0.290466 95.590027
sto_xyz:2 0.103233 2.551443 -2.330460 -50.001572
sto_xyz:3 -0.166917 1.578015 3.768048 -120.494217
sto_xyz:4 0.000000 0.000000 0.000000 1.000000
sform_xorient Right-to-Left
sform_yorient Posterior-to-Anterior
sform_zorient Inferior-to-Superior
file_type NIFTI-1+
file_code 1
descrip 4D image
aux_file

To confirm this using another tool, running AFNI’s 3dinfo on that same file indicates “Time step = 1.00000s”.

So I have two issues. (1) My NIFTI headers do not contain the proper TR values. Unfortunately, the root of the problem is beyond my control since the data aren’t mine and I’m having to start from these NIFTIs, so I likely need to manually edit the headers. Any input on the optimal way to go about this would be greatly appreciated. (2) The BIDS Validator is nonetheless failing to capture the erroneous TR=1 in the NIFTI header, instead finding TR=0. Thus, I’m concerned that even if I fix the TR values in the NIFTI headers, the BIDS validator (and therefore fmriprep) will probably fail anyway. Should I be concerned with these NIFTIs beyond the TR situation? (If so, that might be a problem of broad interest since a number of publications have been released based on this very dataset…)

Thanks very much in advance.

effigies · June 7, 2018, 6:01pm

The first thing I’ll note is that I don’t think you’ll have trouble running fMRIPrep on this, as is. We strictly use the TR defined in the JSON sidecar, and ignore that in the header. The validator is basically warning you that two BIDS apps that use different strategies could produce different results. So it’s still worth fixing.

I would approach this by using pybids to find all files, check if their TRs mismatch, and update individually:

from bids import grabbids
import nibabel

def set_tr(img, tr):
    header = img.header.copy()
    zooms = header.get_zooms()[:3] + (tr,)
    header.set_zooms(zooms)
    return img.__class__(img.get_data().copy(), img.affine, header)

def sync_tr(bids_root):
    layout = grabbids.BIDSLayout(bids_root)
    for nii in layout.get(extensions=['.nii', '.nii.gz']):
        metadata.= layout.get_metadata(nii.filename)
        if 'RepetitionTime' in metadata:
            img = nb.load(nii.filename)
            if img.header.get_zooms()[3:] != (metadata['RepetitionTime'],):
                fixed_img = set_tr(img, metadata['RepetitionTime'])
                fixed_img.to_filename(nii.filename)

sync_tr('CamCan')

(I would definitely make a copy of the dataset before trying this… I haven’t tested this code.)

ChrisGorgolewski · June 7, 2018, 6:09pm

In addition to applying the fix @effigies proposed I would also encourage you to get in touch with Cam-CAN and let them know that: a) the data they are distributing have incorrect headers b) you are creating a BIDS version of the dataset that they might be interested in distributing for the benefit of other users.

zreagh · June 7, 2018, 6:10pm

Thank you! I will apply this fix and report back. I will also get in touch with the Cam-CAN folks and inform them of the situation.

effigies · June 7, 2018, 6:18pm

Speaking of which, it’s such a common error that the standard actually explains how to interpret it, but pixdim[0] should be either 1 or -1. (0 is interpreted as 1.)

Also, I’m a bit curious about the qform/sform. On the plus side, they match, but code 2 indicates that they are already in alignment with another image (presumably anatomical). How much preprocessing has already been done? fMRIPrep is really designed to work on data that has not been preprocessed. For example, slice timing correction and susceptibility distortion correction aren’t going to work properly if the slice direction and phase encoding direction aren’t the same as the data axes.

zreagh · June 7, 2018, 6:23pm

Huh…I didn’t notice that. Viewing the images, it doesn’t appear the EPIs are aligned to the T1w or T2w anatomical volumes. They describe the data as being “unprocessed,” though I may need to ask to be absolutely sure. Thank you for pointing that out.

effigies · June 7, 2018, 6:29pm

Great. They should have qform_code == sform_code == 1, then. I don’t think it will change anything with regard to fMRIPrep, but it’s best to keep these codes meaningful, and there may be some software that does treat them differently.

I would also ask if they’ve de-obliqued the data. As that involves resampling, it can also affect STC/SDC. If possible, it would be best not to have de-obliqued data.

effigies · June 7, 2018, 8:41pm

Regarding the 0/1 issue in the validator, it is possible that FSL is lying to you about pixdim4, and automatically replacing a 0 with a 1. If you have nibabel installed, can you run:

nib-ls -H pixdim sub-CC723395_task-Movie_bold.nii.gz

zreagh · June 7, 2018, 8:49pm

Ah, I see. Just tried that, here’s the output:

sub-CC723395_task-Movie_bold.nii.gz uint16 [ 64,  64,  32, 193] 3.00x3.00x4.44x0.00   [-1.         3.0000002  2.999999   4.44       0.         0.
  0.         0.       ] sform

So if I’m reading this properly, it appears you’re correct - somehow, both FSL and AFNI have been reading TR=1 when in fact it’s 0.

zreagh · June 7, 2018, 8:50pm

The pybids solution worked, by the way. Thanks so much for the suggestion.

ChrisGorgolewski · June 7, 2018, 8:50pm

I can confirm this - the TR seems to be displayed incorrectly in fslhd if set to zero:

me@christop ~/Downloads $ nib-ls -H pixdim zerotr.nii.gz
zerotr.nii.gz int16 [ 64,  64,  21, 200] 3.12x3.12x6.00x0.00   [1.    3.125 3.125 6.    0.    1.    1.    1.   ]

me@christop ~/Downloads $ docker run -ti --rm -v 'C:\:/c' bids/base_fsl
root@9495a8535534:/# fslhd /c/Users/me/Downloads/zerotr.nii.gz
filename       /c/Users/me/Downloads/zerotr.nii.gz

sizeof_hdr     348
data_type      INT16
dim0           4
dim1           64
dim2           64
dim3           21
dim4           200
dim5           1
dim6           1
dim7           1
vox_units      mm
time_units     s
datatype       4
nbyper         2
bitpix         16
pixdim0        0.000000
pixdim1        3.125000
pixdim2        3.125000
pixdim3        6.000000
pixdim4        1.000000
pixdim5        1.000000
pixdim6        1.000000
pixdim7        1.000000
vox_offset     352
cal_max        0.0000
cal_min        0.0000
scl_slope      1.000000
scl_inter      0.000000
phase_dim      0
freq_dim       0
slice_dim      0
slice_name     Unknown
slice_code     0
slice_start    0
slice_end      0
slice_duration 0.000000
time_offset    0.000000
intent         Unknown
intent_code    0
intent_name
intent_p1      0.000000
intent_p2      0.000000
intent_p3      0.000000
qform_name     Scanner Anat
qform_code     1
qto_xyz:1      3.125000  0.000000  0.000000  -98.039986
qto_xyz:2      0.000000  3.125000  0.000000  -105.085075
qto_xyz:3      0.000000  0.000000  6.000000  -57.780876
qto_xyz:4      0.000000  0.000000  0.000000  1.000000
qform_xorient  Left-to-Right
qform_yorient  Posterior-to-Anterior
qform_zorient  Inferior-to-Superior
sform_name     Scanner Anat
sform_code     1
sto_xyz:1      3.125000  0.000000  0.000000  -98.039986
sto_xyz:2      0.000000  3.125000  0.000000  -105.085075
sto_xyz:3      0.000000  0.000000  6.000000  -57.780876
sto_xyz:4      0.000000  0.000000  0.000000  1.000000
sform_xorient  Left-to-Right
sform_yorient  Posterior-to-Anterior
sform_zorient  Inferior-to-Superior
file_type      NIFTI-1+
file_code      1
descrip        FreeSurfer May 22 2011
aux_file

effigies · June 7, 2018, 8:57pm

Well, glad we settled that. I’m guessing that what fslhd is actually doing is populating their own internal data structure, making some necessary defaults and manipulations, and then displaying that, rather than the header per se.

I’ll note that fslhd is also lying about pixdim0, which is -1, but displayed as 0. So that actually isn’t an error!

zreagh · June 8, 2018, 12:09am

Took a day of battling, but this was a good feeling =)

Thanks again.

ChrisGorgolewski · June 8, 2018, 12:37am

Congrats! Nice work!

dprice80 · June 18, 2018, 4:59pm

Hi Guys.

Firstly, thanks to everyone who has helped so far.

With regards to the folder structure: we reluctantly separated the modalities because of the enormous quantity of data we are trying to provide to hundreds of researchers. For example, if a researcher requests T1 data, it does not make sense to provide them with all other modalities, which can run into multiple TB for the whole dataset. It is also against our data sharing policy to provide access to data that has not been requested. Do any of you know a way to achieve this without breaking BIDS format?

We plan to remove any reference to BIDS until the issue of how to share separate modalities is resolved, although we will continue to provide data using the same folder structure as before. The benefit is that it is handy to have data gzipped, and the .json file provides another source of information (this has already been useful in identifying the TR issue).

For those who already have the data, here is a brief explanation of the folder structure.
We took the following BIDS folders generated by our aa (automatic analysis) BIDS function:

BIDS/
    sub-CC321154/
          anat/
          dwi/
          ... and so on

and separated the modalities while maintaining the folder structure.

BIDSsep/
     anat/
           sub-CC321154/
                anat/

     dwi/
           sub-CC321154/
                dwi/

Therefore, recombining the data should have been a simple case of copying those folders back into the original folder structure. If the functions we used to generate the original BIDS folders was working properly, then the result should be BIDS compliant.

If anyone can suggest a better way to do this then would be willing to listen, but remember that we need to grant specific access to each modality, so sharing datasets at the subject level is not an option.

I presume that if BIDS is intended to be a format for sharing data, then this is an important issue to be resolved. It seems like it would be useful to be able to download different modalities / scans from the same subject and combine them at the target location. This would not only allow granular access to large datasets but also aid in sharing data from multi-site studies.

To briefly answer the question of the qform/sform problem: this appears to be a bug in SPM. We are working on a solution with the developers. However, rest assured that as long as the value is >0 then the coordinate system will be numerically correct, but incorrectly labeled as being coregistered to another scan.

Darren

ChrisGorgolewski · June 19, 2018, 1:10am

You should check out http://schizconnect.org - they designed a very cool solution where a new BIDS compatible zip archive is created dynamically based on the particular query/request (selecting modalities and participants) the researcher made.

Changing the folder hierarchy the way you proposed will indeed break compatibility. I cannot think right now of a convenient solution (other than what SchizConnect is doing), but I would love to hear your suggestions. Please post them to https://groups.google.com/forum/#!forum/bids-discussion where all BIDS related discussions are taking place.

ChrisGorgolewski · June 19, 2018, 10:58am

Hi,
I read your post more carefully and I have to admit that I made a mistake and in fact your solution of splitting modalities is very reasonable and perfectly compatible (as long as BIDSsep/anat and BIDSsep/dwi pass the BIDS Validator). As you stated users can simply combine the datasets by copying data after download. Sorry for the confusion!

Ursula_Tooley · January 29, 2019, 5:21pm

I’m trying to use the same code (super handy!) to sync headers from ABCD data, which have erroneous TR information, with the accompanying .json files, which are correct. However, when I try to run it on a test BIDS directory with sync_tr('test_bids'), I get the error below.

/Users/utooley/miniconda2/lib/python2.7/site-packages/grabbit/core.py:410: UserWarning: No valid root directory found for domain 'derivatives'. Falling back on the Layout's root directory. If this isn't the intended behavior, make sure the config file for this domain includes a 'root' key.
  "'root' key." % config['name'])

Running it with sync_tr(os.getcwd()) just hangs.

effigies · January 29, 2019, 7:23pm

Hmm. This may need updating with more recent pybids (>=0.7). Try:

from bids import BIDSLayout
import nibabel

def set_tr(img, tr):
    header = img.header.copy()
    zooms = header.get_zooms()[:3] + (tr,)
    header.set_zooms(zooms)
    return img.__class__(img.get_data().copy(), img.affine, header)

def sync_tr(bids_root):
    layout = BIDSLayout(bids_root)
    for nii in layout.get(extensions=['.nii', '.nii.gz']):
        metadata = layout.get_metadata(nii.path)
        if 'RepetitionTime' in metadata:
            img = nb.load(nii.path)
            if img.header.get_zooms()[3:] != (metadata['RepetitionTime'],):
                fixed_img = set_tr(img, metadata['RepetitionTime'])
                fixed_img.to_filename(nii.path)

Also, if this is a large dataset, I would expect it to take a while. Indexing large datasets is currently pretty slow.

Ursula_Tooley · January 29, 2019, 11:32pm

Ahhh, great! The original code actually did work in the end, you were right, it was just taking longer than I expected (and it did mess up a file when it was interrupted halfway through by network disconnection).

I guess I’m using pybids < 0.7, it’s 0.5.1, I installed through conda from aramislab, so that wasn’t the issue. I get the error below, still, but the nifti headers read the right TR, so it seems not to affect anything (and may be an HPC issue)

Thanks so much!

data/picsl/mackey_group/Ursula/miniconda2/envs/python3/lib/python3.6/site-packages/grabbit/core.py:410: UserWarning: No valid root directory found for domain 'derivatives'. Falling back on the Layout's root directory. If this isn't the intended behavior, make sure the config file for this domain includes a 'root' key.
  "'root' key." % config['name'])