I have been recently asked to help some researchers process the imaging data in UK Biobank with fmriprep and other BIDS-Apps, and I am wondering if anyone out there has attempted to or has successfully converted this data to BIDS. I am assuming it cannot be made a perfect match to the BIDS format, but any help in progressing toward this conversion would be greatly appreciated.
@Fidel might be able to help you
@Fidel, I just found your gitlab python script for this, I will definitely try it. Thanks @ChrisGorgolewski.
That is the latest publicly available version of the script. Some time ago I was re-writing the whole pipeline to python (and giving the possibility to work entirely with BIDS) but that project got sidetracked.
I hope it helps!
@Fidel, it worked great! I was able to convert the data and run it with a BIDS-App pipeline perfectly. Thank you very much for the software!
Really happy to hear that
@Fidel, a quick question. When I was converting to BIDS, I noticed that several of the subjects in the UKB that have scanner metadata and some that do not, which makes them challenging to use. Do you have any insight into this? Is there some default scanner metadata that I am not aware of. I cracked the NIFTI files and cannot find something like SliceTiming in the NIFTI headers.
I am not really sure what you are talking about. Can you give me more details about the metadata you are talking about?
I will try to be clearer. When I download the fMRI imaging data from UKB, in the directory for some of the patients, I get a file called rfMRI.nii.gz, which of course is the image data. In some of the subjects, I also get a rfMRI.json, which has the appropriate metadata about the scan, such as what scanner it was run on, the SliceTiming, etc. Some of the subjects do not have this json file. So I am wondering what to do about the subjects that do not have this json file. How do I find out the information for them without the json file.
Does that make sense?
In order to better understand what may have happened, can you tell me how many subjects you donwloaded, how many do not have json and how many do?
I have downloaded 14932 subjects and 3983 of them have the json.
I am sorry, but I am afraid we just started including the json files after the first 10,000 subjects (that is for the 3rd public data release and so on). The latest ~12,000 subjects that are publicly available should have the json files so you may want to download the last 7,000 subjects you are missing (4th public release) and use that data.
Eventually json files for all subjects will get added in, one way or another, but not in the immediate future. Also, if I am not mistaken, the slice timing should be the same for all subjects
Ok, thanks, the existing jsons have variation in this value, so it may need to be checked to ensure that this is really the case. I am in the process of downloading what you suggested, I will take a look at those. If there is some way to verify the actual slice timing for every subject, that would be great. I realize we may not be able to share that here in this public forum, so let me know if there is a way we can work toward this.
Thank you for you very kind help and support.
Thanks a lot for the help - just wanted to make sure that what is below (from Lex) is correct. Please confirm that we should use the same slice timing (8x acceleration) for all participants.
Some of the slice timing info shows sequential acquisition, others have the (expected) 8 blocks of acquisition times (123-123-123 etc.).
The error is either with (1) the acquisition, or (2) the metadata. I suspect that it’s an issue with the metadata; somebody would have noticed that something was wrong if the acquisition took 8x as long as prescribed.
Since the protocol was the same for all subjects, I suggest taking one of the correct slice timings (8x acceleration) and overwriting the ones reporting sequential acquisition.
I’m just now coming across discrepancies in the slice timing in the json metadata for both rest and task. As @jbpoline noted, some seem to be sequential while others are blocked.
I have chosen to create a template json with the following fields and copying it into the subject directories.
I am using the same json for both task and rest because it appears as though the only protocol difference is in number of timepoints.
I’m posting here just as a sanity check and to see if anyone sees anything problematic with this approach.