Finding CAT12 produced suddirectories in BIDS-conform dataset

1.) I have a dataset in BIDS format containing structural images. Some images of certain subjects were preprocessed using CAT12 which created three subfolders in their respective /anat directory called /mri, /label and /report. So not all subjects have these three folders in their /anat folder. CAT12 has also added the prefix ‘mwp1’ (with underscore) to the files to tag them as preprocessed. Is there a way to find these files (e.g. using some regex methods) using pybids?

2.) I guess a better structure would have been to put the preprocessed files in a subdirectory project/derivatives/cat12/sub-01/…, but does anyone know if CAT12 offers this solution?

project/
    sub-01/session-20201001/anat/sub-01_ses-20201001_acq-mprage_run-01_T1w.nii.gz
    sub-01/session-20201001/anat/mri/mwp1sub-01_ses-20201001_acq-mprage_run-01_T1w.nii.gz
    sub-01/session-20201001/anat/label/...
    sub-01/session-20201001/anat/report/...

Ahoi hoi @JohannesWiesner,

1.) As you are aware, pybids is based on and expects the BIDS specification, thus BIDS conform datasets and files. Even though most derivates are still WIP (except common derivatives), those output files appear to be a bit far away from the intended principles (e.g. provenance, etc.). Adding a prefix instead of adding a desc-<label> keyword-value might not be the best idea (the same holds true for not adding space-<space> assuming the data was transformed). Nevertheless, you could give it a try directly via BIDSLayout with the derivatives parameter either set to true (find all derivatives) or a str (indicating specific derivatives). The other option would be to manually add it via add_derivatives(). However, for this to work at all, please check the answer 2.).

2.) Indeed, if you want to stay BIDS conform, data obtained through a processing pipeline/workflow/etc. should go to /derivatives/<pipeline_name>, in your case: /derivatives/cat12/sub-01/anat/..... I never worked with CAT12, but IIRC BIDS support at this end of the spectrum of processing pipelines is rather limited. That being said, I think SPM has some support for reading (and maybe writing?) BIDS datasets that maybe could be adapted for your purpose. @Guillaume do you have any insights here?

HTH, cheers, Peer

Thanks Peer! That made it a lot more clear to me. So to sum it up: As long as CAT12 doesn’t stick to the convention of putting preprocessed files in a separate folder under /derivatives/ and following a BIDS-conform folder structure, pybids will not be able to find those files in the /mri folders. In order to find those files, CAT12 would have to output those files something like this (?):

Original:

project/source/sub-01/session-20201001/anat/sub-01_ses-20201001_acq-mprage_run-01_T1w.nii.gz

CAT12-preprocessed:

project/derivatives/cat12/sub-01/session-20201001/anat/sub-01_ses-20201001_acq-mprage_run-01_T1w_cat12-mwp1.nii.gz

See the CAT12 manual (p. 57) for more information concerning the tags added by CAT12.

I don’t know if Christian Gaser and his colleagues are planning to make CAT12 BIDS-conform, maybe it would make sense to write them? I also saw that there’s an open issue on GitHub that goes in that direction, where CAT12 is used via the nipype interface.

Ahoi hoi @JohannesWiesner,

no biggie, I’m glad my answer was helpful.

Yeah, I think your proposed path/filename goes into the right direction. The preprocessed images should definitely go into /derivatives and not the BIDS root. The key-value pair is a different thing and also depends on the further work of the /derivatives gang, especially anatomical MRI derivatives. For now, trying to keep everything as BIDS-y as possible might be a good idea, as outlined in my previous answer. Thus adding _space-<space> and _desc-<label> instead of _cat12-mwp1. I’ve no idea how the CAT12 folks are thinking wrt these things. AFAIK CAT12 and other SPM toolboxes follow SPM conventions. That is, adding a prefix, etc. . Making those things more BIDS compatible would require a parallel structure at least I think (e.g. a make-output-BIDS-compatible function). While this is possible and shouldn’t be too complicated, neither of these tools are developed openly or have GitHub repositories, thus I can’t tell how this would work other than asking folks directly. The nipype way is definitely one you could go and in case you don’t want to add an interface, maybe just a function that makes the output compatible. Also, please be aware that you need a json file for your pipeline in order to guarantee compliance.

HTH, cheers Peer

Hi,

while developing CAT12 I had to make a decision about the used naming convention. Should I follow the naming convention of SPM12 or should I rather prefer some standards like BIDS, (btw an idea which I really appreciate)? The file selection tool and almost all functions in SPM12 are adapted to the SPM naming convention, where characters and/or numbers are prepended to the file name. Thus, any other naming scheme would break with most SPM functions and would be also confusing for SPM users. The reason why SPM still sticks to that naming convention is probably that this is used for more than 25 years now.
The additional mri/report/label/surf folders were introduced in CAT12 because there were simply too many processed files if you have a large data set with hundreds or thousands of subjects. The SPM file selection tool is then overloaded because it is easier to put all data into one folder and select them all. If you have a subject-specific subfolder it is more complicated to select the files. Of course, I know that this can be easily scripted, but the typical SPM user sticks to the GUI and more than 2/3 are using Windows computers.
Sorry for the long explanation, but maybe you can now better understand why it would be difficult to change to a BIDS-conform naming scheme in SPM12 or CAT12. For CAtT2 I suggest the following:

  1. Use a script that links BIDS conform data to one processing folder to ease file selection for SPM12/CAT12
  2. Use a script that renames and copies SPM12/CAT12 outputs back to BIDS folder structure
  3. Optionally, change cat_defaults.m to prevent that subfolders are used:
    cat.extopts.subfolders = 0; % use subfolders such as mri, surf, report, and label to organize your data

This would be probably the easiest way to combine the advantages and preferences of both worlds. A shell script would be easy to write, but Windows users could not use it, thus a Matlab script might be the best option. I can see what I can make, it’s just a time issue…

Best,

Christian

3 Likes

Hi Christian, thanks for the quick and comprehensive reply! I wrote a little repository for finding non-BIDS-conform NIFTI-files in a directory (it simply just crawls through the folder and looks for files that match the CAT12-prefix or any other prefix). This can be used as a workaround until CAT12 & SPM12 might eventually become BIDS-conform in the future. I will close this thread for now and mark your answer as solution. Looking forward for BIDS-standard in SPM12 and CAT12!

Ahoi hoi @Christian_Gaser,

thank you very much for the detailed explanation and for joining Neurostars!

The SPM prevalence, user base and their most common setup are definitely important and valuable reasons. IMHO we all (the BIDS, python - neuroimaging and SPM/Matlab communities) could do more regarding a better integration and interoperability. Don’t get me wrong, I don’t want to blame anyone, I understand that it wasn’t a focus on either side and didn’t evolve as “naturally” as other things/integrations. However, given the recent support of BIDS within the SPM ecosystem (BIDS apps, read/write functions, etc.), it might be a good time for some folks to get together and discuss these things. Maybe this could entail short/long term goals for a better collaboration, what to adapt, among other things. For example, the parallel structure I was referring to in my previous post (supporting both the SPM and BIDS structure) is obviously a major software development nightmare, but could enable SPM ecosystem users to use BIDS if they want/need to without the necessity to code it (given the restrictions you mentioned). This should be possible through the steps you mentioned and should be already implemented in one or the other BIDS app. I also saw that you support an MCR version of CAT12. With that a CAT12 BIDS app would be possible as well (if it’s not existing already). Regarding the time and work load, I guess some folks would be interested in helping out. However, discussing this (as it would change the way CAT12 and other SPM software is developed) could/should happen somewhere else then here. I’ll tag some folks I know work with BIDS and SPM/Matlab (sorry for that gang): @Guillaume, @Remi-Gau & @StephanHeunis. I’m pretty sure, that we can find a way to work on these things.

Thanks again, cheers, Peer

1 Like

Thanks for the tag @PeerHerholz.

Excellent timing for this thread since we are having a first meeting for the bids-matlab repo on Thursday (will do my best to the post minutes of the meetings here afterwards)

I think that the problems highligthed by @Christian_Gaser here are definitely among those we need keep in mind.

Ideally it would be very convenient to have bids-matlab help get the output of the different softwares out there to conform better to the BIDS-derivatives.

I don’t think there will be a silver bullet out-of-the-box blanket-solution, but I think that we can definitely create some code base that can help make the life of the developpers of those different toolboxes way easier when it comes to joining the BIDS crowd: I doubt the CAT12 team is the only to have to ask those questions and make those choices.

I also suspect that some of those changes will require some gradual change of the users habits overtime (but this is a long term issue).

1 Like

I completely agree with @PeerHerholz and @Remi-Gau that SPM12 and CAT12 should integrate BIDS. As a short term goal, I will implement a BIDS export function that links or copies CAT12 output to the BIDS structure. However, with regard to the long-term goal of integrating BIDS more deeply, I am still dependent on the SPM developers. The file selection via GUI as a central function is used in so many steps and currently, it would be a nightmare to (recursively) select data in subject-specific folders for databases with hundreds or thousands of subjects (e.g. UK Biobank). Again, a meaningful interim step could be to use an BIDS importer (that already exists in SPM as spm_BIDS function) and after processing to optionally export data to BIDS.
Since version 12.7 CAT is also supporting an MCR version that is scriptable at the shell and that could easily be extended by BIDS support. See here for an example of a CAT12 singularity container:


A flag could be then used to control BIDS or SPM-conform output to satisfy both worlds.

Best,

Christian

An update on the bids-matlab front.

@Christian_Gaser feel free to join us and/or make suggestions on the repository to see if wan make your life easier.

We are moving towards using a channel on the Brainhack mattermost to communicate:

https://mattermost.brainhack.org/brainhack/channels/bids-matlab

Next meeting should be next week :

We will try to make an inventory of:

  • what the “community needs” from bids-matlab: see this issue
  • what’s already out there that we can reuse: see this issue

Setting up a proposal for the “governance” of the repo for the next meeting:

Contributing: adapt from that of the bids specs


minutes from our first meeting

MANAGED BY INCF