A collaborator has recently sent us a BIDS dataset with 3 diffusion series. Unfortunately, we cannot process the third, due to reduced FOV:
sub-01_dwi.nii.gz
sub-01_acq-single_dwi.nii.gz
sub-01_acq-multib1000b2000_dwi.nii.gz
I would like to keep all 3 in the raw bids dataset, but exclude the 3rd scan from processing (QSIprep).
I have tried the following BIDS filter files, but they either fail to exclude the 3rd dataset or select only the second. How might I improve these to only select the 1st and 2nd series, specifically with regex (still new to regex)?
Thought a negative lookahead would work here. Selected all 3 series.
Create a pybids database: pybids layout <bids_root> <database_dir> --no-validate --index-metadata , making sure that the .bidsignore from before that instructs to ignore those multi scans is still present
Pass the database_dir into QSIPrep with the --bids-database-dir argument
I’m not totally sure why those files aren’t being excluded, but I can tell you how I’d go about solving this problem.
In our group we typically process each subject on their own and combine the results after all the subjects have finished. The BIDS data exists somewhere on a network drive, then when a job is sent out to the scheduler, it copies a single subject to a local scratch directory and runs qsiprep on the local copy. If you do something like this, you can just delete the *multi* scans from your local BIDS copy before you run qsiprep. This also avoids the huge time drain of letting pybids index everything at the beginning because qsiprep will see a BIDS input with a single subject.
I was hoping to follow up on your comment about copying subjects to a local scratch directory for processing. Currently, we create a temporary bids directory with a symlink pointing to the subject’s raw directory. For handling of derivative datasets, I realize that some pipelines have data-files present that they use for handling some of their operations (e.g. pyAFQ creating a tract-profiles spreadsheet).
I was wondering if your group has come along any solutions you could recommend for making sure part of the temporary bids-directory you create for each subject also points to relevant derivative datasets?
My naive approach is:
Recreate the base directory tree of derivatives
Symlink subject-specific directory for each pipeline directory
Symlink any file or directory which does not contain '^sub-.*$'
This is a bit vague for me to understand completely, but…
You can create a derivatives folder in the temporary bids folder that just contains the preprocessed derivatives for that subject. I have done this with qsiprep/pyAFQ, as well as post-processing of fMRIPrep data too, for what it’s worth. pyAFQ in particular has a ParticipantAFQ api that is meant for single-subject processing.