Hi @dvsmith, I think we should probably write up a FAQ about exactly how fMRIPrep scheduling works (by default, anyway).
The short answer is:
--n-cpus will default to the number of CPUs on your system, and
--mem-gb will default to 90% of your memory, to make an allowance for OS consumption or underestimates of fMRIPrep component usage. Component jobs, which are tagged as using some number of processors and some quantity of memory) are run if three conditions are met: 1) all pre-requisite jobs have been run (i.e. all data needed to run this step exists); 2) enough memory is available; 3) enough cores are available. The longer answer has some caveats.
As to CPU consumption per run, it really doesn’t work like that. It might average to that, but many steps are parallelized within subjects. For example, when applying transformations to a BOLD series, each volume gets an independent process, run in parallel as cores are available.
As to advice, our general suggestion is to run each subject in an independent process, but I’ve actually never tried running with 96 cores and 384GB of RAM… You can definitely run multiple subjects simultaneously, but whether you want to run 6 separate processes or one process at a time with six simultaneous subjects is up to you. The end result will be a different effective scheduler, but I can’t say which will be better. (I’m assuming here that you might want to commit 16 cores per subject, which is about the limit before it stops giving you any improvements, but it might be more efficient to assume 8 cores per subject and run 12 at a time.)
You can also try your 48 subjects at a time. I don’t really know how the scheduler will perform under those conditions. If you’re feeling up to it, it might be interested to take advantage of a 96-subject dataset to try several different strategies for running subsets and comparing their performance.