Run fmriprep on 7 participants simultaneously

docker
fmriprep

#1

Hello!
i m running fmriprep from docker with command:

    docker run --cpuset-cpus="10-39" --rm -v /home/yarkin/BIDS_LA5study:/bids -v 
    /home/yarkin/fmriprep_LA5study:/out -v 
    /home/yarkin/credentials/license.txt:/opt/freesurfer/license.txt poldracklab/fmriprep:latest 
    /bids /out participant --fs-no-reconall --force-bbr --participant_label {60001..60008}

I want to make preproccesing for 265 patients, since fmriprep is mem-consuming proccess, iam trying to run for 7 patients simultaneously, manualy setting --participant_label flag (counting 7 patients and set a range) - it’s a bit annoing to run command everytime for next 7 participant, manualy changing --partcipants_label parameter.
Is there any ways to solve this issue? with fmriprep flags? or with bash command, with xargs?
Would be very helpfull! thanks


#2

How about:

ls -d sub-*/ \
    | sed -e 's/.*sub-\(.*\)\//\1/' \
    | split -l 7 - participants_

Then you can run

docker run ... --participant-label $(cat participants_aa)
docker run ... --participant-label $(cat participants_ab)
docker run ... --participant-label $(cat participants_ac)

#3

Thanks @effigies, but i mean under that, not to call docker run everytime as previous patients last with fmriprep. As an idea: flag needed to limit number of patients processing simultaneiusly, if you call a group fmriprep without --participant-label


#4

Hmm. That certainly can’t be achieved with the built-in nipype scheduler. But depending on whether parallel is available within the Docker container, you could run:

docker run ... --entrypoint=bash $IMAGE \
    "for LABEL in {A..B}; do
         echo fmriprep ... --participant-label $LABEL
     done | parallel -j 7"

You’ll obviously want to adjust your mem_gb and num-procs options to account for other processes consuming resources, which is going to hurt when some of them are only using one core, and another could safely use the other cores.

A more ambitious alternative would be to augment nipype with a smarter ordering algorithm (e.g., make the topological sort linearize one fully connected sub-graph before starting another), but that would help utilize resources much better.


#5

To jump in with some other ideas: Do you have access to a cluster where you can run these jobs ? There, you could follow @effigies suggestion and just submit each subset of participants as a separate job !

Running them all on one machine is a bit more constrained-- I think the idea of executing them sequentially is probably the easiest. If you wanted to automate this sequential execution, it might be worth having a executing script that uses “wait,” see this example !


#6

I use the qbatch utility (https://github.com/pipitone/qbatch), written in python to do this.

There an example script here ( https://github.com/edickie/bids-on-scinet/blob/master/examples/qbatch_fmriprep1.1.2_anat_p08.sh)