Use fmriprep to open several docker containers processes and get stuck

Summary of what happened:

Use fmriprep to open several docker containers processes and get stuck。

Command used (and if a helper script was used, a link to the helper script or the command generated):

docker run  --rm --name fmriprep_sub-1001 \
  -v /mnt/d/zy/RawData/test/license.txt:/opt/freesurfer/license.txt:ro \
  -v /mnt/d/zy/RawData/Package_1208126/HCP_BIDS:/data:ro \
  -v /mnt/e/fmriprepout/sub-1001:/out \
  nipreps/fmriprep:latest \
  /data /out participant \
  --participant_label sub-1001 \
  --n-cpus 4 \
  --omp-nthreads 4

Version:

fmriprep 23.0.2

Environment (Docker, Singularity, custom installation):

docker

Data formatted according to a validatable standard? Please provide the output of the validator:

Passed BIDS validation

Relevant log outputs (up to 20 lines):

I am opening several more docker containers to achieve fmriprep parallelism, but I found that after executing to

 230601-08:42:18,919 nipype.workflow INFO.
         [Node] Executing "autorecon2_vol" <smriprep.interfaces.freesurfer.ReconAll>
230601-08:42:18,957 nipype.interfaces INFO.
         resume recon-all : recon-all -autorecon2-volonly -openmp 4 -subjid sub-1002 -sd /out/sourcedata/freesurfer -nogcareg
230601-08:42:19,2 nipype.interface INFO.
         resume recon-all : recon-all -autorecon2-volonly -openmp 4 -subjid sub-1002 -sd /out/sourcedata/freesurfer -nogcareg

, the process gets stuck and the CPU resources are then called very low

19357 root      20   0  903600 476524  57300 S   1.3   0.4  16:14.51 fmriprep
22977 root      20   0  903856 476496  57068 S   1.3   0.4  15:45.23 fmriprep
  197 root      20   0 2245708  52692  20668 S   1.0   0.0   0:24.16 containerd
  907 root      20   0  904360 478044  57264 S   1.0   0.4  13:25.78 fmriprep
28969 root      20   0   12400   5348   3180 R   1.0   0.0   7:17.69 top
  419 root      20   0  903588 478240  56628 S   0.7   0.4  14:23.82 fmriprep
30629 root      20   0  906156 479848  57188 S   0.7   0.4  14:52.28 fmriprep
48306 root      20   0  906664 474864  57336 S   0.7   0.4  12:59.42 fmriprep
  575 root      20   0  906636 475900  57664 S   0.3   0.4  12:27.80 fmriprep
 6359 root      20   0  905632 473512  57548 S   0.3   0.4  12:50.54 fmriprep
 7884 root      20   0  906940 480872  57432 S   0.3   0.4  15:42.01 fmriprep
49086 root      20   0  905896 473420  57404 S   0.3   0.4  12:47.65 fmriprep
    1 root      20   0    2324   1712   1600 S   0.0   0.0   0:00.06 init(Ubuntu)
    4 root      20   0    2324      4      0 S   0.0   0.0   0:00.00 init
   71 root      20   0    2340    112      0 S   0.0   0.0   0:00.00 SessionLeader
   72 root      20   0    2340    120      0 S   0.0   0.0   0:00.12 Relay(73)
   73 moye      20   0    9176   5272   3512 S   0.0   0.0   0:00.03 bash
   86 root      20   0   11164   5036   4268 S   0.0   0.0   0:00.26 sudo
   87 root      20   0   11164    860      0 S   0.0   0.0   0:00.00 sudo
   88 root      20   0    9864   3752   3276 S   0.0   0.0   0:00.00 su
   89 root      20   0    8132   4152   3472 S   0.0   0.0   0:00.03 bash
  177 root      20   0 2944092  97212  48576 S   0.0   0.1   0:36.32 dockerd
  345 root      20   0       0      0      0 Z   0.0   0.0   0:00.00 fs_time
  346 root      20   0       0      0      0 Z   0.0   0.0   0:00.00 grep
  349 root      20   0       0      0      0 Z   0.0   0.0   0:00.00 grep
  352 root      20   0       0      0      0 Z   0.0   0.0   0:00.00 uptime
  382 root      20   0    7760   3312   3064 S   0.0   0.0   0:00.00 bash
  383 root      20   0 2437008  41260  16820 S   0.0   0.0   0:05.98 docker
  399 root      20   0  720760  15652   5340 S   0.0   0.0   0:05.62 containerd-shim
  443 root      20   0    2340    112      0 S   0.0   0.0   0:00.00 SessionLeader
  444 root      20   0    2340    120      0 S   0.0   0.0   0:00.05 Relay(445)
  445 moye      20   0    9176   5128   3368 S   0.0   0.0   0:00.03 bash
  458 root      20   0   11132   5096   4300 S   0.0   0.0   0:00.10 sudo
  459 root      20   0   11132    864      0 S   0.0   0.0   0:00.00 sudo
  460 root      20   0    9864   3664   3192 S   0.0   0.0   0:00.00 su
  461 root      20   0    8124   4088   3424 S   0.0   0.0   0:00.00 bash
  468 root      20   0    2340    112      0 S   0.0   0.0   0:00.00 SessionLeader
  469 root      20   0    2340    120      0 S   0.0   0.0   0:03.65 Relay(470)
  470 moye      20   0    9176   5260   3500 S   0.0   0.0   0:00.05 bash
  483 root      20   0   11132   5124   4332 S   0.0   0.0   0:05.24 sudo
  484 root      20   0   11132    860      0 S   0.0   0.0   0:00.00 sudo
  485 root      20   0    9864   3816   3340 S   0.0   0.0   0:00.00 su
  486 root      20   0    8024   4188   3520 S   0.0   0.0   0:00.01 bash
  487 root      20   0       0      0      0 Z   0.0   0.0   0:00.00 mri_motion_corr
  488 root      20   0       0      0      0 Z   0.0   0.0   0:00.00 grep
  521 root      20   0   13488  11008   5592 S   0.0   0.0   0:00.04 python
  522 root      20   0   14452  12156   6184 S   0.0   0.0   0:00.08 python
  527 root      20   0       0      0      0 Z   0.0   0.0   0:00.00 fs_time

Screenshots / relevant information:

wsl installation of ubuntu. basic computer information 24 cores, 256g ram

Hi @yongzhang,

How much RAM are you giving Docker specifically? If not specified, it might be using the default, which is probably too low. You might want to bump it from the default to something like 16GB.

Best,
Steven

Hi Steven,
I used default RAM. I have consulted the relevant information and By default, Docker doesn’t set any resource constraints on a container. It allows the container to access all system resources.
Best.
yongzhang

Hi @yongzhang.

Just in case can you try adding

--memory=16g

to the docker run command?

Best,
Steven

Hi Steven
According to your suggestion, I limit each container to 16g of memory, however, when I open two or more docker containers, processes get stuck.
Best,
yongzhang

Can you explain what you mean by this? fMRIPrep does processes in parallel, and you can process multiple subjects in parallel by specifying more subjects with --participant_label, so why are you doing multiple processes?

I am using multiple docker containers to achieve parallelism, with one docker container running one trial

fMRIPrep will take care of parallelizing multiple runs within a participant too.

My thought is that if --participant-label sub1,sub2,sub3, one of the three has an error, it may cause all three to not run successfully. If I run one container with one subject, this will not happen.

I think that only happens if you include --stop-on-first-crash

Thank you very much, I’ll try it out next. I have two ideas now, one is to open more dockers in the real ubuntu interface to see if it’s the system, and the other is to open a docker container as you said, --participant-label sub1, sub2, sub3. Please look forward to my good news!