Hi again,
Tractoflow has produced good results for me in ~70 datasets with single-shell data (b=750). Now I am aiming to process a subset of 41 subjects with multishell data. My datasets have 8 b0 volumes, 32 b=750 volumes, and 60 b=2000 directions.
I plan to run this on my local HPC server, but I’m running into memory issues when I reach the Eddy step. Here’s an example command and my process so far:
nextflow -c /home/blgeerae/Programs/tractoflow-2.0.0/singularity.conf run /home/blgeerae/Programs/tractoflow-2.0.0/main.nf --root /home/blgeerae/Data/tractoflow/Multishell --dti_shells "0 750" --fodf_shells "0 750 2000" -with-singularity /home/blgeerae/Programs/tractoflow-2.0.0/tractoflow_2.0.0_8b39aee_2019_04_26.img -resume
-
I have tried running tractoflow on the full data cohort (41 datasets, 100 volumes each), allocating 40 cpus and 185GB memory, the process had insufficient memory and stalled, I cancelled it after 6 days of processing. Here’s the command:
-
Next I thought I would run subjects in parallel, submitting 41 separate jobs. To facilitate this, all subject data was moved into individual folders and each script was pointed to a different --root directory. All scripts were executed from the same directory, I hoped that would mean that all results would be deposited in the same /results/ folder (but also all scripts use the same /work/ folder). For these I tried adding the --processes 8 or --processes 4 flag with 8 cpus and 32GB memory allocated, but no job has successfully completed yet. A couple different errors have popped up:
-
Process requirement exceed available CPUs -- req:8; avail: 4
, this appeared on a job with 8 cpus allocated and the --processes 4 flag, oddly -
ERROR ~ Unable to acquire lock on session with ID 48e8c351-9ab9-46b1-ae63-8anaanf61503
or a similar ID. This has appeared on many different jobs. - Other jobs have continued to run until they reached the time limit, with none completing and none successfully completing the eddy step.
-
So, a couple questions:
- Can anyone comment on how much memory I should allocate to successfully process multishell data with 92 directions and 8 b0s? Either one unified script or a separate script for each subject.
- Is this ‘unable to acquire lock…’ error due to multiple tractoflow jobs running in the same directory, accessing the same /work/ and /results/ folders? I hoped this would be simpler than creating a separate folder for each job and consolidating results later.
I know I can skip the eddy step but if possible I’d prefer to keep it, to keep methods the same between my single shell and multishell processing.
Thank you for your time!