Missing something to enable XCP-D processing in parallel?

Trevor_Day · February 18, 2025, 3:13pm

Summary of what happened:

I am trying to process the HCP data using XCP-D, however, XCP-D is only recruiting one core at a time.
I have set XCP-D’s flag --ncpus 27 (I have 28 cores), as well as the apptainer run flag --cpus 27. According to this Apptainer help page, my Ubuntu 22.04 system should be correctly configured to allow Apptainer to use multiple cores.

I have messed around with Apptainer --cpus-set and XCP-D --omp-nthreads, although neither enabled parallelization.

Am I missing something?

Command used (and if a helper script was used, a link to the helper script or the command generated):

#!/bin/bash

hcpya=/mnt/astrid_remote/HCPYA/
fmri_dir=${hcpya}/src/
output_dir=${hcpya}/clean/

participants="$(echo "${1}" | tr '[:space:]' ' ')"

# symlink to latest xcpd version
xcpd=/home/tkmday/trevor/xcpd.sif

mkdir -p /tmp/xcpd_workdir/
now=$(date '+%y%m%d_%H%M')

# $participants is supposed to be word-separated
# shellcheck disable=SC2086
apptainer run                                   \
    -B ${fmri_dir}:/fmri_dir/:ro                \
    -B ${output_dir}:/output/                   \
    -B /tmp/xcpd_workdir/:/wkdir/               \
    -B /tmp:/scrth                              \
    --cpus 27                                   \
    ${xcpd}                                     \
        /fmri_dir/ /output/ participant         \
        --work-dir              /wkdir/         \
        --clean-workdir                         \
        --participant-label     ${participants} \
        --mode                  none            \
        --file-format           cifti           \
        --input-type            hcp             \
        --nuisance-regressors   gsr_only        \
        --motion-filter-type    none            \
        --despike               y               \
        --fd-thresh             0.2             \
        --atlases               Glasser         \
        --min-coverage          0.5             \
        --min-time              60              \
        --combine-runs          n               \
        --output-type           censored        \
        --warp-surfaces-native2std n            \
        --smoothing             6               \
        --abcc-qc n --linc-qc n                 \
        --nprocs                27              \
        tee -a "date-${now}_log.txt"

Version:

XCP-D version v0.10.5
Ubuntu release: 22.04
Apptainer version: 1.3.4

Environment (Docker, Singularity / Apptainer, custom installation):

Apptainer

Data formatted according to a validatable standard? Please provide the output of the validator:

Using XCP-D’s built-in HCP to BIDS function.

Screenshots / relevant information:

Concluded it was only recruiting one core watching the Ubuntu System Monitor GUI.

tsalo · February 18, 2025, 3:18pm

The main XCP-D workflow should use those allotted cores, but the ingression step that creates a pseudo-BIDS-Derivatives dataset from the HCP derivatives is only set up to use one thread. We eventually want to improve that ingression step (see Move ingression functions into workflows to match QSIRecon · Issue #1247 · PennLINC/xcp_d · GitHub), but we haven’t had a chance to work on it yet.

EDIT: And just to be clear, the ingression step can take a long time to run.

Trevor_Day · February 18, 2025, 4:02pm

Ah ha, I see.

So these steps are the ingression steps, and will always be sequential?

250218-10:21:28,174 nipype.utils INFO:
	 Processing rfMRI_REST1_LR
250218-10:24:45,556 nipype.utils INFO:
	 Finished rfMRI_REST1_LR
250218-10:24:45,556 nipype.utils INFO:
	 Processing rfMRI_REST1_RL
250218-10:27:58,115 nipype.utils INFO:
	 Finished rfMRI_REST1_RL
250218-10:27:58,115 nipype.utils INFO:
	 Processing rfMRI_REST2_LR

tsalo · February 18, 2025, 4:09pm

That is correct, unfortunately.

Trevor_Day · February 19, 2025, 2:21pm

Let me ask you a related question, if you don’t mind. I am focusing on the four RS scans, so I cut the number of cores down to four for benchmarking.

However, this seems to have made ALFF run in sequence, which is a major slowdown.
As far as I can tell, it was still only recruiting one core based on System Monitor.

Any changes I need to make there?

tsalo · February 19, 2025, 2:52pm

I think that’s something @mattcieslak fixed in the current unstable version. We’re planning to make a new release soon, so hopefully once we do that you can use the new version and it will calculate ALFF in parallel.

Trevor_Day · February 19, 2025, 3:03pm

Ah okay, I thought it was before.

Edit: It doesn’t look like the unstable ran ALFF in parallel. Will await the next release.
As always, thanks for your hard work and quick replies.