XCP-D crashes, missing font

Trevor_Day · August 25, 2025, 4:25pm

Keeping track in case someone else has a similar issue and sees this.

I tested multiple participants.
The only change I could come up with was moving the working directory to an external hard drive, putting it back on the main volume didn’t help.
I rebooted.
I changed the output directory to a clean one on the main volume.
Tried on a completely new dataset, which happens to have MBME RS scans.
Tried switching from --output-type censored to interpolate because the error was in Node censor_report.

I’m baffled because I didn’t change anything, I had sucessfully tested one participant and was just about to start the batch run when this error started happening.

Also (from the container)

Apptainer> python
Python 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import matplotlib.font_manager
>>> matplotlib.font_manager.findSystemFonts()
['/usr/share/fonts/truetype/dejavu/DejaVuSerif-Bold.ttf', 
 '/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf', 
 '/usr/share/fonts/truetype/dejavu/DejaVuSerif.ttf', 
 '/usr/share/fonts/truetype/dejavu/DejaVuSansMono-Bold.ttf', 
 '/usr/share/fonts/truetype/dejavu/DejaVuSansMono.ttf',
 '/usr/share/fonts/truetype/dejavu/DejaVuSans-Bold.ttf']

Trevor_Day · September 2, 2025, 4:36pm

I was eventually able to get this to work by emptying my ~/.cache, which I then emulated by creating an empty cache each time, I’m assuming the issue was with ~/.cache/fontconfig. I had to keep a cache around to store the TemplateFlow files.

That worked for one participant, but then crashed on the next.

However, @Steven it now crashes on the regress_and_filter_bold step, without writing to the output log file at all. I’m baffled again. I’ve rebooted, and made sure there is room on the drive where $wkdir is. I’ve also added --mem-mb and --low-mem flags, but I’m not getting a message from the OOM handler anyway.

250902-15:09:19,683 nipype.workflow INFO:
		 [Node] Executing "regress_and_filter_bold" <xcp_d.interfaces.nilearn.DenoiseNifti>

Current call:

apptainer run --compat \
    -B ${fmri_dir}:/fmri_dir/:ro                \
    -B ${output_dir}:/output/                   \
    -B "${wkdir}":/wkdir/                       \
    -B "${cache}":/home/tkmday/.cache           \
    ${xcpd}                                     \
        /fmri_dir/ /output/ participant         \

What’s odder, is it kills my terminal and I can’t image why. Any ideas?

The only other piece of information I can come up with is this is an older rev of Ubuntu, 22.04.

Steven · September 2, 2025, 4:49pm

Are you running this directly in terminal or to scheduler (e.g. SLURM via sbatch)? If not scheduler, use scheduler.

Trevor_Day · September 2, 2025, 4:50pm

I don’t have access to a scheduler.

Steven · September 2, 2025, 4:56pm

Can you start an interactive session in your terminal with srun?

Trevor_Day · September 2, 2025, 4:57pm

No, I’d have to install SLURM on this single machine. If you think that’ll help, I can.

Steven · September 2, 2025, 4:58pm

Ah I didn’t realize you were not on a cluster. Is it possible to move to a cluster?

Trevor_Day · September 2, 2025, 4:58pm

No, not really. (If it was, I would have.)

I’ve ran XCP-D on this machine before, and I’m really confused why it’s not working now.

Steven · September 2, 2025, 7:31pm

Can you have the command print the output to a text file? You can pipe it with the > output.txt or something like that.

Trevor_Day · September 2, 2025, 7:36pm

Here you go:

250902-15:09:19,679 nipype.workflow DEBUG:
	 output: confounds_tsv
250902-15:09:19,679 nipype.workflow DEBUG:
	 [Node] regress_and_filter_bold - setting input confounds_tsv = /wkdir/xcp_d_0_11_wf/sub_CLB00008_ses__ses__wf/postprocess_0_wf/prepare_confounds_wf/generate_confounds/desc-confounds_timeseries.tsv
250902-15:09:19,679 nipype.workflow DEBUG:
	 output: confounds_images
250902-15:09:19,679 nipype.workflow DEBUG:
	 [Node] regress_and_filter_bold - setting input confounds_images = []
250902-15:09:19,679 nipype.utils DEBUG:
	 Loading pkl: /wkdir/xcp_d_0_11_wf/sub_CLB00008_ses__ses__wf/postprocess_0_wf/prepare_confounds_wf/process_motion/result_process_motion.pklz
250902-15:09:19,680 nipype.workflow DEBUG:
	 Resolving paths in outputs loaded from results file.
250902-15:09:19,680 nipype.workflow DEBUG:
	 output: temporal_mask
250902-15:09:19,680 nipype.workflow DEBUG:
	 [Node] regress_and_filter_bold - setting input temporal_mask = /wkdir/xcp_d_0_11_wf/sub_CLB00008_ses__ses__wf/postprocess_0_wf/prepare_confounds_wf/process_motion/desc-fd_outliers.tsv
250902-15:09:19,680 nipype.utils DEBUG:
	 Loading pkl: /wkdir/xcp_d_0_11_wf/sub_CLB00008_ses__ses__wf/postprocess_0_wf/downcast_data/result_downcast_data.pklz
250902-15:09:19,681 nipype.workflow DEBUG:
	 Resolving paths in outputs loaded from results file.
250902-15:09:19,681 nipype.workflow DEBUG:
	 output: bold_mask
250902-15:09:19,681 nipype.workflow DEBUG:
	 [Node] regress_and_filter_bold - setting input mask = /fmri_dir/sub-CLB00008/func/sub-CLB00008_task-rest_space-MNI152NLin6Asym_res-4_desc-brain_mask.nii.gz
250902-15:09:19,682 nipype.utils DEBUG:
	 Removing contents of /wkdir/xcp_d_0_11_wf/sub_CLB00008_ses__ses__wf/postprocess_0_wf/denoise_bold_wf/regress_and_filter_bold
250902-15:09:19,682 nipype.workflow DEBUG:
	 [Node] Writing pre-exec report to "/wkdir/xcp_d_0_11_wf/sub_CLB00008_ses__ses__wf/postprocess_0_wf/denoise_bold_wf/regress_and_filter_bold/_report/report.rst"
250902-15:09:19,683 nipype.workflow INFO:
	 [Node] Executing "regress_and_filter_bold" <xcp_d.interfaces.nilearn.DenoiseNifti>

Steven · September 2, 2025, 7:39pm

Is there a resource monitor on your machine you can use to track memory/cpu usage?

Trevor_Day · September 9, 2025, 9:35pm

Can confirm the new issue isn’t related to the old fonts issue.

I did end up solving the fonts issue by creating a fake empty cache for every run, which duplicates a little download time, but it’s not the end of the world. I think you could bind the templateflow cache back over the fake cache.

It turns out that MNI152NLin6Asym res-4 is 0.7 mm isotropic unlike MNI152NLin2009cAsym res-4, which is 4 mm. So when I switched, my bold files were blowing up in size and causing a memory problem.

Everything is working now