I have recently run the XCP-D pipeline (stable release 0.4) over a set of preprocessed RS fMRI scans (via fMRIPrep v22.1.1). Main XCP-D arguments used:
No bandpass filtering settings were explicitly set. I’ve just used the default filtering parameters (i.e. bandpass range 0.01-0.08 Hz with a order 2 Butterworth filter).
I have finally had some time to put on this issue to do some more tests.
Apparently, it looks like -as you @tsalo suggested- it might be related to the bandpass filtering step.
The high-motion volumes identified to be excluded from the BOLD data as outliers in the temporal censoring stage, are replaced by interpolations of the BOLD signal at the bandpass filtering step.
This interpolation procedure might be in my opinion what produces such bad outcomes for certain cases. Particularly, cases where a substantial amount of adjacent volumes are being removed close to the beginning/end of the whole time-series.
If I run xcp-d disabling the band-pass filter for those affected cases, the results look much closer to the expected ones but of course I have unfiltered data.
I wonder what would you recommend at this stage, perhaps find a better and more resilient interpolation approach or simply discard the bandpass filtering step of the BOLD signal?
Thanks for the detailed report. So it seems like XCP-D should just not perform interpolation for censored volumes at the beginning and end of the time series. Does that sound right to you?
If you want to try out different bandpass filtering options, you can run XCP-D 0.5.0 with --dcan-qc and --disable-bandpass-filter, which will produce desc-interpolated_bold files that include interpolated data, but without bandpass filtering. Then you can do what you want to the data before applying a temporal filter.
It definitely looks like when the interpolation goes wrong, there are censored volumes at the beginning/end of the timeseries (even if there’s only one single volume).
Minimally understanding the process underneath this (cubic spline) interpolation procedure I suspect that, for those troubled cases, the interpolation becomes an extrapolation process instead. That is, asking to predict values outside the range of data available (instead of finding intermediate values within as expected).
My guess is (disclaimer: I haven’t look at the interpolation code) that cubic splines are not suited to extrapolate beyond the data set, the spline might not follow the trend of the points as seen in our use-case were it goes bananas.
Hopefully this issue can be fixed in the XCP-D codebase adding a more robust interpolation method or improving the former one. That would be the ideal solution to us, otherwise we will search ways to workaround it. One being what you suggested about working with the desc-interpolated_bold files. Another option we internally discussed could be adding cosine-related regressors (available via fMRIPrep) to the list of nuisance regressors and skip temporal filtering and, hence, the interpolation.
Rather than support extrapolation, I think the best/easiest thing would be to just not interpolate high-motion volumes if they’re at the beginning/end of the run.
Luckily, there is -in my opinion- an easy fix following your approach. CubicSpline class has a parameter for enabling extrapolation set by default to True which I think could be set to False:
from scipy.interpolate import CubicSpline
cubic_spline_fitter = CubicSpline(remained_vol, remained_x, extrapolate=False)
Some quick example, showing an original interpolation of first 5 volumes VERSUS an interpolation with extrapolate parameter to False,
Original data,
Interpolation as is (with extrapolation param. enabled by default),
Interpolation with extrapolation param. disabled,
Perhaps I am getting too technical, I can place this post in the Github issue you just opened if you prefer.