In nilearn, one can do cross-validation on parameters for the estimator by setting the param_grid function.
For instance, here’s a brief example trying different regularization parameters for the SVR regressor:
regressor = DecoderRegressor(
estimator = 'svr',
param_grid = {'C':[0.04,0.2,1]},
standardize= True,cv=cv_inner, scoring="r2")
regressor.fit(y=train_y,X=train_X,groups=train_groups)
If you want to do nested cross-validation, then you can wrap this within a train/test split like:
test_scores=[]
for train, test in cv.split(...):
train_X = train.X
train_Y = train.Y
regressor.fit(y=train_y,X=train_X,groups=train_groups)
test_score = regressor.score(test_X,test_y)
test_scores=test_scores+[test_score]
This will help us to get an unbiased test estimate across the whole dataset while also not double-dipping when selecting the right parameter (I may need to add one extra fit at the end across the training and test data, with the best selected C
, to get a final unbiased estimate).
DecoderRegressor
is a wrapper for estimators, but DecoderRegressor itself has parameters that can be tested, for instance, screening_percentile
.
These cannot be simply passed to param_grid
to test, because param_grid
values are passed on to the estimator itself and are not directly used by DecoderRegressor
to implement through its own parameters.
To search hyperparameters for the DecoderRegressor
, typically one can do something like this Haxby example: Nilearn: Statistical Analysis for NeuroImaging in Python — Machine learning for NeuroImaging
Applying that example to my pseudocode above might look like:
test_scores=[]
for train, test in cv.split(...):
train_X = train.X
train_Y = train.Y
for sp in [1,10,100]:
regressor.fit(y=train_y,X=train_X,groups=train_groups,
screening_percentile=sp)
test_score = regressor.score(test_X,test_y)
test_scores=test_scores+[test_score]
My question is, how to integrate these two approaches so you’re not doing more cross-validation than necessary?
Now the above is optimizing over both C
and screening_percentile
, but not in a consistent manner. The regressor
object as defined above would include the search over three different values of C
, and if passed through the above loop, is also trying three different values of sp
.
But I’m not treating them entirely consistently. I’m getting a new test_score
on the test data for each sp
. In contrast, what’s happening to C
is ‘under the hood’ so to speak, but it doesn’t use the outer loop test group to estimate its value. Rather, if I understand DecoderRegressor
correctly, there’s an inner train/test split that is being used to split different values. screening_percentile_range
should be treated the same way, but I don’t know how to do that without adding a third level of train/test split. So there are inconsistent approaches for C
and for sp
, but both can’t be right.
Or possibly my understanding of what’s going on inside DecoderRegressor
is wrong. If it iterates through different values of param_grid
, and within each, fits across the entire set, then perhaps treatment of screening_percentile
and C
are equivalent. But I don’t think that’s right because there’s some CV inside DecoderRegressor as well.
Anything I’m missing here, and if what I’ve done above is not correct, what is the correct way to do this?