In nilearn, one can do cross-validation on parameters for the estimator by setting the param_grid function.

For instance, here’s a brief example trying different regularization parameters for the SVR regressor:

```
regressor = DecoderRegressor(
estimator = 'svr',
param_grid = {'C':[0.04,0.2,1]},
standardize= True,cv=cv_inner, scoring="r2")
regressor.fit(y=train_y,X=train_X,groups=train_groups)
```

If you want to do nested cross-validation, then you can wrap this within a train/test split like:

```
test_scores=[]
for train, test in cv.split(...):
train_X = train.X
train_Y = train.Y
regressor.fit(y=train_y,X=train_X,groups=train_groups)
test_score = regressor.score(test_X,test_y)
test_scores=test_scores+[test_score]
```

This will help us to get an unbiased test estimate across the whole dataset while also not double-dipping when selecting the right parameter (I may need to add one extra fit at the end across the training and test data, with the best selected `C`

, to get a final unbiased estimate).

`DecoderRegressor`

is a wrapper for estimators, but DecoderRegressor itself has parameters that can be tested, for instance, `screening_percentile`

.

These cannot be simply passed to `param_grid`

to test, because `param_grid`

values are passed on to the estimator itself and are not directly used by `DecoderRegressor`

to implement through its own parameters.

To search hyperparameters for the `DecoderRegressor`

, typically one can do something like this Haxby example: Nilearn: Statistical Analysis for NeuroImaging in Python — Machine learning for NeuroImaging

Applying that example to my pseudocode above might look like:

```
test_scores=[]
for train, test in cv.split(...):
train_X = train.X
train_Y = train.Y
for sp in [1,10,100]:
regressor.fit(y=train_y,X=train_X,groups=train_groups,
screening_percentile=sp)
test_score = regressor.score(test_X,test_y)
test_scores=test_scores+[test_score]
```

My question is, **how to integrate these two approaches so you’re not doing more cross-validation than necessary**?

Now the above is optimizing over both `C`

and `screening_percentile`

, but not in a consistent manner. The `regressor`

object as defined above would include the search over three different values of `C`

, and if passed through the above loop, is also trying three different values of `sp`

.

But I’m not treating them entirely consistently. I’m getting a new `test_score`

on the test data for each `sp`

. In contrast, what’s happening to `C`

is ‘under the hood’ so to speak, but it doesn’t use the outer loop test group to estimate its value. Rather, if I understand `DecoderRegressor`

correctly, there’s an inner train/test split that is being used to split different values. `screening_percentile_range`

should be treated the same way, but I don’t know how to do that without adding a *third* level of train/test split. So there are inconsistent approaches for `C`

and for `sp`

, but both can’t be right.

Or possibly my understanding of what’s going on inside `DecoderRegressor`

is wrong. If it iterates through different values of `param_grid`

, and within each, fits across the entire set, then perhaps treatment of `screening_percentile`

and `C`

are equivalent. But I don’t think that’s right because there’s some CV inside DecoderRegressor as well.

Anything I’m missing here, and if what I’ve done above is not correct, what is the correct way to do this?