Hello all!
I am working with the new NiLearn decoding.Decoder wrapper function, and I cannot figure out when/how the ANOVA feature selection function is being implemented. My question is if the ANOVA is being run on the full data set first, BEFORE any of the cross-validation folds, OR is it being run on each cross-validation run separately? My understanding is that if the ANOVA is being run on all the data initially, this would introduce a peeking bias and inflate the accuracy, whereas if it is run on each fold individually this would avoid the bias of looking at the hold-out data. Thanks!
1 Like
Hello, it uses the SelectPercentile from scikit-learn. You are entirely right that performing the feature selection on the whole data would be a mistake and result in overfitting. the Decoder object performs a feature selection for each fold, using only the training data
1 Like
the feature selection happens here , in the _parallel_fit
function, and as you can see the selector is fitted using only X_train and y_train. the selector is a sklearn.feature_selection.SelectPercentile
, and its score_func
parameter is sklearn.feature_selection.f_classif
or sklearn.feature_selection.f_regression
depending on whether the decoding task is a classification or regression problem
1 Like
Perfect, thank you so much!
1 Like