Regularized Ensemble of models

where i can find the code of this article :sleepy: : @GaelVaroquaux @bthirion
because i don’t understand the algorithm .
i have a matrix (nb_subject X nb_parcel) and i have a vector Y .
i want to apply the method used in the article to fit my model with the methode of ensembling.
thank you very much

Hi, this is now partly implemented in Nilearn Decoder object (only partly, because it does not have the clustering part if i’m not mistaken).
This is now in Nilearn master, but not yet released.



thank you very much but how can i build feature-grouping matrix: Φ(j) using featureAgglomeration .
FeatureAgglomeration return a matrix with nb_subjects * n_parcels but the feature-grouping matrix must return nb_voxels * nb_parcels …

how can i do this please
thank you

I try to to implement the function present in the article , please can you tell me if it is correct or no :

X_red : is the output of FeatureAgglomeration, X_red of shape (nb_subjects *
X : is the data of shape (nb_subjects * nb_voxels)

def _ensembling(X,X_red, y ):
          Φ = X.T  @  X_red     # to get feature grouping matrix
          Φ = preprocessing.normalize( Φ, norm='l2')

          ridge = BayesianRidge()
          feature_selectionn = SelectPercentile(f_regression)
          anova_LR = Pipeline([('anova', feature_selectionn), ('ridge', ridge)])
, y)

         grid = GridSearchCV(anova_LR, param_grid={'anova__percentile': [5, 10, 20]}, 
         verbose=1,cv=3,n_jobs=1), y)  # set the best parameters
         coef_ = grid.best_estimator_.steps[-1][1].coef_
         w_best_ = grid.best_estimator_.steps[0][1].inverse_transform(coef_.reshape(1, -1))
         w_aprox = w_best_ @  Φ.T
         return    liste

at the end, liste is of shape (b, nb_voxels)

and now we can call this function, b times and compute the average on b estimators

So, we will have the output is array of shape (1,nb_voxels)
Is it correct ?
thank you very much @bthirion

To generate Φ I’d rather rely on the labeling of the voxels, expand them in a one-hot encoding matrix, en then normalize the columns. your formula will be inaccurate f the number of samples is small.

Also, I’d use RidgeCV rather than BayesianRidge, as it is more reliable numerically.
Otherwise I think that you got the point.

We should expose such a method in Nilearn, but this is not going to happen in a close future.


Thank you very very much
i try to do this :
label is a list of n_parcellations array : label[0] contains the array of labels of the first parcellation , label[1] contains the labels of the second parcellation … {each array of shape (1,170006)}

def  Φ():
     Phi = []
     for i in range (n_parcellations):
          a = label[i]             # "a" of shape  (1,170006)
          b = np.zeros((a.size, a.max()+1))
          b[np.arange(a.size),a] = 1  
          Phi.append(b)        # to save all Φ of each parcellation
     return Phi

But i have directly a memory error when i excute this function (in the first iteration)
where is the problem ?
thank you @bthirion

You should use a sparse matrix. Here is some code borrowed from

n_voxels = len(labels)

incidence = coo_matrix(
        (np.ones(n_voxels), (labels, np.arange(n_voxels))),
        shape=(n_components, n_voxels), dtype=np.float32).tocsc()

inv_sum_col = dia_matrix(
        (np.array(1. / incidence.sum(axis=1)).squeeze(), 0),
        shape=(n_components, n_components))

incidence = inv_sum_col * incidence

the incidence matrix is Phi.

thank you very much ,
question please: RidgeCV(alphas=(…)) can replace line 7,8, and 9 to find the best model?
thank you