The Natural Scenes Dataset features raw data as well as beta coefficients of a ridge regression. For a given individual of this dataset, the array of beta coefficients is of size `(V, N)`

where `V`

is the number of voxels, `N`

the number of images. Each of the `N`

images was presented several times to a given individual, allowing to leverage several trials to compute these coefficients.

I don’t understand how these coefficients were obtained.

As stated in their research paper, there are 3 ingredients:

- as a first step, each voxel is associated with a Hemodynamic Response Function selected amongst a set of 20 possible HRFs. Explicitly, how does one determine which HRF best fits the activity of a given voxel?
- as a second step, GLMdenoise is used to compute beta coefficients ; however, in my understanding, GLM denoise takes as input a design matrix and outputs the shape of the HRF as well as beta coefficients. However, since the first step aims at determining the best fitting HRF for a given voxel, why use GLM denoise instead of a classical approach? More generally, I don’t understand how one can solve this regression problem when the HRF is different for each voxel ; doesn’t it lead to humongously big matrices?
- they use a different solver for the ridge regression problem

Thank you a lot for your help!