NMA provided Kay dataset questions

sean.byrne · July 22, 2020, 5:10pm

My students were wondering the following, can anybody help us out?

In order to apply the CNN and other neural nets to this data, we need to know whether the ‘responses’ and 'responses_test" are ‘super-cleaned’ in the sense that the represent the z-scored BOLD responses, per voxel, for each picture stimulus only, for each ROI voxel— i.e,
a) the time series information has been removed, including the delays between stimuli presentation, so that this data is just the z-scored BOLD specific to each image?
b) the data represents the ‘ensemble’ voxel response to a specific image (i.e., for that image over all the runs)
c) is the data an average between S1 and S2, or do we have separate sets of data for S1 and S2?
d) Since the NMA Kay data includes the LatOccipital Cortex and V4, and subdivisions of V3A and V3B, and the Kay et al 2008 paper reported data only from V1, V2, and V3, was the NMA Kay data collected at the same time as the data reported in the Kay et al 2008 paper?
e) it would be helpful to have the MNI anatomical coordinates of the ROIs. From the Kay et al 2008 paper it appears that vertices were used (perhaps from Freesurfer parcellations?). For getting our Neural Net model as precise as possible, knowing the exact anatomical masks for these ROIs would be very helpful.
f) following on (e), it is not perfectly clear what LOC refers to here, since the definition of Lateral Occipital Cortex can be a bit fluid between different authors and data sources. Again, if we know exactly what LOC ROI is included, that will help us in our project.

michaelwaskom · July 22, 2020, 6:22pm

Quick answers (some of these are answered in the data loader notebook and video, too).

I would also refer you to the data descriptor on https://crcns.org/data-sets/vc/vim-1

In order to apply the CNN and other neural nets to this data, we need to know whether the ‘responses’ and 'responses_test" are ‘super-cleaned’ in the sense that the represent the z-scored BOLD responses, per voxel, for each picture stimulus only, for each ROI voxel— i.e,
a) the time series information has been removed, including the delays between stimuli presentation, so that this data is just the z-scored BOLD specific to each image?
b) the data represents the ‘ensemble’ voxel response to a specific image (i.e., for that image over all the runs)

Yes, the data matrix has the z-scored estimated response amplitude averaged over all stimulus presentations.

c) is the data an average between S1 and S2, or do we have separate sets of data for S1 and S2?

This is data from S1.

d) Since the NMA Kay data includes the LatOccipital Cortex and V4, and subdivisions of V3A and V3B, and the Kay et al 2008 paper reported data only from V1, V2, and V3, was the NMA Kay data collected at the same time as the data reported in the Kay et al 2008 paper?

As far as I can tell, yes. I think they just focused on V1–3 in the Nature paper because their model didn’t fit well in intermediate regions

e) it would be helpful to have the MNI anatomical coordinates of the ROIs. From the Kay et al 2008 paper it appears that vertices were used (perhaps from Freesurfer parcellations?). For getting our Neural Net model as precise as possible, knowing the exact anatomical masks for these ROIs would be very helpful.

I don’t know that we have this information, unfortunately. It’s possible the full original dataset has some ROI masks in it, but I’m sure the original analyses were done in native space. I’m curious why MNI coordinates would be necessary for good Neural Net modeling?

f) following on (e), it is not perfectly clear what LOC refers to here, since the definition of Lateral Occipital Cortex can be a bit fluid between different authors and data sources. Again, if we know exactly what LOC ROI is included, that will help us in our project.

I think you’d need to track this information down in the CRCNS description or from other Gallant lab papers that have used this dataset.