Visual attention selects only a tiny fraction of visual input information for further processing. Selection starts in the primary visual cortex (V1), which creates a bottom-up saliency map to guide the fovea to selected visual locations via gaze shifts. This motivates a new framework that views vision as consisting of encoding, selection, and decoding stages, placing selection on center stage. It suggests a massive loss of non-selected information from V1 downstream along the visual pathway. Hence, feedback from downstream visual cortical areas to V1 for better decoding (recognition), through analysis-by- synthesis, should query for additional information and be mainly directed at the foveal region. Accordingly, non-foveal vision is not only poorer in spatial resolution, but also more susceptible to many illusions.
CNS*2020 Keynote 2: A new computational framework for understanding vision in our brain - Zhaoping Li