CSAIL Publications and Digital Archive header
bullet Technical Reports bullet Work Products bullet Research Abstracts bullet Historical Collections bullet

link to publications.csail.mit.edu link to www.csail.mit.edu horizontal line


Research Abstracts - 2006
horizontal line

horizontal line

vertical line
vertical line

Investigating Shape Representation in Intermediate Areas of Visual Cortex

M. Kouh, C. Cadieu (at UC Berkeley) & T. Poggio

The Problem

The ventral pathway of the primate cortex is thought to mediate object recognition. Neurons within this pathway are classified in a hierarchical structure and exhibit selectivity to increasingly complex visual stimuli. The computational mechanisms that produce neural selectivity within this hierarchy are not well understood. This study investigates the computational mechanisms that determine selectivity in the intermediate areas of the ventral pathway, specifically area V4.

Area V4, located in the middle of the ventral pathway between areas V1 and IT, plays a critical role in the recognition of complex objects, and a legion in this area typically results in the impairment of object discrimination [13]. Just as V4 is situated physically at an intermediate position to V1 and IT, its representation is also intermediate to these areas in terms of both selectivity and invariance. Studies have found that the receptive field sizes of V4 neurons are on average 4-7 times larger than those of V1, but smaller than those of IT neurons [1, 6]. The preferred stimuli of typical V4 neurons are not simple oriented bars, but are composed of more complex features [3, 1, 8, 9, 4, 11, 2]. Overall, the findings indicate that the sizes of receptive fields, the degree of selectivity, and the range of invariance are all intermediate to V1 and IT.


Understanding how the neuronal population represents shape information is one of the main objectives of visual neuroscience. It is, however, not well understood how the progression of feature selectivity (from simple to complex) and invariance range (from small to large) are achieved along the ventral pathway. While many experimental explorations of ventral pathway representation have been fruitful, determining basic preferences within many areas of macaque ventral cortex, the next question is to ask how one visual representation is transformed into a more complex and invariant visual representation. Plausible models of this transformation have been proposed for V1 [5] and for IT [10]. Conspicuously, intermediate areas have received little attention in this line of work. Therefore, we sought to address the plausibility of a simple feedforward transformational architecture as an explanation for the representations that have been described in intermediate area V4. A computational theory can provide a unifying framework for analyzing experimental work and producing testable hypotheses to motivate further experiments.

Previous Work

The standard model of object recognition in cortex is presented in [12, 14]. This model is composed of a hierarchy of feedforward layers of neuron-like units, performing either (1) a tuning computation (such as weighted linear sum or Gaussian template matching) to increase feature complexity or (2) a nonlinear pooling operation based on a maximum operation to increase response invariance to translation and scaling.

By combining the simple, oriented, V1-like filters, simulations using grating stimuli reveal that the experimental data [3] could be explained by the model as long as several conditions (including the separation of the subunits) are met [7]. More recently, the work described in [14] examined the physiological study by Pasupathy and Connor using contour stimuli, which proposed object-centered, position-specific curvature tuning in a population of V4 neurons [9]. Our results show that the model can exhibit selectivity and invariance properties that correspond to the responses of the V4 cells described in [9]. These results suggest how complex shape tuning of V4 cells may arise from combinations of simpler V1 cell responses. These studies have shown that some parameters in the model have an especially large impact on the shape tuning: the spatial configuration of the subunits within the receptive field and the tuning of a cell to the pattern of subunit activation.


Recently, Freiwald et al. have recorded from area V4 using gratings as well as sparse 2-spot stimuli. Our preliminary simulations on the model have shown some promising results [14]: (1) grating selectivity of a V4 neuron can be well fitted by the combination of V1-like filters, (2) the model units show similar patterns of within-receptive field interaction due to sparse 2-spot stimuli as the V4 neurons, and (3) the model unit could predict the response to the 2-spot stimuli, based on the responses to the gratings. We will verify and extend our preliminary simulations by comparing with the responses from more V4 neurons, in close collaboration with Freiwald et al. We will analyze the prototypical representations of model V4 units, constraining the intermediate layer of the model. We will also compare the model V4 units found across different experimental paradigms (eg. gratings, 2-spot stimuli, and boundary conformation stimuli sets).


The previous work and the current project explain the complex shape selectivity of the V4 neurons with a feedforward combination of oriented filters, in the style of Hubel and Wiesel's model of simple and complex cells in V1. Thus, this study serves as a plausibility proof for a simple neuronal architecture that can produce complex shape tuning and invariance properties found in V4.

Future Work

We will also investigate whether the intermediate neural representations found from the above analysis are compatible with the ones learned from visual experience. Preliminary results show that the model units obtained from a simple, biologically-plausible learning mechanism exhibit similar population statistics reported in several experimental studies [14].

Research Support

This report describes research done at the Center for Biological & Computational Learning, which is in the McGovern Institute for Brain Research at MIT, as well as in the Dept. of Brain & Cognitive Sciences, and which is affiliated with the Computer Sciences & Artificial Intelligence Laboratory (CSAIL). This research was sponsored by grants from: Office of Naval Research (DARPA) Contract No. MDA972-04-1-0037, Office of Naval Research (DARPA) Contract No. N00014-02-1-0915, National Science Foundation-NIH (CRCNS) Contract No. EIA-0218506, and National Institutes of Health (Conte) Contract No. 1 P20 MH66239-01A1. Additional support was provided by: Central Research Institute of Electric Power Industry (CRIEPI), Daimler-Chrysler AG, Eastman Kodak Company, Honda Research Institute USA, Inc., Komatsu Ltd., Merrill-Lynch, NEC Fund, Oxygen, Siemens Corporate Research, Inc., Sony, Sumitomo Metal Industries, Toyota Motor Corporation, and the Eugene McDermott Foundation.


[1] R. Desimone and S. Schein. Visual properties of neurons in area V4 of the macaque: Sensitivity to stimulus form. Journal of Neurophysiology, 57:835--868, 1987.

[2] W. A. Freiwald, D. Y. Tsao, R. B. H. Tootell, and M. S. Livingstone. Complex and dynamic receptive field structure in macaque cortical area V4d. Journal of Vision, 4(8):184a, 2005.

[3] J. Gallant, C. Connor, S Rakshit, J. Lewis, and D. van Essen. Neural responses to polar, hyperbolic, and Cartesian gratings in area V4 of the macaque monkey. Journal of Neurophysiology, 76:2718--2739, 1996.

[4] T. Gawne and J. Martin. Responses of primate visual cortical V4 neurons to simultaneously presented stimuli. Journal of Neurophysiology, 88:1128--1135, 2002.

[5] D. Hubel and T. Wiesel. Receptive fields, binocular interaction and functional architecture in the cat”Ēs visual cortex. Journal of Physiology, 160:106--154, 1962.

[6] E. Kobatake and K. Tanaka. Neuronal selectivities to complex object features in the ventral visual pathway of the macaque cerebral cortex. Journal of Neurophysiology, 71:856--867, 1994.

[7] M. Kouh and M. Riesenhuber. Investigating shape representation in area V4 with HMAX: Orientation and grating selectivities. AI Memo 021, MIT, 2003.

[8] A. Pasupathy and C. Connor. Responses to contour features in macaque area V4. Journal of Neurophysiology, 82:2490--2502, 1999.

[9] A. Pasupathy and C. Connor. Shape representation in area V4: Position--specific tuning for boundary conformation. Journal of Neurophysiology, 86:2505--2519, 2001.

[10] D. Perrett and M. Oram. Neurophysiology of shape processing. Img. Vis. Comput., 11:317--333, 1993.

[11] D. Pollen, A. Przybyszewski, M. Rubin, and W. Foote. Spatial receptive field organization of macaque V4 neurons. Cereb. Cortex, 12(6):601--616, 2002.

[12] M. Riesenhuber and T. Poggio. Hierarchical models of object recognition in cortex. Nature Neuroscience, 2:1019--1025, 1999.

[13] P. Schiller. The effects of V4 and middle temporal (MT) area lesions on visual performance in the rhesus monkey. Vis Neurosci., 10(4):717--746, 1993.

[14] T. Serre, M. Kouh, C. Cadieu, U. Knoblich, G. Kreiman, and T. Poggio. A theory of object recognition: Computations and circuits in the feedforward path of the ventral stream in primate visual cortex. AI Memo 036, MIT, 2005.


vertical line
vertical line
horizontal line

MIT logo Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu