CSAIL Research Abstracts - 2005 link to http://publications.csail.mit.edu/abstracts/abstracts05/index.html link to http://www.csail.mit.edu
bullet Introduction bullet Architecture, Systems
& Networks
bullet Language, Learning,
Vision & Graphics
bullet Physical, Biological
& Social Systems
bullet Theory bullet

horizontal line

Modeling Shape Representation in Visual Cortex Area V4

Charles Cadieu, Minjoon Kouh, Maximilian Riesenhuber & Tomaso Poggio

The Problem

The ventral pathway of the primate cortex is thought to mediate object recognition.  Neurons within this pathway are classified in a hierarchical structure and exhibit increasing selectivity to complex visual stimuli and increasing invariance to translation and scale.  The computational mechanisms that produce neural selectivity within this hierarchy are not well understood.  This study attempts to model the computational mechanisms that determine selectivity and invariance in the intermediate areas of the ventral pathway, specifically V4.  


Understanding how the neuronal population represents shape information is one of the main objectives of visual neuroscience.  A computational theory can provide a unifying framework for analyzing experimental work and producing testable hypotheses to motivate further experiments.

Previous Work

The standard model of object recognition in cortex is presented in (Riesenhuber & Poggio 1999).  This model is composed of a hierarchy of feed-forward layers of neuron-like units, performing either (1) a tuning computation, we use a normalized-dot-product described in (Kouh 2005, Kouh 2004), to increase feature complexity or (2) a nonlinear pooling operation based on a maximum operation to achieve invariance to translation and scaling.

By combining simple, oriented, V1-like filters, simulations using grating stimuli have revealed that V4 cell orientation and grating selectivity can be explained by the model under several conditions (Kouh 2003).  More recently, the work described in (Cadieu 2004), examined a physiological study by Pasupathy and Connor, which proposed object-centered, position-specific curvature tuning in a population of V4 neurons (Pasupathy & Connor 2001).  (Cadieu 2004) shows that the standard model can exhibit selectivity and invariance properties that correspond to physiological evidence and suggest how complex shape tuning of V4 cells may arise from combinations of simpler V1 cell responses (see Figure 1 for a comparison of V4 cell and model unit response).  These studies have shown that the spatial configuration of the subunits within the receptive field and the tuning of a cell to the pattern of subunit activation are necessary to model V4 cell responses.

Fig. 1    A V4 cell response (left), adapted from (Pasupathy & Connor 2001), and a model unit response (middle) show similar selectivity profiles (each small square shows a stimulus with contrast scaled by the level of response).  The responses to each stimulus for the V4 cell and the model unit are plotted against each other (right) and have a correlation coefficient of 0.77.Current Work:

First, we will develop a methodology using techniques from machine learning to fit mean firing-rate V4 neural responses from any rapid serial visual presentation experiment to standard model units.  Such a technique can then be used to quickly model V4 neurons from various experimental groups independently of the stimulus sets they have used. Modeling populations of V4 neurons will provide a basis for standardizing the description of V4 neuron selectivity.

Second, we will develop experimental tools, inspired by our models, that will help electrophysiologists further explore V4 selectivity and invariance.  We will start by designing a stimulus set that will optimally sample the input space of our model units.  Experiments using this stimulus set will be useful for validating or invalidating the model and will also produce population statistics of model parameters, useful for formulating a standardized model of V4 selectivity.

Another possible application of the model in electrophysiology will be to integrate the model into the stimulus presentation loop during experimentation. Experimental methodologies are limited by short time windows for recording and a lack of knowledge about the inputs to a neuron.  By building a model of the neuron during the experiment, the fitted model can be used to intelligently select useful stimuli for presentation during further presentations.  A number of issues may be addressed in this way, such as: finding an optimal stimulus for a neuron, exploring invariance properties, or investigating sub-feature interactions.

Third, we hope to show that V4-like selectivity can be learned from exposure to natural images.  Thomas Serre and Lior Wolf, also at CBCL, are developing novel, biologically inspired object recognition techniques that learn useful selectivities from input images, see (Serre 2005).  Units learned can be tested against populations of V4 neurons by simulating the same experimental procedures used by physiologists.


Our work provides an explanation for the complex shape selectivity and invariance properties of V4 neurons with a feedforward combination of oriented filters, in the style of Hubel and Wiesel's model of simple and complex V1 cells.  Thus, this study serves as a plausibility proof for a simple neuronal architecture that can produce complex shape tuning properties while maintaining invariance to translation.  After forming a standardized description of V4 selectivity and invariance, we hope to extend our techniques to investigate higher levels of the ventral stream, such as areas in inferotemporal cortex.


This report describes research done at the Center for Biological & Computational Learning, which is in the McGovern Institute for Brain Research at MIT, as well as in the Dept. of Brain & Cognitive Sciences, and which is affiliated with the Computer Sciences & Artificial Intelligence Laboratory (CSAIL).

This research was sponsored by grants from: Office of Naval Research (DARPA) Contract No. MDA972-04-1-0037, Office of Naval Research (DARPA) Contract No. N00014-02-1-0915, National Science Foundation (ITR/SYS) Contract No. IIS-0112991, National Science Foundation (ITR) Contract No. IIS-0209289, National Science Foundation-NIH (CRCNS) Contract No. EIA-0218693, National Science Foundation-NIH (CRCNS) Contract No. EIA-0218506, and National Institutes of Health (Conte) Contract No. 1 P20 MH66239-01A1.

Additional support was provided by: Central Research Institute of Electric Power Industry (CRIEPI), Daimler-Chrysler AG, Compaq/Digital Equipment Corporation, Eastman Kodak Company, Honda R&D Co., Ltd., Industrial Technology Research Institute (ITRI), Komatsu Ltd., Eugene McDermott Foundation, Merrill-Lynch, NEC Fund, Oxygen, Siemens Corporate Research, Inc., Sony, Sumitomo Metal Industries, and Toyota Motor Corporation.


[1] C. Cadieu, M. Kouh, M. Riesenhuber, and T. Poggio. Shape representation in V4: Investigating position-specific tuning for boundary conformation with the standard model of object recognition. In CBCL Paper #241/CSAIL Memo #2004-024, Massachusetts Institute of Technology, Cambridge, MA, November 2004.

[2] A. Pasupathy and C. Connor. Shape representation in Area V4: Position-specific tuning for boundary conformation. In J. Neurophysiology, 86:2505-2519, 2001.

[3] M. Riesenhuber and T. Poggio. Hierarchical models of object recognition in cortex. In Nature Neuroscience, 2:1019-1025, 1999.

[4] M. Kouh and M. Riesenhuber. Investigating shape representation in Area V4 with HMAX: Orientation and grating selectivities. In CBCL Paper #231/AIM #2003-021, Massachusetts Institute of Technology, Cambridge, MA, March 2003.

[5] M. Kouh and T. Poggio. A general mechanism for tuning: Gain control circuits and synapses underlie tuning of cortical neurons. In CBCL Paper #245/CSAIL Memo #2004-031, Massachusetts Institute of Technology, Cambridge, MA, December 2004.

[6] M. Kouh and T. Poggio. Gain control with normalization in the standard model. CSAIL Research Abstract, Massachusetts Institute of Technology, Cambridge, MA, April 2005.

[7] T. Serre, L. Wolf and T. Poggio. A new biologically motivated framework for robust object recognition. CSAIL Research Abstract, Massachusetts Institute of Technology, Cambridge,

horizontal line

MIT logo Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu
(Note: On July 1, 2003, the AI Lab and LCS merged to form CSAIL.)