|
Research
Abstracts - 2007 |
Visuospatial ReasoningSajit RaoIntroductionConsider the following examples of everyday tasks: (a) Picking out the smallest checkout line in a supermarket. (b) Looking at the hands of your watch to tell the time, (c) Map understanding: You learn that the Khyber pass in the HinduKush mountains is a strategically important natural gateway used by Alexander's armies to pass from the northwest frontier of Afghanistan into the plains of India in 326 B.C - you lookup a map of the region to see and understand this for yourself. (d) Language understanding and common-sense inference: You hear a news report that the road leading to the market place of a town has been blocked by insurgents, and conclude that a vehicle that needs to reach the marketplace must either clear the blockage (and possibly face resistance) or find an alternative route. There are literally hundreds of such tasks that one might do during the course of a single day without any conscious effort (even though each task itself may involve dozens of lower-level operations). There are also many examples of how we leverage our visuospatial representations by projecting even non-spatial domains like time or graph-structures (e.g. org-charts and process-flows) into diagrams to make certain inferences ``obvious'' or ``pop-out''. The ease and apparent lack of effort of even a child in solving/understanding such problems hints at the sophistication of the underlying visuospatial analysis machinery we must have. There is evidence from fMRI and infant development studies [1][2] which show that in humans, visual processes and representations are not only involved in perception and action but also more ``abstract'' cognitive tasks such as doing math, or making an inference. Vision thus appears to be part of our ``thinking machinery'' as well. GoalOur goal is to build a system where visuospatial perceptual mechanisms are used for perception as well as abstract inference. We expect robust visuospatial reasoning to emerge from three complimentary, interacting competencies:
ProgressFor the spatial analysis component we are working on a real-time implementation of the Visual Routines architecture described in [4]. Vision Modules run in parallel on multiple machines to execute a visual routine. For the learning component we are initially testing the system's ability to index and learn from a test set of simulated blocks-world events. Learned spatial patterns form the templates for both analysis and imagination. Research SupportThis project is funded by a seedling grant from DARPA. References[1] S. Dehaene and E. Spelke and P. Pinel, and R. Stanescu and S. Tsivkin. Sources of Mathematical Thinking: Behavioral and Brain-Imaging Evidence. In Science, vol 284, May 1999. [2] S. Carey. Bootstrapping and the Origin of Concepts. In Daedalus, Winter 2004. [3] S. Ullman. Visual Routines. In Cognition. vol 18, 1984. [4] S. Rao. Visual Routines and Attention. MIT EECS Ph.d Thesis 1998. |
||||
|