CSAIL Publications and Digital Archive header
bullet Research Abstracts Home bullet CSAIL Digital Archive bullet Research Activities bullet CSAIL Home bullet

link to publications.csail.mit.edu link to www.csail.mit.edu horizontal line


Research Abstracts - 2007
horizontal line

horizontal line

vertical line
vertical line

Non-Metrical Navigation Through Visual Path Control

Albert S. Huang & Seth Teller

Figure 1. Features detected on an omnidirectional image


Existing wide-area motion planning methods make a strong set of assumptions about the availability of a global, metrical coordinate system and map for the environment of interest. Algorithms based on such models have difficulty achieve wide-area navigation in a class of spaces where such assumptions are often unmet, for example in extended indoor regions.

We describe a new method for wide-area, non-metrical robot navigation which enables useful, purposeful motion indoors. Our method has two phases: a training phase, in which a human user directs a wheeled robot with an attached camera through an environment while occasionally supplying textual place names; and a navigation phase in which the user specifies goal place names (again as text), and the robot issues low-level motion control in order to move to the specified place.

Our method uses an omnidirectional camera, requires only approximate intrinsic and extrinsic camera calibration, performs no feature tracking, and is capable of effective motion control within an extended, minimally-prepared building environment floorplan. We give results for deployment within a single building floor with 7 rooms, 6 corridor segments, and 15 distinct place names.


Figure 2. The test environment

Effective navigation in general environments presents a fundamental challenge for humans, attracting attention since antiquity. In modern times, the classical formulation of robot motion planning [1] makes three strong assumptions:

  • That a precise metrical map of the robot's workspace is available either explicitly as a freespace description or implicitly through an obstacle-probing mechanism;
  • That the user or high-level planner can express the robot's goal pose numerically, in the coordinates of the metrical map assumed above; and
  • That the robot can localize itself precisely within the metrical map.

These assumptions are reasonable for structured settings in which (for example): the robot is bolted to a factory floor; materials to be manipulated are delivered to the robot by a conveyor belt; humans are kept away for safety reasons; and the environment is otherwise unchanging. Motion planning methods for courier robots (e.g.~in hospitals) intended to move along marked paths also make these assumptions. However both cases require the environment to be prepared extensively (or ``structured'') beforehand by human engineers, and maintained in the structured site for the duration of robot operation.

It is misguided, however, to attempt to extend these assumptions to tasks requiring robots to move about within an extended environment that cannot be observed in total from any single sensor position, and for which no high-precision global metrical model is available -- for example a large household or workplace. In fact, none of these three assumptions generally hold in such an environment; nor can they be made to hold in extended workspaces without significant effort, both to construct a high-fidelity as-built CAD model of the environment and to establish fine-grained localization capability within it.

The key insight underlying our approach is that a metrical map is not only difficult to acquire, but it is also overkill, i.e., it is not needed for applications in which a physical, inertial agent moves non-destructively through the environment using only local sensing. Indeed, humans commonly perform analogous complex navigation tasks in extended environments, without relying upon global sensing or externally-provided metrical coordinate systems.

Our robot navigation method has two phases: a training phase, in which a human user directs a wheeled robot with an attached camera through an environment while occasionally supplying textual place names; and a navigation phase in which the user specifies goal place names (again as text), and the robot issues low-level motion control in order to move to the specified place.

During training, the robot constructs a graph of labeled nodes and edges. Each graph node corresponds to a place named by the user; each node label is supplied by the user as a place name. Each graph edge corresponds to a path traversed by the robot between two named places. Each edge is associated with a set of descriptors for visual features observed by the robot as it moved along the path.

Many features observed during training are re-observed during navigation (Figure 1). We have found that differences in the visual-field locations of features matched across training and navigation can be used to construct a simple and robust control rule that guides the robot onto and along the training motion path, reproducing the training path at any desired speed.


Figure 3. Our robot platform.

We have evaluated our method on a complex, spatially extended floorplan with many offices, open spaces, and branching corridors (Figure 2). Our robot platform was custom-built with a two-wheel rear axle, front and rear single casters, and one on-board laptop providing about 2GHz of processing power (Figure 3). The omnidirectional camera was purchased from Point Grey Research, which also supplied rough intrinsic calibration parameters. We configured the camera to output 6 images per second at 1024x768, of which we discarded one (the top camera) and decimated the others to 128x96.

The user trained the robot by moving it through the environment using a simple joy-stick controller (and a 10-meter long cable, so as not to subtend a large angle in the camera's visual field). Each time a node was visited during the training phase, the user would type in the name of the node on a portable device, which would then transmit the name to the robot. Finally, the user can operate the robot in navigation mode, sending it on a mission by providing the names of a start node and goal node.


[1] J-C. Latombe. Robot Motion Planning. Boston: Kluwer Academic Publishers, 1991.

[2] David G. Lowe. Object recognition from local scale-invariant features. In Proc. of the International Conference on Computer Vision ICCV, pages 1150-1157, 1999.

[3] R. Smith, M. Self, and P. Cheeseman. Estimating uncertain spatial relationships in robotics. In I. Cox and G. Wilfong, editors, Autonomous Robot Vehicles, pages 167-193. Springer-Verlag, 1990.

[4] J. Wolf, W. Burgard, and H. Burkhardt. Using an image retrieval system for vision-based mobile robot localization. In Proc. of the International Conference on Image and Video Retrieval (CIVR), 2002.

[5] Anthony Remazeilles, and Francois Cahumette, and Patrick Gros. Robot motion control from a visual memory. In Proc. of the International Conference on Robotics & Automation (ICRA), 2004.


vertical line
vertical line
horizontal line

MIT logo Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu