CSAIL Publications and Digital Archive header
bullet Technical Reports bullet Work Products bullet Historical Collections bullet

link to publications.csail.mit.edu link to www.csail.mit.edu horizontal line

CSAIL Digital Archive - Artificial Intelligence Laboratory Series
Publications 1300 through 1399

AI PublicationsLast update Sun Mar 19 05:05:02 2006


Author[s]: David M. Siegel

Pose Determination of a Grasped Object Using Limited Sensing

May 1991



This report explores methods for determining the pose of a grasped object using only limited sensor information. The problem of pose determination is to find the position of an object relative to the hand. The information is useful when grasped objects are being manipulated. The problem is hard because of the large space of grasp configurations and the large amount of uncertainty inherent in dexterous hand control. By studying limited sensing approaches, the problem's inherent constraints can be better understood. This understanding helps to show how additional sensor data can be used to make recognition methods more effective and robust.


Author[s]: Yael Moses and Shimon Ullman

Limitations of Non Model-Based Recognition Schemes

May 1991



Different approaches to visual object recognition can be divided into two general classes: model-based vs. non model-based schemes. In this paper we establish some limitation on the class of non model-based recognition schemes. We show that every function that is invariant to viewing position of all objects is the trivial (constant) function. It follows that every consistent recognition scheme for recognizing all 3-D objects must in general be model based. The result is extended to recognition schemes that are imperfect (allowed to make mistakes) or restricted to certain classes of objects.


Author[s]: Feng Zhao and Richard Thorton

Automatic Design of a Maglev Controller in State Space

December 1991



We describe the automatic synthesis of a global nonlinear controller for stabilizing a magnetic levitation system. The synthesized control system can stabilize the maglev vehicle with large initial displacements from an equilibrium, and possesses a much larger operating region than the classical linear feedback design for the same system. The controller is automatically synthesized by a suite of computational tools. This work demonstrates that the difficult control synthesis task can be automated, using programs that actively exploit knowledge of nonlinear dynamics and state space and combine powerful numerical and symbolic computations with spatial-reasoning techniques.


Author[s]: Michael A. Eisenberg

The Kineticist's Workbench: Combining Symbolic and Numerical Methods in the Simulation of Chemical Reaction 

May 1991



The Kineticist's Workbench is a program that simulates chemical reaction mechanisms by predicting, generating, and interpreting numerical data. Prior to simulation, it analyzes a given mechanism to predict that mechanism's behavior; it then simulates the mechanism numerically; and afterward, it interprets and summarizes the data it has generated. In performing these tasks, the Workbench uses a variety of techniques: graph- theoretic algorithms (for analyzing mechanisms), traditional numerical simulation methods, and algorithms that examine simulation results and reinterpret them in qualitative terms. The Workbench thus serves as a prototype for a new class of scientific computational tools---tools that provide symbiotic collaborations between qualitative and quantitative methods.


Author[s]: David T. Clemens

Region-Based Feature Interpretation for Recognizing 3D Models in 2D Images

June 1991



In model-based vision, there are a huge number of possible ways to match model features to image features. In addition to model shape constraints, there are important match-independent constraints that can efficiently reduce the search without the combinatorics of matching. I demonstrate two specific modules in the context of a complete recognition system, Reggie. The first is a region-based grouping mechanism to find groups of image features that are likely to come from a single object. The second is an interpretive matching scheme to make explicit hypotheses about occlusion and instabilities in the image features.


Author[s]: Pegor Papazian

Principles, Opportunism and Seeing in Design: A Computational Approach

June 1991



This thesis introduces elements of a theory of design activity and a computational framework for developing design systems. The theory stresses the opportunistic nature of designing and the complementary roles of focus and distraction, the interdependence of evaluation and generation, the multiplicity of ways of seeing over the history of a design session versus the exclusivity of a given way of seeing over an arbitrarily short period, and the incommensurability of criteria used to evaluate a design. The thesis argues for a principle based rather than rule based approach to designing documents. The Discursive Generator is presented as a computational framework for implementing specific design systems, and a simple system for arranging blocks according to a set of formal principles is developed by way of illustration. Both shape grammars and constraint based systems are used to contrast current trends in design automation with the discursive approach advocated in the thesis. The Discursive Generator is shown to have some important properties lacking in other types of systems, such as dynamism, robustness and the ability to deal with partial designs. When studied in terms of a search metaphor, the Discursive Generator is shown to exhibit behavior which is radically different from some traditional search techniques, and to avoid some of the well-known difficulties associated with them.


Author[s]: Shimon Ullman

Sequence-Seeking and Counter Streams: A Model for Information Processing in the Cortex

December 1991



This paper presents a model for the general flow in the neocortex. The basic process, called "sequence-seeking," is a search for a sequence of mappings or transformations, linking source and target representations. The search is bi-directional, "bottom-up" as well as "top-down," and it explores in parallel a large numbe rof alternative sequences. This operation is implemented in a structure termed "counter streams," in which multiple sequences are explored along two separate, complementary pathways which seeking to meet. The first part of the paper discusses the general sequence-seeking scheme and a number of related processes, such as the learning of successful sequences, context effects, and the use of "express lines" and partial matches. The second part discusses biological implications of the model in terms of connections within and between cortical areas. The model is compared with existing data, and a number of new predictions are proposed.


Author[s]: Daphna Weinshall

The Matching of Doubly Ambiguous Stereograms

July 1991



I have previously described psychophysical experiments that involved the perception of many transparent layers, corresponding to multiple matching, in doubly ambiguous random dot stereograms. Additional experiments are described in the first part of this paper. In one experiment, subjects were required to report the density of dots on each transparent layer. In another experiment, the minimal density of dots on each layer, which is required for the subjects to perceive it as a distinct transparent layer, was measured. The difficulties encountered by stereo matching algorithms, when applied to doubly ambiguous stereograms, are described in the second part of this paper. Algorithms that can be modified to perform consistently with human perception, and the constraints imposed on their parameters by human perception, are discussed.


Author[s]: Ellen C. Hildreth, Hiroshi Ando, Richard Anderson and Stefan Treue

Recovering Three-Dimensional Structure from Motion with Surface Reconstruction

December 1991



We address the computational role that the construction of a complete surface representation may play in the recovery of 3--D structure from motion. We present a model that combines a feature--based structure-- from- -motion algorithm with smooth surface interpolation. This model can represent multiple surfaces in a given viewing direction, incorporates surface constraints from object boundaries, and groups image features using their 2--D image motion. Computer simulations relate the model's behavior to perceptual observations. In a companion paper, we discuss further perceptual experiments regarding the role of surface reconstruction in the human recovery of 3--D structure from motion.


Author[s]: Ellen Spertus

Why are There so Few Female Computer Scientists?

August 1991



This report examines why women pursue careers in computer science and related fields far less frequently than men do. In 1990, only 13% of PhDs in computer science went to women, and only 7.8% of computer science professors were female. Causes include the different ways in which boys and girls are raised, the stereotypes of female engineers, subtle biases that females face, problems resulting from working in predominantly male environments, and sexual biases in language. A theme of the report is that women's underrepresentation is not primarily due to direct discrimination but to subconscious behavior that perpetuates the status quo.


Author[s]: Lynn Andrea Stein

Resolving Ambiguity in Nonmonotonic Inheritance Hierarchies

August 1991



This paper describes a theory of inheritance theories. We present an original theory of inheritance in nonmonotonic hierarchies. The structures on which this theory is based delineate a framework that subsumes most inheritance theories in the literature, providing a new foundation for inheritance. * Our path- based theory is sound and complete w.r.t. a direct model-theoretic semantics. * Both the credulous and the skeptical conclusions of this theory are polynomial-time computable. * We prove that true skeptical inheritance is not contained in the language of path-based inheritance. Because our techniques are modular w.r.t. the definition of specificity, they generalize to provide a unified framework for a broad class of inheritance theories. By describing multiple inheritance theories in the same “language” of credulous extensions, we make principled comparisons rather than the ad-hoc examination of specific examples makes up most of the comparative inheritance work.


Author[s]: J. Brian Subirana-Vilanova and Kah Kay Sung

Multi-Scale Vector-Ridge-Detection for Perceptual Organization Without Edges

December 1992



We present a novel ridge detector that finds ridges on vector fields. It is designed to automatically find the right scale of a ridge even in the presence of noise, multiple steps and narrow valleys. One of the key features of such ridge detector is that it has a zero response at discontinuities. The ridge detector can be applied to scalar and vector quantities such as color. We also present a parallel perceptual organization scheme based on such ridge detector that works without edges; in addition to perceptual groups, the scheme computes potential focus of attention points at which to direct future processing. The relation to human perception and several theoretical findings supporting the scheme are presented. We also show results of a Connection Machine implementation of the scheme for perceptual organization (without edges) using color.


Author[s]: P.A. Skordos and W.H. Zurek

Maxwell's Demon, Rectifiers, and the Second Law: Computer Simulation of Smoluchowski's Trapdoor

September 1991



We have simulated numerically an automated Maxwell's demon inspired by Smoluchowski's ideas of 1912. Two gas chambers of equal area are connected via an opening that is covered by a trapdoor. The trapdoor can open to the left but not to the right, and is intended to rectify naturally occurring variations in density between the two chambers. Our results confirm that though the trapdoor behaves as a rectifier when large density differences are imposed by external means, it can not extract useful work from the thermal motion of the molecules when left on its own.


Author[s]: Lisa Dron

The Multi-Scale Veto Model: A Two-Stage Analog Network for Edge Detection and Image Reconstruction

March 1992



This paper presents the theory behind a model for a two-stage analog network for edge detection and image reconstruction to be implemented in VLSI. Edges are detected in the first stage using the multi-scale veto rule, which eliminates candidates that do not pass a threshold test at each of a set of different spatial scales. The image is reconstructed in the second stage from the brightness values adjacent to edge locations. The MSV rule allows good localization and efficient noise removal. Since the reconstructed images are visually similar to the originals, the possibility exists of achieving significant bandwidth compression.


Author[s]: Waldemar Horwat

Concurrent Smalltalk on the Message-Driven Processor

September 1991



Concurrent Smalltalk is the primary language used for programming the J- Machine, a MIMD message-passing computer containing thousands of 36-bit processors connected by a very low latency network. This thesis describes in detail Concurrent Smalltalk and its implementation on the J-Machine, including the Optimist II global optimizing compiler and Cosmos fine-grain parallel operating system. Quantitative and qualitative results are presented.


Author[s]: Maja Mataric

A Comparative Analysis of Reinforcement Learning Methods

October 1991



This paper analyzes the suitability of reinforcement learning (RL) for both programming and adapting situated agents. We discuss two RL algorithms: Q-learning and the Bucket Brigade. We introduce a special case of the Bucket Brigade, and analyze and compare its performance to Q in a number of experiments. Next we discuss the key problems of RL: time and space complexity, input generalization, sensitivity to parameter values, and selection of the reinforcement function. We address the tradeoffs between the built-in and learned knowledge and the number of training examples required by a learning algorithm. Finally, we suggest directions for future research.


Author[s]: Elizabeth Bradley

A Control Algorithm for Chaotic Physical Systems

October 1991



Control algorithms which exploit the unique properties of chaos can vastly improve the design and performance of many practical and useful systems. The program Perfect Moment is built around such an algorithm. Given two points in the system's state space, it autonomously maps the space, chooses a set of trajectory segments from the maps, uses them to construct a composite path between the points, then causes the system to follow that path. This program is illustrated with two practical examples: the driven single pendulum and its electronic analog, the phase-locked loop. Strange attractor bridges, which alter the reachability of different state space points, can be used to increase the capture range of the circuit.


Author[s]: Michael Eisenberg

Programmable Applications: Interpreter Meets Interface

October 1991



Current fashion in "user-friendly'' software design tends to place an overreliance on direct manipulation interfaces. To be truly expressive (and thus truly user-friendly), applications need both learnable interfaces and domain-enriched languages that are accessible to the user. This paper discusses some of the design issues that arise in the creation of such programmable applications. As an example, we present "SchemePaint", a graphics application that combines a MacPaint-like interface with an interpreter for (a "graphics-enriched'') Scheme.


Author[s]: Ivan A. Bachelder

Contour Matching Using Local Affine Transformations

April 1992



Partial constraints are often available in visual processing tasks requiring the matching of contours in two images. We propose a non- iterative scheme to determine contour matches using locally affine transformations. The method assumes that contours are approximated by the orthographic projection of planar patches within oriented neighborhoods of varying size. For degenerate cases, a minimal matching solution is chosen closest to the minimal pure translation. Performance on noisy synthetic and natural contour imagery is reported.


Author[s]: Amnon Shashua

Correspondence and Affine Shape from Two Orthographic Views: Motion and Recognition

December 1991



The paper presents a simple model for recovering affine shape and correspondence from two orthographic views of a 3D object. It is shown that four corresponding points along two orthographic views, taken under similar illumination conditions, determine affine shape and correspondence for all other points. The scheme is useful for purposes of visual recognition by generating novel views of an object given two model views. It is also shown that the scheme can handle objects with smooth boundaries, to a good approximation, without introducing any modifications or additional model views.


Author[s]: Randall Davis

Intellectual Property and Software: The Assumptions are Broken

November 1991



In March 1991 the World Intellectual Property Organization held an international symposium attended primarily by lawyers, to discuss the questions that artificial intelligence poses for intellectual property law (i.e., copyright and patents). This is an edited version of a talk presented there, which argues that AI poses few problems in the near term and that almost all the truly challenging issues arise instead from software in general. The talk was an attempt to bridge the gap between the legal community and the software community, to explain why existing concepts and categories in intellectual property law present such difficult problems for software, and why software as a technology breaks several important assumptions underlying intellectual property law.


Author[s]: Harold Abelson, Andrew A. Berlin, Jacob Katzenelson, William H. McAllister, Guillermo J. Rozas, Gerald Jay Sussman and Jack Wisdom

The Supercomputer Toolkit: A General Framework for Special-purpose Computing

November 1991



The Toolkit is a family of hardware modules (processors, memory, interconnect, and input- output devices) and a collection of software modules (compilers, simulators, scientific libraries, and high-level front ends) from which high-performance special-purpose computers can be easily configured and programmed. The hardware modules are intended to be standard, reusable parts. These are combined by means of a user- reconfigurable, static interconnect technology. T he Toolkit's software support, based on n ovel compilation techniques, produces e xtremely high- performance numerical code from high-level language input, and will eventually automatically configure hardware modules for particular applications.


Author[s]: David L. Brock

Review of Artificial Muscle Based on Contractile Polymers

November 1991



An artificial muscle with strength and speed equal to that of a human muscle may soon be possible. Polymer gels exhibit abrubt volume changes in response to variations in their external conditions -- shrinking or swelling up to 1000 times their original volume. Through the conversion of chemical or electrical energy into mechanical work, a number of devices have already been constructed which produce forces up to 100N/cm2 and contraction rates on the order of a second. Through the promise of an artificial muscle is real, many fundamental physical and engineering questions remain before the extent or limit of these devices is known.


Author[s]: David L. Brock

Dynamic Model and Control of an Artificial Muscle Based on Contractile Polymers

November 1991



A dynamic model and control system of an artificial muscle is presented. The artificial muscle is based on a contractile polymer gel which undergoes abrupt volume changes in response to variations in external conditions. The device uses an acid-base reaction to directly convert chemical to mechanical energy. A nonlinear sliding mode control system is proposed to track desired joint trajectories of a single link controlled by two antagonist muscles. Both the model and controller were implemented and produced acceptable tracking performance at 2Hz.


Author[s]: Ronen Basri

The Alignment of Objects With Smooth Surfaces: Error Analysis of the Curvature Method

November 1991



The recognition of objects with smooth bounding surfaces from their contour images is considerably more complicated than that of objects with sharp edges, since in the former case the set of object points that generates the silhouette contours changes from one view to another. The "curvature method", developed by Basri and Ullman [1988], provides a method to approximate the appearance of such objects from different viewpoints. In this paper we analyze the curvature method. We apply the method to ellipsoidal objects and compute analytically the error obtained for different rotations of the objects. The error depends on the exact shape of the ellipsoid (namely, the relative lengths of its axes), and it increases a sthe ellipsoid becomes "deep" (elongated in the Z-direction). We show that the errors are usually small, and that, in general, a small number of models is required to predict the appearance of an ellipsoid from all possible views. Finally, we show experimentally that the curvature method applies as well to objects with hyperbolic surface patches.


Author[s]: Ronen Basri

On The Uniqueness of Correspondence Under Orthographic and Perspective Projections

December 1991



The task of shape recovery from a motion sequence requires the establishment of correspondence between image points. The two processes, the matching process and the shape recovery one, are traditionally viewed as independent. Yet, information obtained during the process of shape recovery can be used to guide the matching process. This paper discusses the mutual relationship between the two processes. The paper is divided into two parts. In the first part we review the constraints imposed on the correspondence by rigid transformations and extend them to objects that undergo general affine (non rigid) transformation (including stretch and shear), as well as to rigid objects with smooth surfaces. In all these cases corresponding points lie along epipolar lines, and these lines can be recovered from a small set of corresponding points. In the second part of the paper we discuss the potential use of epipolar lines in the matching process. We present an algorithm that recovers the correspondence from three contour images. The algorithm was implemented and used to construct object models for recognition. In addition we discuss how epipolar lines can be used to solve the aperture problem.


Author[s]: M. Ali Taalebinezhaad

Towards Autonomous Motion Vision

April 1992



Earlier, we introduced a direct method called fixation for the recovery of shape and motion in the general case. The method uses neither feature correspondence nor optical flow. Instead, it directly employs the spatiotemporal gradients of image brightness. This work reports the experimental results of applying some of our fixation algorithms to a sequence of real images where the motion is a combination of translation and rotation. These results show that parameters such as the fization patch size have crucial effects on the estimation of some motion parameters. Some of the critical issues involved in the implementaion of our autonomous motion vision system are also discussed here. Among those are the criteria for automatic choice of an optimum size for the fixation patch, and an appropriate location for the fixation point which result in good estimates for important motion parameters. Finally, a calibration method is described for identifying the real location of the rotation axis in imaging systems.


Author[s]: Tomaso Poggio, Manfred Fahle and Shimon Edelman

Fast Perceptual Learning in Visual Hyperacuity

December 1991



In many different spatial discrimination tasks, such as in determining the sign of the offset in a vernier stimulus, the human visual system exhibits hyperacuity-level performance by evaluating spatial relations with the precision of a fraction of a photoreceptor"s diameter. We propose that this impressive performance depends in part on a fast learning process that uses relatively few examples and occurs at an early processing stage in the visual pathway. We show that this hypothesis is plausible by demonstrating that it is possible to synthesize, from a small number of examples of a given task, a simple (HyperBF) network that attains the required performance level. We then verify with psychophysical experiments some of the key predictions of our conjecture. In particular, we show that fast timulus-specific learning indeed takes place in the human visual system and that this learning does not transfer between two slightly different hyperacuity tasks.


Author[s]: Patrick G. Sobalvarro

Calculation of Blocking Probabilities in Multistage Interconnection Networks with Redundant Paths

December 1991



The blocking probability of a network is a common measure of its performance. There exist means of quickly calculating the blocking probabilities of Banyan networks; however, because Banyan networks have no redundant paths, they are not inherently fault-tolerant, and so their use in large-scale multiprocessors is problematic. Unfortunately, the addition of multiple paths between message sources and sinks in a network complicates the calculation of blocking probabilities. A methodology for exact calculation of blocking probabilities for small networks with redundant paths is presented here, with some discussion of its potential use in approximating blocking probabilities for large networks with redundant paths.


Author[s]: Lynn Andrea Stein and Leora Morgenstern

Motivated Action Theory: A Formal Theory of Causal Reasoning

December 1991



When we reason about change over time, causation provides an implicit preference: we prefer sequences of situations in which one situation leads causally to the next, rather than sequences in which one situation follows another at random and without causal connections. In this paper, we explore the problem of temporal reasoning --- reasoning about change over time --- and the crucial role that causation plays in our intuitions. We examine previous approaches to temporal reasoning, and their shortcomings, in light of this analysis. We propose a new system for causal reasoning, motivated action theory, which builds upon causation as a crucial preference creterion. Motivated action theory solves the traditional problems of both forward and backward reasoning, and additionally provides a basis for a new theory of explanation.


Author[s]: David McAllester and David Rosenblatt

Systematic Nonlinear Planning

December 1991



This paper presents a simple, sound, complete, and systematic algorithm for domain independent STRIPS planning. Simplicity is achieved by starting with a ground procedure and then applying a general and independently verifiable, lifting transformation. Previous planners have been designed directly as lifted procedures. Our ground procedure is a ground version of Tate's NONLIN procedure. In Tate's procedure one is not required to determine whether a prerequisite of a step in an unfinished plan is guarnateed to hold in all linearizations. This allows Tate"s procedure to avoid the use of Chapman"s modal truth criterion. Systematicity is the property that the same plan, or partial plan, is never examined more than once. Systematicity is achieved through a simple modification of Tate's procedure.


Author[s]: David McAllester

Observations on Cognitive Judgments

December 1991



It is obvious to anyone familiar with the rules of the game of chess that a king on an empty board can reach every square. It is true, but not obvious, that a knight can reach every square. Why is the first fact obvious but the second fact not? This paper presents an analytic theory of a class of obviousness judgments of this type. Whether or not the specifics of this analysis are correct, it seems that the study of obviousness judgments can be used to construct integrated theories of linguistics, knowledge representation, and inference.


Author[s]: Robert Givan, David McAllester and Sameer Shalaby

Natural Language Based Inference Procedures Applied to Schubert's Steamroller

December 1991



We have previously argued that the syntactic structure of natural language can be exploited to construct powerful polynomial time inference procedures. This paper supports the earlier arguments by demonstrating that a natural language based polynomial time procedure can solve Schubert's steamroller in a single step.


Author[s]: David McAllester

Grammar Rewriting

December 1991



We present a term rewriting procedure based on congruence closure that can be used with arbitrary equational theories. This procedure is motivated by the pragmatic need to prove equations in equational theories where confluence can not be achieved. The procedure uses context free grammars to represent equivalence classes of terms. The procedure rewrites grammars rather than terms and uses congruence closure to maintain certain congruence properties of the grammar. Grammars provide concise representations of large term sets. Infinite term sets can be represented with finite grammars and exponentially large term sets can be represented with linear sized grammars.


Author[s]: David McAllester and Jeffrey Siskind

Lifting Transformations

December 1991



Lifting is a well known technique in resolution theorem proving, logic programming, and term rewriting. In this paper we formulate lifting as an efficiency-motivated program transformation applicable to a wide variety of nondeterministic procedures. This formulation allows the immediate lifting of complex procedures, such as the Davis- Putnam algorithm, which are otherwise difficult to lift. We treat both classical lifting, which is based on unification, and various closely related program transformations which we also call lifting transformations. These nonclassical lifting transformations are closely related to constraint techniques in logic programming, resolution, and term rewriting.


Author[s]: Robert Givan and David McAllester

Tractable Inference Relations

December 1991



We consider the concept of local sets of inference rules. Locality is a syntactic condition on rule sets which guarantees that the inference relation defined by those rules is polynomial time decidable. Unfortunately, determining whether a given rule set is local can be difficult. In this paper we define inductive locality, a strengthening of locality. We also give a procedure which can automatically recognize the locality of any inductively local rule set. Inductive locality seems to be more useful that the earlier concept of strong locality. We show that locality, as a property of rule sets, is undecidable in general.


Author[s]: Michael de la Maza and Bruce Tidor

Boltzmannn Weighted Selection Improves Performance of Genetic Algorithms

December 1991



Modifiable Boltzmann selective pressure is investigated as a tool to control variability in optimizations using genetic algorithms. An implementation of variable selective pressure, modeled after the use of temperature as a parameter in simulated annealing approaches, is described. The convergence behavior of optimization runs is illustrated as a function of selective pressure; the method is compared to a genetic algorithm lacking this control feature and is shown to exhibit superior convergence properties on a small set of test problems. An analysis is presented that compares the selective pressure of this algorithm to a standard selection procedure.


Author[s]: P. A. Skordos

Time-Reversible Maxwell's Demon

September 1992



A time-reversible Maxwell's demon is demonstrated which creates a density difference between two chambers initialized to have equal density. The density difference is estimated theoretically and confirmed by computer simulations. It is found that the reversible Maxwell's demon compresses phase space volume even though its dynamics are time reversible. The significance of phase space volume compression in operating a microscopic heat engine is also discussed.


Author[s]: Tomaso Poggio and Thomas Vetter

Recognition and Structure from One 2D Model View: Observations on Prototypes, Object Classes and Symmetries

February 1992



In this note we discuss how recognition can be achieved from a single 2D model view exploiting prior knowledge of an object's structure (e.g. symmetry). We prove that for any bilaterally symmetric 3D object one non- accidental 2D model view is sufficient for recognition. Symmetries of higher order allow the recovery of structure from one 2D view. Linear transformations can be learned exactly from a small set of examples in the case of "linear object classes" and used to produce new views of an object from a single view.


Author[s]: Shimon Edelman, Heinrich Bulthoff, and Erik Sklar

Task and Object Learning in Visual Recognition

January 1991



Human performance in object recognition changes with practice, even in the absence of feedback to the subject. The nature of the change can reveal important properties of the process of recognition. We report an experiment designed to distinguish between non-specific task learning and object- specific practice effects. The results of the experiment support the notion that learning through modification of object representations can be separated from less interesting effects of practice, if appropriate response measures (specifically, the coefficient of variation of response time over views of an object) are used. Furthermore, the results, obtained with computer-generated amoeba-like objects, corroborate previous findings regarding the development of canonical views and related phenomena with practice.


Author[s]: Kah-Kay Sung

A Vector Signal Processing Approach to Color

January 1992



Surface (Lambertain) color is a useful visual cue for analyzing material composition of scenes. This thesis adopts a signal processing approach to color vision. It represents color images as fields of 3D vectors, from which we extract region and boundary information. The first problem we face is one of secondary imaging effects that makes image color different from surface color. We demonstrate a simple but effective polarization based technique that corrects for these effects. We then propose a systematic approach of scalarizing color, that allows us to augment classical image processing tools and concepts for multi-dimensional color signals.


Author[s]: Lyle J. Borg-Graham

On Directional Selectivity in Vertebrate Retina: An Experimental and Computational Study

January 1992



This thesis describes an investigation of retinal directional selectivity. We show intracellular (whole-cell patch) recordings in turtle retina which indicate that this computation occurs prior to the ganglion cell, and we describe a pre-ganglionic circuit model to account for this and other findings which places the non-linear spatio-temporal filter at individual, oriented amacrine cell dendrites. The key non-linearity is provided by interactions between excitatory and inhibitory synaptic inputs onto the dendrites, and their distal tips provide directionally selective excitatory outputs onto ganglion cells. Detailed simulations of putative cells support this model, given reasonable parameter constraints. The performance of the model also suggests that this computational substructure may be relevant within the dendritic trees of CNS neurons in general.


Author[s]: David W. Jacobs

Space Efficient 3D Model Indexing

February 1992



We show that we can optimally represent the set of 2D images produced by the point features of a rigid 3D model as two lines in two high-dimensional spaces. We then decribe a working recognition system in which we represent these spaces discretely in a hash table. We can access this table at run time to find all the groups of model features that could match a group of image features, accounting for the effects of sensing error. We also use this representation of a model's images to demonstrate significant new limitations of two other approaches to recognition: invariants, and non- accidental properties.


Author[s]: Tomaso Poggio and Roberto Brunelli

A Novel Approach to Graphics

February 1992



sWe show that we can optimally represent the set of 2D images producedsby the point features of a rigid 3D model as two lines in twoshigh-dimensional spaces. We then decribe a working recognition systemsin which we represent these spaces discretely in a hash table. We cansaccess this table at run time to find all the groups of model featuressthat could match a group of image features, accounting for the effectssof sensing error. We also use this representation of a model's imagessto demonstrate significant new limitations of two other approaches tosrecognition: invariants, and non-accidental properties.


Author[s]: Stephen W. Keckler

A Coupled Multi-ALU Processing Node for a Highly Parallel Computer

September 1992



This report describes Processor Coupling, a mechanism for controlling multiple ALUs on a single integrated circuit to exploit both instruction-level and inter-thread parallelism. A compiler statically schedules individual threads to discover available intra-thread instruction-level parallelism. The runtime scheduling mechanism interleaves threads, exploiting inter-thread parallelism to maintain high ALU utilization. ALUs are assigned to threads on a cycle byscycle basis, and several threads can be active concurrently. Simulation results show that Processor Coupling performs well both on single threaded and multi-threaded applications. The experiments address the effects of memory latencies, function unit latencies, and communication bandwidth between function units.


Author[s]: W. Richards and A. Jepson

What Makes a Good Feature?

April 1992



Using a Bayesian framework, we place bounds on just what features are worth computing if inferences about the world properties are to be made from image data. Previously others have proposed that useful features reflect "non-accidental'' or "suspicious'' configurations (such as parallel or colinear lines). We make these notions more precise and show them to be context sensitive.


Author[s]: Lynne E. Parker

Local Versus Global Control Laws for Cooperative Agent Teams

March 1992



The design of the control laws governing the behavior of individual agents is crucial for the successful development of cooperative agent teams. These control laws may utilize a combination of local and/or global knowledge to achieve the resulting group behavior. A key difficulty in this development is deciding the proper balance between local and global control required to achieve the desired emergent group behavior. This paper addresses this issue by presenting some general guidelines and principles for determining the appropriate level of global versus local control. These principles are illustrated and implemented in a "keep formation'' cooperative task case study.


Author[s]: Linda M. Wills

Automated Program Recognition by Graph Parsing

July 1992



Recognizing standard computational structures (cliches) in a program can help an experienced programmer understand the program. We develop a graph parsing approach to automating program recognition in which programs and cliches are represented in an attributed graph grammar formalism and recognition is achieved by graph parsing. In studying this approach, we evaluate our representation's ability to suppress many common forms of variation which hinder recognition. We investigate the expressiveness of our graph grammar formalism for capturing programming cliches. We empirically and analytically study the computational cost of our recognition approach with respect to two medium-sized, real-world simulator programs.


Author[s]: Gerald J. Sussman and Jack Wisdom

Chaotic Evolution of the Solar System

March 1992



The evolution of the entire planetary system has been numerically integrated for a time span of nearly 100 million years. This calculation confirms that the evolution of the solar system as a whole is chaotic, with a time scale of exponential divergence of about 4 million years. Additional numerical experiments indicate that the Jovian planet subsystem is chaotic, although some small variations in the model can yield quasiperiodic motion. The motion of Pluto is independently and robustly chaotic.


Author[s]: W. Eric Grimson, Daniel P. Huttenlocher and T. D. Alter

Recognizing 3D Ojbects of 2D Images: An Error Analysis

July 1992



Many object recognition systems use a small number of pairings of data and model features to compute the 3D transformation from a model coordinate frame into the sensor coordinate system. With perfect image data, these systems work well. With uncertain image data, however, their performance is less clear. We examine the effects of 2D sensor uncertainty on the computation of 3D model transformations. We use this analysis to bound the uncertainty in the transformation parameters, and the uncertainty associated with transforming other model features into the image. We also examine the impact of the such transformation uncertainty on recognition methods.


Author[s]: Amnon Shashua

Projective Structure from Two Uncalibrated Images: Structure from Motion and RecRecognition

September 1992



This paper addresses the problem of recovering relative structure, in the form of an invariant, referred to as projective structure, from two views of a 3D scene. The invariant structure is computed without any prior knowledge of camera geometry, or internal calibration, and with the property that perspective and orthographic projections are treated alike, namely, the system makes no assumption regarding the existence of perspective distortions in the input images.


Author[s]: Patrick G. Sobalvarro

Probabilistic Analysis of Multistage Interconnection Network Performance

April 1992



We present methods of calculating the value of two performance parameters for multipath, multistage interconnection networks: the normalized throughput and the probability of successful message transmission. We develop a set of exact equations for the loading probability mass functions of network channels and a program for solving them exactly. We also develop a Monte Carlo method for approxmiate solution of the equations, and show that the resulting approximation method will always calculate the values of the performance parameters more quickly than direct simulation.


Author[s]: Timothy D. Tuttle

Understanding and Modeling the Behavior of a Harmonic Drive Gear Transmission

May 1992



In my research, I have performed an extensive experimental investigation of harmonic-drive properties such as stiffness, friction, and kinematic error. From my experimental results, I have found that these properties can be sharply non-linear and highly dependent on operating conditions. Due to the complex interaction of these poorly behaved transmission properties, dynamic response measurements showed surprisingly agitated behavior, especially around system resonance. Theoretical models developed to mimic the observed response illustrated that non-linear frictional effects cannot be ignored in any accurate harmonic-drive representation. Additionally, if behavior around system resonance must be replicated, kinematic error and transmission compliance as well as frictional dissipation from gear- tooth rubbing must all be incorporated into the model.


Author[s]: Thomas Marill

Why Do We See Three-dimensional Objects?

June 1992



When we look at certain line-drawings, we see three-dimensional objects. The question is why; why not just see two-dimensional images? We theorize that we see objects rather than images because the objects we see are, in a certain mathematical sense, less complex than the images; and that furthermore the particular objects we see will be the least complex of the available alternatives. Experimental data supporting the theory is reported. The work is based on ideas of Solomonoff, Kolmogorov, and the "minimum description length'' concepts of Rissanen.


Author[s]: Kenneth W. Chang

Shaping Inputs to Reduce Vibration in Flexible Space Structures

June 1992



Future NASA plans to launch large space strucutres solicit the need for effective vibration control schemes which can solve the unique problems associated with unwanted residual vibration in flexible spacecraft. In this work, a unique method of input command shaping called impulse shaping is examined. A theoretical background is presented along with some insight into the methdos of calculating multiple mode sequences. The Middeck Active Control Experiment (MACE) is then described as the testbed for hardware experiments. These results are shown and some of the difficulties of dealing with nonlinearities are discussed. The paper is concluded with some conclusions about calculating and implementing impulse shaping in complex nonlinear systems.


Author[s]: Michael D. Ernst (Editor)

Intellectual Property in Computing: (How) Should Software Be Protected? An Industry Perspective

May 1992



The future of the software industry is today being shaped in the courtroom. Most discussions of intellectual property to date, however, have been frames as debates about how the existing law --- promulgated long before the computer revolution --- should be applied to software. This memo is a transcript of a panel discussion on what forms of legal protection should apply to software to best serve both the industry and society in general. After addressing that question we can consider what laws would bring this about.


Author[s]: Vijay Balasubramanian

Equivalence and Reduction of Hidden Markov Models

January 1993



This report studies when and why two Hidden Markov Models (HMMs) may represent the same stochastic process. HMMs are characterized in terms of equivalence classes whose elements represent identical stochastic processes. This characterization yields polynomial time algorithms to detect equivalent HMMs. We also find fast algorithms to reduce HMMs to essentially unique and minimal canonical representations. The reduction to a canonical form leads to the definition of 'Generalized Markov Models' which are essentially HMMs without the positivity constraint on their parameters. We discuss how this generalization can yield more parsimonious representations of stochastic processes at the cost of the probabilistic interpretation of the model parameters.


Author[s]: B. Whitney Rappole, Jr.

Minimizing Residual Vibrations in Flexible Systems

June 1992



Residual vibrations degrade the performance of many systems. Due to the lightweight and flexible nature of space structures, controlling residual vibrations is especially difficult. Also, systems such as the Space Shuttle remote Manipulator System have frequencies that vary significantly based upon configuration and loading. Recently, a technique of minimizing vibrations in flexible structures by command input shaping was developed. This document presents research completed in developing a simple, closed- form method of calculating input shaping sequences for two-mode systems and a system to adapt the command input shaping technique to known changes in system frequency about the workspace. The new techniques were tested on a three-link, flexible manipulator.


Author[s]: Ronen Basri and Daphna Weinshall

Distance Metric Between 3D Models and 3D Images for Recognition and Classification

July 1992



Similarity measurements between 3D objects and 2D images are useful for the tasks of object recognition and classification. We distinguish between two types of similarity metrics: metrics computed in image-space (image metrics) and metrics computed in transformation-space (transformation metrics). Existing methods typically use image and the nearest view of the object. Example for such a measure is the Euclidean distance between feature points in the image and corresponding points in the nearest view. (Computing this measure is equivalent to solving the exterior orientation calibration problem.) In this paper we introduce a different type of metrics: transformation metrics. These metrics penalize for the deformatoins applied to the object to produce the observed image. We present a transformation metric that optimally penalizes for "affine deformations" under weak-perspective. A closed-form solution, together with the nearest view according to this metric, are derived. The metric is shown to be equivalent to the Euclidean image metric, in the sense that they bound each other from both above and below. For Euclidean image metric we offier a sub-optimal closed-form solution and an iterative scheme to compute the exact solution.


Author[s]: Thomas M. Breuel

Geometric Aspects of Visual Object Recognition

May 1992



This thesis presents there important results in visual object recognition based on shape. (1) A new algorithm (RAST; Recognition by Adaptive Sudivisions of Tranformation space) is presented that has lower average-case complexity than any known recognition algorithm. (2) It is shown, both theoretically and empirically, that representing 3D objects as collections of 2D views (the "View-Based Approximation") is feasible and affects the reliability of 3D recognition systems no more than other commonly made approximations. (3) The problem of recognition in cluttered scenes is considered from a Bayesian perspective; the commonly-used "bounded- error errorsmeasure" is demonstrated to correspond to an independence assumption. It is shown that by modeling the statistical properties of real-scenes better, objects can be recognized more reliably.


Author[s]: Nicola Ancona and Tomaso Poggio

Optical Flow From 1D Correlation: Application to a Simple Time-To-Crash Detector

October 1993



In the first part of this paper we show that a new technique exploiting 1D correlation of 2D or even 1D patches between successive frames may be sufficient to compute a satisfactory estimation of the optical flow field. The algorithm is well-suited to VLSI implementations. The sparse measurements provided by the technique can be used to compute qualitative properties of the flow for a number of different visual tsks. In particular, the second part of the paper shows how to combine our 1D correlation technique with a scheme for detecting expansion or rotation ([5]) in a simple algorithm which also suggests interesting biological implications. The algorithm provides a rough estimate of time-to-crash. It was tested on real image sequences. We show its performance and compare the results to previous approaches.


Author[s]: Ehud Rivlin and Ronen Basri

Localization and Positioning Using Combinations of Model Views

September 1992



A method for localization and positioning in an indoor environment is presented. The method is based on representing the scene as a set of 2D views and predicting the appearances of novel views by linear combinations of the model views. The method is accurate under weak perspective projection. Analysis of this projection as well as experimental results demonstrate that in many cases it is sufficient to accurately describe the scene. When weak perspective approximation is invalid, an iterative solution to account for the perspective distortions can be employed. A simple algorithm for repositioning, the task of returning to a previously visited position defined by a single view, is derived from this method.


Author[s]: Rajeev Surati

A Parallelizing Compiler Based on Partial Evaluation

July 1993



We constructed a parallelizing compiler that utilizes partial evaluation to achieve efficient parallel object code from very high-level data independent source programs. On several important scientific applications, the compiler attains parallel performance equivalent to or better than the best observed results from the manual restructuring of code. This is the first attempt to capitalize on partial evaluation's ability to expose low-level parallelism. New static scheduling techniques are used to utilize the fine-grained parallelism of the computations. The compiler maps the computation graph resulting from partial evaluation onto the Supercomputer Toolkit, an eight VLIW processor parallel computer.


Author[s]: T.D. Alter

3D Pose from Three Corresponding Points Under Weak-Perspective Projection

July 1992



Model-based object recognition commonly involves using a minimal set of matched model and image points to compute the pose of the model in image coordinates. Furthermore, recognition systems often rely on the "weak-perspective" imaging model in place of the perspective imaging model. This paper discusses computing the pose of a model from three corresponding points under weak-perspective projection. A new solution to the problem is proposed which, like previous solutins, involves solving a biquadratic equation. Here the biquadratic is motivate geometrically and its solutions, comprised of an actual and a false solution, are interpreted graphically. The final equations take a new form, which lead to a simple expression for the image position of any unmatched model point.


Author[s]: M. Ali Taalebinezhaad

Visual Tracking

October 1992



A typical robot vision scenario might involve a vehicle moving with an unknown 3D motion (translation and rotation) while taking intensity images of an arbitrary environment. This paper describes the theory and implementation issues of tracking any desired point in the environment. This method is performed completely in software without any need to mechanically move the camera relative to the vehicle. This tracking technique is simple an inexpensive. Furthermore, it does not use either optical flow or feature correspondence. Instead, the spatio-temporal gradients of the input intensity images are used directly. The experimental results presented support the idea of tracking in software. The final result is a sequence of tracked images where the desired point is kept stationary in the images independent of the nature of the relative motion. Finally, the quality of these tracked images are examined using spatio-temporal gradient maps.


Author[s]: M. Ali Taalebinezhaad

Robot Motion Vision by Fixation

September 1992



In many motion-vision scenarios, a camera (mounted on a moving vehicle) takes images of an environment to find the "motion'' and shape. We introduce a direct-method called fixation for solving this motion-vision problem in its general case. Fixation uses neither feature-correspondence nor optical-flow. Instead, spatio-temporal brightness gradients are used directly. In contrast to previous direct methods, fixation does not restrict the motion or the environment. Moreover, fixation method neither requires tracked images as its input nor uses mechanical tracking for obtaining fixated images. The experimental results on real images are presented and the implementation issues and techniques are discussed.


Author[s]: Feng Zhao

Automatic Analysis and Synthesis of Controllers for Dynamical Systems Based On P

September 1992



I present a novel design methodology for the synthesis of automatic controllers, together with a computational environment---the Control Engineer's Workbench---integrating a suite of programs that automatically analyze and design controllers for high-performance, global control of nonlinear systems. This work demonstrates that difficult control synthesis tasks can be automated, using programs that actively exploit and efficiently represent knowledge of nonlinear dynamics and phase space and effectively use the representation to guide and perform the control design. The Control Engineer's Workbench combines powerful numerical and symbolic computations with artificial intelligence reasoning techniques. As a demonstration, the Workbench automatically designed a high-quality maglev controller that outperforms a previous linear design by a factor of 20.


Author[s]: Elizabeth Bradley

Taming Chaotic Circuits

September 1992



Control algorithms that exploit chaotic behavior can vastly improve the performance of many practical and useful systems. The program Perfect Moment is built around a collection of such techniques. It autonomously explores a dynamical system's behavior, using rules embodying theorems and definitions from nonlinear dynamics to zero in on interesting and useful parameter ranges and state-space regions. It then constructs a reference trajectory based on that information and causes the system to follow it. This program and its results are illustrated with several examples, among them the phase- locked loop, where sections of chaotic attractors are used to increase the capture range of the circuit.


Author[s]: Jose L. Marroquin and Federico Girosi

Some Extensions of the K-Means Algorithm for Image Segmentation and Pattern Classification

January 1993



In this paper we present some extensions to the k-means algorithm for vector quantization that permit its efficient use in image segmentation and pattern classification tasks. It is shown that by introducing state variables that correspond to certain statistics of the dynamic behavior of the algorithm, it is possible to find the representative centers fo the lower dimensional maniforlds that define the boundaries between classes, for clouds of multi-dimensional, mult-class data; this permits one, for example, to find class boundaries directly from sparse data (e.g., in image segmentation tasks) or to efficiently place centers for pattern classification (e.g., with local Gaussian classifiers). The same state variables can be used to define algorithms for determining adaptively the optimal number of centers for clouds of data with space-varying density. Some examples of the applicatin of these extensions are also given.


Author[s]: Ronen Basri

Recognition by Prototypes

December 1992



A scheme for recognizing 3D objects from single 2D images is introduced. The scheme proceeds in two stages. In the first stage, the categorization stage, the image is compared to prototype objects. For each prototype, the view that most resembles the image is recovered, and, if the view is found to be similar to the image, the class identity of the object is determined. In the second stage, the identification stage, the observed object is compared to the individual models of its class, where classes are expected to contain objects with relatively similar shapes. For each model, a view that matches the image is sought. If such a view is found, the object's specific identity is determined. The advantage of categorizing the object before it is identified is twofold. First, the image is compared to a smaller number of models, since only models that belong to the object's class need to be considered. Second, the cost of comparing the image to each model in a classis very low, because correspondence is computed once for the whoel class. More specifically, the correspondence and object pose computed in the categorization stage to align the prototype with the image are reused in the identification stage to align the individual models with the image. As a result, identification is reduced to a series fo simple template comparisons. The paper concludes with an algorithm for constructing optimal prototypes for classes of objects.


Author[s]: Ronald D. Chaney

Analytical Representation of Contours

October 1992



The interpretation and recognition of noisy contours, such as silhouettes, have proven to be difficult. One obstacle to the solution of these problems has been the lack of a robust representation for contours. The contour is represented by a set of pairwise tangent circular arcs. The advantage of such an approach is that mathematical properties such as orientation and curvature are explicityly represented. We introduce a smoothing criterion for the contour tht optimizes the tradeoff between the complexity of the contour and proximity of the data points. The complexity measure is the number of extrema of curvature present in the contour. The smoothing criterion leads us to a true scale-space for contours. We describe the computation of the contour representation as well as the computation of relevant properties of the contour. We consider the potential application of the representation, the smoothing paradigm, and the scale-space to contour interpretation and recognition.


Author[s]: Karen B. Sarachik

Limitations of Geometric Hashing in the Presence of Gaussian Noise

October 1992



This paper presents a detailed error analysis of geometric hashing for 2D object recogition. We analytically derive the probability of false positives and negatives as a function of the number of model and image, features and occlusion, using a 2D Gaussian noise model. The results are presented in the form of ROC (receiver-operating characteristic) curves, which demonstrate that the 2D Gaussian error model always has better performance than that of the bounded uniform model. They also directly indicate the optimal performance that can be achieved for a given clutter and occlusion rate, and how to choose the thresholds to achieve these rates.


Author[s]: Michael J. Jones

Using Recurrent Networks for Dimensionality Reduction

September 1992



This report explores how recurrent neural networks can be exploited for learning high- dimensional mappings. Since recurrent networks are as powerful as Turing machines, an interesting question is how recurrent networks can be used to simplify the problem of learning from examples. The main problem with learning high-dimensional functions is the curse of dimensionality which roughly states that the number of examples needed to learn a function increases exponentially with input dimension. This thesis proposes a way of avoiding this problem by using a recurrent network to decompose a high-dimensional function into many lower dimensional functions connected in a feedback loop.


Author[s]: Ronald Chaney

Complexity as a Sclae-Space for the Medial Axis Transform

January 1993



The medial axis skeleton is a thin line graph that preserves the topology of a region. The skeleton has often been cited as a useful representation for shape description, region interpretation, and object recognition. Unfortunately, the computation of the skeleton is extremely sensitive to variations in the bounding contour. In this paper, we describe a robust method for computing the medial axis skeleton across a variety of scales. The resulting scale-space is parametric with the complexity of the skeleton, where the complexity is defined as the number of branches in the skeleton.


Author[s]: William M. Wells III

Statistical Object Recognition

January 1993



Two formulations of model-based object recognition are described. MAP Model Matching evaluates joint hypotheses of match and pose, while Posterior Marginal Pose Estimation evaluates the pose only. Local search in pose space is carried out with the Expectation--Maximization (EM) algorithm. Recognition experiments are described where the EM algorithm is used to refine and evaluate pose hypotheses in 2D and 3D. Initial hypotheses for the 2D experiments were generated by a simple indexing method: Angle Pair Indexing. The Linear Combination of Views method of Ullman and Basri is employed as the projection model in the 3D experiments.


Author[s]: S. Tanveer F. Mahmood

Data and Model-Driven Selection Using Parallel-Line Groups

May 1993



A key problem in model-based object recognition is selection, namely, the problem of isolating regions in an image that are likely to come from a single object. This isolation can be either based solely on image data (data-driven) or can incorporate the knowledge of the model object (model- driven). In this paper we present an approach that exploits the property of closely-spaced parallelism between lines on objects to achieve data and model-driven selection. Specifically, we present a method of identifying groups of closely-spaced parallel lines in images that generates a linear number of small-sized and reliable groups thus meeting several of the desirable requirements of a grouping scheme for recognition. The line groups generated form the basis for data and model-driven selection. Data-driven selection is achieved by selecting salient line groups as judged by a saliency measure that emphasizes the likelihood of the groups coming from single objects. The approach to model-driven selection, on the other hand, uses the description of closely- spaced parallel line groups on the model object to selectively generate line groups in the image that are likely to eb the projections of the model groups under a set of allowable transformations and taking into account the effect of occlusions, illumination changes, and imaging errors. We then discuss the utility of line groups-based selection in the context of reducing the search involved in recognition, both as an independent selection mechanism, and when used in combination with other cues such as color. Finally, we present results that indicate a vast improvement in the performance of a recognition system that is integrated with parallel line groups-based selection.

horizontal line

MIT logo Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu