CSAIL Publications and Digital Archive header
bullet Technical Reports bullet Work Products bullet Research Abstracts bullet Historical Collections bullet

link to publications.csail.mit.edu link to www.csail.mit.edu horizontal line


Research Abstracts - 2006
horizontal line

horizontal line

vertical line
vertical line

Intent Recognition

Shuonan Dong & Brian C. Williams


A friend extends his hand, grasps a cup, and brings it to his mouth. I can infer with some confidence that his intent is to drink water. If I knew of a better way to accomplish his intentions, I may inform him of such; for example, I may hand him a straw. Furthermore, knowing his intentions, I might be able to predict his next action and assist him. In this example, I might predict that my friend would need a napkin and thus give him one. Intent recognition occurs naturally among humans, and even more so when participating in collaborative tasks.

Enabling the intent recognition capability into robots or autonomous agents is a rather difficult problem. To date, NASA’s most advanced humanoid robot Robonaut (as shown in Figure 1) is still tele-operated by a remote human controller.

Figure 1. Robonaut is tele-operated to shake hands with an astronaut.
Figure 1. Robonaut is tele-operated to shake hands with an astronaut.

The goal of this intent recognition research is to create an innovative and appropriate representation for the problem, develop a tractable solution method, and demonstrate it working on a hardware or software platform.


The problem of intent recognition can be broken down into the modules shown in Figure 2. The data processor uses machine vision or other methods depending on the sensory data to extract the basic poses of the observed agent. The plan recognizer uses the poses and temporal data to determine the intent of the agent. The pose data may be abstracted various times to form more and more abstract actions.

Figure 2. Intent recognition process
Figure 2. Intent recognition process

Previous Work

A good summary of the different techniques for solving the plan recognition problem can be found in [2]. Some researchers have used a Dempster-Shafer approach [1], while others have used Bayesian probabilistic approaches [4]. Some have incorporated plan recognition into human-computer collaboration applications [5], and more recently there has been discussion of multiple-goal recognition [3].

Points of Innovation

We model the agent’s behavior in a temporal plan network using a reactive model-based programming language. The use of temporal constraints is infrequently discussed in current plan recognition research, but it seems reasonable to conjecture that the timing of an agent’s actions may be very useful for identifying the agent’s plans, especially when we have incomplete or noisy data. For example, if we are only given the information that a person is reaching toward the sink and the amount of time spent, we might infer that he/she is turning off the faucet if it was a quick action, or that he/she is washing hands if it was a slow action. We use probabilistic techniques to model uncertainty, and incorporate learning into the plan recognizer.


[1] Mathias Bauer. A Dempster-Shafer approach to modeling agent preferences for plan recognition. User Modeling and User-Adapted Interaction, 1995.

[2] Sandra Carrbery. Techniques for plan recognition. User Modeling and User-Adapted Interaction, 11:31—48, 2001.

[3] Xiaoyong Chai and Qiang Yang. Multiple-Goal Recognition from Low-Level Signals. In Proceedings of the 20th National Conference on Artificial Intelligence, 2005.

[4] Henry Kautz. A formal theory of plan recognition and its implementation. In J. Allen, H. Kautz, R. Pelavin, and J. Tenenberg, editors, Reasoning about Plans, pages 69—125. Morgan Kaufman, San Mateo, CA, 1991.

[5] Neal Lesh, Charles Rich, and Candace Sidner. Using plan recognition in human-computer collaboration. In Proceedings of the Seventh International Conference on User Modeling, pages 23—32, 1999.

[6] Don Patterson, Lin Liao, Dieter Fox, and Henry Kautz. Inferring High Level Behavior from Low Level Sensors. Fifth Annual Conference on Ubiquitous Computing (UBICOMP 2003), Seattle, WA, 2003.

vertical line
vertical line
horizontal line

MIT logo Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu