CSAIL Publications and Digital Archive header
bullet Research Abstracts Home bullet CSAIL Digital Archive bullet Research Activities bullet CSAIL Home bullet

link to publications.csail.mit.edu link to www.csail.mit.edu horizontal line

 

Research Abstracts - 2007
horizontal line

horizontal line

vertical line
vertical line

A Tripartite Architecture for Question Answering

Boris Katz, Gary Borchardt, Sue Felshin & Gregory Marton

The Problem

Robust question answering involves a combination of many processing capabilities: natural language understanding and generation, interactive dialog, matching, reasoning, data organization, information extraction, remote interfaces, and more. In the construction of question answering systems, then, it is critically important to choose an appropriate architecture: one that is sufficiently modular and flexible, and that includes abstraction layers in appropriate places.

Three particularly important requirements for such an architecture are:

  • handling diverse information resources, used singly or in combination,
  • facilitating a range of user–system interaction styles, to accommodate different types of situations that arise during question answering sessions, and
  • supporting the use of multiple processing strategies, including linguistic, knowledge-based and statistical techniques.
Approach

We have been elaborating a comprehensive question answering architecture that consists of three broad layers of operation, as illustrated in the following diagram:

diagram of tripartite architecture

The top layer of this architecture supplies natural language processing functionality. Its responsibilities include analysis of questions, engaging in system–user dialog, and generating responses to the user. A range of contingent circumstances are addressed by this layer, including the correction of misspelled words, clarification of ambiguities, handling of elliptical questions, and suggestion of related questions. In presenting responses, this layer is responsible for including source attributions and links to further information, and for providing explanations of fused information.

The bottom layer of the architecture establishes a uniform interface to diverse resources such as multi-media documents, databases, and resources available on the Web. In this capacity, it creates an abstraction layer that supports interaction with these resources while hiding differences in format, access protocols, and other details. In addition, the bottom layer facilitates the integration of different technologies (e.g., statistical and knowledge-based techniques) that might be relevant for different types of resources.

The middle layer supplies a bridge between the top layer and the bottom layer, serving to decompose questions into subquestions, perform reasoning operations, fuse the results of processing for subquestions, and construct explanations of fused information which are relayed to the top layer. Question decomposition occurs both on the basis of syntax, in cases where the question itself contains distinct parts that may be posed as subquestions, and semantics, in cases where domain knowledge and representations of underlying meaning may be applied to partition a question into subquestions.

Current Status

We have constructed an initial implementation of the tripartite architecture using several processing components developed in our group. Roughly, our START system [2] [3] makes up the top layer of the architecture, our Omnibase system [5] makes up the bottom layer, and our IMPACT system (based on [1]) makes up the middle layer. The assembled system provides access to information in a number of resources, including the CIA World Factbook, IMDB, Biography.com, Infoplease.com, and the MIPT Terrorism Knowledge Base. An initial description of our use of this architecture to answer questions on the basis of multiple information resources appears in: [4].

Research Support

This work is supported in part by the Disruptive Technology Office as part of the AQUAINT Phase 3 research program.

References:

[1] Gary C. Borchardt. Thinking between the Lines: Computers and the Comprehension of Causal Descriptions. Cambridge, Massachusetts, 1994.

[2] Boris Katz. Using English for Indexing and Retrieving. In Artificial Intelligence at MIT: Expanding Frontiers, v. 1; Cambridge, MA, 1990.

[3] Boris Katz. Annotating the World Wide Web Using Natural Language. In Proceedings of the 5th RIAO Conference on Computer Assisted Information Searching on the Internet (RIAO '97), Montreal, Canada, 1997.

[4] Boris Katz, Gary Borchardt, and Sue Felshin. Syntactic and Semantic Decomposition Strategies for Question Answering from Multiple Resources. In Proceedings of the AAAI 2005 Workshop on Inference for Textual Question Answering, pp. 35–41, 2005.

[5] Boris Katz, Sue Felshin, Deniz Yuret, Ali Ibrahim, Jimmy Lin, Gregory Marton, Alton Jerome McFarland, and Baris Temelkuran. Omnibase: Uniform Access to Heterogeneous Data for Question Answering. In Proc. of the 7th Int. Workshop on Applications of Natural Language to Information Systems (NLDB '02), Stockholm, Sweden, June 2002.

vertical line
vertical line
 
horizontal line

MIT logo Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu