MIT CSAIL Research Abstracts

CSAIL Publications and Digital Archive header

Technical Reports

Work Products

Historical Collections

horizontal line

Research Abstracts - 2006
horizontal line

horizontal line

Introduction

Open Domain Question Answering on Newspaper Text

Boris Katz, Gregory Marton, Sue Felshin, Daniel Loreto & Federico Mora

The Problem

NIST's annual TREC Question Answering Track supports the community by providing shared evaluation on a number of important tasks. Factoid and list questions test systems on their ability to answer factual questions exactly, based on a body of English newspaper text. The definition and relationship tasks evaluate systems on their ability to locate the most important facts about a subject, or about the complex relationships between two subjects.

Motivation

Information systems aimed at humans often return documents, paragraphs, snippets of documents, or sentences, and the user must find the actual answer in the text returned. While the surrounding context is often useful to humans, knowing the exact answer can help automatic systems answer more complex questions. For example, if a system can find that "the fifth largest country in Africa" is "Chad", then it becomes possible to answer followup questions, "How many people live there?", and more complex questions, "How many people live in Africa's fifth largest country?".

Previous Work

Since 1992, TREC evaluations [7] have been providing a way to comparatively evaluate systems on a common data set and task. In 1999, a question answering "track" was introduced. A set of "open domain" news articles is provided in advance, and at evaluation time, questions pertaining to the text are given, to be answered automatically. Human assessors then pool the automatic results and decide which answers are correct.

Previous innovations fall into a few categories:

Using online resources like Google and Wikipedia to find related keywords and query refinements. These methods help find the right documents in the corpus, and then help to extract the best exact answers, on the assumption that those will be the ones most prominently found online.
Understanding the corpus better before questions are asked: for example resolving names, noun phrase references ("the senator"), and pronouns helps bring the right keywords into close textual proximity of possible answers for a proximity-based keyword search. Textual cues for definitional contexts (for example apposition) indicate good answers for questions like "Who is X?": "X, a leading Z, ...".
The best systems use reasoning to check that the sentences they use to support an answer can justifiably be said to answer the question. For many answers, this requires much world knowledge.

Our system [3] can be decomposed into five components: data indexing, question analysis, passage retrieval, candidate generation, and answer selection. In the data indexing and passage retrieval components, we have explored several query expansion and document retrieval methods, but their effect on question answering performance is still unclear. In the question analysis component, we identify an expected answer type, and possible paraphrases of the question. During candidate generation, we look for structures that indicate a possible answer. During answer selection, we look for support for each candidate (e.g., from the Web), and choose the set of answers to return. The question answering track is separated into four kinds of questions:

factoid questions, with an exact named-entity answer ("Who shot Kennedy?")
list questions, with a set of exact named-entity answers ("What countries did he visit?")
'other' questions, asking for important facts not yet explored in the conversation ("What else do you know about John F. Kennedy?")
relationship questions, that is, questions on complex topics, with a set of complex answers which may be spread across documents ("What communications or conflicts linked Kennedy with Nikita Khrushchev?")

We have fielded separate systems for each kind of question, and we are working to integrate our various strategies into a comprehensive package, and into the START system. [1] [2]

Research Support

This work is supported in part by the Advanced Research and Development Activity as part of the AQUAINT Phase II research program.

References:

[1] Boris Katz. Using English for Indexing and Retrieving. In Proceedings of the 1st RIAO Conference on User-Oriented Content-Based Text and Image Handling (RIAO '88), 1988.

[2] Boris Katz. Annotating the World Wide Web using Natural Language. In Proceedings of the Conference on the Computer-Assisted Searching on the Internet, RIAO97, 1997.

[3] Boris Katz, Gregory Marton, Gary Borchardt, Alexis Brownell, Sue Felshin, Daniel Loreto, Jesse Louis-Rosenberg, Ben Lu, Federico Mora, Stephan Stiller, Ozlem Uzuner, and Angela Wilcox. External Knowledge for Question Answering. In Proceedings of the Twelfth Text REtrieval Conference (TREC 2005), November 2005.

[4] Jimmy Lin and Boris Katz. Question Answering from the Web using Knowledge Annotation and Knowledge Mining Techniques. In Proceedings of the 12th International Conference on Information and Knowledge Management (CIKM 2003), November 2003.

[5] Gregory A. Marton. Nuggeteer: Automatic Nugget-Based Evaluation using Descriptions and Judgements. In Proceedings of NAACL/HLT, 2006.

[6] Gregory A. Marton. Nuggeteer: Automatic Nugget-Based Evaluation using Descriptions and Judgements. Technical Report 1721.1/30604, 2006.

[7] Ellen Voorhees. Overview of the TREC 2005 question answering track. NIST publication, 2005.

Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu