CSAIL Publications and Digital Archive header
bullet Research Abstracts Home bullet CSAIL Digital Archive bullet Research Activities bullet CSAIL Home bullet

link to publications.csail.mit.edu link to www.csail.mit.edu horizontal line


Research Abstracts - 2007
horizontal line

horizontal line

vertical line
vertical line

Knowledge Fusion for Question Answering

Gary Borchardt, Boris Katz & Sue Felshin


Question answering systems can provide "one-stop shopping" for information from diverse sources such as databases, websites, and interactive web resources. While some questions can be answered using information from a single source, other questions require the combination of information from multiple sources. Particularly in cases where the question itself does not indicate possible subquestions to be answered, a problem arises of determining which components of knowledge, drawn from which sources, can be combined in which ways to arrive at an answer.


We are investigating two general approaches to question answering over multiple sources. Our approach to answering complex questions addresses the case where a question can be syntactically decomposed into subquestions that can be answered using information from individual sources. The second approach, described here, applies in those cases where the question is syntactically simple, yet domain knowledge or a representation of underlying meaning can be used to identify and combine suitable components of information from multiple, available sources.

One strategy for handling these situations involves the use of domain-motivated decomposition rules to answer key questions identified by system designers and users. These rules are expressed in terms of "knowledge templates", which combine fixed English phrasing with argument slots for variable values. For example, individual rows in a database table of terrorist incidents might be described using a knowledge template of the form "On [a date], [a terrorist group] used [a weapon type] to perform [an attack type] against [a target type] in [a country]." Using templates of this sort, we have constructed decomposition rules that answer particular questions by posing and answering related subquestions and then performing constraint propagation over sets of returned values [3].

The strategy of using explicit semantic decomposition rules works best when there are relatively few key domain questions and available sources. However, when there are many targeted question types and many sources, it becomes increasingly difficult to identify all of the potential semantic interactions between available content in different sources.

For this situation, we are focusing on described event occurrences in particular, with the intent of decomposing these occurrences into collections of lower-level assertions that model what happens during the events. In this way, we hope to automatically identify a number of inter-event relationships, such as when the occurrence of one or more events implies or contradicts the occurrence of another event. This work is grounded in our work on the transition space representation [1] [2].

In this approach, the temporal unfolding of various events is modeled by sets of language-based statements that specify, in particular, changes in the values of key attributes of event participants. These statements concerning changes are then further decomposed into statements regarding momentary presence and absence of attributes, and, ultimately, a lowest level of statements that specify whether one quantity, such as a timestamped attribute value, is equal to, not equal to, greater than, or not greater than another quantity. Inference can be carried out on the elaborated lower-level assertions, and the resulting base of assertions can be used to detect instances of support or conflict for other event occurrences.

The following are examples of language-based statements that serve as a grounding for this representation:

  • The affinity between PIJ and Hezbollah increases.
  • The supreme leader of al-Saiqa does not change.
  • The PLF becomes a part of the PFLP-GC.
  • Jordan ceases to be a base of operations for al-Fatah.
  • Khalid al-Hasan is a leader of al-Fatah.
  • The supreme leader of PLO in 1970 equals Yasser Arafat.

The strategies described above have been implemented within our IMPACT reasoning system, which operates in conjunction with the START and Omnibase systems to implement our three-layered question answering architecture. A preliminary description of this work appears in [3].

We have applied the strategy of using decomposition rules to a set of information contained in the Monterey Weapons of Mass Destruction Terrorism Database and the MIPT Terrorism Knowledge Base. These rules enable IMPACT, START and Omnibase to answer a range of key questions about terrorist group activities, characteristics and capabilities.

We are currently applying the strategy of representing events in terms of lower-level assertions to a subset of information contained in the MIPT Terrorism Knowledge Base, concerning terrorist group formation, merging, splitting, and related events. For each targeted event type, we have constructed an "event model" that depicts its underlying conditions and changes in terms of lower-level assertions. We then instantiate the event models in the context of particular reported events. Figure 1 illustrates an instantiated event model for the event of one terrorist group joining another terrorist group.

instantiated event model
Figure 1: An instantiated event model of one terrorist group joining another terrorist group.

We then use the sets of lower-level assertions generated by the instantiated event models, plus persistence calculations performed by our IMPACT system, to enable START and IMPACT to provide explanatory answers to questions about conditions and changes at particular times, as illustrated in Figure 2.

screenshot of START answering a question

Figure 2: START responding to a question about a time-specific condition.


This work is supported in part by the Disruptive Technology Office as part of the AQUAINT Phase 3 research program.


[1] Gary C. Borchardt. Understanding Causal Descriptions of Physical Systems. In Proceedings of the AAAI Tenth National Conference on Artificial Intelligence, pp. 2–8, 1992.

[2] Gary C. Borchardt. Thinking between the Lines: Computers and the Comprehension of Causal Descriptions. Cambridge, Massachusetts, 1994.

[3] Boris Katz, Gary Borchardt, and Sue Felshin. Syntactic and Semantic Decomposition Strategies for Question Answering from Multiple Resources. In Proceedings of the AAAI 2005 Workshop on Inference for Textual Question Answering, pp. 35–41, 2005.

vertical line
vertical line
horizontal line

MIT logo Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu