Abstracts - 2006
Learning to Order Temporal Segments
Phillip Bramsen, Pawan Deshpande, Yoong Keok Lee & Regina Barzilay
We introduce a new method for temporal ordering. The unit of our analysis is a temporal segment, a fragment of text that maintains temporal coherence. The strength of our approach lies in its ability to simultaneously optimize pairwise ordering preferences and global constraints on the graph topology. Our learning method achieves 83% F-measure in temporal segmentation, and 85% accuracy in inferring temporal relations between two segments.
Understanding the temporal flow of discourse is a significant aspect of text comprehension. Not surprisingly, temporal analysis has been a focus of linguistic research for quite some time. Besides its linguistic significance, temporal analysis has important practical implications. In multidocument summarization, knowledge about the temporal order of events can enhance both the content selection and the summary generation processes . In question answering, temporal analysis is needed to determine when a particular event occurs, and how events relate to each other. Typical temporal analysis applications include extraction of temporal expressions , time stamping of event clauses , and temporal ordering of events   .
We consider a temporal segment to be a fragment of text that does not exhibit abrupt changes in temporal focus . A segment may contain more than one event or state, but the key requirement is that segment elements maintain temporal coherence. For instance, a medical discharge summary may contain segments describing a patient's admission, his previous hospital visit, and original symptoms' onset. Each of these segments corresponds to a different time frame, and is clearly delineated as such in a text.
Our ultimate goal is to automatically construct a graph that encodes ordering between temporal segments. The key premise is that in a coherent document, temporal progression is reflected in a wide range of linguistic features and contextual dependencies. For instance, given a pair of adjacent segments, the temporal adverb "next day" in the second segment is a strong predictor of precedence relation. In other cases, we can predict the right order between a pair of segments by analyzing their relation to other segments in the text. The interaction between pairwise ordering decisions can be easily formalized in terms of constraints on the graph topology. One most obvious example of such constraint is prohibiting cycles in the ordering graph. We show how these complementary sources of information can be incorporated in a learning model.
Our contributions are twofold:
Temporal Segmentation - We propose a fully automatic, linguistically rich model for temporal segmentation. Most work on temporal analysis is done on a finer granularity than proposed here. However, the granularity of this representation facilitates temporal analysis, and is especially suitable for domains with sparse temporal anchors.
Segment Ordering - We introduce a new method for learning temporal ordering. The strength of the proposed model lies in its ability to simultaneously optimize pairwise ordering preferences and global constraints on graph topology. While the algorithm has been applied at the segment level, it can be used with other temporal annotation schemes.
We evaluate our temporal ordering and segmentation algorithms on a corpus of medical discharge summaries. Our work complements the existing body of research in two key aspects. First, our focus is on globally optimal temporal inference. While the importance of global constraints has been previously validated in symbolic systems for temporal analysis , existing corpus-based approaches operate at the local level. Second, we investigate temporal ordering in medical discharge summaries, which differ from standard newspaper collections. Temporal analysis in this domain is challenging in several respects: a typical summary exhibits no significant tense and aspect variations, and contains few absolute time markers.
Representation and Methods
We view text as a linear sequence of temporal segments. Temporal focus is retained within a segment, but radically changes in between segments. The length of a segment can range from a single clause to a sequence of adjacent sentences. For example, a single segment describing the examination of a patient may include several events and states which belong to the same time frame. To automatically predict shifts in temporal focus, this assessment is performed in a supervised discriminative framework using a boosting classifier  to predict segment boundaries using a feature set drawn from lexical, syntax, and positional information.
We represent ordering of events as a temporal directed acyclic graph (TDAG). Edges in a TDAG capture temporal precedence relations between segments. Because the graph encodes an order, cycles are prohibited. We do not require the graph to be fully connected - if the precedence relation between two nodes is not specified in the text, the corresponding nodes will not be connected.
Our next goal is to automatically construct a graph that encodes ordering relations between temporal segments. One possible approach is to cast graph construction as a standard binary classification task: predict an ordering for each pair of distinct segments based on their attributes alone. If a pair contains a temporal marker, like "later", then accurate prediction is feasible. In fact, this method is commonly used in event ordering   . However, many segment pairs lack temporal markers and other explicit cues for ordering. Determining their relation out of context can be difficult, even for humans. Moreover, by treating each segment pair in isolation, we cannot guarantee that all the pairwise assignments are consistent with each other, and yield a valid TDAG. Rather than ordering each pair separately, our ordering model relies on global inference. Given the pairwise ordering predictions of a local classifier that incorporates a variety of lexical, temporal, and positional features, our model finds a globally optimal assignment. In essence, the algorithm constructs a graph that is maximally consistent with individual ordering preferences of each segment pair, and at the same time satisfies graph-level constraints, such as acyclicity and transitivity. For this task, we employ three global inferences strategies - two greedy approaches and one that uses integer linear programming (ILP). These strategies, which vary significantly in terms of linguistic motivation and computational complexity, are described below:
Greedy Inference in Natural Reading Order (NRO) - The simplest way to construct a consistent TDAG is by adding segments in the order of their appearance in a text. Intuitively speaking, this technique processes segments in the same order as a reader of the text. The motivation underlying this approach is that the reader incrementally builds temporal interpretation of a text; when a new piece of information is introduced, the reader knows how to relate it to already processed text.
Greedy Best-first Inference (BF) - Our second inference strategy is also greedy. It aims to optimize the score of the graph. The score of the graph is computed by summing the scores of its edges. While this greedy strategy is not guaranteed to find the optimalsolution, it finds a reasonable approximation .
Integer Linear Programming (ILP) - We can cast the task of constructing a globally optimal TDAG as an optimization problem. In contrast to the previous approaches, the method is not greedy. It computes the optimal solution within the Integer Linear Programming (ILP) framework. We apply constraints on the TDAG that ensure acyclicity, transitivity and that each node is connected to at least one other node.
Corpus and Annotations
We applied our method for temporal segmentation and ordering to a corpus of medical discharge summaries. The medical domain has been a popular testbed for methods for automatic temporal analyzers . The appeal is partly due to rich temporal structure of these documents, and the practical need to parse this structure for meaningful processing of medical data. The summaries do not follow a chronological order. The ordering of information in this domain is guided by stylistic conventions (i.e., symptoms are presented before treatment) and the relevance of information to the current conditions (i.e., previous onset of the same disease is summarized before the description of other diseases).
Our approach for temporal segmentation requires annotated data for supervised training. On the temporal segmentation task, there is a high inter-annotator agreement with an observed Kappa value of 0.71 confirming our hypothesis about reliability of temporal segmentation. There is similarly also a high inter-annotator agreement on the ordering task.
We evaluate temporal segmentation using leave-one-out cross-validation on our corpus of 60 summaries. The segmentation algorithm achieves a performance of 83% F-measure, with a recall of 78% and a precision of 89%.
To evaluate segment ordering, we employ leave-one-out cross-validation on 29 annotated TDAGs. In addition to the three global inference algorithms, we include a majority baseline that classifies all edges as forward. It is a natural baseline as it assumes chronological order. The baseline classifies 72.2% of the edges correctly, while NRO achieves an accuracy of 76.1%. Best first greedy has an accuracy of 79.9% As expected, the ILP strategy, which supports exact global inference, achieves the best performance - 85.3%. These results demonstrate that all learned ordering methods outperform the majority baseline by a wide margin.
 R. Barzilay, N. Elhadad and K. McKeown. Inferring strategies for sentences ordering in multidocument news summarization. In Journal of Artificial Intelligence, 17:35-55. 2002.
 G. Wilson, I. Main, B. Sundheim and L. Ferro. A multilingual approach to annotating and extracting temporal information. In Proceedings of the ACL 2001 Workshop on Temporal and Spatial Information Prcoessing, pp. 81-87. 2001.
 E. Filatova and E. Hovy. Assigning time-stamps to event-clauses. In In Proceedings of the ACL 2001 Workshop on Temporal and Spatial Information Processing, pp. 104-111. 2001.
 I. Mani, B. Schiffman and J. Zhang. Inferring temporal ordering of events in news. In Proceeding of HLT-NAACL. 2003.
 M. Lapata and A. Lascarides. Inferring sentence-internal temporal relations. In Proceedings of HLT-NAACL, pp. 153-160. 2004.
 B. Boguraev and R. K. Ando. TimeML-compliant text analysis for temporal reasoning. In Proceedings of IJCAI, pp. 997-1003. 2005.
 B. Webber. Tense as a discourse anaphor. Computational Linguistics, 14(2):61-73. 1998.
 L. Zhou, C. Friedman, S. Parsons, G. Hripcsak. System architecture for temporal information extraction, representation and reasoning in clinical narrative reports. In Proceedings of AMIA, pp. 869-873. 2005.
 R Schapire and Y Singer. Boostexter: A boosting-based system for text categorization. In Machine Learning, pp. 135-168, 2000.
 W. Cohen, R. Schapire, and Yoram Singer. Learning to order things. Journal of Artificial Intelligence, 10;243-270. 1999.
 C. Combi and Y. Shahar. Temporal reasoning and temporal data maintenance in medicine: Issues and challenges. Computers in Biology and Medicine, 27(5):353-368. 1997.