CSAIL Research Abstract

Introduction

Architecture, Systems
& Networks

Language, Learning,
Vision & Graphics

Physical, Biological
& Social Systems

Theory

horizontal line

Automatic Processing of Spoken Dialogue in the Home Hemodialysis Domain

Ronilda Lacson & Regina Barzilay

Abstract

Spoken medical dialogue is a valuable source of information. Automatic processing of spoken dialogue can enhance information dissemination, clinical care and research. However, understanding even a perfect transcript of spoken dialogue is problematic because of its lack of structure. We describe a method for automatically acquiring the structure of a spoken dialogue by classifying dialogue turns into four semantic types - Clinical, Technical, Backchannel and Miscellaneous. The semantic taxonomy was determined by our application - recorded, transcribed telephone consultations between nurses and dialysis patients. We use shallow features of a dialogue turn, augmented by lexico-semantic features to develop a predictive model that automatically classifies dialogue turns into their corresponding semantic types using machine-learning techniques. The best results show an accuracy of 73%, compared to 33% (P < 0.01) when every segment is labeled with the most frequent class. This result highlights the feasibility of automatically structuring spoken medical dialogue.

Introduction

We describe a technique for automatically acquiring the structure of a transcribed medical dialogue by identifying the semantic type of its turns. Our method operates as part of a system that analyzes telephone consultations between nurses and dialysis patients in the home hemodialysis program at Lynchburg Nephrology, the largest such program in the United States. By identifying the type of a turn - Clinical, Technical, Backchannel or Miscellaneous - we are able to render the transcript into a structured format, amenable to automatic analysis. The Clinical category represents the patient's health, the Technical category encompasses problems with operating dialysis machines, the Miscellaneous category includes mostly scheduling and social concerns, while Backchannels capture greetings and acknowledgments. This classification allows a provider to distill the portions of the dialogue that support medical reasoning and are of primary interest to clinicians, as opposed to technical or scheduling concerns which are typically routed elsewhere. In the long run, knowing the distribution of patient requests can improve the allocation of resources, and ultimately provide better quality of health care.

We present a machine learning algorithm for semantic type classification for medical dialogue based on a shallow meaning representation encoded as simple lexical and contextual features. The lack of world knowledge in this representation is compensated for by a large number of manually annotated training examples. We show how to enhance our machine learning algorithm with background knowledge. Prior to text classification, we employ a feature generator that maps words of a transcript into semantic concepts augmenting our initial shallow utterance representation with new, more informative features. We explore two sources of background knowledge: a manually crafted, large-scale domain ontology and word clusters automatically computed from raw text using hierarchical distributional clustering.[1]

Method

Basic Model

Our basic model relies on three features that can be easily extracted from the transcript: words of a dialogue turn, its length and words of the previous turn. Each object in the training set is represented as a vector of features and its corresponding class. We learn the weights of the rules in the supervised framework using Boostexter,[2] a state-of-the-art boosting classifier.

Data Augmentation with Background Knowledge

Our first approach builds on a large-scale human crafted resource, UMLS. This resource is widely used in medical informatics, and has been shown beneficial in a variety of applications.[3,4] The degree of generalization we can achieve is determined by the size and the structure of the ontology. For our experiments, we used the 2003 version of UMLS which consists of 203 semantic types. Each term that is listed in UMLS is substituted with its corresponding semantic category.

As a source of automatically computed background knowledge, our second approach uses clusters of words with similar semantic properties. Being automatically constructed, clusters are noisier than UMLS, but at the same time have several potential advantages. First, we can easily control the degree of abstraction by changing the desired number of clusters, while in UMLS the number of semantic classes is fixed. Second, clustering provides an easy and robust solution to the problem of coverage as we can always select a large and stylistically appropriate corpus for cluster induction. This is especially important for our application, since patients often use colloquial language and jargon, which may not be covered by UMLS.

Results

The basic model, the UMLS augmented model and the cluster based model all significantly outperform the 33.4% accuracy (P < 0.01) of a baseline model in which every turn is assigned to the most frequent class (Clinical). The best model that combines lexical, turn length and contextual features and is augmented with background information obtained through statistical clustering achieves an accuracy of 73%.

The contributions of this paper are threefold. First, we propose a framework for rendering transcripts of patient-caregiver consultations into a structured representation, amenable to automatic processing. We show that the annotation scheme we propose can be reliably annotated by humans, and thus forms a solid basis for training the learning algorithm. Second, we present a fully-implemented machine-learning method that accurately identifies the semantic type of each utterance. Our emphasis on spoken medical discourse sets us apart from the efforts to interpret written medical text.[3,5] Third, we explore a novel way to automatically incorporate medical knowledge into a dialogue classification algorithm.

References:

[1] PF Brown, PV DeSouza, R Mercer, VJ Della Pietra, JC Lai. Class-based n-gram models of natural language. In Computational Linguistics., pp. 467-479, 1992.

[2] R Schapire and Y Singer. Boostexter: A boosting-based system for text categorization. In Machine Learning, pp. 135-168, 2000.

[3] Y Hsieh, GA Hardardottir, PF Brennan. Linguistic analysis: Terms and phrases used by patients in e-mail messages to nurses. In Medinfo, pp. 511-515, 2004.

[4] W Chapman, M Fiszman, JN Dowling, BE Chapman, TC Rindflesch. Identifying respiratory findings in emergency department reports for biosurveillance using MetaMap. In Medinfo., pp. 487-491, 2004.

[5] AT McCray, AR Aronson, AC Browne, TC Rindflesch, A Razi, S Srinivasan. UMLS knowledge for biomedical language processing. In Bull Med Libr Assoc., pp. 184-94, 1993.

Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu
(Note: On July 1, 2003, the AI Lab and LCS merged to form CSAIL.)