MIT CSAIL Research Abstracts

CSAIL Publications and Digital Archive header

Technical Reports

Work Products

Research Abstracts

Historical Collections

horizontal line

Research Abstracts - 2006
horizontal line

horizontal line

Natural Language Processing of Spoken Diet Records (SDRs)

Ronilda Lacson & William Long

Abstract

Dietary assessment is a fundamental aspect of nutritional evaluation that is essential for management of obesity as well as for assessing dietary impact on chronic diseases. Various methods have been used for dietary assessment including written records, 24-hour recalls, and food frequency questionnaires.[1,2] The use of mobile phones to provide real-time dietary records provides potential advantages for accessibility, ease of use and automated documentation. However, understanding even a perfect transcript of spoken dietary records (SDRs) is challenging for people. This work presents a first step towards automatic analysis of SDRs. Our approach consists of four steps – identification of food items, identification of food quantifiers, classification of food quantifiers and temporal annotation. Our method enables automatic extraction of dietary information from SDRs, which in turn allows automated mapping to a Diet History Questionnaire dietary database.[3] Our model has an accuracy of 90%. This work demonstrates the feasibility of automatically processing SDRs.

Introduction

The proposed utility of mobile phones to provide a real-time dietary record provides potential advantages for accessibility, ease of use, and options for near-instantaneous transfer of information and automated documentation. Mobile phones are increasingly becoming ubiquitous in the general population and tend to be within a person's reach at most times. Apart from the spoken dietary record (SDR) that forms the first phase of this revolutionary approach, there is the potential for utilizing new developments in technology such as photo or video capability for sending pictures or short video recordings of the food and beverage intake, both before and after the meal. The rapid evolution of mobile phone technology coupled with continued advancement in automatic image recognition provides an additional layer of information that may assist researchers in more precise estimation of true dietary intake.

Despite the overall vision for developing this tool, the project has to begin with a critical basic component of this new modality – automatic processing of SDRs. The goals of this paper are two-fold: (1) We propose a framework for acquiring the structure of SDRs; and (2) We present and evaluate a four-step algorithm that enables automatic extraction of dietary information from these records, which in turn allows automated mapping to the Diet History Questionnaire dietary/nutrient database, a commonly used dietary assessment instrument based on 4,200 individual foods reported by adults in the 1994-1996 US Department of Agriculture Continuing Survey of Food Intakes by Individuals (CSFII).[3]

Method

We recorded SDRs over a period of 20 days. There was no prior training and there were no restrictions in word usage or timing of the recordings. We used a Nokia 6600 mobile phone as a voice recording device to record descriptions of dietary information. Each SDR is automatically stored in the mobile phone with its corresponding date/time stamp. All SDRs were downloaded to a computer after twenty days and manually transcribed by the investigator, maintaining delineations between individual recordings. The data were then divided into training and testing sets according to the chronological order in which they were received.

The automatic processing of the data is divided into four steps: Identification of Food Items, Quantification of Food Items, Classification of Food Quantifiers, and Temporal Annotation. The basic model relies on supervised classification at the word level. Each object in the training set is represented as a vector of features and its corresponding class. We learn the weights of the rules in the supervised framework using Boostexter,[4] a state-of-the-art boosting classifier.

Results

We show that at each step, the methods we describe for automatically processing SDRs significantly outperform majority baseline. The baseline model is given by classifying each item with the most frequent class. The models' accuracy at each step are as follows: 0.95, 0.98, 0.92 and 1.0. Proceeding step-wise, we obtain a combined accuracy of 0.90 in identifying all food items with their appropriate food quantifiers with accurate temporal annotation in the test data set.

Discussion

We present a fully-implemented stepwise algorithm for automatically extracting key dietary information. We believe this is a significant and innovative approach to dietary assessment that has minimal burden to both consumer and health care provider. The use of natural language enhances the usability of the system and the wide availability of mobile phones makes it an ideal medium for collecting this information. The current algorithm focuses on natural language processing techniques that enable automatic processing of the SDRs.

References:

[1] SN Zulkifli, SM Yu. The food frequency method for dietary assessment. In J Am Diet Assoc., pp. 681-685, 1992.

[2] R Klesges, L Eck, J Ray. Who underreports dietary intake in a dietary recall? Evidence from the Second National Health and Nutrition Examination Survey. In J Consult Clin Psychol., pp. 438-444, 1995.

[3] A Flood, A Subar, S Hull, T Zimmerman, D Jenkins, A Schatzkin. Methodology for adding glycemic load values to the National Cancer Institute diet history questionnaire database. In Journal of the American Dietetic Association, vol. 106(3), pp. 393-402, 2006.

[4] R Schapire and Y Singer. Boostexter: A boosting-based system for text categorization. In Machine Learning, pp. 135-168, 2000.

Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu