MIT CSAIL Research Abstracts

CSAIL Publications and Digital Archive header

Research Abstracts Home

CSAIL Digital Archive

Research Activities

CSAIL Home

horizontal line

Research Abstracts - 2007
horizontal line

horizontal line

Getting to Know You Gradually: Personal Lifetime User Modeling (PLUM)

Max Van Kleek, Michael Bernstein, Robin Stewart, David E. Karger, MC Schraefel & Howard E. Shrobe

Introduction

Progress in user modeling over recent years has demonstrated that models learned from observing users' actions can boost ease and efficiency of application use, improve interaction quality, and save users time and effort. Yet, despite progress in the field, relatively few applications on the desktop today employ user modeling techniques to adapt to users' needs. The field's most visible successes have instead been in recommender systems for online retailers and content providers, which gain leverage by simultaneously amassing profiles of hundreds, thousands, or millions of users. While this approach has been successful for online businesses and marketplaces, it is not easily applied to desktop applications, which have one primary user, and where information may be much more personal and sensitive in nature. One of the primary obstacles to user modeling on the desktop has been the complexity needed to develop application-specific user modeling systems to learn from user actions. Another is the bootstrapping problem, that very little about the user is known when the application is first installed on the user's system.

Our belief is that some of these desktop modeling challenges can be mitigated by decoupling user modeling components from applications, so that models can be shared across applications. In addition to reducing the bootstrapping problem, an advantage to this approach is that it becomes possible to capture task-related contextual connections among applications such as an e-mail client, web browser, and a text editor, which would otherwise be missed by application-centered modeling techniques.

This project seeks to advance techniques in personal lifetime user modeling (PLUM), the idea of building and incrementally refining models of a user through passive observations of their behavior, captured over a long term using their own digital devices. Our overall goals with this project are the following: first, 1) perform primary investigative work in PLUM, specifically to identify the primary major technical challenges and major risks for PLUM modeling approaches as a whole; 2) to establish standard practices, i.e., to identify a set of the techniques and algorithms which are appropriate for capturing deriving, or using PLUM models; finally, 3) to compare PLUM models with other more common approaches such as collaborative filtering. In addition, we hope to focus on the following problems and application domains, which are unique to PLUM-modeling:

Distributed, heterogeneous user activity capture and learning - Our first goal is to address technical challenges of building rich logs of the user's activity and environment, and to show how these logs can be captured unobtrusively and securely across the user's personal devices, under the user's full discretion and control. Following this, the second feasibility challenge will surround demonstrating how continuous, unsupervised model learning techniques can generate predictive user models from these rich logs. A key challenge is in effectively handling differences among representations of observations, for both the same and unlike phenomena. Diversity in representations for a single phenomenon could arise from the use of different types of sensors for the same phenomena (e.g. GPS vs WiFi for location detection) at different times, the selective availability of certain types of observations, or noise-corrupted sensing processes.
Memory prosthesis applications - Once the capture and representational issues are resolved, we wish to investigate the value of the rich activity capture logs themselves-- without deriving sophisticated models -- towards helping people keep track of their past activities. Studies have attributed information re-finding failures, and the inability to understand previously written notes, to failures of memory failures of various types [7], and that people far more often misremember what they did in the past than they realize [9]. A main impediment towards use of such logs, however, is the comparative effort required to search and find information of interest. Thus, we will seek ways to make it possible for users to access information kept in these logs efficiently at critical moments when this information is needed.
Situational awareness (user context disambiguation and prediction) - Next, we we wish to determine the feasibility of using PLUM logs to derive an individual's "life profile" -- specifically, the people, places, and topics/things that make up the background contexts for the user's usual information manipulation activities. Then, to see how this profile could make it possible to predict the user's information needs in new scenarios, and to "interpret" the significance of certain contexts for certain types of activities over others. This has several immediate uses such as: prediction, i.e., being able to tell what actions/state the uesr will likely assume given a particular context, and interpretation, i.e., and the ability to tell what makes a particular situation significant, similar to, or different from other situations in the past, and how much a particular situation deviates from situations in the user's normal routine.
Task understanding - Finally, we wish to determine to what extent the system can be made to "understand" the different high level tasks/activities the user is performing, without the user's intervention. We specifically refer to the system being able to associate specific user actions, resources (such as documents), with a specific, small set of "general activities" that map roughly to the user's own notion of their tasks. Then, we wish to determine whether these models can understand properties of these activities, ranging from the "topics" that characterize them, and the physical context(s) in which they typically occur, to the relative priorities of these activities for the user at a particular moment. An additional challenge we wish to address includes mapping from these activities to textual descriptions of "to-do" and project items written by the user. This understanding can then be applied to task tracking, and to allow the system to provide proactive assistance for services or resources that are likely to be needed by the user.

Related work

The examination of ways to capture personal interaction histories to serve as personal memory prostheses has led us to the closely related field the automatic, continuous archival of personal experiences (i.e., life logging). Early work in this space was done by using mobile/wearable devices was done by Lamming et al. [7], a system which built an "automatic diary" based upon a user's physical movement in the workplace; and Rhodes et al. [8], which demonstrated how incorporation of sensed location and other physical context (such as the identity of nearby persons) could be used to retrieve notes created in similar contexts. Microsoft Research's more recent MyLifeBits project has sought to build a lifetime personal information store that correlates spatial temporal correlations among resources for retrieval tasks, such as searching for documents based on co-occurring events, co-located items, and time of access [4].

With respect to examining users' information manipulation behaviors at the desktop, the CALO IRIS project [2] has produced an instrumented desktop environment capable of recording every action the user performs. Additionally, IRIS uses a rich semantic-grounding for observations, which served as inspiration for our KR. PLUM differs from IRIS in that it attempts to instrument the user's desktop applications without modifying them noticeably to the user - which has a considerable adoption (and long-term maintenance) advantage over re-implementing an entire suite of desktop applications, and forcing users to switch to a foreign interface.

On the modeling frontier, a new open-source toolkit by Fogarty and Hudson called SUBTLE [3] seeks to simplify the process of building statistical models of users' activities. We plan to investigate SUBTLE's capabilities and techniques, and may consider integrating it with PLUM when the system is released.

Status

With respect to context capture, we have demonstrated a working prototype of Chron, PLUM's capture framework, written in Java, with a set of knowledge sources designed for the Apple MacBook. [10] The current set of knowledge sources monitor a user's interactions with their desktop applications on Mac OS X (via integration with Applescript), filesystem activity, the user's location via WiFi, accelerometer readings from the Sudden Motion Sensor, and takes periodic images of the user using the integrated iSight. Observations are represented in RDF and written via the Jena RDF API. We have demonstrated that Chron works effectively with under 5 percent CPU utilisation (combined with mysql) when set to 3Hz with 10 knowledge sources. (On the same machine, iTunes consistently consumes 6-12 percent CPU while playing mp3s).

With respect to our second goal, of demonstrating the value of captured activity logs, we are currently prototyping an augmented personal journal application known as jourKnow, which establishes correspondences between when each piece of text in the journal was created to the user's environment and activities surrounding that moment. We are hoping to evaluate whether such correspondences can be used to facilitate re-finding within the journal, reduce the effect of memory decay on identifying the meaning of journal entries, and help users re-fill missing bits of their journal. See [1] for a preliminary discussion.

For more information

Please see PLUM's web site for code releases and project updates.

Acknowledgements

PLUM is part of the ConnectingME project, and is supported by CSAIL and the MIT-Nokia collaboration.

References:

[1] Michael Berstein, Max Van Kleek, mc schraefel, and David Karger. "Managing personal information scraps". Work in progress, Proceedings of CHI 2007, San Jose, CA, April 2007.

[2] Adam Cheyer, Jack Park and Richard Giuli. "IRIS: Integrate. Relate. Infer. Share." Workshop on the Semantic Desktop. Proceedings of the International Semantic Web Conference, Galway, Ireland, 2005.

[3] James Fogarty and Scott E. Hudson. "Toolkit for Developing and Deploying Sensor-Based Statistical Models of Human Situations". In Proceedings of CHI 2007, San Jose, CA, USA, April 2007.

[4] Jim Gemmell, Gordon Bell, Roger Lueder, Steven Drucker and Curtis Wong , "MyLifeBits: Fulfilling the MEMEX Vision", ACM Multimedia ACM, 2002.

[5] Rosco Hill and James "Bo" Begole. "Activity rhythm detection and modeling". In Proceedings of CHI 2003, Ft. Lauderdale, Florida, USA, 2003.

[6] Vaiva Kalnikaite and Steve Whittaker. "Software or Wetware? Discovering When and Why People Use Digital Prosthetic Memory". In Proceedings of CHI 2007, San Jose, CA, USA, April 2007.

[7] Mik Lamming. "Forget-me-not: intimate computing in support of human memory". In Proceedings of FRIEND21 Symposium on Next Generation Human Interfaces, Tokyo, Japan, 1994.

[8] Bradley Rhodes. "The wearable remembrance agent: A System for Augmented Memory". In Proceedings of the First International Symposium on Wearable Computers (ISWC '97), Cambridge, MA, 1997.

[9] Daniel L. Schacter. -The Seven Sins of Memory: How the Mind Forgets and Remembers Houghton-Mifflin, 2002.

[10] Max Van Kleek and Howard E. Shrobe. "A Practical Activity Capture Framework for Personal, Lifetime User Modeling". In Proceedings of 11th International Conference on User Modeling (UM 2007), Corfu, Greece, June 2007.

Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu