MIT CSAIL Research Abstracts

Scene Modeling for Far-Field Activity Analysis

Tomas Izo & Eric Grimson

Fig. 1: Example scene model--a sample of ~3000 moving object tracks (left) grouped into ~100 clusters (right). Tracks with different color belong to different clusters (colors are reused).

Introduction

A key goal of any visual surveillance system is to automatically determine when an observed scene contains unusual or unexpected activity. In the past this task was performed by a human expert: someone familiar with the scene who was able to recognize when something out of the ordinary occurred. A typical surveillance site may have so many sensors in different locations that it is no longer feasible for a person to monitor all of them. Machine vision systems are needed to mine the collected data for potentially interesting activity. This has fostered a new area of machine vision research, aimed at building statistical models of the usual pattern of activity in scenes.

Methods

We define an activity as the movement of a single object, such as a car, person or group of people, through the scene. Various aspects of activities, to which we refer as tracks, include the paths the objects take, their sizes, velocities, direction of motion, etc. We propose a method for unsupervised, multi-featured learning of a statistical model of activities that can make use of a very large data set for learning, yet enables fast reasoning about new tracks, both partial and complete. We use a versatile similarity measure that groups together tracks only when they are similar along all of the various observed aspects. We group tracks using spectral clustering and estimate the spectral embedding efficiently from a sample of tracks using the Nystrom approximation [1]. Clusters are modeled as Gaussians in the embedding space and new tracks are projected into the embedding space and matched with the cluster models to detect anomalies. The ability to reason about partial tracks makes it possible to detect surprising moments, which occur when there is a sudden change in the belief distribution for a given track.

Results

To learn a statistical model of activity in the scene, we use a set of approximately 40,000 moving object tracks, corresponding to an entire week of activity in a busy urban outdoor scene. Figure 2 shows two examples of unusual activities detected using the learned model by thresholding on the likelihood of each activity under the model. In addition to obtaining qualitative results such as the ones in Figure 2, we also demonstrate the validity of our scene modeling and anomaly detection framework by collecting human judgments for a sample of several hundred examples from the data set and showing that they are correlated with the likelihoods of the examples under the model.

Fig. 2: Examples of detected unusual activity. Left: A person walking along an unusual path and dropping off a large object (suddenly changing size). Right: A car driving into the pedestrian zone and turning around.

Research Support

References:

[1] Charles F. Fowlkes, Serge Belongie, Fan Chung and Jitendra Malik. Spectral grouping using the Nystrom method. In IEEE Trans. Pattern Analysis and Machine Intelligence, 26(2):214-225, 2004.