Abstracts - 2007
WaveScope: A Wireless Sensor Computing System for High Data Rate Applications
Lewis Girod, Yuan Mei, Ryan Newton, Stan Rost, Arvind Thiagarajan, Hari Balakrishnan & Sam Madden
Collaborators: Kevin Amaratunga (Metis), Ivan Stoianov (Intel), Charles E. Taylor (UCLA), Daniel T. Blumstein (UCLA), Travis Collier (UCLA), Andreas Ali (UCLA), Eugene Shih (CSAIL)
The WaveScope project is developing a software platform to make it easy to develop, deploy, and operate wireless sensor networks that exhibit high data rates. In contrast to the "first generation" of wireless sensor networks that are characterized by relatively low sensor sampling rates, there are several important emerging applications in which high rates of hundreds to tens of thousands of sensor samples per second are common. These include:
Our target applications use many embedded vibration, seismic, pressure, magnetic, acoustic, and image sensors that sample at fine resolutions and high rates. For example, pressure transient monitoring of water transmission pipelines requires sampling rates of approximately 1-2 kHz [1,2], vibration monitoring of airplane wings and industrial equipment requires sampling at tens of kHz, etc.
These applications are generally continuously vigilant, processing the data streams in real time, and triggering more sophisticated processing or actuation when particular events are detected.
Questions and analyses applied in a WaveScope system might include:
Answering these questions requires both signal processing and relational database operations over "real-time" data samples as well as data from the recent past. They also require efficient wireless data delivery protocols for large "blobs" of signal data.
Our research focuses on two novel directions:
WaveScope will enable users and developers to compose systems from modular building blocks that analyze the source data to produce information and to ask questions at different levels of abstraction. One of the implications is to allow deployment of monitoring applications without requiring the user to write code for an embedded platform. Our declarative programming system and optimized network protocols will free application writers from these gory details (which are hard to get right).
We are developing a high-level, declarative application description language that enables automated design-time optimizations through compilation to an intermediate representation. This language and optimization algorithms will enable developers to compose systems from high-level operators while ensuring good performance. Through compilation, WaveScope can perform computational optimizations including common subexpression elimination, factoring, and elimination of redundant operators. This language will also enables WaveScope to automate the process of distributing the processing components into a network of sensor devices. By pushing processing towards the data, it is possible to substantially reduce network overhead.
WaveScope's goal is to provide a clean and readily understood model of operation, to enable new users to get started quickly and run it out-of-the-box on the most common platforms. As a result, we aim to identify and include the most commonly used signal processing operations (such as the Fourier and Wavelet Transforms and Filtering) as a complete set of primitive operators. Where possible, we are working to integrate external packages and signal processing code (e.g., e.g. Matlab, Octave, Ptolemy, etc.) into WaveScope.
Experience with Hydraulic Transients: Implementations in Matlab, Java Simulator, and the Borealis stream processing system
We have experimented with leak localization on a water transmission pipeline by analysis of hydraulic transients. Generated by a sudden change in water velocity, a hydraulic transient is a pressure wave that reflects from leaks. We have outlined a set of operators needed for analyzing hydraulic transients in pressure data.
We have also devised a distributed algorithm for in-network leak localization. The distributed approach combines locations of leak signatures from several sensors to create an improved estimate of the leak location. We also found that in some cases, selecting the sensors to be included in this estimation process increased the accuracy of the final leak location estimate.
Besides Matlab and Java-based simulations, we have also implemented leak detection using Borealis stream processing system. In this evaluation, we added two new operators to the original Borealis system: a wavelet transformation operator, and a range selection operator for peak extraction.
Our initial evaluation of Borealis suggested two limitations that make traditional stream processors (like Borealis) an inappropriate starting place for WaveScope:
The initial experiences mentioned above, including Borealis and leak detection, and other signal-processing sensor systems like acoustic localization, bird chirp identification and audio-based tracking have led us towards our current goal of designing a high level data processing language: the Wavescope HDL. This language is designed for a system that supports high data rates (hundreds of kHz), includes a library of signal processing operators, and will permit more interesting constructs for data flow including cycles, delays, and other logic not available in existing stream processors like Borealis.
We have designed new protocols for disseminating and processing large application data units, corresponding to thousands of samples collected at high rates. The main challenge is dealing with variable wireless channel conditions and packet losses. To perform efficient "in-the-net" processing, nodes create state. Unfortunately, packet losses and topology changes make it hard to keep this state consistent. We are developing algorithms to address these problems.
The main difference between our protocols and previous work is that most previoous work has dealt with small chunks of application data, while our domain mandates large data chunks that are considerably larger than the size of a network packet.
Initial progress on a prototype Wavescope system
After some early work implementing Wavescope applications within Borealis, we begin work on a new system that would be better suited for the many applications we envision. Progress on this system has been rapid and it is already functional enough to be used for some of our applications.
We are developing Wavescope in two tracks: one track developing the engine in C++, and implementing some applications directly in C++, and a second track developing the Wavescript language and compiler, which will eventually compile to C++ and interface to the engine. Currently some of our applications are coded directly in C++, and some are coded in Wavescript. As Wavescript becomes more feature-complete, we expect to move all application code over to that system.
We have published a CIDR paper  that describes the high level design and goals of Wavescope and Wavescript. This paper solidifies many of the language concepts and provides an example of an application implemented in Wavescript.
Our progress on the engine has led to very good performance results, when compared with other streaming database systems, even commercial ones. It is not unusual to see performance improvements of three orders of magnitude. This is largely the result of a number of optimizations in the engine and the memory manager that reduce scheduling overhead and memory footprint for high rate data. We have a VLDB paper under submission that describes the performance characteristics of the Wavescript engine, and details some of these optimizations.
There is much future work in tuning the engine, in particular tuning for higher performance on multi-core processors. We also plan to implement many compiler optimizations within the Wavescript compiler.
Experience with a bio-acoustic wildlife localization application
We have been pursuing a wildlife monitoring application, in collaboration with several biologists from UCLA. In this work, we are developing software in Wavescope to enable real-time localization of animal calls. The localization system works by continuously monitoring the audio channel, searching for a profile matching the target animal. This it typically done by summing the signal energy over a selected band and detecting sudden increases in energy. When a possible call is detected, the direction of arrival of the signal is determined by an approximate maximum-likelihood (AML) algorithm. These estimates are then combined to determine the most likely source location of the animal call. These algorithms are described in detail in a recent IPSN publication .
We have participated in some of the UCLA field work (July 2006) in order to better understand their requirements and to help gather data. In this work we have used an embedded acoustic platform that was originally developed at UCLA . We see this platform as an ideal platform for Wavescope systems, and in fact we have already ported the current version of Wavescope to this platform.
Although the field work in 2006 did not use Wavescope, we were able to collect large amounts of test data, which we could later use to further develop and test our Wavescope software with realistic application requirements. These initial efforts have produced some hard performance data and have been quite useful in better understanding the behavior of the system.
Experience with a seizure detection application
Our first external Wavescope user is Eugene Shih, who is investigating power saving schemes in an embedded epileptic seizure monitor. He has translated an existing seizure detection algorithm into Wavescope and has been continuing to develop it there to add adaptive power saving algorithms. So far his adaptive scheme has been implemented and he has successfully ported it to his ARM-based embedded platform. The limited resources of the embedded platform did not pose a problem, and the Wavescope version of this algorithm uses only about 30% of the CPU. He has found Wavescope to be helpful, and has provided good feedback about the aspects that are not so easy to use. As Wavescript matures, he will be able to transition to Wavescript and will be able to provide feedback about that layer of the system as well.
Ptolemy II [4,5] is a Java-based framework for dataflow software and modeling that has been under development at UC Berkeley in various forms since 1987. Using Ptolemy, dataflow systems are implemented through a combination of reusable Java components and an XML-based high level description language that defines how the components are linked together. Ptolemy supports a wide variety of different dataflow models, called "Domains", including finite state machines, synchronous signal processing, circuit emulation, and hybrids that combine elements of different domains. This framework has been used to support various forms of optimization  as well as automated synthesis of embedded signal processing systems .
Relative to Ptolemy, WaveScope's framework targets the embedded sensor network more directly, and focus on providing a framework for implementing systems, whereas Ptolemy is more commonly used as a modeling or instructional tool. WaveScope directly addresses the problem of distributing a workflow into a network, distributing some computation to the sensors while centralizing other computation at the server. WaveScope also places greater emphasis on enabling automatic optimization of computational, buffering and network resources. Finally, many WaveScope applications involve forms of conditional processing that are not conveniently implemented using the existing Ptolemy domains.
Borealis  is an event-driven stream processing engine designed to support streams of data from sensor networks, following on from work on the previous systems Aurora and Medusa. Borealis is configured using a high-level description language based on XML that links together modules implemented in C++. It supports distribution of processing throughout the network by an RPC-like mechanism that enables local-remote transparency for connections between modules. Relative to other stream processing engines, Borealis provides better support for dynamics: query results can be revised dynamically in the event that corrections are made to the input stream, and the queries themselves can be dynamically selected based on detected events. We have used Borealis to build some initial WaveScope prototypes, although it is still not clear whether it will be a good fit in the long run.
TelegraphCQ  is one of the first engines to process data continuously as it arrives instead of storing it first. In TelegraphCQ, users express queries in SQL, except that each query has an extra clause defining the input windows over which the results should be computed. The main emphasis of TelegraphCQ is on adaptive processing, in which tuples can be routed to different modules based on changing conditions. This adaptivity enables better performance in the face of variable conditions during query execution, but introduces greater overhead.
The STREAM  project explores several aspects of stream processing: a new data model and query language for streams, resource management, and some distributed operation. STREAM processes data as it arrives, converting the stream to a relation that changes over time. These relations are processed and converted back to an output stream. This arrangement allows STREAM applications to be composed of a mix of operations on streams and relations.
Relative to both STREAM and TelegraphCQ, WaveScope proposes more sophisticated network capabilities, especially the ability to push signal processing components close to the sensor to limit network overhead. WaveScope also proposes a more sophisticated buffer management scheme that enables the system to store data that may be needed later as a result of conditional processing.
LabView and Matlab
LabView is a commercial product that is similar in functionality to Ptolemy, but is noted for its extensive compatibility with industrial equipment. Matlab serves as a source of inspiration for WaveScope development, because of its wide acceptance and high degree of expressiveness.
WaveScope intends to provide users with an interface that supports tinkering and exploration of data streams with similar ease to Matlab and LabView.
 Stoianov, I., Dellow, D., Maksimovic, C. and Graham, N.J.D. Field Validation of the Application of Hydraulic Transients for Leak Detection in Transmission Pipelines. In The Proceedings of CCWI 2003 Advances in Water Supply Management Conference, London, UK, September 2003.
 Stoianov, I., Maksimovic, C. and Graham, N.J.D. Designing a Continuous Monitoring System for Transmission Pipelines, In The Proceedings CCWI 2003 Advances in Water Supply Management Conference, London, UK, September 2003.
 Buck, J.T., Ha, S., Lee, E.A., and Messerschmitt, D.G., Ptolemy: A Framework for Simulating and Prototyping Heterogeneous Systems. In Int. Journal of Computer Simulation, special issue on Simulation Software Development, vol. 4, pp. 155-182, April, 1994.
 Ha, S. and Lee, E.A., Compile-Time Scheduling of Dynamic Constructs in Dataflow Program Graphs. In IEEE Trans. on Computers, Vol. 46, No. 7, July 1997.
 Bhattacharyya, S.S., Murthy, P.K., and Lee, E.A., Synthesis of Embedded Software from Synchronous Dataflow Specifications. In Journal of VLSI Signal Processing Systems, Vol. 21, No. 2, June 1999.
 Abadi, D.J., Ahmad, Y., Balazinska, M., Centintemel, U., Cherniack, M., Hwang, J-H., Lindner, W., Maskey, A.S., Rasin, A., Ryvkina, E., Tatbul, N., Xing, Y., and Zdonik, S., The Design of the Borealis Stream Processing Engine CIDR 2005 - Second Biennial Conference on Innovative Data Systems Research, Asilomar, California, January 2005.
 Arasu, A., Babcock, B., Babu, S., Cieslewicz, J., Datar, M., Ito, K., Motwani, R., Srivastava, U., and Widom, J.. STREAM: The stanford data stream management system. To appear in a book on data stream management edited by Garofalakis, Gehrke, and Rastogi.
 Chandrasekaran, S., Cooper, O., Deshpande, A., Franklin, M.J., Hellerstein, J.M., Hong, W., Krishnamurthy, S., Madden, S., Raman, V., Reiss, F., and Shah, M. TelegraphCQ: Continuous dataflow processing for an uncertain world. In Proc. of the First Biennial Conference on Innovative Data Systems Research (CIDR), January 2003.
 Ali, A., Collier, T., Girod, L., Yao, K., Taylor, C.E., Blumstein, D.T. An Empirical Study of Collaborative Acoustic Source Localization. In Proc. of Information Processing in Sensor Networks (IPSN07), Cambridge, MA, April 2007.
 Girod, L., M., Trifa, V., and Estrin, D. The Design and Implementation of a Self-calibrating Acoustic Sensing Platform. In Proc. of the ACM Conference on Embedded Networked Sensor Systems (SenSys 2006), Boulder, CO, November 2006.
 Girod, L., Jamieson, K., Mei, Y., Newton, R., Rost, S., Thiagarajan, A., Balakrishnan, H., and Madden, S. The Case for WaveScope: A Signal-Oriented Data Stream Management System. In Proc. of the Third Biennial Conference on Innovative Data Systems Research (CIDR), Monterrey, CA, January 2007.
This work is funded by the National Science Foundation under Award Number CNS-0520032.