CSAIL Publications and Digital Archive header
bullet Technical Reports bullet Work Products bullet Research Abstracts bullet Historical Collections bullet

link to publications.csail.mit.edu link to www.csail.mit.edu horizontal line

 

Research Abstracts - 2006
horizontal line

horizontal line

vertical line
vertical line

Communication-Efficient Probabilistic Quorum Systems for Sensor Networks

Gregory Chockler, Seth Gilbert & Boaz Patt-Shamir

Abstract

Communication-efficiency is of key importance when constructing robust services in limited bandwidth environments, such as sensor networks. We focus on communication-efficiency in the context of quorum systems, which are useful primitives for building reliable distributed systems. To this end, we exhibit a new probabilistic quorum construction in which every node transmits at most O(log^2 n) bits per quorum access, where n is the number of nodes in the system. Our implementation, in addition to being communication efficient, is also robust in the face of communication failures. In particular, it guarantees consistency (with high probability) in the face of network partitions. To the best of our knowledge, no existing probabilistic quorum systems achieve polylogarithmic communication complexity and are resilient to network partitions.

Introduction

Quorum systems have long been used for improving robustness, efficiency and availability of distributed systems. Among their applications are data replication, mutual exclusion, data dissemination, and many others. In a typical quorum-based distributed system, the servers store data on behalf of clients, which access the data by interacting with subsets of servers, called quorums, such that every pair of quorums intersect.

In this project, we explore the problem of designing efficient quorum systems for environments with limited computational and networking resources, such as sensor networks (sensor nets). We consider a typical sensornet setting where small battery-powered devices (sensors) with wireless connectivity are deployed in an environment without pre-existing networking infrastructure. The sensors may act as servers that collect and store data intended to be accessed by more powerful client machines. They can also act as clients that store and access information at other sensors for better availability and/or load balancing. Due to the ad hoc deployment, the communication network is formed by the devices themselves so that nearby nodes communicate directly, and the nodes residing outside of each other's communication ranges communicate indirectly through other nodes. In addition, due to collisions and other electromagnetic interference, the network can experience frequent message loss and connectivity changes.

It is therefore, a natural question to ask whether the per-node communication overhead can be reduced regardless of the topology of the network, the method of quorum access, and intermittent connectivity. Specifically, is logarithmic or polylogarithmic bit complexity achievable?

Progress

In this project, we answer this question in the affirmative. We present a new, probabilistic quorum system that achieves polylogarithmic communication complexity at the cost of a non-zero---but polynomially small---probability of accessing non-intersecting quorums. The main idea of our construction is to use probabilistic sampling to eliminate the need for accessing too many individual nodes in the system. It works roughly as follows: For each quorum access, a client first chooses a sample of node identifiers of size O(log n), uniformly at random, and then initiates gossip to disseminate this sample---along with the payload for the access---to the nodes in the system. Upon receiving a request, each individual sensor first updates its state if necessary, and then checks whether its identifier is included in the sample. If so, it initiates a gossip to return a response to the client. The client continues to gossip its request until the responses from a subset of the original sample have been received. Once this happens, the quorum access is regarded as completed.

Note that in our protocol, the quorum access can terminate even if the client has not received responses from every node in the sample. Thus, a quorum is effectively any subset---of a sufficient size---of a random subset of size O(log n). The actual size of this subset is determined by the desired failure threshold, which can be polynomially close to n/2.

It might seem possible to tolerate more failures by repeating the sampling step, if too many of the nodes in the original sample appear to have failed. There are two problems with this approach though: First, since reliable failure detection cannot be guaranteed, the client might end up selecting too many nodes, potentially resulting in too many messages being sent. Second, when intermittent connectivity causes the network to become partitioned for sufficiently long, the samples being drawn by the client become increasingly biased towards the connected nodes (due to the reduced sample space) eventually leading to violation of the probabilistic intersection guarantee. Our protocol avoids these problems by employing quorums which are subsets of the initial sample so that failure resilience is achieved without repeating the sampling step.

Of course, if too many nodes in the sample are indeed inaccessible, it might take an arbitrarily long time---at least until the connectivity is restored---for the quorum access to complete. The continuous gossip ensures that this will happen with high probability once the network connectivity is restored. An obvious problem with continuous gossip is that it can result in too many messages being transmitted when the network is unstable. In practice however, the performance can easily be tuned up (e.g., by adjusting timeouts between consecutive gossip rounds) to ensure quick termination under normal network conditions.

To the best of our knowledge, none of the existing probabilistic quorum systems are simultaneously communication-efficient, resilient to network partitions and guarantee quorum intersection with high probability.

Research Support

This work is supported in part by NSF grant CCR-0098305 and NSF ITR Grant 0121277, AFOSR Contract #F49620-00-1-0097, DARPA Contract #F33615-01-C-1896, NSF Grant 64961-CS, and NTT Grant MIT9904-12.

References:

[1] Gregory Chockler, Seth Gilbert, and Boaz Patt-Shamir. Communication-Efficient Probabilistic Quorum Systems for Sensor Networks. In Proceedings of the International Workshop on Foundations and Algorithms for Wireless Networking (FAWN), March, 2006 .

vertical line
vertical line
 
horizontal line

MIT logo Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu