Abstracts - 2007
Hybrid Quorum Byzantine Fault Tolerance
James A. Cowling, Daniel S. Myers & Barbara H. Liskov
The increasing importance of highly available online services mandates protocols to ensure correctness and permanence of service data. Availability and correctness in the face of node failures and malicious attacks, so called Byzantine faults, introduces additional challenges to reliability. These challenges are addressed by Byzantine fault-tolerant replicated state machines, notably the Castro and Liskov BFT protocol .
Two approaches have traditionally been taken to providing Byzantine fault tolerant state machine replication: replica-based agreement approaches, such as BFT, and quorum based approaches such as the Query/Update (Q/U) protocol . In an agreement protocol, a primary replica nominates an ordering for each client operation, with replicas communicating in three-phases of all-to-all communication to agree on the ordering. In the newer Byzantine quorum approaches, this work is pushed to the clients to optimistically order operations, proceeding with one or two phases of communication while avoiding inter-replica communication
Both agreement-based and quorum-based schemes have shortcomings. The quadratic message cost of BFT may saturate a network, with the three phases of communication introducing potentially excessive delay. Q/U was designed to address claimed scalability issues in BFT, and executes writes in one-to-two phases, with low communication overhead. On the other hand it requires a large number of replicas - 5f+1 to tolerate f faults, rather than 3f+1 in BFT. Q/U also suffers significant performance degradation when writes contend (concurrent writes on the same object), due to the inefficient use of randomized backoff.
In this work we leverage the advantages of both agreement and quorum schemes, in a hybrid protocol for Byzantine fault tolerance, HQ Replication. Under normal operation we run a lightweight two-phase quorum scheme, while employing the BFT protocol to resolve instances of write contention. Through this combination we offer the following properties:
Our quorum protocol allows clients to commit write operations in two phases. In the first phase a client asks for a grant from each replica, authorizing it to perform the operation at the given sequence number. If the client receives a quorum (2f+1) matching grants it can use the combination as a certificate to prove this agreement to the replicas in the second phase, committing its operation after a quorum of replies. To ensure liveness we provide writeback operations to enable clients to advance the state of slow replicas, or commit an operation on behalf of another client.
If different replicas have granted the same sequence number to different clients, there has been write contention, and the granted sequence numbers are in an inconsistent state. We then employ the services of BFT to resolve this contention via communication between replicas, committing all contending operations. Further details on the protocol are available in our paper  and technical report .
Our implementation of HQ Replication was found to exhibit good performance. Figure 1 shows an analytical comparison of the HQ and Q/U protocols, along with three different flavors of BFT - the original BFT protocol, along with those we modified to use MACs instead of authenticators (BFT-MACs), and incorporate the MAC optimizations along with the use of preferred quorums  (BFT-opt). Not only does HQ exhibit low communication load, but our modified versions of BFT significantly reduce communication cost over the original BFT implementaion.
Figure 2 illustrates the performance of HQ vs BFT (with BFT batch size of 1) under a real system deployment, in the absence of contention. HQ is found to outperform BFT in throughput, however significant gains were realized in BFT due to our optimizations. Figure 3 shows how HQ reacts to increasing levels of write contention, up to 100% of operations contending at contention factor 1.0. Performance degrades quite gradually in all cases, especially in comparison to existing quorum schemes.
An interesting side-note of this work is that with a few optimizations, BFT was found to perform much better than previously believed. This is further emphasized when request batching is employed, where the system buffers a number client requests and runs the protocol once per batch. This amortizes the cost of agreement over a number of client operations, reducing the cost of the agreement protocol. Batching is only possible in agreement-based protocols where there is a primary replica to buffer requests, and not in quorum protocols where replicas order operations independently. We hence recommend the HQ protocol for use where latency is critical or contention is relatively low, and BFT when contention is high or maximum system throughput is a primary concern.
Our base protocol for Hybrid Quorum Byzantine Fault Tolerance is fully developed as the HQ Replication protocol  . We have implemented experimental versions of both HQ Replication and the BFT protocol on a new codebase for the purposes of performance comparison. We aim to release a final implementation of both these protocols in the near future, to facilitate use in building distributed systems, and provide consistent performance benchmarks for the research community.
We are also commencing the design of a large-scale distributed transactional storage system, that will employ Byzantine Fault Tolerance across the replicated servers at its core. This project will incorporate the new fault tolerance codebase.
This research is supported by NSF ITR grant CNS-0428107 and by T-Party, a joint program between MIT and Quanta Computer Inc., Taiwan.
 J. Cowling, D. Myers, B. Liskov, R. Rodrigues and L. Shrira. HQ Replication: A Hybrid Quorum Protocol for Byzantine Fault Tolerance. In Proceedings of the Seventh Symposium on Operating Systems Design and Implementation (OSDI '06), pp. 177-190, Seattle, WA, USA. November 2006.
 J. Cowling, D. Myers, B. Liskov, R. Rodrigues and L. Shrira. HQ Replication: Properties and Optimizations. MIT Technical Report MIT-CSAIL-TR-2007-009. Cambridge, MA, USA. February 2007.
 M. Castro and B. Liskov. Practical Byzantine Fault Tolerance. In The Proceedings of the 3rd Conference on Operating Systems Design and Implementation (OSDI '99), pp. 173--186, New Orleans, Louisiana, February 1999.
 M. Abd-El-Malek, G. R. Ganger, G. R. Goodson, M. K. Reiter and J. J. Wylie. Fault-Scalable Byzantine Fault-Tolerant Services. In The Proceedings of the 20th ACM Symposium on Operating System Principles (SOSP '05), pp. 59--74, Brighton, United Kingdom, October 2005.