Abstracts - 2006
Proving Atomicity: An Assertional Approach
Gregory Chockler, Nancy Lynch, Sayan Mitra & Joshua Tauber
Many distributed and network-based services can be modeled as shared objects accessible to (possibly remote) clients through well-defined interfaces. Atomicity [6,8] (also known as linearizability) is a desirable property for such objects as it allows clients using the objects to perceive the operations that occur in each run as occurring atomically, in some sequential order. This perception makes it easier to understand the behavior of a system using distributed services, and so, simplifies the task of system design.
Atomic services could be implemented simply on single server machines. However, to achieve high availability in a distributed system and to tolerate failures, atomic services are typically implemented by distributed algorithms. Many distributed algorithms have been proposed for implementing atomic objects.
Although atomic object implementations are abundant, proving that algorithms achieve atomicity has turned out to be a challenging problem. Most existing proofs for such algorithms are long, subtle, and difficult to understand and check. As evidence of the difficulty, we note that several published proofs for implementations of atomic shared read/write memory objects have later been shown to be incorrect.
We believe that a fundamental reason for the difficulty of these proofs is their style: they are based on detailed, not-very-systematic, reasoning about events and their ordering. Useful structure in such proofs is often provided by lemmas about partial orders of operations on objects, for example, Proposition 3 of  (for single-writer read/write objects) and Lemma 13.16 of  (for multi-writer read/write objects). These lemmas provide sufficient conditions for correctness of atomic read/write object implementations, based on a list of properties that a partial ordering of operations must satisfy. However, showing that these properties hold still requires detailed, ad hoc reasoning about events.
In this work, we initiate the study of systematic ways of verifying distributed implementations of atomic objects, beginning with read/write objects (registers). Our general approach is to replace operational reasoning about events and partial orders with assertional reasoning about invariants and simulation relations. The assertional methods differ from the traditional operational arguments in two important ways. First, the system properties are stated precisely in terms of predicates over the system state components. Second, assertional proofs can be checked by examining individual state transitions of the algorithm without reasoning about entire executions. As such they lend themselves to mechanization, i.e., the process of checking a proof can be carried out using automated tools, such as theorem provers.
Our approach to carrying out assertional atomicity proofs is first to define an abstract state machine that captures the atomicity property and then, prove correctness of the object implementations by establishing a simulation mapping between the implementation and the specification automata. The challenge is to find a specification automaton that is general enough to apply to many existing implementations and in the same time sufficiently close to the actual implementations so that the task of finding the mapping becomes easier.
To this end, we define an abstract state machine, which we call the Partial-Order Machine (PO-Machine), which records information about operations and their orders in its state. The PO-Machine expresses the common behavior of many existing atomic register implementations, in which client operation requests are gradually ordered relative to other operation requests until all the necessary ordering constraints are achieved. The ordering constructed is, in the limit, guaranteed to be a partial order of the requested operations that satisfies the conditions listed in Lemma 13.16 of . In fact, the order constructed directly by the PO-Machine satifies slightly weaker constraints, but we prove that these are still sufficient to guarantee atomicity.
We used the PO-Machine as a formal specification for distributed algorithms that implement atomic memory. We demonstrated the generality of this specification by showing that it is implemented by three different read/write register implementations: the message-passing register emulation of  (extended to handle multiple writers), an optimized version of  that takes advantage of synchronized clocks at writers to eliminate the first phase of the write implementation, and the unbounded version of the shared memory construction of a multi-writer/multi-reader register from single-writer/single-reader registers of . We also show that a slight modification of the PO-Machine, called the TO-Machine, can be used to prove atomicity of a general (i.e., not necessarily read/write) object implementation based on the replicated state machine protocol of Lamport .
We specified the PO-Machine and the algorithms formally using the I/O Automata (IOA)  and Timed IOA [3,4] models, in fact, using formal specification languages that have been defined for these models. The IOA/TIOA specification languages leads to very stylized assertional proofs for invariants and simulation relations that can be partially automated using theorem provers. Moreover, the same IOA specifications can be used by the IOA compiler to produce executable Java code. We used the TIOA to PVS translator and TAME library to generate descriptions of the PO-Machine and the ABD algorithm in the language of the Prototype Verification System (PVS). We used PVS to substantially increase the level of detail and assurance of some of our previous hand proofs. In fact, we discovered several gaps and bugs in our hand proofs. Automatic translation enables us to easily tweak the simulation relations and rerun the proof scripts. We demonstrated the usefulness of the state machine style of specification by compiling and executing our IOA automaton for ABD. A single formal representation of the algorithm can be used for specification, verification, and execution.
The results were published in .
Our work with four algorithms so far suggests to us that our PO-Machine (or small variants) may be general enough to capture many of the existing atomic register algorithms. We plan to use these methods to study a wider variety of algorithms, such as bounded-timestamp-based constructions, whose proofs have been notoriously difficult and bug-prone. An interesting challenge will be to extend our framework to capture implementations that are not explicitly based on timestamps, for example, the construction that creates atomic bits from safe bits. Another interesting direction deals with adapting the PO-Machine to capture weaker register semantics, such as safe registers, regular registers, and sequentially consistent registers. There is an increased recent interest in these semantics as they capture the guarantees provided by many Byzantine-resilient storage systems based on Byzantine quorums.
This work is supported by MURI--AFOSR SA2796PO 1-0000243658, USAF--AFRL #FA9550-04-1-0121, NSF Grant CCR-0121277, NSF-Texas Engineering Experiment Station Grant 64961-CS, and DARPA F33615-01-C-1896.
 Hagit Attiya, Amotz Bar-Noy, and Danny Dolev. Sharing memory robustly in message-passing systems. Journal of the ACM, 42(1):124-142, January 1995.
 Gregory Chockler, Nancy Lynch, Sayan Mitra, and Joshua Tauber. Proving Atomicity: An Assertional Approach. In Proceedings of the 19th International Symposium on DIStributed Computing (DISC'2005), September 26-28, 2005, Krakow, Poland. Published by Springer-Verlag GmbH in Lecture Notes in Computer Science, Volume 3724, Oct 2005, Pages 152 - 168
 D. Kaynar, N. Lynch, R. Segala, and F. Vaandrager. The theory of timed I/O automata. Technical Report MIT/LCS/TR-917a, MIT Laboratory for Computer Science, 2004. Available at http://theory.lcs.mit.edu/tds/reflist.html.
 Dilsun K. Kaynar, Nancy Lynch, Roberto Segala, and Frits Vaandrager. Timed I/O automata: A mathematical framework for modeling and analyzing real-time system. In RTSS 2003: The 24th IEEE International Real-Time Systems Symposium, Cancun,Mexico, December 2003.
 Leslie Lamport. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 21(7):558-565, July 1978.
 Leslie Lamport. On interprocess communication: Part I and II. Dist. Comput., 1:77-101, 1986.
 N. A. Lynch and M.R. Tuttle. An introduction to Input/Output Automata. CWI Quarterly, 2(3):219-246, 1989.
 Nancy Lynch. Distributed Algorithms. Morgan Kaufmann Publishers, Inc., San Mateo, CA, March 1996.
 P.M.B. Vitanyi and B. Awerbuch. Atomic shared register access by asynchronous hardware. In 27th IEEE Annual Symposium on Foundations of Computer Science, pages 233--243, 1986.