Abstracts - 2006
Jim Sukha & Bradley C. Kuszmaul
Most traditional algorithms and data structures are designed for tasks whose working set fits entirely into main memory. In these cases, there are a myriad of textbook algorithms and data structures that can easily be optimized for specific applications. For an application whose data set resides partially on disk, however, the set of usable algorithms and data structures is often much more limited. Programs that access data on disk often explicitly move data in and out of main memory. Because these explicit I/O operations substantially increase code complexity, these programs tend to use only canonical data structures such as hash tables, queues and B-trees.
Programming disk-based data structures that can be accessed concurrently is even more difficult. In addition to managing the movement of data to and from memory, programs must also use a mechanism for concurrency control to avoid data races between concurrent operations. Programmers typically use locks to guarantee that critical sections of code execute atomically. Designing an correct locking protocol for a complex data structure that is free of problems such as deadlock or starvation, while still maintaining good performance, is a challenging task.
We have implemented LibXac , a C library that supports non-durable and durable atomic transactions on memory-mapped files. LibXac is designed to provide a simple programming interface for writing code that concurrently accesses data on disk. This interface simplifies the programming of complex, concurrent, disk-based data structures.
LibXac provides a simple interface for accessing data on disk based
on memory mapping. Memory mapping provides the programmer with the illusion
of a single-level of storage, allowing the programmer to access data in
memory and data on disk in the same way. LibXac also supports atomic transactions
by replacing the normal
The code in Figure 1 illustrates a simple program that sums the first integer on 5 random pages in a file and stores the result on a 6th random page. The code on the left illustrates an incorrect version that uses locks and normal memory-mapping (if two of these programs run concurrently, deadlock is possible). The code on the right illustrates a similar, correct program written using LibXac.
LibXac's implementation is based on memory mapping. Instead of explicitly
managing a buffer pool, LibXac relies on the virtual memory manager of
the operating system to cache pages in memory. The runtime automatically
detects the pages a transaction accesses by manipulating the virtual memory
protections on the page and handling segmentation faults. Our implementation
relies only on the memory-mapping (
We have successfully completed a prototype implementation of LibXac. To demonstrate the ease of programming with LibXac, we have modified serial implementations of a B-tree and a cache-oblivious B-tree to support durable concurrent insertions. In a performance study, LibXac on these search trees provides performance comparable to the Berkeley DB , a high-quality transaction system written using explicit I/O. These promising results suggest that, despite the overheads required for maintaining transaction atomicity, it is possible to efficiently support a simple interface for programming disk-based data structures. We have improved the LibXac prototype to implement group commit. For durable transactions, this feature allows multiple concurrent transactions to log their changes to disk using the same synchronous disk write.
Although our prototype logs enough information to disk to in principle to allow users to recover from program crashes, we have yet to implement the recovery program. We are investigating the possibility of modifying LibXac to provide a shared memory abstraction on a distributed system. In a distributed setting, we hope to find a larger class of test applications for which a system such as LibXac is useful.
This research is supported in part by the Singapore-MIT Alliance and by NSF Grants ACI-0324974 and CNS-0305606.
 Sleepycat Software. The Berkeley Database. http://www.sleepycat.com, 2005.
 Jim Sukha. Memory-Mapped Transactions. Master's Thesis. MIT EECS, May 2005.