![]()
|
Research
Abstracts - 2006 |
|
Concurrent Cache-Oblivious B-TreesMichael A. Bender, Jeremy T. Fineman, Seth Gilbert & Bradley C. KuszmaulIntroduction For over three decades, the B-tree [2] has been the data structure
of choice for maintaining searchable, ordered data on disk. Traditional
B-trees are effective in large part because they minimize the number of
disk blocks accessed during a search. Specifically, for block size In contrast, cache-oblivious (CO) B-trees attain near-optimal memory
performance at all levels of the memory hierarchy and for all block sizes
(e.g., [3,4,5,10]). A CO B-tree performs a search operation
with
One shortcoming of previously described CO B-trees is that they do not support concurrent access by different processes, whereas typical applications, such as databases and file systems, need to access and modify their data structures concurrently. This work describes three concurrent CO B-tree data structures, proves them correct, and presents performance analysis for a few interesting special cases. The rest of this abstract reviews CO B-trees and explains the concurrency problem. Cache-oblivious model External-memory data structures, such as B-trees, are traditionally
analyzed in the disk-access model (DAM) [1], in which internal memory
has size The cache-oblivious model [11,16] is like the DAM model in that
the objective is to minimize the number of data transfers between two
levels. Unlike the DAM model, however, the parameters Cache-oblivious B-treesA CO B-tree [4] achieves nearly optimal locality of reference at every level of the memory hierarchy. Although the first CO B-tree is complicated [4], subsequent designs [6,10,13,17] rival B-trees both in simplicity and performance [6,9,10,13,14,17]. Indeed, preliminary experiments have shown, surprisingly, that CO B-trees can outperform traditional B-trees, sometimes by factors of more than 2 [8,13]. There are two main approaches to implementing serial CO B-trees. One approach is based on packed-memory arrays and the other on exponential search trees. Both approaches employ a static CO search tree [16] as a building block. A static CO search tree contains a set of A static CO search tree executes searches in
We leave descriptions of the dynamic CO B-treesto the full paper. The concurrency problemConcurrency introduces a number of challenges. A naïve approach would lock segments of the data structure during updates. For example, in a traditional B-tree, each block is locked before being updated. Unfortunately, in the CO model it is difficult to determine the correct granularity at which to acquire locks because the block size is unknown. Locking too small a region may result in deadlock; locking too large a region may result in poor concurrency. Second, in most database and file-system applications, searches are more common than inserts or deletes, so it is desirable that searches be non-blocking, meaning that each search can continue to make progress, even if other operations are stalled. Our solutions in this paper do not require search operations to acquire locks. Another problem arises with maintaining the ``balance'' of the data
structure. Both main approaches to designing serial CO B-treesrequire
careful weight balance of the trees to ensure efficient operations. With
concurrent updates, the balance can be difficult to maintain. For example,
the amortized analysis of a packed-memory array allows a process to rewrite
the data structure completely once in a while. But if the rewrite acquires
locks which prevent other processors from accessing the data structure,
then the average concurrency drops: on average only We respond to these challenges by developing new serial data structures that are more amenable to parallel accesses, and then we apply concurrency techniques to develop our concurrent CO B-trees. Our resultsThis work includes three concurrent CO B-treedata structures: (1) an exponential CO B-tree(lock-based), (2) a packed-memory CO B-tree(lock-based), and (3) a packed-memory CO B-tree (non-blocking). We show that each data structure is linearizable, meaning that each completed operation (and a subset of the incomplete operations) can be assigned a serialization point; each operation appears, from an external viewer's perspective, as if it occurs exactly at the serialization point (see, e.g., [12,15]). We also show that the lock-based data structures are deadlock free, i.e., that some operation always eventually completes. We show that the non-blocking B-tree is lock-free; that is, even if some processes fail, some operation always completes. When only one process is executing, the trees gain all the performance
advantages of optimal cache-obliviousness; for example, they execute operations
with the same asymptotic performance as the non-concurrent versions and
behave well on a multi-level cache hierarchy without any explicit coding
for the cache parameters. Also, these data structures perform nearly as
well as the traditional cache-aware B-tree with additive insert overheads
of References:[1] Alok Aggarwal and Jeffrey S. Vitter. The input/output complexity of sorting and related problems. Communications of the ACM, 31(9):1--3, September 1988. [2] Rudolf Bayer and Edward M. McCreight. Organization and maintenance of large ordered indexes. Acta Informatica, 1(3):173--189, February 1972. [3] Michael A. Bender, Richard Cole, and Rajeev Raman. Exponential structures for efficient cache-oblivious algorithms. In ICALP 2002, pp. 195--207, 2002. [4] Michael A. Bender, Erik Demaine, and Martin Farach-Colton. Cache-oblivious B-trees. In FOCS 2000, pp. 399--409, 2000. [5] Michael A. Bender, Erik Demaine, and Martin Farach-Colton. Cache-oblivious B-trees. SIAM Journal on Computing, 2005. [6] M. A. Bender, Z. Duan, J. Iacono, and J. Wu. A locality-preserving cache-oblivious dynamic dictionary. In Soda 2002, pp. 29--38, 2002. [8] Michael A. Bender, Martin Farach-Colton, Bradley C. Kuszmaul, and Jim Sukha. Cache-oblivious B-trees for optimizing disk performance. Manuscript, 2005. [9] Michael A. Bender, Gerth Stølting Brodal, Rolf Fagerberg, Dongdong Ge, Simai He, Haodong Hu, John Iacono, and Alejandro López-Ortiz. The cost of cache-oblivious sorting. In FOCS 2003, pp. 271--282, Cambridge, Massachusetts, October 2004. [10] Gerth Stølting Brodal, Rolf Fagerberg, and Riko Jacob. Cache oblivious search trees via binary trees of small height. In Soda 2002, pp. 39--48, San Francisco, California, January 2002. [11] Matteo Frigo, Charles E. Leiserson, Harald Prokop, and Sridhar Ramachandran. Cache-oblivious algorithms. In FOCS 1999, pp. 285--297, New York, New York, October 1999. [12] Maurice P. Herlihy and Jeannette M. Wing. Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems, 12(3): 468--492, 1990. [13] Zardosht Kasheff. Cache-oblivious dynamic search trees. Masters thesis, Massachusetts Institute of Technology, Cambridge, Massachusetts, June 2004. [14] Richard E. Ladner, Ray Fortna, and Bao-Hoang Nguyen. A comparison of cache aware and cache oblivious static search trees using program instrumentation. In Experimental Algorithms: From Algorithm Design to Robust and Efficient Software, volume 2547 of LNCS, pp. 78--92, 2002. [15] Nancy Lynch. Distributed Algorithms. Morgan Kaufmann, 1996. [16] Harald Prokop. Cache-oblivious algorithms. Masters thesis, Massachusetts Institute of Technology, Cambridge, Massachusetts, June 1999. [17] Naila Rahman, Richard Cole, and Rajeev Raman. Optimised predecessor data structures for internal memory. In Proceedings of the 5th International Workshop on Algorithm Engineering, volume 2141 of LNCS, pp. 67--78, Aarhus, Denmark, August 2001. |
![]() ![]() |
||
|