MIT CSAIL Research Abstracts

CSAIL Publications and Digital Archive header

Technical Reports

Work Products

Research Abstracts

Historical Collections

horizontal line

Research Abstracts - 2006
horizontal line

horizontal line

Cache-Oblivious Data Structures

Erik D. Demaine

Why

The memory hierarchies of modern computers are becoming increasingly ``steep,'' ranging from fast registers and on-chip cache down to relatively slow disks and networks. Because the speed of processors is increasing more quickly than the speed of memory and disk, the disparity between the various levels of the memory hierarchy is growing. Today the access times of L1 cache and disk differ by approximately 6 orders of magnitude. Because of these large variations in magnitude, it is becoming increasingly dangerous to design algorithms assuming a ``flat'' memory with uniform access times.

The distinguishing feature of multilevel memory hierarchies is that memory transfers are done in blocks in order to amortize the cost of a transfer. This amortization only works when each transfer contains many pieces of data to be used by the CPU. Thus, an important objective is to maintain locality of reference, meaning that memory accesses are clustered in time and space.

Data locality is easier to maintain in algorithmic problems having regular and/or static data, because the flow of data is predictable. Examples of such problems include matrix multiplication, fast Fourier transform, and LU decomposition. Even in sorting, there is a fixed and predetermined order in which the data must end up. We focus on the more challenging problem of maintaining data locality in irregular and dynamic problems where by definition the data flow is continually changing and unpredictable, making it difficult to organize data locality a priori. Irregular and dynamic problems often require data-structure solutions because data structures specialize in moving data efficiently throughout memory.

What

A recent direction in the design of algorithms and data structures that are efficient on a memory hierarchy is the notion of cache-obliviousness, introduced by Prof. Leiserson's group in CSAIL in 1999 [FLPR99]. Cache-oblivious algorithms perform well on a multilevel memory hierarchy without knowing any parameters of the hierarchy, only knowing the existence of a hierarchy. More precisely, such algorithms assume that there is a cache of size M and with block size B, but are forbidden from actually exploiting the parameters B and M (except in the analysis). Equivalently, a single cache-oblivious algorithm is efficient on all memory hierarchies simultaneously. While such results might seem impossible, a recent body of work has developed cache-oblivious algorithms and data structures that perform as well or nearly as well as standard external-memory structures which require knowledge of the cache/memory size and block transfer size.

Progress

In collaboration with Stephen Alstrup (DIKU), Lars Arge (Duke), Michael Bender (Stony Brook), Richard Cole (NYU), Bryan Holland-Minkley (Duke), Martin Farach-Colton (Rutgers), Ian Munro (Waterloo), Theis Rauhe (DIKU), and Mikkel Thorup (AT&T), we have developed cache-oblivious solutions to several data structural problems:

Priority queues [ABD+06]. We develop a data structure to maintain a dynamic set of ordered objects subject to insertion, deletion, and delete-min in O((1/B) log_M/B (N/B)) amortized memory transfers per operation. This bound is optimal, even without the cache-oblivious restriction. Priority queues are a critical component in many of the best known external-memory graph algorithms, and using our cache-oblivious priority queue we develop several cache-oblivious graph algorithms.
Linked lists [BCD+02, BDF06]. We make a systematic study of the effects of the memory model on optimal solutions to the problem of maintaining a dynamic ordered set subject to insertions, deletions, and traversals of k consecutive elements. We consider three main memory models: a two-level (known) memory hierarchy; the cache-oblivious model, which applies to unknown and multi-level memory hierarchies; and the sequential-access model, where sequential block transfers are less expensive than random block transfers. We also consider the effects of whether the elements must be kept sorted, whether the updates are clustered, and whether the bounds can be amortized.
Tree layout [ABD+03] We consider the problem of laying out a tree with fixed parent/child structure in hierarchical memory. The goal is to minimize the expected number of block transfers performed during a search along a root-to-leaf path, subject to a given probability distribution on the leaves. This problem was previously considered by Gil and Itai (1999), who developed optimal algorithms when the block-transfer size B is known. We show how to extend any approximately optimal algorithm to the cache-oblivious setting in which the block-transfer size is unknown to the algorithm. The query performance of the cache-oblivious layout is within a constant factor of the query performance of the optimal known-block-size layout. Computing the cache-oblivious layout requires only logarithmically many calls to the layout algorithm with known block size. Finally, we analyze two greedy strategies, and show that they have a performance ratio between Ω(log B / log log B) and O(log B) when compared to the optimal layout.

Future

Many data structural problems remain to be explored in the cache-oblivious setting, and this remains an active area of research.

Research Support

This research was supported by NSF ITR grant EIA-0112849.

References

[ABD+03]: Stephen Alstrup, Michael A. Bender, Erik D. Demaine, Martin Farach-Colton, J. Ian Munro, Theis Rauhe, and Mikkel Thorup. Efficient tree layout in a multilevel memory hierarchy. arXiv:cs.DS/0211010, November 2003.
[ABD+06]: Lars Arge, Michael A. Bender, Erik D. Demaine, Bryan E. Holland-Minkley, and J. Ian Munro. An optimal cache-oblivious priority queue and its application to graph algorithms. SIAM Journal on Computing, to appear.
[BCD+02]: Michael A. Bender, Richard Cole, Erik D. Demaine, and Martin Farach-Colton. Scanning and traversing: Maintaining data for traversals in a memory hierarchy. In Proceedings of the 10th Annual European Symposium on Algorithms, LNCS 2461, pp. 139–151, Rome, Italy, September 2002.
[BDF06]: Michael A. Bender, Erik D. Demaine, and Martin Farach-Colton. Cache-oblivious B-trees. SIAM Journal on Computing, to appear.
[FLPR99]: Matteo Frigo, Charles E. Leiserson, Harald Prokop, and Sridhar Ramachandran. Cache-oblivious algorithms. In Proceedings of the 40th Annual Symposium on Foundations of Computer Science, pp. 285–297, New York, October 1999.

Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu