![]()
|
Research
Abstracts - 2006 |
|
Cache-Oblivious String B-treesMichael A. Bender (SUNY Stony Brook), Martin Farach-Colton (Rutgers) & Bradley C. KuszmaulSummaryTraditional B-trees are designed for uniform key size, and most B-trees implementations employ heuristics to deal with heterogeneous key size. The String B-tree of Ferragina and Grossi [2] is designed to handle variable length keys in the standard cache-aware model where the block transfer size in the memory hierarchy is known. We developed static and dynamic Cache-Oblivious String B-trees (COSB-trees). Our data structures also employ a provably good version of front compression, a heuristic version of which is commonly used in practical B-trees. The static COSB-tree represents a set D of N strings, and supports member, predecessor, successor, prefix, and range queries. Determining whether k is in the data structure requires O(1+logB N+s(k)/B) block transfers, where s(k) is the length of key k. Finding the predecessor or successor of k requires O(1+logB N+s(k)/B+s(k')/B) transfers, where k' is the predecessor (resp. successor) of k. The prefix operation on k requires O(1+logB N+(s(k)+c(Q))/B) transfers, where c(Q) is the size, using front compression, of the result set Q. Performing a range query from k to k' requires O(1+logB N+ (s(k)+s(k')+c(Q))/B) transfers. The compressed strings can be decoded for an additional c(Q)/B transfers. The dynamic COSB-tree also supports insertions and deletions. The member, predecessor, successor, prefix, and range prefix operations match the performance of the static tree. To insert or delete k requires O(1+logB N+log2N s(k)/B) memory transfers. The dynamic COSB-tree augmented with a scanning structure achieves the same bounds, except that range queries are amortized, and insertion and deletion are accelerated to O(1+logB N+log2+εlog N s(k)/B) amortized memory transfers, for any ε>0. If range queries need to return only pointers to keys (rather than the bits of the keys), then the updates become O(1+logB N+s(k)/B) amortized memory transfers. All bounds are with high probability. This paper appears in [1]. Research SupportThis research was supported in part by the Singapore-MIT Alliance and by NSF Grants ACI-0324974, EIA-0112849, CNS-0305606 and CCR-0208670. References:[1] Michael A. Bender, Martin Farach-Colton, and Bradley C. Kuszmaul. Cache-Oblivious String B-trees. To appear in the 25th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS06), Chicago, Illinois, Country, June 2006. [2] P. Ferragina and R. Grossi. The String B-Tree: A new data structure for string search in external memory and its applications. Journal of the ACM, 46(2):236--280, Mar 1999. |
![]() ![]() |
||
|