
Research
Abstracts  2006 
Query/Update Tradeoffs for an External Memory Search Tree Data StructureMichael A. Bender (SUNY Stony Brook), Jeremy T. Fineman, Bradley C. Kuszmaul & Jelani NelsonIntroductionIn the dynamic search problem we would like to maintain a set of (key, value) pairs under insertion and deletion, where the keys are members of some ordered set. We would then like to support the following 3 queries: (1) search(k) should return all values associated with key k, (2) predecessor(k) should return the value of an item with the highest key less than k, and (3) rangesearch(j, k) should return all items with keys in the range [j, k]. We look at the dynamic search problem in the case where N, the number of data items, is so large that all items cannot simultaneously fit into main memory. We thus use the external memory model of [1] to analyze our data structures, in which we attempt to minimize the number of block transfers from disk to memory as opposed to the traditional model of analyzing instructions executed. Many external memory data structures already exist to solve the dynamic search problem, such as the Btree which has provably optimal query I/O complexity. However the insertion performance of the Btree is not provably optimal, and one of the things we have shown is how to beat the Btree's asymptotic insertion performance while maintaining the same asymptotic I/O complexity bounds for all other operations. This discovery was also made independently in [2]. We also currently have some partial results toward achieving our results with a cacheoblivious data structure. The External Memory ModelIn the external memory model, we have a disk of unbounded size divided into blocks of size B. Memory is of size M and is divided into M/B cache lines each also of size B. When we wish to read a value, this read is free if the value resides in memory. Else, we pay one block transfer to bring the value along with all the contents of its block on disk into one cache line in memory. Should no free cache lines be available, we must evict some cache line to disk using some cache replacement strategy. A recent trend in external memory algorithms is to develop cacheoblivious algorithms, or algorithms that perform without using any knowledge of the parameters M and B [3]. Such algorithms have the benefits of having proven I/O complexity bounds hold simultaneously across all adjacent levels of the memory hierarchy, as well as being portable across machines with different block and memory sizes. The Buffered Repository B^{ε}treeThis data structure is a modification of the buffered repository tree introduced in [4] and was independently discovered in [2]. Ultimately, the data structure achieves O((log_{B}N)/(ε B^{1ε})) block transfers for insert, O((log_{B}N)/ε + S/B) block transfers for rangesearch, and O((log_{B}N)/ε) block transfers for all other operations. Here S is the number of items output, and ε is a parameter that can be tuned and must be in the range [log_{B}2, 1]. For any fixed constant ε, this data structure outperforms Btree insertions by a factor polynomial in B while only suffering by a constant factor for all other operations. We also have written a memory mapped implementation of the buffered repository B^{ε}tree. Preliminary results indicate that this data structure does in fact perform well in practice. We set up a database with 2^{17} 12byte items on a machine with 4096KB blocks and 128M of memory (that is, the database used over fifteen times the space of available memory). For the setting ε = ½, insertions performed roughly 23x faster than a traditional Btree while searches were only roughly 3x slower. The CacheOblivious Lookahead ArrayFor the case where ε = log_{B}2, we have developed a data structure, which we call the cacheoblivious lookahead array, that matches the bounds of the buffered repository B^{ε}tree. These bounds are identical to those achieved by the cacheaware buffered repository tree from [4]. Specifically, the I/O complexity bounds are O(log(N/B)/B) for insert, O(log(N/B) + S/B) for rangesearch, and O(log(N/B)) for all other operations. To our knowledge, this is the first cacheoblivious data structure to achieve the bounds of the buffered repository tree. Conclusion and Future WorkOur future work includes achieving the full query/update tradeoff of the buffered repository B^{ε}tree cacheobliviously, or proving that no such result is possible. Also, we have yet to implement the cacheoblivious lookahead array to experiment and discover its performance in practice as compared with its cacheaware counterpart. Research SupportThis research was supported in part by the SingaporeMIT Alliance and by NSF Grants ACI0324974, EIA0112849, CNS0305606 and CCR0208670. References[1] Alok Aggarwal and Jeffrey Scott Vitter. The Input/Output Complexity of Sorting and Related Problems. Communications of the ACM 31(9):11161127. 1988. [2] Gerth Stølting Brodal and Rolf Fagerberg. Lower bounds for external memory dictionaries. In Proceedings of the 14th Annual ACMSIAM Symposium on Discrete Algorithms (SODA 2003), pp. 546554, Baltimore, Maryland, January 2003. [3] Matteo Frigo, Charles E. Leiserson, Harald Prokop, and Sridhar Ramachandran. CacheOblivious Algorithms. In Proceedings of the 40th Annual IEEE Symposium on Foundations of Computer Science (FOCS 1999), pp. 285298, New York City, New York, October 1999. [4] Adam L. Buchsbaum, Michael Goldwasser, Suresh Venkatasubramanian, and Jeffery Westbrook. On external memory graph traversal. In Proceedings of the 11th Annual ACMSIAM Symposium on Discrete Algorithms (SODA 2000), pp. 859860, San Francisco, California, January 2000. 

