MIT CSAIL Research Abstracts

CSAIL Publications and Digital Archive header

Technical Reports

Work Products

Research Abstracts

Historical Collections

horizontal line

Research Abstracts - 2006
horizontal line

horizontal line

Query Execution in C-Store - A Column Oriented DBMS

Daniel J. Abadi, David J. DeWitt, Stavros Harizopoulos, Samuel R. Madden, Daniel S. Myers & Michael R. Stonebraker

C-Store

Most major DBMS vendors implement record-oriented storage systems, where the attributes of a record (or tuple) are placed contiguously in storage. With this row store architecture, a single disk write suffices to push all of the fields of a single record out to disk. Hence, high performance writes are achieved, and a DBMS with a row store architecture is a write-optimized system. These are especially effective on OLTP-style applications.

In contrast, systems oriented toward ad-hoc querying of large amounts of data should be read-optimized. Data warehouses represent one class of read-optimized system, in which periodically a bulk load of new data is performed, followed by a relatively long period of ad-hoc queries. Other read-mostly applications include customer relationship management (CRM) systems, electronic library card catalogs, and other ad-hoc inquiry systems. In such environments, a column store architecture, in which the values for each single column (or attribute) are stored contiguously, can be more efficient.

With a column store architecture, a DBMS need only read the values of columns required for processing a given query, and can avoid bringing into memory irrelevant attributes. In warehouse environments where typical queries involve aggregates performed over large numbers of data items, a column store has a sizeable performance advantage.

C-Store, [1], is a large, multi-university project to build an open source column-oriented DBMS (C-Store Webpage). At CSAIL, we are building the query executer for this project and are looking at the research implications of building such a column-oriented query executer. Two key questions we try to answer using our column oriented query executer include:

How do you integrate data compression into a column oriented query executer?
How does the storage of columns vs. rows on disk affect query executer performance?

These two questions are described in detail below.

Compression

Compression in traditional database systems is known to improve performance significantly: it reduces the size of the data and improves I/O performance by reducing seek times (the data is stored nearer each other), transfer times (less data to transfer) and buffer hit rate (larger fraction of DBMS fits in buffer pool). For queries that are I/O limited, the CPU overhead of decompression is often compensated for by the I/O improvements.

Storing data in columns presents a number of opportunities for improved performance from compression algorithms when compared to row-oriented architectures. In a column-oriented database, compression schemes that encode the differences between the successive entries in the column and multiple values at once are natural. In a row-oriented database, such schemes do not typically work well because adjacent rows of multi-attribute tables are likely to vary significantly in at least some attributes. Still, despite the unnatural nature of attribute-level compression, there have been some successful proposed schemes (e.g. [2][3][4]); however, none of these schemes attempt to compress attributes from more than one tuple at once.

Instead, compression techniques for row-stores often employ dictionary schemes where a dictionary is used to code wide values in the attribute domain into smaller codes. For example, a simple dictionary for a string-typed column of colors might map "blue'' to 0, "yellow'' to 1, "green'' to 2, and so on [2][5][6]. Sometimes these schemes employ prefix-coding based on symbol frequencies (e.g., Huffman encoding [7]) or express values as small differences from some frame of reference and remove leading nulls from them (e.g., [8][4][5]). In addition to these traditional techniques, column-stores are also well-suited to compression schemes that compress values from more than one row at a time. This allows for a larger variety of viable compression algorithms. For example, run-length encoding, where repeats of the same element are expressed as (run-length, value) pairs, is an attractive approach for compressing sorted data in a column-store. Similarly, improvements to traditional compression algorithms that allow basic "symbols'' to span more than one column entry are also possible in a column-store. Finally, compression ratios are generally much higher in column-stores because consecutive entries in a column are often quite similar to each other, whereas adjacent attributes in a tuple are not. Column-stores can also store different columns in different sort-orders, further increasing the potential for compression, since sorted data is usually quite compressible.

In addition to improving the compression ratio, column-oriented compression schemes also improve CPU performance by allowing database operators to to operate directly on compressed data. This is particularly true with compression schemes like run length encoding that refer to multiple entries with the same value in a single record. For example, if a run-length encoded column says the value "42'' appears 1000 times consecutively in a particular column for which we are computing a SUM aggregate, the operator can simply take the product of the value and run-length as the SUM, without having to decompress.

Our research studies a number of alternative compression schemes that are particularly well-suited to column stores, and looks at a framework for how these schemes can be seamlessly integrated into the architecture of a column-oriented query executer

Comparison to Row-Store Executers

Surprisingly, initial observations about column vs. row store performance (column stores are better for reads since they only have to read in relevant columns, row stores are better for writes since it takes only a singe disk write to insert or update a tuple) turn out to not be trivially true. For example, to obtain a benefit over a row store when reading several columns in a table, a column store must ensure that it can read large sequential portions of each column, since the cost of seeking between two fields in the same tuple (stored separately on disk) can be prohibitively expensive. Similarly, writes can actually be made relatively inexpensive as long as many inserts are buffered together so they can be done sequentially.

Thus, we aim to look at the fundamental differences between row-store and column-store performance, studying the basic tradeoffs between the two storage representations. We attempt to answer the following questions:

As the number of columns accessed by a query increase, how does that affect the performance of a column store?
How is performance affected by the use of disk and L2 cache pre-fetching?
On a modern workstation, under what workloads are column and row stores I/O bound?
To what extent is it possible to mask the costs of I/O with processing and memory latency in the two architectures?
How are the relative performance tradeoffs of column and row stores affected by the presence of competition for I/O and memory bandwidth along with CPU cycles from competing queries?

The purpose of our work is not to definitively settle the question of whether a row store is "better" than a column store, but to explore the situations under which one architecture outperforms the other.

References:

[1] M. Stonebraker and D. J. Abadi and A. Batkin and X. Chen and M. Cherniack and M. Ferreira and E. Lau and A. Lin and S. Madden and E. O'Neil and P. O'Neil and A. Rasin and N. Tran and S. Zdonik. C-Store: A Column-oriented DBMS. In VLDB 2005, pp. 553-564.

[2] G.Graefe and L.Shapiro. Data compression and database performance. In ACM/IEEE-CS Symp. On Applied Computing pages 22-27. April 1991.

[3] Gautam Ray and Jayant R. Haritsa and S. Seshadri. Database Compression: A Performance Enhancement Tool.. In COMAD, 1995.

[4] Jonathan Goldstein and Raghu Ramakrishnan and Uri Shaft. Compressing Relations and Indexes. In ICDE '98: Proceedings of the Fourteenth International Conference on Data Engineering, pp. 370--379, 1998.

[5] Mark A. Roth and Scott J. Van Horn. Database compression. In SIGMOD Record, Volume 22, Number 3, pp. 31--39, 1993.

[6] Zhiyuan Chen and Johannes Gehrke and Flip Korn. Query optimization in compressed database systems. In SIGMOD '01: Proceedings of the 2001 ACM SIGMOD international conference on Management of data, pp. 271--282, 2001.

[7] D. Huffman. A method for the construction of minimum-redundancy codes. In Proc. IRE, 40(9):1098-1101. September 1952.

[8] Till Westmann and Donald Kossmann and Sven Helmer and Guido Moerkotte. The implementation and performance of compressed databases. In SIGMOD Record, Volume 29, Number 3, pp. 55--67, 2000.

Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu