Abstract
Brighthouse is a column-oriented data warehouse with an automatically tuned, ultra small overhead metadata layer called Knowledge Grid, that is used as an alternative to classical indexes. The advantages of column-oriented data storage, as well as data compression have already been well-documented, especially in the context of analytic, decision support querying. This paper demonstrates additional benefits resulting from Knowledge Grid for compressed, column-oriented databases. In particular, we explain how it assists in query optimization and execution, by minimizing the need of data reads and data decompression.
- D. J. Abadi. Column Stores For Wide and Sparse Data. CIDR 2007: 292--297Google Scholar
- A. Ailamaki, D. J. DeWitt, M. D. Hill: Data page layouts for relational databases on deep memory hierarchies. VLDB J. 11(3): 198--215 (2002) Google ScholarDigital Library
- T. Apaydin, G. Canahuate, H. Ferhatosmanoglu, A. S. Tosun: Approximate Encoding for Direct Access and Query Processing over Compressed Bitmaps. VLDB 2006: 846--857 Google ScholarDigital Library
- S. Babu, P. Bizarro. Adaptive Query Processing in the Looking Glass. CIDR 2005: 238--249Google Scholar
- K. S. Beyer, P. J. Haas, B. Reinwald, Y. Sismanis, R. Gemulla: On synopses for distinct-value estimation under multiset operations. SIGMOD 2007: 199--210 Google ScholarDigital Library
- N. Bruno, S. Chaudhuri, L. Gravano: STHoles: A Multidimensional Workload-Aware Histogram. SIGMOD 2001: 211--222 Google ScholarDigital Library
- M. Cannataro, D. Talia. The knowledge grid. Commun. ACM 46(1): 89--93, 2003 Google ScholarDigital Library
- S. Chakkappen, T. Cruanes, B. Dageville, L. Jiang, U. Shaft, H. Su, M. Zait: Efficient and Scalable Statistics Gathering for Large Databases in Oracle 11g. SIGMOD 2008: 1053--1063 Google ScholarDigital Library
- S. Chaudhuri, V. R. Narasayya: Self-Tuning Database Systems: A Decade of Progress. VLDB 2007: 3--14 Google ScholarDigital Library
- G. P. Copeland, S. Khoshafian: A Decomposition Storage Model. SIGMOD 1985: 268--279 Google ScholarDigital Library
- G. V. Cormack: Data Compression on a Database System. Commun. ACM 28(12): 1336--1342 (1985) Google ScholarDigital Library
- A. Deshpande, Z. G. Ives, V. Raman: Adaptive Query Processing. Foundations and Trends in Databases 1(1): 1--140 (2007) Google ScholarDigital Library
- Enterprise Data Warehousing with MySQL. MySQL Business White Paper, 2007Google Scholar
- D. Feinberg, M. A. Beyer. Magic Quadrant for Data Warehouse Database Management Systems. Gartner RAS Core Research Note G00151490, 2007Google Scholar
- P. Ferragina, R. Grossi, A. Gupta, R. Shah, J. S. Vitter: On Searching Compressed String Collections Cache-Obliviously. PODS 2008: 181--190 Google ScholarDigital Library
- R. Grondin, E. Fadeitchev, V. Zarouba. Searchable archive. US Patent 7,243,110, July 10, 2007Google Scholar
- J. M. Hellerstein, M. Stonebraker, J. R. Hamilton: Architecture of a Database System. Foundations and Trends in Databases 1(2): 141--259 (2007) Google ScholarDigital Library
- A. L. Holloway, V. Raman, G. Swart, D. J. DeWitt: How to barter bits for chronons: compression and bandwidth trade offs for database scans. SIGMOD 2007: 389--400 Google ScholarDigital Library
- www.infobright.comGoogle Scholar
- Y. E. Ioannidis: The History of Histograms (abridged). VLDB 2003: 19--30 Google ScholarDigital Library
- M. L. Kersten: The Database Architecture Jigsaw Puzzle. ICDE 2008: 3--4 Google ScholarDigital Library
- N. Kerdprasop, K. Kerdprasop: Semantic Knowledge Integration to Support Inductive Query Optimization. DaWaK 2007: 157--169 Google ScholarDigital Library
- www.luciddb.orgGoogle Scholar
- V. Markl, P. J. Haas, M. Kutsch, N. Megiddo, U. Srivastava, T. M. Tran. Consistent selectivity estimation via maximum entropy. VLDB J. 16(1): 55--76 (2007) Google ScholarDigital Library
- J. K. Metzger, B. M. Zane, F. D. Hinshaw. Limiting scans of loosely ordered and/or grouped relations using nearly ordered maps. US Patent 6,973,452, December 6, 2005Google Scholar
- C. Mishra, N. Koudas, C. Zuzarte: Generating Targeted Queries for Database Testing. SIGMOD 2008: 499--510 Google ScholarDigital Library
- T. Mitchell. Machine Learning. McGraw Hill, 1997 Google ScholarDigital Library
- MySQL 5.1 Reference Manual: Storage Engines. http://dev.mysql.com/doc/refman/5.1/en/storage-engines.htmlGoogle Scholar
- S. Naouali, R. Missaoui: Flexible Query Answering in Data Cubes. DaWaK 2005: 221--232 Google ScholarDigital Library
- D. Narayanan, A. Donnelly, R. Mortier, A. Rowstron: Delay Aware Querying with Seaweed. VLDB 2006: 727--738 Google ScholarDigital Library
- www.paraccel.comGoogle Scholar
- Z. Pawlak. Rough sets: Theoretical aspects of reasoning about data. Kluwer, 1991 Google ScholarDigital Library
- Z. Pawlak, A. Skowron. Rudiments of rough sets. Information Sciences 177(1): 3--27, 2007Google ScholarCross Ref
- M. Poess, R. O. Nambiar, D. Walrath: Why You Should Run TPC-DS: A Workload Analysis. VLDB 2007: 1138--1149 Google ScholarDigital Library
- V. Raman, G. Swart, L. Qiao, F. Reiss, V. Dialani, D. Kossmann, I. Narang, R. Sidle: Constant-Time Query Processing. ICDE 2008: 60--69 Google ScholarDigital Library
- A. Rasin, S. Zdonik, O. Trajman, S. Lawande: Automatic Vertical-Database Design. WO Patent Application, 2008/016877 A2Google Scholar
- M. Stonebraker, D. J. Abadi, A. Batkin, X. Chen, M. Cherniack, M. Ferreira, E. Lau, A. Lin, S. Madden, E. O'Neil, P. O'Neil, A. Rasin, N. Tran, S. Zdonik. CStore: A Column Oriented DBMS. VLDB 2005: 553--564 Google ScholarDigital Library
- M. Stonebraker, S. Madden, D. J. Abadi, S. Harizopoulos, N. Hachem, P. Helland: The End of an Architectural Era (It's Time for a Complete Rewrite). VLDB 2007: 1150--1160 Google ScholarDigital Library
- www.sybase.com/products/datawarehousing/sybaseiqGoogle Scholar
- www.vertica.comGoogle Scholar
- B. Vo, G. S. Manku: RadixZip: Linear-Time Compression of Token Streams. VLDB 2007: 1162--1172 Google ScholarDigital Library
- P. W. White, C. D. French. Database system with methodology for storing a database table by vertically partitioning all columns of the table. US Patent 5,794,229, August 11, 1998Google Scholar
- M. Wojnarski, C. Apanowicz, V. Eastwood, D. Ślȩzak, P. Synak, A. Wojna, J. Wróblewski: Method and System for Data Compression in a Relational Database. US Patent Application, 2008/0071818 A1Google Scholar
- J. Wróblewski, C. Apanowicz, V. Eastwood, D. Ślȩzak, P. Synak, A. Wojna, M. Wojnarski: Method and System for Storing, Organizing and Processing Data in a Relational Database. US Patent Application, 2008/0071748 A1Google Scholar
- M. Zukowski, S. Heman, N. Nes, P. A. Boncz: Super-Scalar RAM-CPU Cache Compression. ICDE 2006: 59 Google ScholarDigital Library
Index Terms
- Brighthouse: an analytic data warehouse for ad-hoc queries
Recommendations
SQLtoKeyNoSQL: a layer for relational to key-based NoSQL database mapping
iiWAS '15: Proceedings of the 17th International Conference on Information Integration and Web-based Applications & ServicesToday, many applications produce and manipulate a large volume of data, the so-called Big Data. Traditional databases (DB), like relational databases, are not suitable to Big Data management. In order to solve this problem, a new category of DB has been ...
Comments