skip to main content
research-article

Brighthouse: an analytic data warehouse for ad-hoc queries

Published:01 August 2008Publication History
Skip Abstract Section

Abstract

Brighthouse is a column-oriented data warehouse with an automatically tuned, ultra small overhead metadata layer called Knowledge Grid, that is used as an alternative to classical indexes. The advantages of column-oriented data storage, as well as data compression have already been well-documented, especially in the context of analytic, decision support querying. This paper demonstrates additional benefits resulting from Knowledge Grid for compressed, column-oriented databases. In particular, we explain how it assists in query optimization and execution, by minimizing the need of data reads and data decompression.

References

  1. D. J. Abadi. Column Stores For Wide and Sparse Data. CIDR 2007: 292--297Google ScholarGoogle Scholar
  2. A. Ailamaki, D. J. DeWitt, M. D. Hill: Data page layouts for relational databases on deep memory hierarchies. VLDB J. 11(3): 198--215 (2002) Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. T. Apaydin, G. Canahuate, H. Ferhatosmanoglu, A. S. Tosun: Approximate Encoding for Direct Access and Query Processing over Compressed Bitmaps. VLDB 2006: 846--857 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Babu, P. Bizarro. Adaptive Query Processing in the Looking Glass. CIDR 2005: 238--249Google ScholarGoogle Scholar
  5. K. S. Beyer, P. J. Haas, B. Reinwald, Y. Sismanis, R. Gemulla: On synopses for distinct-value estimation under multiset operations. SIGMOD 2007: 199--210 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. N. Bruno, S. Chaudhuri, L. Gravano: STHoles: A Multidimensional Workload-Aware Histogram. SIGMOD 2001: 211--222 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. Cannataro, D. Talia. The knowledge grid. Commun. ACM 46(1): 89--93, 2003 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. S. Chakkappen, T. Cruanes, B. Dageville, L. Jiang, U. Shaft, H. Su, M. Zait: Efficient and Scalable Statistics Gathering for Large Databases in Oracle 11g. SIGMOD 2008: 1053--1063 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Chaudhuri, V. R. Narasayya: Self-Tuning Database Systems: A Decade of Progress. VLDB 2007: 3--14 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. G. P. Copeland, S. Khoshafian: A Decomposition Storage Model. SIGMOD 1985: 268--279 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. G. V. Cormack: Data Compression on a Database System. Commun. ACM 28(12): 1336--1342 (1985) Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Deshpande, Z. G. Ives, V. Raman: Adaptive Query Processing. Foundations and Trends in Databases 1(1): 1--140 (2007) Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Enterprise Data Warehousing with MySQL. MySQL Business White Paper, 2007Google ScholarGoogle Scholar
  14. D. Feinberg, M. A. Beyer. Magic Quadrant for Data Warehouse Database Management Systems. Gartner RAS Core Research Note G00151490, 2007Google ScholarGoogle Scholar
  15. P. Ferragina, R. Grossi, A. Gupta, R. Shah, J. S. Vitter: On Searching Compressed String Collections Cache-Obliviously. PODS 2008: 181--190 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. Grondin, E. Fadeitchev, V. Zarouba. Searchable archive. US Patent 7,243,110, July 10, 2007Google ScholarGoogle Scholar
  17. J. M. Hellerstein, M. Stonebraker, J. R. Hamilton: Architecture of a Database System. Foundations and Trends in Databases 1(2): 141--259 (2007) Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. L. Holloway, V. Raman, G. Swart, D. J. DeWitt: How to barter bits for chronons: compression and bandwidth trade offs for database scans. SIGMOD 2007: 389--400 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. www.infobright.comGoogle ScholarGoogle Scholar
  20. Y. E. Ioannidis: The History of Histograms (abridged). VLDB 2003: 19--30 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. L. Kersten: The Database Architecture Jigsaw Puzzle. ICDE 2008: 3--4 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. N. Kerdprasop, K. Kerdprasop: Semantic Knowledge Integration to Support Inductive Query Optimization. DaWaK 2007: 157--169 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. www.luciddb.orgGoogle ScholarGoogle Scholar
  24. V. Markl, P. J. Haas, M. Kutsch, N. Megiddo, U. Srivastava, T. M. Tran. Consistent selectivity estimation via maximum entropy. VLDB J. 16(1): 55--76 (2007) Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. K. Metzger, B. M. Zane, F. D. Hinshaw. Limiting scans of loosely ordered and/or grouped relations using nearly ordered maps. US Patent 6,973,452, December 6, 2005Google ScholarGoogle Scholar
  26. C. Mishra, N. Koudas, C. Zuzarte: Generating Targeted Queries for Database Testing. SIGMOD 2008: 499--510 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. T. Mitchell. Machine Learning. McGraw Hill, 1997 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. MySQL 5.1 Reference Manual: Storage Engines. http://dev.mysql.com/doc/refman/5.1/en/storage-engines.htmlGoogle ScholarGoogle Scholar
  29. S. Naouali, R. Missaoui: Flexible Query Answering in Data Cubes. DaWaK 2005: 221--232 Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. D. Narayanan, A. Donnelly, R. Mortier, A. Rowstron: Delay Aware Querying with Seaweed. VLDB 2006: 727--738 Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. www.paraccel.comGoogle ScholarGoogle Scholar
  32. Z. Pawlak. Rough sets: Theoretical aspects of reasoning about data. Kluwer, 1991 Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Z. Pawlak, A. Skowron. Rudiments of rough sets. Information Sciences 177(1): 3--27, 2007Google ScholarGoogle ScholarCross RefCross Ref
  34. M. Poess, R. O. Nambiar, D. Walrath: Why You Should Run TPC-DS: A Workload Analysis. VLDB 2007: 1138--1149 Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. V. Raman, G. Swart, L. Qiao, F. Reiss, V. Dialani, D. Kossmann, I. Narang, R. Sidle: Constant-Time Query Processing. ICDE 2008: 60--69 Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. A. Rasin, S. Zdonik, O. Trajman, S. Lawande: Automatic Vertical-Database Design. WO Patent Application, 2008/016877 A2Google ScholarGoogle Scholar
  37. M. Stonebraker, D. J. Abadi, A. Batkin, X. Chen, M. Cherniack, M. Ferreira, E. Lau, A. Lin, S. Madden, E. O'Neil, P. O'Neil, A. Rasin, N. Tran, S. Zdonik. CStore: A Column Oriented DBMS. VLDB 2005: 553--564 Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. M. Stonebraker, S. Madden, D. J. Abadi, S. Harizopoulos, N. Hachem, P. Helland: The End of an Architectural Era (It's Time for a Complete Rewrite). VLDB 2007: 1150--1160 Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. www.sybase.com/products/datawarehousing/sybaseiqGoogle ScholarGoogle Scholar
  40. www.vertica.comGoogle ScholarGoogle Scholar
  41. B. Vo, G. S. Manku: RadixZip: Linear-Time Compression of Token Streams. VLDB 2007: 1162--1172 Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. P. W. White, C. D. French. Database system with methodology for storing a database table by vertically partitioning all columns of the table. US Patent 5,794,229, August 11, 1998Google ScholarGoogle Scholar
  43. M. Wojnarski, C. Apanowicz, V. Eastwood, D. Ślȩzak, P. Synak, A. Wojna, J. Wróblewski: Method and System for Data Compression in a Relational Database. US Patent Application, 2008/0071818 A1Google ScholarGoogle Scholar
  44. J. Wróblewski, C. Apanowicz, V. Eastwood, D. Ślȩzak, P. Synak, A. Wojna, M. Wojnarski: Method and System for Storing, Organizing and Processing Data in a Relational Database. US Patent Application, 2008/0071748 A1Google ScholarGoogle Scholar
  45. M. Zukowski, S. Heman, N. Nes, P. A. Boncz: Super-Scalar RAM-CPU Cache Compression. ICDE 2006: 59 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Brighthouse: an analytic data warehouse for ad-hoc queries

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in

                Full Access

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader