research-article

Brighthouse: an analytic data warehouse for ad-hoc queries

Authors:
Dominik Ślȩzak

Infobright Inc., Poland

Infobright Inc., Poland
View Profile

,
Jakub Wróblewski

Infobright Inc., Poland

Infobright Inc., Poland
View Profile

,
Victoria Eastwood

Infobright Inc., Canada

Infobright Inc., Canada
View Profile

,
Piotr Synak

Infobright Inc., Poland

Infobright Inc., Poland
View Profile

Proceedings of the VLDB Endowment Volume 1 Issue 2pp 1337–1345https://doi.org/10.14778/1454159.1454174

Published:01 August 2008Publication History

Proceedings of the VLDB Endowment

Abstract

Brighthouse is a column-oriented data warehouse with an automatically tuned, ultra small overhead metadata layer called Knowledge Grid, that is used as an alternative to classical indexes. The advantages of column-oriented data storage, as well as data compression have already been well-documented, especially in the context of analytic, decision support querying. This paper demonstrates additional benefits resulting from Knowledge Grid for compressed, column-oriented databases. In particular, we explain how it assists in query optimization and execution, by minimizing the need of data reads and data decompression.

References

D. J. Abadi. Column Stores For Wide and Sparse Data. CIDR 2007: 292--297Google Scholar
A. Ailamaki, D. J. DeWitt, M. D. Hill: Data page layouts for relational databases on deep memory hierarchies. VLDB J. 11(3): 198--215 (2002) Google ScholarDigital Library
T. Apaydin, G. Canahuate, H. Ferhatosmanoglu, A. S. Tosun: Approximate Encoding for Direct Access and Query Processing over Compressed Bitmaps. VLDB 2006: 846--857 Google ScholarDigital Library
S. Babu, P. Bizarro. Adaptive Query Processing in the Looking Glass. CIDR 2005: 238--249Google Scholar
K. S. Beyer, P. J. Haas, B. Reinwald, Y. Sismanis, R. Gemulla: On synopses for distinct-value estimation under multiset operations. SIGMOD 2007: 199--210 Google ScholarDigital Library
N. Bruno, S. Chaudhuri, L. Gravano: STHoles: A Multidimensional Workload-Aware Histogram. SIGMOD 2001: 211--222 Google ScholarDigital Library
M. Cannataro, D. Talia. The knowledge grid. Commun. ACM 46(1): 89--93, 2003 Google ScholarDigital Library
S. Chakkappen, T. Cruanes, B. Dageville, L. Jiang, U. Shaft, H. Su, M. Zait: Efficient and Scalable Statistics Gathering for Large Databases in Oracle 11g. SIGMOD 2008: 1053--1063 Google ScholarDigital Library
S. Chaudhuri, V. R. Narasayya: Self-Tuning Database Systems: A Decade of Progress. VLDB 2007: 3--14 Google ScholarDigital Library
G. P. Copeland, S. Khoshafian: A Decomposition Storage Model. SIGMOD 1985: 268--279 Google ScholarDigital Library
G. V. Cormack: Data Compression on a Database System. Commun. ACM 28(12): 1336--1342 (1985) Google ScholarDigital Library
A. Deshpande, Z. G. Ives, V. Raman: Adaptive Query Processing. Foundations and Trends in Databases 1(1): 1--140 (2007) Google ScholarDigital Library
Enterprise Data Warehousing with MySQL. MySQL Business White Paper, 2007Google Scholar
D. Feinberg, M. A. Beyer. Magic Quadrant for Data Warehouse Database Management Systems. Gartner RAS Core Research Note G00151490, 2007Google Scholar
P. Ferragina, R. Grossi, A. Gupta, R. Shah, J. S. Vitter: On Searching Compressed String Collections Cache-Obliviously. PODS 2008: 181--190 Google ScholarDigital Library
R. Grondin, E. Fadeitchev, V. Zarouba. Searchable archive. US Patent 7,243,110, July 10, 2007Google Scholar
J. M. Hellerstein, M. Stonebraker, J. R. Hamilton: Architecture of a Database System. Foundations and Trends in Databases 1(2): 141--259 (2007) Google ScholarDigital Library
A. L. Holloway, V. Raman, G. Swart, D. J. DeWitt: How to barter bits for chronons: compression and bandwidth trade offs for database scans. SIGMOD 2007: 389--400 Google ScholarDigital Library
www.infobright.comGoogle Scholar
Y. E. Ioannidis: The History of Histograms (abridged). VLDB 2003: 19--30 Google ScholarDigital Library
M. L. Kersten: The Database Architecture Jigsaw Puzzle. ICDE 2008: 3--4 Google ScholarDigital Library
N. Kerdprasop, K. Kerdprasop: Semantic Knowledge Integration to Support Inductive Query Optimization. DaWaK 2007: 157--169 Google ScholarDigital Library
www.luciddb.orgGoogle Scholar
V. Markl, P. J. Haas, M. Kutsch, N. Megiddo, U. Srivastava, T. M. Tran. Consistent selectivity estimation via maximum entropy. VLDB J. 16(1): 55--76 (2007) Google ScholarDigital Library
J. K. Metzger, B. M. Zane, F. D. Hinshaw. Limiting scans of loosely ordered and/or grouped relations using nearly ordered maps. US Patent 6,973,452, December 6, 2005Google Scholar
C. Mishra, N. Koudas, C. Zuzarte: Generating Targeted Queries for Database Testing. SIGMOD 2008: 499--510 Google ScholarDigital Library
T. Mitchell. Machine Learning. McGraw Hill, 1997 Google ScholarDigital Library
MySQL 5.1 Reference Manual: Storage Engines. http://dev.mysql.com/doc/refman/5.1/en/storage-engines.htmlGoogle Scholar
S. Naouali, R. Missaoui: Flexible Query Answering in Data Cubes. DaWaK 2005: 221--232 Google ScholarDigital Library
D. Narayanan, A. Donnelly, R. Mortier, A. Rowstron: Delay Aware Querying with Seaweed. VLDB 2006: 727--738 Google ScholarDigital Library
www.paraccel.comGoogle Scholar
Z. Pawlak. Rough sets: Theoretical aspects of reasoning about data. Kluwer, 1991 Google ScholarDigital Library
Z. Pawlak, A. Skowron. Rudiments of rough sets. Information Sciences 177(1): 3--27, 2007Google ScholarCross Ref
M. Poess, R. O. Nambiar, D. Walrath: Why You Should Run TPC-DS: A Workload Analysis. VLDB 2007: 1138--1149 Google ScholarDigital Library
V. Raman, G. Swart, L. Qiao, F. Reiss, V. Dialani, D. Kossmann, I. Narang, R. Sidle: Constant-Time Query Processing. ICDE 2008: 60--69 Google ScholarDigital Library
A. Rasin, S. Zdonik, O. Trajman, S. Lawande: Automatic Vertical-Database Design. WO Patent Application, 2008/016877 A2Google Scholar
M. Stonebraker, D. J. Abadi, A. Batkin, X. Chen, M. Cherniack, M. Ferreira, E. Lau, A. Lin, S. Madden, E. O'Neil, P. O'Neil, A. Rasin, N. Tran, S. Zdonik. CStore: A Column Oriented DBMS. VLDB 2005: 553--564 Google ScholarDigital Library
M. Stonebraker, S. Madden, D. J. Abadi, S. Harizopoulos, N. Hachem, P. Helland: The End of an Architectural Era (It's Time for a Complete Rewrite). VLDB 2007: 1150--1160 Google ScholarDigital Library
www.sybase.com/products/datawarehousing/sybaseiqGoogle Scholar
www.vertica.comGoogle Scholar
B. Vo, G. S. Manku: RadixZip: Linear-Time Compression of Token Streams. VLDB 2007: 1162--1172 Google ScholarDigital Library
P. W. White, C. D. French. Database system with methodology for storing a database table by vertically partitioning all columns of the table. US Patent 5,794,229, August 11, 1998Google Scholar
M. Wojnarski, C. Apanowicz, V. Eastwood, D. Ślȩzak, P. Synak, A. Wojna, J. Wróblewski: Method and System for Data Compression in a Relational Database. US Patent Application, 2008/0071818 A1Google Scholar
J. Wróblewski, C. Apanowicz, V. Eastwood, D. Ślȩzak, P. Synak, A. Wojna, M. Wojnarski: Method and System for Storing, Organizing and Processing Data in a Relational Database. US Patent Application, 2008/0071748 A1Google Scholar
M. Zukowski, S. Heman, N. Nes, P. A. Boncz: Super-Scalar RAM-CPU Cache Compression. ICDE 2006: 59 Google ScholarDigital Library

Index Terms

Recommendations

Sql: Learn Basics of Queries and Implement Easily (sql programming, SQL 2016, sql database programming, sql for beginners, sql beginners guide, sql ... sql workbook, sql guide, MSSQL) (Volume 1)
Read More
Oracle High-Performance SQL Tuning
Read More
SQLtoKeyNoSQL: a layer for relational to key-based NoSQL database mapping
iiWAS '15: Proceedings of the 17th International Conference on Information Integration and Web-based Applications & Services

Today, many applications produce and manipulate a large volume of data, the so-called Big Data. Traditional databases (DB), like relational databases, are not suitable to Big Data management. In order to solve this problem, a new category of DB has been ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Proceedings of the VLDB Endowment Volume 1, Issue 2
August 2008
461 pages
ISSN:2150-8097
Editors:
Peter Buneman,
Beng Chin Ooi,
Kenneth Ross,
Gerald Weber
Issue’s Table of Contents
Sponsors
In-Cooperation
Publisher
VLDB Endowment
Publication History
- Published: 1 August 2008
Published in pvldb Volume 1, Issue 2
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 30
  Total Citations
  View Citations
- 691
  Total Downloads
- Downloads (Last 12 months)22
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Brighthouse: an analytic data warehouse for ad-hoc queries

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

Sql: Learn Basics of Queries and Implement Easily (sql programming, SQL 2016, sql database programming, sql for beginners, sql beginners guide, sql ... sql workbook, sql guide, MSSQL) (Volume 1)

Oracle High-Performance SQL Tuning

SQLtoKeyNoSQL: a layer for relational to key-based NoSQL database mapping

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Brighthouse: an analytic data warehouse for ad-hoc queries

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

Sql: Learn Basics of Queries and Implement Easily (sql programming, SQL 2016, sql database programming, sql for beginners, sql beginners guide, sql ... sql workbook, sql guide, MSSQL) (Volume 1)

Oracle High-Performance SQL Tuning

SQLtoKeyNoSQL: a layer for relational to key-based NoSQL database mapping

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media