research-article

BOSS - An Architecture for Database Kernel Composition

Authors:
Hubert Mohr-Daurat

Imperial College London

Imperial College London
View Profile

,
Xuan Sun

Imperial College London

Imperial College London
View Profile

,
Holger Pirk

Imperial College London

Imperial College London
View Profile

Authors Info & Claims

Proceedings of the VLDB Endowment Volume 17 Issue 4pp 877–890https://doi.org/10.14778/3636218.3636239

Published:05 March 2024Publication History

Proceedings of the VLDB Endowment

Abstract

Composable Database System Research has yielded components such as Apache Arrow for Storage, Meta's Velox for processing and Apache Calcite for query planning. What is lacking, however, is a design for a general, efficient and easy-to-use architecture to connect them. We propose such an architecture. Our proposal is based on the ideas of partial query evaluation and a carefully designed, unified exchange format for query plans and data. We implement the architecture in a system called BOSS¹ that combines the Apache Arrow, the GPU-accelerated compute kernel ArrayFire and the CPU-oriented Velox kernel into a fully-featured relational Data Management System (DMS). We demonstrate that the architecture is general enough to incorporate practically any DMS component, easy-to-use and virtually overhead-free. Based on the architecture, BOSS achieves significant performance improvement over the CPU-only Velox kernel and even outperforms the highly-optimized GPU-only DMS HeavyDB for some queries.

References

Apache Arrow. 2023. Retrieved 2023-02-24 from https://arrow.apache.orgGoogle Scholar
BlazingSQL. 2023. BlazingDB. Retrieved April 14, 2023 from https://github.com/BlazingDB/blazingsqlGoogle Scholar
Nils Boeschen and Carsten Binnig. 2022. GaccO - A GPU-Accelerated OLTP DBMS. In Proceedings of the 2022 International Conference on Management of Data (SIGMOD '22). Association for Computing Machinery, New York, NY, USA, 1003--1016. Google ScholarDigital Library
Peter Boncz and M. L Kersten. 2002. Monet: A next-Generation DBMS Kernel for Query-Intensive Applications. Ph.D. Dissertation. Universiteit van Amsterdam.Google Scholar
Peter Boncz, Thomas Neumann, and Orri Erling. 2013. TPC-H Analyzed: Hidden Messages and Lessons Learned from an Influential Benchmark. In Technology Conference on Performance Evaluation and Benchmarking. Springer, 61--76.Google Scholar
BOSS. 2023. Retrieved 2023-12-12 from http://boss.lsds.ukGoogle Scholar
Sebastian Breß. 2014. The Design and Implementation of CoGaDB: A Column-Oriented GPU-accelerated DBMS. Datenbank-Spektrum 14 (2014), 199--209.Google ScholarCross Ref
Sebastian Breß, Henning Funke, and Jens Teubner. 2016. Robust Query Processing in Co-Processor-Accelerated Databases. In Proceedings of the 2016 International Conference on Management of Data (SIGMOD '16). Association for Computing Machinery, New York, NY, USA, 1891--1906. Google ScholarDigital Library
José Cambronero, John K Feser, Micah J Smith, and Samuel Madden. 2017. Query Optimization for Dynamic Imputation. Proceedings of the VLDB Endowment 10, 11 (2017), 1310--1321.Google ScholarDigital Library
Jiashen Cao, Rathijit Sen, Matteo Interlandi, Joy Arulraj, and Hyesoon Kim. 2023. Revisiting Query Performance in GPU Database Systems. arXiv:2302.00734 [cs.DB]Google Scholar
Michael J Carey, David J DeWitt, Goetz Graefe, David M Haight, Joel E Richardson, Daniel T Schuh, Eugene J Shekita, and Scott L Vandenberg. 1988. The EXODUS Extensible DBMS Project: An Overview. (1988).Google Scholar
Periklis Chrysogelos, Manos Karpathiotakis, Raja Appuswamy, and Anastasia Ailamaki. 2019. HetExchange: Encapsulating Heterogeneous CPU-GPU Parallelism in JIT Compiled Engines. Proc. VLDB Endow. 12, 5 (Jan. 2019), 544--556. Google ScholarDigital Library
The Transaction Processing Council. 2013. TPC-H Benchmark (Revision 2.16.0). Retrieved 2023-02-24 from http://www.tpc.org/tpch/Google Scholar
Dominik Durner, Viktor Leis, and Thomas Neumann. 2021. JSON Tiles: Fast Analytics on Semi-Structured Data. In Proceedings of the 2021 International Conference on Management of Data. 445--458.Google ScholarDigital Library
Henning Funke, Sebastian Breß, Stefan Noll, Volker Markl, and Jens Teubner. 2018. Pipelined Query Processing in Coprocessor Environments. In Proceedings of the 2018 International Conference on Management of Data (SIGMOD '18). Association for Computing Machinery, New York, NY, USA, 1603--1618. Google ScholarDigital Library
Vincent Garcia, Eric Debreuve, and Michel Barlaud. 2008. Fast k Nearest Neighbor Search Using GPU. In 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. 1--6. Google ScholarCross Ref
Vincent Garcia, Éric Debreuve, Frank Nielsen, and Michel Barlaud. 2010. K-Nearest Neighbor Search: Fast GPU-based Implementations and Application to High-Dimensional Feature Matching. In 2010 IEEE International Conference on Image Processing. 3757--3760. Google ScholarCross Ref
Google. 2023. Google SQL. Retrieved 2023-04-13 from https://cloud.google.com/spanner/docs/reference/standard-sql/overviewGoogle Scholar
G. Graefe. Feb./1994. Volcano-an Extensible and Parallel Query Evaluation System. IEEE Transactions on Knowledge and Data Engineering 6, 1 (Feb./1994), 120--135. Google ScholarDigital Library
Chris Gregg and Kim Hazelwood. 2011. Where Is the Data? Why You Cannot Debate CPU vs. GPU Performance without the Answer. In (IEEE ISPASS) IEEE International Symposium on Performance Analysis of Systems and Software. IEEE, 134--144.Google ScholarCross Ref
K. O. W. Group. 2023. The OpenCL Specification. Retrieved April 15, 2023 from https://registry.khronos.org/OpenCL/specs/opencl-2.0.pdfGoogle Scholar
HEAVY.AI. 2023. HeavyDB. Retrieved April 14, 2023 from https://www.heavy.ai/product/heavydbGoogle Scholar
Max Heimel, Michael Saecker, Holger Pirk, Stefan Manegold, and Volker Markl. 2013. Hardware-Oblivious Parallelism for in-Memory Column-Stores. Proc. VLDB Endow. 6, 9 (July 2013), 709--720. Google ScholarDigital Library
Denis Hirn and Torsten Grust. 2021. One WITH RECURSIVE Is Worth Many GOTOs. In Proceedings of the 2021 International Conference on Management of Data. ACM, Virtual Event China, 723--735. Google ScholarDigital Library
Intel. 2023. VTune Profiler. Retrieved 2023-02-24 from https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.htmlGoogle Scholar
Will Jones, Tony Field, and Tristan Allwood. 2012. Deconstraining DSLs. In Proceedings of the 17th ACM SIGPLAN International Conference on Functional Programming. ACM, Copenhagen Denmark, 299--310. Google ScholarDigital Library
Julia. 2023. Retrieved 2023-02-24 from https://julialang.orgGoogle Scholar
Timo Kersten, Viktor Leis, Alfons Kemper, Thomas Neumann, Andrew Pavlo, and Peter Boncz. 2018. Everything You Always Wanted to Know about Compiled and Vectorized Queries but Were Afraid to Ask. Proceedings of the VLDB Endowment 11, 13 (Sept. 2018), 2209--2222. Google ScholarDigital Library
Delta Lake. 2023. Retrieved 2023-04-19 from https://delta.io/Google Scholar
Jing Li, Hung-Wei Tseng, Chunbin Lin, Yannis Papakonstantinou, and Steven Swanson. 2016. HippogriffDB: Balancing I/O and GPU Bandwidth in Big Data Analytics. Proc. VLDB Endow. 9, 14 (Oct. 2016), 1647--1658. Google ScholarDigital Library
Clemens Lutz, Sebastian Breß, Steffen Zeuch, Tilmann Rabl, and Volker Markl. 2020. Pump Up the Volume: Processing Large Data on GPUs with Fast Interconnects. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. ACM, Portland OR USA, 1633--1649. Google ScholarDigital Library
James Malcolm, Pavan Yalamanchili, Chris McClanahan, Vishwanath Venugopalakrishnan, Krunal Patel, and John Melonakos. 2012. ArrayFire: A GPU Acceleration Platform. In SPIE Defense, Security, and Sensing, Eric J. Kelmelis (Ed.). Baltimore, Maryland, USA, 84030A. Google ScholarCross Ref
Tobias Maltenberger, Ivan Ilic, Ilin Tolovski, and Tilmann Rabl. 2022. Evaluating Multi-GPU Sorting with Modern Interconnects. In Proceedings of the 2022 International Conference on Management of Data (SIGMOD '22). Association for Computing Machinery, New York, NY, USA, 1795--1809. Google ScholarDigital Library
John McCarthy. 1960. Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part I. Commun. ACM 3, 4 (April 1960), 184--195. Google ScholarDigital Library
Thomas Neumann. 2011. Efficiently Compiling Efficient Query Plans for Modern Hardware. Proceedings of the VLDB Endowment 4, 9 (June 2011), 539--550. Google ScholarDigital Library
Johns Paul, Jiong He, and Bingsheng He. 2016. GPL: A GPU-Based Pipelined Query Processing Engine. In Proceedings of the 2016 International Conference on Management of Data (SIGMOD '16). Association for Computing Machinery, New York, NY, USA, 1935--1950. Google ScholarDigital Library
Johns Paul, Shengliang Lu, Bingsheng He, and Chiew Tong Lau. 2021. MG-Join: A Scalable Join for Massively Parallel Multi-GPU Architectures. In Proceedings of the 2021 International Conference on Management of Data (SIGMOD '21). Association for Computing Machinery, New York, NY, USA, 1413--1425. Google ScholarDigital Library
Pedro Pedreira, Orri Erling, Masha Basmanova, Kevin Wilfong, Laith Sakka, Krishna Pai, Wei He, and Biswapesh Chattopadhyay. 2022. Velox: Meta's Unified Execution Engine. Proc. VLDB Endow. 15, 12 (Aug. 2022), 3372--3384. Google ScholarDigital Library
Pedro Pedreira, Orri Erling, Konstantinos Karanasos, Scott Schneider, Wes McKinney, Satya R Valluri, Mohamed Zait, and Jacques Nadeau. 2023. The Composable Data Management System Manifesto. Proceedings of the VLDB Endowment 16, 10 (2023), 2679--2685.Google ScholarDigital Library
Mark Raasveldt and Hannes Mühleisen. 2020. Data Management for Data Science Towards Embedded Analytics. (2020).Google Scholar
Karthik Ramachandra, Kwanghyun Park, K. Venkatesh Emani, Alan Halverson, César Galindo-Legaria, and Conor Cunningham. 2017. Froid: Optimization of Imperative Programs in a Relational Database. Proceedings of the VLDB Endowment 11, 4 (Dec. 2017), 432--444. Google ScholarDigital Library
Ran Rui, Hao Li, and Yi-Cheng Tu. 2020. Efficient Join Algorithms for Large Database Tables in a Multi-GPU Environment. Proc. VLDB Endow. 14, 4 (Dec. 2020), 708--720. Google ScholarDigital Library
Ran Rui and Yi-Cheng Tu. 2017. Fast Equi-Join Algorithms on GPUs: Design and Implementation. In Proceedings of the 29th International Conference on Scientific and Statistical Database Management (SSDBM '17). Association for Computing Machinery, New York, NY, USA, Article 17. Google ScholarDigital Library
Nikola Samardzic, Weikang Qiao, Vaibhav Aggarwal, Mau-Chung Frank Chang, and Jason Cong. 2020. Bonsai: High-performance Adaptive Merge Tree Sorting. In 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA). 282--294. Google ScholarDigital Library
Anil Shanbhag, Samuel Madden, and Xiangyao Yu. 2020. A Study of the Fundamental Performance Characteristics of GPUs and CPUs for Database Analytics. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (SIGMOD '20). Association for Computing Machinery, New York, NY, USA, 1617--1632. Google ScholarDigital Library
Anil Shanbhag, Holger Pirk, and Samuel Madden. 2018. Efficient Top-K Query Processing on Massively Parallel Hardware (SIGMOD '18). Association for Computing Machinery, New York, NY, USA, 1557--1570. Google ScholarDigital Library
Panagiotis Sioulas, Periklis Chrysogelos, Manos Karpathiotakis, Raja Appuswamy, and Anastasia Ailamaki. 2019. Hardware-Conscious Hash-Joins on GPUs. In 2019 IEEE 35th International Conference on Data Engineering (ICDE). IEEE, Macao, Macao, 698--709. Google ScholarCross Ref
Panagiotis Sioulas, Periklis Chrysogelos, Manos Karpathiotakis, Raja Appuswamy, and Anastasia Ailamaki. 2019. Hardware-Conscious Hash-Joins on GPUs. In 2019 IEEE 35th International Conference on Data Engineering (ICDE). 698--709. Google ScholarCross Ref
Elias Stehle and Hans-Arno Jacobsen. 2017. A Memory Bandwidth-Efficient Hybrid Radix Sort on GPUs. In Proceedings of the 2017 ACM International Conference on Management of Data (SIGMOD '17). Association for Computing Machinery, New York, NY, USA, 417--432. Google ScholarDigital Library
Substrait. 2023. Retrieved 2023-02-24 from https://substrait.ioGoogle Scholar
Yuchao Tao, Xi He, Ashwin Machanavajjhala, and Sudeepa Roy. 2020. Computing Local Sensitivities of Counting Queries with Joins. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (SIGMOD '20). Association for Computing Machinery, New York, NY, USA, 479--494. Google ScholarDigital Library
Patrick Wieschollek, Oliver Wang, Alexander Sorkine-Hornung, and Hendrik Lensch. 2016. Efficient Large-Scale Approximate Nearest Neighbor Search on the Gpu. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2027--2035.Google ScholarCross Ref
Haicheng Wu, Gregory Diamos, Srihari Cadambi, and Sudhakar Yalamanchili. 2012. Kernel Weaver: Automatically Fusing Database Primitives for Efficient GPU Computation. In 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture. 107--118. Google ScholarDigital Library
Bobbi W. Yogatama, Weiwei Gong, and Xiangyao Yu. 2022. Orchestrating Data Placement and Query Execution in Heterogeneous CPU-GPU DBMS. Proc. VLDB Endow. 15, 11 (July 2022), 2491--2503. Google ScholarDigital Library
Yuan Yuan, Rubao Lee, and Xiaodong Zhang. 2013. The Yin and Yang of Processing Data Warehousing Queries on GPU Devices. Proc. VLDB Endow. 6, 10 (Aug. 2013), 817--828. Google ScholarDigital Library
Marcin Zukowski, Peter A Boncz, Niels Nes, and Sándor Héman. 2005. MonetDB/X100-A DBMS in the CPU Cache. IEEE Data Eng. Bull. 28, 2 (2005), 17--22.Google Scholar

Index Terms

BOSS - An Architecture for Database Kernel Composition

Index terms have been assigned to the content through auto-classification.

Recommendations

Software architecture constraint reuse-by-composition

Architecture constraints are specifications which enable developers to formalize design rules that architectures should respect, like the topological conditions of a given architecture pattern or style. These constraints can serve as a documentation to ...
Read More
A quasi-distributed architecture for database management systems
CSC '89: Proceedings of the 17th conference on ACM Annual Computer Science Conference

This paper describes a new architecture for database management systems. This quasi-distributed architecture is a compromise between the traditional centralized architecture and the more recent distributed architecture. As such it provides benefits over ...
Read More
(Reference) Architecture = Components + Composition (+ Variation Points)
CobRA '15: Proceedings of the 1st International Workshop on Exploring Component-based Techniques for Constructing Reference Architectures

The notions of architecture, component and composition are perceived differently in different communities. In order to discuss how component-based development can contribute to the definition and use of reference architecture in practice, in this ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Proceedings of the VLDB Endowment Volume 17, Issue 4
December 2023
309 pages
ISSN:2150-8097
Editors:
Meihui Zhang
Beijing Institute of Technology
,
Cyrus Shahabi
University of Southern California
Issue’s Table of Contents
Sponsors
In-Cooperation
Publisher
VLDB Endowment
Publication History
- Published: 5 March 2024
Published in pvldb Volume 17, Issue 4

Check for updates
Badges
- Artifacts Available / v1.1
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 49
  Total Downloads
- Downloads (Last 12 months)49
- Downloads (Last 6 weeks)29
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

BOSS - An Architecture for Database Kernel Composition

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

Software architecture constraint reuse-by-composition

A quasi-distributed architecture for database management systems

(Reference) Architecture = Components + Composition (+ Variation Points)