ABSTRACT
Hybrid transactional and analytical processing (HTAP) systems like SAP HANA make it much simpler to manage both operational load and analytical queries without ETL, separate data warehouses, et al. To represent both transactional and analytical business logic in a single database system, stored procedures are often used to express analytical queries using control flow logic and DMLs. Optimizing these complex procedures requires a fair knowledge of imperative programming languages as well as the declarative query language. Therefore, unified optimization techniques considering both program and query optimization techniques are essential for achieving optimal query performance. In this paper, we propose a novel unified optimization technique for efficient iterative query processing. We present a notion of query motion that allows the movement of SQL queries in and out of a loop. Additionally, we exploit a new cost model that measures the quality of the execution plan with consideration for queries and loop iterations. We describe our experimental evaluation that demonstrates the benefit of our technique using both a standard decision support benchmark and real-world workloads. An extensive evaluation shows that our unified optimization technique enumerates plans that achieve performance improvements of up to an order of magnitude faster than plans generated by the existing loop-invariant code motion technique.
- A. V. Aho, S. L. Monica, R. Sethi, and J. D. Ullman. 2007. Compilers: Principles, techniques, and tools(2nd ed.). Addison-wesley.Google ScholarDigital Library
- M. De Alba and D. Kaeli. 2002. Path-based hardware loop prediction. In CICINDI conference.Google Scholar
- C. Binnig, N. May, and T. Mindnich. 2013. SQLScript: Efficiently analyzing big enterprise data in SAP HANA. BTW(2013), 363--382.Google Scholar
- M. Boehm, D. R. Burdick, A. V. Evfimievski, B. Reinwald, F. R. Reiss, P. Sen, S. Tatikonda, and Y. Tian. 2014. SystemML's Optimizer: Plan generation for large-scale machine learning programs. IEEE Data Eng.Bull.37, 3 (2014), 52--62.Google Scholar
- A. Cheung, S. Madden, and A. Solar-Lezama. 2014. Sloth: Being lazy is a virtue (when issuing database queries). In SIGMOD conference. ACM,931--942.Google Scholar
- A. Cheung, A. Solar-Lezama, and S. Madden. 2012. Inferring SQL queries using program synthesis. CoRRabs/1208.2013 (2012).Google Scholar
- K. Cooper and L. Torczon. 2011. Engineering a compiler (2nd ed.). Elsevier. Google ScholarDigital Library
- R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck. 1991. Efficiently computing static single assignment form and the control dependence graph. ACM TOPLAS 13, 4 (1991), 451--490. Google ScholarDigital Library
- M. R. de Alba and D. R. Kaeli. 2001. Runtime predict ability of loops. In International Workshop on Workload Characterization. IEEE, 91--98. Google ScholarDigital Library
- C. Diaconu, C. Freedman, E. Ismert, P. Larson, P. Mittal, R. Stonecipher, N. Verma, and M. Zwilling. 2013. Hekaton: SQL server's memory-optimized OLTP engine. In SIGMOD Conference. ACM, 1243--1254. Google ScholarDigital Library
- A. El-Helw, V. Raghavan, M. A. Soliman, G. Caragea, Z. Gu, and M.Petropoulos. 2015. Optimization of common table expressions in MPP database systems. In VLDB conference. 1704--1715. Google ScholarDigital Library
- K. V. Emani, K. Ramachandra, S. Bhattacharya, and S. Sudarshan. 2016. Extracting equivalent SQL from imperative code in database applications. In SIGMOD conference. ACM, 1781--1796. Google ScholarDigital Library
- C. Galindo-Legaria and M. Joshi. 2001. Orthogonal optimization of subqueries and aggregation. In SIGMOD conference. ACM, 571--581.Google Scholar
- F. Gulyassy, M. Hoppe, and M. Iermann. 2010. Materials planning with SAP. Galileo Press. Google ScholarDigital Library
- R. Guravannavar and S. Sudarshan. 2008. Rewriting procedures for batched bindings. In VLDB conference. 1107--1123.Google Scholar
- Y. Klonatos, C. Koch, T. Rompf, and H. Chafi. 2014. Building efficient query engines in a high-level language. In VLDB conference. 853--864. Google ScholarDigital Library
- C. Lattner and V. Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In CGO conference. IEEE Computer Society, 75. Google ScholarDigital Library
- D. F. Lieuwen and D. J. DeWitt. 1992. A transformation-based approach to optimizing loops in database programming languages. In SIGMOD conference. ACM, 91--100. Google ScholarDigital Library
- I. Mami and Z. Bellahsene. 2012. A survey of view selection methods. In SIGMOD conference. ACM, 20--29.Google Scholar
- J. Meehan, N. Tatbul, S. Zdonik, C. Aslantas, U. Cetintemel, J. Du, T. Kraska, S. Madden, D. Maier, A. Pavlo, M. Stonebraker, K. Tufte, and H. Wang. 2015. S-Store: Streaming meets transaction processing. In VLDB conference. 2134--2145. Google ScholarDigital Library
- S. S. Muchnick. 1997. Advanced compiler design and implementation. Morgan Kaufmann. Google ScholarDigital Library
- T. Neumann. 2011. Efficiently compiling efficient query plans formodern hardware. In VLDB conference. 539--550. Google ScholarDigital Library
- M. Onizuka, H. Kato, S. Hidaka, K. Nakano, and Z. Hu. 2013. Optimization for iterative queries on MapReduce. In VLDB conference. 241--252. Google ScholarDigital Library
- S. Palkar, J. J. Thomas, A. Shanbhag, D. Narayanan, H. Pirk, M.Schwarzkopf, S. Amarasinghe, and M. Zaharia. 2017. Weld: A common runtime for high performance data analytics. In CIDR conference.Google Scholar
- K. Ramachandra, R. Guravannavar, and S. Sudarshan. 2012. Program analysis and transformation for holistic optimization of database applications. In SIGPLAN International Workshop on State of the Art in Java Program analysis (SOAP). ACM, 39--44. Google ScholarDigital Library
- K. Ramachandra, K. Park, K. V. Emani, A. Halverson, C. Galindo-Legaria, and C. Cunningham. 2017. Froid: Optimization of imperative programs in a relational database. In VLDB conference. 432--444. Google ScholarDigital Library
- A. Shaikhha, Y. Klonatos, L. Parreaux, L. Brown, M. Dashti, and C.Koch. 2016. How to architect a query compiler. In SIGMOD conference. ACM, 1907--1922. Google ScholarDigital Library
- D. Tetzlaff and S. Glesner. 2013. Static prediction of loop iterationcounts using machine learning to enable hot spot optimizations. In Euromicro SEAA conference. IEEE, 300--307. Google ScholarDigital Library
- TPC 2015. TPC-DS Version 2.1.0. http://www.tpc.org/tpcds/Google Scholar
- X. Yan and J. Han. 2002. gspan: Graph-based substructure pattern mining. In ICDM conference. IEEE, 721--724 Google ScholarDigital Library
Index Terms
- Iterative Query Processing based on Unified Optimization Techniques
Recommendations
View-based query processing: On the relationship between rewriting, answering and losslessness
As a result of the extensive research in view-based query processing, three notions have been identified as fundamental, namely rewriting, answering, and losslessness. Answering amounts to computing the tuples satisfying the query in all databases ...
Comments