Skip to main content
Log in

Interaction-aware scheduling of report-generation workloads

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

The typical workload in a database system consists of a mix of multiple queries of different types that run concurrently. Interactions among the different queries in a query mix can have a significant impact on database performance. Hence, optimizing database performance requires reasoning about query mixes rather than considering queries individually. Current database systems lack the ability to do such reasoning. We propose a new approach based on planning experiments and statistical modeling to capture the impact of query interactions. Our approach requires no prior assumptions about the internal workings of the database system or the nature and cause of query interactions, making it portable across systems. To demonstrate the potential of modeling and exploiting query interactions, we have developed a novel interaction-aware query scheduler for report-generation workloads. Our scheduler, called QShuffler, uses two query scheduling algorithms that leverage models of query interactions. The first algorithm is optimized for workloads where queries are submitted in large batches. The second algorithm targets workloads where queries arrive continuously, and scheduling decisions have to be made online. We report an experimental evaluation of QShuffler using TPC-H workloads running on IBM DB2. The evaluation shows that QShuffler, by modeling and exploiting query interactions, can consistently outperform (up to 4x) query schedulers in current database systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Aster data systems. http://www.asterdata.com/

  2. Greenplum. http://www.greenplum.com/

  3. Cognos. http://www.cognos.com/

  4. Business objects. http://www.businessobjects.com/

  5. Ahmad, M., Aboulnaga, A., Babu, S., Munagala, K.: Modeling and exploiting query interactions in database systems. In: CIKM (2008)

  6. Ahmad, M., Aboulnaga, A., Babu, S., Munagala, K.: QShuffler: Getting the Query Mix Right. In: ICDE (2008). (poster)

  7. Ahmad, M., Aboulnaga, A., Babu, S.: Query interactions in database workloads. In: DBTest Workshop (2009)

  8. Roy P., Seshadri S., Sudarshan S., Bhobe S.: Efficient and extensible algorithms for multi query optimization. SIGMOD Rec. 29(2), 249–260 (2000)

    Article  Google Scholar 

  9. O’Gorman K., El Abbadi A., Agrawal D.: Multiple query optimization in middleware using query teamwork. Softw. Pract. Experience 35(4), 361–391 (2005)

    Article  Google Scholar 

  10. Albuitiu, M.C., Kemper, A.: Synergy-based workload management. In: PhD Workshop, VLDB (2009)

  11. Conway R.H., Maxwell W.L., Miller L.W.: Theory of Scheduling. Addison-Wesley, Reading, Massachusetts (1967)

    MATH  Google Scholar 

  12. Ibaraki T., Kameda T., Katoh N.: Cautious transaction schedulers for database concurrency control. IEEE Trans. Softw. Eng. 14(7), 997–1009 (1988)

    Article  Google Scholar 

  13. Katoh N., Ibaraki T., Kameda T.: Cautious transaction schedulers with admission control. TODS 10(2), 205–229 (1985)

    Article  MATH  Google Scholar 

  14. Abbott R., Garcia-Molina H.: Scheduling real-time transactions. SIGMOD Rec. 17(1), 71–81 (1988)

    Article  Google Scholar 

  15. Abbott, R., Garcia-Molina, H.: Scheduling real-time transactions with disk resident data. In: VLDB (1989)

  16. Abbott R.K., Garcia-Molina H.: Scheduling real-time transactions: a performance evaluation. TODS 17(3), 513–560 (1992)

    Article  Google Scholar 

  17. Kang, K.D., Son, S.H., Stankovic, J.A.: Service differentiation in real-time main memory databases. In: Proceedings IEEE International Symposium on Object-Oriented Real-Time Distributed Computing (2002)

  18. Pang H., Carey M.J., Livny M.: Multiclass query scheduling in real-time database systems. TKDE 7(4), 533–551 (1995)

    Google Scholar 

  19. Carey, M.J., Jauhari, R., Livny, M.: Priority in DBMS resource scheduling. In: VLDB (1989)

  20. McWherter, D.T., Schroeder, B., Ailamaki, A., Harchol-Balter, M.: Priority mechanisms for OLTP and transactional web applications. In: ICDE (2004)

  21. McWherter, D.T., Schroeder, B., Ailamaki, A., Harchol-Balter, M.: Improving preemptive prioritization via statistical characterization of OLTP locking. In: ICDE (2005)

  22. Sacco G.M., Schkolnick M.: Buffer management in relational database systems. TODS 11(4), 473–498 (1986)

    Article  Google Scholar 

  23. Schroeder B., Harchol-Balter M.: Web servers under overload: how scheduling can help. ACM Trans. Internet Technol. 6(1), 20–52 (2006)

    Article  Google Scholar 

  24. Elnikety, S., Nahum, E., Tracey, J., Zwaenepoel, W.: A method for transparent admission control and request scheduling in e-commerce web sites. In: WWW (2004)

  25. Kelly, T.: Detecting performance anomalies in global applications. In: Proceedings Workshop on Real, Large Distributed Systems (2005)

  26. Stewart, C., Kelly, T., Zhang, A.: Exploiting nonstationarity for performance prediction. In: EuroSys (2007)

  27. Zhang, Q., Cherkasova, L., Smirni, E.: A regression-based analytic model for dynamic resource provisioning of multi-tier applications. In: ICAC (2007)

  28. Zhang, Q., Cherkasova, L., Mathews, G., Greene, W., Smirni, E.: R-capriccio: a capacity planning and anomaly detection tool for enterprise services with live workloads. In: Middleware (2007)

  29. Heiss, H.U., Wagner, R.: Adaptive load control in transaction processing systems. In: VLDB (1991)

  30. Schroeder, B., Harchol-Balter, M., Iyengar, A., Nahum, E., Wierman, A.: How to determine a good multi-programming level for external scheduling. In: ICDE (2006)

  31. Mönkeberg, A., Weikum, G.: Performance evaluation of an adaptive and robust load control method for the avoidance of data- contention thrashing. In: VLDB (1992)

  32. Mehta, A., Gupta, C., Dayal, U.: BI Batch Manager: a system for managing batch workloads on enterprise data warehouses. In: EDBT (2008)

  33. Niu, B., Martin, P., Powley, W., Bird, P., Horman, R.: Adapting mixed workloads to meet SLOs in autonomic DBMSs. In: SMDB Workshop, ICDE (2007)

  34. Niu B., Martin P., Powley W.: Towards autonomic workload management in DBMSs. J. Database Manag. 20(3), 1–17 (2009)

    Article  Google Scholar 

  35. Ganapathi, A., Kuno, H., Dayal, U., Wiener, J., Fox, A., Jordan, M., Patterson, D.: Predicting multiple metrics for queries: Better decisions enabled by machine learning. In: ICDE (2009)

  36. Babu, S., Borisov, N., Duan, S., Herodotou, H., Thummala, V.: Automated experiment-driven management of (database) systems. In: HotOS Workshop (2009)

  37. Duan, S., Thummala, V., Babu, S.: Tuning database configuration parameters with iTuned. In: VLDB (2009)

  38. Zheng, W., Bianchini, R., Janakiraman, G.J., Santos, J.R., Turner, Y.: JustRunIt: Experiment-based management of virtualized data centers. In: Proceedings USENIX Annual Technical Conference (2009)

  39. Belknap, P., Dageville, B., Dias, K., Yagoub, K.: Self-tuning for SQL performance in Oracle database 11g. In: SMDB Workshop, ICDE (2009)

  40. Transaction processing performance council (TPC). http://www.tpc.org/

  41. Babcock B., Babu S., Datar M., Motwani R., Thomas D.: Operator scheduling in data stream systems. VLDB J. 13(4), 333–353 (2004)

    Article  Google Scholar 

  42. Ryser, H.J.: Combinatorial Mathematics. The Mathematical Association of America (1963)

  43. Schrijver, A.: Theory of Linear and Integer Programming. Wiley (1998)

  44. CPLEX. http://www.ilog.com/products/cplex/

  45. Coady, Y., Cox, R., Detreville, J., Druschel, P., Hellerstein, J., Hume, A., Keeton, K., Nguyen, T., Small, C., Stein, L., Warfield, A.: Falling off the cliff: when systems go nonlinear. In: HotOS Workshop (2005)

  46. Zilio, D.C., Rao, J., Lightstone, S., Lohman, G., Storm, A., Garcia-Arellano, C., Fadden, S.: DB2 design advisor: integrated automatic physical database design. In: VLDB (2004)

  47. Agrawal, S., Chaudhuri, S., Narasayya, V.R.: Automated selection of materialized views and indexes in SQL databases. In: VLDB (2000)

  48. Niu, B., Martin, P., Powley, W., Horman, R., Bird, P.: Workload adaptation in autonomic DBMSs. In: CASCON (2006)

  49. Niu, B., Shi, J.: Scalable workload adaptation for mixed workload. In: Infoscale Conference (2009)

  50. Loh W.Y.: Regression trees with unbiased variable selection and interaction detection. Stat. Sin. 12, 361–386 (2002)

    MathSciNet  MATH  Google Scholar 

  51. Witten I.H., Frank E.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  52. MySQL slow query log parser. http://code.google.com/p/mysql-slow-query-log-parser/

  53. Garrod C., Manjhi A., Ailamaki A., Maggs B.M., Mowry T.C., Olston C., Tomasic A.: Scalable query result caching for web applications. PVLDB 1(1), 550–561 (2008)

    Google Scholar 

  54. Manjhi, A., Gibbons, P.B., Ailamaki, A., Garrod, C., Maggs, B.M., Mowry, T.C., Olston, C., Tomasic, A., Yu, H.: Invalidation clues for database scalability services. In: ICDE (2007)

  55. Ioannidis, Y.: The history of histograms (abridged). In: VLDB (2003)

  56. Fano U.: On the theory of ionization yield of radiations in different substances. Phys. Rev. 70, 44–52 (1946)

    Article  Google Scholar 

  57. Cox D.R., Lewis P.A.: Statistical Analysis of Series of Events. Chapman & Hall, London (1966)

    MATH  Google Scholar 

  58. Kaufman L., Rousseeuw P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley and Sons, Inc, New York, NY (1990)

    Google Scholar 

  59. Skewed TPC-D data generator. ftp://ftp.research.microsoft.com/users/viveknar/TPCDSkew/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mumtaz Ahmad.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ahmad, M., Aboulnaga, A., Babu, S. et al. Interaction-aware scheduling of report-generation workloads. The VLDB Journal 20, 589–615 (2011). https://doi.org/10.1007/s00778-011-0217-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-011-0217-y

Keywords

Navigation