Skip to main content
Log in

Answering queries using materialized views with minimum size

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

In this paper, we study the following problem. Given a database and a set of queries, we want to find a set of views that can compute the answers to the queries, such that the amount of space, in bytes, required to store the viewset is minimum on the given database. (We also handle problem instances where the input has a set of database instances, as described by an oracle that returns the sizes of view relations for given view definitions.) This problem is important for applications such as distributed databases, data warehousing, and data integration. We explore the decidability and complexity of the problem for workloads of conjunctive queries. We show that results differ significantly depending on whether the workload queries have self-joins. Further, for queries without self-joins we describe a very compact search space of views, which contains all views in at least one optimal viewset. We present techniques for finding a minimum-size viewset for a single query without self-joins by using the shape of the query and its constraints, and validate the approach by extensive experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Bernstein, P.A., Goodman, N., Wong, E., Reeve, C.L., Rothnie, Jr., J.B.: Query processing in a system for distributed databases (SDD-1). ACM Trans. Database Syst. (TODS) 6(4), 602–625 (1981)

    Google Scholar 

  2. Ceri, S., Pelagatti, G.: Distributed Databases: Principles and Systems. McGraw-Hill Book Company, New York (1984)

    Google Scholar 

  3. Özsu, M.T., Valduriez, P.: Principles of Distributed Database Systems. Prentice-Hall, Englewood Cliffs, NJ (1999)

    Google Scholar 

  4. Lenzerini, M.: Data integration: A theoretical perspective. PODS pp. 233–246 (2002)

  5. Hacıgümüş, H., Iyer, B., Mehrotra, S.: Providing database as a service. ICDE (2002)

  6. Wiederhold, G.: Mediators in the architecture of future information systems. IEEE Comput. 25(3), 38–49 (1992)

    Google Scholar 

  7. Haas, L.M., Kossmann, D., Wimmers, E.L., Yang, J.: Optimizing queries across diverse data sources. In: Proc. VLDB pp. 276–285 (1997)

  8. Hacıgümüş, H., Iyer, B., Li, C., Mehrotra, S.: Executing SQL over encrypted data in the database-service-provider model. SIGMOD (2002)

  9. Gupta, H.: Selection of views to materialize in a data warehouse. ICDT (1997)

  10. Theodoratos, D., Ligoudistianos, S., Sellis, T.: Designing the global data warehouse with spj views. CAiSE (1999)

  11. Theodoratos, D., Sellis, T.: Data warehouse configuration. In: Proc. VLDB (1997)

  12. Yang, J., Karlapalem, K., Li, Q.: Algorithms for materialized view design in data warehousing environment. In: Proc. VLDB (1997)

  13. Baralis, E., Paraboschi, S., Teniente, E.: Materialized view selection in a multidimensional database. In: Proc. VLDB (1997)

  14. Gupta, H., Harinarayan, V., Rajaraman, A., Ullman, J.: Index selection in olap. ICDE (1997)

  15. Harinarayan, V., Rajaraman, A., Ullman, J.: Implementing data cubes efficiently. SIGMOD (1996)

  16. Bello, R., Dias, K., Downing, A., Feenan, J., Finnerty, J., Norcott, W., Sun, H., Witkowski, A., Ziauddin, M.: Materialized views in Oracle. In: Proc. VLDB pp. 659–664 (1998)

  17. Goldstein, J., Larson, P.-A.: Optimizing queries using materialized views: A practical, scalable solution. SIGMOD 331–342 (2001)

  18. Zaharioudakis, M., Cochrane, R., Lapis, G., Pirahesh, H., Urata, M.: Answering complex SQL queries using automatic summary tables. SIGMOD pp. 105–116 (2000)

  19. Agrawal, S., Chaudhuri, S., Narasayya, V.: Automated selection of materialized views and indexes in Microsoft SQL Server. Proc. VLDB 496–505 (2000)

  20. Chirkova, R.: The view-selection problem has an exponential-time lower bound for conjunctive queries and views. In: PODS pp. 159–168 (2002)

  21. Chirkova, R., Genesereth, M.R.: Linearly bounded reformulations of conjunctive databases. DOOD (2000)

  22. Chirkova, R., Halevy, A.Y., Suciu, D.: A formal perspective on the view selection problem. VLDB J. 11(3), pp. 216–237. (2002)

    Article  Google Scholar 

  23. Afrati, F., Li, C., Ullman, J.D.: Generating efficient plans using views. pp. 319–330, SIGMOD (2001)

  24. Levy, A., Mendelzon, A.O., Sagiv, Y., Srivastava, D.: Answering queries using views. pp. 95–104, PODS (1995)

  25. Halevy, A.: Answering queries using views: A survey. VLDB J. (2001)

  26. Calvanese, D., De Giacomo, G., Lenzerini, M.: Answering queries using views over description logics knowledge bases. PODS pp. 386–391 (2000)

  27. Calvanese, D., De Giacomo, G., Lenzerini, M., Vardi, M.Y.: Answering regular path queries using views. ICDE pp. 389–398 (2000)

  28. Li, C., Bawa, M., Ullman, J.D.: Minimizing view sets without losing query-answering power. ICDT pp. 99–113 (2001)

  29. Chen, Z., Seshadri, P.: An algebraic compression framework for query results. ICDE pp. 177–188 (2000)

  30. TPC-H: http://www.tpc.org/tpch/

  31. Chandra, A.K., Merlin, P.M.: Optimal implementation of conjunctive queries in relational data bases. STOC pp. 77–90 (1977)

  32. Sagiv, Y., Yannakakis, M.: Equivalences among relational expressions with the union and difference operators. J. ACM 27(4), 633–655 (1980)

    Article  MathSciNet  Google Scholar 

  33. Pottinger, R., Levy, A.: A scalable algorithm for answering queries using views. Proc. VLDB (2000)

  34. Yannakakis, M.: Algorithms for acyclic database schemes. Proc. VLDB, IEEE Computer Society Press 82–94 (1981)

  35. Gupta, A., Sagiv, Y., Ullman, J.D., Widom, J.: Constraint checking with partial information. PODS pp. 45–55 (1994)

  36. Klug, A.: On conjunctive queries containing inequalities. J. ACM 35(1), 146–160 (1988)

    Article  MATH  MathSciNet  Google Scholar 

  37. Selinger, P.G., Astrahan, M.M., Chamberlin, D.D., Lorie, R.A., Price, T.G.: Access path selection in a relational database management system. SIGMOD pp. 23–34 (1979)

  38. Graham, M.: On the universal relation. Technical report, University of Toronto, Canada (1979)

  39. Barsalou, T., Keller, A.M., Siambela, N., Wiederhold, G.: Updating relational databases through object-based views. In: Proceedings of the 1991 ACM SIGMOD International Conference on Management of Data, pp. 248–257 (1991)

  40. Ioannidis, Y.E., Poosala, V.: Balancing histogram optimality and practicality for query result size estimation. In: Proceedings of the SIGMOD Conference, pp. 233–244 (1995)

  41. Gibbons, P.B., Matias, Y.: New sampling-based summary statistics for improving approximate query answers. In: Proceedings of the SIGMOD Conference, pp. 331–342 (1998)

  42. Chaudhuri, S.: An overview of query optimization in relational systems. PODS pp. 34–43 (1998)

  43. Haas, P.J., Naughton, J.F., Seshadri, S., Stokes, L.: Sampling-based estimation of the number of distinct values of an attribute. In: Proc. VLDB pp. 311–322 (1995)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rada Chirkova.

Additional information

Part of this article was published elsewhere [Chirkova, R., Li, C.: Materializing views with minimal size to answer queries. PODS (2003)]. In addition to the prior materials, this article contains new theoretical results, as well as new results on how to efficiently implement the proposed techniques (Sects. 5 and 5.4)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chirkova, R., Li, C. & Li, J. Answering queries using materialized views with minimum size. The VLDB Journal 15, 191–210 (2006). https://doi.org/10.1007/s00778-005-0162-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-005-0162-8

Keywords

Navigation