skip to main content
article
Free Access

Index scans using a finite LRU buffer: a validated I/O model

Published:01 September 1989Publication History
Skip Abstract Section

Abstract

Indexes are commonly employed to retrieve a portion of a file or to retrieve its records in a particular order. An accurate performance model of indexes is essential to the design, analysis, and tuning of file management and database systems, and particularly to database query optimization. Many previous studies have addressed the problem of estimating the number of disk page fetches when randomly accessing k records out of N given records stored on T disk pages. This paper generalizes these results, relaxing two assumptions that usually do not hold in practice: unlimited buffer and unique records for each key value. Experiments show that the performance of an index scan is very sensitive to buffer size limitations and multiple records per key value. A model for these more practical situations is presented and a formula derived for estimating the performance of an index scan. We also give a closed-form approximation that is easy to compute. The theoretical results are validated using the R* distributed relational database system. Although we use database terminology throughout the paper, the model is more generally applicable whenever random accesses are made using keys.

References

  1. 1 ASTRAHAN, M. M., SCHKOLNICK, M., AND KIM, W. Performance of the System R access path selection mechanism. Inf. Process. 80 (1980), 487-491.Google ScholarGoogle Scholar
  2. 2 BITTON, D., AND DEWITT, D.J. Duplicate record elimination in large data files. A CM Trans. Database Syst. 8 (June 1983), 255-265. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3 CARDENAS, A.F. Analysis and performance of inverted data base structures. Commun. A CM 18 (May 1975), 253-263. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4 CHEUNG, T.-Y. Estimating block accesses and number of records in file management. Commun. ACM 25 (July 1982), 484-487. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 5 CHOU, H.-T., AND DEWITT, D.J. An evaluation of buffer management strategies for relational database systems. In Proceedings of Eleventh International Conference on Very Large Data Bases (Stockholm, Sept. 1985), pp. 127-141.Google ScholarGoogle Scholar
  6. 6 CHRISTODOULAKIS, S. Estimating block transfers and join sizes. In Proceedings of the A CM- SIGMOD Conference (San Jose, Calif., May 1983). ACM, New York, 1983, pp. 40-54. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7 CHRISTODOULAKIS, S. Estimating block selectivities. Inf. Syst. 1 (1984), 69-79.Google ScholarGoogle ScholarCross RefCross Ref
  8. 8 CHRISTODOULAKIS, S. Implications of certain assumptions in database performance evaluation. ACM Trans. Database Syst. 9 (June 1984), 163-186. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9 COMER, D. The ubiquitous B-tree. ACM Comput. Surv. 11 (June 1979), 121-137. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10 EFFELSBERG, W., AND HAERDER, T. Principles of database buffer management. ACM Trans. Database Syst. 9 (Dec. 1984), 560-595. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. 11 IJBEMA, A., AND BLANKEN, S. Estimating bucket accesses: A practical approach. In Proceedings of the International Conference on Data Engineering (Los Angeles, Calif., Feb. 1986), pp. 30-37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. 12 LOHMAN, G. M., ET AL. Query processing in R*. In Query Processing in Database Systems, Springer-Verlag, (also available as IBM Res. Rep. RJ4272, April 1984).Google ScholarGoogle Scholar
  13. 13 LUK, W.S. On estimating block accesses in database organizations. Commun. ACM 11 (1983), 945-947. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. 14 MACKERT, L. F., AND LOHMAN, G.M. Index scans using a finite LRU buffer: A validated I/O Model. IBM Res. Rep. RJ4836, San Jose, Calif., Sept. 1985.Google ScholarGoogle Scholar
  15. 15 MACKERT, L. F., AND LOHMAN, G.M. R* optimizer validation and performance evaluation for local queries. In Proceedings of ACM-SIGMOD Conference (Washington, D.C., May 1986), pp. 84-95. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. 16 MACKERT, L. F., AND LOHMAN, G.M. R* optimizer validation and performance evaluation for distributed queries. In Proceedings of Twelfth International Conference on Very Large Data Bases (Kyoto, Aug. 1986), pp. 149-159. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17 SACCO, G. M., AND SCHKOLNICK, M. A mechanism for managing the buffer pool in a relational database system using the hot set model. In Proceedings of the Eighth International Conference on Very Large Data Bases (Mexico City, 1982), pp. 257-262. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. 18 SELINGER, P. G., ET AL. Access path selection in a relational database management system, in Proceedings of ACM-SIGMOD Conference (1979). ACM, New York, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. 19 STONEBRAKER, M. Operating system support for database management. Commun. ACM 24 (July 1981), 412-418. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. 20 VANDER ZANDEN, B. T., TAYLOR, H. M., AND BITTON, D. Estimating block accesses when attributes are correlated. In Proceedings of the Twelfth International Conference on Very Large Data Bases (Kyoto, Aug. 1986), pp. 119-127. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. 21 WATERS, S.J. Hit ratios. Computer J. 19 (1976), 21-24.Google ScholarGoogle ScholarCross RefCross Ref
  22. 22 WHANG, K.-Y., WIEDERHOLD, G., AND SAGALOWICZ, D. Estimating block accesses in database organizations: A closed noniterative formula. Commun. Au~ zo (l~ov. 1983), 940-944. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. 23 YAO, S.B. Approximating block accesses in database organizations. Commun. ACM 20 {April 1977), 260-261. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Index scans using a finite LRU buffer: a validated I/O model

                        Recommendations

                        Reviews

                        Fazli Can

                        Most database optimizers try to estimate the I/O cost of a query in order to choose the cheapest access path (index or sequential scan) using information available during compile time. Studies on the I/O cost of random accesses using indexes usually rely on two assumptions: unlimited buffer and unique records for each key value. This paper studies the problem while relaxing these assumptions. The authors first introduce the problem with a good literature review and provide terminology, which improves the readability of the paper. In the main body of the paper, they derive an iterative formula for I/O cost in terms of the number of page fetches from disk to a finite LRU buffer when scanning an unclustered index. Next, they derive two easy-to-compute approximations of this formula that are applicable in the typical situation of low duplication of index keys. Numerous experiments validate the theoretical results. In the experiments, the authors compare their I/O cost estimates with the actual I/O cost using the R* distributed relational DBMS, with the System R estimate, and with another useful estimate. In the conclusion of the paper, the authors provide a list of problems as a pointer to future research. These problems are the effects of (a) data values drawn from a nonuniform distribution, (b) records not randomly placed on pages, and (c) a variable number of records per page. This well-written paper is useful for researchers working on physical database design and performance evaluation. Its results are applicable whenever random accesses are made using indexes.

                        Access critical reviews of Computing literature here

                        Become a reviewer for Computing Reviews.

                        Comments

                        Login options

                        Check if you have access through your login credentials or your institution to get full access on this article.

                        Sign in

                        Full Access

                        • Published in

                          cover image ACM Transactions on Database Systems
                          ACM Transactions on Database Systems  Volume 14, Issue 3
                          Sept. 1989
                          152 pages
                          ISSN:0362-5915
                          EISSN:1557-4644
                          DOI:10.1145/68012
                          • Editor:
                          • Gio Wiederhold
                          Issue’s Table of Contents

                          Copyright © 1989 ACM

                          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                          Publisher

                          Association for Computing Machinery

                          New York, NY, United States

                          Publication History

                          • Published: 1 September 1989
                          Published in tods Volume 14, Issue 3

                          Permissions

                          Request permissions about this article.

                          Request Permissions

                          Check for updates

                          Qualifiers

                          • article

                        PDF Format

                        View or Download as a PDF file.

                        PDF

                        eReader

                        View online with eReader.

                        eReader