Copyright © 2005 Elsevier B.V. All rights reserved.
Improving range-sum query evaluation on data cubes via polynomial approximation
Received 9 November 2004;
References and further reading may be available for this article. To view references and further reading you must purchase this article.
Abstract
Inefficient query answering is the main drawback in Decision Support Systems (DSS), due to the very large size of the multidimensional data stored in the underlying Data Warehouse Server (DWS). Aggregate queries are the most frequent and useful kind for such systems, as they support several analysis based on the multidimensionality and multi-resolution of data. As a consequence, providing fast answers to aggregate queries (by trading off accuracy for efficiency, if possible) has become a very important requirement in improving the effectiveness of DSS-based applications. In this paper we present a technique based on an analytical interpretation of multidimensional data and on the well-known least squares approximation (LSA) method for supporting approximate aggregate query answering in OLAP, which represents the most common application interfaces for a DWS. Our technique consists in building data synopses by interpreting the original data distributions as a set of discrete functions. These synopses, called Δ-Syn, are obtained by approximating data with a set of polynomial coefficients, and by storing these coefficients instead of the original data. Queries are issued on the compressed representation, thus reducing the number of disk accesses needed to evaluate the answers.
Keywords: Multidimensional data management; OLAP; Approximate query answering; Data synopses
Article Outline
- 1. Introduction
- 1.1. Target range queries
- 1.2. Target data cubes
- 1.3. Summary of contribution
- 1.4. Outline
- 2. Related work
- 3. Foundations
- 3.1. Problem statement
- 3.2. Key intuition and basic definitions
- 3.3. Background on the least squares approximation method
- 3.4. Remarks on the goodness of the LSA method in supporting approximate query answering
- 3.5. Analytical interpretation of data cubes
- 3.6. Improving the approximate query answering technique
- 4. Building and accessing Δ-Syn
- 5. The buildΔSyn algorithm
- 6. Experimental results
- 6.1. Synthetic data cubes
- 6.2. Error analysis on synthetic data cubes
- 6.3. Real data cubes
- 6.4. Concluding remarks: What for highly-dimensional, massive data cubes?
- 7. A reference architecture for improving query performances of OLAP servers
- 8. Conclusions and future work
- Acknowledgements
- Appendix A. Appendix
- A.1. The findIndexOfBestRow procedure
- A.2. The compOccupation procedure
- A.3. The cumDist procedure
- A.4. The linearComb procedure
- References
- Vitae







E-mail Article
Add to my Quick Links

Cited By in Scopus (3)





