Abstract
We propose a SAO index to approximately answer arbitrary linear optimization queries in a sliding window of a data stream. It uses limited memory to maintain the most “important” tuples. At any time, for any linear optimization query, we can retrieve the approximate top-K tuples in the sliding window almost instantly. The larger the amount of available memory, the better the quality of the answers is. More importantly, for a given amount of memory, the quality of the answers can be further improved by dynamically allocating a larger portion of the memory to the outer layers of the SAO index.
Similar content being viewed by others
References
Abadi DJ, Carney D, Çetintemel U et al (2003) Aurora: a new model and architecture for data stream management. VLDB J 12(2): 120–139
Agarwal PK, Har-Peled S, Varadarajan KR (2004) Approximating extent measures of points. JACM 51(4): 606–635
Babcock B, Babu S, Datar M et al (2002) Models and issues in data stream systems. PODS, pp 1–16
Bruno N, Chaudhuri S, Gravano L (2002) Top-k selection queries over relational databases: mapping strategies and performance evaluation. TODS 27(2): 153–187
Barber CB, Dobkin DP, Huhdanpaa H (1996) The quickhull algorithm for convex hulls. ACM Trans Math Softw 22(4): 469–483
Bangolae SL, Jayasumana AP, Chandrasekar V (2003) Gigabit Networking: digitized radar data transfer and beyond. ICC, pp 684–688
Böhm C, Kriegel HP (2001) Determining the convex hull in large multidimensional databases. DaWaK, pp 294–306
Chang YC, Bergman LD, Castelli V et al (2000) The onion technique: indexing for linear optimization queries. SIGMOD, pp 391–402
Chandrasekaran S, Cooper O, Deshpande A et al (2003) TelegraphCQ: continuous dataflow processing for an uncertain world. CIDR
Chang KC, Hwang SW (2002) Minimal probing: supporting expensive predicates for top-k queries. SIGMOD, pp 346–357
Carey MJ, Kossmann D (1997) On Saying “Enough Already!” in SQL. SIGMOD, pp 219–230
Cormode G, Muthukrishnan S (2003) Radial histograms for spatial streams. Technical Report 2003-11 DIMACS
Clarkson KL, Mehlhorn K, Seidel R (1993) Four results on randomized incremental constructions. Comput Geom 3: 185–212
Dantzig GB (1963) Linear programming and extensions. Princeton University Press, NJ
DeWitt DJ, Gray J (1992) Parallel database systems: the future of high performance database systems. CACM 35(6): 85–98
Donjerkovic D, Ramakrishnan R (1999) Probabilistic optimization of top N queries. VLDB, pp 411–422
Fagin R (1996) Combining fuzzy information from multiple systems. PODS, pp 216–226
Fagin R, Lotem A, Naor M (2001) Optimal aggregation algorithms for middleware. PODS, pp 102–113
Gibbons PB, Matias Y (1999) Synopsis data structures for massive data sets. SODA, pp 909–910
Hristidis V, Koudas N, Papakonstantinou Y (2001) PREFER: a system for the efficient execution of multi-parametric ranked queries. SIGMOD, pp 259–270
Hardy GH, Littlewood JE, Polya G (1934) Inequalities. Cambridge University Press, London
Hershberger J, Suri S (2004) Adaptive sampling for geometric problems over data streams. PODS, pp 252–262
Ilyas IF, Aref WG, Elmagarmid AK (2004) Supporting top-k join queries in relational databases. VLDB J 13(3): 207–221
Li CS, Chang YC, Bergman LD et al (2000) Model-based multi-modal information retrieval from large archives. ICDCS International Workshop of Knowledge Discovery and Data Mining in the World-Wide Web
Li CS, Chang YC, Smith JR et al (2001) SPIRE/EPI-SPIRE model-based multi-modal information retrieval from large archives. MMCBIR
Marian A, Bruno N, Gravano L (2004) Evaluating top-k queries over web-accessible databases. TODS 29(2): 319–362
Mouratidis K, Bakiras S, Papadias D (2006) Continuous monitoring of top-k queries over sliding windows. SIGMOD, pp 635–646
O’Rourke J (1998) Computational geometry in C, 2nd edn. Cambridge University Press, London
Preparata FP, Shamos MI (1985) Computational geometry—an introduction. Springer, Berlin
Sullivan DG, Seltzer MI (2000) Isolation with flexibility: a resource management framework for central servers. USENIX General Track, pp 337–350
2005 UC Data Mining Competition Homepage. http://mill.ucsd.edu
Waldspurger CA, Weihl WE (1994) Lottery scheduling: flexible proportional-share resource management. OSDI, pp 1–11
Yi K, Yu H, Yang J et al (2003) Efficient maintenance of materialized top-k views. ICDE, pp 189–200
Luo G, Wu K, Yu PS (2007) SAO: a stream index for answering linear optimization queries. ICDE, pp 1302–1306
Gedik B, Wu K, Yu PS et al (2007) CPU load shedding for binary stream joins. KAIS 13(3): 271–303
Cho M, Pei J, Wang K (2007) Answering ad hoc aggregate queries from data streams using prefix aggregate trees. KAIS 12(3): 301–329
Agarwal D (2007) Detecting anomalies in cross-classified streams: a bayesian approach. KAIS 11(1): 29–44
Kalousis A, Prados J, Hilario M (2007) Stability of feature selection algorithms: a study on high-dimensional spaces. KAIS 12(1): 95–116
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Luo, G., Wu, KL. & Yu, P.S. Answering linear optimization queries with an approximate stream index. Knowl Inf Syst 20, 95–121 (2009). https://doi.org/10.1007/s10115-008-0157-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-008-0157-z