Abstract
A RkNN query returns all objects whose nearest k neighbors contain the query object. In this paper, we consider RkNN query processing in the case where the distances between attribute values are not necessarily metric. Dissimilarities between objects could then be a monotonic aggregate of dissimilarities between their values, such aggregation functions being specified at query time. We outline real world cases that motivate RkNN processing in such scenarios. We consider the AL-Tree index and its applicability in RkNN query processing. We develop an approach that exploits the group level reasoning enabled by the AL-Tree in RkNN processing. We evaluate our approach against a Naive approach that performs sequential scans on contiguous data and an improved block-based approach that we provide. We use real-world datasets and synthetic data with varying characteristics for our experiments. This extensive empirical evaluation shows that our approach is better than existing methods in terms of computational and disk access costs, leading to significantly better response times.
- How fast is your disk? http://www.linuxinsight.com/how_fast_is_your-disk.html, January 2007.Google Scholar
- E. Achtert, C. Böhm, P. Kröger, P. Kunath, A. Pryakhin, and M. Renz. Efficient reverse k-nearest neighbor search in arbitrary metric spaces. In SIGMOD Conference, pages 515--526, 2006. Google ScholarDigital Library
- E. Achtert, H.-P. Kriegel, P. Kröger, M. Renz, and A. Züfle. Reverse k-nearest neighbor search in dynamic and general metric databases. In EDBT, pages 886--897, 2009. Google ScholarDigital Library
- V. Athitsos, M. Potamias, P. Papapetrou, and G. Kollios. Nearest neighbor retrieval using distance-based hashing. In ICDE, 2008. Google ScholarDigital Library
- H. Bast, D. Majumdar, R. Schenkel, M. Theobald, and G. Weikum. Io-top-k: Index-access optimized top-k query processing. In VLDB, pages 475--486, 2006. Google ScholarDigital Library
- J. L. Bentley. Multidimensional binary search trees used for associative searching. CACM, 1975. Google ScholarDigital Library
- G.-H. Cha. Non-metric similarity ranking for image retrieval. In DEXA, pages 853--862, 2006. Google ScholarDigital Library
- H. Chen, R. Shi, K. Furuse, and N. Ohbo. Finding rknn straightforwardly with large secondary storage. In INGS, 2008. Google ScholarDigital Library
- O Cheong A Vigneron and J Yon Reverse nearest neighbor queries in fixed dimension. CoRR, abs/0905.4441, 2009.Google Scholar
- W. Chung, Gray and Horst. Windows 2000 disk io performance. Microsoft Research TR, June 2000.Google Scholar
- P. Ciaccia, M. Patella, and P. Zezula. M-tree: An efficient access method for similarity search in metric spaces. In VLDB, 1997. Google ScholarDigital Library
- E. Dellis and B. Seeger. Efficient computation of reverse skyline queries. In VLDB, pages 291--302, 2007. Google ScholarDigital Library
- P. M. Deshpande, D. P, and K. Kummamuru. Efficient online top-k retrieval with arbitrary similarity measures. In EDBT, pages 356--367, 2008. Google ScholarDigital Library
- R. Fagin, A. Lotem, and M. Naor. Optimal aggregation algorithms for middleware. J. Comput. Syst. Sci., 66(4):614--656, 2003. Google ScholarDigital Library
- K. Goh, B. Li, and E. Chang. Dyndex: A dynamic and nonmetric space indexer. In ACM Intl. Conference on Multimedia, 2002. Google ScholarDigital Library
- A. Guttman. R-trees: A dynamic index structure for spatial searching. In SIGMOD, 1984. Google ScholarDigital Library
- F. Korn and S. Muthukrishnan. Influence sets based on reverse nearest neighbor queries. In SIGMOD Conference, pages 201--212, 2000. Google ScholarDigital Library
- H.-P. Kriegel, P. Kröger, M. Renz, A. Züfle, and A. Katzdobler. Reverse k-nearest neighbor search based on aggregate point access methods. In SSDBM, pages 444--460, 2009. Google ScholarDigital Library
- K. C. K. Lee, B. Zheng, and W.-C. Lee. Ranked reverse nearest neighbor search. IEEE TKDE, 20(7):894--910, 2008. Google ScholarDigital Library
- J. Lin, D. Etter, and D. DeBarr. Exact and approximate reverse nearest neighbor search for multimedia data. In SDM, pages 656--667, 2008.Google ScholarCross Ref
- G. Murphy and D. Medin. The role of theories in conceptual coherence. In Psychological Review, 1985.Google ScholarCross Ref
- D. P, P. M. Deshpande, D. Majumdar, and R. Krishnapuram. Efficient skyline retrieval with arbitrary similarity measures. In EDBT, 2009.Google Scholar
- A. Singh, H. Ferhatosmanoglu, and A. S. Tosun. High dimensional reverse nearest neighbor queries. In CIKM, pages 91--98, 2003. Google ScholarDigital Library
- T. Skopal and J. Lokoc. Nm-tree: Flexible approximate similarity search in metric and non-metric spaces. In DEXA, pages 312--325, 2008. Google ScholarDigital Library
- I. Stanoi, D. Agrawal, and A. E. Abbadi. Reverse nearest neighbor queries for dynamic databases. In In SIGMOD Workshop on DMKD, pages 44--53, 2000.Google Scholar
- Y. Tao, D. Papadias, and X. Lian. Reverse knn search in arbitrary dimensionality. In VLDB, 2004. Google ScholarDigital Library
- M. Vlachos, D. Gunopulos, and G. Kollios. Robust similarity measures for mobile object trajectories. In DEXA 2002, 2002. Google ScholarDigital Library
- C. Xia, W. Hsu, and M.-L. Lee. Erknn: efficient reverse k-nearest neighbors retrieval with local knn-distance estimation. In CIKM, 2005. Google ScholarDigital Library
- C. Yang and K.-I. Lin. An index structure for efficient reverse nearest neighbor queries. In ICDE, 2001. Google ScholarDigital Library
- J. L. Yanmin Luo, Canhong Lian and H. Chen. Finding rknn by compressed straightforward index. In ISKE, 2008.Google Scholar
- M. L. Yiu and N. Mamoulis. Reverse nearest neighbors search in ad hoc subspaces. IEEE TKDE, 19(3):412--426, 2007. Google ScholarDigital Library
Index Terms
- Efficient RkNN retrieval with arbitrary non-metric similarity measures
Recommendations
k-Distance Approximation for Memory-Efficient RkNN Retrieval
Similarity Search and ApplicationsAbstractFor a given query object, Reverse k-Nearest Neighbor queries retrieve those objects that have the query object among their k-nearest neighbors. However, computing the k-nearest neighbor sets for all points in a database is expensive in terms of ...
Efficient online top-K retrieval with arbitrary similarity measures
EDBT '08: Proceedings of the 11th international conference on Extending database technology: Advances in database technologyThe top-k retrieval problem requires finding k objects most similar to a given query object. Similarities between objects are most often computed as aggregated similarities of their attribute values. We consider the case where the similarities between ...
Efficient reverse skyline retrieval with arbitrary non-metric similarity measures
EDBT/ICDT '11: Proceedings of the 14th International Conference on Extending Database TechnologyA Reverse Skyline query returns all objects whose skyline contains the query object. In this paper, we consider Reverse Skyline query processing where the distance between attribute values are not necessarily metric. We outline real world cases that ...
Comments