Subspace Nearest Neighbor Search - Problem Statement, Approaches, and Discussion

Hund, Michael; Behrisch, Michael; Färber, Ines; Sedlmair, Michael; Schreck, Tobias; Seidl, Thomas; Keim, Daniel

doi:10.1007/978-3-319-25087-8_29

Michael Hund¹⁷,
Michael Behrisch¹⁷,
Ines Färber¹⁸,
Michael Sedlmair¹⁹,
Tobias Schreck²⁰,
Thomas Seidl¹⁸ &
…
Daniel Keim¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9371))

Included in the following conference series:

International Conference on Similarity Search and Applications

1106 Accesses
5 Citations
2 Altmetric

Abstract

Computing the similarity between objects is a central task for many applications in the field of information retrieval and data mining. For finding k-nearest neighbors, typically a ranking is computed based on a predetermined set of data dimensions and a distance function, constant over all possible queries. However, many high-dimensional feature spaces contain a large number of dimensions, many of which may contain noise, irrelevant, redundant, or contradicting information. More specifically, the relevance of dimensions may depend on the query object itself, and in general, different dimension sets (subspaces) may be appropriate for a query. Approaches for feature selection or -weighting typically provide a global subspace selection, which may not be suitable for all possibly queries. In this position paper, we frame a new research problem, called subspace nearest neighbor search, aiming at multiple query-dependent subspaces for nearest neighbor search. We describe relevant problem characteristics, relate to existing approaches, and outline potential research directions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE TKDE 17(6), 734–749 (2005)
Google Scholar
Beyer, K.S., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is “nearest neighbor” meaningful? In: Proc. 7th Int. Conf. Database Theory, pp. 217–235 (1999)
Google Scholar
Camastra, F.: Data dimensionality estimation methods: a survey. Pattern Recognition 36(12), 2945–2954 (2003)
Article MATH Google Scholar
Gleicher, M., Albers, D., Walker, R., Jusufi, I., Hansen, C.D., Roberts, J.C.: Visual comparison for information visualization. Information Visualization 10(4), 289–309 (2011)
Article Google Scholar
Hinneburg, A., Keim, D.A., Aggarwal, C.C.: What is the nearest neighbor in high dimensional spaces? In: Proc. 26th Int. Conf. on VLDB, Cairo, Egypt (2000)
Google Scholar
Houle, M.E., Kriegel, H.-P., Kröger, P., Schubert, E., Zimek, A.: Can shared-neighbor distances defeat the curse of dimensionality? In: Gertz, M., Ludäscher, B. (eds.) SSDBM 2010. LNCS, vol. 6187, pp. 482–500. Springer, Heidelberg (2010)
Chapter Google Scholar
Houle, M.E., Ma, X., Oria, V., Sun, J.: Efficient algorithms for similarity search in axis-aligned subspaces. In: Traina, A.J.M., Traina Jr, C., Cordeiro, R.L.F. (eds.) SISAP 2014. LNCS, vol. 8821, pp. 1–12. Springer, Heidelberg (2014)
Google Scholar
Kailing, K., Kriegel, H.-P., Kröger, P., Wanka, S.: Ranking interesting subspaces for clustering high dimensional data. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, pp. 241–252. Springer, Heidelberg (2003)
Chapter Google Scholar
Kriegel, H.P., Kröger, P., Zimek, A.: Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering. ACM TKDD 3(1), 1 (2009)
Article Google Scholar
Liu, H., Motoda, H.: Computational Methods of Feature Selection. Data Mining and Knowledge Discovery Series. Chapman & Hall/CRC Press (2007)
Google Scholar
Micenkova, B., Dang, X.H., Assent, I., Ng, R.: Explaining outliers by subspace separability. In: 13th. IEEE ICDM, pp. 518–527 (2013)
Google Scholar
Müller, E., Günnemann, S., Assent, I., Seidl, T.: Evaluating clustering in subspace projections of high dimensional data. In: VLDB, vol. 2, pp. 1270–1281 (2009)
Google Scholar
Zimek, A., Schubert, E., Kriegel, H.P.: A survey on unsupervised outlier detection in high-dimensional numerical data. Statistical Analysis and Data Mining 5(5), 363–387 (2012)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

University of Konstanz, Konstanz, Germany
Michael Hund, Michael Behrisch & Daniel Keim
RWTH Aachen University, Aachen, Germany
Ines Färber & Thomas Seidl
University of Vienna, Wien, Austria
Michael Sedlmair
Graz University of Technology, Graz, Austria
Tobias Schreck

Authors

Michael Hund
View author publications
You can also search for this author in PubMed Google Scholar
Michael Behrisch
View author publications
You can also search for this author in PubMed Google Scholar
Ines Färber
View author publications
You can also search for this author in PubMed Google Scholar
Michael Sedlmair
View author publications
You can also search for this author in PubMed Google Scholar
Tobias Schreck
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Seidl
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Keim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael Hund .

Editor information

Editors and Affiliations

ISTI-CNR, Pisa, Italy
Giuseppe Amato
University of Strathclyde, Glasgow, United Kingdom
Richard Connor
ISTI-CNR, Pisa, Italy
Fabrizio Falchi
ISTI-CNR, Pisa, Italy
Claudio Gennaro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hund, M. et al. (2015). Subspace Nearest Neighbor Search - Problem Statement, Approaches, and Discussion. In: Amato, G., Connor, R., Falchi, F., Gennaro, C. (eds) Similarity Search and Applications. SISAP 2015. Lecture Notes in Computer Science(), vol 9371. Springer, Cham. https://doi.org/10.1007/978-3-319-25087-8_29

Download citation

DOI: https://doi.org/10.1007/978-3-319-25087-8_29
Published: 17 October 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25086-1
Online ISBN: 978-3-319-25087-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics