Skip to main content

High-Dimensional Shared Nearest Neighbor Clustering Algorithm

  • Conference paper
Fuzzy Systems and Knowledge Discovery (FSKD 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3614))

Included in the following conference series:

Abstract

Clustering results often critically depend on density and similarity, and its complexity often changes along with the augment of sample dimensionality. In this paper, we refer to classical shared nearest neighbor clustering algorithm (SNN), and provide a high-dimensional shared nearest neighbor clustering algorithm (DSNN). This DSNN is evaluated using a freeway traffic data set, and experiment results show that DSNN settles many disadvantages in SNN algorithm, such as outliers, statistic, core points, computation complexity etc, also attains better clustering results on multi-dimensional data set than SNN algorithm.

This work is supported by the National Natural Science Foundation of China (60205007) , Natural Science Foundation of Guangdong Province (031558,04300462), Research Foundation of National Science and Technology Plan Project (2004BA721A02), Research Foundation of Science and Technology Plan Project in Guangdong Province (2003C50118) and Research Foundation of Science and Technology Plan Project in Guangzhou City(2002Z3-E0017).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Clarke, F., Ekeland, I.: Nonlinear oscillations and boundary-value problems for Hamiltonian systems. Arch. Rat. Mech. Anal. 78, 315–333 (1982)

    Article  MATH  MathSciNet  Google Scholar 

  2. Guha, S., Rastogi, R., Shim, K.: Cure: An efficient clustering algorithm for large databases. In: Proc. 1998 ACM-SIGMOD Int. Conf. Management of Data(SIGMOD 1998), pp. 73–84 (1998)

    Google Scholar 

  3. Ertoz, L., Steinbach, M., Kumar, V.: Finding Clusters of Different Sizes, Shapes, and Densities in Noisy, High Dimensional Data. In: Proceedings of Third SIAM International Conference on Data Mining, San Francisco, CA, USA (May 2003)

    Google Scholar 

  4. Ertoz, L., Steinbach, M., Kumar, V.: A New Shared Nearest Neighbor Clustering Algorithm and its Applications. In: Workshop on Clustering High Dimensional Data and its Applications. Second SIAM International Conference on Data Mining, Arlington, VA, USA (2002)

    Google Scholar 

  5. Bay, S.D., Schwabacher, M.: Mining Distance-Based Outliers in Near Linear Time with Randomization and a Simple Pruning Rule. In: Conference on Knowledge Discovery in Data archive Proceedings of the ninth ACM SIGKDD International Conference (KDD), pp. 29–38 (2003)

    Google Scholar 

  6. Eskin, E., Arnold, A., Prerau, M., Portnov, L., Stolfo, S.: A framework for unsupervised anomaly detection: Detecting intrusions in unlabeled data. In: Applications of Data Mining in Computer Security, December 16, 10/320,259. Kluwer, Dordrecht (2002)

    Google Scholar 

  7. Strehl, A., Ghosh, J., Mooney, R.: Impact of Similarity Measures on Web-page Clustering. In: Proceedings of the 17th National Conference on Artificial Intelligence: Workshop of Artificial Intelligence for Web Search, pp. 58–64. AAAI/MIT Press, Cambridge (2000)

    Google Scholar 

  8. Tan, P.-N., Steinbach, M., Kumar, V., Klooster, S., Potter, C., Torregrosa, A.: Finding spatio-termporal patterns in earth science data. In: KDD Temporal Data Mining Workshop, San Francisco, California, USA (August 2001)

    Google Scholar 

  9. Steinbach, M., Tan, P.-N., Kumar, V., Klooster, S., Potter, C.: Temporal data mining for the discovery and analysis of ocean climate indices. In: Proceedings of the KDD Temporal Data Mining Workshop, Edmonton, Alberta, Canada (August 2002)

    Google Scholar 

  10. Shekhar, S., Lu, C.T., Chawla, S., Zhang, P.: Data Mining and Visualization of Twin-Cities Traffic Data. University of Minnesota Academic report (2001)

    Google Scholar 

  11. Kumar, V., Steinbach, M., Tan, P.-N.: Mining Scientific Data: Discovery of Patterns in the Global Climate System. In: PAKDD, May 7 (2002)

    Google Scholar 

  12. Asanobu, K.: Data mining for Typhoon Image Collection. Journal of Intelligent Information Systems 19(1), 25–41 (2002)

    Article  Google Scholar 

  13. Tan, P.-N., Steinbach, M., Kumar, V., Klooster, S., Potter, C., Torregrosa, A.: Finding spatio-termporal patterns in earth science data. In: KDD Temporal Data Mining Workshop, San Francisco, California, USA (August 2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yin, J., Fan, X., Chen, Y., Ren, J. (2005). High-Dimensional Shared Nearest Neighbor Clustering Algorithm. In: Wang, L., Jin, Y. (eds) Fuzzy Systems and Knowledge Discovery. FSKD 2005. Lecture Notes in Computer Science(), vol 3614. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11540007_60

Download citation

  • DOI: https://doi.org/10.1007/11540007_60

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28331-7

  • Online ISBN: 978-3-540-31828-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics