Skip to main content

Analysis of the Nearest Neighbor Classifiers: A Review

  • Conference paper
  • First Online:
Advances in Artificial Intelligence and Data Engineering (AIDE 2019)

Abstract

We are living in a data age and with the expansion of ‘Internet of Things’ platform, there is an upsurge in devices connected to the Internet. Everything from smart sensors to smartphones and tablets, systems installed in manufacturing units, hospitals, vehicles, etc. is generating data. Such developments in the technological world have escalated the generation of data and require an analysis to be performed on the raw data to identify patterns. The data mining techniques are deployed extensively to extract information and they yield far-reaching effects on the trade and the lives of the people concerned. The accuracy and effectiveness of data mining techniques in providing better outcomes and cost-effective methods in various domains have been established. Usually, in supervised learning, density estimation is used by instance-based learning classifiers like k-nearest neighbor (kNN). In this paper, the regular kNN classifier is compared with the various classifiers conceptually and the ARSkNN that uses mass estimation has been proved to be commensurate to kNN in accuracy and has reduced computation time drastically on datasets chosen for this analysis. Tenfold cross-validation is used for testing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 279.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 279.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Audibert JY, Tsybakov AB (2007) Fast learning rates for plug—in classifiers under the margin condition. Ann Stat 35:608–633. https://doi.org/10.1214/009053606000001217

    Article  MATH  Google Scholar 

  2. Bailey T, Jain A (1978) A note on distance-weighted k-nearest neighbor rules. IEEE Trans Syst Man Cybern 8:311–313. https://doi.org/10.1109/TSMC.1978.4309958

    Article  MATH  Google Scholar 

  3. Baoli L, Shiwen Y, Qin L (2003) An improved k-nearest neighbor algorithm for text categorization. https://pdfs.semanticscholar.org/490a/b325ba480f6fb71cdbb5f87ff4cb70918686.pdf

  4. Bauer ME, Burk TE, Ek AR, Coppin PR, Lime SD, Walsh TA, Walters DK, Befort W, Heinzen DF (1994) Satellite inventory of Minnesota’s forest resources. Photogram. Eng. Remote Sens. 60(3):287–298

    Google Scholar 

  5. Bax E (2000) Validation of nearest neighbor classifiers. IEEE Trans Inform Theor 46:2746–2752. https://doi.org/10.1109/18.887892

    Article  MathSciNet  MATH  Google Scholar 

  6. Imandoust SB, Bolandraftar M (2013) Application of K-nearest neighbor (KNN) approach for predicting “economic events: theoretical background. Int J Eng Res Appl 3(5):605–610

    Google Scholar 

  7. Weinberger KQ, Lawrence KS (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10:207–244

    MATH  Google Scholar 

  8. Shalev-Shwartz S, Singer Y, Ng AY (2004) Online and batch learning of pseudo-metrics. In: Twenty-first International conference on machine learning. ACM, New York, NY, USA, vol 94. https://doi.org/10.1145/1015330.1015376

  9. Baoli L, Qin L, Shiwen Y (2004) An adaptive k-nearest neighbor text categorization strategy. ACM Trans Asian Lang Inf Process (TALIP) 3(4):215–226

    Article  Google Scholar 

  10. Chen YS, Hung YP, Yen TF, Fuh CS (2007) Fast and versatile algorithm for nearest neighbor search based on a lower bound tree. Pattern Recogn 40(2):360–375

    Article  MATH  Google Scholar 

  11. Fix E, Hodges J (1951) Discriminatory analysis, non parametric discrimination: consistency properties. Technical report, vol 4. USA, School of aviation medicine Randolph field texas

    Google Scholar 

  12. Hart P (1968) The condensed nearest neighbor rule. IEEE Trans Inf Theory 14(3):515–516. https://doi.org/10.1109/TIT.1968.1054155

    Article  Google Scholar 

  13. Gate G (1972) The reduced nearest neighbor rule. IEEE Trans Inf Theory 18(3):431–433. https://doi.org/10.1109/TIT.1972.1054809

    Article  Google Scholar 

  14. Alpaydin E (1997) Voting over multiple condensed nearest neighbors. Artif Intell Rev 11:115–132. https://doi.org/10.1023/A:1006563312922

    Article  Google Scholar 

  15. Wilson D, Martinez T (2000) Reduction techniques for instance-based learning algorithms. Mach Learn 38(3):257–286. https://doi.org/10.1023/A:1007626913721

    Article  MATH  Google Scholar 

  16. Aha DW, Kibler D, Albert M (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66. https://doi.org/10.1007/BF00153759

    Article  Google Scholar 

  17. Sproull RF (1991) Refinements to Nearest neighbor searching. Tech Rep Int Comput Sci ACM 18(9):507–517

    MATH  Google Scholar 

  18. Kumar A, Bhatnagar R, Srivastava S (2018) ARSkNN: an efficient k-nearest neighbor classification technique using mass based similarity measure. J Intell Fuzzy Syst 35(4):1–12. https://doi.org/10.3233/JIFS-169701

    Article  Google Scholar 

  19. Lyon RJ, Stappers BW, Cooper S, Brooke JM, Knowles JD (2016) Fifty years of pulsar candidate selection: from simple filters to a new principled real-time classification approach. Mon Not R Astron Soc 459(1):1104–1123

    Article  Google Scholar 

  20. Keith MJ (2010) The high time resolution universe pulsar survey—I. System configuration and initial discoveries. Mon Not R Astron Soc 409(2):619–627. https://doi.org/10.1111/j.1365-2966.2010.17325.x

  21. Lorimer DR, Kramer M (2004) Handbook of pulsar astronomy. Cambridge observing handbooks for research astronomers. Cambridge University Press, Cambridge, vol 4

    Google Scholar 

  22. Archana S, Elangovan K (2014) Survey of classification techniques in data mining. Int J Comput Sci Mob Appl 2(2):65–71

    Google Scholar 

  23. Aha DW (1997) Lazy learning. Kluwer, Norwell

    Book  MATH  Google Scholar 

  24. Gao QB, Wang ZZ (2007) Center-based nearest neighbor classifier. Pattern Recogn 40(1):346–349

    Article  MATH  Google Scholar 

  25. Omercevic D, Drbohlav O, Leonardis A (2007) High-dimensional feature matching: employing the concept of meaningful nearest neighbors. In: IEEE eleventh international conference on computer vision, pp 1–8

    Google Scholar 

  26. Toyama J, Kudo M, Imai H (2010) Probably correct k-nearest neighbor search in high dimensions. Pattern Recogn 43(4):1361–1372

    Article  MATH  Google Scholar 

  27. Ting KM, Zhou GT, Liu FT, Tan SC (2010) Mass estimation and its applications. In: Sixteenth ACM SIGKDD international conference on knowledge discovery and data mining, pp 989–998

    Google Scholar 

  28. Kumar A, Bhatnagar R, Srivastava S (2018) Analysis of credit risk prediction using ARSkNN, pp 644–652. https://doi.org/10.1007/978-3-319-74690-6_63

  29. Cristoph B, Rob JH, Bonsoo K (2017) A note on the validity of cross-validation for evaluating autoregressive time series prediction. Comput Stat Data Anal 120:70–83. https://doi.org/10.1016/j.csda.2017.11.003

    Article  MathSciNet  MATH  Google Scholar 

  30. Ji-Hyun K (2009) Estimating classification error rate: repeated cross-validation, repeated hold-out and bootstrap. Comput Stat Data Anal 53:3735–3745

    Article  MathSciNet  MATH  Google Scholar 

  31. Gaoxia J, Wengian W (2017) Error estimation based on variance analysis of k-fold cross validation. Pattern Recogn 69:94–106

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to G. Poornalatha .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Agarwal, Y., Poornalatha, G. (2021). Analysis of the Nearest Neighbor Classifiers: A Review. In: Chiplunkar, N.N., Fukao, T. (eds) Advances in Artificial Intelligence and Data Engineering. AIDE 2019. Advances in Intelligent Systems and Computing, vol 1133. Springer, Singapore. https://doi.org/10.1007/978-981-15-3514-7_43

Download citation

Publish with us

Policies and ethics