Abstract
We are living in a data age and with the expansion of ‘Internet of Things’ platform, there is an upsurge in devices connected to the Internet. Everything from smart sensors to smartphones and tablets, systems installed in manufacturing units, hospitals, vehicles, etc. is generating data. Such developments in the technological world have escalated the generation of data and require an analysis to be performed on the raw data to identify patterns. The data mining techniques are deployed extensively to extract information and they yield far-reaching effects on the trade and the lives of the people concerned. The accuracy and effectiveness of data mining techniques in providing better outcomes and cost-effective methods in various domains have been established. Usually, in supervised learning, density estimation is used by instance-based learning classifiers like k-nearest neighbor (kNN). In this paper, the regular kNN classifier is compared with the various classifiers conceptually and the ARSkNN that uses mass estimation has been proved to be commensurate to kNN in accuracy and has reduced computation time drastically on datasets chosen for this analysis. Tenfold cross-validation is used for testing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Audibert JY, Tsybakov AB (2007) Fast learning rates for plug—in classifiers under the margin condition. Ann Stat 35:608–633. https://doi.org/10.1214/009053606000001217
Bailey T, Jain A (1978) A note on distance-weighted k-nearest neighbor rules. IEEE Trans Syst Man Cybern 8:311–313. https://doi.org/10.1109/TSMC.1978.4309958
Baoli L, Shiwen Y, Qin L (2003) An improved k-nearest neighbor algorithm for text categorization. https://pdfs.semanticscholar.org/490a/b325ba480f6fb71cdbb5f87ff4cb70918686.pdf
Bauer ME, Burk TE, Ek AR, Coppin PR, Lime SD, Walsh TA, Walters DK, Befort W, Heinzen DF (1994) Satellite inventory of Minnesota’s forest resources. Photogram. Eng. Remote Sens. 60(3):287–298
Bax E (2000) Validation of nearest neighbor classifiers. IEEE Trans Inform Theor 46:2746–2752. https://doi.org/10.1109/18.887892
Imandoust SB, Bolandraftar M (2013) Application of K-nearest neighbor (KNN) approach for predicting “economic events: theoretical background. Int J Eng Res Appl 3(5):605–610
Weinberger KQ, Lawrence KS (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10:207–244
Shalev-Shwartz S, Singer Y, Ng AY (2004) Online and batch learning of pseudo-metrics. In: Twenty-first International conference on machine learning. ACM, New York, NY, USA, vol 94. https://doi.org/10.1145/1015330.1015376
Baoli L, Qin L, Shiwen Y (2004) An adaptive k-nearest neighbor text categorization strategy. ACM Trans Asian Lang Inf Process (TALIP) 3(4):215–226
Chen YS, Hung YP, Yen TF, Fuh CS (2007) Fast and versatile algorithm for nearest neighbor search based on a lower bound tree. Pattern Recogn 40(2):360–375
Fix E, Hodges J (1951) Discriminatory analysis, non parametric discrimination: consistency properties. Technical report, vol 4. USA, School of aviation medicine Randolph field texas
Hart P (1968) The condensed nearest neighbor rule. IEEE Trans Inf Theory 14(3):515–516. https://doi.org/10.1109/TIT.1968.1054155
Gate G (1972) The reduced nearest neighbor rule. IEEE Trans Inf Theory 18(3):431–433. https://doi.org/10.1109/TIT.1972.1054809
Alpaydin E (1997) Voting over multiple condensed nearest neighbors. Artif Intell Rev 11:115–132. https://doi.org/10.1023/A:1006563312922
Wilson D, Martinez T (2000) Reduction techniques for instance-based learning algorithms. Mach Learn 38(3):257–286. https://doi.org/10.1023/A:1007626913721
Aha DW, Kibler D, Albert M (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66. https://doi.org/10.1007/BF00153759
Sproull RF (1991) Refinements to Nearest neighbor searching. Tech Rep Int Comput Sci ACM 18(9):507–517
Kumar A, Bhatnagar R, Srivastava S (2018) ARSkNN: an efficient k-nearest neighbor classification technique using mass based similarity measure. J Intell Fuzzy Syst 35(4):1–12. https://doi.org/10.3233/JIFS-169701
Lyon RJ, Stappers BW, Cooper S, Brooke JM, Knowles JD (2016) Fifty years of pulsar candidate selection: from simple filters to a new principled real-time classification approach. Mon Not R Astron Soc 459(1):1104–1123
Keith MJ (2010) The high time resolution universe pulsar survey—I. System configuration and initial discoveries. Mon Not R Astron Soc 409(2):619–627. https://doi.org/10.1111/j.1365-2966.2010.17325.x
Lorimer DR, Kramer M (2004) Handbook of pulsar astronomy. Cambridge observing handbooks for research astronomers. Cambridge University Press, Cambridge, vol 4
Archana S, Elangovan K (2014) Survey of classification techniques in data mining. Int J Comput Sci Mob Appl 2(2):65–71
Aha DW (1997) Lazy learning. Kluwer, Norwell
Gao QB, Wang ZZ (2007) Center-based nearest neighbor classifier. Pattern Recogn 40(1):346–349
Omercevic D, Drbohlav O, Leonardis A (2007) High-dimensional feature matching: employing the concept of meaningful nearest neighbors. In: IEEE eleventh international conference on computer vision, pp 1–8
Toyama J, Kudo M, Imai H (2010) Probably correct k-nearest neighbor search in high dimensions. Pattern Recogn 43(4):1361–1372
Ting KM, Zhou GT, Liu FT, Tan SC (2010) Mass estimation and its applications. In: Sixteenth ACM SIGKDD international conference on knowledge discovery and data mining, pp 989–998
Kumar A, Bhatnagar R, Srivastava S (2018) Analysis of credit risk prediction using ARSkNN, pp 644–652. https://doi.org/10.1007/978-3-319-74690-6_63
Cristoph B, Rob JH, Bonsoo K (2017) A note on the validity of cross-validation for evaluating autoregressive time series prediction. Comput Stat Data Anal 120:70–83. https://doi.org/10.1016/j.csda.2017.11.003
Ji-Hyun K (2009) Estimating classification error rate: repeated cross-validation, repeated hold-out and bootstrap. Comput Stat Data Anal 53:3735–3745
Gaoxia J, Wengian W (2017) Error estimation based on variance analysis of k-fold cross validation. Pattern Recogn 69:94–106
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Agarwal, Y., Poornalatha, G. (2021). Analysis of the Nearest Neighbor Classifiers: A Review. In: Chiplunkar, N.N., Fukao, T. (eds) Advances in Artificial Intelligence and Data Engineering. AIDE 2019. Advances in Intelligent Systems and Computing, vol 1133. Springer, Singapore. https://doi.org/10.1007/978-981-15-3514-7_43
Download citation
DOI: https://doi.org/10.1007/978-981-15-3514-7_43
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-3513-0
Online ISBN: 978-981-15-3514-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)