Skip to main content
Log in

A study of the effect of different types of noise on the precision of supervised learning techniques

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Machine learning techniques often have to deal with noisy data, which may affect the accuracy of the resulting data models. Therefore, effectively dealing with noise is a key aspect in supervised learning to obtain reliable models from data. Although several authors have studied the effect of noise for some particular learners, comparisons of its effect among different learners are lacking. In this paper, we address this issue by systematically comparing how different degrees of noise affect four supervised learners that belong to different paradigms. Specifically, we consider the Naïve Bayes probabilistic classifier, the C4.5 decision tree, the IBk instance-based learner and the SMO support vector machine. We have selected four methods which enable us to contrast different learning paradigms, and which are considered to be four of the top ten algorithms in data mining (Yu et al. 2007). We test them on a collection of data sets that are perturbed with noise in the input attributes and noise in the output class. As an initial hypothesis, we assign the techniques to two groups, NB with C4.5 and IBk with SMO, based on their proposed sensitivity to noise, the first group being the least sensitive. The analysis enables us to extract key observations about the effect of different types and degrees of noise on these learning techniques. In general, we find that Naïve Bayes appears as the most robust algorithm, and SMO the least, relative to the other two techniques. However, we find that the underlying empirical behavior of the techniques is more complex, and varies depending on the noise type and the specific data set being processed. In general, noise in the training data set is found to give the most difficulty to the learners.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Aha DW, Kibler D, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6: 37–66

    Google Scholar 

  • Aha DW (1992) Tolerating noisy, irrelevant and novel attributes in instance-based learning algorithms. Int J Man–Mach Stud 36: 267–287

    Article  Google Scholar 

  • Angluin D, Laird P (1988) Learning from noisy examples. Mach Learn 2(4): 343–370

    Google Scholar 

  • Asuncion A, Newman DJ (2007) UCI repository of machine learning databases. Available by anonymous ftp to ics.uci.edu in the pub/machine-learning-databases directory. University of California

  • Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32: 675–701

    Article  Google Scholar 

  • Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11: 86–92

    Article  MATH  Google Scholar 

  • Fürnkranz J (1997) Noise-tolerant windowing. In: Proceedings of the 15th international joint conference on artificial intelligence (IJCAI-97), Nagoya, Japan. Morgan Kaufmann, pp 852–857

  • Goldman SA, Sloan RH (1995) Can PAC learning algorithms tolerate random attribute noise. Algorithmica 14(1): 70–84 (Springer, New York)

    Article  MathSciNet  MATH  Google Scholar 

  • Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten, IH (2009) The WEKA data mining software: an update; SIGKDD Explor 10–18

  • Hunt EB, Martin J, Stone P (1966) Experiments in induction. Academic Press, New York

    Google Scholar 

  • John GH, Langley P (1995) Estimating continuous distributions in Bayesian classifiers. In: Eleventh conference on uncertainty in artificial intelligence, San Mateo, pp 338–345

  • Kearns M (1998) Efficient noise-tolerant learning from statistical queries. J ACM 45(6): 983–1006

    Article  MathSciNet  MATH  Google Scholar 

  • Meeson S, Blott BH, Killingback ALT (1996) EIT data noise evaluation in the clinical environment. Physiol Meas 17: A33–A38

    Article  Google Scholar 

  • Nelson R (2005) Overcoming noise in data-acquisition systems (WEBCAST). Test Meas World. http://www.tmworld.com/article/319648- Overcomming_noise_in_data_acquisition_systems.php

  • Nettleton D, Torra V (2001) A comparison of active set method and genetic algorithm approaches for learning weighting vectors in some aggregation operators. Int J Intel Syst 16(9): 1069–1083

    Article  MATH  Google Scholar 

  • Nettleton D, Muñiz J (2001) Processing and representation of meta-data for sleep apnea diagnosis with an artificial intelligence approach. Int J Med Inform 63(1–2): 77–89

    Article  Google Scholar 

  • Platt J (1998) Fast training of support vector Machines using sequential minimal optimization. In: Schölkopf B, Burges CJC, Smola AJ (eds) Advances in kernel methods—support vector learning, Chap 12. MIT Press, pp 169–185

  • Quinlan JR (1986) Induction of decision trees. Mach Learn 1: 81–106 (Kluwer Academic Publishers)

    Google Scholar 

  • Quinlan JR (1993) C4.5 programs for machine learning. Morgan Kaufmann, San Mateo

    Google Scholar 

  • Sloan R (1988) Types of noise in data for concept learning. In: Annual workshop on computational learning theory. Proceedings of the first annual workshop on Computational learning theory: 91–96. SIGART: ACM special interest group on artificial intelligence

  • Sloan RH (1995) Four types of noise in data for PAC learning. Inform Process Lett 54(3): 157–162

    Article  MATH  Google Scholar 

  • Torra V (1997) The weighted owa operator. Int J Intell Syst 12(2): 153–166

    Article  MATH  Google Scholar 

  • Vapnik VN (1995) The nature of statistical learning theory. Springer Verlag, New York

    MATH  Google Scholar 

  • Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco

    MATH  Google Scholar 

  • Yu S, Zhou ZH, Steinbac M, Hand DJ, Steinberg D (2007) Top 10 algorithms in data mining. Knowl Inform Syst 14(1): 1–37

    Google Scholar 

  • Zhu X, Wu X, Chen Q (2003) Eliminating class noise in large datasets. In: Proceedings of the 20th ICML international conference on machine learning, Washington, DC, pp 920–927

  • Zhu X, Wu X (2004) Class noise vs. attribute noise: a quantitative study of their impacts. Artif Intel Rev 22: 177–210 (Kluwer Academic Publishers)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David F. Nettleton.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nettleton, D.F., Orriols-Puig, A. & Fornells, A. A study of the effect of different types of noise on the precision of supervised learning techniques. Artif Intell Rev 33, 275–306 (2010). https://doi.org/10.1007/s10462-010-9156-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-010-9156-z

Keywords

Navigation