Abstract
Selecting the most appropriate learning algorithm for a given task has become a crucial research issue since the advent of multi-paradigm data mining tool suites. To address this issue, researchers have tried to extract dataset characteristics which might provide clues as to the most appropriate learning algorithm. We propose to extend this research by extracting inducer profiles, i.e., sets of metalevel features which characterize learning algorithms from the point of view of their representation and functionality, efficiency, practicality, and resilience. Values for these features can be determined on the basis of author specifications, expert consensus or previous case studies. However, there is a need to characterize learning algorithms in more quantitative terms on the basis of extensive, controlled experiments. This paper illustrates the proposed approach and reports empirical findings on one resilience-related characteristic of learning algorithms for classification, namely their tolerance to irrelevant variables in training data.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
D. W. Aha, D. Kibler, and M. K. Albert. Instance-based learning algorithms. Machine Learning, 6(l):37–66, 1991.
H. Almuallim and T. Dietterich. Efficient algorithms for identifying relevant features. In Proceedings of the Ninth Canadian Conference on Artificial Intelligence, pages 38–45, Vancouver, BC, 1992. Morgan Kaufmann.
C. M. Bishop. Neural Networks for Pattern Recognition. Oxford University Press, 1995.
W. W. Cohen. Fast effective rule induction. In A. Prieditis and S. Russell, editors,Proc. of the 11th International Conference on Machine Learning, pages 115–123, Tahoe City, CA, 1995. Morgan Kaufmann.
U. Fayyad and K. Irani. Multi-interval discretization of continuous-valued attributes for classification learning. In Proc. of the 13th IJCAI, pages 1022–1027, Chambery, France, 1993. Morgan Kaufmann.
J. Gama and P. Brazdil. Characterization of classification algorithms. In E. Pinto-Ferreira and N. Mamede, editors, Progress in Artificial Intelligence. 7th Portuguese Conference on Artificial Intelligence (EPIA-95), pages 189–200. Springer-Verlag, 1995.
J. Gama and P. Brazdil. Linear tree. Intelligent Data Analysis, 3:1–22, 1999.
S. Geman, E. Bienenstock, and R. Doursat. Neural networks and the bias/variance dilemma. Neural Computation, 4:1–58, 1992.
R. Kohavi, D. Sommerfeld, and J. Dougherty. Data mining using mlc++ In International Conference on Tools with AI, pages 234–245, 1996.
P. Langley and S. Sage. Scaling to domains with irrelevant features. In R. Greiner et al., editor, Computational Learning Theory and Natural Learning Systems, volume 4, Cambridge, MA, 1996. MIT Press.
D. Michie, D. J. Spiegelhalter, and C. C. Taylor, editors. Machine learning, neural and statistical classification. Prentice-Hall, 1994.
P. M. Murphy and D.W. Aha. UCI machine learning repository. http://www.ics.uci.edu/mlearn/MLRepository.html, 1991. Irvine, CA: University of California, Dept. of Information and Computer Science.
G. Nakhaeizadeh and A. Schnabl. Development of multi-criteria metrics for evaluation of data mining algorithms. In Proc. Third International Conference on Knowledge Discovery and Data Mining, pages 37–42, Newport Beach, CA, 1997. AAAI Press.
L. Rendell and E. Cho. Empirical learning as a function of concept character. Machine Learning, 5:267–298, 1990.
B.D. Ripley. Pattern Recognition and Neural Networks. Cambridge U. Press, 1996.
C. Schaffer. A conservation law for generalization performance. In W. W. Cohen and H. Hirsh, editors, Proc. of the 11th International Conference on Machine Learning, pages 259–265, Rutgers, NJ, 1994. Morgan Kaufmann.
D. Wolpert. The lack of a priori distinctions between learning algorithms. Neural Computation, 8(7):1381–1390, 1996.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hilario, M., Kalousis, A. (2000). Quantifying the Resilience of Inductive Classification Algorithms. In: Zighed, D.A., Komorowski, J., Żytkow, J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 2000. Lecture Notes in Computer Science(), vol 1910. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45372-5_11
Download citation
DOI: https://doi.org/10.1007/3-540-45372-5_11
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41066-9
Online ISBN: 978-3-540-45372-7
eBook Packages: Springer Book Archive