Quantifying the Resilience of Inductive Classification Algorithms

Hilario, Melanie; Kalousis, Alexandros

doi:10.1007/3-540-45372-5_11

Melanie Hilario⁴ &
Alexandros Kalousis⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1910))

Included in the following conference series:

European Conference on Principles of Data Mining and Knowledge Discovery

2700 Accesses
9 Citations

Abstract

Selecting the most appropriate learning algorithm for a given task has become a crucial research issue since the advent of multi-paradigm data mining tool suites. To address this issue, researchers have tried to extract dataset characteristics which might provide clues as to the most appropriate learning algorithm. We propose to extend this research by extracting inducer profiles, i.e., sets of metalevel features which characterize learning algorithms from the point of view of their representation and functionality, efficiency, practicality, and resilience. Values for these features can be determined on the basis of author specifications, expert consensus or previous case studies. However, there is a need to characterize learning algorithms in more quantitative terms on the basis of extensive, controlled experiments. This paper illustrates the proposed approach and reports empirical findings on one resilience-related characteristic of learning algorithms for classification, namely their tolerance to irrelevant variables in training data.

Download to read the full chapter text

Chapter PDF

Dependence of the Informativity of the Formed Patterns on the Quality of the Initial Data Sample

Inductive Discovery by Machine Learning for Identification of Structural Models

Rule Learning

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

D. W. Aha, D. Kibler, and M. K. Albert. Instance-based learning algorithms. Machine Learning, 6(l):37–66, 1991.
Google Scholar
H. Almuallim and T. Dietterich. Efficient algorithms for identifying relevant features. In Proceedings of the Ninth Canadian Conference on Artificial Intelligence, pages 38–45, Vancouver, BC, 1992. Morgan Kaufmann.
Google Scholar
C. M. Bishop. Neural Networks for Pattern Recognition. Oxford University Press, 1995.
Google Scholar
W. W. Cohen. Fast effective rule induction. In A. Prieditis and S. Russell, editors,Proc. of the 11th International Conference on Machine Learning, pages 115–123, Tahoe City, CA, 1995. Morgan Kaufmann.
Google Scholar
U. Fayyad and K. Irani. Multi-interval discretization of continuous-valued attributes for classification learning. In Proc. of the 13th IJCAI, pages 1022–1027, Chambery, France, 1993. Morgan Kaufmann.
Google Scholar
J. Gama and P. Brazdil. Characterization of classification algorithms. In E. Pinto-Ferreira and N. Mamede, editors, Progress in Artificial Intelligence. 7th Portuguese Conference on Artificial Intelligence (EPIA-95), pages 189–200. Springer-Verlag, 1995.
Google Scholar
J. Gama and P. Brazdil. Linear tree. Intelligent Data Analysis, 3:1–22, 1999.
Article MATH Google Scholar
S. Geman, E. Bienenstock, and R. Doursat. Neural networks and the bias/variance dilemma. Neural Computation, 4:1–58, 1992.
Article Google Scholar
R. Kohavi, D. Sommerfeld, and J. Dougherty. Data mining using mlc++ In International Conference on Tools with AI, pages 234–245, 1996.
Google Scholar
P. Langley and S. Sage. Scaling to domains with irrelevant features. In R. Greiner et al., editor, Computational Learning Theory and Natural Learning Systems, volume 4, Cambridge, MA, 1996. MIT Press.
Google Scholar
D. Michie, D. J. Spiegelhalter, and C. C. Taylor, editors. Machine learning, neural and statistical classification. Prentice-Hall, 1994.
Google Scholar
P. M. Murphy and D.W. Aha. UCI machine learning repository. http://www.ics.uci.edu/mlearn/MLRepository.html, 1991. Irvine, CA: University of California, Dept. of Information and Computer Science.
G. Nakhaeizadeh and A. Schnabl. Development of multi-criteria metrics for evaluation of data mining algorithms. In Proc. Third International Conference on Knowledge Discovery and Data Mining, pages 37–42, Newport Beach, CA, 1997. AAAI Press.
Google Scholar
L. Rendell and E. Cho. Empirical learning as a function of concept character. Machine Learning, 5:267–298, 1990.
Google Scholar
B.D. Ripley. Pattern Recognition and Neural Networks. Cambridge U. Press, 1996.
Google Scholar
C. Schaffer. A conservation law for generalization performance. In W. W. Cohen and H. Hirsh, editors, Proc. of the 11th International Conference on Machine Learning, pages 259–265, Rutgers, NJ, 1994. Morgan Kaufmann.
Google Scholar
D. Wolpert. The lack of a priori distinctions between learning algorithms. Neural Computation, 8(7):1381–1390, 1996.
Google Scholar

Download references

Author information

Authors and Affiliations

CSD - University of Geneva, CH-1211 Geneva 4, Switzerland
Melanie Hilario & Alexandros Kalousis

Authors

Melanie Hilario
View author publications
You can also search for this author in PubMed Google Scholar
Alexandros Kalousis
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer and Information Science, Norwegian University of Science and Technology, O.S. Bragstads plass 2E, 7491, Trondheim, Norway
Jan Komorowski
Department of Computer Science, University of North Carolina, Charlotte, NC 28223, USA
Jan Żytkow
Laboratoire ERIC, Université Lyon 2, 5 avenue Pierre Mendès-France, 69676, Bron, France
Djamel A. Zighed

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hilario, M., Kalousis, A. (2000). Quantifying the Resilience of Inductive Classification Algorithms. In: Zighed, D.A., Komorowski, J., Żytkow, J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 2000. Lecture Notes in Computer Science(), vol 1910. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45372-5_11

Download citation

DOI: https://doi.org/10.1007/3-540-45372-5_11
Published: 18 July 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41066-9
Online ISBN: 978-3-540-45372-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Quantifying the Resilience of Inductive Classification Algorithms

Abstract

Chapter PDF

Similar content being viewed by others

Dependence of the Informativity of the Formed Patterns on the Quality of the Initial Data Sample

Inductive Discovery by Machine Learning for Identification of Structural Models

Rule Learning

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Quantifying the Resilience of Inductive Classification Algorithms

Abstract

Chapter PDF

Similar content being viewed by others

Dependence of the Informativity of the Formed Patterns on the Quality of the Initial Data Sample

Inductive Discovery by Machine Learning for Identification of Structural Models

Rule Learning

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation