Abstract
If we lack relevant problem-specific knowledge, cross-validation methods may be used to select a classification method empirically. We examine this idea here to show in what senses cross-validation does and does not solve the selection problem. As illustrated empirically, cross-validation may lead to higher average performance than application of any single classification strategy, and it also cuts the risk of poor performance. On the other hand, cross-validation is no more or less a form of bias than simpler strategies, and applying it appropriately ultimately depends in the same way on prior knowledge. In fact, cross-validation may be seen as a way of applying partial information about the applicability of alternative classification strategies.
Article PDF
Similar content being viewed by others
References
Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984).Classification and regression trees. Pacific Grove, CA: Wadsworth & Brooks.
Breiman, L. (1992).Stacked regressions (Technical Report 367). Berkeley, CA: Department of Statistics, University of California at Berkeley.
Buntine, W. (1991). Classifiers: A theoretical and empirical study.Proceedings of the Twelfth International Joint Conference on Artificial Intelligence.
Efron, B. (1982).The jackknife, the bootstrap and other resampling plans. SIAM.
Fisher, D., & McKusick, K. (1989). An empirical comparison of ID3 and back-propagation.Proceedings of the Eleventh International Joint Conference on Artificial Intelligence.
Fisher, D., & Schlimmer, J. (1988). Concept simplification and prediction accuracy.Proceedings of the Fifth International Conference on Machine Learning (pp. 22–28).
Gams, M. (1989). New measurements highlight the importance of redundant knowledge.Proceedings of the Fourth European Working Session on Learning (pp. 71–80). Pitman Publishing.
Geisser, S. (1975). The predictive sample reuse method with applications.Journal of the American Statistical Association, 70, 320–328.
Jacobs, R.A., Jordan, M.I., Nowlan, S.J., & Hinton, G.E. (1991). Adaptive mixtures of local experts.Neural Computation, 3(1).
Kwok, S.K., & Carter, C. (1990). Multiple decision trees. In R.D. Schacter, T.D. Levitt, L.N. Kanal, & J.F., Lemmer (Eds.),Uncertainty in artificial intelligence 4. Amsterdam: North-Holland.
McClelland, J.L., & Rumelhart, D.E. (1988).Explorations in parallel distributed processing. Cambridge, MA: MIT Press.
Murphy, P.M., & Aha, D.W. (1992). UCI repository of machine learning databases [a machine-readable data repository]. Maintained at the Department of Information and Computer Science, University of California, Irvine, CA. Data sets are available by anonymous ftp at ics.uci.edu in the directory pub/machine-learning-databases.
Quinlan, J.R. (1986). Induction of decision trees.Machine learning, 1(1), 81–106.
Quinlan, J.R. (1987a). Generating productions rules from decision trees.Proceedings of the Tenth International Joint Conference on Artificial Intelligence.
Quinlan, J.R. (1987b). Simplifying decision trees.International Journal of Man-Machine Studies, 27, 221–234.
Quinlan, J.R. (1993). Comparing connectionist and symbolic learning methods. In S. Hanson, G. Drastal, & R. Rivest, (Eds.),Computational learning theory and natural learning systems: Constraints and prospects. Cambridge, MA: MIT Press.
Schaffer, C. (1993). Overfitting avoidance as bias.Machine Learning, 10, 153–178.
Shavlik, J.W., Mooney, R.J., & Towell, G.G. (1991). Symbolic and neural learning algorithms: An experimental comparison.Machine Learning, 6(2), 111–144.
Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions.Journal of the Royal Statistical Society (Series B),36, 111–147.
Stone, M. (1977). Asymptotics for and against cross-validation.Biometrika, 64, 29–35.
Wahba, G. (1990).Spline models for observational data. SIAM.
Wolpert, D.H. (1992a). On the connection between in-sample testing and generalization error.Complex Systems, 6, 47–94.
Wolpert, D.H. (1992b). Stacked generalization.Neural Networks, 5, 241–259.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Schaffer, C. Selecting a classification method by cross-validation. Mach Learn 13, 135–143 (1993). https://doi.org/10.1007/BF00993106
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF00993106