Abstract
For two-class datasets, we provide a method for estimating the generalization error of a bag using out-of-bag estimates. In bagging, each predictor (single hypothesis) is learned from a bootstrap sample of the training examples; the output of a bag (a set of predictors) on an example is determined by voting. The out-of-bag estimate is based on recording the votes of each predictor on those training examples omitted from its bootstrap sample. Because no additional predictors are generated, the out-of-bag estimate requires considerably less time than 10-fold cross-validation. We address the question of how to use the out-of-bag estimate to estimate generalization error on two-class datasets. Our experiments on several datasets show that the out-of-bag estimate and 10-fold cross-validation have similar performance, but are both biased. We can eliminate most of the bias in the out-of-bag estimate and increase accuracy by incorporating a correction based on the distribution of the out-of-bag votes.
Article PDF
Similar content being viewed by others
References
Blake, C. L., & Merz, C. J. (1998). UCI repository of machine learning databases. [http://www.ics.uci.edu/ ~mlearn/MLRepository.html]. Irvine, California: Department of Information and Computer Science, University of California.
Breiman, L. (1996a). Bagging predictors. Machine Learning, 24:2, 123–140.
Breiman, L. (1996b). Out-of-bag estimation. [ftp://ftp.stat.berkeley.edu/pub/users/breiman/OOBestimation.ps.Z]. Berkeley, California: Department of Statistics, University of California.
Dietterich, T. G. (2000). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40:2, 139–157.
Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. New York: Chapman and Hall.
Freund, Y., & Schapire, R. E. (1996). Experiments with a new boosting algorithm. In Proceedings of the Thirteenth International Conference on Machine Learning (pp. 148–156). Bara, Italy: Morgan Kaufmann.
Kearns, M. J., & Ron, D. (1997). Algorithmic stability and sanity-check bounds for leave-one-out cross-validation. In Proceedings of the Tenth Annual Conference on Computational Learning Theory (pp. 152–162). Nashville, Tennessee: ACM Press.
Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (pp. 1137–1143). Montréal: Morgan Kaufmann.
Maclin, R., & Opitz, D. (1997). An empirical evaluation of bagging and boosting. In Proceedings of the Fourteenth National Conference on Artificial Intelligence (pp. 546–551). Providence, Rhode Island: AAAI Press.
Michie, D., Spiegelhalter, D. J., & Taylor, C. C. (1994). Machine learning, neural and statistical classification. Englewood Cliffs, New Jersey: Prentice Hall.
Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1:1, 81–106.
Quinlan, J. R. (1993). C4.5: Programs for machine learning. San Mateo, California: Morgan Kaufmann.
Quinlan, J. R. (1996). Bagging, boosting, and C4.5. In Proceedings of the Thirteenth National Conference on Artificial Intelligence (pp. 725–730). Portland, Oregon: AAAI Press.
Tibshirani, R. (1996). Bias, variance and prediction error for classification rules. [http://www-stat.stanford.edu/ ~tibs/ftp/biasvar.ps]. Toronto: Department of Statistics, University of Toronto.
Weiss, S. M., & Kulikowski, C. A. (1991). Computer systems that learn: Classification and prediction methods from statistics, neural nets, machine learning, and expert systems. San Mateo, California: Morgan Kaufmann.
Wolpert, D. H., & Macready, W.G. (1999). An efficient method to estimate bagging's generalization error. Machine Learning, 35:1, 41–55.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Bylander, T. Estimating Generalization Error on Two-Class Datasets Using Out-of-Bag Estimates. Machine Learning 48, 287–297 (2002). https://doi.org/10.1023/A:1013964023376
Issue Date:
DOI: https://doi.org/10.1023/A:1013964023376