Abstract
Stacking regressions is a method for forming linear combinations of different predictors to give improved prediction accuracy. The idea is to use cross-validation data and least squares under non-negativity constraints to determine the coefficients in the combination. Its effectiveness is demonstrated in stacking regression trees of different sizes and in a simulation stacking linear subset and ridge regressions. Reasons why this method works are explored. The idea of stacking originated with Wolpert (1992).
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Avoid common mistakes on your manuscript.
References
Belsley, D.A., Kuh, E. and Welsch, R., “Regression Diagnostics,” 1980, John Wiley and Sons, New York.
Berger, J.O. and Bock, M.E., “Combining independent normal mean estimation problems with unknown variances,” Ann. Statist. 4, 1976, pp. 642–648.
Breiman, L., Friedman, J., Olshen, R. and Stone, J., “Classification and Regression Trees,” 1984, Wadsworth, California.
Breiman, L. and Friedman, J.H., “Estimating Optimal Transformations in Multiple Regression and Correlation (with discussion),” J. Amer. Statist. Assoc., 80, 1985, pp. 580–619.
Breiman, L. and Spector, P., “Submodel Selection and Evaluation-X Random Case,” International Statistical Review, 3, 1992, pp. 291–319.
Efron, B. and Morris, C., “Combining possibly related estimation problems (with discussion),” J. Roy. Statist. Soc. Ser. B, 35, 1973, pp. 379–421.
Green, E.J. and Strawderman, W.E., “A James-Stein type estimator for combining unbiased and possibly biased estimators,” J. Amer. Statist. Assoc., 86, 1991, pp. 1001–1006.
Hoerl, A.E. and Kennard, R.W., “Ridge regression: Biased estimation for nonorthogonal problems,” Technometrics, 12, 1970, pp. 55–67.
Lawson, J. and Hanson, R., “Solving Least Squares Problems,” 1974, Prentice-Hall, New Jersey.
Luenberger, D., “Linear and Nonlinear Programming,” 1984, Addison-Wesley Publishing Co.
Le Blanc, M. and Tibshirani, R., “Combining Estimates in Regression and Classification,” Technical Report 9318, 1973, Dept. of Statistics, University of Toronto.
Perrone, M.P., “General Averaging Results for Convex Optimization,” Proceedings of the 1993 Connectionist Models Summer School, Erlbaum Associates, 1994, pp. 364–371.
Rao, J.N.K. and Subrathmaniam, K., “Combining independent estimators and estimation in linear regression with unequal variances,” Biometrics, 27, 1971, pp. 971–990.
Rubin, D.B. and Weisberg, S., “The variance of a linear combination of independent estimators using estimated weights,” Biometrika, 62, 1975, pp. 708–709.
Wolpert, D., “Stacked Generalization,” Neural Networks, Vol. 5, 1992, pp. 241–259.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Breiman, L. Stacked regressions. Mach Learn 24, 49–64 (1996). https://doi.org/10.1007/BF00117832
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF00117832