Combining Bagging and Random Subspaces to Create Better Ensembles

Panov, Panče; Džeroski, Sašo

doi:10.1007/978-3-540-74825-0_11

Combining Bagging and Random Subspaces to Create Better Ensembles

Panče Panov¹ &
Sašo Džeroski¹

Conference paper

1671 Accesses
45 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4723))

Abstract

Random forests are one of the best performing methods for constructing ensembles. They derive their strength from two aspects: using random subsamples of the training data (as in bagging) and randomizing the algorithm for learning base-level classifiers (decision trees). The base-level algorithm randomly selects a subset of the features at each step of tree construction and chooses the best among these. We propose to use a combination of concepts used in bagging and random subspaces to achieve a similar effect. The latter randomly select a subset of the features at the start and use a deterministic version of the base-level algorithm (and is thus somewhat similar to the randomized version of the algorithm). The results of our experiments show that the proposed approach has a comparable performance to that of random forests, with the added advantage of being applicable to any base-level algorithm without the need to randomize the latter.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article MATH Google Scholar
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
MATH MathSciNet Google Scholar
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(8), 832–844 (1998)
Article Google Scholar
Efron, B., Tibshirani, R.J.: An introduction to the Bootstrap. In: Monographs on Statistics and Applied Probability, vol. 57, Chapman and Hall, Sydney (1993)
Google Scholar
Schapire, R.E.: The strength of weak learnability. Machine Learning 5, 197–227 (1990)
Google Scholar
Ho, T.K.: Complexity of classification problems and comparative advantages of combined classifiers. In: MCS 2000: Proceedings of the First International Workshop on Multiple Classifier Systems, London, UK, pp. 97–106. Springer, Heidelberg (2000)
Google Scholar
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth and Brooks, Monterey, CA (1984)
MATH Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques (Morgan Kaufmann Series in Data Management Systems), 2nd edn. Morgan Kaufmann, San Francisco (2005)
MATH Google Scholar
Newman, D.J., Hettich, S., Blake, C., Merz, C.: UCI repository of machine learning databases (1998)
Google Scholar
Quinlan, J.R.: C4.5: programs for machine learning. Kaufmann Publishers Inc., San Francisco, CA, USA (1993)
Google Scholar
Cohen, W.W.: Fast effective rule induction. In: Prieditis, A., Russell, S. (eds.) Proc. of the 12th International Conference on Machine Learning, Tahoe City, CA, pp. 115–123. Morgan Kaufmann, San Francisco (1995)
Google Scholar
Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Mach. Learn. 6(1), 37–66 (1991)
Google Scholar
Wilcoxon, F.: Individual comparisons by ranking methods. Biometrics 1, 80–83 (1945)
Article Google Scholar
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia
Panče Panov & Sašo Džeroski

Authors

Panče Panov
View author publications
You can also search for this author in PubMed Google Scholar
Sašo Džeroski
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Michael R. Berthold John Shawe-Taylor Nada Lavrač

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Panov, P., Džeroski, S. (2007). Combining Bagging and Random Subspaces to Create Better Ensembles. In: R. Berthold, M., Shawe-Taylor, J., Lavrač, N. (eds) Advances in Intelligent Data Analysis VII. IDA 2007. Lecture Notes in Computer Science, vol 4723. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74825-0_11

Download citation

DOI: https://doi.org/10.1007/978-3-540-74825-0_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74824-3
Online ISBN: 978-3-540-74825-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics