research-article

Learning regularized hoeffding trees from data streams

Authors:
Jean Paul Barddal

Pontifícia Universidade Católica do Paraná (PUC-PR), Curitiba, Brazil

Pontifícia Universidade Católica do Paraná (PUC-PR), Curitiba, Brazil
View Profile

,
Fabrício Enembreck

Pontifícia Universidade Católica do Paraná (PUC-PR), Curitiba, Brazil

Pontifícia Universidade Católica do Paraná (PUC-PR), Curitiba, Brazil
View Profile

SAC '19: Proceedings of the 34th ACM/SIGAPP Symposium on Applied ComputingApril 2019Pages 574–581https://doi.org/10.1145/3297280.3297334

Published:08 April 2019Publication History

SAC '19: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing

Pages 574–581

ABSTRACT

Learning from data streams is a hot topic in machine learning that targets the learning and update of predictive models as data becomes available for both training and query. Due to their simplicity and convincing results in a multitude of applications, Hoeffding Trees are, by far, the most widely used family of methods for learning decision trees from streaming data. Despite the aforementioned positive characteristics, Hoeffding Trees tend to continuously grow in terms of nodes as new data becomes available, i.e., they eventually split on all features available, and multiple times on the same feature; thus leading to unnecessary complexity. With this behavior, Hoeffding Trees lose the ability to be human-understandable and computationally efficient. To tackle these issues, we propose a regularization scheme for Hoeffding Trees that (i) uses a penalty factor to control the gain obtained by creating a new split node using a feature that has not been used thus far; and (ii) uses information from previous splits in the current branch to determine whether the gain observed indeed justifies a new split. The proposed scheme is combined with both standard and adaptive variants of Hoeffding Trees. Experiments using real-world, stationary and drifting synthetic data show that the proposed method prevents both original and adaptive Hoeffding Trees from unnecessarily growing while maintaining impressive accuracy rates. As a byproduct of the regularization process, significant improvements in processing time, model complexity, and memory consumption have also been observed, thus showing the effectiveness of the proposed regularization scheme.

References

R. Agrawal, T.Imielinski, and Arun Swami. 1993. Database mining: a performance perspective. Knowledge and Data Engineering, IEEE Transactions on 5, 6 (Dec 1993), 914--925. Google ScholarDigital Library
Geoffrey Holmes Albert Bifet, Eibe Frank and Bernhard Pfahringer (Eds.). 2010. Accurate Ensembles for Data Streams: Combining Restricted Hoeffding Trees using Stacking. JMLR Proceedings, Vol. 13. JMLR.org.Google Scholar
Jean Paul Barddal, Heitor Murilo Gomes, FabrÃěgcio Enembreck, and Bernhard Pfahringer. 2017. A survey on feature drift adaptation: Definition, benchmark, challenges and future directions. Journal of Systems and Software 127 (2017), 278 -- 294. Google ScholarDigital Library
Jean Paul Barddal, Heitor Murilo Gomes, Fabrício Enembreck, Bernhard Pfahringer, and Albert Bifet. 2016. On Dynamic Feature Weighting for Feature Drifting Data Streams. In ECML/PKDD' 16 (Lecture Notes in Computer Science). Springer.Google Scholar
Andrew R Barron, Jorma Rissanen, and Bin Yu. 1998. The Minimum Description Length Principle in Coding and Modeling. IEEE Trans. Inf. Theory 44, 6 (1998), 2743--2760. http://dblp.uni-trier.de/rec/bibtex/journals/tit/BarronRY98 Google ScholarDigital Library
Albert Bifet, Eibe Frank, Geoff Holmes, and Bernhard Pfahringer. 2012. Ensembles of Restricted Hoeffding Trees. ACM Trans. Intell. Syst. Technol. 3, 2, Article 30 (Feb. 2012), 20 pages. Google ScholarDigital Library
Albert Bifet and Ricard Gavaldà. 2007. Learning from time-changing data with adaptive windowing. In In SIAM International Conference on Data Mining.Google ScholarCross Ref
Albert Bifet and Ricard Gavaldà. 2009. Adaptive Learning from Evolving Data Streams. Springer Berlin Heidelberg, Berlin, Heidelberg, 249--260. Google ScholarDigital Library
Albert Bifet, Geoff Holmes, Richard Kirkby, and Bernhard Pfahringer. 2010. MOA: Massive Online Analysis. The Journal of Machine Learning Research 11 (2010), 1601--1604. Google ScholarDigital Library
Albert Bifet, Geoff Holmes, and Bernhard Pfahringer. 2010. Leveraging Bagging for Evolving Data Streams. In Machine Learning and Knowledge Discovery in Databases, JosÃl' Luis BalcÃązar, Francesco Bonchi, Aristides Gionis, and MichÃĺle Sebag (Eds.). Lecture Notes in Computer Science, Vol. 6321. Springer Berlin Heidelberg, 135--150. Google ScholarDigital Library
Jock A. Blackard and Denis J. Dean. 1999. Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables. Computers and Electronics in Agriculture 24, 3 (1999), 131 -- 151.Google ScholarCross Ref
Janez Demsar. 2006. Statistical Comparisons of Classifiers over Multiple Data Sets. J. Mach. Learn. Res. 7 (Dec. 2006), 1--30. http://dl.acm.org/citation.cfm?id=1248547.1248548 Google ScholarDigital Library
Houtao Deng and G. Runger. 2012. Feature selection via regularized trees. In The 2012 International Joint Conference on Neural Networks (IJCNN). 1--8.Google Scholar
Pedro Domingos and Geoff Hulten. 2000. Mining High-speed Data Streams. In Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '00). ACM, New York, NY, USA, 71--80. Google ScholarDigital Library
Joao Gama. 2010. Knowledge Discovery from Data Streams (1st ed.). Chapman & Hall/CRC. Google ScholarDigital Library
J. Gama and P. Rodrigues. 2009. Issues in evaluation of stream learning algorithms. In Proc. of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM SIGKDD, 329--338. Google ScholarDigital Library
Mark A. Hall and Lloyd A. Smith. 1999. Feature Selection for Machine Learning: Comparing a Correlation-based Filter Approach to the Wrapper. (1999).Google ScholarDigital Library
Geoff Hulten, Laurie Spencer, and Pedro Domingos. 2001. Mining Time-changing Data Streams. In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '01). ACM, New York, NY, USA, 97--106. Google ScholarDigital Library
Elena Ikonomovska, João Gama, Bernard Zenko, and Saso Dzeroski. 2011. Speeding-Up Hoeffding-Based Regression Trees With Options. In ICML. 537--544. Google ScholarDigital Library
Ioannis Katakis, Grigorios Tsoumakas, and Ioannis Vlahavas. 2006. Dynamic Feature Space and Incremental Feature Selection for the Classification of Textual Data Streams. In in ECML/PKDD-2006 International Workshop on Knowledge Discovery from Data Streams. 2006. Springer Verlag, 107.Google Scholar
Hai-Long Nguyen, Yew-Kwong Woon, Wee-Keong Ng, and Li Wan. 2012. Heterogeneous Ensemble for Feature Drifts in Data Streams. In Advances in Knowledge Discovery and Data Mining, Pang-Ning Tan, Sanjay Chawla, ChinKuan Ho, and James Bailey (Eds.). Lecture Notes in Computer Science, Vol. 7302. Springer Berlin Heidelberg, 1--12. Google ScholarDigital Library
W. Nick Street and Y. Kim. 2001. A streaming ensemble algorithm (SEA) for large-classification. In Proc. of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. ACM SIGKDD, 377--382. Google ScholarDigital Library
Robert Tibshirani. 1996. Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society. Series B (Methodological) 58, 1 (1996), 267--288. http://www.jstor.org/stable/2346178Google ScholarCross Ref
Geoffrey I. Webb, Loong Kuan Lee, Bart Goethals, and François Petitjean. 2018. Analyzing concept drift and shift from sample data. Data Mining and Knowledge Discovery (12 Mar 2018). Google ScholarDigital Library
Gerhard Widmer and Miroslav Kubat. 1996. Learning in the Presence of Concept Drift and Hidden Contexts. Mach. Learn. 23, 1 (April 1996), 69--101. Google ScholarDigital Library
H. Yang and S. Fong. 2011. Optimized very fast decision tree with balanced classification accuracy and compact tree size. In The 3rd International Conference on Data Mining and Intelligent Information Technology Applications. 57--64.Google Scholar

Index Terms

Learning regularized hoeffding trees from data streams
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Learning settings
      1. Online learning settings

Recommendations

Learning Higher Accuracy Decision Trees from Concept Drifting Data Streams
IEA/AIE '08: Proceedings of the 21st international conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems: New Frontiers in Applied Artificial Intelligence

In this paper, we propose to combine the naive-Bayes approach with CVFDT, which is known as one of the major algorithms to induce a high-accuracy decision tree from time-changing data streams. The proposed improvement, called CVFDT<Subscript>NBC</...
Read More
Learning model trees from evolving data streams

The problem of real-time extraction of meaningful patterns from time-changing data streams is of increasing importance for the machine learning and data mining communities. Regression in time-changing data streams is a relatively unexplored topic, ...
Read More
Decision trees for mining data streams

In this paper we study the problem of constructing accurate decision tree models from data streams. Data streams are incremental tasks that require incremental, online, and any-time learning algorithms. One of the most successful algorithms for mining ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SAC '19: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing
April 2019
2682 pages
ISBN:9781450359337
DOI:10.1145/3297280
Conference Chairs:
Chih-Cheng Hung
Kennesaw State University, Marietta, Georgia
,
George A. Papadopoulos
University of Cyprus, Nicosia, Cyprus
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 April 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
concept drift
data stream mining
decision tree
regularization
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,650of6,669submissions,25%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 134
  Total Downloads
- Downloads (Last 12 months)16
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Learning regularized hoeffding trees from data streams

SAC '19: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Learning Higher Accuracy Decision Trees from Concept Drifting Data Streams

Learning model trees from evolving data streams

Decision trees for mining data streams