Abstract
Textual data streams have been widely applied in real-world applications where online users’ expressed their opinions for online products. Mining this stream of data is a challenging task for researchers as a result of changes in data distribution, a phenomenon widely known as concept drift. Most of the existing classification methods incorporated drift detection methods that depend on the classification errors. However, these methods are prone to higher false-positive or missed detections rates. Thus, there is a need for more sensitive detection methods that can detect the maximum number of drifts in the data stream to improve classification accuracy. In this paper, we present a drift detection-based adaptive windowing for ensemble classifier, an adaptive unsupervised learning algorithm for sentiment classification, and opinion mining. The proposed algorithm employs four different dissimilarity measures to quantify the magnitude of concept drift in data streams, to improve the classification performance. Series of the experiments were conducted on the real-world datasets and the results demonstrated the efficiency of our proposed model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aggarwal, C.C.: Recommender Systems The Textbook. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-29659-3
Gaber, M.M., Zaslavsky, A., Krishnaswamy, S.: A survey of classification methods in data streams. In: Data Streams (2007)
Widmer, G.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. (1996). https://doi.org/10.1007/bf00116900
Žliobaitė, I., Pechenizkiy, M., Gama, J.: An overview of concept drift applications. In: Japkowicz, N., Stefanowski, J. (eds.) Big Data Analysis: New Algorithms for a New Society. SBD, vol. 16, pp. 91–114. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-26989-4_4
Gama, J., Zliobaite, I., Bifet, A., et al.: A survey on concept drift adaptation. ACM Comput. Surv. 46 (2014)
Sujatha, P., Saradha, S.: A study of data mining concepts and techniques. Int. J. Appl. Eng. Res. (2014)
Pinage, F.A., dos Santos, E.M., da Gama, J.M.P.: Classification systems in dynamic environments: an overview. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. (2016). https://doi.org/10.1002/widm.1184
Tsymbal, A.: The problem of concept drift: definitions and related work (2004)
Rokach, L., Maimon, O.: The Data Mining and Knowledge Discovery Handbook, pp. 1203–1224. Springer, Boston (2010). https://doi.org/10.1007/978-0-387-09823-4
Min, J.K., Hong, J.H., Cho, S.B.: Combining localized fusion and dynamic selection for high-performance SVM. Expert Syst. Appl. (2015). https://doi.org/10.1016/j.eswa.2014.07.028
Sun, Y., Shao, H., Wang, S.: Efficient ensemble classification for multi-label data streams with concept drift. Information (2019). https://doi.org/10.3390/info10050158
Minku, L.L., Yao, X.: DDD: a new ensemble approach for dealing with concept drift. IEEE Trans. Knowl. Data Eng. 24, 619–633 (2012). https://doi.org/10.1109/TKDE.2011.58
Katakis, I., Tsoumakas, G., Vlahavas, I.: Tracking recurring contexts using ensemble classifiers: an application to email filtering. Knowl. Inf. Syst. (2010). https://doi.org/10.1007/s10115-009-0206-2
Krawczyk, B., Cano, A.: Adaptive ensemble active learning for drifting data stream mining. In: IJCAI International Joint Conference on Artificial Intelligence (2019)
Jȩdrzejowicz, J., Jȩdrzejowicz, P.: GEP-based ensemble classifier with drift-detection. In: Bramer, M., Petridis, M. (eds.) SGAI 2018. LNCS (LNAI), vol. 11311, pp. 121–131. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-04191-5_9
Al-Ghossein, M., Murena, P.A., Abdessalem, T., et al.: Adaptive collaborative topic modeling for online recommendation. In: RecSys 2018 - 12th ACM Conference on Recommender Systems, pp 338–346. Association for Computing Machinery, Inc., (2018)
Bifet, A., Frank, E., Holmes, G., Pfahringer, B.: Accurate ensembles for data streams: combining restricted hoeffding trees using stacking. J. Mach. Learn. Res. (2010)
Tomás, C.C., Oliveira, E., Sousa, D., et al.: Proceedings of the 3rd IPLeiria’s international health congress. BMC Health Serv. Res. (2016). https://doi.org/10.1186/s12913-016-1423-5
Zhang, H., Wu, J., Norris, J., et al.: Predictors of preference for caesarean delivery among pregnant women in Beijing. J. Int. Med. Res. (2017). https://doi.org/10.1177/0300060517696217
Gomes, H.M., et al.: Adaptive random forests for evolving data stream classification. Mach. Learn. 106(9–10), 1469–1495 (2017). https://doi.org/10.1007/s10994-017-5642-8
Arya, M., Choudhary, C.: Improving the efficiency of ensemble classifier adaptive random forest with meta level learning for real-time data streams. In: Bhateja, V., Satapathy, S.C., Zhang, Y.-D., Aradhya, V.N.M. (eds.) ICICC 2019. AISC, vol. 1034, pp. 11–21. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-1084-7_2
Gemaque, R.N., Costa, A.F.J., Giusti, R., dos Santos, E.M.: An overview of unsupervised drift detection methods. Wiley Interdiscip. Rev. Data Min. Knowl. Discov (2020)
Nishida, K., Yamauchi, K.: Detecting concept drift using statistical testing. In: Corruble, V., Takeda, M., Suzuki, E. (eds.) DS 2007. LNCS (LNAI), vol. 4755, pp. 264–269. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-75488-6_27
Misra, S., Biswas, D., Saha, S.K., Mazumdar, C.: Applying Fourier inspired windows for concept drift detection in data stream. In: 2020 IEEE Calcutta Conference, CALCON 2020 - Proceedings (2020)
Yu, S., Abraham, Z.: Concept drift detection with hierarchical hypothesis testing. In: Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017 (2017)
Goldenberg, I., Webb, G.I.: Survey of distance measures for quantifying concept drift and shift in numeric data. Knowl. Inf. Syst. 60(2), 591–615 (2018). https://doi.org/10.1007/s10115-018-1257-z
Costa, F.G.d., Duarte, F.S.L.G., Vallim, R.M.M., Mello, R.F.d.: Multidimensional surrogate stability to detect data stream concept drift. Expert Syst. Appl. 87, 1339–1351 (2017). https://doi.org/10.1016/j.eswa.2017.06.005
Gama, J., Sebastião, R., Rodrigues, P.P.: On evaluating stream learning algorithms. Mach. Learn. 90(3), 317–346 (2012). https://doi.org/10.1007/s10994-012-5320-9
Polikar, R.: Ensemble based systems in decision making. IEEE Circuits Syst. Mag. (2006)
McAuley, J., Leskovec, J.: Hidden factors and hidden topics: understanding rating dimensions with review text. In: RecSys 2013 - Proceedings of the 7th ACM Conference on Recommender Systems (2013)
Zhang, Y., Chu, G., Li, P., et al.: Three-layer concept drifting detection in text data streams. Neurocomputing (2017). https://doi.org/10.1016/j.neucom.2017.04.047
Du, L., Song, Q., Jia, X.: Detecting concept drift: an information entropy based method using an adaptive sliding window. Intell. Data Anal. 18, 337–364 (2014). https://doi.org/10.3233/IDA-140645
Barros, R.S.M., Cabral, D.R.L., Gonçalves, P.M., Santos, S.G.T.C.: RDDM: reactive drift detection method. Expert Syst. Appl. (2017). https://doi.org/10.1016/j.eswa.2017.08.023
Sebastião, R., Fernandes, J.M.: Supporting the page-hinkley test with empirical mode decomposition for change detection. In: Kryszkiewicz, M., Appice, A., Ślęzak, D., Rybinski, H., Skowron, A., Raś, Z.W. (eds.) ISMIS 2017. LNCS (LNAI), vol. 10352, pp. 492–498. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60438-1_48
Kolter, J.Z., Maloof, M.A.: Using additive expert ensembles to cope with concept drift. In: ICML 2005 - Proceedings of the 22nd International Conference on Machine Learning (2005)
Ditzler, G., Polikar, R.: Hellinger distance based drift detection for nonstationary environments. In: IEEE SSCI 2011: Symposium Series on Computational Intelligence - CIDUE 2011: 2011 IEEE Symposium on Computational Intelligence in Dynamic and Uncertain Environments (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Rabiu, I. et al. (2022). Ensemble Method for Online Sentiment Classification Using Drift Detection-Based Adaptive Window Method. In: Saeed, F., Mohammed, F., Ghaleb, F. (eds) Advances on Intelligent Informatics and Computing. IRICT 2021. Lecture Notes on Data Engineering and Communications Technologies, vol 127. Springer, Cham. https://doi.org/10.1007/978-3-030-98741-1_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-98741-1_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-98740-4
Online ISBN: 978-3-030-98741-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)