Abstract
Identification of typical user behaviour within a web application is a crucial assumption for revealing user characteristics, preferences and habits. Typical and repeating features of user behaviour during his/her interaction with web application can be generalized through behavioural patterns. In this paper, we propose HyBPMine—a novel method, for behavioural pattern mining over a data stream. Our method combines global patterns with patterns specific to dynamically identified groups of similar users. In this way, the method finds and combines the general global patterns (typical for high number of users) with the specific patterns (typical for user groups). We represent the patterns as frequent closed itemsets of items visited by users in their sessions. The behavioural patterns are often used for personalization, prediction or recommendation. In this paper, we evaluated the performance of our method indirectly, by applying discovered patterns in personalized recommendations. In other words, we recommended next user actions within the actual user session. We performed several experiments over data from e-learning and news domains. Our results clearly show that proposed method reaches higher precision than its components used individually as well as the state-of-the-art approaches. In addition, a inclusion of group patterns brings only low and constant computational load, which does not significantly increase resource requirements.
Similar content being viewed by others
Notes
https://moa.cms.waikato.ac.nz/
References
Aggarwal, C.C., Han, J., Wang J., Yu, P.S.: A framework for clustering evolving data streams. Proc. of the 29th int. conf. on Very large data bases. Vol. 29. VLDB Endowment 81–92 (2003)
Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for projected clustering of high dimensional data streams. Proc. of the 30th int. conf. on Very large data bases. Vol. 30. VLDB Endowment 852–863 (2004)
Agrawal, R., Srikant, R.: Mining sequential patterns. Proc. of the 19th int. conf. on Data Engineering. IEEE, 3–14 (1995)
Agrawal, R., Imieliski, T., Swami, A.: Mining association rules between sets of items in large databases. ACM SIGMOD record. 22(2), ACM, 207–216 (1993)
Anandhi, D., Irfan Ahmed, M.S.: An improved web log mining and online navigational pattern prediction. Res. J. Appl. Sci. Eng. Technol. 8(12), 1472–1479 (2014)
Bielikova, M., Simko, M., Barla, M., Tvarozek, J., Labaj, M., Moro, R., Srba, I., Sevcech, J.: ALEF: from application to platform for adaptive collaborative learning. Recommender systems for technology enhanced learning, pp. 195–225. Springer, New York (2014)
Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: Moa: massive online analysis. J. Mach. Learn. Res. 11, 1601–1604 (2010)
Cao, F., Estert, M., Qian, W., Zhou, A.: Density-based clustering over an evolving data stream with noise. Proc. of the 2006 SIAM int. conf. on Data mining. Society for Industrial and Applied Mathematics (2006)
Chen, Y., Li, T.: Density-based clustering for real-time stream data. Proc. of the 13th ACM SIGKDD int. conf. on Knowledge discovery and data mining. ACM (2007)
Chen, ChCh., Shuai, H.H., Chen, M.S.: Distributed and scalable sequential pattern mining through stream processing. Knowl. Inf. Syst. 53(2), 365–390 (2017). Springer
Cheng, J., Ke, Y., Ng, W.: Maintaining frequent closed itemsets over a sliding window. J. Intell. Inf. Syst. 31(3), 191–215 (2008)
Chi, Y., Wang, H., Yu, P.S., Muntz, R.R: Moment: Maintaining closed frequent itemsets over a stream sliding window. Data Mining, 2004. ICDM’04. 4th IEEE Int. Con. on. IEEE, 59–66 (2004)
Facca, F.M., Lanzi, P.L.: Recent developments in web usage mining research. Proc. of the Int. Conf. on data warehousing and knowledge discovery, pp. 140–150. Springer, Berlin (2003)
Fatahi, S., Shabanali-Fami, F., Moradi, H.: An empirical study of using sequential behavior pattern mining approach to predict learning styles. Proc. of the Education and Information Technologies, pp. 1–19. Springer, Berlin (2017)
Fournier-Viger, P., Gomariz, A., Gueniche, T., Soltani, A., Wu, C.W., Tseng, W.S.: SPMF: a Java open-source pattern mining library. J. Mach. Learn. Res. 15(1), 3389–3393 (2014)
Han, J., Pei, J., Yin, T.: Mining frequent patterns without candidate generation. ACM SIGMOD Record. 29(2). ACM, 1–12 (2000)
Herder, E.: An Analysis of User Behavior on the Web-Understanding the Web and its Users. VDM Verlag, Saarbrücken (2007)
Iqbal, M.H., Soomro, T.R.: Big data analysis: apache storm perspective. Int. J. of Comput. Trends Technol. 19, 9–14 (2015)
Jalali, M., Mustapha, N., Nasir Sulaiman, M., Mamat, A.: WebPUM: a Web-based recommendation system to predict user future movements. Expert Syst. Appl. 37, 6201–6212 (2010)
Kassak, O., Kompan , M., Bielikova, M.: Student Behavior in a Web-based Educational System: Exit Intent Prediction. In the Engineering Applications of Artificial Intelligence J., Issue Mining the Humanities: Technologies and Applications, 51(May). 136–149, Elsevier, (2016)
Kranen, P., Assent, I., Baldauf, C., Seidl, T.: The ClusTree: indexing micro-clusters for anytime stream mining. Knowl. Inf. Syst. 29(2), 249–272 (2011)
Lee, V.E., Jin, R., Agrawal, G.: Frequent pattern mining in data streams. In frequent pattern mining, pp. 199–224. Springer, Berlin (2014)
Li, H.F., Ho, C.C., Kuo, F.F., Lee, S.Y.: A new algorithm for maintaining closed frequent itemsets in data streams by incremental updates. Proc. of the 6th Int. Conf. on Data Mining-Workshops. IEEE, 672–676 (2006)
Liraki, Z., Harounabadi, A.: Predicting the users’ Navigation Patterns in Web, using Weighted Association Rules and Users Navigation Information. Int. J. Comput. Appl. 110(12), 16–21 (2015)
Makker, S., Rathy, R.K.: Web server performance optimization using prediction prefetching engine. Int. J. Comput. Appl. 23(9), 19–24 (2011)
Moniz, N., Torgo, L., Eirinaki, M., Branco, P.: A framework for recommendation of highly popular news lacking social feedback. N. Gener. Comput. 35(4), 417–450 (2017). Springer
Quadrana, M., Bifet, A., Gavalda, R.: An efficient closed frequent itemset miner for the MOA stream mining system. AI Commun. 28(1), 143–158 (2015)
Rafailidis, D., Nanopoulos, A.: Modeling users preference dynamics and side information in recommender systems. IEEE Trans. Syst. Man Cybern. Syst. 46(6), 782–792 (2016)
Sapienza, A., Bessi, A., Ferrara, E.: Non-Negative tensor factorization for human behavioral pattern mining in online games. Information 93, 66 (2018)
Song, G., Yang, D., Cui, B., Zheng, B., Liu, Y., Xie, K.: CLAIM: an efficient method for relaxed frequent closed itemsets mining over stream data. Int. Conf. on Database Sys. for Advanced App. Springer, Berlin (2007)
Shin, S.J., Lee, D.S., Lee, W.S.: CP-tree: An adaptive synopsis structure for compressing frequent itemsets over online data streams. Inf. Sci. 278, 559–576 (2014)
Thiyagarajan, R., Thangavel, K., Rathipriya, R.: Recommendation of web pages using weighted K-means clustering. Int. J. Comput. Appl. 86(14), 44–48 (2014)
Tyagi, S., Bharadwaj, K.K.: Enhanced new user recommendations based on quantitative association rule mining. Procedia Comput. Sci. 10, 102–109 (2012)
Wang, J. Han, J.: BIDE: Efficient mining of frequent closed sequences. Data Engineering. Proc. of the 20th Int. Conf. on Data Engineering. IEEE, 79–90 (2004)
Xun, Y., Zhang, J., Qin, X.: FiDoop: parallel mining of frequent itemsets using map reduce. IEEE Trans. Syst. Man Cybern. Syst. 46(3), 313–325 (2016)
Yen, S.J., Wu, C.W., Lee, Y.S., Tseng, V.S. Hsieh, C.H.: A fast algorithm for mining frequent closed itemsets over stream sliding window. Proc. of Int. Conf. on Fuzzy Systems (FUZZ). IEEE, 996–1002 (2011)
Zheng, Z., Wei, W., Liu, Ch., Cao, W., Cao, L., Bhatia, M.: An effective contrast sequential pattern mining approach to taxpayer behavior analysis. World Wide Web 9(4), 633–651 (2016)
Acknowledgements
This work was partially supported by the project Human Information Behavior in the Digital Space founded by the Slovak Research and Development Agency no. APVV-15-0508; adaptation of access to information and knowledge artefacts based on interaction and collaboration within web environment founded by the Scientific Grant Agency of the SR, No. VG 1/0646/15 and No. VG 1/0667/18; with the support of the Ministry of Education, Science, Research and Sport of the SR within the Research and Development Operational Programme for the project “University Science Park of STU Bratislava”, ITMS 26240220084, co-funded by the ERDF and the Research and Development Operational Programme for the project International centre of excellence for research of intelligent and secure information-communication technologies and systems, ITMS 26240120039, co-funded by the ERDF.
Author information
Authors and Affiliations
Corresponding author
About this article
Cite this article
Chovanak, T., Kassak, O., Kompan, M. et al. Fast Streaming Behavioural Pattern Mining. New Gener. Comput. 36, 365–391 (2018). https://doi.org/10.1007/s00354-018-0044-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00354-018-0044-4