Skip to main content
Log in

Fast Streaming Behavioural Pattern Mining

  • Research Paper
  • Published:
New Generation Computing Aims and scope Submit manuscript

Abstract

Identification of typical user behaviour within a web application is a crucial assumption for revealing user characteristics, preferences and habits. Typical and repeating features of user behaviour during his/her interaction with web application can be generalized through behavioural patterns. In this paper, we propose HyBPMine—a novel method, for behavioural pattern mining over a data stream. Our method combines global patterns with patterns specific to dynamically identified groups of similar users. In this way, the method finds and combines the general global patterns (typical for high number of users) with the specific patterns (typical for user groups). We represent the patterns as frequent closed itemsets of items visited by users in their sessions. The behavioural patterns are often used for personalization, prediction or recommendation. In this paper, we evaluated the performance of our method indirectly, by applying discovered patterns in personalized recommendations. In other words, we recommended next user actions within the actual user session. We performed several experiments over data from e-learning and news domains. Our results clearly show that proposed method reaches higher precision than its components used individually as well as the state-of-the-art approaches. In addition, a inclusion of group patterns brings only low and constant computational load, which does not significantly increase resource requirements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. https://moa.cms.waikato.ac.nz/

References

  1. Aggarwal, C.C., Han, J., Wang J., Yu, P.S.: A framework for clustering evolving data streams. Proc. of the 29th int. conf. on Very large data bases. Vol. 29. VLDB Endowment 81–92 (2003)

  2. Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for projected clustering of high dimensional data streams. Proc. of the 30th int. conf. on Very large data bases. Vol. 30. VLDB Endowment 852–863 (2004)

  3. Agrawal, R., Srikant, R.: Mining sequential patterns. Proc. of the 19th int. conf. on Data Engineering. IEEE, 3–14 (1995)

  4. Agrawal, R., Imieliski, T., Swami, A.: Mining association rules between sets of items in large databases. ACM SIGMOD record. 22(2), ACM, 207–216 (1993)

  5. Anandhi, D., Irfan Ahmed, M.S.: An improved web log mining and online navigational pattern prediction. Res. J. Appl. Sci. Eng. Technol. 8(12), 1472–1479 (2014)

    Google Scholar 

  6. Bielikova, M., Simko, M., Barla, M., Tvarozek, J., Labaj, M., Moro, R., Srba, I., Sevcech, J.: ALEF: from application to platform for adaptive collaborative learning. Recommender systems for technology enhanced learning, pp. 195–225. Springer, New York (2014)

    Google Scholar 

  7. Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: Moa: massive online analysis. J. Mach. Learn. Res. 11, 1601–1604 (2010)

    Google Scholar 

  8. Cao, F., Estert, M., Qian, W., Zhou, A.: Density-based clustering over an evolving data stream with noise. Proc. of the 2006 SIAM int. conf. on Data mining. Society for Industrial and Applied Mathematics (2006)

  9. Chen, Y., Li, T.: Density-based clustering for real-time stream data. Proc. of the 13th ACM SIGKDD int. conf. on Knowledge discovery and data mining. ACM (2007)

  10. Chen, ChCh., Shuai, H.H., Chen, M.S.: Distributed and scalable sequential pattern mining through stream processing. Knowl. Inf. Syst. 53(2), 365–390 (2017). Springer

    Article  Google Scholar 

  11. Cheng, J., Ke, Y., Ng, W.: Maintaining frequent closed itemsets over a sliding window. J. Intell. Inf. Syst. 31(3), 191–215 (2008)

    Article  Google Scholar 

  12. Chi, Y., Wang, H., Yu, P.S., Muntz, R.R: Moment: Maintaining closed frequent itemsets over a stream sliding window. Data Mining, 2004. ICDM’04. 4th IEEE Int. Con. on. IEEE, 59–66 (2004)

  13. Facca, F.M., Lanzi, P.L.: Recent developments in web usage mining research. Proc. of the Int. Conf. on data warehousing and knowledge discovery, pp. 140–150. Springer, Berlin (2003)

    Google Scholar 

  14. Fatahi, S., Shabanali-Fami, F., Moradi, H.: An empirical study of using sequential behavior pattern mining approach to predict learning styles. Proc. of the Education and Information Technologies, pp. 1–19. Springer, Berlin (2017)

    Google Scholar 

  15. Fournier-Viger, P., Gomariz, A., Gueniche, T., Soltani, A., Wu, C.W., Tseng, W.S.: SPMF: a Java open-source pattern mining library. J. Mach. Learn. Res. 15(1), 3389–3393 (2014)

    MATH  Google Scholar 

  16. Han, J., Pei, J., Yin, T.: Mining frequent patterns without candidate generation. ACM SIGMOD Record. 29(2). ACM, 1–12 (2000)

  17. Herder, E.: An Analysis of User Behavior on the Web-Understanding the Web and its Users. VDM Verlag, Saarbrücken (2007)

    Google Scholar 

  18. Iqbal, M.H., Soomro, T.R.: Big data analysis: apache storm perspective. Int. J. of Comput. Trends Technol. 19, 9–14 (2015)

    Article  Google Scholar 

  19. Jalali, M., Mustapha, N., Nasir Sulaiman, M., Mamat, A.: WebPUM: a Web-based recommendation system to predict user future movements. Expert Syst. Appl. 37, 6201–6212 (2010)

    Article  Google Scholar 

  20. Kassak, O., Kompan , M., Bielikova, M.: Student Behavior in a Web-based Educational System: Exit Intent Prediction. In the Engineering Applications of Artificial Intelligence J., Issue Mining the Humanities: Technologies and Applications, 51(May). 136–149, Elsevier, (2016)

  21. Kranen, P., Assent, I., Baldauf, C., Seidl, T.: The ClusTree: indexing micro-clusters for anytime stream mining. Knowl. Inf. Syst. 29(2), 249–272 (2011)

    Article  Google Scholar 

  22. Lee, V.E., Jin, R., Agrawal, G.: Frequent pattern mining in data streams. In frequent pattern mining, pp. 199–224. Springer, Berlin (2014)

    Google Scholar 

  23. Li, H.F., Ho, C.C., Kuo, F.F., Lee, S.Y.: A new algorithm for maintaining closed frequent itemsets in data streams by incremental updates. Proc. of the 6th Int. Conf. on Data Mining-Workshops. IEEE, 672–676 (2006)

  24. Liraki, Z., Harounabadi, A.: Predicting the users’ Navigation Patterns in Web, using Weighted Association Rules and Users Navigation Information. Int. J. Comput. Appl. 110(12), 16–21 (2015)

    Google Scholar 

  25. Makker, S., Rathy, R.K.: Web server performance optimization using prediction prefetching engine. Int. J. Comput. Appl. 23(9), 19–24 (2011)

    Google Scholar 

  26. Moniz, N., Torgo, L., Eirinaki, M., Branco, P.: A framework for recommendation of highly popular news lacking social feedback. N. Gener. Comput. 35(4), 417–450 (2017). Springer

    Article  Google Scholar 

  27. Quadrana, M., Bifet, A., Gavalda, R.: An efficient closed frequent itemset miner for the MOA stream mining system. AI Commun. 28(1), 143–158 (2015)

    MathSciNet  MATH  Google Scholar 

  28. Rafailidis, D., Nanopoulos, A.: Modeling users preference dynamics and side information in recommender systems. IEEE Trans. Syst. Man Cybern. Syst. 46(6), 782–792 (2016)

    Article  Google Scholar 

  29. Sapienza, A., Bessi, A., Ferrara, E.: Non-Negative tensor factorization for human behavioral pattern mining in online games. Information 93, 66 (2018)

    Article  Google Scholar 

  30. Song, G., Yang, D., Cui, B., Zheng, B., Liu, Y., Xie, K.: CLAIM: an efficient method for relaxed frequent closed itemsets mining over stream data. Int. Conf. on Database Sys. for Advanced App. Springer, Berlin (2007)

    Google Scholar 

  31. Shin, S.J., Lee, D.S., Lee, W.S.: CP-tree: An adaptive synopsis structure for compressing frequent itemsets over online data streams. Inf. Sci. 278, 559–576 (2014)

    Article  MathSciNet  Google Scholar 

  32. Thiyagarajan, R., Thangavel, K., Rathipriya, R.: Recommendation of web pages using weighted K-means clustering. Int. J. Comput. Appl. 86(14), 44–48 (2014)

    Google Scholar 

  33. Tyagi, S., Bharadwaj, K.K.: Enhanced new user recommendations based on quantitative association rule mining. Procedia Comput. Sci. 10, 102–109 (2012)

    Article  Google Scholar 

  34. Wang, J. Han, J.: BIDE: Efficient mining of frequent closed sequences. Data Engineering. Proc. of the 20th Int. Conf. on Data Engineering. IEEE, 79–90 (2004)

  35. Xun, Y., Zhang, J., Qin, X.: FiDoop: parallel mining of frequent itemsets using map reduce. IEEE Trans. Syst. Man Cybern. Syst. 46(3), 313–325 (2016)

    Article  Google Scholar 

  36. Yen, S.J., Wu, C.W., Lee, Y.S., Tseng, V.S. Hsieh, C.H.: A fast algorithm for mining frequent closed itemsets over stream sliding window. Proc. of Int. Conf. on Fuzzy Systems (FUZZ). IEEE, 996–1002 (2011)

  37. Zheng, Z., Wei, W., Liu, Ch., Cao, W., Cao, L., Bhatia, M.: An effective contrast sequential pattern mining approach to taxpayer behavior analysis. World Wide Web 9(4), 633–651 (2016)

    Article  Google Scholar 

Download references

Acknowledgements

This work was partially supported by the project Human Information Behavior in the Digital Space founded by the Slovak Research and Development Agency no. APVV-15-0508; adaptation of access to information and knowledge artefacts based on interaction and collaboration within web environment founded by the Scientific Grant Agency of the SR, No. VG 1/0646/15 and No. VG 1/0667/18; with the support of the Ministry of Education, Science, Research and Sport of the SR within the Research and Development Operational Programme for the project “University Science Park of STU Bratislava”, ITMS 26240220084, co-funded by the ERDF and the Research and Development Operational Programme for the project International centre of excellence for research of intelligent and secure information-communication technologies and systems, ITMS 26240120039, co-funded by the ERDF.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michal Kompan.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chovanak, T., Kassak, O., Kompan, M. et al. Fast Streaming Behavioural Pattern Mining. New Gener. Comput. 36, 365–391 (2018). https://doi.org/10.1007/s00354-018-0044-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00354-018-0044-4

Keywords

Navigation