Abstract
We are working on behavioral marketing in the Internet. On one hand we observe the behavior of visitors, and on the other hand we trigger (in real-time) stimulations intended to alter this behavior. Real-time and mass-customization are the two challenges that we have to address. In this paper, we present a hybrid approach for clustering visitor sessions, based on a combination of global and local sequence alignments, such as Needleman-Wunsch and Smith-Waterman. Our goal is to define very simple approaches able to address about 80 % of visitor sessions to be segmented, and which can be easily turned into small pieces of program, to be run in parallel in thousands of web browsers.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48(3), 443–453 (1970)
Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147(1), 195–197 (1981)
Wang, W., Zaïane, O.R.: Clustering web sessions by sequence alignment. In: Proceedings of 13th International Workshop on Database and Expert Systems Applications, 2002, pp. 394–398. IEEE (2002)
Li, C., Lu, Y.: Similarity measurement of web sessions based on sequence alignment. Wuhan Univ. J. Nat. Sci. 12(5), 814–818 (2007)
Poornalatha, G., Raghavendra, P.: Alignment based similarity distance measure for better web sessions clustering. Procedia Comput. Sci. 5, 450–457 (2011)
Chordia, B.S., Adhiya, K.P.: Grouping web access sequences using sequence alignment method. Indian J. Comput. Sci. Eng. (IJCSE) 2(3), 308–314 (2011)
Dimopoulos, C., Makris, C., Panagis, Y., Theodoridis, E., Tsakalidis, A.: A web page usage prediction scheme using sequence indexing and clustering techniques. Data Knowl. Eng. 69(4), 371–382 (2010)
Petitjean, F., Forestier, G., Webb, G., Nicholson, A., Chen, Y., Keogh, E.: Dynamic time warping averaging of time series allows faster and more accurate classification. In: IEEE International Conference on Data Mining (2014)
Meesrikamolkul, W., Niennattrakul, V., Ratanamahatana, C.A.: Shape-based clustering for time series data. In: Tan, P.-N., Chawla, S., Ho, C.K., Bailey, J. (eds.) PAKDD 2012, Part I. LNCS, vol. 7301, pp. 530–541. Springer, Heidelberg (2012)
Nakamura, A., Kudo, M.: Packing alignment: alignment for sequences of various length events. In: Huang, J.Z., Cao, L., Srivastava, J. (eds.) PAKDD 2011, Part II. LNCS, vol. 6635, pp. 234–245. Springer, Heidelberg (2011)
Marascu, A., Khan, S.A., Palpanas, T.: Scalable similarity matching in streaming time series. In: Tan, P.-N., Chawla, S., Ho, C.K., Bailey, J. (eds.) PAKDD 2012, Part II. LNCS, vol. 7302, pp. 218–230. Springer, Heidelberg (2012)
Chan, A.: An analysis of pairwise sequence alignment algorithm complexities: needleman-wunsch, smith-waterman, fasta, blast and gapped blast (2013)
Cooley, R., Mobasher, B., Srivastava, J.: Grouping web page references into transactions for mining world wide web browsing patterns. In: Proceedings of Knowledge and Data Engineering Exchange Workshop, 1997, pp. 2–9. IEEE (1997)
Cooley, R., Mobasher, B., Srivastava, J.: Data preparation for mining world wide web browsing patterns. Knowl. Inf. Syst. 1(1), 5–32 (1999)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Luu, VT., Forestier, G., Fondement, F., Muller, PA. (2015). Web Site Audience Segmentation Using Hybrid Alignment Techniques. In: Li, XL., Cao, T., Lim, EP., Zhou, ZH., Ho, TB., Cheung, D. (eds) Trends and Applications in Knowledge Discovery and Data Mining. Lecture Notes in Computer Science(), vol 9441. Springer, Cham. https://doi.org/10.1007/978-3-319-25660-3_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-25660-3_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25659-7
Online ISBN: 978-3-319-25660-3
eBook Packages: Computer ScienceComputer Science (R0)