Skip to main content

Web Site Audience Segmentation Using Hybrid Alignment Techniques

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9441))

Abstract

We are working on behavioral marketing in the Internet. On one hand we observe the behavior of visitors, and on the other hand we trigger (in real-time) stimulations intended to alter this behavior. Real-time and mass-customization are the two challenges that we have to address. In this paper, we present a hybrid approach for clustering visitor sessions, based on a combination of global and local sequence alignments, such as Needleman-Wunsch and Smith-Waterman. Our goal is to define very simple approaches able to address about 80 % of visitor sessions to be segmented, and which can be easily turned into small pieces of program, to be run in parallel in thousands of web browsers.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://stat.ethz.ch/R-manual/R-patched/library/stats/html/dendrogram.html.

  2. 2.

    https://code.google.com/p/himmele/source/browse/trunk/Bioinformatics/.

  3. 3.

    https://github.com/lbehnke/hierarchical-clustering-java/tree/master/src.

References

  1. Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48(3), 443–453 (1970)

    Article  Google Scholar 

  2. Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147(1), 195–197 (1981)

    Article  Google Scholar 

  3. Wang, W., Zaïane, O.R.: Clustering web sessions by sequence alignment. In: Proceedings of 13th International Workshop on Database and Expert Systems Applications, 2002, pp. 394–398. IEEE (2002)

    Google Scholar 

  4. Li, C., Lu, Y.: Similarity measurement of web sessions based on sequence alignment. Wuhan Univ. J. Nat. Sci. 12(5), 814–818 (2007)

    Article  Google Scholar 

  5. Poornalatha, G., Raghavendra, P.: Alignment based similarity distance measure for better web sessions clustering. Procedia Comput. Sci. 5, 450–457 (2011)

    Article  Google Scholar 

  6. Chordia, B.S., Adhiya, K.P.: Grouping web access sequences using sequence alignment method. Indian J. Comput. Sci. Eng. (IJCSE) 2(3), 308–314 (2011)

    Google Scholar 

  7. Dimopoulos, C., Makris, C., Panagis, Y., Theodoridis, E., Tsakalidis, A.: A web page usage prediction scheme using sequence indexing and clustering techniques. Data Knowl. Eng. 69(4), 371–382 (2010)

    Article  Google Scholar 

  8. Petitjean, F., Forestier, G., Webb, G., Nicholson, A., Chen, Y., Keogh, E.: Dynamic time warping averaging of time series allows faster and more accurate classification. In: IEEE International Conference on Data Mining (2014)

    Google Scholar 

  9. Meesrikamolkul, W., Niennattrakul, V., Ratanamahatana, C.A.: Shape-based clustering for time series data. In: Tan, P.-N., Chawla, S., Ho, C.K., Bailey, J. (eds.) PAKDD 2012, Part I. LNCS, vol. 7301, pp. 530–541. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  10. Nakamura, A., Kudo, M.: Packing alignment: alignment for sequences of various length events. In: Huang, J.Z., Cao, L., Srivastava, J. (eds.) PAKDD 2011, Part II. LNCS, vol. 6635, pp. 234–245. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  11. Marascu, A., Khan, S.A., Palpanas, T.: Scalable similarity matching in streaming time series. In: Tan, P.-N., Chawla, S., Ho, C.K., Bailey, J. (eds.) PAKDD 2012, Part II. LNCS, vol. 7302, pp. 218–230. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  12. Chan, A.: An analysis of pairwise sequence alignment algorithm complexities: needleman-wunsch, smith-waterman, fasta, blast and gapped blast (2013)

    Google Scholar 

  13. Cooley, R., Mobasher, B., Srivastava, J.: Grouping web page references into transactions for mining world wide web browsing patterns. In: Proceedings of Knowledge and Data Engineering Exchange Workshop, 1997, pp. 2–9. IEEE (1997)

    Google Scholar 

  14. Cooley, R., Mobasher, B., Srivastava, J.: Data preparation for mining world wide web browsing patterns. Knowl. Inf. Syst. 1(1), 5–32 (1999)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vinh-Trung Luu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Luu, VT., Forestier, G., Fondement, F., Muller, PA. (2015). Web Site Audience Segmentation Using Hybrid Alignment Techniques. In: Li, XL., Cao, T., Lim, EP., Zhou, ZH., Ho, TB., Cheung, D. (eds) Trends and Applications in Knowledge Discovery and Data Mining. Lecture Notes in Computer Science(), vol 9441. Springer, Cham. https://doi.org/10.1007/978-3-319-25660-3_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25660-3_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25659-7

  • Online ISBN: 978-3-319-25660-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics