Abstract
Complex web information structures prevent search engines from providing satisfactory context-sensitive retrieval. We see that in order to overcome this obstacle, it is essential to use techniques that recover the web authors’ intentions and superimpose them with the users’ retrieval contexts in summarizing web sites. Therefore, in this paper, we present a framework for discovering implicit associations among web documents for effective web site summarization. In the proposed framework, associations of web documents are induced by the web structure embedding them, as well as the contents of the documents and users’ interests. We analyze the semantics of document associations and describe an algorithm which capture these semantics for enumerating and ranking possible document associations. We then use these asociations in creating context-sensitive summaries of web neighborhoods.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Wen-Syan Li, Okan Kolak, Quoc Vu, and Hajime Takano. Defining Logical Domains in a Web Site. In Proceedings of the 11th ACM Conference on Hypertext, pages 123–132, San Antonio, TX, USA, May 2000.
Jon M. Kleinberg. Authoritative sources in a hyperlinked environment. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, pages 668–677, January 1998.
Wen-Syan Li and K Selςuk Candan. Integrating Content Search with Structure Analysis for Hypermedia Retrieval and Management. ACM Computing Surveys, 31(4es):13, 1999.
K. Selςuk Candan and Wen-Syan Li. Using Random Walks for Mining Web Document Associations. In Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pages 294–305, Kyoto, Japan, April 2000.
Krishna Bharat and Monika Henzinger. Improved algorithms for topic distillation in a hyperlinked environment. In Proceedings of the 21th Annual International ACM SIGIR Conference, pages 104–111, Melbourne, Australia, August 1998.
Lawrence Page and Sergey Brin. The Anatomy of a Large-Scale Hypertextual Web Search Engine. In Proceedings of the 7th World-Wide Web Conference, Brisbane, Queensland, Australia, April 1998.
T. Joachims, D. Freitag, and T. Mitchell. Webwatcher: A tour guide for the world wide web. In Proceedings of the 1997 Internaltional Joint Conference on Artificial Intelligence, August 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Candan⋆, K.S., Li, WS. (2001). Discovering Web Document Associations for Web Site Summarization. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2001. Lecture Notes in Computer Science, vol 2114. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44801-2_16
Download citation
DOI: https://doi.org/10.1007/3-540-44801-2_16
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42553-3
Online ISBN: 978-3-540-44801-3
eBook Packages: Springer Book Archive