Discovering Web Document Associations for Web Site Summarization

Candan⋆, K. Selςuk; Li, Wen-Syan

doi:10.1007/3-540-44801-2_16

Discovering Web Document Associations for Web Site Summarization

K. Selςuk Candan⋆⁷ &
Wen-Syan Li⁸

Conference paper
First Online: 01 January 2001

874 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2114))

Abstract

Complex web information structures prevent search engines from providing satisfactory context-sensitive retrieval. We see that in order to overcome this obstacle, it is essential to use techniques that recover the web authors’ intentions and superimpose them with the users’ retrieval contexts in summarizing web sites. Therefore, in this paper, we present a framework for discovering implicit associations among web documents for effective web site summarization. In the proposed framework, associations of web documents are induced by the web structure embedding them, as well as the contents of the documents and users’ interests. We analyze the semantics of document associations and describe an algorithm which capture these semantics for enumerating and ranking possible document associations. We then use these asociations in creating context-sensitive summaries of web neighborhoods.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Wen-Syan Li, Okan Kolak, Quoc Vu, and Hajime Takano. Defining Logical Domains in a Web Site. In Proceedings of the 11th ACM Conference on Hypertext, pages 123–132, San Antonio, TX, USA, May 2000.
Google Scholar
Jon M. Kleinberg. Authoritative sources in a hyperlinked environment. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, pages 668–677, January 1998.
Google Scholar
Wen-Syan Li and K Selςuk Candan. Integrating Content Search with Structure Analysis for Hypermedia Retrieval and Management. ACM Computing Surveys, 31(4es):13, 1999.
Article Google Scholar
K. Selςuk Candan and Wen-Syan Li. Using Random Walks for Mining Web Document Associations. In Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pages 294–305, Kyoto, Japan, April 2000.
Google Scholar
Krishna Bharat and Monika Henzinger. Improved algorithms for topic distillation in a hyperlinked environment. In Proceedings of the 21th Annual International ACM SIGIR Conference, pages 104–111, Melbourne, Australia, August 1998.
Google Scholar
Lawrence Page and Sergey Brin. The Anatomy of a Large-Scale Hypertextual Web Search Engine. In Proceedings of the 7th World-Wide Web Conference, Brisbane, Queensland, Australia, April 1998.
Google Scholar
T. Joachims, D. Freitag, and T. Mitchell. Webwatcher: A tour guide for the world wide web. In Proceedings of the 1997 Internaltional Joint Conference on Artificial Intelligence, August 1997.
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Sci. and Eng. Dept, Arizona State University, Tempe, AZ, 85287, USA
K. Selςuk Candan⋆
CCRL, NEC USA, Inc., 110 Rio Robles M/S SJ100, San Jose, CA, 95134, USA
Wen-Syan Li

Authors

K. Selςuk Candan⋆
View author publications
You can also search for this author in PubMed Google Scholar
Wen-Syan Li
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Kyoto University, Kyoto, 606-8501, Japan
Yahiko Kambayashi
EC3, Siebensterngasse 21/3, 1070, Wien
Werner Winiwarter
Center for Spatial Information Science (CSIS), University of Tokyo, 4-6-1, Komaba Meguro-ku, Tokyo, 153-8904, Japan
Masatoshi Arikawa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Candan⋆, K.S., Li, WS. (2001). Discovering Web Document Associations for Web Site Summarization. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2001. Lecture Notes in Computer Science, vol 2114. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44801-2_16

Download citation

DOI: https://doi.org/10.1007/3-540-44801-2_16
Published: 28 August 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42553-3
Online ISBN: 978-3-540-44801-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics