Skip Navigation

IEICE Transactions on Information and Systems 2006 E89-D(10):2606-2615; doi:10.1093/ietisy/e89-d.10.2606
This Article
Right arrow Full Text (PDF)
Right arrow References
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Request Permissions
Google Scholar
Right arrow Articles by ASANO, Y.
Right arrow Articles by KITSUREGAWA, M.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Copyright © 2006 The Institute of Electronics, Information and Communication Engineers

Regular Section -- Papers -- Data Mining

Mining Communities on the Web Using a Max-Flow and a Site-Oriented Framework

Yasuhito ASANO1, Takao NISHIZEKI2, Masashi TOYODA3 and Masaru KITSUREGAWA3

1 The author is with the Department of Information Sciences, Faculty of Science and Engineering, Tokyo Denki University, Saitama-ken, 350–0394 Japan. E-mail: asano{at}y.dendai.ac.jp, 2 The author is with the Graduate School of Information Sciences, Tohoku University, Sendai-shi, 980–8579 Japan., 3 The authors are with the Institute of Industrial Science, The University of Tokyo, Tokyo, 153–8505 Japan.

There are several methods for mining communities on the Web using hyperlinks. One of the well-known ones is a max-flow based method proposed by Flake et al. The method adopts a page-oriented framework, that is, it uses a page on the Web as a unit of information, like other methods including HITS and trawling. Recently, Asano et al. built a site-oriented framework which uses a site as a unit of information, and they experimentally showed that trawling on the site-oriented framework often outputs significantly better communities than trawling on the page-oriented framework. However, it has not been known whether the site-oriented framework is effective in mining communities through the max-flow based method. In this paper, we first point out several problems of the max-flow based method, mainly owing to the page-oriented framework, and then propose solutions to the problems by utilizing several advantages of the site-oriented framework. Computational experiments reveal that our max-flow based method on the site-oriented framework is very effective in mining communities, related to the topics of given pages, in comparison with the original max-flow based method on the page-oriented framework.

Key Words: Web, data mining, site, max-flow, site-oriented framework


Manuscript received August 8, 2005. Manuscript revised February 26, 2006.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.