ABSTRACT
The PageRank algorithm is used in Web information retrieval to calculate a single list of popularity scores for each page in the Web. These popularity scores are used to rank query results when presented to the user. By using the structure of the entire Web to calculate one score per document, we are calculating a general popularity score, not particular to any community. Therefore, the PageRank scores are more suited to general queries. In this paper, we introduce a more general form of PageRank, using Web multi-resolution community-based popularity scores, where each document obtains a popularity score dependent on a given Web community. When a query is related to a specific community, we choose the associated set of popularity scores and order the query results accordingly. Using Web-community based popularity scores, we achieved an 11% increase in precision over PageRank.
- S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems, 30(1-7):107--117, April 1998. Google ScholarDigital Library
- C. Ding, X. He, and H. D. Simon. On the equivalence of nonnegative matrix factorization and spectral clustering. In Proc. SIAM Int'l Conf. Data Mining (SDM'05), pages 606--610, April 2005.Google ScholarCross Ref
- T. H. Haveliwala. Topic-sensitive pagerank. In WWW '02: Proceedings of the 11th international conference on World Wide Web, pages 517--526, New York, NY, USA, 2002. ACM Press. Google ScholarDigital Library
- G. Jeh and J. Widom. Scaling personalized web search. In WWW '03: Proceedings of the 12th international conference on World Wide Web, pages 271--279, New York, NY, USA, 2003. ACM Press. Google ScholarDigital Library
- J. M. Kleinberg. Authoritative sources in a hyperlinked environment. J. ACM, 46(5):604--632, 1999. Google ScholarDigital Library
- L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, 1998.Google Scholar
Index Terms
- Mining web multi-resolution community-based popularity for information retrieval
Recommendations
Finding news-topic oriented influential twitter users based on topic related hashtag community detection
Recently, more and more users would like to collect and provide information about news topics in Twitter, which is one of the most popular microblogging services. Virtual communities defined by hashtags in Twitter are created for exchanging information ...
MapReduce Based Information Retrieval Algorithms for Efficient Ranking of Webpages
In this paper, the authors discuss the MapReduce implementation of crawler, indexer and ranking algorithms in search engines. The proposed algorithms are used in search engines to retrieve results from the World Wide Web. A crawler and an indexer in a ...
Web Algorithms for Information Retrieval: A Performance Comparative Study
The World Wide Web has emerged to become the biggest and most popular way of communication and information dissemination. Every day, the Web is expending and people generally rely on search engine to explore the web. Because of its rapid and chaotic ...
Comments