poster

Investigating the suboptimality and instability of pseudo-relevance feedback

Authors:
Raghavendra Udupa

Microsoft Research India, Bangalore, India

Microsoft Research India, Bangalore, India
View Profile

,
Abhijit Bhole

Microsoft Research India, Bangalore, India

Microsoft Research India, Bangalore, India
View Profile

SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrievalJuly 2010Pages 813–814https://doi.org/10.1145/1835449.1835629

Published:19 July 2010Publication History

SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

Pages 813–814

ABSTRACT

Although Pseudo-Relevance Feedback (PRF) techniques improve average retrieval performance at the price of high variance, not much is known about their optimality and the reasons for their instability. In this work, we study more than 800 topics from several test collections including the TREC Robust Track and show that PRF techniques are highly suboptimal, i.e. they do not make the fullest utilization of pseudo-relevant documents and under-perform. A careful selection of expansion terms from the pseudo-relevant document with the help of an oracle can actually improve retrieval performance dramatically (by > 60%). Further, we show that instability in PRF techniques is mainly due to wrong selection of expansion terms from the pseudo-relevant documents. Our findings emphasize the need to revisit the problem of term selection to make a break through in PRF.

References

G. Cao, J.-Y. Nie, J. Gao, and S. Robertson. Selecting good expansion terms for pseudo-relevance feedback. In Proceedings of SIGIR '08. Google ScholarDigital Library
R. Udupa, A. Bhole, and P. Bhattacharyya. On selecting a good expansion set in pseudo-relevance feedback. In Proceedings of ICTIR '09. Google ScholarDigital Library
C. Zhai and J. Lafferty. Model-based feedback in the language modeling approach to information retrieval. In Proceedings of CIKM '01. Google ScholarDigital Library
C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst.(2) Google ScholarDigital Library
E. N. Efthimiadis. Query expansion. Annual Review of Information Systems and Technology.Google Scholar
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning.Google Scholar

Index Terms

Investigating the suboptimality and instability of pseudo-relevance feedback
1. Information systems
  1. Information retrieval

Recommendations

Query dependent pseudo-relevance feedback based on wikipedia
SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval

Pseudo-relevance feedback (PRF) via query-expansion has been proven to be e®ective in many information retrieval (IR) tasks. In most existing work, the top-ranked documents from an initial search are assumed to be relevant and used for PRF. One problem ...
Read More
Document-based and term-based linear methods for pseudo-relevance feedback

Query expansion is a successful approach for improving Information Retrieval effectiveness. This work focuses on pseudo-relevance feedback (PRF) which provides an automatic method for expanding queries without explicit user feedback. These techniques ...
Read More
An incremental approach to efficient pseudo-relevance feedback
SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Pseudo-relevance feedback is an important strategy to improve search accuracy. It is often implemented as a two-round retrieval process: the first round is to retrieve an initial set of documents relevant to an original query, and the second round is to ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
July 2010
944 pages
ISBN:9781450301534
DOI:10.1145/1835449
General Chairs:
Fabio Crestani
University of Lugano, CH
,
Stéphane Marchand-Maillet
University of Geneva, CH
,
Program Chairs:
Hsin-Hsi Chen
National Taiwan University, TW
,
Efthimis N. Efthimiadis
University of Washington, USA
,
Jacques Savoy
University of Neuchatel, CH
Copyright © 2010 Copyright is held by the owner/author(s)
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 July 2010
Check for updates
Author Tags
pseudo-relevance feedback
Qualifiers
- poster
Conference

Acceptance Rates
SIGIR '10 Paper Acceptance Rate87of520submissions,17%Overall Acceptance Rate792of3,983submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 233
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Investigating the suboptimality and instability of pseudo-relevance feedback

SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Query dependent pseudo-relevance feedback based on wikipedia

Document-based and term-based linear methods for pseudo-relevance feedback

An incremental approach to efficient pseudo-relevance feedback