Abstract
In pseudo relevance feedback (PRF), the document weight which indicates how important a document is for the PRF model, plays a key role. In this paper, we investigate the smoothness issue of the document weights in PRF. The term smoothness means that the document weights decrease smoothly (i.e. gradually) along the document ranking list, and the weights are smooth (i.e. similar) within topically similar documents. We postulate that a reasonably smooth document-weighting function can benefit the PRF performance. This hypothesis is tested under a typical PRF model, namely the Relevance Model (RM). We propose a two-step document weight smoothing method, the different instantiations of which have different effects on weight smoothing. Experiments on three TREC collections show that the instantiated methods with better smoothing effects generally lead to better PRF performance. In addition, the proposed method can significantly improve the RM’s performance and outperform various alternative methods which can also be used to smooth the document weights.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abdul-Jaleel, N., Allan, J., Croft, W.B., Diaz, F., Larkey, L., Li, X., Metzler, D., Smucker, M.D., Strohman, T., Turtle, H., Wade, C.: Umass at TREC 2004: Novelty and hard. In: TREC 2004 (2004)
Diaz, F.: Regularizing ad hoc retrieval scores. In: CIKM. pp. 672–679 (2005)
Diaz, F.: Regularizing query-based retrieval scores. Inf. Retr. 10(6), 531–562 (2007)
Diaz, F.: Improving relevance feedback in language modeling with score regularization. In: SIGIR, pp. 807–808 (2008)
Lavrenko, V., Croft, W.B.: Relevance-based language models. In: SIGIR, pp. 120–127 (2001)
Li, X.: A new robust relevance model in the language model framework. Inf. Process. Manage. 44(3), 991–1007 (2008)
Lv, Y., Zhai, C.: Adaptive relevance feedback in information retrieval. In: CIKM, pp. 255–264 (2009)
Lv, Y., Zhai, C.: A comparative study of methods for estimating query language models with pseudo feedback. In: CIKM, pp. 1895–1898 (2009)
Mei, Q., Zhang, D., Zhai, C.: A general optimization framework for smoothing language models on graph structures. In: SIGIR, pp. 611–618 (2008)
Ogilvie, P., Callan, J.: Experiments using the lemur toolkit. In: TREC-10, pp. 103–108 (2002)
Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: SIGIR, pp. 275–281 (1998)
Tombros, A., van Rijsbergen, C.J.: Query-sensitive similarity measures for information retrieval. Knowl. Inf. Syst. 6(5) (2004)
Wang, J., Zhu, J.: Portfolio theory of information retrieval. In: SIGIR, pp. 115–122 (2009)
Zhai, C., Lafferty, J.D.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: SIGIR, pp. 334–342 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, P., Song, D., Zhao, X., Hou, Y. (2010). A Study of Document Weight Smoothness in Pseudo Relevance Feedback. In: Cheng, PJ., Kan, MY., Lam, W., Nakov, P. (eds) Information Retrieval Technology. AIRS 2010. Lecture Notes in Computer Science, vol 6458. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17187-1_50
Download citation
DOI: https://doi.org/10.1007/978-3-642-17187-1_50
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17186-4
Online ISBN: 978-3-642-17187-1
eBook Packages: Computer ScienceComputer Science (R0)