Abstract
Owing to the increasing use of ambiguous and imprecise words in expressing the user’s information need, it has become necessary to expand the original query with additional terms that best capture the actual user intent. Selecting the appropriate words to be used as additional terms is mainly dependent on the degree of relatedness between a candidate expansion term and the query terms. In this paper, we propose two criteria to assess the degree of relatedness: (1) attribute more importance to terms occurring in the largest possible number of documents where the query keywords appear, (2) assign more importance to terms having a short distance with the query terms within documents. We employ the strength Pareto fitness assignment in order to satisfy both criteria simultaneously. Our computational experiments on OHSUMED test collection show that our approach significantly improves the retrieval performance compared to the baseline.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Chen, Q., Li, M., Zhou, M.: Improving query spelling correction using web search results. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL 2007, pp. 181–189. ACL, Stroudsburg (2007)
Eisenstein, J., O’Connor, B., Smith, N.A., Xing, E.P.: Mapping the geographical diffusion of new words. In: Workshop on Social Network and Social Media Analysis: Methods, Models and Applications, NIPS 2012 (2012)
Ntoulas, A., Cho, J., Olston, C.: What’s new on the web?: The evolution of the web from a search engine perspective. In: Proceedings of the 13th International Conference on World Wide Web, WWW 2004, pp. 1–12. ACM, New York (2004)
Ranganath, P.: From microprocessors to nanostores: Rethinking data-centric systems. IEEE Computer 44(1), 39–48 (2011)
Robertson, S., Zaragoza, H.: The probabilistic relevance framework: Bm25 and beyond. Foundations and Trends in Information Retrieval 3(4), 333–389 (2009)
Robertson, S.E., Jones, K.S.: Relevance weighting of search terms. Journal of the American Society for Information science 27(3), 129–146 (1976)
Rocchio, J.J.: Relevance feedback in information retrieval. In: Salton, G. (ed.) The Smart retrieval system - experiments in automatic document processing, pp. 313–323. Prentice-Hall, Englewood Cliffs (1971)
Sun, H.M.: A study of the features of internet english from the linguistic perspective. Studies in Literature and Language 1(7), 98–103 (2010)
Williams, H.E., Zobel, J.: Searchable words on the web. International Journal on Digital Libraries 5(2), 99–105 (2005)
Zhu, Y., Zhong, N., Xiong, Y.: Data explosion, data nature and dataology. In: Zhong, N., Li, K., Lu, S., Chen, L. (eds.) BI 2009. LNCS, vol. 5819, pp. 147–158. Springer, Heidelberg (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Khennak, I., Drias, H. (2015). Strength Pareto Fitness Assignment for Generating Expansion Features. In: Rocha, A., Correia, A., Costanzo, S., Reis, L. (eds) New Contributions in Information Systems and Technologies. Advances in Intelligent Systems and Computing, vol 353. Springer, Cham. https://doi.org/10.1007/978-3-319-16486-1_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-16486-1_13
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16485-4
Online ISBN: 978-3-319-16486-1
eBook Packages: Computer ScienceComputer Science (R0)