Abstract
Query term deletion is one of the commonly used strategies for query rewriting. In this paper, we study the problem of query term deletion using large-scale e-commerce search logs. Specifically, we focus on queries that do not lead to user clicks and aim to predict a reduced and better query that can lead to clicks by term deletion. Accurate prediction of term deletion can potentially help users recover from poor search results and improve shopping experience. To achieve this, we use various term-dependent and query-dependent measures as features and build a classifier to predict which term is the most likely to be deleted from a given query. Our approach is data-driven. We investigate the large-scale query history and the document collection, verify the usefulness of previously proposed features, and also propose to incorporate the query category information into the term deletion predictors. We observe that training within-category classifiers can result in much better performance than training a unified classifier. We validate our approach using a large collection of query sessions logs from a leading e-commerce site and demonstrate that our approach provides promising performance in query term deletion prediction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Allan, J., Callan, J., Croft, W., Ballesteros, L., Broglio, J., Xu, J., Shu, H.: Inquery at trec-5. In: Center for Intelligent Information Retrieval, Dept. of Computer Science, University of Massachusetts, Amherst, Mass (1996)
Bailey, P., White, R., Liu, H., Kumaran, G.: Mining historic query trails to label long and rare search engine queries. ACM Transactions on the Web (TWEB) 4(4), 15 (2010)
Bendersky, M., Croft, W.B.: Discovering key concepts in verbose queries. In: Proceedings of SIGIR, pp. 491–498 (2008)
Chien, S., Immorlica, N.: Semantic similarity between search engine queries using temporal correlation. In: Proceedings of the 14th International Conference on World Wide Web, pp. 2–11. ACM (2005)
Cucerzan, S., Brill, E.: Extracting semantically related queries by exploiting user session information. Technical report, Technical report, Microsoft Research (2005)
Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., Lin, C.-J.: Liblinear: A library for large linear classification. The Journal of Machine Learning Research 9, 1871–1874 (2008)
Fonseca, B., Golgher, P., Pôssas, B., Ribeiro-Neto, B., Ziviani, N.: Concept-based interactive query expansion. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 696–703. ACM (2005)
Hasan, M.A., Parikh, N., Singh, G., Sundaresan, N.: Query suggestion for e-commerce sites. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, WSDM 2011, pp. 765–774 (2011)
Jones, R., Fain, D.C.: Query word deletion prediction. In: Proceedings of SIGIR, pp. 435–436 (2003)
Jones, R., Rey, B., Madani, O., Greiner, W.: Generating query substitutions. In: Proceedings of the 15th International Conference on World Wide Web, WWW 2006, New York, NY, USA, pp. 387–396 (2006)
Kumaran, G., Allan, J.: A case for shorter queries, and helping users create them. In: HLT-NAACL, pp. 220–227 (2007)
Kumaran, G., Carvalho, V.R.: Reducing long queries using query quality predictors. In: Proceedings of SIGIR, pp. 564–571 (2009)
Lease, M., Allan, J., Croft, W.B.: Regression rank: Learning to meet the opportunity of descriptive queries. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 90–101. Springer, Heidelberg (2009)
Parikh, N., Sundaresan, N.: Inferring semantic query relations from collective user behavior. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, pp. 349–358. ACM (2008)
Shen, D., Ruvini, J.D., Somaiya, M., Sundaresan, N.: Item categorization in the e-commerce domain. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 1921–1924. ACM (2011)
Singh, G., Parikh, N., Sundaresan, N.: Rewriting null e-commerce queries to recommend products. In: Proceedings of the 21st International Conference Companion on World Wide Web, WWW 2012 Companion, pp. 73–82 (2012)
Wu, H., Fang, H.: An exploration of query term deletion. In: Proceedings of the ECIR 2011 Workshop on Information Retrieval Over Query Sessions (2011)
Zhao, L., Callan, J.: Term necessity prediction. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, CIKM 2010, pp. 259–268 (2010)
Zukerman, I., Raskutti, B., Wen, Y.: Query expansion and query reduction in document retrieval. In: Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence, pp. 552–559. IEEE (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Yang, B., Parikh, N., Singh, G., Sundaresan, N. (2014). A Study of Query Term Deletion Using Large-Scale E-commerce Search Logs. In: de Rijke, M., et al. Advances in Information Retrieval. ECIR 2014. Lecture Notes in Computer Science, vol 8416. Springer, Cham. https://doi.org/10.1007/978-3-319-06028-6_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-06028-6_20
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-06027-9
Online ISBN: 978-3-319-06028-6
eBook Packages: Computer ScienceComputer Science (R0)