Skip to main content

Training Deep Ranking Model with Weak Relevance Labels

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10538))

Abstract

Deep neural networks have already achieved great success in a number of fields, for example, computer vision, natural language processing, speech recognition, and etc. However, such advances have not been observed in information retrieval (IR) tasks yet, such as ad-hoc retrieval. A potential explanation is that in a particular IR task, training a document ranker usually needs large amounts of relevance labels which describe the relationship between queries and documents. However, this kind of relevance judgments are usually very expensive to obtain. In this paper, we propose to train deep ranking models with weak relevance labels generated by click model based on real users’ click behavior. We investigate the effectiveness of different weak relevance labels trained based on several major click models, such as DBN, RCM, PSCM, TCM, and UBM. The experimental results indicate that the ranking models trained with weak relevance labels are able to utilize large scale of behavior data and they can get similar performance compared to the ranking model trained based on relevance labels from external assessors, which are supposed to be more accurate. This preliminary finding encourages us to develop deep ranking models with weak supervised data.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://github.com/bmitra-msft/NDRM/blob/master/notebooks/Duet.ipynb.

  2. 2.

    http://research.nii.ac.jp/ntcir/tools/ntcireval-en.html.

References

  1. Huang, P.-S., et al.: Learning deep structured semantic models for web search using clickthrough data. In: Proceedings of the 22nd ACM International Conference on Conference on Information and Knowledge Management. ACM (2013)

    Google Scholar 

  2. Mikolov, T., et al.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems (2013)

    Google Scholar 

  3. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP, vol. 14 (2014)

    Google Scholar 

  4. Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: ICML, vol. 14 (2014)

    Google Scholar 

  5. Dehghani, M., Zamani, H., Severyn, A., Kamps, J., Croft, W.B.: Neural ranking models with weak supervision. arXiv preprint arXiv:1704.08803 (2017)

  6. Yin, D., Hu, Y., Tang, J., Daly, T., Zhou, M., Ouyang, H., Chen, J., et al.: Ranking relevance in Yahoo search. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 323–332. ACM (2016)

    Google Scholar 

  7. Salakhutdinov, R., Hinton, G.: Semantic hashing. Int. J. Approx. Reason. 50(7), 969–978 (2009)

    Article  Google Scholar 

  8. Chuklin, A., Markov, I., de Rijke, M.: Click models for web search. Synth. Lect. Inf. Concepts Retr. Serv. 7(3), 1–115 (2015)

    Google Scholar 

  9. Guo, J., Fan, Y., Ai, Q., Croft, W.B.: A deep relevance matching model for ad-hoc retrieval. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 55–64. ACM (2016)

    Google Scholar 

  10. Shen, Y., He, X., Gao, J., Deng, L., Mesnil, G.: Learning semantic representations using convolutional neural networks for web search. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 373–374. ACM (2014)

    Google Scholar 

  11. Hu, B., Lu, Z., Li, H., Chen, Q.: Convolutional neural network architectures for matching natural language sentences. In: Advances in Neural Information Processing Systems, pp. 2042–2050 (2014)

    Google Scholar 

  12. Lu, Z., Li, H.: A deep architecture for matching short texts. In: Advances in Neural Information Processing Systems, pp. 1367–1375 (2013)

    Google Scholar 

  13. Pang, L., Lan, Y., Guo, J., Xu, J., Wan, S., Cheng, X.: Text matching as image recognition. arXiv preprint arXiv:1602.06359 (2016)

  14. Hui, K., Yates, A., Berberich, K., de Melo, G.: A position-aware deep model for relevance matching in information retrieval. arXiv preprint arXiv:1704.03940 (2017)

  15. Mitra, B., Diaz, F., Craswell, N.: Learning to match using local and distributed representations of text for web search. arXiv preprint arXiv:1610.08136 (2016)

  16. Agichtein, E., Brill, E., Dumais, S., Ragno, R.: Learning user interaction models for predicting web search result preferences. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 3–10. ACM (2006)

    Google Scholar 

  17. Joachims, T., Granka, L., Pan, B., Hembrooke, H., Gay, G.: Accurately interpreting clickthrough data as implicit feedback. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 154–161. ACM (2005)

    Google Scholar 

  18. Zhang, Y., Chen, W., Wang, D., Yang, Q.: User-click modeling for understanding and predicting search-behavior. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1388–1396. ACM (2011)

    Google Scholar 

  19. Wang, C., Liu, Y., Zhang, M., Ma, S., Zheng, M., Qian, J., Zhang, K.: Incorporating vertical results into search click models. In: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 503–512. ACM, July 2013

    Google Scholar 

  20. Wang, C., Liu, Y., Wang, M., Zhou, K., Nie, J., Ma, S.: Incorporating non-sequential behavior into click models. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 283–292. ACM (2015)

    Google Scholar 

  21. Chapelle, O., Zhang, Y.: A dynamic Bayesian network click model for web search ranking. In: Proceedings of the 18th International Conference on World Wide Web, pp. 1–10. ACM (2009)

    Google Scholar 

  22. Dupret, G.E., Piwowarski, B.: A user browsing model to predict search engine click data from past observations. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 331–338. ACM (2008)

    Google Scholar 

  23. Xu, W., Manavoglu, E., Cantu-Paz, E.: Temporal click model for sponsored search. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 106–113. ACM (2010)

    Google Scholar 

  24. Craswell, N., Zoeter, O., Taylor, M., Ramsey, B.: An experimental comparison of click position-bias models. In: WSDM 2008, pp. 87–94. ACM (2008)

    Google Scholar 

  25. Luo, C., Zheng, Y., Liu, Y., Wang, X., Xu, J., Zhang, M., Ma, S.: SogouT-16: a new web corpus to embrace IR research. In: The 40th ACM SIGIR International Conference on Research and Development in Information Retrieval, SIGIR 2017 (2017)

    Google Scholar 

Download references

Acknowledgement

This work is supported by Natural Science Foundation of China (Grant No. 61622208, 61732008, 61532011) and National Key Basic Research Program of China (2015CB358700).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yiqun Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Luo, C., Zheng, Y., Mao, J., Liu, Y., Zhang, M., Ma, S. (2017). Training Deep Ranking Model with Weak Relevance Labels. In: Huang, Z., Xiao, X., Cao, X. (eds) Databases Theory and Applications. ADC 2017. Lecture Notes in Computer Science(), vol 10538. Springer, Cham. https://doi.org/10.1007/978-3-319-68155-9_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-68155-9_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-68154-2

  • Online ISBN: 978-3-319-68155-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics