Abstract
Unbiased learning to rank aims to generate optimal orders for candidates utilizing noisy click-through data. To deal with such problem, most models treat the biased click labels as combined supervision of relevance and propensity, which pay little attention to the uncertainty of implicit user feedback. We propose a semi-supervised framework to address this issue, namely ULTRGAN (Unbiased Learning To Rank with Generative Adversarial Networks). The unified framework regards the task as semi-supervised learning with missing labels, and employs adversarial training to debias click-through datasets. In ULTRGAN, the generator samples potential negative examples combined with true positive examples for the discriminator. Meanwhile, the discriminator challenges the generator for better performances. We further incorporate pairwise debiasing to generate unbiased labels diffusing from the discriminator to the generator. Experimental results over both synthetic and real-world datasets show the effectiveness and robustness of ULTRGAN.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ai, Q., Bi, K., Luo, C., Guo, J., Croft, W.B.: Unbiased learning to rank with unbiased propensity estimation. In: SIGIR, pp. 385–394 (2018)
Ai, Q., Mao, J., Liu, Y., Croft, W.B.: Unbiased learning to rank: theory and practice. In: CIKM, pp. 2305–2306 (2018)
Bekker, J., Davis, J.: Learning from positive and unlabeled data: a survey. Mach. Learn. 109, 719–760 (2020)
Borisov, A., Markov, I., De Rijke, M., Serdyukov, P.: A neural click model for web search. In: WWW, pp. 531–541 (2016)
Burges, C., et al.: Learning to rank using gradient descent. In: ICML, pp. 89–96 (2005)
Burges, C.J.: From ranknet to lambdarank to lambdamart: an overview. Learning 11(23–581), 81 (2010)
Cao, Z., Qin, T., Liu, T.Y., Tsai, M.F., Li, H.: Learning to rank: from pairwise approach to listwise approach. In: ICML, pp. 129–136 (2007)
Chapelle, O., Chang, Y.: Yahoo! learning to rank challenge overview. In: Proceedings of the Learning to Rank Challenge, pp. 1–24 (2011)
Chen, J., Mao, J., Liu, Y., Zhang, M., Ma, S.: TianGong-ST: a new dataset with large-scale refined real-world web search sessions. In: CIKM, pp. 2485–2488 (2019)
Chuklin, A., Markov, I., de Rijke, M.: Click models for web search. Synth. Lect. Inf. Concepts Retr. Serv. 7(3), 1–115 (2015)
Cossock, D., Zhang, T.: Subset ranking using regression. In: Lugosi, G., Simon, H.U. (eds.) COLT 2006. LNCS (LNAI), vol. 4005, pp. 605–619. Springer, Heidelberg (2006). https://doi.org/10.1007/11776420_44
Craswell, N., Zoeter, O., Taylor, M., Ramsey, B.: An experimental comparison of click position-bias models. In: WSDM, pp. 87–94 (2008)
Dupret, G.E., Piwowarski, B.: A user browsing model to predict search engine click data from past observations. In: SIGIR, pp. 331–338 (2008)
Goodfellow, I., et al.: Generative adversarial nets. In: NIPS, pp. 2672–2680 (2014)
Hu, Z., Wang, Y., Peng, Q., Li, H.: Unbiased lambdamart: an unbiased pairwise learning-to-rank algorithm. In: WWW, pp. 2830–2836 (2019)
Joachims, T.: Optimizing search engines using clickthrough data. In: KDD, pp. 133–142 (2002)
Joachims, T., Granka, L., Pan, B., Hembrooke, H., Gay, G.: Accurately interpreting clickthrough data as implicit feedback. In: SIGIR, vol. 51, pp. 4–11 (2017)
Joachims, T., Swaminathan, A., Schnabel, T.: Unbiased learning-to-rank with biased feedback. In: WSDM, pp. 781–789 (2017)
Liu, T.Y.: Learning to rank for information retrieval. Found. Trends Inf. Retr. 3(3), 225–331 (2009)
Lu, S., Dou, Z., Jun, X., Nie, J.Y., Wen, J.R.: PSGAN: a minimax game for personalized search with limited and noisy click data. In: SIGIR, pp. 555–564 (2019)
Oosterhuis, H., Jagerman, R., de Rijke, M.: Unbiased learning to rank: counterfactual and online approaches. In: WWW (Companion Volume), pp. 299–300 (2020)
Oosterhuis, H., de Rijke, M.: Differentiable unbiased online learning to rank. In: CIKM, pp. 1293–1302 (2018)
O’Brien, M., Keane, M.T.: Modeling result-list searching in the world wide web: the role of relevance topologies and trust bias. In: Proceedings of the 28th Annual Conference of the Cognitive Science Society, vol. 28, pp. 1881–1886. Citeseer (2006)
Richardson, M., Dominowska, E., Ragno, R.: Predicting clicks: estimating the click-through rate for new ads. In: WWW, pp. 521–530 (2007)
Rosenbaum, P.R., Rubin, D.B.: The central role of the propensity score in observational studies for causal effects. Biometrika 70(1), 41–55 (1983)
Steck, H.: Training and testing of recommender systems on data missing not at random. In: KDD, pp. 713–722 (2010)
Sutton, R.S., McAllester, D.A., Singh, S.P., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: NIPS, pp. 1057–1063 (2000)
Wang, J., et al.: IRGAN: a minimax game for unifying generative and discriminative information retrieval models. In: SIGIR, pp. 515–524 (2017)
Wang, X., Bendersky, M., Metzler, D., Najork, M.: Learning to rank with selection bias in personal search. In: SIGIR, pp. 115–124 (2016)
Wang, X., Golbandi, N., Bendersky, M., Metzler, D., Najork, M.: Position bias estimation for unbiased learning to rank in personal search. In: WSDM, pp. 610–618 (2018)
Yang, L., Cui, Y., Xuan, Y., Wang, C., Belongie, S., Estrin, D.: Unbiased offline recommender evaluation for missing-not-at-random implicit feedback. In: RecSys, pp. 279–287 (2018)
Yue, Y., Joachims, T.: Interactively optimizing information retrieval systems as a dueling bandits problem. In: ICML, pp. 1201–1208 (2009)
Yue, Y., Patel, R., Roehrig, H.: Beyond position bias: Examining result attractiveness as a source of presentation bias in clickthrough data. In: WWW, pp. 1011–1018 (2010)
Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: ACM SIGIR Forum, vol. 51, pp. 268–276. ACM, New York (2017)
Acknowledgements
This work is supported by the National Key Research and Development Program of China under Grant No. 2016YFB1000904. We thank the anonymous reviewers for their careful reading and insightful comments on our manuscript.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Cai, H., Wang, C., He, X. (2020). Debiasing Learning to Rank Models with Generative Adversarial Networks. In: Wang, X., Zhang, R., Lee, YK., Sun, L., Moon, YS. (eds) Web and Big Data. APWeb-WAIM 2020. Lecture Notes in Computer Science(), vol 12318. Springer, Cham. https://doi.org/10.1007/978-3-030-60290-1_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-60290-1_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60289-5
Online ISBN: 978-3-030-60290-1
eBook Packages: Computer ScienceComputer Science (R0)