Pairwise Learning to Rank by Neural Networks Revisited: Reconstruction, Theoretical Analysis and Practical Performance

Köppel, Marius; Segner, Alexander; Wagener, Martin; Pensel, Lukas; Karwath, Andreas; Kramer, Stefan

doi:10.1007/978-3-030-46133-1_15

Marius Köppel¹⁴,
Alexander Segner¹⁴,
Martin Wagener¹⁴,
Lukas Pensel¹⁴,
Andreas Karwath¹⁵ &
…
Stefan Kramer¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11908))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

2155 Accesses
11 Citations

Abstract

We present a pairwise learning to rank approach based on a neural net, called DirectRanker, that generalizes the RankNet architecture. We show mathematically that our model is reflexive, antisymmetric, and transitive allowing for simplified training and improved performance. Experimental results on the LETOR MSLR-WEB10K, MQ2007 and MQ2008 datasets show that our model outperforms numerous state-of-the-art methods, while being inherently simpler in structure and using a pairwise approach only.

M. Köppel, A. Segner and M. Wagener—These authors contributed equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
For our implementation of the model and the tests see https://github.com/kramerlab/direct-ranker.

References

Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). http://tensorflow.org/
Burges, C., et al.: Learning to rank using gradient descent. In: Proceedings of the 22nd International Conference on Machine Learning, ICML 2005, pp. 89–96. ACM, New York (2005). http://doi.acm.org/10.1145/1102351.1102363
Burges, C., Ragno, R., Le, Q., Burges, C.J.: Learning to rank with non-smooth cost functions. In: Advances in Neural Information Processing Systems 19. MIT Press, Cambridge, January 2007. https://www.microsoft.com/en-us/research/publication/learning-to-rank-with-non-smooth-cost-functions/
Cao, Y., Xu, J., Liu, T.Y., Li, H., Huang, Y., Hon, H.W.: Adapting ranking SVM to document retrieval. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 186–193. ACM (2006). https://doi.org/10.1145/1148170.1148205
Cao, Z., Qin, T., Liu, T.Y., Tsai, M.F., Li, H.: Learning to rank: from pairwise approach to listwise approach, p. 9, April 2007. https://www.microsoft.com/en-us/research/publication/learning-to-rank-from-pairwise-approach-to-listwise-approach/
Cooper, W.S., Gey, F.C., Dabney, D.P.: Probabilistic retrieval based on staged logistic regression. In: Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 198–210. ACM (1992). http://doi.acm.org/10.1145/133160.133199
Freund, Y., Iyer, R., Schapire, R.E., Singer, Y.: An efficient boosting algorithm for combining preferences. J. Mach. Learn. Res. 4(Nov), 933–969 (2003). http://dl.acm.org/citation.cfm?id=945365.964285
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2000). http://www.jstor.org/stable/2699986
Fuhr, N.: Optimum polynomial retrieval functions based on the probability ranking principle. ACM Trans. Inf. Syst. (TOIS) 7(3), 183–204 (1989)
Article Google Scholar
Herbrich, R., Graepel, T., Obermayer, K.: Large margin rank boundaries for ordinal regression. In: Advances in Large Margin Classifiers, pp. 115–132 (2000)
Google Scholar
Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Netw. 2(5), 359–366 (1989). https://doi.org/10.1016/0893-6080(89)90020-8
Article MATH Google Scholar
Ibrahim, O.A.S., Landa-Silva, D.: ES-Rank: evolution strategy learning to rank approach. In: Proceedings of the Symposium on Applied Computing, pp. 944–950. ACM (2017). https://doi.org/10.1145/3019612.3019696
Ibrahim, O.A.S., Landa-Silva, D.: An evolutionary strategy with machine learning for learning to rank in information retrieval. Soft Comput. 22(10), 3171–3185 (2018). https://doi.org/10.1007/s00500-017-2988-6
Article Google Scholar
Jiang, L., Li, C., Cai, Z.: Learning decision tree for ranking. Knowl. Inf. Syst. 20(1), 123–135 (2009)
Article Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Li, P., Wu, Q., Burges, C.J.: McRank: learning to rank using multiple classification and gradient boosting. In: Advances in Neural Information Processing Systems, pp. 897–904 (2008)
Google Scholar
Liu, T.Y.: Learning to rank for information retrieval. Found. Trends Inf. Retr. 3(3), 225–331 (2009). https://doi.org/10.1561/1500000016
Article Google Scholar
Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Qin, T., Liu, T.: Introducing LETOR 4.0 datasets. CoRR abs/1306.2597 (2013). http://arxiv.org/abs/1306.2597
Rigutini, L., Papini, T., Maggini, M., Bianchini, M.: A neural network approach for learning object ranking. In: Kůrková, V., Neruda, R., Koutník, J. (eds.) ICANN 2008. LNCS, vol. 5164, pp. 899–908. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87559-8_93
Chapter Google Scholar
Croft, W.B., Callan, J.: Lemur toolkit (2001–2012). http://lemurproject.org/contrib.php
Wu, Q., Burges, C.J., Svore, K.M., Gao, J.: Adapting boosting for information retrieval measures. Inf. Retr. 13, 254–270 (2010). https://www.microsoft.com/en-us/research/publication/adapting-boosting-for-information-retrieval-measures/
Xu, J., Li, H.: AdaRank: a boosting algorithm for information retrieval. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2007, pp. 391–398. ACM, New York (2007). https://doi.org/10.1145/1277741.1277809

Download references

Acknowledgement

We would like to thank Dr. Christian Schmitt for his contributions to the work presented in this paper.

We also thank Luiz Frederic Wagner for proof(read)ing the mathematical aspects of our model.

Parts of this research were conducted using the supercomputer Mogon and/or advisory services offered by Johannes Gutenberg University Mainz (hpc.uni-mainz.de), which is a member of the AHRP (Alliance for High Performance Computing in Rhineland Palatinate, www.ahrp.info) and the Gauss Alliance e.V.

The authors gratefully acknowledge the computing time granted on the supercomputer Mogon at Johannes Gutenberg University Mainz (hpc.uni-mainz.de).

This research was partially funded by the Carl Zeiss Foundation Project: ‘Competence Centre for High-Performance-Computing in the Natural Sciences’ at the University of Mainz. Furthermore, Andreas Karwath has been co-funded by the MRC grant MR/S003991/1.

Author information

Authors and Affiliations

Johannes Gutenberg-Universität Mainz, Saarstraße 21, 55122, Mainz, Germany
Marius Köppel, Alexander Segner, Martin Wagener, Lukas Pensel & Stefan Kramer
University of Birmingham, Haworth Building (Y2), Birmingham, B15 2TT, UK
Andreas Karwath

Authors

Marius Köppel
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Segner
View author publications
You can also search for this author in PubMed Google Scholar
Martin Wagener
View author publications
You can also search for this author in PubMed Google Scholar
Lukas Pensel
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Karwath
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Kramer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marius Köppel .

Editor information

Editors and Affiliations

Leuphana University, Lüneburg, Germany
Ulf Brefeld
IRISA/Inria, Rennes, France
Elisa Fromont
University of Würzburg, Würzburg, Germany
Andreas Hotho
Leiden University, Leiden, The Netherlands
Arno Knobbe
ETH Zurich, Zurich, Switzerland
Marloes Maathuis
Institut National des Sciences Appliquées, Villeurbanne, France
Céline Robardet

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 150 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Köppel, M., Segner, A., Wagener, M., Pensel, L., Karwath, A., Kramer, S. (2020). Pairwise Learning to Rank by Neural Networks Revisited: Reconstruction, Theoretical Analysis and Practical Performance. In: Brefeld, U., Fromont, E., Hotho, A., Knobbe, A., Maathuis, M., Robardet, C. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019. Lecture Notes in Computer Science(), vol 11908. Springer, Cham. https://doi.org/10.1007/978-3-030-46133-1_15

Download citation

DOI: https://doi.org/10.1007/978-3-030-46133-1_15
Published: 30 April 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-46132-4
Online ISBN: 978-3-030-46133-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)