research-article

Pre-trained Language Model based Ranking in Baidu Search

Authors:
Lixin Zou

Baidu Inc., Beijing, China

Baidu Inc., Beijing, China
View Profile

,
Shengqiang Zhang

Baidu Inc., Beijing, China

Baidu Inc., Beijing, China
View Profile

,
Hengyi Cai

Baidu Inc., Beijing, China

Baidu Inc., Beijing, China
View Profile

,
Dehong Ma

Baidu Inc., Beijing, China

Baidu Inc., Beijing, China
View Profile

,
Suqi Cheng

Baidu Inc., Beijing, China

Baidu Inc., Beijing, China
View Profile

,
Shuaiqiang Wang

Baidu Inc., Beijing, China

Baidu Inc., Beijing, China
View Profile

,
Daiting Shi

Baidu Inc., Beijing, China

Baidu Inc., Beijing, China
View Profile

,
Zhicong Cheng

Baidu Inc., Beijing, China

Baidu Inc., Beijing, China
View Profile

,
Dawei Yin

Baidu Inc., Beijing, China

Baidu Inc., Beijing, China
View Profile

KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data MiningAugust 2021Pages 4014–4022https://doi.org/10.1145/3447548.3467147

Published:14 August 2021Publication History

KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

Pages 4014–4022

ABSTRACT

As the heart of a search engine, the ranking system plays a crucial role in satisfying users' information demands. More recently, neural rankers fine-tuned from pre-trained language models (PLMs) establish state-of-the-art ranking effectiveness. However, it is nontrivial to directly apply these PLM-based rankers to the large-scale web search system due to the following challenging issues: (1) the prohibitively expensive computations of massive neural PLMs, especially for long texts in the web document, prohibit their deployments in an online ranking system that demands extremely low latency; (2) the discrepancy between existing ranking-agnostic pre-training objectives and the ad-hoc retrieval scenarios that demand comprehensive relevance modeling is another main barrier for improving the online ranking system; (3) a real-world search engine typically involves a committee of ranking components, and thus the compatibility of the individually fine-tuned ranking model is critical for a cooperative ranking system. In this work, we contribute a series of successfully applied techniques in tackling these exposed issues when deploying the state-of-the-art Chinese pre-trained language model, i.e., ERNIE, in the online search engine system. We first articulate a novel practice to cost-efficiently summarize the web document and contextualize the resultant summary content with the query using a cheap yet powerful Pyramid-ERNIE architecture. Then we endow an innovative paradigm to finely exploit the large-scale noisy and biased post-click behavioral data for relevance-oriented pre-training. We also propose a human-anchored fine-tuning strategy tailored for the online ranking system, aiming to stabilize the ranking signals across various online components. Extensive offline and online experimental results show that the proposed techniques significantly boost the search engine's performance.

Supplemental Material

kdd_baidu_rank.mp4

mp4

220.8 MB

Download

References

Armen Aghajanyan, Akshat Shrivastava, Anchit Gupta, Naman Goyal, Luke Zettlemoyer, and Sonal Gupta. 2020. Better fine-tuning by reducing representational collapse. arXiv:2008.03156 (2020).Google Scholar
Kristjan Arumae, Qing Sun, and Parminder Bhatia. 2020. An Empirical Investigation towards Efficient Multi-Domain Language Model Pre-training. In EMNLP'20.Google ScholarCross Ref
Alexei Baevski, Sergey Edunov, Yinhan Liu, Luke Zettlemoyer, and M. Auli. 2019. Cloze-driven Pretraining of Self-attention Networks. In EMNLP'19.Google Scholar
Iz Beltagy, Matthew E Peters, and Arman Cohan. 2020. Longformer: The longdocument transformer. arXiv:2004.05150 (2020).Google Scholar
Christopher JC Burges. 2010. From ranknet to lambdarank to lambdamart: An overview. Learning (2010).Google Scholar
Zhe Cao, Tao Qin, T. Liu, Ming-Feng Tsai, and H. Li. 2007. Learning to rank: from pairwise approach to listwise approach. In ICML'07.Google Scholar
Wei-Cheng Chang, X Yu Felix, Yin-Wen Chang, Yiming Yang, and Sanjiv Kumar. 2019. Pre-training Tasks for Embedding-based Large-scale Retrieval. In ICLR'19.Google Scholar
Olivier Chapelle, Thorsten Joachims, Filip Radlinski, and Yisong Yue. 2012. Largescale validation and analysis of interleaved search evaluation. TOIS (2012).Google Scholar
O. Chapelle and Y. Zhang. 2009. A dynamic bayesian network click model for web search ranking. In WWW'09.Google Scholar
J. Chen, Hande Dong, Xiao lei Wang, Fuli Feng, Ming-Chieh Wang, and X. He. 2020. Bias and Debias in Recommender System: A Survey and Future Directions. arXiv:2010.03240 (2020).Google Scholar
Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamas Sarlos, Peter Hawkins, Jared Davis, Afroz Mohiuddin, Lukasz Kaiser, et al. 2020. Rethinking attention with performers. arXiv:2009.14794 (2020).Google Scholar
William S Cooper, Fredric C Gey, and Daniel P Dabney. 1992. Probabilistic retrieval based on staged logistic regression. In SIGIR'92.Google ScholarDigital Library
J. Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT'19.Google Scholar
Yoav Freund, Raj Iyer, Robert E Schapire, and Yoram Singer. 2003. An Efficient Boosting Algorithm for Combining Preferences. JMLR (2003).Google Scholar
Yoav Freund and Robert E Schapire. 1997. A decision-theoretic generalization of on-line learning and an application to boosting. JCSS (1997).Google Scholar
Luyu Gao, Zhuyun Dai, and J. Callan. 2020. Modularized Transfomer-based Ranking Framework. In EMNLP'20.Google Scholar
Ian Goodfellow, Yoshua Bengio, Aaron Courville, and Yoshua Bengio. 2016. Deep learning. Vol. 1. MIT press Cambridge.Google ScholarDigital Library
Jiafeng Guo, Yixing Fan, Qingyao Ai, and W Bruce Croft. 2016. A deep relevance matching model for ad-hoc retrieval. In CIKM'16.Google ScholarDigital Library
Suchin Gururangan, Ana Marasovi?, Swabha Swayamdipta, Kyle Lo, Iz Beltagy, Doug Downey, and Noah A. Smith. 2020. Don't Stop Pretraining: Adapt Language Models to Domains and Tasks. arXiv:2004.10964 (2020).Google Scholar
Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv:1503.02531 (2015).Google Scholar
Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck. 2013. Learning deep structured semantic models for web search using clickthrough data. In CIKM'13.Google ScholarDigital Library
Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, MatthewTang, Andrew Howard, Hartwig Adam, and Dmitry Kalenichenko. 2018. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In CVPR'18.Google ScholarCross Ref
K. Järvelin and Jaana Kekäläinen. 2017. IR evaluation methods for retrieving highly relevant documents. In SIGIR'17.Google Scholar
Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, and Tuo Zhao. 2019. Smart: Robust and efficient fine-tuning for pre-trained natural language models through principled regularized optimization. arXiv:1911.03437 (2019).Google Scholar
Thorsten Joachims. 2002. Optimizing search engines using clickthrough data. In SIGKDD'02.Google ScholarDigital Library
Omar Khattab and Matei Zaharia. 2020. Colbert: Efficient and effective passage search via contextualized late interaction over bert. In SIGIR'20.Google ScholarDigital Library
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. CoRR abs/1412.6980 (2015).Google Scholar
Nikita Kitaev, ?ukasz Kaiser, and Anselm Levskaya. 2020. Reformer: The efficient transformer. arXiv:2001.04451 (2020).Google Scholar
Raghuraman Krishnamoorthi. 2018. Quantizing deep convolutional networks for efficient inference: A whitepaper. arXiv:1806.08342 (2018).Google Scholar
Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2019. Albert: A lite bert for self-supervised learning of language representations. arXiv:1909.11942 (2019).Google Scholar
Jinhyuk Lee,Wonjin Yoon, Sungdong Kim, D. Kim, Sunkyu Kim, Chan Ho So, and Jaewoo Kang. 2020. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics (2020).Google Scholar
Ping Li, Qiang Wu, and Christopher Burges. 2007. McRank: Learning to Rank Using Multiple Classification and Gradient Boosting. NIPS'07 (2007).Google Scholar
Jimmy Lin, Rodrigo Nogueira, and A. Yates. 2020. Pretrained Transformers for Text Ranking: BERT and Beyond. arXiv:2010.06467 (2020).Google Scholar
Tie-Yan Liu. 2009. Learning to Rank for Information Retrieval. Foundations and Trends in Information Retrieval (2009).Google Scholar
Yiding Liu, Weixue Lu, Suqi Cheng, Daiting Shi, Shuaiqiang Wang, Zhicong Cheng, and Dawei Yin. 2021. Pre-trained Language Model forWeb-scale Retrieval in Baidu Search. In SIGKDD'21.Google Scholar
Xinyu Ma, Jiafeng Guo, Ruqing Zhang, Yixing Fan, Xiang Ji, and Xueqi Cheng. 2020. PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval. arXiv:2010.10137 (2020).Google Scholar
Oded Z Maimon and Lior Rokach. 2014. Data mining with decision trees: theory and applications. World scientific.Google Scholar
Ryan McDonald, George Brokos, and Ion Androutsopoulos. 2018. Deep Relevance Ranking Using Enhanced Document-Query Interactions. In EMNLP'18.Google Scholar
Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. arXiv:1901.04085 (2019).Google Scholar
Rodrigo Nogueira, W. Yang, Kyunghyun Cho, and Jimmy Lin. 2019. Multi-Stage Document Ranking with BERT. arXiv:1910.14424 (2019).Google Scholar
Morteza Mousa Pasandi, Mohsen Hajabdollahi, Nader Karimi, and Shadrokh Samavi. 2020. Modeling of Pruning Techniques for Deep Neural Networks Simplification. arXiv:2001.04062 (2020).Google Scholar
Lorenzo Rosasco, Ernesto De Vito, Andrea Caponnetto, Michele Piana, and Alessandro Verri. 2004. Are loss functions all the same? Neural computation (2004).Google Scholar
Yelong Shen, Xiaodong He, Jianfeng Gao, Li Deng, and Grégoire Mesnil. 2014. A latent semantic model with convolutional-pooling structure for information retrieval. In CIKM'14.Google ScholarDigital Library
Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, and Hua Wu. 2019. ERNIE: Enhanced Representation through Knowledge Integration. arXiv preprint arXiv:1904.09223 (2019).Google Scholar
Yi Tay, Mostafa Dehghani, Dara Bahri, and Donald Metzler. 2020. Efficient transformers: A survey. arXiv:2009.06732 (2020).Google Scholar
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, L. Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In NIPS'17.Google ScholarDigital Library
SinongWang, Belinda Li, Madian Khabsa, Han Fang, and Hao Ma. 2020. Linformer: Self-attention with linear complexity. arXiv:2006.04768 (2020).Google Scholar
Chenyan Xiong, Zhuyun Dai, Jamie Callan, Zhiyuan Liu, and Russell Power. 2017. End-to-end neural ad-hoc ranking with kernel pooling. In SIGIR'17.Google ScholarDigital Library
Z. Yang, Zihang Dai, Yiming Yang, J. Carbonell, R. Salakhutdinov, and Quoc V. Le. 2019. XLNet: Generalized Autoregressive Pretraining for Language Understanding. NeurIPS'19 (2019).Google Scholar
Dawei Yin, Yuening Hu, Jiliang Tang, Tim Daly, Mianwei Zhou, Hua Ouyang, Jianhui Chen, Changsung Kang, Hongbo Deng, Chikashi Nobata, et al. 2016. Ranking relevance in yahoo search. In SIGKDD'16.Google ScholarDigital Library
Jingqing Zhang, Yao Zhao, Mohammad Saleh, and Peter Liu. 2020. Pegasus: Pre-training with extracted gap-sentences for abstractive summarization. In ICML'20.Google Scholar
R. Zhang, Revanth Reddy Gangi Reddy, Md Arafat Sultan, V. Castelli, Anthony Ferritto, Radu Florian, Efsun Sarioglu Kayi, S. Roukos, A. Sil, and T. Ward. 2020. Multi-Stage Pre-training for Low-Resource Domain Adaptation. arXiv:2010.05904 (2020).Google Scholar
Shiqi Zhao, H. Wang, Chao Li, T. Liu, and Y. Guan. 2011. Automatically Generating Questions from Queries for Community-based Question Answering. In IJCNLP'11.Google Scholar
Xiangyu Zhao, Long Xia, Lixin Zou, Hui Liu, Dawei Yin, and Jiliang Tang. 2020. Whole-Chain Recommendations. In CIKM'20.Google Scholar
Zhaohui Zheng, Keke Chen, Gordon Sun, and Hongyuan Zha. 2007. A regression framework for learning ranking functions using relative relevance judgments. In SIGIR'07.Google ScholarDigital Library
Wangchunshu Zhou, Dong-Ho Lee, Ravi Kiran Selvam, Seyeon Lee, Bill Yuchen Lin, and Xiang Ren. 2020. Pre-training text-to-text transformers for concept centric common sense. arXiv:2011.07956 (2020).Google Scholar
Daniel M Ziegler, Nisan Stiennon, JeffreyWu, Tom B Brown, Alec Radford, Dario Amodei, Paul Christiano, and Geoffrey Irving. 2019. Fine-tuning language models from human preferences. arXiv:1909.08593 (2019).Google Scholar
Lixin Zou, Long Xia, Zhuoye Ding, Jiaxing Song, Weidong Liu, and Dawei Yin. 2019. Reinforcement learning to optimize long-term user engagement in recommender systems. In SIGKDD'19.Google ScholarDigital Library
Lixin Zou, Long Xia, Pan Du, Zhuo Zhang, Ting Bai, Weidong Liu, Jian-Yun Nie, and Dawei Yin. 2020. Pseudo Dyna-Q: A reinforcement learning framework for interactive recommendation. In WSDM'20.Google ScholarDigital Library
Lixin Zou, Long Xia, Yulong Gu, Xiangyu Zhao, Weidong Liu, Jimmy Xiangji Huang, and Dawei Yin. 2020. Neural Interactive Collaborative Filtering. In SIGIR'20.Google Scholar

Index Terms

Pre-trained Language Model based Ranking in Baidu Search
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Language models
      2. Learning to rank

Recommendations

Pre-trained Language Model-based Retrieval and Ranking for Web Search
Pre-trained language representation models (PLMs) such as BERT and Enhanced Representation through kNowledge IntEgration (ERNIE) have been integral to achieving recent improvements on various downstream tasks, including information retrieval. However, it ...
Read More
Ranking Relevance in Yahoo Search
KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Search engines play a crucial role in our daily lives. Relevance is the core problem of a commercial search engine. It has attracted thousands of researchers from both academia and industry and has been studied for decades. Relevance in a modern search ...
Read More
Quality-biased ranking for queries with commercial intent
WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide Web

Modern search engines are good enough to answer popular commercial queries with mainly highly relevant documents. However, our experiments show that users behavior on such relevant commercial sites may differ from one to another web-site with the same ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining
August 2021
4259 pages
ISBN:9781450383325
DOI:10.1145/3447548
General Chairs:
Feida Zhu
Singapore Management University
,
Beng Chin Ooi
National University of Singapore
,
Chunyan Miao
Nanyang Technology University
,
Program Chairs:
Haixun Wang,
Iryna Skrypnyk,
Wynne Hsu,
Sanjay Chawla
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 14 August 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
learning to rank
pre-trained language model
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 20
  Total Citations
  View Citations
- 713
  Total Downloads
- Downloads (Last 12 months)143
- Downloads (Last 6 weeks)9
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Pre-trained Language Model based Ranking in Baidu Search

KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Pre-trained Language Model-based Retrieval and Ranking for Web Search

Ranking Relevance in Yahoo Search

Quality-biased ranking for queries with commercial intent

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Pre-trained Language Model based Ranking in Baidu Search

KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Pre-trained Language Model-based Retrieval and Ranking for Web Search

Ranking Relevance in Yahoo Search

Quality-biased ranking for queries with commercial intent

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media