ABSTRACT
We consider context-response matching with multiple types of representations for multi-turn response selection in retrieval-based chatbots. The representations encode semantics of contexts and responses on words, n-grams, and sub-sequences of utterances, and capture both short-term and long-term dependencies among words. With such a number of representations in hand, we study how to fuse them in a deep neural architecture for matching and how each of them contributes to matching. To this end, we propose a multi-representation fusion network where the representations can be fused into matching at an early stage, at an intermediate stage, or at the last stage. We empirically compare different representations and fusing strategies on two benchmark data sets. Evaluation results indicate that late fusion is always better than early fusion, and by fusing the representations at the last stage, our model significantly outperforms the existing methods, and achieves new state-of-the-art performance on both data sets. Through a thorough ablation study, we demonstrate the effect of each representation to matching, which sheds light on how to select them in practical systems.
- Ricardo Baeza-Yates, Berthier Ribeiro-Neto, et almbox. 1999. Modern information retrieval . Vol. 463. ACM press New York. Google ScholarDigital Library
- Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. arXiv preprint arXiv:1412.3555 (2014).Google ScholarDigital Library
- George Ferguson, James F Allen, Bradford W Miller, et almbox. 1996. TRAINS-95: Towards a Mixed-Initiative Planning Assistant.. In AIPS. 70--77. Google ScholarDigital Library
- Nadine Glas, Ken Prepin, and Catherine Pelachaud. 2015. Engagement Driven Topic Selection for An Information-giving Ggent. In Workshop on the Semantics and Pragmatics of Dialogue (SemDial 2015-goDial) .Google Scholar
- Baotian Hu, Zhengdong Lu, Hang Li, and Qingcai Chen. 2014. Convolutional Neural Network Architectures for Matching Natural Language Sentences. In Advances in Neural Information Processing Systems. 2042--2050. Google ScholarDigital Library
- Zongcheng Ji, Zhengdong Lu, and Hang Li. 2014. An Information Retrieval Approach to Short Text Conversation. arXiv preprint arXiv:1408.6988 (2014).Google Scholar
- Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing . Association for Computational Linguistics, 1746--1751.Google ScholarCross Ref
- Diederik P Kingma and Jimmy Ba. 2015. Adam: A method for Stochastic Optimization. In ICLR .Google Scholar
- Feng-Lin Li, Minghui Qiu, Haiqing Chen, Xiongwei Wang, Xing Gao, Jun Huang, Juwei Ren, Zhongzhou Zhao, Weipeng Zhao, Lei Wang, et almbox. 2017b. AliMe Assist: An Intelligent Assistant for Creating an Innovative E-commerce Experience. In Proceedings of the Conference on Information and Knowledge Management. 2495--2498. Google ScholarDigital Library
- Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. 2015. A Diversity-Promoting Objective Function for Neural Conversation Models. In NAACL . 110--119.Google Scholar
- Jiwei Li, Michel Galley, Chris Brockett, Georgios Spithourakis, Jianfeng Gao, and Bill Dolan. 2016a. A Persona-Based Neural Conversation Model. In ACL. 994--1003.Google Scholar
- Jiwei Li, Will Monroe, Alan Ritter, Dan Jurafsky, Michel Galley, and Jianfeng Gao. 2016b. Deep Reinforcement Learning for Dialogue Generation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing . 1192--1202.Google ScholarCross Ref
- Jiwei Li, Will Monroe, Tianlin Shi, S.e bastien Jean, Alan Ritter, and Dan Jurafsky. 2017a. Adversarial Learning for Neural Dialogue Generation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing . 2157--2169.Google ScholarCross Ref
- Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017. A Structured Self-attentive Sentence Embedding. In ICLR .Google Scholar
- Ryan Lowe, Nissan Pow, Iulian Serban, and Joelle Pineau. 2015. The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-turn Dialogue Systems. In Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue . 285--294.Google ScholarCross Ref
- Zhengdong Lu and Hang Li. 2013. A Deep Architecture for Matching Short Texts. In Advances in Neural Information Processing Systems. 1367--1375. Google ScholarDigital Library
- Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed Representations of Words and Phrases and Their Compositionality. In Advances in Neural Information Processing Systems. 3111--3119. Google ScholarDigital Library
- Lili Mou, Yiping Song, Rui Yan, Ge Li, Lu Zhang, and Zhi Jin. 2016. Sequence to Backward and Forward Sequences: A Content-Introducing Approach to Generative Short-Text Conversation. In Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers. 3349--3358.Google Scholar
- Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 1532--1543.Google ScholarCross Ref
- Alan Ritter, Colin Cherry, and William B Dolan. 2011. Data-driven Response Generation in Social Media. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. 583--593. Google ScholarDigital Library
- Minjoon Seo, Aniruddha Kembhavi, Ali Farhadi, and Hannaneh Hajishirzi. 2017. Bidirectional Attention Flow for Machine Comprehension. In ICLR .Google Scholar
- Iulian Vlad Serban, Alessandro Sordoni, Yoshua Bengio, Aaron C. Courville, and Joelle Pineau. 2016. End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models. In AAAI . 3776--3784. Google ScholarDigital Library
- Iulian Vlad Serban, Alessandro Sordoni, Ryan Lowe, Laurent Charlin, Joelle Pineau, Aaron C Courville, and Yoshua Bengio. 2017. A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues. In AAAI . 3295--3301. Google ScholarDigital Library
- Lifeng Shang, Zhengdong Lu, and Hang Li. 2015. Neural Responding Machine for Short-Text Conversation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics . 1577--1586.Google ScholarCross Ref
- Mingyue Shang, Zhenxin Fu, Nanyun Peng, Yansong Feng, Dongyan Zhao, and Rui Yan. 2018. Learning to Converse with Noisy Data: Generation with Calibration.. In IJCAI . 4338--4344. Google ScholarDigital Library
- Heung-Yeung Shum, Xiaodong He, and Di Li. 2018. From Eliza to XiaoIce: Challenges and Opportunities with Social Chatbots. Frontiers of IT & EE, Vol. 19, 1 (2018), 10--26.Google Scholar
- Chongyang Tao, Shen Gao, Mingyue Shang, Wei Wu, Dongyan Zhao, and Rui Yan. 2018a. Get The Point of My Utterance! Learning Towards Effective Responses with Multi-Head Attention Mechanism.. In IJCAI. 4418--4424. Google ScholarDigital Library
- Chongyang Tao, Lili Mou, Dongyan Zhao, and Rui Yan. 2018b. RUBER: An Unsupervised Method for Automatic Evaluation of Open-Domain Dialog Systems. In AAAI. 722--729.Google Scholar
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems. 5998--6008. Google Scholar
- Oriol Vinyals and Quoc Le. 2015. A Neural Conversational Model. arXiv preprint arXiv:1506.05869 (2015).Google Scholar
- Ellen M Voorhees et almbox. 1999. The TREC-8 Question Answering Track Report.. In Trec, Vol. 99. 77--82.Google Scholar
- Shengxian Wan, Yanyan Lan, Jun Xu, Jiafeng Guo, Liang Pang, and Xueqi Cheng. 2016. Match-SRNN: Modeling the Recursive Matching Structure with Spatial RNN. In IJCAI . 2922--2928. Google ScholarDigital Library
- Hao Wang, Zhengdong Lu, Hang Li, and Enhong Chen. 2013. A Dataset for Research on Short-text Conversations. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing . 935--945.Google Scholar
- Shuohang Wang and Jing Jiang. 2016. Learning Natural Language Inference with LSTM. In NAACL. 1442--1451.Google Scholar
- Shuohang Wang and Jing Jiang. 2017. A Compare-Aggregate Model for Matching Text Sequences. In ICLR .Google Scholar
- Joseph Weizenbaum. 1966. ELIZA: A Computer Program for the Study of Natural Language Communication Between Man and Machine. Commun. ACM, Vol. 9, 1, 36--45. Google ScholarDigital Library
- Yu Wu, wei, Zhoujun Li, and Ming Zhou. 2018a. Learning Matching Models with Weak Supervision for Response Selection in Retrieval-based Chatbots. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 420--425.Google ScholarCross Ref
- Yu Wu, Wei Wu, Chen Xing, Ming Zhou, and Zhoujun Li. 2017. Sequential Matching Network: A New Architecture for Multi-turn Response Selection in Retrieval-based Chatbots. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 496--505.Google ScholarCross Ref
- Yu Wu, Wei Wu, Dejian Yang, Can Xu, Zhoujun Li, and Ming Zhou. 2018b. Neural Response Generation with Dynamic Vocabularies. In AAAI . 5594--5601.Google Scholar
- Chen Xing, Wei Wu, Yu Wu, Jie Liu, Yalou Huang, Ming Zhou, and Wei-Ying Ma. 2017. Topic Aware Neural Response Generation.. In AAAI. 3351--3357. Google ScholarDigital Library
- Chen Xing, Wei Wu, Yu Wu, Ming Zhou, Yalou Huang, and Wei-Ying Ma. 2018. Hierarchical Recurrent Attention Network for Response Generation. In AAAI . 5610--5617.Google Scholar
- Zhen Xu, Bingquan Liu, Baoxun Wang, Chengjie Sun, and Xiaolong Wang. 2017. Incorporating Loose-Structured Knowledge into LSTM with Recall Gate for Conversation Modeling. In Proceedings of the 2017 International Joint Conference on Neural Networks. 3506--3513.Google ScholarCross Ref
- Rui Yan, Yiping Song, and Hua Wu. 2016. Learning to Respond with Deep Neural Networks for Retrieval-Based Human-Computer Conversation System. In SIGIR. 55--64. Google ScholarDigital Library
- Rui Yan and Dongyan Zhao. 2018. Coupled context modeling for deep chit-chat: towards conversations between human and computer. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining . 2574--2583. Google ScholarDigital Library
- Stephanie Young, Milica Gasic, Blaise Thomson, and John D Williams. 2013. POMDP-based Statistical Spoken Dialog Systems: A Review. Proc. IEEE, Vol. 101, 5 (2013), 1160--1179.Google ScholarCross Ref
- Hao Zhou, Minlie Huang, Tianyang Zhang, Xiaoyan Zhu, and Bing Liu. 2018a. Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory. In AAAI. 730--738.Google Scholar
- Xiangyang Zhou, Daxiang Dong, Hua Wu, Shiqi Zhao, R Yan, D Yu, Xuan Liu, and H Tian. 2016. Multi-view Response Selection for Human-computer Conversation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing . 372--381.Google ScholarCross Ref
- Xiangyang Zhou, Lu Li, Daxiang Dong, Yi Liu, Ying Chen, Wayne Xin Zhao, Dianhai Yu, and Hua Wu. 2018b. Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics . 1118--1127.Google ScholarCross Ref
Index Terms
- Multi-Representation Fusion Network for Multi-Turn Response Selection in Retrieval-Based Chatbots
Recommendations
Speaker-Aware BERT for Multi-Turn Response Selection in Retrieval-Based Chatbots
CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge ManagementIn this paper, we study the problem of employing pre-trained language models for multi-turn response selection in retrieval-based chatbots. A new model, named Speaker-Aware BERT (SA-BERT), is proposed in order to make the model aware of the speaker ...
Interactive Matching Network for Multi-Turn Response Selection in Retrieval-Based Chatbots
CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge ManagementIn this paper, we propose an interactive matching network (IMN) for the multi-turn response selection task. First, IMN constructs word representations from three aspects to address the challenge of out-of-vocabulary (OOV) words. Second, an attentive ...
Dialogue History Matters! Personalized Response Selection in Multi-Turn Retrieval-Based Chatbots
Existing multi-turn context-response matching methods mainly concentrate on obtaining multi-level and multi-dimension representations and better interactions between context utterances and response. However, in real-place conversation scenarios, whether a ...
Comments