short-paper

UserSimCRS: A User Simulation Toolkit for Evaluating Conversational Recommender Systems

Authors:
Jafar Afzali

University of Stavanger, Stavanger, Norway

University of Stavanger, Stavanger, Norway

0000-0001-7822-8222
View Profile

,
Aleksander Mark Drzewiecki

University of Stavanger, Stavanger, Norway

University of Stavanger, Stavanger, Norway

0000-0001-6426-6017
View Profile

,
Krisztian Balog

University of Stavanger, Stavanger, Norway

University of Stavanger, Stavanger, Norway

0000-0003-2762-721X
View Profile

,
Shuo Zhang

Bloomberg, London, United Kingdom

Bloomberg, London, United Kingdom

0000-0003-3179-4125
View Profile

WSDM '23: Proceedings of the Sixteenth ACM International Conference on Web Search and Data MiningFebruary 2023Pages 1160–1163https://doi.org/10.1145/3539597.3573029

Published:27 February 2023Publication History

WSDM '23: Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining

Pages 1160–1163

ABSTRACT

We present an extensible user simulation toolkit to facilitate automatic evaluation of conversational recommender systems. It builds on an established agenda-based approach and extends it with several novel elements, including user satisfaction prediction, persona and context modeling, and conditional natural language generation. We showcase the toolkit with a pre-existing movie recommender system and demonstrate its ability to simulate dialogues that mimic real conversations, while requiring only a handful of manually annotated dialogues as training data.

References

Krisztian Balog. 2021. Conversational AI from an Information Retrieval Perspective: Remaining Challenges and a Case for User Simulation. In Proc. of DESIRES '21. 80--90.Google Scholar
Krisztian Balog and Tom Kenter. 2019. Personal Knowledge Graphs: A Research Agenda. In Proc. of ICTIR '19. 217--220.Google ScholarDigital Library
Krisztian Balog, David Maxwell, Paul Thomas, and Shuo Zhang. 2021. Report on the 1st Simulation for Information Retrieval Workshop (Sim4IR 2021) at SIGIR 2021. SIGIR Forum 55, 2, Article 10 (dec 2021).Google ScholarDigital Library
Tanja Bunk, Daksh Varshneya, Vladimir Vlasov, and Alan Nichol. 2020. DIET: Lightweight Language Understanding for Dialogue Systems. arXiv:2004.09936 [cs.CL]Google Scholar
Konstantina Christakopoulou, Filip Radlinski, and Katja Hofmann. 2016. Towards Conversational Recommender Systems. In Proc. of KDD '16. 815--824.Google ScholarDigital Library
Chongming Gao, Wenqiang Lei, Xiangnan He, Maarten de Rijke, and Tat-Seng Chua. 2021. Advances and Challenges in Conversational Recommender Systems: A Survey. AI Open 2 (2021), 100--126.Google ScholarCross Ref
Javeria Habib, Shuo Zhang, and Krisztian Balog. 2020. IAI MovieBot: A Conversational Movie Recommender System. In Proc. of CIKM '20. 3405--3408.Google ScholarDigital Library
Eugene Ie, Chih wei Hsu, Martin Mladenov, Vihan Jain, Sanmit Narvekar, Jing Wang, Rui Wu, and Craig Boutilier. 2019. RecSim: A Configurable Simulation Platform for Recommender Systems. arXiv:1909.04847 [cs.LG]Google Scholar
Dietmar Jannach, Ahtsham Manzoor, Wanling Cai, and Li Chen. 2021. A Survey on Conversational Recommender Systems. ACM Comput. Surv. 54, 5 (2021).Google ScholarDigital Library
Ivica Kostric, Krisztian Balog, and Filip Radlinski. 2021. Soliciting User Preferences in Conversational Recommender Systems via Usage-Related Questions. In Proc. of RecSys '21. 724--729.Google ScholarDigital Library
Karl Krauth, Sarah Dean, Alex Zhao, Wenshuo Guo, Mihaela Curmei, Benjamin Recht, and Michael I. Jordan. 2020. Do Offline Metrics Predict Online Performance in Recommender Systems? arXiv:2011.07931 [cs.LG]Google Scholar
Wenqiang Lei, Xiangnan He, Yisong Miao, Qingyun Wu, Richang Hong, Min-Yen Kan, and Tat-Seng Chua. 2020. Estimation-Action-Reflection: Towards Deep Interaction Between Conversational and Recommender Systems. In Proc. of WSDM '20. 304--312.Google ScholarDigital Library
Raymond Li, Samira Kahou, Hannes Schulz, Vincent Michalski, Laurent Charlin, and Chris Pal. 2018. Towards Deep Conversational Recommendations. In Proc. of NIPS '18. 9748--9758.Google Scholar
Martin Mladenov, Chih-Wei Hsu, Vihan Jain, Eugene Ie, Christopher Colby, Nicolas Mayoraz, Hubert Pham, Dustin Tran, Ivan Vendrov, and Craig Boutilier. 2021. RecSim NG: Toward Principled Uncertainty Modeling for Recommender Ecosystems. arXiv:2103.08057 [cs.LG]Google Scholar
Namkee Park, Kyungeun Jang, Seonggyeol Cho, and Jinyoung Choi. 2021. Use of Offensive Language in Human-Artificial Intelligence Chatbot Interaction: The Effects of Ethical Ideology, Social Competence, and Perceived Humanlikeness. Comput. Hum. Behav. 121 (2021), 106795.Google ScholarDigital Library
David Rohde, Stephen Bonner, Travis Dunlop, Flavian Vasile, and Alexandros Karatzoglou. 2018. RecoGym: A Reinforcement Learning Environment for the problem of Product Recommendation in Online Advertising. arXiv:1808.00720 [cs.IR]Google Scholar
Alexandre Salle, Shervin Malmasi, Oleg Rokhlenko, and Eugene Agichtein. 2021. Studying the Effectiveness of Conversational Search Refinement Through User Simulation. In Proc. of ECIR '21. 587--602.Google ScholarDigital Library
Jost Schatzmann, Blaise Thomson, Karl Weilhammer, Hui Ye, and Steve Young. 2007. Agenda-Based User Simulation for Bootstrapping a POMDP Dialogue System. In Proc. of NAACL '07. 149--152.Google ScholarCross Ref
Jost Schatzmann, Karl Weilhammer, Matt Stuttle, and Steve Young. 2006. A Survey of Statistical User Simulation Techniques for Reinforcement-Learning of Dialogue Management Strategies. Knowl. Eng. Rev. 21, 2 (June 2006), 97--126.Google ScholarDigital Library
Ivan Sekulic, Mohammad Aliannejadi, and Fabio Crestani. 2022. Evaluating Mixed-Initiative Conversational Search Systems via User Simulation. In Proc. of WSDM '22. 888--896.Google ScholarDigital Library
Bichen Shi, Makbule Gulcin Ozsoy, Neil Hurley, Barry Smyth, Elias Z. Tragos, James Geraci, and Aonghus Lawlor. 2019. PyRecGym: A Reinforcement Learning Gym for Recommender Systems. In Proc. of RecSys '19. 491--495.Google ScholarDigital Library
Weiyan Shi, Kun Qian, Xuewei Wang, and Zhou Yu. 2019. How to Build User Simulators to Train RL-based Dialog Systems. In Proc. of EMNLP-IJCNLP '19. 1990--2000.Google ScholarCross Ref
Weiwei Sun, Shuo Zhang, Krisztian Balog, Zhaochun Ren, Pengjie Ren, Zhumin Chen, and Maarten de Rijke. 2021. Simulating User Satisfaction for the Evaluation of Task-Oriented Dialogue Systems. In Proc. of SIGIR '21. 2499--2506.Google ScholarDigital Library
Bo-Hsiang Tseng, Yinpei Dai, Florian Kreyssig, and Bill Byrne. 2021. Transferable Dialogue Systems and User Simulators. In Proc. of ACL '21. 152--166.Google ScholarCross Ref
Shuo Zhang and Krisztian Balog. 2020. Evaluating Conversational Recommender Systems via User Simulation. In Proc. of KDD '20. 1512--1520.Google ScholarDigital Library
Shuo Zhang, Mu-Chun Wang, and Krisztian Balog. 2022. Analyzing and Simulating User Utterance Reformulation in Conversational Recommender Systems. In Proc. of SIGIR '22. 133--143.Google ScholarDigital Library
Yongfeng Zhang, Xu Chen, Qingyao Ai, Liu Yang, and W. Bruce Croft. 2018. Towards Conversational Search and Recommendation: System Ask, User Respond. In Proc. of CIKM '18. 177--186.Google Scholar
Qi Zhu, Zheng Zhang, Yan Fang, Xiang Li, Ryuichi Takanobu, Jinchao Li, Baolin Peng, Jianfeng Gao, Xiaoyan Zhu, and Minlie Huang. 2020. ConvLab-2: An Open-Source Toolkit for Building, Evaluating, and Diagnosing Dialogue Systems. In Proc. of ACL '20. 142--149.Google ScholarCross Ref
Jie Zou, Yifan Chen, and Evangelos Kanoulas. 2020. Towards Question-Based Recommender Systems. In Proc. of SIGIR '20. 881--890.Google ScholarDigital Library

Index Terms

UserSimCRS: A User Simulation Toolkit for Evaluating Conversational Recommender Systems
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Recommender systems

Recommendations

Evaluating Conversational Recommender Systems via User Simulation
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Conversational information access is an emerging research area. Currently, human evaluation is used for end-to-end system evaluation, which is both very time and resource intensive at scale, and thus becomes a bottleneck of progress. As an alternative, ...
Read More
Unifying Recommender Systems and Conversational User Interfaces
CUI '22: Proceedings of the 4th Conference on Conversational User Interfaces

This paper considers unifying research on conversational user interfaces and recommender systems. Studies on conversational user interfaces (CUIs) typically examine how conversations can be facilitated (i.e., optimizing the means). Recommender systems ...
Read More
Impacts of Personal Characteristics on User Trust in Conversational Recommender Systems
CHI '22: Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems

Conversational recommender systems (CRSs) imitate human advisors to assist users in finding items through conversations and have recently gained increasing attention in domains such as media and e-commerce. Like in human communication, building trust in ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WSDM '23: Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining
February 2023
1345 pages
ISBN:9781450394079
DOI:10.1145/3539597
General Chairs:
Tat-Seng Chua
National University of Singapore
,
Hady Lauw
Singapore Management University
,
Program Chairs:
Luo Si
Salesforce
,
Evimaria Terzi
Boston University
,
Panayiotis Tsaparas
University of Ioannina
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 February 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
conversational recommender systems
user simulation
Qualifiers
- short-paper
Conference

Acceptance Rates
Overall Acceptance Rate498of2,863submissions,17%
Upcoming Conference
WSDM '25

Sponsor:

sigir

sigir

sigir

sigir

The Eighteenth ACM International Conference on Web Search and Data Mining

April 7 - 11, 2025

Hannover , Germany
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 224
  Total Downloads
- Downloads (Last 12 months)165
- Downloads (Last 6 weeks)14
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

UserSimCRS: A User Simulation Toolkit for Evaluating Conversational Recommender Systems

WSDM '23: Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Evaluating Conversational Recommender Systems via User Simulation

Unifying Recommender Systems and Conversational User Interfaces

Impacts of Personal Characteristics on User Trust in Conversational Recommender Systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

UserSimCRS: A User Simulation Toolkit for Evaluating Conversational Recommender Systems

WSDM '23: Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Evaluating Conversational Recommender Systems via User Simulation

Unifying Recommender Systems and Conversational User Interfaces

Impacts of Personal Characteristics on User Trust in Conversational Recommender Systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media