Abstract
How should reinforcement learning (RL) agents explain themselves to humans not trained in AI? To gain insights into this question, we conducted a 124-participant, four-treatment experiment to compare participants’ mental models of an RL agent in the context of a simple Real-Time Strategy (RTS) game. The four treatments isolated two types of explanations vs. neither vs. both together. The two types of explanations were as follows: (1) saliency maps (an “Input Intelligibility Type” that explains the AI’s focus of attention) and (2) reward-decomposition bars (an “Output Intelligibility Type” that explains the AI’s predictions of future types of rewards). Our results show that a combined explanation that included saliency and reward bars was needed to achieve a statistically significant difference in participants’ mental model scores over the no-explanation treatment. However, this combined explanation was far from a panacea: It exacted disproportionately high cognitive loads from the participants who received the combined explanation. Further, in some situations, participants who saw both explanations predicted the agent’s next action worse than all other treatments’ participants.
- Julius Adebayo, Justin Gilmer, Michael Muelly, Ian Goodfellow, Moritz Hardt, and Been Kim. 2018. Sanity checks for saliency maps. In Advances in Neural Information Processing Systems, Vol. 31, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (Eds.). Curran Associates, Inc., 9505--9515.Google Scholar
- Dan Amir and Ofra Amir. 2018. HIGHLIGHTS: Summarizing agent behavior to people. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems. International Foundation for Autonomous Agents and Multiagent Systems, 1168--1176.Google Scholar
- Marco Ancona, Enea Ceolini, Cengiz Öztireli, and Markus Gross. 2018. Towards better understanding of gradient-based attribution methods for deep neural networks. In Proceedings of the International Conference on Learning Representations.Google Scholar
- Svetlin Bostandjiev, John O’Donovan, and Tobias Höllerer. 2012. TasteWeights: A visual interactive hybrid recommender system. In Proceedings of the ACM Conference on Recommender Systems. ACM, 35--42.Google ScholarDigital Library
- Michelene T. H. Chi, Miriam Bassok, Matthew W. Lewis, Peter Reimann, and Robert Glaser. 1989. Self-explanations: How students study and use examples in learning to solve problems. Cogn. Sci. 13, 2 (4 1989), 145--182. DOI:https://doi.org/10.1207/s15516709cog1302_1Google Scholar
- Duncan Cramer and Dennis Howitt. 2004. The Sage Dictionary of Statistics: A Practical Resource for Students in the Social Sciences. Sage.Google Scholar
- Jonathan Dodge, Sean Penney, Claudia Hilderbrand, Andrew Anderson, and Margaret Burnett. 2018. How the experts do it: Assessing and explaining agent behaviors in real-time strategy games. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI’18). ACM, New York, NY, Article 562, 12 pages.Google ScholarDigital Library
- Martin Erwig, Alan Fern, Magesh Murali, and Anurag Koul. 2018. Explaining deep adaptive programs via reward decomposition. In Proceedings of the IJCAI Workshop on Explainable Artificial Intelligence. 40--44.Google Scholar
- Ruth Fong and Andrea Vedaldi. 2017. Interpretable explanations of black boxes by meaningful perturbation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17). IEEE, 3449--3457.Google ScholarCross Ref
- Sam Greydanus, Anurag Koul, Jonathan Dodge, and Alan Fern. 2018. Visualizing and understanding Atari agents. In Proceedings of the 35th International Conference on Machine Learning (Proceedings of Machine Learning Research), Jennifer Dy and Andreas Krause (Eds.), Vol. 80. PMLR, Stockholm Sweden, 1792--1801.Google Scholar
- Sandra G. Hart. 2006. NASA-task load index (NASA-TLX); 20 years later. In Proceedings of the Human Factors and Ergonomics Society Annual Meeting, Vol. 50. Sage, Thousand Oaks, CA, 904--908.Google ScholarCross Ref
- Sandra G. Hart and Lowell E. Staveland. 1988. Development of NASA-TLX (task load index): Results of empirical and theoretical research. In Advances in Psychology. Vol. 52. Elsevier, 139--183.Google Scholar
- Bradley Hayes and Julie A. Shah. 2017. Improving robot controller transparency through autonomous policy explanation. In Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction. ACM, 303--312.Google Scholar
- Robert Hoffman, Shane Mueller, Gary Klein, and Jordan Litman. 2018. Metrics for explainable AI: Challenges and prospects. arXiv:1812.04608 (2018).Google Scholar
- Fred Hohman, Andrew Head, Rich Caruana, Robert DeLine, and Steven M. Drucker. 2019. Gamut: A design probe to understand how data scientists understand machine learning models. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, 579.Google Scholar
- Hsiu-Fang Hsieh and Sarah Shannon. 2005. Three approaches to qualitative content analysis. Qual. Health Res. 15, 9 (2005), 1277--1288.Google ScholarCross Ref
- Sandy H. Huang, David Held, Pieter Abbeel, and Anca D. Dragan. 2019. Enabling robots to communicate their objectives. Auton. Robots 43, 2 (2019), 309--326.Google ScholarDigital Library
- Paul Jaccard. 1908. Nouvelles recherches sur la distribution florale. Bull. Soc. Vaud. Sci. Nat. 44 (1908), 223--270.Google Scholar
- Jeff Johnson. 2013. Designing with the Mind in Mind: Simple Guide to Understanding User Interface Design Guidelines. Elsevier.Google ScholarDigital Library
- Caitlin Kelleher and Wint Hnin. 2019. Predicting cognitive load in future code puzzles. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI’19). ACM, New York, NY.Google ScholarDigital Library
- Been Kim, Martin Wattenberg, Justin Gilmer, Carrie Cai, James Wexler, Fernanda Viegas, and Rory Sayres. 2018. Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (TCAV). In Proceedings of the International Conference on Machine Learning (Proceedings of Machine Learning Research), Jennifer Dy and Andreas Krause (Eds.), Vol. 80. PMLR, Stockholm Sweden, 2668--2677. http://proceedings.mlr.press/v80/kim18d.htmlGoogle Scholar
- Man-Je Kim, Kyung-Joong Kim, SeungJun Kim, and Anind K. Dey. 2016. Evaluation of starcraft artificial intelligence competition bots by experienced human players. In Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems. ACM, 1915--1921.Google Scholar
- M. J. Kim, K. J. Kim, S. Kim, and A. K. Dey. 2018. Performance evaluation gaps in a real-time strategy game between human and artificial intelligence players. IEEE Access 6 (2018), 13575--13586. DOI:https://doi.org/10.1109/ACCESS.2018.2800016Google ScholarCross Ref
- Paul A. Kirschner, John Sweller, and Richard E. Clark. 2006. Why minimal guidance during instruction does not work: An analysis of the failure of constructivist, discovery, problem-based, experiential, and inquiry-based teaching. Educ. Psychol. 41, 2 (2006), 75--86.Google ScholarCross Ref
- Todd Kulesza, Margaret Burnett, Weng-Keen Wong, and Simone Stumpf. 2015. Principles of explanatory debugging to personalize interactive machine learning. In Proceedings of the ACM International Conference on Intelligent User Interfaces (IUI’15). ACM, 126--137.Google ScholarDigital Library
- Todd Kulesza, Simone Stumpf, Margaret Burnett, Sherry Yang, Irwin Kwan, and Weng-Keen Wong. 2013. Too much, too little, or just right? Ways explanations impact end users’ mental models. In Proceedings of the 2013 IEEE Symposium on Visual Languages and Human Centric Computing. IEEE, 3--10.Google ScholarCross Ref
- Brian Y. Lim. 2012. Improving Understanding and Trust with Intelligibility in Context-aware Applications. Ph.D. Dissertation.Google ScholarDigital Library
- Brian Y. Lim and Anind K. Dey. 2009. Assessing demand for intelligibility in context-aware applications. In Proceedings of the 11th International Conference on Ubiquitous Computing. ACM, 195--204.Google Scholar
- Brian Y. Lim, Qian Yang, Ashraf M Abdul, and Danding Wang. 2019. Why these explanations? Selecting intelligibility types for explanation goals. In Proceedings of the ACM International Conference on Intelligent User Interfaces (IUI) Workshops.Google Scholar
- Katherine Lippa, Helen Klein, and Valerie Shalin. 2008. Everyday expertise: Cognitive demands in diabetes self-management. Hum. Fact. 50, 1 (2008), 112--120.Google ScholarCross Ref
- M. Lomas, R. Chevalier, E. V. Cross, R. C. Garrett, J. Hoare, and M. Kopack. 2012. Explaining robot actions. In Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction (HRI’12). 187--188. DOI:https://doi.org/10.1145/2157689.2157748Google Scholar
- Prashan Madumal, Tim Miller, Liz Sonenberg, and Frank Vetere. 2019. Explainable reinforcement learning through a causal lens. CoRR abs/1905.10958 (2019). arxiv:1905.10958 http://arxiv.org/abs/1905.10958Google Scholar
- Tim Miller. 2017. Explanation in artificial intelligence: Insights from the social sciences. CoRR abs/1706.07269 (2017). arxiv:1706.07269 http://arxiv.org/abs/1706.07269Google Scholar
- Jack Muramatsu and Wanda Pratt. 2001. Transparent queries: Investigation users’ mental models of search engines. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 217--224.Google ScholarDigital Library
- NASA. [n.d.]. NASA TLX: Task Load Index. Retrieved from https://humansystems.arc.nasa.gov/groups/TLX/.Google Scholar
- Joshua Newn, Eduardo Velloso, Fraser Allison, Yomna Abdelrahman, and Frank Vetere. 2017. Evaluating real-time gaze representations to infer intentions in competitive turn-based strategy games. In Proceedings of the ACM Symposium on Computer--Human Interaction in Play. ACM, 541--552.Google ScholarDigital Library
- Joshua Newn, Eduardo Velloso, Marcus Carter, and Frank Vetere. 2016. Exploring the effects of gaze awareness on multiplayer gameplay. In Proceedings of the ACM Symposium on Computer--Human Interaction in Play Companion Extended Abstracts. ACM, 239--244.Google ScholarDigital Library
- Donald Norman and Dedra Gentner. 1983. Mental models. Lawrence Erlbaum Associates, Hillsdale, NJ,1983.Google Scholar
- S. Ontañón, G. Synnaeve, A. Uriarte, F. Richoux, D. Churchill, and M. Preuss. 2013. A survey of real-time strategy game AI research and competition in StarCraft. IEEE Trans. Comput. Intell. AI Games 5, 4 (Dec. 2013), 293--311. DOI:https://doi.org/10.1109/TCIAIG.2013.2286295Google ScholarCross Ref
- Bei Peng, James MacGlashan, Robert Loftin, Michael L. Littman, David L. Roberts, and Matthew E. Taylor. 2016. A need for speed: Adapting agent action speed to improve task learning from non-expert humans. In Proceedings of the 2016 International Conference on Autonomous Agents 8 Multiagent Systems. International Foundation for Autonomous Agents and Multiagent Systems, 957--965.Google Scholar
- Sean Penney, Jonathan Dodge, Claudia Hilderbrand, Andrew Anderson, Logan Simpson, and Margaret Burnett. 2018. Toward foraging for understanding of StarCraft agents: An empirical study. In Proceedings of the ACM International Conference on Intelligent User Interfaces (IUI’18). ACM, NY, 225--237.Google ScholarDigital Library
- Nicolas Riche, Matthieu Duvinage, Matei Mancas, Bernard Gosselin, and Thierry Dutoit. 2013. Saliency and human fixations: State-of-the-art and study of comparison metrics. In Proceedings of the IEEE International Conference on Computer Vision. 1153--1160.Google ScholarDigital Library
- Ariel Rosenfeld, Moshe Cohen, Matthew E. Taylor, and Sarit Kraus. 2018. Leveraging human knowledge in tabular reinforcement learning: A study of human subjects. Knowl. Eng. Rev. 33 (2018).Google Scholar
- Stuart Russell and Andrew Zimdars. 2003. Q-decomposition for reinforcement learning agents. In Proceedings of the International Conference on Machine Learning. 656--663.Google Scholar
- Shlomo S. Sawilowsky. 2009. New effect size rules of thumb. J. Mod. Appl. Stat. Methods 8, 2 (2009), 26.Google ScholarCross Ref
- Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2013. Deep inside convolutional networks: Visualising image classification models and saliency maps. CoRR abs/1312.6034 (2013).Google Scholar
- Jost Springenberg, Alexey Dosovitskiy, Thomas Brox, and Martin A. Riedmiller. 2014. Striving for simplicity: The all convolutional net. CoRR abs/1412.6806 (2014).Google Scholar
- Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction. MIT Press.Google ScholarDigital Library
- Rohani A. Tarmizi and John Sweller. 1988. Guidance during mathematical problem solving.J. Educ. Psychol. 80, 4 (1988), 424.Google ScholarCross Ref
- Andrea L. Thomaz, Guy Hoffman, and Cynthia Breazeal. 2006. Reinforcement learning with human teachers: Understanding how people want to teach robots. In Proceedings of the 15th IEEE International Symposium on Robot and Human Interactive Communication (ROMAN’06). IEEE, 352--357.Google ScholarCross Ref
- J. Tullio, A. Dey, J. Chalecki, and J. Fogarty. 2007. How it works: A field study of non-technical users interacting with an intelligent system. In Proceedings of the ACM Conference on Human Factors in Computing Systems. ACM, 31--40.Google Scholar
- Jasper van der Waa, Jurriaan van Diggelen, Karel van den Bosch, and Mark Neerincx. 2018. Contrastive explanations for reinforcement learning in terms of expected consequences. (2018).Google Scholar
- Jeroen J. G. Van Merrienboer and John Sweller. 2005. Cognitive load theory and complex learning: Recent developments and future directions. Educ. Psychol. Rev. 17, 2 (2005), 147--177.Google ScholarCross Ref
- Harm Van Seijen, Mehdi Fatemi, Joshua Romoff, Romain Laroche, Tavian Barnes, and Jeffrey Tsang. 2017. Hybrid reward architecture for reinforcement learning. In Advances in Neural Information Processing Systems. 5392--5402.Google Scholar
- Oriol Vinyals, David Silver, et al. 2019. AlphaStar: Mastering the Real-Time Strategy Game StarCraft II. Retrieved from shorturl.at/dinL3.Google Scholar
- Danding Wang, Qian Yang, Ashraf Abdul, and Brian Y. Lim. 2019. Designing theory-driven user-centric explainable AI. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’19), Vol. 19.Google Scholar
- Daniel S. Weld and Gagan Bansal. 2019. The challenge of crafting intelligible intelligence. Commun. ACM 62, 6 (May 2019), 70--79. DOI:https://doi.org/10.1145/3282486Google ScholarDigital Library
- Ari Widyanti, Addie Johnson, and Dick de Waard. 2013. Adaptation of the rating scale mental effort (RSME) for use in Indonesia. Int. J. Industr. Ergon. 43, 1 (2013), 70--76.Google ScholarCross Ref
- Claes Wohlin, Per Runeson, Martin Höst, Magnus Ohlsson, Björn Regnell, and Anders Wesslén. 2000. Experimentation in Software Engineering: An Introduction. Kluwer Academic Publishers, Norwell, MA.Google ScholarDigital Library
- Robert H. Wortham, Andreas Theodorou, and Joanna J Bryson. 2017. Improving robot transparency: Real-time visualisation of robot AI substantially improves understanding in naive observers. In Proceedings of the IEEE International Symposium on Robot and Human Interactive Communication (ROMAN’7). http://opus.bath.ac.uk/55793/Google ScholarCross Ref
- Johannes Zagermann, Ulrike Pfeil, and Harald Reiterer. 2016. Measuring cognitive load using eye tracking technology in visual computing. In Proceedings of the 6th Workshop on Beyond Time and Errors on Novel Evaluation Methods for Visualization. ACM, 78--85.Google ScholarDigital Library
- Matthew Zeiler and Rob Fergus. 2014. Visualizing and understanding convolutional networks. In Proceedings of the European Conference on Computer Vision. Springer, 818--833.Google ScholarCross Ref
- Jianming Zhang, Sarah Adel Bargal, Zhe Lin, Jonathan Brandt, Xiaohui Shen, and Stan Sclaroff. 2018. Top-down neural attention by excitation backprop. Int. J. Comput. Vis. 126, 10 (Oct. 2018), 1084--1102. DOI:https://doi.org/10.1007/s11263-017-1059-xGoogle ScholarDigital Library
- F. R. H. Zijlstra and L. Van Doorn. 1985. The Construction of a Scale to Measure Perceived Effort. University of Technology.Google Scholar
Index Terms
- Mental Models of Mere Mortals with Explanations of Reinforcement Learning
Recommendations
Explaining reinforcement learning to mere mortals: an empirical study
IJCAI'19: Proceedings of the 28th International Joint Conference on Artificial IntelligenceWe present a user study to investigate the impact of explanations on non-experts' understanding of reinforcement learning (RL) agents. We investigate both a common RL visualization, saliency maps (the focus of attention), and a more recent explanation ...
Redefining Counterfactual Explanations for Reinforcement Learning: Overview, Challenges and Opportunities
While AI algorithms have shown remarkable success in various fields, their lack of transparency hinders their application to real-life tasks. Although explanations targeted at non-experts are necessary for user trust and human-AI collaboration, the ...
Counterfactual Explanations for Reinforcement Learning Agents
AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent SystemsReinforcement learning (RL) algorithms often use neural networks to represent agent's policy, making them difficult to interpret. Counterfactual explanations are human-friendly explanations which offer users actionable advice on how to change their ...
Comments