ABSTRACT
Causal questions inquire about causal relationships between different events or phenomena. They are important for a variety of use cases, including virtual assistants and search engines. However, many current approaches to causal question answering cannot provide explanations or evidence for their answers. Hence, in this paper, we aim to answer causal questions with a causality graph, a large-scale dataset of causal relations between noun phrases along with the relations' provenance data. Inspired by recent, successful applications of reinforcement learning to knowledge graph tasks, such as link prediction and fact-checking, we explore the application of reinforcement learning on a causality graph for causal question answering. We introduce an Actor-Critic-based agent which learns to search through the graph to answer causal questions. We bootstrap the agent with a supervised learning procedure to deal with large action spaces and sparse rewards. Our evaluation shows that the agent successfully prunes the search space to answer binary causal questions by visiting less than 30 nodes per question compared to over 3,000 nodes by a naive breadth-first search. Our ablation study indicates that our supervised learning strategy provides a strong foundation upon which our reinforcement learning agent improves. The paths returned by our agent explain the mechanisms by which a cause produces an effect. Moreover, for each edge on a path, our causality graph provides its original source allowing for easy verification of paths.
Supplemental Material
- Takuya Akiba, Shotaro Sano, Toshihiko Yanase, Takeru Ohta, and Masanori Koyama. 2019. Optuna: A Next-generation Hyperparameter Optimization Framework. In KDD. ACM, 2623--2631.Google ScholarDigital Library
- Paul Almasan, José Suá rez-Varela, Krzysztof Rusek, Pere Barlet-Ros, and Albert Cabellos-Aparicio. 2022. Deep reinforcement learning meets graph neural networks: Exploring a routing optimization use case. Comput. Commun. , Vol. 196 (2022), 184--194.Google ScholarDigital Library
- Alexander Bondarenko, Magdalena Wolska, Stefan Heindorf, Lukas Blü baum, Axel-Cyrille Ngonga Ngomo, Benno Stein, Pavel Braslavski, Matthias Hagen, and Martin Potthast. 2022. CausalQA: A Benchmark for Causal Question Answering. In COLING. International Committee on Computational Linguistics, 3296--3308.Google Scholar
- Dhairya Dalal. 2021. Knowledge augmented language models for causal question answering. CEUR Workshop Proceedings , Vol. 3005 (2021), 17--24.Google Scholar
- Dhairya Dalal, Mihael Arcan, and Paul Buitelaar. 2021. Enhancing Multiple-Choice Question Answering with Causal Knowledge. In DeeLIO@NAACL-HLT. ACL, 70--80.Google Scholar
- Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, and Andrew McCallum. 2018. Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning. In ICLR.Google Scholar
- Tim Dettmers, Pasquale Minervini, Pontus Stenetorp, and Sebastian Riedel. 2018. Convolutional 2D Knowledge Graph Embeddings. In AAAI. 1811--1818.Google Scholar
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019a. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT. ACL, 4171--4186.Google Scholar
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019b. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2--7, 2019, Volume 1 (Long and Short Papers), , Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, 4171--4186. https://doi.org/10.18653/V1/N19--1423Google ScholarCross Ref
- Roxana Girju and Dan I. Moldovan. 2002. Text Mining for Causal Relations. In FLAIRS. 360--364.Google Scholar
- Ivan Habernal, Henning Wachsmuth, Iryna Gurevych, and Benno Stein. 2018. The Argument Reasoning Comprehension Task: Identification and Reconstruction of Implicit Warrants. In NAACL-HLT. ACL, 1930--1940.Google ScholarCross Ref
- Oktie Hassanzadeh, Debarun Bhattacharjya, Mark Feblowitz, Kavitha Srinivas, Michael Perrone, Shirin Sohrabi, and Michael Katz. 2019. Answering Binary Causal Questions Through Large-Scale Text Mining: An Evaluation Using Cause-Effect Pairs from Human Experts. In IJCAI. ijcai.org, 5003--5009.Google Scholar
- Stefan Heindorf, Yan Scholten, Henning Wachsmuth, Axel-Cyrille Ngonga Ngomo, and Martin Potthast. 2020. CauseNet: Towards a Causality Graph Extracted from the Web. In CIKM. ACM, 3023--3030.Google Scholar
- Iris Hendrickx, Su Nam Kim, Zornitsa Kozareva, Preslav Nakov, Diarmuid Ó Sé aghdha, Sebastian Padó, Marco Pennacchiotti, Lorenza Romano, and Stan Szpakowicz. 2010. SemEval-2010 Task 8: Multi-Way Classification of Semantic Relations between Pairs of Nominals. In SemEval@ACL. ACL, 33--38.Google Scholar
- Sepp Hochreiter and Jü rgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. , Vol. 9, 8 (1997), 1735--1780.Google ScholarDigital Library
- Jena D. Hwang, Chandra Bhagavatula, Ronan Le Bras, Jeff Da, Keisuke Sakaguchi, Antoine Bosselut, and Yejin Choi. 2021. (Comet-) Atomic 2020: On Symbolic and Neural Commonsense Knowledge Graphs. In AAAI. 6384--6392.Google Scholar
- Filip Ilievski, Pedro A. Szekely, and Bin Zhang. 2021. CSKG: The CommonSense Knowledge Graph. In ESWC, Vol. 12731. Springer, 680--696.Google Scholar
- Magdalena Kaiser, Rishiraj Saha Roy, and Gerhard Weikum. 2021. Reinforcement Learning from Reformulations in Conversational Question Answering over Knowledge Graphs. In SIGIR. ACM, 459--469.Google Scholar
- Humayun Kayesh, Md. Saiful Islam, Junhu Wang, Shikha Anirban, A. S. M. Kayes, and Paul A. Watters. 2020. Answering Binary Causal Questions: A Transfer Learning Based Approach. In IJCNN. IEEE, 1--9.Google Scholar
- Daniel Khashabi, Yeganeh Kordi, and Hannaneh Hajishirzi. 2022. UnifiedQA-v2: Stronger Generalization via Broader Cross-Format Training. CoRR , Vol. abs/2202.12359 (2022).Google Scholar
- Daniel Khashabi, Sewon Min, Tushar Khot, Ashish Sabharwal, Oyvind Tafjord, Peter Clark, and Hannaneh Hajishirzi. 2020. UnifiedQA: Crossing Format Boundaries With a Single QA System. In EMNLP (Findings) (Findings of ACL, Vol. EMNLP 2020). ACL, 1896--1907.Google Scholar
- Zhongyang Li, Xiao Ding, Ting Liu, J. Edward Hu, and Benjamin Van Durme. 2020. Guided Generation of Cause and Effect. In IJCAI. ijcai.org, 3629--3636.Google Scholar
- Xi Victoria Lin, Richard Socher, and Caiming Xiong. 2018. Multi-Hop Knowledge Graph Reasoning with Reward Shaping. In EMNLP. ACL, 3243--3253.Google Scholar
- Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. CoRR , Vol. abs/1907.11692 (2019). showeprint[arXiv]1907.11692 http://arxiv.org/abs/1907.11692Google Scholar
- Ilya Loshchilov and Frank Hutter. 2019. Decoupled Weight Decay Regularization. In ICLR (Poster). OpenReview.net.Google Scholar
- Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Rose Finkel, Steven Bethard, and David McClosky. 2014. The Stanford CoreNLP Natural Language Processing Toolkit. In ACL (System Demonstrations). ACL, 55--60.Google Scholar
- Volodymyr Mnih, Adrià Puigdomè nech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016. Asynchronous Methods for Deep Reinforcement Learning. In ICML (JMLR Workshop and Conference Proceedings, Vol. 48). JMLR.org, 1928--1937.Google Scholar
- Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. 2016. MS MARCO: A Human Generated MAchine Reading COmprehension Dataset. In CoCo@NIPS (CEUR Workshop Proceedings, Vol. 1773). CEUR-WS.org.Google Scholar
- OpenAI. 2023. GPT-4 Technical Report. CoRR , Vol. abs/2303.08774 (2023). https://doi.org/10.48550/ARXIV.2303.08774 showeprint[arXiv]2303.08774Google Scholar
- Razvan Pascanu, Tomá s Mikolov, and Yoshua Bengio. 2013. On the difficulty of training recurrent neural networks. In ICML (JMLR Workshop and Conference Proceedings, Vol. 28). JMLR.org, 1310--1318.Google Scholar
- Xue Bin Peng, Pieter Abbeel, Sergey Levine, and Michiel van de Panne. 2018. DeepMimic: example-guided deep reinforcement learning of physics-based character skills. ACM Trans. Graph. , Vol. 37, 4 (2018), 143.Google ScholarDigital Library
- Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global Vectors for Word Representation. In EMNLP. ACL, 1532--1543.Google Scholar
- Yunqi Qiu, Yuanzhuo Wang, Xiaolong Jin, and Kun Zhang. 2020. Stepwise Reasoning for Multi-Relation Question Answering over Knowledge Graph with Weak Supervision. In WSDM. ACM, 474--482.Google Scholar
- Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res. , Vol. 21 (2020), 140:1--140:67.Google Scholar
- Maarten Sap, Ronan Le Bras, Emily Allaway, Chandra Bhagavatula, Nicholas Lourie, Hannah Rashkin, Brendan Roof, Noah A. Smith, and Yejin Choi. 2019. ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning. In AAAI. 3027--3035.Google ScholarDigital Library
- John Schulman, Philipp Moritz, Sergey Levine, Michael I. Jordan, and Pieter Abbeel. 2016. High-Dimensional Continuous Control Using Generalized Advantage Estimation. In ICLR (Poster).Google Scholar
- John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal Policy Optimization Algorithms. CoRR , Vol. abs/1707.06347 (2017).Google Scholar
- Rebecca Sharp, Mihai Surdeanu, Peter Jansen, Peter Clark, and Michael Hammond. 2016. Creating Causal Embeddings for Question Answering with Minimal Supervision. In EMNLP. ACL, 138--148.Google Scholar
- Yelong Shen, Jianshu Chen, Po-Sen Huang, Yuqing Guo, and Jianfeng Gao. 2018. M-Walk: Learning to Walk over Graphs using Monte Carlo Tree Search. In NeurIPS. 6787--6798.Google Scholar
- David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Vedavyas Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy P. Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel, and Demis Hassabis. 2016. Mastering the game of Go with deep neural networks and tree search. Nat. , Vol. 529, 7587 (2016), 484--489.Google Scholar
- David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, et al. 2018. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science, Vol. 362, 6419 (2018), 1140--1144.Google Scholar
- Robyn Speer, Joshua Chin, and Catherine Havasi. 2017. ConceptNet 5.5: An Open Multilingual Graph of General Knowledge. In AAAI. 4444--4451.Google Scholar
- Richard S. Sutton and Andrew G. Barto. 1998. Reinforcement learning - an introduction.Google ScholarDigital Library
- Douglas N. Walton. 2007. Dialog theory for critical argumentation. Controversis, Vol. 5. Benjamin/Cummings.Google ScholarCross Ref
- Guojia Wan and Bo Du. 2021. GaussianPath: A Bayesian Multi-Hop Reasoning Framework for Knowledge Graph Reasoning. In AAAI. 4393--4401.Google Scholar
- Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, and Furu Wei. 2022. Text Embeddings by Weakly-Supervised Contrastive Pre-training. CoRR , Vol. abs/2212.03533 (2022). https://doi.org/10.48550/ARXIV.2212.03533 showeprint[arXiv]2212.03533Google ScholarCross Ref
- Peter West, Chandra Bhagavatula, Jack Hessel, Jena D. Hwang, Liwei Jiang, Ronan Le Bras, Ximing Lu, Sean Welleck, and Yejin Choi. 2022. Symbolic Knowledge Distillation: from General Language Models to Commonsense Models. In NAACL-HLT. ACL, 4602--4625.Google Scholar
- Ronald J. Williams. 1992. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning. Mach. Learn. , Vol. 8 (1992), 229--256.Google ScholarDigital Library
- Wenhan Xiong, Thien Hoang, and William Yang Wang. 2017. DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning. In EMNLP. ACL, 564--573. ioGoogle Scholar
Index Terms
- Causal Question Answering with Reinforcement Learning
Recommendations
Causal explanation for reinforcement learning: quantifying state and temporal importance
AbstractExplainability plays an increasingly important role in machine learning. Because reinforcement learning (RL) involves interactions between states and actions over time, it’s more challenging to explain an RL policy than supervised learning. ...
Reward Shaping in Episodic Reinforcement Learning
AAMAS '17: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent SystemsRecent advancements in reinforcement learning confirm that reinforcement learning techniques can solve large scale problems leading to high quality autonomous decision making. It is a matter of time until we will see large scale applications of ...
A reinforcement learning formulation to the complex question answering problem
Reinforcement learning formulation for complex question answering.Abstract summaries used for small amount of supervision using reward scores.User interaction component incorporated to guide candidate sentence selection.Experiments reveal that systems ...
Comments