Abstract
Teaching a computer to understand and answer a natural language question pertaining to a given general document is a central, yet unsolved task in natural language processing. The answering of complex questions, which require processing multiple sentences, is much harder than answering those questions which can be correctly answered by merely understanding a single sentence. In this paper, we propose a novel global attentional inference (GAI) neural network architecture, which learns useful cues from structural knowledge via a dynamically terminated multi-hop inference mechanism, to answer cloze-style questions. Here, reinforcement learning is employed to learn an inference gate which can determine whether to keep accumulating cues or to predict an answer. By exploiting structural knowledge, our model can answer complex questions much better than other compared methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Altmann, G., Steedman, M.: Interaction with context during human sentence processing. Cognition 30(3), 191–238 (1988)
Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)
Chen, D., Bolton, J., Manning, C.D.: A thorough examination of the CNN/daily mail reading comprehension task. In: Association for Computational Linguistics (ACL) (2016)
Cui, Y., Chen, Z., Wei, S., Wang, S., Liu, T., Hu, G.: Attention-over-attention neural networks for reading comprehension. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, Volume 1: Long Papers, pp. 593–602 (2017). https://doi.org/10.18653/v1/P17-1055
Dhingra, B., Liu, H., Yang, Z., Cohen, W.W., Salakhutdinov, R.: Gated-attention readers for text comprehension. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, Volume 1: Long Papers, pp. 1832–1846 (2017). https://doi.org/10.18653/v1/P17-1168
Graves, A., Fernández, S., Schmidhuber, J.: Bidirectional LSTM networks for improved phoneme classification and recognition. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3697, pp. 799–804. Springer, Heidelberg (2005). https://doi.org/10.1007/11550907_126
Graves, A., et al.: Supervised Sequence Labelling with Recurrent Neural Networks, vol. 385. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-24797-2
Hermann, K.M., et al.: Teaching machines to read and comprehend. In: Advances in Neural Information Processing Systems, pp. 1693–1701 (2015)
Hill, F., Bordes, A., Chopra, S., Weston, J.: The Goldilocks principle: reading children’s books with explicit memory representations. CoRR abs/1511.02301 (2015). http://arxiv.org/abs/1511.02301
Kadlec, R., Schmid, M., Bajgar, O., Kleindienst, J.: Text understanding with the attention sum reader network. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, Berlin, Germany, 7–12 August 2016, Volume 1: Long Papers (2016). http://aclweb.org/anthology/P/P16/P16-1086.pdf
Kobayashi, S., Tian, R., Okazaki, N., Inui, K.: Dynamic entity representation with max-pooling improves machine reading. In: HLT-NAACL, pp. 850–855 (2016)
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation, pp. 1532–1543 (2014). http://www.aclweb.org/anthology/D14-1162
Shen, Y., Huang, P.S., Gao, J., Chen, W.: ReasoNet: learning to stop reading in machine comprehension. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1047–1055. ACM (2017)
Sordoni, A., Bachman, P., Bengio, Y.: Iterative alternating neural attention for machine reading. CoRR abs/1606.02245 (2016). http://arxiv.org/abs/1606.02245
Taylor, W.L.: “Cloze procedure”: a new tool for measuring readability. J. Bull. 30(4), 415–433 (1953)
Toutanova, K., Klein, D., Manning, C.D., Singer, Y.: Feature-rich part-of-speech tagging with a cyclic dependency network, pp. 173–180 (2003)
Trischler, A., Ye, Z., Yuan, X., Bachman, P., Sordoni, A., Suleman, K.: Natural language comprehension with the EpiReader. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, 1–4 November 2016, pp. 128–137 (2016). http://aclweb.org/anthology/D/D16/D16-1013.pdf
Warren, S.L.: Make it stick: the science of successful learning. Educ. Rev. Reseñas Educativas 23 (2016)
Werbos, P.J.: Backpropagation through time: what it does and how to do it. Proc. IEEE 78(10), 1550–1560 (1990)
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3–4), 229–256 (1992)
Acknowledgements
This work is supported in part by 973 program (No. 2015CB352300) and NSFC (U1611461,61751209), Chinese Knowledge Center of Engineering Science and Technology (CKCEST).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Song, J., Tang, S., Qian, T., Zhu, W., Wu, F. (2018). Reading Document and Answering Question via Global Attentional Inference. In: Hong, R., Cheng, WH., Yamasaki, T., Wang, M., Ngo, CW. (eds) Advances in Multimedia Information Processing – PCM 2018. PCM 2018. Lecture Notes in Computer Science(), vol 11164. Springer, Cham. https://doi.org/10.1007/978-3-030-00776-8_31
Download citation
DOI: https://doi.org/10.1007/978-3-030-00776-8_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00775-1
Online ISBN: 978-3-030-00776-8
eBook Packages: Computer ScienceComputer Science (R0)