Algebraic Reinforcement Learning

Neubert, Stefanie; Belzner, Lenz; Wirsing, Martin

doi:10.1007/978-3-319-23165-5_26

Algebraic Reinforcement Learning

Hypothesis Induction for Relational Reinforcement Learning Using Term Generalization

Stefanie Neubert¹⁶,
Lenz Belzner¹⁶ &
Martin Wirsing¹⁶

Chapter
First Online: 01 January 2015

829 Accesses
1 Citations
1 Altmetric

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9200))

Abstract

The TG relational reinforcement learning algorithm builds first-order decision trees from perception samples. To this end, it statistically checks the significance of hypotheses about state properties possibly relevant for decision making. The generation of hypotheses is restricted by constraints manually specified a priori. In this paper we propose Algebraic Reinforcement Learning (ARL) for eliminating this condition by employing rewrite theories for state representation, enabling induction of hypotheses from perception samples directly via term generalization with the ACUOS system. We compare experimental results for ARL with and without generalization, and show that generalization positively influences convergence rates and reduces complexity of learned trees in comparison to trees learned without generalization.

This work has been partially funded by the EU project ASCENS, 257414. Dedicated to José Meseguer.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
In the original TG algorithm implementation, entailment is decided by Prolog’s SLD-resolution.
2.
This is done with a standard statistical F-Test for variance comparison using the statistics stored with the hypothesis, for details see [6].
3.
Note that no domain dynamics (e.g. rewrite rules) are encoded, as these are implicitly learned by the ARL algorithm w.r.t. observed rewards.

References

Sutton, R.S., Barto, A.G.: Reinforcement learning: an introduction. IEEE Trans. Neural Netw. 9(5), 1054–1054 (1998)
Article Google Scholar
Džeroski, S., De Raedt, L., Driessens, K.: Relational reinforcement learning. Mach. Learn. 43(1–2), 7–52 (2001)
Article MATH Google Scholar
Tadepalli, P., Givan, R., Driessens, K.: Relational reinforcement learning: an overview. In: Proceedings of the ICML 2004 Workshop on Relational Reinforcement Learning (2004)
Google Scholar
Van Otterlo, M.: A survey of reinforcement learning in relational domains (2005)
Google Scholar
Driessens, K., Ramon, J., Blockeel, H.: Speeding up relational reinforcement learning through the use of an incremental first order decision tree learner. In: Flach, P.A., De Raedt, L. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 97–108. Springer, Heidelberg (2001)
Chapter Google Scholar
Driessens, K.: Relational reinforcement learning. In: Sammut, C., Webb, G.I. (eds.) Encyclopedia of Machine Learning, pp. 857–862. Springer, New York (2010)
Google Scholar
Alpuente, M., Escobar, S., Meseguer, J., Ojeda, P.: A modular equational generalization algorithm. In: Hanus, M. (ed.) LOPSTR 2008. LNCS, vol. 5438, pp. 24–39. Springer, Heidelberg (2009)
Chapter Google Scholar
Alpuente, M., Escobar, S., Espert, J., Meseguer, J.: ACUOS: a system for modular acu generalization with subtyping and inheritance. In: Fermé, E., Leite, J. (eds.) JELIA 2014. LNCS, vol. 8761, pp. 573–581. Springer, Heidelberg (2014)
Google Scholar
Clavel, M., Durán, F., Eker, S., Lincoln, P., Martí-Oliet, N., Meseguer, J., Talcott, C. (eds.): All About Maude - A High-Performance Logical Framework. LNCS, vol. 4350, pp. 119–129. Springer, Heidelberg (2007)
Book Google Scholar
Meseguer, J.: Twenty years of rewriting logic. J. Log. Algebr. Program. 81(7–8), 721–781 (2012)
Article MathSciNet MATH Google Scholar
Belzner, L.: Action programming in rewriting logic. TPLP 13(4-5-Online-Supplement) (2013)
Google Scholar
Belzner, L.: Verifiable decisions in autonomous concurrent systems. In: Kühn, E., Pugliese, R. (eds.) COORDINATION 2014. LNCS, vol. 8459, pp. 17–32. Springer, Heidelberg (2014)
Chapter Google Scholar
Belzner, L.: Value iteration for relational MDPs in rewriting logic. In: Endriss, U., Leite, J. (eds.) STAIRS 2014 - Proceedings of the 7th European Starting AI Researcher Symposium, Prague, Czech Republic, 18–22 August, 2014. Frontiers in Artificial Intelligence and Applications, vol. 264, pp. 61–70. IOS Press, The Netherlands (2014)
Google Scholar
Wirsing, M., Knapp, A.: A formal approach to object-oriented software engineering. Theor. Comput. Sci. 285(2), 519–560 (2002)
Article MathSciNet MATH Google Scholar
Wirsing, M., Denker, G., Talcott, C.L., Poggio, A., Briesemeister, L.: A rewriting logic framework for soft constraints. Electr. Notes Theor. Comput. Sci. 176(4), 181–197 (2007)
Article MATH Google Scholar
Belzner, L., De Nicola, R., Vandin, A., Wirsing, M.: Reasoning (on) service component ensembles in rewriting logic. In: Iida, S., Meseguer, J., Ogata, K. (eds.) Specification, Algebra, and Software. LNCS, vol. 8373, pp. 188–211. Springer, Heidelberg (2014)
Chapter Google Scholar
Boronat, A., Knapp, A., Meseguer, J., Wirsing, M.: What is a multi-modeling language? In: Corradini, A., Montanari, U. (eds.) WADT 2008. LNCS, vol. 5486, pp. 71–87. Springer, Heidelberg (2009)
Chapter Google Scholar
Eckhardt, J., Mühlbauer, T., Meseguer, J., Wirsing, M.: Semantics, distributed implementation, and formal analysis of KLAIM models in maude. Sci. Comput. Program. 99, 24–74 (2015)
Article Google Scholar
Eckhardt, J., Mühlbauer, T., AlTurki, M., Meseguer, J., Wirsing, M.: Stable availability under denial of service attacks through formal patterns. In: de Lara, J., Zisman, A. (eds.) FASE 2012. LNCS, vol. 7212, pp. 78–93. Springer, Heidelberg (2012)
Google Scholar
Blockeel, H., Raedt, L.D.: Top-down induction of first-order logical decision trees. Artif. Intell. 101(12), 285–297 (1998)
Article MathSciNet MATH Google Scholar
Blockeel, H., De Raedt, L.: Lookahead and discretization in ILP. In: Džeroski, S., Lavrač, N. (eds.) ILP 1997. LNCS, vol. 1297, pp. 77–84. Springer, Heidelberg (1997)
Chapter Google Scholar
Castillo, L.P., Wrobel, S.: A comparative study on methods for reducing myopia of hill-climbing search in multirelational learning. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 19. ACM (2004)
Google Scholar
Russell, S.J., Norvig, P.: Artificial Intelligence - A Modern Approach, 3rd edn. Pearson Education, New York (2010)
MATH Google Scholar
Neubert, S.: Solving relational reinforcement learning problems with a combination of incremental decision trees and generalization. Master’s thesis, Ludwig-Maximilians-Universität München, Germany (2014)
Google Scholar
Quinlan, J.R.: C 4.5: Programs for Machine Learning, vol. 1. Morgan Kaufmann, San Mateo (1993)
Google Scholar
Shannon, C.E.: A mathematical theory of communication. ACM SIGMOBILE Mobile Comput. Commun. Rev. 5(1), 3–55 (2001)
Article MathSciNet Google Scholar
Driessens, K., Ramon, J.: Relational instance based regression for relational reinforcement learning. In: ICML, pp. 123–130 (2003)
Google Scholar
Gärtner, T., Driessens, K., Ramon, J.: Graph kernels and gaussian processes for relational reinforcement learning. In: Horváth, T., Yamamoto, A. (eds.) ILP 2003. LNCS (LNAI), vol. 2835, pp. 146–163. Springer, Heidelberg (2003)
Chapter Google Scholar
Boutilier, C., Reiter, R., Price, B.: Symbolic dynamic programming for first-order MDPs. In: Nebel, B. (ed.) IJCAI, pp. 690–700. Morgan Kaufmann, Seattle (2001)
Google Scholar
Wang, C., Joshi, S., Khardon, R.: First order decision diagrams for relational mdps. J. Artif. Intell. Res. 31, 431–472 (2008)
MathSciNet MATH Google Scholar
Sanner, S., Kersting, K.: Symbolic dynamic programming for first-order pomdps (2010)
Google Scholar
Rodrigues, C., Gérard, P., Rouveirol, C., Soldano, H.: Incremental learning of relational action rules. In: 2010 Ninth International Conference on Machine Learning and Applications (ICMLA), pp. 451–458. IEEE (2010)
Google Scholar
Khot, T., Natarajan, S., Kersting, K., Shavlik, J.: Learning markov logic networks via functional gradient boosting. In: 2011 IEEE 11th International Conference on Data Mining (ICDM), pp. 320–329. IEEE (2011)
Google Scholar
Hölzl, M., Gabor, T.: Reasoning and learning for awareness and adaptation. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds.) Software Engineering for Collective Autonomic Systems: Results of the ASCENS Project. LNCS, vol. 8998, pp. 249–290. Springer, Heidelberg (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Ludwig-Maximilians-Universität München, Munich, Germany
Stefanie Neubert, Lenz Belzner & Martin Wirsing

Authors

Stefanie Neubert
View author publications
You can also search for this author in PubMed Google Scholar
Lenz Belzner
View author publications
You can also search for this author in PubMed Google Scholar
Martin Wirsing
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lenz Belzner .

Editor information

Editors and Affiliations

Universidad Complutense de Madrid, Madrid, Spain
Narciso Martí-Oliet
University of Oslo, Oslo, Norway
Peter Csaba Ölveczky
SRI International, Menlo Park, California, USA
Carolyn Talcott

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Neubert, S., Belzner, L., Wirsing, M. (2015). Algebraic Reinforcement Learning. In: Martí-Oliet, N., Ölveczky, P., Talcott, C. (eds) Logic, Rewriting, and Concurrency. Lecture Notes in Computer Science(), vol 9200. Springer, Cham. https://doi.org/10.1007/978-3-319-23165-5_26

Download citation

DOI: https://doi.org/10.1007/978-3-319-23165-5_26
Published: 27 August 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23164-8
Online ISBN: 978-3-319-23165-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics