Abstract
The Factored Markov Decision Process (fmdp) framework is a standard representation for sequential decision problems under uncertainty where the state is represented as a collection of random variables. Factored Reinforcement Learning (frl) is an Model-based Reinforcement Learning approach to fmdps where the transition and reward functions of the problem are learned. In this paper, we show how to model in a theoretically well-founded way the problems where some combinations of state variable values may not occur, giving rise to impossible states. Furthermore, we propose a new heuristics that considers as impossible the states that have not been seen so far. We derive an algorithm whose improvement in performance with respect to the standard approach is illustrated through benchmark experiments.
Chapter PDF
References
Boutilier, C.: Correlated action effects in decision theoretic regression. In: Proceedings of the 13th International Conference on Uncertainty in Artificial Intelligence, pp. 30–37. AUAI Press (1997)
Boutilier, C., Dearden, R., Goldszmidt, M.: Exploiting structure in policy construction. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, Montreal, pp. 1104–1111 (1995)
Boutilier, C., Dearden, R., Goldszmidt, M.: Stochastic dynamic programming with factored representations. Artificial Intelligence 121, 49–100 (2000)
Butz, M.V., Goldberg, D.E., Stolzmann, W.: The Anticipatory Classifier System and Genetic Generalization. Natural Computing 1, 427–467 (2002)
Dean, T., Kanazawa, K.: A model for reasoning about persistence and causation. Computational Intelligence 5, 142–150 (1989)
Degris, T., Sigaud, O., Wuillemin, P.-H.: Chi-square tests driven method for learning the structure of factored MDPs. In: Proceedings of the 22nd International Conference on Uncertainty in Artificial Intelligence, Massachusetts Institute of Technology, pp. 122–129. AUAI Press, Cambridge (2006a)
Degris, T., Sigaud, O., Wuillemin, P.-H.: Learning the structure of factored markov decision processes in reinforcement learning problems. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 257–264. ACM Press, Pittsburgh (2006b)
Guestrin, C., Koller, D., Parr, R., Venkataraman, S.: Efficient Solution Algorithms for Factored MDPs. Journal of Artificial Intelligence Research 19, 399–468 (2003)
Hoey, J., St-Aubin, R., Hu, A., Boutilier, C.: SPUDD: Stochastic planning using decision diagrams. In: Proceedings of the 15th Conference on International Conference on Uncertainty in Artificial Intelligence, Stockholm, pp. 279–288 (1999)
Sigaud, O., Butz, M.V., Kozlova, O., Meyer, C.: Anticipatory Learning Classifier Systems and Factored Reinforcement Learning. In: ABiALS 2008. LNCS (LNAI), vol. 5499, Springer. Heidelberg (2009)
Sigaud, O., Wilson, S.W.: Learning Classifier Systems: a survey. Journal of Soft Computing 11, 1065–1078 (2007)
Sutton, R.S.: DYNA, an integrated architecture for learning, planning and reacting. In: Working Notes of the AAAI Spring Symposium on Integrated Intelligent Architectures (1991)
Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. MIT Press, Cambridge (1998)
Szita, I., Lőrincz, A.: The many faces of optimism: a unifying approach. In: ICML 2008: Proceedings of the 25th international conference on Machine learning, pp. 1048–1055. ACM Press, New York (2008)
Utgoff, P.E.: Incremental induction of decision trees. Machine Learning 4, 161–186 (1989)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kozlova, O., Sigaud, O., Wuillemin, PH., Meyer, C. (2009). Considering Unseen States as Impossible in Factored Reinforcement Learning. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2009. Lecture Notes in Computer Science(), vol 5781. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04180-8_64
Download citation
DOI: https://doi.org/10.1007/978-3-642-04180-8_64
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04179-2
Online ISBN: 978-3-642-04180-8
eBook Packages: Computer ScienceComputer Science (R0)