skip to main content
10.1145/3575870.3587117acmconferencesArticle/Chapter ViewAbstractPublication PagescpsweekConference Proceedingsconference-collections
research-article
Open Access
Results Reproduced / v1.1

Interval Markov Decision Processes with Continuous Action-Spaces

Published:09 May 2023Publication History

ABSTRACT

Interval Markov Decision Processes (IMDPs) are finite-state uncertain Markov models, where the transition probabilities belong to intervals. Recently, there has been a surge of research on employing IMDPs as abstractions of stochastic systems for control synthesis. However, due to the absence of algorithms for synthesis over IMDPs with continuous action-spaces, the action-space is assumed discrete a-priori, which is a restrictive assumption for many applications. Motivated by this, we introduce continuous-action IMDPs (caIMDPs), where the bounds on transition probabilities are functions of the action variables, and study value iteration for maximizing expected cumulative rewards. Specifically, we decompose the max-min problem associated to value iteration to |𝒬| max problems, where |𝒬| is the number of states of the caIMDP. Then, exploiting the simple form of these max problems, we identify cases where value iteration over caIMDPs can be solved efficiently (e.g., with linear or convex programming). We also gain other interesting insights: e.g., in certain cases where the action set π’œ is a polytope, synthesis over a discrete-action IMDP, where the actions are the vertices of π’œ, is sufficient for optimality. We demonstrate our results on a numerical example. Finally, we include a short discussion on employing caIMDPs as abstractions for control synthesis.

References

  1. Dimitri P Bertsekas and Steven Shreve. 2004. Stochastic optimal control: the discrete-time case.Google ScholarGoogle Scholar
  2. Stephen Boyd and Lieven Vandenberghe. 2004. Convex optimization. Cambridge university press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Murat Cubuktepe, Nils Jansen, Sebastian Junges, Joost-Pieter Katoen, and Ufuk Topcu. 2021. Convex Optimization for Parameter Synthesis in MDPs. IEEE Trans. Automat. Control (2021).Google ScholarGoogle ScholarCross RefCross Ref
  4. Giannis Delimpaltadakis, Luca Laurenti, and Manuel Mazo Jr. 2022. Formal Analysis of the Sampling Behaviour of Stochastic Event-Triggered Control. arXiv preprint arXiv:2202.10178 (2022).Google ScholarGoogle Scholar
  5. Maxence Dutreix, Jeongmin Huh, and Samuel Coogan. 2022. Abstraction-based synthesis for stochastic systems with omega-regular objectives. Nonlinear Analysis: Hybrid Systems 45 (2022), 101204.Google ScholarGoogle ScholarCross RefCross Ref
  6. James E Falk. 1973. A linear maxβ€”min problem. Mathematical Programming 5, 1 (1973), 169–188.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Sicun Gao, Soonho Kong, and Edmund M Clarke. 2013. dReal: An SMT solver for nonlinear theories over the reals. In International conference on automated deduction. Springer, 208–214.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Robert Givan, Sonia Leach, and Thomas Dean. 2000. Bounded-parameter Markov decision processes. Artificial Intelligence 122, 1-2 (2000), 71–109.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Ernst Moritz Hahn, Holger Hermanns, and Lijun Zhang. 2011. Probabilistic reachability for parametric Markov models. International Journal on Software Tools for Technology Transfer 13, 1 (2011), 3–19.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. John Jackson, Luca Laurenti, Eric Frew, and Morteza Lahijanian. 2021. Strategy synthesis for partially-known switched stochastic systems. In Proceedings of the 24th International Conference on Hybrid Systems: Computation and Control. 1–11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Xenofon Koutsoukos and Derek Riley. 2006. Computational methods for reachability analysis of stochastic hybrid systems. In International Workshop on Hybrid Systems: Computation and Control. Springer, 377–391.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Morteza Lahijanian, Sean B Andersson, and Calin Belta. 2015. Formal verification and synthesis for discrete-time stochastic systems. IEEE Trans. Automat. Control 60, 8 (2015), 2031–2045.Google ScholarGoogle ScholarCross RefCross Ref
  13. Ruggero Lanotte, Andrea Maggiolo-Schettini, and Angelo Troina. 2007. Parametric probabilistic transition systems for system design and analysis. Formal Aspects of Computing 19, 1 (2007), 93–109.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Luca Laurenti, Morteza Lahijanian, Alessandro Abate, Luca Cardelli, and Marta Kwiatkowska. 2020. Formal and efficient synthesis for continuous-time linear stochastic hybrid processes. IEEE Trans. Automat. Control 66, 1 (2020), 17–32.Google ScholarGoogle ScholarCross RefCross Ref
  15. Abolfazl Lavaei, Sadegh Soudjani, Alessandro Abate, and Majid Zamani. 2022. Automated verification and synthesis of stochastic hybrid systems: A survey. Automatica 146 (2022), 110617.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Arnab Nilim and Laurent El Ghaoui. 2005. Robust control of Markov decision processes with uncertain transition matrices. Operations Research 53, 5 (2005), 780–798.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Andrzej S Nowak. 1984. On zero-sum stochastic games with general state space I. Probability and Mathematical Statistics 4, 1 (1984), 13–32.Google ScholarGoogle Scholar
  18. R Tyrrell Rockafellar. 1970. Convex analysis. Vol. 18. Princeton university press.Google ScholarGoogle Scholar
  19. G George Yin and Chao Zhu. 2009. Hybrid switching diffusions: properties and applications. Vol. 63. Springer Science & Business Media.Google ScholarGoogle Scholar

Index Terms

  1. Interval Markov Decision Processes with Continuous Action-Spaces

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Article Metrics

            • Downloads (Last 12 months)265
            • Downloads (Last 6 weeks)37

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format