Optimality Conditions for CTMDP with Average Cost Criterion

Guo, Xianping; Zhu, Weiping

doi:10.1007/978-1-4613-0265-0_10

Xianping Guo⁴ &
Weiping Zhu⁵

756 Accesses
2 Citations

Abstract

In this paper, we consider continuous time Markov decision processes with (possibly unbounded) transition and cost rates under the average cost criterion. We present a set of conditions that is weaker than those in [5, 11, 12, 14], and prove the existence of optimal stationary policies using the optimality inequality. Moreover, the theory is illustrated by two examples.

This research has been supported partially by Natural Science Foundation of Guangdong Province, by Foundation of Hongkong Zhongshan University Advanced Research Center, and by the University of Queensland under Grant No. 98/UQNSRG025G.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A Constrained Optimization Problem with Applications to Constrained MDPs

The risk probability criterion for discounted continuous-time Markov decision processes

Article 10 August 2017

Constrained Continuous-Time Markov Decision Processes on the Finite Horizon

Article 15 April 2016

References

Bather, J. Optimal stationary policies for denumerable Markov chains in continuous time. Adv, Appl. Prob., 8 (1976), 144–158.
Article MathSciNet MATH Google Scholar
Bertsekas, D. P. Dynamic Programming: Deterministic and Stochastic Models. Prentice-Hall, Englewood Cliffs, NJ, 1987.
MATH Google Scholar
Doshi, B. T. Continuous time control of Markov processes on an arbitrary state space: average return criterion. Stochastic Processes and their Applications, 4 (1976), 55–77
Article MathSciNet MATH Google Scholar
Howard, R. A. Dynamic Programming and Markov Processes. Wiley, New York, 1960.
MATH Google Scholar
Kakumanu, P. Nondiscounted continuous time Markov decision processes with countable state space. SIAM J. Control, 10 (1972), 210–220.
Article MathSciNet MATH Google Scholar
Lippman, S. A. Applying a new device in the optimization of exponential queueing systems. Op. Res. 23 (1975b), 687–710.
Article MathSciNet MATH Google Scholar
Miller, R. L. Finite state continuous time Markov decision processes with an infinite planning horizon. J. Math. Anal. Appl., 22 (1968), 552–569.
Article MathSciNet MATH Google Scholar
Puterman, M. L. Markov Decision Processes. John Wiley & Sons Inc,1994.
Google Scholar
Yushkevich, A. A. and Feinberg, E. A. On homogeneous Markov model with continuous time and finite or countable state space. Theory. Prob. Appl., 24 (1979), 156–161.
Article MATH Google Scholar
Walrand, J. An Introduction to Queuing Networks. Prentice-Hall, Englewood Cliffs, NJ, 1988.
Google Scholar
Dong, Z. Q. Continuous time Markov decision programming with average reward criterion-countable state and action space. Sci. Chi.. SP ISS(II)(1979), 131–148.
Google Scholar
Song, J. S. Continuous time Markov decision programming with non-uniformly bounded transition rate. Scientia Sinica, 12 (1987), 1258–1267.
Google Scholar
Zheng, S. H. Continuous time Markov decision programming with average reward criterion and unbounded reward rate. Acta Math. Appl. Sinica, 7 (1991), 6–16.
Article MATH Google Scholar
Guo, X. P. and Liu K. “Optimality inequality for continuous time Markov decision processes with average reward criterion”. Preprint, Zhongshan University, Guangzhou, P.R. China, 1998(Submitted).
Google Scholar
Sennott, L. I. Another set of conditions for average optimality in Markov decision processes. System & control letters, 24 (1995), 147–151.
Article MathSciNet MATH Google Scholar
Sennott, L. I. Average cost optimal stationary policies in infinite state Markov decision processes with unbounded cost. Op. Res., 37 (1989), 626–633.
Article MathSciNet MATH Google Scholar
Cavazos-Cadena, R. and Gaucherand, Value iteration in a class of average controlled Markov chains with unbounded costs: Necessary and sufficient conditions for pointwise convergence, J. Appl. Prob., 33 (1996), 986–1002.
Article MATH Google Scholar
Guo, X. P. and Zhu, W. P. “Denumerable state continuous time Markov decision processes with unbounded cost and transition rates under discounted cost criterion”, To appear on Journal of Australian Math. Soci. series B.
Google Scholar
Wu, C. B. Continuous time Markov decision processes with unbounded reward and non-uniformly bounded transition rate under discounted criterion. Acta Math. Appl. Sinica, 20 (1997), 196–208.
MathSciNet MATH Google Scholar
Filar, J. A. and Vrieze, K. Competitive Markov Decision Processes. Springer-Verlag, New York, 1996.
Book Google Scholar
Chung, K. L. Markov Chains with Stationary Transition Probabilities. Springer-Verlag, Berlin, 1960.
MATH Google Scholar
Haviv, M. and Puterman, M. L. Bias optimality in controlled queueing systems. J. Appl. Prob.., 35 (1998), 136–150.
Article MathSciNet MATH Google Scholar
Serfozo, R. Optimal control of random walks, birth and death processes, and queues. Adv. Appl. Prob.., 13 (1981), 61–83.
Article MathSciNet MATH Google Scholar
Anderson, W. J. Continuous Time Markov Chains. Springer-Verlag, New York, 1991.
Book MATH Google Scholar
Widder, D.V. The Laplace Transform. Princeton University Press. Princeton, NJ, 1946.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, Zhongshan University, China
Xianping Guo
Department of Computer Science and Electrical Engineering, The University of Queensland, Australia
Weiping Zhu

Authors

Xianping Guo
View author publications
You can also search for this author in PubMed Google Scholar
Weiping Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Research Department, Changsha Railway University, Changsha, China
Zhenting Hou
School of Mathematics, University of South Australia, Mawson Lakes, SA, Australia
Jerzy A. Filar
School of Computing and Mathematical Sciences, University of Greenwich, London, UK
Anyue Chen

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Guo, X., Zhu, W. (2002). Optimality Conditions for CTMDP with Average Cost Criterion. In: Hou, Z., Filar, J.A., Chen, A. (eds) Markov Processes and Controlled Markov Chains. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-0265-0_10

Download citation

DOI: https://doi.org/10.1007/978-1-4613-0265-0_10
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-7968-3
Online ISBN: 978-1-4613-0265-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Optimality Conditions for CTMDP with Average Cost Criterion

Abstract

Access this chapter

Preview

Similar content being viewed by others

A Constrained Optimization Problem with Applications to Constrained MDPs

The risk probability criterion for discounted continuous-time Markov decision processes

Constrained Continuous-Time Markov Decision Processes on the Finite Horizon

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Optimality Conditions for CTMDP with Average Cost Criterion

Abstract

Access this chapter

Preview

Similar content being viewed by others

A Constrained Optimization Problem with Applications to Constrained MDPs

The risk probability criterion for discounted continuous-time Markov decision processes

Constrained Continuous-Time Markov Decision Processes on the Finite Horizon

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation