Abstract
In this paper, we consider continuous time Markov decision processes with (possibly unbounded) transition and cost rates under the average cost criterion. We present a set of conditions that is weaker than those in [5, 11, 12, 14], and prove the existence of optimal stationary policies using the optimality inequality. Moreover, the theory is illustrated by two examples.
This research has been supported partially by Natural Science Foundation of Guangdong Province, by Foundation of Hongkong Zhongshan University Advanced Research Center, and by the University of Queensland under Grant No. 98/UQNSRG025G.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bather, J. Optimal stationary policies for denumerable Markov chains in continuous time. Adv, Appl. Prob., 8 (1976), 144–158.
Bertsekas, D. P. Dynamic Programming: Deterministic and Stochastic Models. Prentice-Hall, Englewood Cliffs, NJ, 1987.
Doshi, B. T. Continuous time control of Markov processes on an arbitrary state space: average return criterion. Stochastic Processes and their Applications, 4 (1976), 55–77
Howard, R. A. Dynamic Programming and Markov Processes. Wiley, New York, 1960.
Kakumanu, P. Nondiscounted continuous time Markov decision processes with countable state space. SIAM J. Control, 10 (1972), 210–220.
Lippman, S. A. Applying a new device in the optimization of exponential queueing systems. Op. Res. 23 (1975b), 687–710.
Miller, R. L. Finite state continuous time Markov decision processes with an infinite planning horizon. J. Math. Anal. Appl., 22 (1968), 552–569.
Puterman, M. L. Markov Decision Processes. John Wiley & Sons Inc,1994.
Yushkevich, A. A. and Feinberg, E. A. On homogeneous Markov model with continuous time and finite or countable state space. Theory. Prob. Appl., 24 (1979), 156–161.
Walrand, J. An Introduction to Queuing Networks. Prentice-Hall, Englewood Cliffs, NJ, 1988.
Dong, Z. Q. Continuous time Markov decision programming with average reward criterion-countable state and action space. Sci. Chi.. SP ISS(II)(1979), 131–148.
Song, J. S. Continuous time Markov decision programming with non-uniformly bounded transition rate. Scientia Sinica, 12 (1987), 1258–1267.
Zheng, S. H. Continuous time Markov decision programming with average reward criterion and unbounded reward rate. Acta Math. Appl. Sinica, 7 (1991), 6–16.
Guo, X. P. and Liu K. “Optimality inequality for continuous time Markov decision processes with average reward criterion”. Preprint, Zhongshan University, Guangzhou, P.R. China, 1998(Submitted).
Sennott, L. I. Another set of conditions for average optimality in Markov decision processes. System & control letters, 24 (1995), 147–151.
Sennott, L. I. Average cost optimal stationary policies in infinite state Markov decision processes with unbounded cost. Op. Res., 37 (1989), 626–633.
Cavazos-Cadena, R. and Gaucherand, Value iteration in a class of average controlled Markov chains with unbounded costs: Necessary and sufficient conditions for pointwise convergence, J. Appl. Prob., 33 (1996), 986–1002.
Guo, X. P. and Zhu, W. P. “Denumerable state continuous time Markov decision processes with unbounded cost and transition rates under discounted cost criterion”, To appear on Journal of Australian Math. Soci. series B.
Wu, C. B. Continuous time Markov decision processes with unbounded reward and non-uniformly bounded transition rate under discounted criterion. Acta Math. Appl. Sinica, 20 (1997), 196–208.
Filar, J. A. and Vrieze, K. Competitive Markov Decision Processes. Springer-Verlag, New York, 1996.
Chung, K. L. Markov Chains with Stationary Transition Probabilities. Springer-Verlag, Berlin, 1960.
Haviv, M. and Puterman, M. L. Bias optimality in controlled queueing systems. J. Appl. Prob.., 35 (1998), 136–150.
Serfozo, R. Optimal control of random walks, birth and death processes, and queues. Adv. Appl. Prob.., 13 (1981), 61–83.
Anderson, W. J. Continuous Time Markov Chains. Springer-Verlag, New York, 1991.
Widder, D.V. The Laplace Transform. Princeton University Press. Princeton, NJ, 1946.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Kluwer Academic Publishers
About this chapter
Cite this chapter
Guo, X., Zhu, W. (2002). Optimality Conditions for CTMDP with Average Cost Criterion. In: Hou, Z., Filar, J.A., Chen, A. (eds) Markov Processes and Controlled Markov Chains. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-0265-0_10
Download citation
DOI: https://doi.org/10.1007/978-1-4613-0265-0_10
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-7968-3
Online ISBN: 978-1-4613-0265-0
eBook Packages: Springer Book Archive