ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
advertisementadvertisement
Theoretical Computer Science
Volume 157, Issue 2, 5 May 1996, Pages 161-183
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Purchase PDF (1496 K)

 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/0304-3975(95)00158-1    How to Cite or Link Using DOI (Opens New Window)
Copyright © 1996 Published by Elsevier Science B.V.

On the complexity of partially observed Markov decision processes

Dima Buragob, E-mail The Corresponding Author, a, 1, Michel de Rougemontc and Anatol Slissenkod, e, f, Corresponding Author Contact Information, E-mail The Corresponding Author, *

Laboratory for Theory of Algorithms, SPIIRAN2, St. Petersburg, Russia b LRI, Université Paris-Sud, France c Laboratoire de Recherche en Informatique, Université Paris-Sud, Bât. 490, F-91405, Orsay, France d Université Paris-12, Bât.P3, Informatique, 61, Ave. du Général de Gaulle, 94010, Créteil, France e LITP, Institut Blaise Pascal, Paris, France f Laboratory for Theory of Algorithms, SPIIRAN2, St. Petersburg, Russia

Available online 12 February 1999.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

In the paper we consider the complexity of constructing optimal policies (strategies) for some type of partially observed Markov decision processes. This particular case of the classical problem deals with finite stationary processes, and can be represented as constructing optimal strategies to reach target vertices from a starting vertex in a graph with colored vertices and probabilistic deviations from an edge chosen to follow. The colors of the visited vertices is the only information available to a strategy. The complexity of Markov decision in the case of perfect information (bijective coloring of vertices) is known and briefly surveyed at the beginning of the paper. For the unobservable case (all the colors are equal) we give an improvement of the result of Papadimitriou and Tsitsiklis, namely we show that the problem of constructing even a very weak approximation to an optimal strategy is NP-hard. Our main results concern the case of a fixed bound on the multiplicity of coloring, that is a case of partially observed processes where some upper bound on the unobservability is supposed. We show that the problem of finding an optimal strategy is still NP-hard, but polytime approximations are possible. Some relations of our results to the Max-Word Problem are also indicated.

Article Outline

• References

 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.