Bayesian Adaptive Stochastic Process Termination
Inchi Hu,
Chi-Wen Jevons Lee
Department of Information and Systems Management, School of Business and Management, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
A. B. Freeman School of Business, Tulane University, New Orleans, Louisiana 70118-5669, USA
imichu{at}ust.hk
jevons.lee{at}tulane.edu
This paper considers the problem of optimally terminating a number of stochastic processes when the time varying random rewards have distributions belonging to an exponential family with an unknown parameter. The problem is formulated as a Bayesian adaptive control model with the objective of minimizing the difference between the expected reward and the optimal reward when the parameter is known. The paper establishes an asymptotic lower bound on this difference and constructs policies based on a Kullback-Leibler index that obtain this lower bound. The results are applied to models of tree harvesting and destructive testing. A simulation study shows that these policies are efficient when the number of processes is large.
Key Words: Irrevocable adaptive policy; Kullback-Leibler index; multi-armed bandit problem; sequential test; stopping time
History: Received: November 13, 1995;
revision received: February 6, 1997;revision received: November 14, 1997;revision received: May 7, 1999;revision received: April 2, 2001;revision received: January 17, 2002;revision received: July 24, 2002;
Copyright © 2003 by INFORMS.