Abstract
A Monte-Carlo evaluation consists in estimating a position by averaging the outcome of several random continuations. The method can serve as an evaluation function at the leaves of a min-max tree. This paper presents a new framework to combine tree search with Monte-Carlo evaluation, that does not separate between a min-max phase and a Monte-Carlo phase. Instead of backing-up the min-max value close to the root, and the average value at some depth, a more general backup operator is defined that progressively changes from averaging to min-max as the number of simulations grows. This approach provides a fine-grained control of the tree growth, at the level of individual simulations, and allows efficient selectivity. The resulting algorithm was implemented in a 9×9 Go-playing program, Crazy Stone, that won the 10th KGS computer-Go tournament.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abramson, B.: Expected-Outcome: A General Model of Static Evaluation. IEEE Transactions on Pattern Analysis and Machine Intelligence 12(2), 182–193 (1990)
Allis, L.V.: Searching for Solutions in Games and Artificial Intelligence. PhD thesis, Universiteit Maastricht, Maastricht, The Netherlands (1994)
Alrefaei, M.H., Andradóttir, S.: A Simulated Annealing Algorithm with Constant Temperature for Discrete Stochastic Optimization. Management Science 45(5), 748–764 (1999)
Baum, E.B., Smith, W.D.: A Bayesian Approach to Relevance in Game Playing. Artificial Intelligence 97(1–2), 195–242 (1997)
Billings, D., Papp, D., Peña, L., Schaeffer, J., Szafron, D.: Using Selective-Sampling Simulations in Poker. In: Proceedings of the AAAI Spring Symposium on Search Techniques for Problem Solving under Uncertainty and Incomplete Information (1999)
Bouzy, B.: Associating Shallow and Selective Global Tree Search with Monte Carlo for 9×9 Go. In: van den Herik, H.J., Björnsson, Y., Netanyahu, N.S. (eds.) CG 2004. LNCS, vol. 3846, pp. 67–80. Springer, Heidelberg (2006)
Bouzy, B.: Move Pruning Techniques for Monte-Carlo Go. In: van den Herik, H.J., Hsu, S.-C., Hsu, T.-s., Donkers, H.H.L.M. (eds.) CG 2005. LNCS, vol. 4250, pp. 104–119. Springer, Heidelberg (2006)
Bouzy, B., Cazenave, T.: Computer Go: an AI-oriented Survey. Artificial Intelligence 132, 39–103 (2001)
Bouzy, B., Helmstetter, B.: Monte Carlo Go Developments. In: van den Herik, H.J., Iida, H., Heinz, E.A. (eds.) 10th Advances in Computer Games (ACG10), Many Games, Many Challenges, pp. 159–174. Kluwer Academic Publishers, Boston (2004)
Brügmann, B.: Monte Carlo Go, Unpublished technical report (1993)
Cazenave, T., Helmstetter, B.: Combining Tactical Search and Monte-Carlo in the Game of Go. In: Kendall, G., Lucas, S. (eds.) Proceedings of the IEEE Symposium on Computational Intelligence and Games, pp. 117–124. IEEE Computer Society Press, Los Alamitos (2005)
Chang, H.S., Fu, M.C., Hu, J., Marcus, S.I.: An Adaptive Sampling Algorithm for Solving Markov Decision Processes. Operations Research 53(1), 126–139 (2005)
Chen, C.-H., Lin, J., Yücesan, E., Chick, S.E.: Simulation Budget Allocation for Further Enhancing the Efficiency of Ordinal Optimization. Journal of Discrete Event Dynamic Systems: Theory and Applications 10(3), 251–270 (2000)
Chung, M., Buro, M., Schaeffer, J.: Monte-Carlo Planning in RTS Games. In: Kendall, G., Lucas, S. (eds.) Proceedings of the IEEE Symposium on Computational Intelligence and Games, pp. 117–124. IEEE Computer Society Press, Los Alamitos (2005)
Enzenberger, M.: Evaluation in Go by a Neural Network Using Soft Segmentation. In: van den Herik, H.J., Iida, H., Heinz, E.A. (eds.) 10th Advances in Computer Games (ACG10), Many Games, Many Challenges, pp. 97–108. Kluwer Academic Publishers, Boston (2004)
Futschik, A., Pflug, G.Ch.: Optimal Allocation of Simulation Experiments in Discrete Stochastic Optimization and Approximative Algorithms. European Journal of Operational Research 101, 245–260 (1997)
Ginsberg, M.L.: GIB: Steps Toward an Expert-Level Bridge-Playing Program. In: Dean, Th. (ed.) Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, pp. 584–593. Morgan Kaufmann, Los Altos, CA (1999)
Juillé, H.: Methods for Statistical Inference: Extending the Evolutionary Computation Paradigm. PhD thesis, Brandeis University, Department of Computer Science (May 1999)
Kearns, M., Mansour, Y., Ng, A.Y.: A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes. In: Dean, T. (ed.) Proceedings of the Sixteenth Internation Joint Conference on Artificial Intelligence, pp. 1324–1331. Morgan Kaufmann, Los Alamitos, CA (1999)
Knuth, D.E., Moore, R.W.: An Analysis of Alpha-Beta Pruning. Artificial Intelligence 6, 293–326 (1975)
Palay, A.J.: Searching with Probabilities. Pitman, Marshfield, MA (1984)
Péret, L., Garcia, F.: On-line Search for Solving Large Markov Decision Processes. In: De Mantaras, R.L., Saitta, L. (eds.) Proceedings of the 16th European Conference on Artificial Intelligence (2004)
Sheppard, B.: Efficient Control of Selective Simulations. ICGA Journal 27(2), 67–79 (2004)
Sutton, R.S.: Learning to Predict by the Methods of Temporal Differences. Machine Learning 3, 9–44 (1988)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA (1998)
Tesauro, G.: Programming Backgammon Using Self-Teaching Neural Nets. Artificial Intelligence 134, 181–199 (2002)
Tromp, J., Farnebäck, G.: Combinatorics of Go. In: van den Herik, H.J., Ciancarini, P., Donkers, H.L.L.M. (eds.) CG 2006. 5th Computers and Games Conference. LNCS, vol. 4630, pp. 85–100. Springer, Heidelberg (2007)
Wedd, N.: Computer Go Tournaments on KGS (2005), http://www.weddslist.com/kgs/
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Coulom, R. (2007). Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.H.L.M.(. (eds) Computers and Games. CG 2006. Lecture Notes in Computer Science, vol 4630. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75538-8_7
Download citation
DOI: https://doi.org/10.1007/978-3-540-75538-8_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75537-1
Online ISBN: 978-3-540-75538-8
eBook Packages: Computer ScienceComputer Science (R0)