Abstract
The confidence intervals in computer Go are used in MCTS algorithm to select the potentially most promising moves that should be evaluated with Monte-Carlo simulations. Smart selection of moves for evaluation has the crucial impact on program’s playing strength. This paper describes the application of confidence intervals for binomial distributed random variables in computer Go. In practice, the estimation of confidence intervals of binomial distribution is difficult and computationally exhausted. Now due to computer technology progress and functions offered by many libraries calculation of confidence intervals for discreet, binomial distribution become an easy task. This research shows that the move-selection strategy which implements calculation of the exact confidence intervals based on discreet, binomial distribution is much more effective than based on normal. The new approach shows its advantages particularly in games played on medium and large boards.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Search strategies (UCT, UCB1-TUNED, MOSS etc.) are often described using Multi-Armed Bandit (MAB) problem terminology.
- 2.
Implementation of the OMC move-selection strategy does not require implementation of the back-propagation strategy. The condition, in the opposite direction, is not true.
- 3.
Gauss error function \( erf(x)=\frac{2}{\sqrt{\pi }} \int _0^x \mathrm {e}^{-t^2} \, \mathrm {d}t; \; erfc(x)= 1-erf(x)=\frac{2}{\sqrt{\pi }} \int _x^\infty \mathrm {e}^{-t^2} \, \mathrm {d}t .\)
- 4.
In the context of Go game – payoffs, results of simulations (also called playouts or rollouts).
- 5.
Points added to the score of white stones as compensation for playing second.
References
Allis, V.: Searching for solutions in games and artificial intelligence. Ph.D. thesis, Rijksuniversiteit Limburg, Maastricht, The Netherlands (1994)
Audibert, J., Bubeck, S.: Minimax policies for adversarial and stochastic bandits. In: Proceedings of the 22nd Annual Conference on Learning Theory, Omnipress, pp. 773–818 (2004)
Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47, 235–256 (2002)
Brügmann, B.: Monte Carlo Go. Technical report, Physics Department, Syracuse University, Syracuse, NY, USA (1993)
Brown, L., Cai, T., DasGupta, A.: Interval estimation for a binomial proportion. Stat. Sci. 16(2), 101–117 (2001)
Chaslot, G., Saito, J., Uiterwijk, J., Bouzy, B., van den Herik, H.: Monte-Carlo strategies for computer go. In: 18th BeNeLux Conference on Artificial Intelligence, pp. 83–90 (2006)
Coulom, R.: Efficient selectivity and backup operators in Monte-Carlo tree search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.H.L.M.J. (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007)
Gelly, S., Silver, D.: Combining online and offline knowledge in UCT. In: ICML 2007 Proceedings of the 24th International Conference on Machine Learning, pp. 273–280. ACM Press, New York (2007)
Gelly, S., Wang, Y.: Exploration exploitation in Go: UCT for Monte-Carlo Go. In: NIPS: Neural Information Processing Systems Conference On-Line Trading of Exploration and Exploitation Workshop (2006)
Hogg, R., McKean, J., Craig, A.: Introduction to Mathematical Statistics, 7th edn. Pearson Prentice Hall, US (2013)
Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)
Morisette, J., Khorram, S.: Interval estimation for a binomial proportion. Photogram. Eng. Remote Sens. 64, 281–283 (1998)
Neyman, J.: On the problem of confidence intervals. Ann. Math. Stat. 6(3), 111–116 (1935)
Neyman, J.: Outline of a theory of statistical estimation based on the classical theory of probability. Philos. Trans. Roy. Soc. Lond. Ser. A, Math. Phys. Sci. 236(767), 333–380 (1937)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Śliwa, L.S. (2016). The Confidence Intervals in Computer Go. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L., Zurada, J. (eds) Artificial Intelligence and Soft Computing. ICAISC 2016. Lecture Notes in Computer Science(), vol 9693. Springer, Cham. https://doi.org/10.1007/978-3-319-39384-1_51
Download citation
DOI: https://doi.org/10.1007/978-3-319-39384-1_51
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-39383-4
Online ISBN: 978-3-319-39384-1
eBook Packages: Computer ScienceComputer Science (R0)