The Confidence Intervals in Computer Go

Śliwa, Leszek Stanislaw

doi:10.1007/978-3-319-39384-1_51

Leszek Stanislaw Śliwa¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9693))

Included in the following conference series:

International Conference on Artificial Intelligence and Soft Computing

1174 Accesses

Abstract

The confidence intervals in computer Go are used in MCTS algorithm to select the potentially most promising moves that should be evaluated with Monte-Carlo simulations. Smart selection of moves for evaluation has the crucial impact on program’s playing strength. This paper describes the application of confidence intervals for binomial distributed random variables in computer Go. In practice, the estimation of confidence intervals of binomial distribution is difficult and computationally exhausted. Now due to computer technology progress and functions offered by many libraries calculation of confidence intervals for discreet, binomial distribution become an easy task. This research shows that the move-selection strategy which implements calculation of the exact confidence intervals based on discreet, binomial distribution is much more effective than based on normal. The new approach shows its advantages particularly in games played on medium and large boards.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Search strategies (UCT, UCB1-TUNED, MOSS etc.) are often described using Multi-Armed Bandit (MAB) problem terminology.
2.
Implementation of the OMC move-selection strategy does not require implementation of the back-propagation strategy. The condition, in the opposite direction, is not true.
3.
Gauss error function \( erf(x)=\frac{2}{\sqrt{\pi }} \int _0^x \mathrm {e}^{-t^2} \, \mathrm {d}t; \; erfc(x)= 1-erf(x)=\frac{2}{\sqrt{\pi }} \int _x^\infty \mathrm {e}^{-t^2} \, \mathrm {d}t .\)
4.
In the context of Go game – payoffs, results of simulations (also called playouts or rollouts).
5.
Points added to the score of white stones as compensation for playing second.

References

Allis, V.: Searching for solutions in games and artificial intelligence. Ph.D. thesis, Rijksuniversiteit Limburg, Maastricht, The Netherlands (1994)
Google Scholar
Audibert, J., Bubeck, S.: Minimax policies for adversarial and stochastic bandits. In: Proceedings of the 22nd Annual Conference on Learning Theory, Omnipress, pp. 773–818 (2004)
Google Scholar
Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47, 235–256 (2002)
Article MATH Google Scholar
Brügmann, B.: Monte Carlo Go. Technical report, Physics Department, Syracuse University, Syracuse, NY, USA (1993)
Google Scholar
Brown, L., Cai, T., DasGupta, A.: Interval estimation for a binomial proportion. Stat. Sci. 16(2), 101–117 (2001)
MathSciNet MATH Google Scholar
Chaslot, G., Saito, J., Uiterwijk, J., Bouzy, B., van den Herik, H.: Monte-Carlo strategies for computer go. In: 18th BeNeLux Conference on Artificial Intelligence, pp. 83–90 (2006)
Google Scholar
Coulom, R.: Efficient selectivity and backup operators in Monte-Carlo tree search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.H.L.M.J. (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007)
Chapter Google Scholar
Gelly, S., Silver, D.: Combining online and offline knowledge in UCT. In: ICML 2007 Proceedings of the 24th International Conference on Machine Learning, pp. 273–280. ACM Press, New York (2007)
Google Scholar
Gelly, S., Wang, Y.: Exploration exploitation in Go: UCT for Monte-Carlo Go. In: NIPS: Neural Information Processing Systems Conference On-Line Trading of Exploration and Exploitation Workshop (2006)
Google Scholar
Hogg, R., McKean, J., Craig, A.: Introduction to Mathematical Statistics, 7th edn. Pearson Prentice Hall, US (2013)
Google Scholar
Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)
Chapter Google Scholar
Morisette, J., Khorram, S.: Interval estimation for a binomial proportion. Photogram. Eng. Remote Sens. 64, 281–283 (1998)
Google Scholar
Neyman, J.: On the problem of confidence intervals. Ann. Math. Stat. 6(3), 111–116 (1935)
Article MATH Google Scholar
Neyman, J.: Outline of a theory of statistical estimation based on the classical theory of probability. Philos. Trans. Roy. Soc. Lond. Ser. A, Math. Phys. Sci. 236(767), 333–380 (1937)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Electronics and Information Technology, The Institute of Computer Science, Warsaw University of Technology, Warsaw, Poland
Leszek Stanislaw Śliwa

Authors

Leszek Stanislaw Śliwa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Leszek Stanislaw Śliwa .

Editor information

Editors and Affiliations

Częstochowa University of Technology, Czestochowa, Poland
Leszek Rutkowski
Częstochowa University of Technology, Czestochowa, Poland
Marcin Korytkowski
Częstochowa University of Technology, Czestochowa, Poland
Rafał Scherer
AGH University of Science and Technology, Krakow, Poland
Ryszard Tadeusiewicz
University of California, Berkeley, California, USA
Lotfi A. Zadeh
University of Louisville, Louisville, Kentucky, USA
Jacek M. Zurada

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Śliwa, L.S. (2016). The Confidence Intervals in Computer Go. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L., Zurada, J. (eds) Artificial Intelligence and Soft Computing. ICAISC 2016. Lecture Notes in Computer Science(), vol 9693. Springer, Cham. https://doi.org/10.1007/978-3-319-39384-1_51

Download citation

DOI: https://doi.org/10.1007/978-3-319-39384-1_51
Published: 29 May 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-39383-4
Online ISBN: 978-3-319-39384-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics