skip to main content
10.1145/3534678.3539367acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Learning Models of Individual Behavior in Chess

Published:14 August 2022Publication History

ABSTRACT

AI systems that can capture human-like behavior are becoming increasingly useful in situations where humans may want to learn from these systems, collaborate with them, or engage with them as partners for an extended duration. In order to develop human-oriented AI systems, the problem of predicting human actions---as opposed to predicting optimal actions---has received considerable attention. Existing work has focused on capturing human behavior in an aggregate sense, which potentially limits the benefit any particular individual could gain from interaction with these systems. We extend this line of work by developing highly accurate predictive models of individual human behavior in chess. Chess is a rich domain for exploring human-AI interaction because it combines a unique set of properties: AI systems achieved superhuman performance many years ago, and yet humans still interact with them closely, both as opponents and as preparation tools, and there is an enormous corpus of recorded data on individual player games. Starting with Maia, an open-source version of AlphaZero trained on a population of human players, we demonstrate that we can significantly improve prediction accuracy of a particular player's moves by applying a series of fine-tuning methods. Furthermore, our personalized models can be used to perform stylometry---predicting who made a given set of moves---indicating that they capture human decision-making at an individual level. Our work demonstrates a way to bring AI systems into better alignment with the behavior of individual people, which could lead to large improvements in human-AI interaction.

Skip Supplemental Material Section

Supplemental Material

maia-individual.mp4

mp4

63.4 MB

References

  1. Anderson, A., Kleinberg, J., and Mullainathan, S. Assessing human error against a benchmark of perfection. ACM Transactions on Knowledge Discovery from Data (TKDD) 11, 4 (2017), 45.Google ScholarGoogle Scholar
  2. Andrychowicz, M., Denil, M., Gomez, S., Hoffman, M. W., Pfau, D., Schaul, T., Shillingford, B., and De Freitas, N. Learning to learn by gradient descent by gradient descent. In Advances in neural information processing systems (2016).Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Baier, H., Sattaur, A., Powley, E. J., Devlin, S., Rollason, J., and Cowling, P. I. Emulating human play in a leading mobile card game. IEEE Transactions on Games 11, 4 (2018), 386--395.Google ScholarGoogle Scholar
  4. Billings, D., Papp, D., Schaeffer, J., and Szafron, D. Opponent modeling in poker. Aaai/iaai 493, 499 (1998), 105.Google ScholarGoogle Scholar
  5. Biswas, T., and Regan, K. W. Measuring level-k reasoning, satisficing, and human error in game-play data. In 2015 IEEE 14th International Conference on Machine Learning and Applications (Miami, FL, 2015), IEEE, pp. 941--947.Google ScholarGoogle ScholarCross RefCross Ref
  6. Brown, G. W. Iterative solution of games by fictitious play. Activity analysis of production and allocation 13, 1 (1951), 374--376.Google ScholarGoogle Scholar
  7. Carroll, M., Shah, R., Ho, M. K., Griffiths, T., Seshia, S., Abbeel, P., and Dragan, A. On the utility of learning about humans for human-ai coordination. In Advances in Neural Information Processing Systems (2019), pp. 5175--5186.Google ScholarGoogle Scholar
  8. Charness, N. The impact of chess research on cognitive science. Psychological research 54, 1 (1992), 4--9.Google ScholarGoogle Scholar
  9. Chase, W. G., and Simon, H. A. Perception in chess. Cognitive psychology 4, 1 (1973), 55--81.Google ScholarGoogle Scholar
  10. Chatfield, K., Simonyan, K., Vedaldi, A., and Zisserman, A. Return of the devil in the details: Delving deep into convolutional nets. In Proceedings of the British Machine Vision Conference 2014 (2014).Google ScholarGoogle ScholarCross RefCross Ref
  11. Chernoff, H. A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. The Annals of Mathematical Statistics (1952), 493--507.Google ScholarGoogle ScholarCross RefCross Ref
  12. Chidambaram, M., and Qi, Y. Style transfer generative adversarial networks: Learning to play chess differently. arXiv preprint arXiv:1702.06762 (2017).Google ScholarGoogle Scholar
  13. Czech, J., Willig, M., Beyer, A., Kersting, K., and Fürnkranz, J. Learning to play the chess variant crazyhouse above world champion level with deep neural networks and human data. arXiv (2019).Google ScholarGoogle Scholar
  14. Dai, A. M., and Le, Q. V. Semi-supervised sequence learning. In Advances in neural information processing systems (2015), pp. 3079--3087.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Davidson, A., Billings, D., Schaeffer, J., and Szafron, D. Improved opponent modeling in poker. In International Conference on Artificial Intelligence, ICAI'00 (2000), pp. 1467--1473.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Deng, J., Zhang, Z., Marchi, E., and Schuller, B. Sparse autoencoder-based feature transfer learning for speech emotion recognition. In 2013 humaine association conference on affective computing and intelligent interaction (2013), IEEE.Google ScholarGoogle Scholar
  17. Ding, Y., Florensa, C., Abbeel, P., and Phielipp, M. Goal-conditioned imitation learning. In Advances in Neural Information Processing Systems (2019).Google ScholarGoogle Scholar
  18. Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., and Darrell, T. Decaf: A deep convolutional activation feature for generic visual recognition. In International conference on machine learning (2014), pp. 647--655.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Duplessis, T. Lichess. lichess.org, 2021. Accessed: 2021-01-01.Google ScholarGoogle Scholar
  20. Finn, C., Abbeel, P., and Levine, S. Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70 (2017), JMLR. org, pp. 1126--1135.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Gatys, L. A., Ecker, A. S., and Bethge, M. Image style transfer using convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp. 2414--2423.Google ScholarGoogle ScholarCross RefCross Ref
  22. George, D., Shen, H., and Huerta, E. Classification and unsupervised clustering of ligo data with deep transfer learning. Physical Review D 97, 10 (2018), 101501.Google ScholarGoogle ScholarCross RefCross Ref
  23. Glickman, M. E. The glicko system. Boston University 16 (1995), 16--17.Google ScholarGoogle Scholar
  24. Gobet, F., and Simon, H. A. Templates in chess memory: A mechanism for recalling several boards. Cognitive psychology 31, 1 (1996), 1--40.Google ScholarGoogle Scholar
  25. Graves, A., Liwicki, M., Fernández, S., Bertolami, R., Bunke, H., and Schmidhuber, J. A novel connectionist system for unconstrained handwriting recognition. IEEE transactions on pattern analysis and machine intelligence 31, 5 (2008).Google ScholarGoogle Scholar
  26. Haworth, G., Regan, K., and Fatta, G. D. Performance and prediction: Bayesian modelling of fallible choice in chess. In Advances in computer games (2009), Springer, pp. 99--110.Google ScholarGoogle Scholar
  27. He, H., Boyd-Graber, J., Kwok, K., and Daumé III, H. Opponent modeling in deep reinforcement learning. In International conference on machine learning (2016), PMLR, pp. 1804--1813.Google ScholarGoogle Scholar
  28. He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp. 770--778.Google ScholarGoogle ScholarCross RefCross Ref
  29. Ho, J., and Ermon, S. Generative adversarial imitation learning. In NIPS (2016).Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Howard, J., and Ruder, S. Universal language model fine-tuning for text classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (2018), pp. 328--339.Google ScholarGoogle ScholarCross RefCross Ref
  31. Hu, J., Shen, L., and Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (2018).Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Huang, Z., Siniscalchi, S. M., and Lee, C.-H. A unified approach to transfer learning of deep neural networks with applications to speaker adaptation in automatic speech recognition. Neurocomputing 218 (2016).Google ScholarGoogle Scholar
  33. Huh, M., Agrawal, P., and Efros, A. A. What makes imagenet good for transfer learning? arXiv (2016).Google ScholarGoogle Scholar
  34. Kawahara, J., and Hamarneh, G. Multi-resolution-tract cnn with hybrid pretrained and skin-lesion trained layers. In International workshop on machine learning in medical imaging (2016), Springer, pp. 164--171.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Kornblith, S., Shlens, J., and Le, Q. V. Do better imagenet models transfer better? In Proceedings of the IEEE conference on computer vision and pattern recognition (2019), pp. 2661--2671.Google ScholarGoogle ScholarCross RefCross Ref
  36. Kunze, J., Kirsch, L., Kurenkov, I., Krug, A., Johannsmeier, J., and Stober, S. Transfer learning for speech recognition on a budget. In Proceedings of the 2nd Workshop on Representation Learning for NLP (2017), pp. 168--177.Google ScholarGoogle ScholarCross RefCross Ref
  37. LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278--2324.Google ScholarGoogle ScholarCross RefCross Ref
  38. Litjens, G., et al. A survey on deep learning in medical image analysis. Medical image analysis 42 (2017), 60--88.Google ScholarGoogle Scholar
  39. Lockett, A. J., Chen, C. L., and Miikkulainen, R. Evolving explicit opponent models in game playing. In Proceedings of the 9th annual conference on Genetic and evolutionary computation (2007), pp. 2106--2113.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Longe, F. How you can learn chess with AI and magnus carlsen in play magnus, with felipe longe CTO of play magnus, 2020.Google ScholarGoogle Scholar
  41. McCarthy, J. Chess as the drosophila of ai. In Computers, chess, and cognition. Springer, New York, NY, 1990, pp. 227--237.Google ScholarGoogle ScholarCross RefCross Ref
  42. McIlroy-Young, R., Kleinberg, J., Sen, S., Barocas, S., and Anderson, A. Mimetic models: Ethical implications of ai that acts like you. In Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society (AIES'22) (2022).Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. McIlroy-Young, R., Sen, S., Kleinberg, J., and Anderson, A. Aligning superhuman ai with human behavior: Chess as a model system. In Proceedings of the 25th ACM SIGKDD international conference on Knowledge discovery and data mining (2020).Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. McIlroy-Young, R., Wang, R., Sen, S., Kleinberg, J., and Anderson, A. Detecting individual decision-making style: Exploring behavioral stylometry in chess. Advances in Neural Information Processing Systems 34 (2021).Google ScholarGoogle Scholar
  45. Newell, A., Shaw, J. C., and Simon, H. A. Chess-playing programs and the problem of complexity. IBM Journal of Research and Development 2, 4 (1958).Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Oquab, M., Bottou, L., Laptev, I., and Sivic, J. Learning and transferring mid-level image representations using convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (2014).Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Paulsen, P., and Fürnkranz, J. A moderately successful attempt to train chess evaluation functions of different strengths. In Proceedings of the ICML-10 Workshop on Machine Learning and Games (2010), p. 114.Google ScholarGoogle Scholar
  48. Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., and Zettlemoyer, L. Deep contextualized word representations. CoRR abs/1802.05365 (2018).Google ScholarGoogle Scholar
  49. Pham, V., Bluche, T., Kermorvant, C., and Louradour, J. Dropout improves recurrent neural networks for handwriting recognition. In 2014 14th International Conference on Frontiers in Handwriting Recognition (2014), IEEE, pp. 285--290.Google ScholarGoogle ScholarCross RefCross Ref
  50. Plamondon, R., and Srihari, S. N. Online and off-line handwriting recognition: a comprehensive survey. IEEE Transactions on pattern analysis and machine intelligence 22, 1 (2000), 63--84.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., and Liu, P. J. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research 21 (2020), 1--67.Google ScholarGoogle Scholar
  52. Raghu, A., Raghu, M., Bengio, S., and Vinyals, O. Rapid learning or feature reuse? towards understanding the effectiveness of maml. In International Conference on Learning Representations (2019).Google ScholarGoogle Scholar
  53. Raghu, M., Zhang, C., Kleinberg, J., and Bengio, S. Transfusion: Understanding transfer learning for medical imaging. In Advances in Neural Information Processing Systems (2019), pp. 3342--3352.Google ScholarGoogle Scholar
  54. Regan, K. W., and Haworth, G. M. Intrinsic chess ratings. In Twenty-fifth aaai conference on artificial intelligence (2011).Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Revow, M., Williams, C. K., and Hinton, G. E. Using generative models for handwritten digit recognition. IEEE transactions on pattern analysis and machine intelligence 18, 6 (1996), 592--606.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Rohrbach, M., Ebert, S., and Schiele, B. Transfer learning in a transductive setting. In Advances in neural information processing systems (2013), pp. 46--54.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Schaal, S. Is imitation learning the route to humanoid robots? Trends in cognitive sciences 3, 6 (1999), 233--242.Google ScholarGoogle Scholar
  58. Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., et al. A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science (2018).Google ScholarGoogle Scholar
  59. Simon, H. A. The structure of ill-structured problems. In Models of discovery. Springer, 1977, pp. 304--325.Google ScholarGoogle ScholarCross RefCross Ref
  60. Sun, Q., Liu, Y., Chua, T.-S., and Schiele, B. Meta-transfer learning for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019), pp. 403--412.Google ScholarGoogle ScholarCross RefCross Ref
  61. Tajbakhsh, N., et al. Convolutional neural networks for medical image analysis: Full training or fine tuning? IEEE transactions on medical imaging 35, 5 (2016).Google ScholarGoogle Scholar
  62. Tan, C., Sun, F., Kong, T., Zhang, W., Yang, C., and Liu, C. A survey on deep transfer learning. In International conference on artificial neural networks (2018), Springer, pp. 270--279.Google ScholarGoogle ScholarCross RefCross Ref
  63. Tappert, C. C., Suen, C. Y., and Wakahara, T. The state of the art in online handwriting recognition. IEEE Transactions on pattern analysis and machine intelligence 12, 8 (1990), 787--808.Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Torabi, F., Warnell, G., and Stone, P. Recent advances in imitation learning from observation. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19) (2019).Google ScholarGoogle ScholarCross RefCross Ref
  65. Wang, D., and Zheng, T. F. Transfer learning for speech and language processing. In 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) (2015), IEEE, pp. 1225--1237.Google ScholarGoogle ScholarCross RefCross Ref
  66. Yang, J., Zhang, Y., and Dong, F. Neural word segmentation with rich pretraining. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (2017), pp. 839--849.Google ScholarGoogle ScholarCross RefCross Ref
  67. Yosinski, J., Clune, J., Bengio, Y., and Lipson, H. How transferable are features in deep neural networks? In Advances in neural information processing systems (2014), pp. 3320--3328.Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Zoph, B., Yuret, D., May, J., and Knight, K. Transfer learning for low-resource neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (2016), pp. 1568--1575.Google ScholarGoogle ScholarCross RefCross Ref

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
    August 2022
    5033 pages
    ISBN:9781450393850
    DOI:10.1145/3534678

    Copyright © 2022 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 14 August 2022

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    Overall Acceptance Rate1,133of8,635submissions,13%

    Upcoming Conference

    KDD '24

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader