ABSTRACT
AI systems that can capture human-like behavior are becoming increasingly useful in situations where humans may want to learn from these systems, collaborate with them, or engage with them as partners for an extended duration. In order to develop human-oriented AI systems, the problem of predicting human actions---as opposed to predicting optimal actions---has received considerable attention. Existing work has focused on capturing human behavior in an aggregate sense, which potentially limits the benefit any particular individual could gain from interaction with these systems. We extend this line of work by developing highly accurate predictive models of individual human behavior in chess. Chess is a rich domain for exploring human-AI interaction because it combines a unique set of properties: AI systems achieved superhuman performance many years ago, and yet humans still interact with them closely, both as opponents and as preparation tools, and there is an enormous corpus of recorded data on individual player games. Starting with Maia, an open-source version of AlphaZero trained on a population of human players, we demonstrate that we can significantly improve prediction accuracy of a particular player's moves by applying a series of fine-tuning methods. Furthermore, our personalized models can be used to perform stylometry---predicting who made a given set of moves---indicating that they capture human decision-making at an individual level. Our work demonstrates a way to bring AI systems into better alignment with the behavior of individual people, which could lead to large improvements in human-AI interaction.
Supplemental Material
- Anderson, A., Kleinberg, J., and Mullainathan, S. Assessing human error against a benchmark of perfection. ACM Transactions on Knowledge Discovery from Data (TKDD) 11, 4 (2017), 45.Google Scholar
- Andrychowicz, M., Denil, M., Gomez, S., Hoffman, M. W., Pfau, D., Schaul, T., Shillingford, B., and De Freitas, N. Learning to learn by gradient descent by gradient descent. In Advances in neural information processing systems (2016).Google ScholarDigital Library
- Baier, H., Sattaur, A., Powley, E. J., Devlin, S., Rollason, J., and Cowling, P. I. Emulating human play in a leading mobile card game. IEEE Transactions on Games 11, 4 (2018), 386--395.Google Scholar
- Billings, D., Papp, D., Schaeffer, J., and Szafron, D. Opponent modeling in poker. Aaai/iaai 493, 499 (1998), 105.Google Scholar
- Biswas, T., and Regan, K. W. Measuring level-k reasoning, satisficing, and human error in game-play data. In 2015 IEEE 14th International Conference on Machine Learning and Applications (Miami, FL, 2015), IEEE, pp. 941--947.Google ScholarCross Ref
- Brown, G. W. Iterative solution of games by fictitious play. Activity analysis of production and allocation 13, 1 (1951), 374--376.Google Scholar
- Carroll, M., Shah, R., Ho, M. K., Griffiths, T., Seshia, S., Abbeel, P., and Dragan, A. On the utility of learning about humans for human-ai coordination. In Advances in Neural Information Processing Systems (2019), pp. 5175--5186.Google Scholar
- Charness, N. The impact of chess research on cognitive science. Psychological research 54, 1 (1992), 4--9.Google Scholar
- Chase, W. G., and Simon, H. A. Perception in chess. Cognitive psychology 4, 1 (1973), 55--81.Google Scholar
- Chatfield, K., Simonyan, K., Vedaldi, A., and Zisserman, A. Return of the devil in the details: Delving deep into convolutional nets. In Proceedings of the British Machine Vision Conference 2014 (2014).Google ScholarCross Ref
- Chernoff, H. A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. The Annals of Mathematical Statistics (1952), 493--507.Google ScholarCross Ref
- Chidambaram, M., and Qi, Y. Style transfer generative adversarial networks: Learning to play chess differently. arXiv preprint arXiv:1702.06762 (2017).Google Scholar
- Czech, J., Willig, M., Beyer, A., Kersting, K., and Fürnkranz, J. Learning to play the chess variant crazyhouse above world champion level with deep neural networks and human data. arXiv (2019).Google Scholar
- Dai, A. M., and Le, Q. V. Semi-supervised sequence learning. In Advances in neural information processing systems (2015), pp. 3079--3087.Google ScholarDigital Library
- Davidson, A., Billings, D., Schaeffer, J., and Szafron, D. Improved opponent modeling in poker. In International Conference on Artificial Intelligence, ICAI'00 (2000), pp. 1467--1473.Google ScholarDigital Library
- Deng, J., Zhang, Z., Marchi, E., and Schuller, B. Sparse autoencoder-based feature transfer learning for speech emotion recognition. In 2013 humaine association conference on affective computing and intelligent interaction (2013), IEEE.Google Scholar
- Ding, Y., Florensa, C., Abbeel, P., and Phielipp, M. Goal-conditioned imitation learning. In Advances in Neural Information Processing Systems (2019).Google Scholar
- Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., and Darrell, T. Decaf: A deep convolutional activation feature for generic visual recognition. In International conference on machine learning (2014), pp. 647--655.Google ScholarDigital Library
- Duplessis, T. Lichess. lichess.org, 2021. Accessed: 2021-01-01.Google Scholar
- Finn, C., Abbeel, P., and Levine, S. Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70 (2017), JMLR. org, pp. 1126--1135.Google ScholarDigital Library
- Gatys, L. A., Ecker, A. S., and Bethge, M. Image style transfer using convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp. 2414--2423.Google ScholarCross Ref
- George, D., Shen, H., and Huerta, E. Classification and unsupervised clustering of ligo data with deep transfer learning. Physical Review D 97, 10 (2018), 101501.Google ScholarCross Ref
- Glickman, M. E. The glicko system. Boston University 16 (1995), 16--17.Google Scholar
- Gobet, F., and Simon, H. A. Templates in chess memory: A mechanism for recalling several boards. Cognitive psychology 31, 1 (1996), 1--40.Google Scholar
- Graves, A., Liwicki, M., Fernández, S., Bertolami, R., Bunke, H., and Schmidhuber, J. A novel connectionist system for unconstrained handwriting recognition. IEEE transactions on pattern analysis and machine intelligence 31, 5 (2008).Google Scholar
- Haworth, G., Regan, K., and Fatta, G. D. Performance and prediction: Bayesian modelling of fallible choice in chess. In Advances in computer games (2009), Springer, pp. 99--110.Google Scholar
- He, H., Boyd-Graber, J., Kwok, K., and Daumé III, H. Opponent modeling in deep reinforcement learning. In International conference on machine learning (2016), PMLR, pp. 1804--1813.Google Scholar
- He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp. 770--778.Google ScholarCross Ref
- Ho, J., and Ermon, S. Generative adversarial imitation learning. In NIPS (2016).Google ScholarDigital Library
- Howard, J., and Ruder, S. Universal language model fine-tuning for text classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (2018), pp. 328--339.Google ScholarCross Ref
- Hu, J., Shen, L., and Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (2018).Google ScholarDigital Library
- Huang, Z., Siniscalchi, S. M., and Lee, C.-H. A unified approach to transfer learning of deep neural networks with applications to speaker adaptation in automatic speech recognition. Neurocomputing 218 (2016).Google Scholar
- Huh, M., Agrawal, P., and Efros, A. A. What makes imagenet good for transfer learning? arXiv (2016).Google Scholar
- Kawahara, J., and Hamarneh, G. Multi-resolution-tract cnn with hybrid pretrained and skin-lesion trained layers. In International workshop on machine learning in medical imaging (2016), Springer, pp. 164--171.Google ScholarDigital Library
- Kornblith, S., Shlens, J., and Le, Q. V. Do better imagenet models transfer better? In Proceedings of the IEEE conference on computer vision and pattern recognition (2019), pp. 2661--2671.Google ScholarCross Ref
- Kunze, J., Kirsch, L., Kurenkov, I., Krug, A., Johannsmeier, J., and Stober, S. Transfer learning for speech recognition on a budget. In Proceedings of the 2nd Workshop on Representation Learning for NLP (2017), pp. 168--177.Google ScholarCross Ref
- LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278--2324.Google ScholarCross Ref
- Litjens, G., et al. A survey on deep learning in medical image analysis. Medical image analysis 42 (2017), 60--88.Google Scholar
- Lockett, A. J., Chen, C. L., and Miikkulainen, R. Evolving explicit opponent models in game playing. In Proceedings of the 9th annual conference on Genetic and evolutionary computation (2007), pp. 2106--2113.Google ScholarDigital Library
- Longe, F. How you can learn chess with AI and magnus carlsen in play magnus, with felipe longe CTO of play magnus, 2020.Google Scholar
- McCarthy, J. Chess as the drosophila of ai. In Computers, chess, and cognition. Springer, New York, NY, 1990, pp. 227--237.Google ScholarCross Ref
- McIlroy-Young, R., Kleinberg, J., Sen, S., Barocas, S., and Anderson, A. Mimetic models: Ethical implications of ai that acts like you. In Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society (AIES'22) (2022).Google ScholarDigital Library
- McIlroy-Young, R., Sen, S., Kleinberg, J., and Anderson, A. Aligning superhuman ai with human behavior: Chess as a model system. In Proceedings of the 25th ACM SIGKDD international conference on Knowledge discovery and data mining (2020).Google ScholarDigital Library
- McIlroy-Young, R., Wang, R., Sen, S., Kleinberg, J., and Anderson, A. Detecting individual decision-making style: Exploring behavioral stylometry in chess. Advances in Neural Information Processing Systems 34 (2021).Google Scholar
- Newell, A., Shaw, J. C., and Simon, H. A. Chess-playing programs and the problem of complexity. IBM Journal of Research and Development 2, 4 (1958).Google ScholarDigital Library
- Oquab, M., Bottou, L., Laptev, I., and Sivic, J. Learning and transferring mid-level image representations using convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (2014).Google ScholarDigital Library
- Paulsen, P., and Fürnkranz, J. A moderately successful attempt to train chess evaluation functions of different strengths. In Proceedings of the ICML-10 Workshop on Machine Learning and Games (2010), p. 114.Google Scholar
- Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., and Zettlemoyer, L. Deep contextualized word representations. CoRR abs/1802.05365 (2018).Google Scholar
- Pham, V., Bluche, T., Kermorvant, C., and Louradour, J. Dropout improves recurrent neural networks for handwriting recognition. In 2014 14th International Conference on Frontiers in Handwriting Recognition (2014), IEEE, pp. 285--290.Google ScholarCross Ref
- Plamondon, R., and Srihari, S. N. Online and off-line handwriting recognition: a comprehensive survey. IEEE Transactions on pattern analysis and machine intelligence 22, 1 (2000), 63--84.Google ScholarDigital Library
- Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., and Liu, P. J. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research 21 (2020), 1--67.Google Scholar
- Raghu, A., Raghu, M., Bengio, S., and Vinyals, O. Rapid learning or feature reuse? towards understanding the effectiveness of maml. In International Conference on Learning Representations (2019).Google Scholar
- Raghu, M., Zhang, C., Kleinberg, J., and Bengio, S. Transfusion: Understanding transfer learning for medical imaging. In Advances in Neural Information Processing Systems (2019), pp. 3342--3352.Google Scholar
- Regan, K. W., and Haworth, G. M. Intrinsic chess ratings. In Twenty-fifth aaai conference on artificial intelligence (2011).Google ScholarDigital Library
- Revow, M., Williams, C. K., and Hinton, G. E. Using generative models for handwritten digit recognition. IEEE transactions on pattern analysis and machine intelligence 18, 6 (1996), 592--606.Google ScholarDigital Library
- Rohrbach, M., Ebert, S., and Schiele, B. Transfer learning in a transductive setting. In Advances in neural information processing systems (2013), pp. 46--54.Google ScholarDigital Library
- Schaal, S. Is imitation learning the route to humanoid robots? Trends in cognitive sciences 3, 6 (1999), 233--242.Google Scholar
- Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., et al. A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science (2018).Google Scholar
- Simon, H. A. The structure of ill-structured problems. In Models of discovery. Springer, 1977, pp. 304--325.Google ScholarCross Ref
- Sun, Q., Liu, Y., Chua, T.-S., and Schiele, B. Meta-transfer learning for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019), pp. 403--412.Google ScholarCross Ref
- Tajbakhsh, N., et al. Convolutional neural networks for medical image analysis: Full training or fine tuning? IEEE transactions on medical imaging 35, 5 (2016).Google Scholar
- Tan, C., Sun, F., Kong, T., Zhang, W., Yang, C., and Liu, C. A survey on deep transfer learning. In International conference on artificial neural networks (2018), Springer, pp. 270--279.Google ScholarCross Ref
- Tappert, C. C., Suen, C. Y., and Wakahara, T. The state of the art in online handwriting recognition. IEEE Transactions on pattern analysis and machine intelligence 12, 8 (1990), 787--808.Google ScholarDigital Library
- Torabi, F., Warnell, G., and Stone, P. Recent advances in imitation learning from observation. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19) (2019).Google ScholarCross Ref
- Wang, D., and Zheng, T. F. Transfer learning for speech and language processing. In 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) (2015), IEEE, pp. 1225--1237.Google ScholarCross Ref
- Yang, J., Zhang, Y., and Dong, F. Neural word segmentation with rich pretraining. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (2017), pp. 839--849.Google ScholarCross Ref
- Yosinski, J., Clune, J., Bengio, Y., and Lipson, H. How transferable are features in deep neural networks? In Advances in neural information processing systems (2014), pp. 3320--3328.Google ScholarDigital Library
- Zoph, B., Yuret, D., May, J., and Knight, K. Transfer learning for low-resource neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (2016), pp. 1568--1575.Google ScholarCross Ref
Recommendations
Aligning Superhuman AI with Human Behavior: Chess as a Model System
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningAs artificial intelligence becomes increasingly intelligent---in some cases, achieving superhuman performance---there is growing potential for humans to learn from and collaborate with algorithms. However, the ways in which AI systems approach problems ...
Playing Chess at a Human Desired Level and Style
HAI '19: Proceedings of the 7th International Conference on Human-Agent InteractionHuman chess players prefer training with human opponents over chess agents as the latter are distinctively different in level and style than humans. Chess agents designed for human-agent play are capable of adjusting their level, however their style is ...
Making Superhuman AI More Human in Chess
Advances in Computer GamesAbstractComputer chess research has traditionally focused on creating the strongest possible chess engine. Recently, however, attempts have been made to create engines that mimic the playing strength and style of human players. Our research proposes ...
Comments