Abstract
We examine learning by artificial agents in repeated play of Cournot duopoly games. Our learning model is simple and cognitively realistic. The model departs from standard reinforcement learning models, as applied to agents in games, in that it credits the agent with a form of conceptual ascent, whereby the agent is able to learn from a consideration set of strategies spanning more than one period of play. The resulting behavior is markedly different from behavior predicted by classical economics for the single-shot (unrepeated) Cournot duopoly game. In repeated play under our learning regime, agents are able to arrive at a tacit form of collusion and set production levels near to those for a monopolist. We note that Cournot duopoly games are reasonable approximations for many real-world arrangements, including hourly spot markets for electricity.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
B. Allaz and J.-L Vila, Cournot competition, forward markets and efficiency, Journal of Economic Theory 59 (1993), 1–16.
Robert Axelrod, The evolution of cooperation, Basic Books, Inc., New York, NY, 1984.
R. R. Bush and F. Mosteller, Stochastic models for learning, Wiley, New York, NY, 1955.
B. Banerjee, R. Mukherjee, and S. Sen, Learning mutual trust, Working Notes of AGENTS-00 Workshop on Deception, Fraud and Trust in Agent Societies, 2000, citeseer.nj.nec.com/banerjee00learning.html, pp. 9–14.
D. W. Bunn and F. Oliveira, Evaluating individual market power in electricity markets via agent-based simulation, Annals of Operations Research 121 (2003), 57–78.
Colin F. Camerer, Behavioral game theory: Experiments in strategic interaction, Russell Sage Foundation and Princeton University Press, New York, NY and Princeton, NJ, 2003.
Caroline Claus and Craig Boutilier, The dynamics of reinforcement learning in cooperative multiagent systems, Proceedings of the Fifteenth National Conference on Artificial Intelligence (Menlo Park, CA), AAAI Press/MIT Press, 1998, pp. 746–752.
Andrew M. Colman, Game theory and its applications in the social and biological sciences, second ed., Routledge, London, UK, 1995.
A. Cournot, Researches into the mathematical principles of the theory of wealth, Macmillan, New York, NY, 1897, English edition edited by N. Bacon. Originally published in French as Recherches sur Principes Mathématiques de la Théorie des Richesses in 1838.
Robyn M. Dawes, Social dilemmas, Annual Review of Psychology 31 (1980), 169–193.
Garett O. Dworman, Steven O. Kimbrough, and James D. Laing, Bargaining by artificial agents in two coalition games: A study in genetic programming for electronic commerce, Genetic Programming 1996: Proceedings of the First Annual Genetic Programming Conference, July 28–31, 1996, Stanford University (John R. Koza, David E. Goldberg, David B. Fogel, and Rick L. Riolo, eds.), The MIT Press, 1996, pp. 54–62.
Ido Erev and Alvin E. Roth, Predicting how people play games: Reinforcement learning in experimental games with unique, mixed strategy equilibria, The American Economic Review 88 (1998), no. 4, 848–881.
[FKP+02]_Christina Fang, Steven O. Kimbrough, Stefano Pace, Annapurna Valluri, and Zhiqiang Zheng, On adaptive emergence of trust behavior in the game of stag hunt, Group Decision and Negotiation 11 (November 2002), no. 6, 449–467.
J. W. Friedman, Oligopoly and the theory of games, North Holland (now Elsevier), 1977.
J. S. Gans, D. Price, and K. Woods, Contracts and electricity pool prices, Australian Journal of Management 23 (1998), no. 1, 83–96.
S. M. Harvey and W. W. Hogan, California electricity prices and forward market hedging, Technical report: working paper series, Center for Business and Government, John F. Kennedy School of Government, Harvard University, Cambridge, Massachusetts 02138, October 2000.
Charles A. Holt, An experimental test of the consistent-conjectures hypothesis, The American Economic Review 75 (1985), no. 3, 314–325.
J. Hu and M. P. Wellman, Multiagent reinforcement learning: Theoretical framework and an algorithm, Fifteenth International Conference on Machine Learning, July 1998, pp. 242–250.
Steven O. Kimbrough and Ming Lu, A note on Q-learning in the Cournot game, WeB 2003: Proceedings of the Second Workshop in e-Business (Seattle, WA), December 13–14, 2003, Available at http://opimsun.wharton.upenn.edu/~sok/sokpapers/2004/cournot-rl-note-final.doc.
—, Simple reinforcement learning agents: Pareto beats Nash in an algorithmic game theory study, Information Systems and e-Business (forthcoming 2004).
Steven O. Kimbrough, Ming Lu, and Ann Kuo, A note on strategic learning in policy space, Formal Modelling in Electronic Commerce: Representation, Inference, and Strategic Interaction (Steven O. Kimbrough and D. J. Wu, eds.), Springer, 2004.
Leslie Pack Kaelbling, Michael L. Littman, and Andrew W. Moore, Reinforcement learning: A survey, Journal of Artificial Intelligence Research 4 (1996), 237–285.
John H. Kagel and Alvin E. Roth (eds.), The handbook of experimental economics, Princeton University Press, Princeton, NJ, 1995.
David M. Kreps, Game theory and economic modeling, Clarendon Press, Oxford, England, 1990.
C. Le Coq and Henrik Orzen, Do forward markets enhance competition? experimental evidence, Technical report: working paper series, The Economic Research Institute, Stockholm School of Economics, SSE/EFI Working Paper, Department of Economics, Sveavagen, P.O. Box 6501, 113 83 Stockholm, Sweden, August 2002.
Michael W. Macy and Andreas Flache, Learning dynamics in social dilemmas, Proceedings of the National Academy of Science (PNAS) 99 (2002), no. suppl. 3, 7229–7236.
Rajatish Mukherjee and Sandip Sen, Towards a pareto-optimal solution in general-sum games, 2004, citeseer.nj.nec.com/591017.html.
Anatol Rapoport and Albert M. Chammah, Prisoner’s dilemma: A study in conflict and cooperation, The University of Michigan Press, Ann Arbor, MI, 1965.
Alvin E. Roth and Ido Erev, Learning in extensive-form games: Experimental data and simple dynamic models in the intermediate term, Games and Economic Behavior 8 (1995), 164–212.
Amnon Rapoport, William E. Stein, and Graham J. Burkheimer, Response models for detection of change, D. Reidel Publishing Company, Dordrecht, Holland, 1979.
Richar S. Sutton and Andrew G. Barto, Reinforcement learning: An introduction, The MIT Press, Cambridge, MA, 1998.
T. Sandholm and R. Crites, Multiagent reinforcement learning in iterated prisoner’s dilemma, Biosystems 37 (1995), 147–166, Special Issue on the Prisoner’s Dilemma.
Hal R. Varian, Intermediate microeconomics: A modern approach, W. W. Norton & Company, New York, NY, 2003.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Kimbrough, S.O., Lu, M., Murphy, F. (2005). Learning and Tacit Collusion by Artificial Agents in Cournot Duopoly Games. In: Kimbrough, S.O., Wu, D. (eds) Formal Modelling in Electronic Commerce. International Handbooks on Information Systems. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-26989-4_19
Download citation
DOI: https://doi.org/10.1007/3-540-26989-4_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21431-1
Online ISBN: 978-3-540-26989-2
eBook Packages: Business and EconomicsBusiness and Management (R0)