Skip to main content

A Reinforcement Procedure Leading to Correlated Equilibrium

  • Chapter
Economics Essays

Abstract

We consider repeated games where at any period each player knows only his set of actions and the stream of payoffs that he has received in the past. He knows neither his own payoff function, nor the characteristics of the other players (how many there are, their strategies and payoffs). In this context, we present an adaptive procedure for play called “modified-regret-matching” — which is interpretable as a stimulus-response or reinforcement procedure, and which has the property that any limit point of the empirical distribution of play is a correlated equilibrium of the stage game.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Auer P., N. Cesa-Bianchi, Y. Freund and R. E. Schapire [ 1995 ], Gambling in a Rigged Casino: The Adversarial Multi-Armed Bandit Problem, Proceedings of the 36th Annual Symposium on Foundations of Computer Science, 322–331.

    Google Scholar 

  2. Aumann, R. J. [ 1974 ], Subjectivity and Correlation in Randomized Strategies, Journal of Mathematical Economics 1, 67–96.

    Article  Google Scholar 

  3. Banos, A. [ 1968 ], On Pseudo-Games, The Annals of Mathematical Statistics 39, 1932–1945.

    Article  Google Scholar 

  4. Blackwell, D. [ 1956 ], An Analog of the Minmax Theorem for Vector Payoffs, Pacific Journal of Mathematics 6, 1–8.

    Article  Google Scholar 

  5. Borgers, T. and R. Sarin [ 1995 ], Naive Reinforcement Learning with Endogenous Aspirations, University College London (mimeo).

    Google Scholar 

  6. Bush, R. and F. Mosteller [1955], Stochastic Models for Learning,Wiley.

    Google Scholar 

  7. Erev, I. and A. E. Roth [ 1998 ], Predicting How People Play Games: Reinforcement Learning in Experimental Games with Unique, Mixed Strategies, American Economic Review 88, 848–881.

    Google Scholar 

  8. Foster, D. and R. V. Vohra [ 1993 ], A Randomized Rule for Selecting Forecasts, Operations Research 41, 704–709.

    Article  Google Scholar 

  9. Foster, D. and R. V. Vohra [ 1997 ], Calibrated Learning and Correlated Equilibrium, Games and Economic Behavior 21, 40–55.

    Article  Google Scholar 

  10. Foster, D. and R. V. Vohra [ 1998 ], Asymptotic Calibration, Biometrika 85, 379–390.

    Article  Google Scholar 

  11. Fudenberg, D. and D. K. Levine [1998], Theory of Learning in Games,MIT Press.

    Google Scholar 

  12. Fudenberg, D. and D. K. Levine [ 1999 ], Conditional Universal Consistency, Games and Economic Behavior 29, 104–130.

    Article  Google Scholar 

  13. Hannan, J. [ 1957 ], Approximation to Bayes Risk in Repeated Play, in Contributions to the Theory of Games, Vol. III (Annals of Mathematics Studies 39 ), M. Dresher, A. W. Tucker and P. Wolfe (eds.), Princeton University Press, 97–139.

    Google Scholar 

  14. Hart, S. and A. Mas-Colell [ 2000 ], A Simple Adaptive Procedure Leading to Correlated Equilibrium, Econometrica.

    Google Scholar 

  15. Hart, S. and A. Mas-Colell [ 2001 ], A General Class of Adaptive Strategies, Journal of Economic Theory.

    Google Scholar 

  16. Loève, M. [ 1978 ], Probability Theory, Vol. II, 4th Edition, Springer-Verlag.

    Google Scholar 

  17. Megiddo, N. [ 1980 ], On Repeated Games with Incomplete Information Played by Non-Bayesian Players, International Journal of Game Theory 9, 157–167.

    Article  Google Scholar 

  18. Roth, A. E. and I. Erev [ 1995 ], Learning in Extensive-Form Games: Experimental Data and Simple Dynamic Models in the Intermediate Term, Games and Economic Behavior 8, 164–212.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Additional information

Dedicated with great admiration to Werner Hildenbrand on his 65th birthday. Previous versions of these results were included in the Center for Rationality Discussion Papers #126 (December 1996) and #166 (March 1998). We thank Dean Foster for suggesting the use of “modified regrets.” The research is partially supported by grants of the Israel Academy of Sciences and Humanities; the Spanish Ministry of Education; the Generalitat de Catalunya; CREI; and the EU-TMR Research Network.

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Hart, S., Mas-Colell, A. (2001). A Reinforcement Procedure Leading to Correlated Equilibrium. In: Debreu, G., Neuefeind, W., Trockel, W. (eds) Economics Essays. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-04623-4_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-04623-4_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-07539-1

  • Online ISBN: 978-3-662-04623-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics