Convergence of reinforcement learning to Nash equilibrium: A search-market experiment

My bibliography Save this article

Convergence of reinforcement learning to Nash equilibrium: A search-market experiment

Author

Listed:

Darmon, Eric
Waldeck, Roger

Registered:

Abstract

Since the introduction of Reinforcement Learning (RL) in Game Theory, a growing literature is concerned with the theoretical convergence of RL-driven outcomes towards Nash equilibrium. In this paper, we apply this issue to a search-theoretic framework (posted-price market) where sellers are confronted with a population of imperfectly informed buyers and take one decision per period (posted prices) with no direct interactions between sellers. We focus on three different scenarios with varying buyers’ characteristics. For each of these scenarios, we quantitatively and qualitatively test whether the learned variable (price strategy) converges to the Nash equilibrium. We also study the impact of the temperature parameter (defining the exploitation/exploration trade off) on these results.

Suggested Citation

Darmon, Eric & Waldeck, Roger, 2005. "Convergence of reinforcement learning to Nash equilibrium: A search-market experiment," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 355(1), pages 119-130.

Handle: RePEc:eee:phsmap:v:355:y:2005:i:1:p:119-130
DOI: 10.1016/j.physa.2005.02.074

Download full text from publisher

As the access to this document is restricted, you may want to search for a different version of it.

References listed on IDEAS

Kirman, Alan P. & Vriend, Nicolaas J., 2001. "Evolving market structure: An ACE model of price dispersion and loyalty," Journal of Economic Dynamics and Control, Elsevier, vol. 25(3-4), pages 459-502, March.
Martin Posch, 1997. "Cycling in a stochastic learning algorithm for normal form games," Journal of Evolutionary Economics, Springer, vol. 7(2), pages 193-207.
Erik Brynjolfsson & Michael D. Smith, 2000. "Frictionless Commerce? A Comparison of Internet and Conventional Retailers," Management Science, INFORMS, vol. 46(4), pages 563-585, April.
- Michael Smith & Erik Brynjolfsson, 1999. "Frictionless Commerce? A Comparison of Internet and Conventional Retailers," Computing in Economics and Finance 1999 1022, Society for Computational Economics.
Bell, Ann Maria, 2001. "Reinforcement Learning Rules in a Repeated Game," Computational Economics, Springer;Society for Computational Economics, vol. 18(1), pages 89-110, August.
Erev, Ido & Roth, Alvin E, 1998. "Predicting How People Play Games: Reinforcement Learning in Experimental Games with Unique, Mixed Strategy Equilibria," American Economic Review, American Economic Association, vol. 88(4), pages 848-881, September.
Brenner, Thomas, 2002. "A Behavioural Learning Approach to the Dynamics of Prices," Computational Economics, Springer;Society for Computational Economics, vol. 19(1), pages 67-94, February.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Fogale, Alberto & Pellizzari, Paolo & Warglien, Massimo, 2007. "Learning and equilibrium selection in a coordination game with heterogeneous agents," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 380(C), pages 519-527.
- Alberto Fogale & Paolo Pellizzari & Massimo Warglien, 2006. "Learning and equilibrium selection in a coordination game with heterogeneous agents," Working Papers 135, Department of Applied Mathematics, Università Ca' Foscari Venezia.
Roger Waldeck & Eric Darmon, 2006. "Can boundedly rational sellers learn to play Nash?," Journal of Economic Interaction and Coordination, Springer;Society for Economic Science with Heterogeneous Interacting Agents, vol. 1(2), pages 147-169, November.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Roger Waldeck & Eric Darmon, 2006. "Can boundedly rational sellers learn to play Nash?," Journal of Economic Interaction and Coordination, Springer;Society for Economic Science with Heterogeneous Interacting Agents, vol. 1(2), pages 147-169, November.
Waltman, Ludo & Kaymak, Uzay, 2008. "Q-learning agents in a Cournot oligopoly model," Journal of Economic Dynamics and Control, Elsevier, vol. 32(10), pages 3275-3293, October.
Lamieri, Marco & Bertacchini, Enrico, 2006. "What if Hayek goes shopping in the bazaar?," MPRA Paper 367, University Library of Munich, Germany, revised 21 Jun 2006.
Beggs, A.W., 2005. "On the convergence of reinforcement learning," Journal of Economic Theory, Elsevier, vol. 122(1), pages 1-36, May.
- Alan Beggs, 2002. "On the Convergence of Reinforcement Learning," Economics Series Working Papers 96, University of Oxford, Department of Economics.
Hopkins, Ed, 2007. "Adaptive learning models of consumer behavior," Journal of Economic Behavior & Organization, Elsevier, vol. 64(3-4), pages 348-368.
- Ed Hopkins, 2002. "Adaptive Learning Models of Consumer Behaviour," Edinburgh School of Economics Discussion Paper Series 80, Edinburgh School of Economics, University of Edinburgh.
- Ed Hopkins, 2004. "Adaptive Learning Models of Consumer Behaviour," Edinburgh School of Economics Discussion Paper Series 121, Edinburgh School of Economics, University of Edinburgh, revised Nov 2004.
- Ed Hopkins, 2006. "Adaptive Learning Models of Consumer Behaviour," Levine's Bibliography 122247000000000658, UCLA Department of Economics.
- Ed Hopkins, 2010. "Adaptive Learning Models of Consumer Behaviour," Levine's Working Paper Archive 506439000000000346, David K. Levine.
Ed Hopkins & Robert M. Seymour, 2002. "The Stability of Price Dispersion under Seller and Consumer Learning," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 43(4), pages 1157-1190, November.
- Ed Hopkins & Robert M Seymour, 1999. "The Stability of Price Dispersion under Seller and Consumer Learning," Edinburgh School of Economics Discussion Paper Series 45, Edinburgh School of Economics, University of Edinburgh, revised Dec 2000.
- Ed Hopkins & Roberty M. Seymour, 2002. "The Stability of Price Dispersion under Seller and Consumer Learning," Game Theory and Information 0203002, University Library of Munich, Germany.
- Ed Hopkins & Robert M Seymour, 2000. "The Stability of Price Dispersion under Seller and Consumer Learning," Edinburgh School of Economics Discussion Paper Series 52, Edinburgh School of Economics, University of Edinburgh, revised Dec 2000.
Juliette Rouchier, 2013. "The Interest of Having Loyal Buyers in a Perishable Market," Computational Economics, Springer;Society for Computational Economics, vol. 41(2), pages 151-170, February.
Mengel, Friederike, 2012. "Learning across games," Games and Economic Behavior, Elsevier, vol. 74(2), pages 601-619.
- Friederike Mengel, 2007. "Learning Across Games," Working Papers. Serie AD 2007-05, Instituto Valenciano de Investigaciones Económicas, S.A. (Ivie).
R. Cowan & N. Jonard & J. -B. Zimmermann, 2007. "Evolving networks of inventors," Springer Books, in: Uwe Cantner & Franco Malerba (ed.), Innovation, Industrial Dynamics and Structural Transformation, pages 129-148, Springer.
- R. Cowan & N. Jonard & J.-B. Zimmermann, 2006. "Evolving networks of inventors," Journal of Evolutionary Economics, Springer, vol. 16(1), pages 155-174, April.
- Cowan, Robin & Jonard, Nicolas & Zimmermann, J-B, 2004. "Evolving Networks of Inventors," Research Memorandum 018, Maastricht University, Maastricht Economic Research Institute on Innovation and Technology (MERIT).
- Robin Cowan & Nicolas Jonard & Jean-Benoît Zimmermann, 2008. "Evolving Networks of Inventors," Post-Print hal-00279215, HAL.
Ed Hopkins, 2002. "Two Competing Models of How People Learn in Games," Econometrica, Econometric Society, vol. 70(6), pages 2141-2166, November.
- Ed Hopkins, 1999. "Two Competing Models of How People Learn in Games," Edinburgh School of Economics Discussion Paper Series 42, Edinburgh School of Economics, University of Edinburgh, revised Dec 2000.
- Ed Hopkins, 2001. "Two Competing Models of How People Learn in Games," NajEcon Working Paper Reviews 625018000000000226, www.najecon.org.
- Ed Hopkins, 2000. "Two Competing Models of How People Learn in Games," Edinburgh School of Economics Discussion Paper Series 51, Edinburgh School of Economics, University of Edinburgh, revised Dec 2000.
- Ed Hopkins, 2001. "Two Competing Models of How People Learn in Games," Levine's Working Paper Archive 625018000000000226, David K. Levine.
Hopkins, Ed & Posch, Martin, 2005. "Attainability of boundary points under reinforcement learning," Games and Economic Behavior, Elsevier, vol. 53(1), pages 110-125, October.
- Ed Hopkins & Martin Posch, 2003. "Attainability of Boundary Points under Reinforcement Learning," Edinburgh School of Economics Discussion Paper Series 79, Edinburgh School of Economics, University of Edinburgh.
- Ed Hopkins & Martin Posch, 2003. "Attainability of Boundary Points under Reinforcement Learning," Levine's Working Paper Archive 506439000000000350, David K. Levine.
Mario Bravo & Mathieu Faure, 2013. "Reinforcement Learning with Restrictions on the Action Set," AMSE Working Papers 1335, Aix-Marseille School of Economics, France, revised 01 Jul 2013.
- Mario Bravo & Mathieu Faure, 2015. "Reinforcement Learning with Restrictions on the Action Set," Post-Print hal-01457301, HAL.
Laslier, Jean-Francois & Topol, Richard & Walliser, Bernard, 2001. "A Behavioral Learning Process in Games," Games and Economic Behavior, Elsevier, vol. 37(2), pages 340-366, November.
- J.-F. Laslier & R. Topol & B. Walliser, 1999. "A behavioral learning process in games," THEMA Working Papers 99-03, THEMA (THéorie Economique, Modélisation et Applications), Université de Cergy-Pontoise.
- Laslier, J.-F. & Topol, R. & Walliser, B., 1999. "A Behavioral Learning Process in Games," Papers 99-03, Paris X - Nanterre, U.F.R. de Sc. Ec. Gest. Maths Infor..
Leigh Tesfatsion, 2002. "Agent-Based Computational Economics," Computational Economics 0203001, University Library of Munich, Germany, revised 15 Aug 2002.
- Tesfatsion, Leigh, 2007. "Agent-based computational economics," ISU General Staff Papers 200701010800001423, Iowa State University, Department of Economics.
- Tesfatsion, Leigh, 2003. "Agent-Based Computational Economics," ISU General Staff Papers 200301010800001248, Iowa State University, Department of Economics.
Izquierdo, Luis R. & Izquierdo, Segismundo S. & Gotts, Nicholas M. & Polhill, J. Gary, 2007. "Transient and asymptotic dynamics of reinforcement learning in games," Games and Economic Behavior, Elsevier, vol. 61(2), pages 259-276, November.
Ianni, Antonella, 2014. "Learning strict Nash equilibria through reinforcement," Journal of Mathematical Economics, Elsevier, vol. 50(C), pages 148-155.
- Ianni, Antonella, 2011. "Learning Strict Nash Equilibria through Reinforcement," MPRA Paper 33936, University Library of Munich, Germany.
Panayotis Mertikopoulos & William H. Sandholm, 2016. "Learning in Games via Reinforcement and Regularization," Mathematics of Operations Research, INFORMS, vol. 41(4), pages 1297-1324, November.
Moulet, Sonia & Rouchier, Juliette, 2008. "The influence of seller learning and time constraints on sequential bargaining in an artificial perishable goods market," Journal of Economic Dynamics and Control, Elsevier, vol. 32(7), pages 2322-2348, July.
- Sonia Moulet & Juliette Rouchier, 2009. "The influence of seller learning and time constraints on sequential bargaining in an artificial perishable goods market," Working Papers halshs-00353505, HAL.
Alan Kirman & Sonia Moulet, 2008. "Impact de l'organisation du marché: Comparaison de la négociation de gré à gré et des enchères descendantes," Working Papers halshs-00349034, HAL.
Stephan Schuster, 2012. "BRA: An Algorithm for Simulating Bounded Rational Agents," Computational Economics, Springer;Society for Computational Economics, vol. 39(1), pages 51-69, January.

More about this item

Keywords

Reinforcement learning; Nash equilibrium; Search market; Agent-based modeling;
All these keywords.

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:phsmap:v:355:y:2005:i:1:p:119-130. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.journals.elsevier.com/physica-a-statistical-mechpplications/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Convergence of reinforcement learning to Nash equilibrium: A search-market experiment

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data