IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2109.00719.html
   My bibliography  Save this paper

Multi-agent Bayesian Learning with Best Response Dynamics: Convergence and Stability

Author

Listed:
  • Manxi Wu
  • Saurabh Amin
  • Asuman Ozdaglar

Abstract

We study learning dynamics induced by strategic agents who repeatedly play a game with an unknown payoff-relevant parameter. In this dynamics, a belief estimate of the parameter is repeatedly updated given players' strategies and realized payoffs using Bayes's rule. Players adjust their strategies by accounting for best response strategies given the belief. We show that, with probability 1, beliefs and strategies converge to a fixed point, where the belief consistently estimates the payoff distribution for the strategy, and the strategy is an equilibrium corresponding to the belief. However, learning may not always identify the unknown parameter because the belief estimate relies on the game outcomes that are endogenously generated by players' strategies. We obtain sufficient and necessary conditions, under which learning leads to a globally stable fixed point that is a complete information Nash equilibrium. We also provide sufficient conditions that guarantee local stability of fixed point beliefs and strategies.

Suggested Citation

  • Manxi Wu & Saurabh Amin & Asuman Ozdaglar, 2021. "Multi-agent Bayesian Learning with Best Response Dynamics: Convergence and Stability," Papers 2109.00719, arXiv.org.
  • Handle: RePEc:arx:papers:2109.00719
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2109.00719
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Darrell Duffie & Semyon Malamud & Gustavo Manso, 2009. "Information Percolation With Equilibrium Search Dynamics," Econometrica, Econometric Society, vol. 77(5), pages 1513-1574, September.
    2. Frank H. Hahn, 1978. "Exercises in Conjectural Equilibria," Palgrave Macmillan Books, in: Steinar Strøm & Lars Werin (ed.), Topics in Disequilibrium Economics, pages 64-80, Palgrave Macmillan.
    3. Wendy W. Moe & Peter S. Fader, 2004. "Dynamic Conversion Behavior at E-Commerce Sites," Management Science, INFORMS, vol. 50(3), pages 326-335, March.
    4. Fudenberg Drew & Kreps David M., 1993. "Learning Mixed Equilibria," Games and Economic Behavior, Elsevier, vol. 5(3), pages 320-367, July.
    5. Fudenberg, Drew & Levine, David K, 1993. "Steady State Learning and Nash Equilibrium," Econometrica, Econometric Society, vol. 61(3), pages 547-573, May.
    6. Cominetti, Roberto & Melo, Emerson & Sorin, Sylvain, 2010. "A payoff-based learning procedure and its application to traffic games," Games and Economic Behavior, Elsevier, vol. 70(1), pages 71-83, September.
    7. Ed Hopkins, 2002. "Two Competing Models of How People Learn in Games," Econometrica, Econometric Society, vol. 70(6), pages 2141-2166, November.
    8. Kalai, Ehud & Lehrer, Ehud, 1993. "Rational Learning Leads to Nash Equilibrium," Econometrica, Econometric Society, vol. 61(5), pages 1019-1045, September.
    9. Mira Frick & Ryota Iijima & Yuhta Ishii, 2020. "Stability and Robustness in Misspecified Learning Models," Cowles Foundation Discussion Papers 2235, Cowles Foundation for Research in Economics, Yale University.
    10. , & , & ,, 2014. "Dynamics of information exchange in endogenous social networks," Theoretical Economics, Econometric Society, vol. 9(1), January.
    11. Samuelson Larry, 1994. "Stochastic Stability in Games with Alternative Best Replies," Journal of Economic Theory, Elsevier, vol. 64(1), pages 35-65, October.
    12. Beggs, A.W., 2005. "On the convergence of reinforcement learning," Journal of Economic Theory, Elsevier, vol. 122(1), pages 1-36, May.
    13. Sergiu Hart & Andreu Mas-Colell, 2013. "Regret-Based Continuous-Time Dynamics," World Scientific Book Chapters, in: Simple Adaptive Strategies From Regret-Matching to Uncoupled Dynamics, chapter 5, pages 99-124, World Scientific Publishing Co. Pte. Ltd..
    14. Manxi Wu & Saurabh Amin & Asuman E. Ozdaglar, 2021. "Value of Information in Bayesian Routing Games," Operations Research, INFORMS, vol. 69(1), pages 148-163, January.
    15. Fudenberg, Drew & Levine, David K, 1993. "Self-Confirming Equilibrium," Econometrica, Econometric Society, vol. 61(3), pages 523-545, May.
    16. Kalai, Ehud & Lehrer, Ehud, 1995. "Subjective games and equilibria," Games and Economic Behavior, Elsevier, vol. 8(1), pages 123-163.
    17. Blume Lawrence E., 1993. "The Statistical Mechanics of Strategic Interaction," Games and Economic Behavior, Elsevier, vol. 5(3), pages 387-424, July.
    18. Kalai, Ehud & Lehrer, Ehud, 1993. "Subjective Equilibrium in Repeated Games," Econometrica, Econometric Society, vol. 61(5), pages 1231-1240, September.
    19. Rothschild, Michael, 1974. "A two-armed bandit theory of market pricing," Journal of Economic Theory, Elsevier, vol. 9(2), pages 185-202, October.
    20. Marden, Jason R. & Shamma, Jeff S., 2012. "Revisiting log-linear learning: Asynchrony, completeness and payoff-based implementation," Games and Economic Behavior, Elsevier, vol. 75(2), pages 788-808.
    21. Ali, S. Nageeb, 2018. "Herding with costly information," Journal of Economic Theory, Elsevier, vol. 175(C), pages 713-729.
    22. Milgrom, Paul & Roberts, John, 1990. "Rationalizability, Learning, and Equilibrium in Games with Strategic Complementarities," Econometrica, Econometric Society, vol. 58(6), pages 1255-1277, November.
    23. , P. & , Peyton, 2006. "Regret testing: learning to play Nash equilibrium without knowing you have an opponent," Theoretical Economics, Econometric Society, vol. 1(3), pages 341-367, September.
    24. Daron Acemoglu & Ali Makhdoumi & Azarakhsh Malekian & Asuman Ozdaglar, 2017. "Fast and Slow Learning From Reviews," NBER Working Papers 24046, National Bureau of Economic Research, Inc.
    25. Alós-Ferrer, Carlos & Netzer, Nick, 2010. "The logit-response dynamics," Games and Economic Behavior, Elsevier, vol. 68(2), pages 413-427, March.
    26. Samuelson, Larry & Zhang, Jianbo, 1992. "Evolutionary stability in asymmetric games," Journal of Economic Theory, Elsevier, vol. 57(2), pages 363-391, August.
    27. Benaim, Michel & Hirsch, Morris W., 1999. "Mixed Equilibria and Dynamical Systems Arising from Fictitious Play in Perturbed Games," Games and Economic Behavior, Elsevier, vol. 29(1-2), pages 36-72, October.
    28. Hofbauer, Josef & Sandholm, William H., 2009. "Stable games and their dynamics," Journal of Economic Theory, Elsevier, vol. 144(4), pages 1665-1693.4, July.
    29. Matsui, Akihiko, 1992. "Best response dynamics and socially stable strategies," Journal of Economic Theory, Elsevier, vol. 57(2), pages 343-362, August.
    30. Sachin Adlakha & Ramesh Johari, 2013. "Mean Field Equilibrium in Dynamic Games with Strategic Complementarities," Operations Research, INFORMS, vol. 61(4), pages 971-989, August.
    31. , H., 2010. "Local stability under evolutionary game dynamics," Theoretical Economics, Econometric Society, vol. 5(1), January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jonathan Newton, 2018. "Evolutionary Game Theory: A Renaissance," Games, MDPI, vol. 9(2), pages 1-67, May.
    2. Sobel, Joel, 2000. "Economists' Models of Learning," Journal of Economic Theory, Elsevier, vol. 94(2), pages 241-261, October.
    3. Sandholm, William H., 2015. "Population Games and Deterministic Evolutionary Dynamics," Handbook of Game Theory with Economic Applications,, Elsevier.
    4. Fudenberg, Drew & Kreps, David M., 1995. "Learning in extensive-form games I. Self-confirming equilibria," Games and Economic Behavior, Elsevier, vol. 8(1), pages 20-55.
    5. Lagunoff, Roger, 1997. "On the dynamic selection of mechanisms for provision of public projects," Journal of Economic Dynamics and Control, Elsevier, vol. 21(10), pages 1699-1725, August.
    6. Jakub Bielawski & Thiparat Chotibut & Fryderyk Falniowski & Michal Misiurewicz & Georgios Piliouras, 2022. "Unpredictable dynamics in congestion games: memory loss can prevent chaos," Papers 2201.10992, arXiv.org, revised Jan 2022.
    7. Chernov, G. & Susin, I., 2019. "Models of learning in games: An overview," Journal of the New Economic Association, New Economic Association, vol. 44(4), pages 77-125.
    8. Weibull, Jörgen W., 1997. "What have we learned from Evolutionary Game Theory so far?," Working Paper Series 487, Research Institute of Industrial Economics, revised 26 Oct 1998.
    9. Ignacio Esponda & Demian Pouzo, 2014. "Berk-Nash Equilibrium: A Framework for Modeling Agents with Misspecified Models," Papers 1411.1152, arXiv.org, revised Nov 2019.
    10. Hopkins, Ed, 1999. "Learning, Matching, and Aggregation," Games and Economic Behavior, Elsevier, vol. 26(1), pages 79-110, January.
    11. Fudenberg, Drew & Takahashi, Satoru, 2011. "Heterogeneous beliefs and local information in stochastic fictitious play," Games and Economic Behavior, Elsevier, vol. 71(1), pages 100-120, January.
    12. Hofbauer, Josef & Hopkins, Ed, 2005. "Learning in perturbed asymmetric games," Games and Economic Behavior, Elsevier, vol. 52(1), pages 133-152, July.
    13. Naoki Funai, 2019. "Convergence results on stochastic adaptive learning," Economic Theory, Springer;Society for the Advancement of Economic Theory (SAET), vol. 68(4), pages 907-934, November.
    14. Drew Fudenberg & David K Levine, 2006. "An Economists Perspective on Multi-Agent Learning," Levine's Working Paper Archive 784828000000000683, David K. Levine.
    15. Ignacio Esponda & Demian Pouzo, 2015. "Equilibrium in Misspecified Markov Decision Processes," Papers 1502.06901, arXiv.org, revised May 2016.
    16. Andriy Zapechelnyuk, 2009. "Limit Behavior of No-regret Dynamics," Discussion Papers 21, Kyiv School of Economics.
    17. Mario Bravo, 2016. "An Adjusted Payoff-Based Procedure for Normal Form Games," Mathematics of Operations Research, INFORMS, vol. 41(4), pages 1469-1483, November.
    18. Block, Juan I. & Fudenberg, Drew & Levine, David K., 2019. "Learning dynamics with social comparisons and limited memory," Theoretical Economics, Econometric Society, vol. 14(1), January.
    19. Michel Benaïm & Josef Hofbauer & Sylvain Sorin, 2012. "Perturbations of Set-Valued Dynamical Systems, with Applications to Game Theory," Dynamic Games and Applications, Springer, vol. 2(2), pages 195-205, June.
    20. Schipper, Burkhard C., 2021. "Discovery and equilibrium in games with unawareness," Journal of Economic Theory, Elsevier, vol. 198(C).

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2109.00719. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.