IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2407.04521.html
   My bibliography  Save this paper

Unified continuous-time q-learning for mean-field game and mean-field control problems

Author

Listed:
  • Xiaoli Wei
  • Xiang Yu
  • Fengyi Yuan

Abstract

This paper studies the continuous-time q-learning in the mean-field jump-diffusion models from the representative agent's perspective. To overcome the challenge when the population distribution may not be directly observable, we introduce the integrated q-function in decoupled form (decoupled Iq-function) and establish its martingale characterization together with the value function, which provides a unified policy evaluation rule for both mean-field game (MFG) and mean-field control (MFC) problems. Moreover, depending on the task to solve the MFG or MFC problem, we can employ the decoupled Iq-function by different means to learn the mean-field equilibrium policy or the mean-field optimal policy respectively. As a result, we devise a unified q-learning algorithm for both MFG and MFC problems by utilizing all test policies stemming from the mean-field interactions. For several examples in the jump-diffusion setting, within and beyond the LQ framework, we can obtain the exact parameterization of the decoupled Iq-functions and the value functions, and illustrate our algorithm from the representative agent's perspective with satisfactory performance.

Suggested Citation

  • Xiaoli Wei & Xiang Yu & Fengyi Yuan, 2024. "Unified continuous-time q-learning for mean-field game and mean-field control problems," Papers 2407.04521, arXiv.org.
  • Handle: RePEc:arx:papers:2407.04521
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2407.04521
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. A. Bensoussan & K. C. J. Sung & S. C. P. Yam & S. P. Yung, 2016. "Linear-Quadratic Mean Field Games," Journal of Optimization Theory and Applications, Springer, vol. 169(2), pages 496-529, May.
    2. Yanwei Jia & Xun Yu Zhou, 2022. "q-Learning in Continuous Time," Papers 2207.00713, arXiv.org, revised Apr 2023.
    3. Sun, Yeneng, 2006. "The exact law of large numbers via Fubini extension and characterization of insurable risks," Journal of Economic Theory, Elsevier, vol. 126(1), pages 31-69, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Daniel Lacker & Thaleia Zariphopoulou, 2017. "Mean field and n-agent games for optimal investment under relative performance criteria," Papers 1703.07685, arXiv.org, revised Jun 2018.
    2. Yanwei Jia, 2024. "Continuous-time Risk-sensitive Reinforcement Learning via Quadratic Variation Penalty," Papers 2404.12598, arXiv.org.
    3. Lijun Bo & Yijie Huang & Xiang Yu, 2023. "On optimal tracking portfolio in incomplete markets: The classical control and the reinforcement learning approaches," Papers 2311.14318, arXiv.org.
    4. Jian Yang, 2021. "Analysis of Markovian Competitive Situations Using Nonatomic Games," Dynamic Games and Applications, Springer, vol. 11(1), pages 184-216, March.
    5. Marcel Nutz, 2016. "A Mean Field Game of Optimal Stopping," Papers 1605.09112, arXiv.org, revised Nov 2017.
    6. Jonas Hedlund & Carlos Oyarzun, 2018. "Imitation in heterogeneous populations," Economic Theory, Springer;Society for the Advancement of Economic Theory (SAET), vol. 65(4), pages 937-973, June.
    7. Li, Fei & Song, Yangbo & Zhao, Mofei, 2023. "Global manipulation by local obfuscation," Journal of Economic Theory, Elsevier, vol. 207(C).
    8. Al-Najjar, Nabil I., 2008. "Large games and the law of large numbers," Games and Economic Behavior, Elsevier, vol. 64(1), pages 1-34, September.
    9. Berliant, Marcus & Fujishima, Shota, 2012. "Optimal dynamic nonlinear income taxes: facing an uncertain future with a sluggish government," MPRA Paper 41947, University Library of Munich, Germany.
    10. Pupato, Germán, 2017. "Performance pay, trade and inequality," Journal of Economic Theory, Elsevier, vol. 172(C), pages 478-504.
    11. Luo, Yulei & Young, Eric R., 2016. "Induced uncertainty, market price of risk, and the dynamics of consumption and wealth," Journal of Economic Theory, Elsevier, vol. 163(C), pages 1-41.
    12. Martin Hellwig & Felix Bierbrauer, 2009. "Public Good Provision in a Large Economy," 2009 Meeting Papers 1062, Society for Economic Dynamics.
    13. Philippe Bacchetta & Eric Van Wincoop, 2008. "Higher Order Expectations in Asset Pricing," Journal of Money, Credit and Banking, Blackwell Publishing, vol. 40(5), pages 837-866, August.
    14. Fu, Guanxing & Horst, Ulrich, 2017. "Mean Field Games with Singular Controls," Rationality and Competition Discussion Paper Series 22, CRC TRR 190 Rationality and Competition.
    15. Sun, Yeneng & Yannelis, Nicholas C., 2007. "Perfect competition in asymmetric information economies: compatibility of efficiency and incentives," Journal of Economic Theory, Elsevier, vol. 134(1), pages 175-194, May.
    16. Ryan Donnelly & Zi Li, 2022. "Dynamic Inventory Management with Mean-Field Competition," Papers 2210.17208, arXiv.org.
    17. Lang, Matthias, 2019. "Communicating subjective evaluations," Journal of Economic Theory, Elsevier, vol. 179(C), pages 163-199.
    18. Manuel Amador & Pierre-Olivier Weill, 2010. "Learning from Prices: Public Communication and Welfare," Journal of Political Economy, University of Chicago Press, vol. 118(5), pages 866-907.
    19. William Fuchs & Luis Garicano & Luis Rayo, 2015. "Optimal Contracting and the Organization of Knowledge," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 82(2), pages 632-658.
    20. Ren'e Aid & Ofelia Bonesini & Giorgia Callegaro & Luciano Campi, 2021. "A McKean-Vlasov game of commodity production, consumption and trading," Papers 2111.04391, arXiv.org.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2407.04521. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.