IDEAS home Printed from https://ideas.repec.org/a/inm/ormoor/v50y2025i1p506-536.html
   My bibliography  Save this article

Mean-Field Multiagent Reinforcement Learning: A Decentralized Network Approach

Author

Listed:
  • Haotian Gu

    (Department of Mathematics, University of California, Berkeley, Berkeley, California 94720)

  • Xin Guo

    (Department of Industrial Engineering & Operations Research, University of California, Berkeley, Berkeley, California 94720)

  • Xiaoli Wei

    (Tsinghua Shenzhen International Graduate School, Shenzhen 518071, China)

  • Renyuan Xu

    (Industrial & Systems Engineering, University of Southern California, Los Angeles, California 90089)

Abstract

One of the challenges for multiagent reinforcement learning (MARL) is designing efficient learning algorithms for a large system in which each agent has only limited or partial information of the entire system. Whereas exciting progress has been made to analyze decentralized MARL with the network of agents for social networks and team video games, little is known theoretically for decentralized MARL with the network of states for modeling self-driving vehicles, ride-sharing, and data and traffic routing. This paper proposes a framework of localized training and decentralized execution to study MARL with the network of states. Localized training means that agents only need to collect local information in their neighboring states during the training phase; decentralized execution implies that agents can execute afterward the learned decentralized policies, which depend only on agents’ current states. The theoretical analysis consists of three key components: the first is the reformulation of the MARL system as a networked Markov decision process with teams of agents, enabling updating the associated team Q-function in a localized fashion; the second is the Bellman equation for the value function and the appropriate Q-function on the probability measure space; and the third is the exponential decay property of the team Q-function, facilitating its approximation with efficient sample efficiency and controllable error. The theoretical analysis paves the way for a new algorithm LTDE-N eural -AC, in which the actor–critic approach with overparameterized neural networks is proposed. The convergence and sample complexity are established and shown to be scalable with respect to the sizes of both agents and states. To the best of our knowledge, this is the first neural network–based MARL algorithm with network structure and provable convergence guarantee.

Suggested Citation

  • Haotian Gu & Xin Guo & Xiaoli Wei & Renyuan Xu, 2025. "Mean-Field Multiagent Reinforcement Learning: A Decentralized Network Approach," Mathematics of Operations Research, INFORMS, vol. 50(1), pages 506-536, February.
  • Handle: RePEc:inm:ormoor:v:50:y:2025:i:1:p:506-536
    DOI: 10.1287/moor.2022.0055
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/moor.2022.0055
    Download Restriction: no

    File URL: https://libkey.io/10.1287/moor.2022.0055?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:ormoor:v:50:y:2025:i:1:p:506-536. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.