IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2208.10469.html
   My bibliography  Save this paper

Formal Contracts Mitigate Social Dilemmas in Multi-Agent RL

Author

Listed:
  • Andreas A. Haupt
  • Phillip J. K. Christoffersen
  • Mehul Damani
  • Dylan Hadfield-Menell

Abstract

Multi-agent Reinforcement Learning (MARL) is a powerful tool for training autonomous agents acting independently in a common environment. However, it can lead to sub-optimal behavior when individual incentives and group incentives diverge. Humans are remarkably capable at solving these social dilemmas. It is an open problem in MARL to replicate such cooperative behaviors in selfish agents. In this work, we draw upon the idea of formal contracting from economics to overcome diverging incentives between agents in MARL. We propose an augmentation to a Markov game where agents voluntarily agree to binding transfers of reward, under pre-specified conditions. Our contributions are theoretical and empirical. First, we show that this augmentation makes all subgame-perfect equilibria of all Fully Observable Markov Games exhibit socially optimal behavior, given a sufficiently rich space of contracts. Next, we show that for general contract spaces, and even under partial observability, richer contract spaces lead to higher welfare. Hence, contract space design solves an exploration-exploitation tradeoff, sidestepping incentive issues. We complement our theoretical analysis with experiments. Issues of exploration in the contracting augmentation are mitigated using a training methodology inspired by multi-objective reinforcement learning: Multi-Objective Contract Augmentation Learning (MOCA). We test our methodology in static, single-move games, as well as dynamic domains that simulate traffic, pollution management and common pool resource management.

Suggested Citation

  • Andreas A. Haupt & Phillip J. K. Christoffersen & Mehul Damani & Dylan Hadfield-Menell, 2022. "Formal Contracts Mitigate Social Dilemmas in Multi-Agent RL," Papers 2208.10469, arXiv.org, revised Jan 2024.
  • Handle: RePEc:arx:papers:2208.10469
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2208.10469
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Michael Curry & Tuomas Sandholm & John Dickerson, 2022. "Differentiable Economics for Randomized Affine Maximizer Auctions," Papers 2202.02872, arXiv.org.
    2. Martin J. Osborne & Ariel Rubinstein, 1994. "A Course in Game Theory," MIT Press Books, The MIT Press, edition 1, volume 1, number 0262650401, April.
    3. Emilio Calvano & Giacomo Calzolari & Vincenzo Denicolò & Sergio Pastorello, 2020. "Artificial Intelligence, Algorithmic Pricing, and Collusion," American Economic Review, American Economic Association, vol. 110(10), pages 3267-3297, October.
    4. John Asker & Chaim Fershtman & Ariel Pakes, 2021. "Artificial Intelligence and Pricing: The Impact of Algorithm Design," NBER Working Papers 28535, National Bureau of Economic Research, Inc.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Martin, Simon & Rasch, Alexander, 2022. "Collusion by algorithm: The role of unobserved actions," DICE Discussion Papers 382, Heinrich Heine University Düsseldorf, Düsseldorf Institute for Competition Economics (DICE).
    2. Aniko …ry & Ali Horta su & Kevin Williams, 2022. "Dynamic Price Competition: Theory and Evidence from Airline Markets," Cowles Foundation Discussion Papers 2341R1, Cowles Foundation for Research in Economics, Yale University, revised Apr 2023.
    3. Martin, Simon & Rasch, Alexander, 2024. "Demand forecasting, signal precision, and collusion with hidden actions," International Journal of Industrial Organization, Elsevier, vol. 92(C).
    4. Dolgopolov, Arthur, 2024. "Reinforcement learning in a prisoner's dilemma," Games and Economic Behavior, Elsevier, vol. 144(C), pages 84-103.
    5. Simon Martin & Alexander Rasch, 2022. "Collusion by Algorithm: The Role of Unobserved Actions," CESifo Working Paper Series 9629, CESifo.
    6. Andreas Haupt & Aroon Narayanan, 2022. "Risk Preferences of Learning Algorithms," Papers 2205.04619, arXiv.org, revised Dec 2023.
    7. Shidi Deng & Maximilian Schiffer & Martin Bichler, 2024. "Algorithmic Collusion in Dynamic Pricing with Deep Reinforcement Learning," Papers 2406.02437, arXiv.org.
    8. Battigalli, Pierpaolo & Bonanno, Giacomo, 1997. "The Logic of Belief Persistence," Economics and Philosophy, Cambridge University Press, vol. 13(1), pages 39-59, April.
    9. Szabó, György & Borsos, István & Szombati, Edit, 2019. "Games, graphs and Kirchhoff laws," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 521(C), pages 416-423.
    10. Shi, Ziyi & Xu, Meng & Song, Yancun & Zhu, Zheng, 2024. "Multi-Platform dynamic game and operation of hybrid Bike-Sharing systems based on reinforcement learning," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 181(C).
    11. Shi, Yi & Deng, Yawen & Wang, Guoan & Xu, Jiuping, 2020. "Stackelberg equilibrium-based eco-economic approach for sustainable development of kitchen waste disposal with subsidy policy: A case study from China," Energy, Elsevier, vol. 196(C).
    12. Marc Le Menestrel, 2003. "A one-shot Prisoners’ Dilemma with procedural utility," Economics Working Papers 819, Department of Economics and Business, Universitat Pompeu Fabra.
    13. Cheng‐Kuang Wu & Yi‐Ming Chen & Dachrahn Wu & Ching‐Lin Chi, 2020. "A Game Theory Approach for Assessment of Risk and Deployment of Police Patrols in Response to Criminal Activity in San Francisco," Risk Analysis, John Wiley & Sons, vol. 40(3), pages 534-549, March.
    14. Inkoo Cho & Noah Williams, 2024. "Collusive Outcomes Without Collusion," Papers 2403.07177, arXiv.org.
    15. Nasimeh Heydaribeni & Achilleas Anastasopoulos, 2019. "Linear Equilibria for Dynamic LQG Games with Asymmetric Information and Dependent Types," Papers 1909.04834, arXiv.org.
    16. Müller, Christoph, 2020. "Robust implementation in weakly perfect Bayesian strategies," Journal of Economic Theory, Elsevier, vol. 189(C).
    17. Hitoshi Matsushima, 2019. "Implementation without expected utility: ex-post verifiability," Social Choice and Welfare, Springer;The Society for Social Choice and Welfare, vol. 53(4), pages 575-585, December.
    18. Dasgupta Utteeyo, 2011. "Are Entry Threats Always Credible?," The B.E. Journal of Economic Analysis & Policy, De Gruyter, vol. 11(1), pages 1-41, December.
    19. Baran Han, 2018. "The role and welfare rationale of secondary sanctions: A theory and a case study of the US sanctions targeting Iran," Conflict Management and Peace Science, Peace Science Society (International), vol. 35(5), pages 474-502, September.
    20. Carlos Pimienta & Jianfei Shen, 2014. "On the equivalence between (quasi-)perfect and sequential equilibria," International Journal of Game Theory, Springer;Game Theory Society, vol. 43(2), pages 395-402, May.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2208.10469. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.