IDEAS home Printed from https://ideas.repec.org/a/wsi/acsxxx/v14y2011i02ns0219525911002998.html
   My bibliography  Save this article

An Empirical Study Of Potential-Based Reward Shaping And Advice In Complex, Multi-Agent Systems

Author

Listed:
  • SAM DEVLIN

    (University of York, UK)

  • DANIEL KUDENKO

    (University of York, UK)

  • MAREK GRZEŚ

    (University of Waterloo, CA, Canada)

Abstract

This paper investigates the impact of reward shaping in multi-agent reinforcement learning as a way to incorporate domain knowledge about good strategies. In theory, potential-based reward shaping does not alter the Nash Equilibria of a stochastic game, only the exploration of the shaped agent. We demonstrate empirically the performance of reward shaping in two problem domains within the context of RoboCup KeepAway by designing three reward shaping schemes, encouraging specific behaviour such as keeping a minimum distance from other players on the same team and taking on specific roles. The results illustrate that reward shaping with multiple, simultaneous learning agents can reduce the time needed to learn a suitable policy and can alter the final group performance.

Suggested Citation

  • Sam Devlin & Daniel Kudenko & Marek Grześ, 2011. "An Empirical Study Of Potential-Based Reward Shaping And Advice In Complex, Multi-Agent Systems," Advances in Complex Systems (ACS), World Scientific Publishing Co. Pte. Ltd., vol. 14(02), pages 251-278.
  • Handle: RePEc:wsi:acsxxx:v:14:y:2011:i:02:n:s0219525911002998
    DOI: 10.1142/S0219525911002998
    as

    Download full text from publisher

    File URL: http://www.worldscientific.com/doi/abs/10.1142/S0219525911002998
    Download Restriction: Access to full text is restricted to subscribers

    File URL: https://libkey.io/10.1142/S0219525911002998?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Drew Fudenberg & Jean Tirole, 1991. "Game Theory," MIT Press Books, The MIT Press, edition 1, volume 1, number 0262061414, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. De Moor, Bram J. & Gijsbrechts, Joren & Boute, Robert N., 2022. "Reward shaping to improve the performance of deep reinforcement learning in perishable inventory management," European Journal of Operational Research, Elsevier, vol. 301(2), pages 535-545.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Janvier D. Nkurunziza, 2005. "Reputation and Credit without Collateral in Africa`s Formal Banking," Economics Series Working Papers WPS/2005-02, University of Oxford, Department of Economics.
    2. Sandy Fréret & Denis Maguain, 2017. "The effects of agglomeration on tax competition: evidence from a two-regime spatial panel model on French data," International Tax and Public Finance, Springer;International Institute of Public Finance, vol. 24(6), pages 1100-1140, December.
    3. Carlo Rosa & Giovanni Verga, 2006. "The Impact of Central Bank Announcements on Asset Prices in Real Time: Testing the Efficiency of the Euribor Futures Market," CEP Discussion Papers dp0764, Centre for Economic Performance, LSE.
    4. Marco Bassetto, 2002. "A Game-Theoretic View of the Fiscal Theory of the Price Level," Econometrica, Econometric Society, vol. 70(6), pages 2167-2195, November.
    5. Arthur Schram & Boris Van Leeuwen & Theo Offerman, 2013. "Superstars Need Social Benefits: An Experiment on Network Formation," Working Papers 1306, Departament Empresa, Universitat Autònoma de Barcelona, revised Jul 2013.
    6. Patrick W. Schmitz, 2006. "Book Review," Journal of Institutional and Theoretical Economics (JITE), Mohr Siebeck, Tübingen, vol. 162(3), pages 535-542, September.
    7. Celik, Gorkem, 2006. "Mechanism design with weaker incentive compatibility constraints," Games and Economic Behavior, Elsevier, vol. 56(1), pages 37-44, July.
    8. Régis Chenavaz & Corina Paraschiv & Gabriel Turinici, 2017. "Dynamic Pricing of New Products in Competitive Markets: A Mean-Field Game Approach," Working Papers hal-01592958, HAL.
    9. Christoph Engel, 2006. "The Difficult Reception of Rigorous Descriptive Social Science in the Law," Discussion Paper Series of the Max Planck Institute for Research on Collective Goods 2006_1, Max Planck Institute for Research on Collective Goods.
    10. Boone, Jan, 2003. "Optimal Competition: A Benchmark for Competition Policy," CEPR Discussion Papers 3766, C.E.P.R. Discussion Papers.
    11. Cantillo, Miguel & Wright, Julian, 2000. "How Do Firms Choose Their Lenders? An Empirical Investigation," The Review of Financial Studies, Society for Financial Studies, vol. 13(1), pages 155-189.
    12. Pierpaolo Battigalli, 2006. "Rationalization In Signaling Games: Theory And Applications," International Game Theory Review (IGTR), World Scientific Publishing Co. Pte. Ltd., vol. 8(01), pages 67-93.
    13. Horn, Henrik & Tangerås, Thomas, 2016. "Economics and Politics of International Investment Agreements," Working Paper Series 1140, Research Institute of Industrial Economics.
    14. Benchekroun, Hassan & van Long, Ngo, 1998. "Efficiency inducing taxation for polluting oligopolists," Journal of Public Economics, Elsevier, vol. 70(2), pages 325-342, November.
    15. Matthias Greiff & Fabian Paetzel, 2012. "The Importance of Knowing Your Own Reputation," MAGKS Papers on Economics 201236, Philipps-Universität Marburg, Faculty of Business Administration and Economics, Department of Economics (Volkswirtschaftliche Abteilung).
    16. Bernheim, B Douglas, 1994. "A Theory of Conformity," Journal of Political Economy, University of Chicago Press, vol. 102(5), pages 841-877, October.
    17. Johannes Urpelainen, 2011. "Frontrunners and Laggards: The Strategy of Environmental Regulation under Uncertainty," Environmental & Resource Economics, Springer;European Association of Environmental and Resource Economists, vol. 50(3), pages 325-346, November.
    18. William H. Sandholm, 2005. "Negative Externalities and Evolutionary Implementation," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 72(3), pages 885-915.
    19. Schweizer, Nikolaus & Szech, Nora, 2015. "A quantitative version of Myerson regularity," Working Paper Series in Economics 76, Karlsruhe Institute of Technology (KIT), Department of Economics and Management.
    20. Stefan Ambec & Michel Poitevin, 2016. "Decision-making in organizations: when to delegate and whom to delegate," Review of Economic Design, Springer;Society for Economic Design, vol. 20(2), pages 115-143, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wsi:acsxxx:v:14:y:2011:i:02:n:s0219525911002998. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Tai Tone Lim (email available below). General contact details of provider: http://www.worldscinet.com/acs/acs.shtml .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.