IDEAS home Printed from https://ideas.repec.org/a/wsi/acsxxx/v12y2009i04n05ns0219525909002295.html
   My bibliography  Save this article

Multiagent Learning For Black Box System Reward Functions

Author

Listed:
  • KAGAN TUMER

    (Oregon State University, 204 Rogers Hall, Corvallis, Oregon 97331, USA)

  • ADRIAN AGOGINO

    (UCSC, NASA Ames Research Center, Mailstop 269-3, Moffett Field, California 94035, USA)

Abstract

In large, distributed systems composed of adaptive and interactive components (agents), ensuring the coordination among the agents so that the system achieves certain performance objectives is a challenging proposition. The key difficulty to overcome in such systems is one of credit assignment: How to apportion credit (or blame) to a particular agent based on the performance of the entire system. In this paper, we show how this problem can be solved in general for a large class of reward functions whose analytical form may be unknown (hence "black box" reward). This method combines the salient features of global solutions (e.g. "team games") which are broadly applicable but provide poor solutions in large problems with those of local solutions (e.g. "difference rewards") which learn quickly, but can be computationally burdensome. We introduce two estimates for local rewards for a class of problems where the mapping from the agent actions to system reward functions can be decomposed into a linear combination of nonlinear functions of the agents' actions. We test our method's performance on a distributed marketing problem and an air traffic flow management problem and show a 44% performance improvement over team games and a speedup of ordernfor difference rewards (for annagent system).

Suggested Citation

  • Kagan Tumer & Adrian Agogino, 2009. "Multiagent Learning For Black Box System Reward Functions," Advances in Complex Systems (ACS), World Scientific Publishing Co. Pte. Ltd., vol. 12(04n05), pages 475-492.
  • Handle: RePEc:wsi:acsxxx:v:12:y:2009:i:04n05:n:s0219525909002295
    DOI: 10.1142/S0219525909002295
    as

    Download full text from publisher

    File URL: http://www.worldscientific.com/doi/abs/10.1142/S0219525909002295
    Download Restriction: Access to full text is restricted to subscribers

    File URL: https://libkey.io/10.1142/S0219525909002295?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Johnson, N.F. & Jarvis, S. & Jonson, R. & Cheung, P. & Kwong, Y.R. & Hui, P.M., 1998. "Volatility and agent adaptability in a self-organizing market," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 258(1), pages 230-236.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Thorsten Chmura & Thomas Pitz, 2007. "An Extended Reinforcement Algorithm for Estimation of Human Behaviour in Experimental Congestion Games," Journal of Artificial Societies and Social Simulation, Journal of Artificial Societies and Social Simulation, vol. 10(2), pages 1-1.
    2. Chmura, Thorsten & Pitz, Thomas, 2004. "Minority Game: Experiments and Simulations of Traffic Scenarios," Bonn Econ Discussion Papers 23/2004, University of Bonn, Bonn Graduate School of Economics (BGSE).
    3. Xu, C. & Gu, G.-Q. & Hui, P.M., 2024. "Impacts of an expert’s opinion on the collective performance of a competing population for limited resources," Chaos, Solitons & Fractals, Elsevier, vol. 183(C).
    4. Li-Xin Zhong & Wen-Juan Xu & Ping Huang & Chen-Yang Zhong & Tian Qiu, 2013. "Self-organization and phase transition in financial markets with multiple choices," Papers 1312.0690, arXiv.org, revised Jun 2014.
    5. Epstein, Daniel & Bazzan, Ana L.C., 2013. "The value of less connected agents in Boolean networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 392(21), pages 5387-5398.
    6. Kets, W., 2007. "The Minority Game : An Economics Perspective," Other publications TiSEM 65d52a6a-b27d-45a9-93a7-e, Tilburg University, School of Economics and Management.
    7. Aki-Hiro Sato & Hideki Takayasu, 2001. "Derivation of ARCH(1) process from market price changes based on deterministic microscopic multi-agent," Papers cond-mat/0104313, arXiv.org.
    8. Matteo Marsili & Damien Challet, 2001. "Trading Behavior And Excess Volatility In Toy Markets," Advances in Complex Systems (ACS), World Scientific Publishing Co. Pte. Ltd., vol. 4(01), pages 3-17.
    9. Li-Xin Zhong & Wen-Juan Xu & Fei Ren & Yong-Dong Shi, 2012. "Coupled effects of market impact and asymmetric sensitivity in financial markets," Papers 1209.3399, arXiv.org, revised Jan 2013.
    10. Zhong, Li-Xin & Xu, Wen-Juan & Ren, Fei & Shi, Yong-Dong, 2013. "Coupled effects of market impact and asymmetric sensitivity in financial markets," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 392(9), pages 2139-2149.
    11. Marsili, Matteo & Challet, Damien & Zecchina, Riccardo, 2000. "Exact solution of a modified El Farol's bar problem: Efficiency and the role of market impact," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 280(3), pages 522-553.
    12. Zhong, Li-Xin & Xu, Wen-Juan & Chen, Rong-Da & Zhong, Chen-Yang & Qiu, Tian & Ren, Fei & He, Yun-Xing, 2018. "Self-reinforcing feedback loop in financial markets with coupling of market impact and momentum traders," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 493(C), pages 301-310.
    13. Mansilla, R, 2000. "From naive to sophisticated behavior in multiagents-based financial market models," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 284(1), pages 478-488.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wsi:acsxxx:v:12:y:2009:i:04n05:n:s0219525909002295. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Tai Tone Lim (email available below). General contact details of provider: http://www.worldscinet.com/acs/acs.shtml .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.