IDEAS home Printed from https://ideas.repec.org/a/inm/oropre/v70y2022i2p1105-1127.html
   My bibliography  Save this article

Bayesian Exploration: Incentivizing Exploration in Bayesian Games

Author

Listed:
  • Yishay Mansour

    (Tel Aviv University, Tel Aviv, Israel; Google, Tel Aviv, Israel)

  • Alex Slivkins

    (Microsoft Research, New York, New York 10012)

  • Vasilis Syrgkanis

    (Microsoft Research, Cambridge, Massachusetts 02142)

  • Zhiwei Steven Wu

    (Carnegie Mellon University, Pittsburgh, Pennsylvania 15213)

Abstract

We consider a ubiquitous scenario in the internet economy when individual decision makers (henceforth, agents ) both produce and consume information as they make strategic choices in an uncertain environment. This creates a three-way trade-off between exploration (trying out insufficiently explored alternatives to help others in the future), exploitation (making optimal decisions given the information discovered by other agents), and incentives of the agents (who are myopically interested in exploitation while preferring the others to explore). We posit a principal who controls the flow of information from agents that came before to the ones that arrive later and strives to coordinate the agents toward a socially optimal balance between exploration and exploitation, not using any monetary transfers. The goal is to design a recommendation policy for the principal that respects agents’ incentives and minimizes a suitable notion of regret . We extend prior work in this direction to allow the agents to interact with one another in a shared environment: at each time step, multiple agents arrive to play a Bayesian game , receive recommendations, choose their actions, receive their payoffs, and then leave the game forever. The agents now face two sources of uncertainty: the actions of the other agents and the parameters of the uncertain game environment. Our main contribution is to show that the principal can achieve constant regret when the utilities are deterministic (the constant depends on the prior distribution but not on the time horizon) and logarithmic regret when the utilities are stochastic. As a key technical tool, we introduce the concept of explorable actions , the actions that some incentive-compatible policy can recommend with nonzero probability. We show how the principal can identify (and explore) all explorable actions and use the revealed information to perform optimally. In particular, our results significantly improve over the prior work on the special case of a single agent per round, which relies on assumptions to guarantee that all actions are explorable. Interestingly, we do not require the principal’s utility to be aligned with the cumulative utility of the agents; instead, the principal can optimize an arbitrary notion of per-round reward.

Suggested Citation

  • Yishay Mansour & Alex Slivkins & Vasilis Syrgkanis & Zhiwei Steven Wu, 2022. "Bayesian Exploration: Incentivizing Exploration in Bayesian Games," Operations Research, INFORMS, vol. 70(2), pages 1105-1127, March.
  • Handle: RePEc:inm:oropre:v:70:y:2022:i:2:p:1105-1127
    DOI: 10.1287/opre.2021.2205
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/opre.2021.2205
    Download Restriction: no

    File URL: https://libkey.io/10.1287/opre.2021.2205?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:oropre:v:70:y:2022:i:2:p:1105-1127. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.