IDEAS home Printed from https://ideas.repec.org/a/spr/dyngam/v7y2017i3d10.1007_s13235-016-0198-y.html
   My bibliography  Save this article

Approachability in Stackelberg Stochastic Games with Vector Costs

Author

Listed:
  • Dileep Kalathil

    (EECS at UC Berkeley)

  • Vivek S. Borkar

    (IIT Bombay)

  • Rahul Jain

    (University of Southern California)

Abstract

The notion of approachability was introduced by Blackwell (Pac J Math 6(1):1–8, 1956) in the context of vector-valued repeated games. The famous ‘Blackwell’s approachability theorem’ prescribes a strategy for approachability, i.e., for ‘steering’ the average vector cost of a given agent toward a given target set, irrespective of the strategies of the other agents. In this paper, motivated by the multi-objective optimization/decision-making problems in dynamically changing environments, we address the approachability problem in Stackelberg stochastic games with vector-valued cost functions. We make two main contributions. Firstly, we give a simple and computationally tractable strategy for approachability for Stackelberg stochastic games along the lines of Blackwell’s. Secondly, we give a reinforcement learning algorithm for learning the approachable strategy when the transition kernel is unknown. We also recover as a by-product Blackwell’s necessary and sufficient conditions for approachability for convex sets in this setup and thus a complete characterization. We give sufficient conditions for non-convex sets.

Suggested Citation

  • Dileep Kalathil & Vivek S. Borkar & Rahul Jain, 2017. "Approachability in Stackelberg Stochastic Games with Vector Costs," Dynamic Games and Applications, Springer, vol. 7(3), pages 422-442, September.
  • Handle: RePEc:spr:dyngam:v:7:y:2017:i:3:d:10.1007_s13235-016-0198-y
    DOI: 10.1007/s13235-016-0198-y
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s13235-016-0198-y
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s13235-016-0198-y?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Shie Mannor & Nahum Shimkin, 2003. "The Empirical Bayes Envelope and Regret Minimization in Competitive Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 28(2), pages 327-345, May.
    2. Michel Benaïm & Josef Hofbauer & Sylvain Sorin, 2006. "Stochastic Approximations and Differential Inclusions, Part II: Applications," Mathematics of Operations Research, INFORMS, vol. 31(4), pages 673-695, November.
    3. Milman, Emanuel, 2006. "Approachable sets of vector payoffs in stochastic games," Games and Economic Behavior, Elsevier, vol. 56(1), pages 135-147, July.
    4. Huizhen Yu & Dimitri P. Bertsekas, 2013. "On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems," Mathematics of Operations Research, INFORMS, vol. 38(2), pages 209-227, May.
    5. Michel Benaim & Josef Hofbauer & Sylvain Sorin, 2005. "Stochastic Approximations and Differential Inclusions II: Applications," Levine's Bibliography 784828000000000098, UCLA Department of Economics.
    6. Eyal Even-Dar & Sham. M. Kakade & Yishay Mansour, 2009. "Online Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 34(3), pages 726-736, August.
    7. Jia Yuan Yu & Shie Mannor & Nahum Shimkin, 2009. "Markov Decision Processes with Arbitrary Reward Processes," Mathematics of Operations Research, INFORMS, vol. 34(3), pages 737-757, August.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Soham R. Phade & Venkat Anantharam, 2023. "Learning in Games with Cumulative Prospect Theoretic Preferences," Dynamic Games and Applications, Springer, vol. 13(1), pages 265-306, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mathieu Faure & Gregory Roth, 2010. "Stochastic Approximations of Set-Valued Dynamical Systems: Convergence with Positive Probability to an Attractor," Mathematics of Operations Research, INFORMS, vol. 35(3), pages 624-640, August.
    2. Akimoto, Youhei & Auger, Anne & Hansen, Nikolaus, 2022. "An ODE method to prove the geometric convergence of adaptive stochastic algorithms," Stochastic Processes and their Applications, Elsevier, vol. 145(C), pages 269-307.
    3. Shiau Hong Lim & Huan Xu & Shie Mannor, 2016. "Reinforcement Learning in Robust Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 41(4), pages 1325-1353, November.
    4. Andrey Bernstein & Shie Mannor & Nahum Shimkin, 2014. "Opportunistic Approachability and Generalized No-Regret Problems," Mathematics of Operations Research, INFORMS, vol. 39(4), pages 1057-1083, November.
    5. Saeed Hadikhanloo & Rida Laraki & Panayotis Mertikopoulos & Sylvain Sorin, 2022. "Learning in nonatomic games, part Ⅰ: Finite action spaces and population games," Post-Print hal-03767995, HAL.
    6. Andriy Zapechelnyuk, 2009. "Limit Behavior of No-regret Dynamics," Discussion Papers 21, Kyiv School of Economics.
    7. Sylvain Sorin, 2023. "Continuous Time Learning Algorithms in Optimization and Game Theory," Dynamic Games and Applications, Springer, vol. 13(1), pages 3-24, March.
    8. Viossat, Yannick & Zapechelnyuk, Andriy, 2013. "No-regret dynamics and fictitious play," Journal of Economic Theory, Elsevier, vol. 148(2), pages 825-842.
    9. Michel Benaïm & Mathieu Faure, 2013. "Consistency of Vanishingly Smooth Fictitious Play," Mathematics of Operations Research, INFORMS, vol. 38(3), pages 437-450, August.
    10. Michel Benaim & Olivier Raimond, 2007. "Simulated Annealing, Vertex-Reinforced Random Walks and Learning in Games," Levine's Bibliography 122247000000001702, UCLA Department of Economics.
    11. Josef Hofbauer & Sylvain Sorin & Yannick Viossat, 2009. "Time Average Replicator and Best-Reply Dynamics," Mathematics of Operations Research, INFORMS, vol. 34(2), pages 263-269, May.
    12. Jason M. Altschuler & Kunal Talwar, 2021. "Online Learning over a Finite Action Set with Limited Switching," Mathematics of Operations Research, INFORMS, vol. 46(1), pages 179-203, February.
    13. Michel Benaïm & Josef Hofbauer & Sylvain Sorin, 2012. "Perturbations of Set-Valued Dynamical Systems, with Applications to Game Theory," Dynamic Games and Applications, Springer, vol. 2(2), pages 195-205, June.
    14. Bervoets, Sebastian & Faure, Mathieu, 2020. "Convergence in games with continua of equilibria," Journal of Mathematical Economics, Elsevier, vol. 90(C), pages 25-30.
    15. Benaïm, Michel & Hofbauer, Josef & Hopkins, Ed, 2009. "Learning in games with unstable equilibria," Journal of Economic Theory, Elsevier, vol. 144(4), pages 1694-1709, July.
    16. Fournier, Gaëtan & Kuperwasser, Eden & Munk, Orin & Solan, Eilon & Weinbaum, Avishay, 2021. "Approachability with constraints," European Journal of Operational Research, Elsevier, vol. 292(2), pages 687-695.
    17. Rad Niazadeh & Negin Golrezaei & Joshua Wang & Fransisca Susan & Ashwinkumar Badanidiyuru, 2023. "Online Learning via Offline Greedy Algorithms: Applications in Market Design and Optimization," Management Science, INFORMS, vol. 69(7), pages 3797-3817, July.
    18. Cason, Timothy N. & Friedman, Daniel & Hopkins, Ed, 2010. "Testing the TASP: An experimental investigation of learning in games with unstable equilibria," Journal of Economic Theory, Elsevier, vol. 145(6), pages 2309-2331, November.
    19. Eunji Lim, 2011. "On the Convergence Rate for Stochastic Approximation in the Nonsmooth Setting," Mathematics of Operations Research, INFORMS, vol. 36(3), pages 527-537, August.
    20. Mannor, Shie & Shimkin, Nahum, 2008. "Regret minimization in repeated matrix games with variable stage duration," Games and Economic Behavior, Elsevier, vol. 63(1), pages 227-258, May.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:dyngam:v:7:y:2017:i:3:d:10.1007_s13235-016-0198-y. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.