IDEAS home Printed from https://ideas.repec.org/a/spr/annopr/v241y2016i1d10.1007_s10479-013-1523-0.html
   My bibliography  Save this article

Four proofs of Gittins’ multiarmed bandit theorem

Author

Listed:
  • Esther Frostig

    (The University of Haifa)

  • Gideon Weiss

    (The University of Haifa)

Abstract

We study four proofs that the Gittins index priority rule is optimal for alternative bandit processes. These include Gittins’ original exchange argument, Weber’s prevailing charge argument, Whittle’s Lagrangian dual approach, and Bertsimas and Niño-Mora’s proof based on the achievable region approach and generalized conservation laws. We extend the achievable region proof to infinite countable state spaces, by using infinite dimensional linear programming theory.

Suggested Citation

  • Esther Frostig & Gideon Weiss, 2016. "Four proofs of Gittins’ multiarmed bandit theorem," Annals of Operations Research, Springer, vol. 241(1), pages 127-165, June.
  • Handle: RePEc:spr:annopr:v:241:y:2016:i:1:d:10.1007_s10479-013-1523-0
    DOI: 10.1007/s10479-013-1523-0
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10479-013-1523-0
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10479-013-1523-0?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Eric V. Denardo & Haechurl Park & Uriel G. Rothblum, 2007. "Risk-Sensitive and Risk-Neutral Multiarmed Bandits," Mathematics of Operations Research, INFORMS, vol. 32(2), pages 374-394, May.
    2. Eric Denardo & Eugene Feinberg & Uriel Rothblum, 2013. "The multi-armed bandit, with constraints," Annals of Operations Research, Springer, vol. 208(1), pages 37-62, September.
    3. J. Michael Harrison, 1975. "Dynamic Scheduling of a Multiclass Queue: Discount Optimality," Operations Research, INFORMS, vol. 23(2), pages 270-282, April.
    4. A. Federgruen & H. Groenevelt, 1988. "Characterization and Optimization of Achievable Performance in General Queueing Systems," Operations Research, INFORMS, vol. 36(5), pages 733-741, October.
    5. K.D. Glazebrook & R. Garbe, 1999. "Almost optimal policies for stochastic systemswhich almost satisfy conservation laws," Annals of Operations Research, Springer, vol. 92(0), pages 19-43, January.
    6. Kevin D. Glazebrook & José Niño-Mora, 2001. "Parallel Scheduling of Multiclass M/M/m Queues: Approximate and Heavy-Traffic Optimization of Achievable Performance," Operations Research, INFORMS, vol. 49(4), pages 609-623, August.
    7. Sonin, Isaac M., 2008. "A generalized Gittins index for a Markov chain and its recursive calculation," Statistics & Probability Letters, Elsevier, vol. 78(12), pages 1526-1533, September.
    8. Meilijson, Isaac & Weiss, Gideon, 1977. "Multiple feedback at a single-server station," Stochastic Processes and their Applications, Elsevier, vol. 5(2), pages 195-205, May.
    9. J. George Shanthikumar & David D. Yao, 1992. "Multiclass Queueing Systems: Polymatroidal Structure and Optimal Scheduling Control," Operations Research, INFORMS, vol. 40(3-supplem), pages 293-299, June.
    10. Michael N. Katehakis & Arthur F. Veinott, 1987. "The Multi-Armed Bandit Problem: Decomposition and Computation," Mathematics of Operations Research, INFORMS, vol. 12(2), pages 262-268, May.
    11. M. Dacre & K. Glazebrook & J. Niño‐Mora, 1999. "The achievable region approach to the optimal control of stochastic systems," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 61(4), pages 747-791.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Samuel N. Cohen & Tanut Treetanthiploet, 2019. "Gittins' theorem under uncertainty," Papers 1907.05689, arXiv.org, revised Jun 2021.
    2. Francisco Alvarez, 2018. "Decomposing risk in an exploitation–exploration problem with endogenous termination time," Annals of Operations Research, Springer, vol. 261(1), pages 45-77, February.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. José Niño-Mora, 2006. "Restless Bandit Marginal Productivity Indices, Diminishing Returns, and Optimal Control of Make-to-Order/Make-to-Stock M/G/1 Queues," Mathematics of Operations Research, INFORMS, vol. 31(1), pages 50-84, February.
    2. Shaler Stidham, 2002. "Analysis, Design, and Control of Queueing Systems," Operations Research, INFORMS, vol. 50(1), pages 197-216, February.
    3. Malekipirbazari, Milad & Çavuş, Özlem, 2024. "Index policy for multiarmed bandit problem with dynamic risk measures," European Journal of Operational Research, Elsevier, vol. 312(2), pages 627-640.
    4. R. T. Dunn & K. D. Glazebrook, 2004. "Discounted Multiarmed Bandit Problems on a Collection of Machines with Varying Speeds," Mathematics of Operations Research, INFORMS, vol. 29(2), pages 266-279, May.
    5. José Niño-Mora, 2000. "On certain greedoid polyhedra, partially indexable scheduling problems and extended restless bandit allocation indices," Economics Working Papers 456, Department of Economics and Business, Universitat Pompeu Fabra.
    6. José Niño-Mora, 2000. "Beyond Smith's rule: An optimal dynamic index, rule for single machine stochastic scheduling with convex holding costs," Economics Working Papers 514, Department of Economics and Business, Universitat Pompeu Fabra.
    7. Eric Denardo & Eugene Feinberg & Uriel Rothblum, 2013. "The multi-armed bandit, with constraints," Annals of Operations Research, Springer, vol. 208(1), pages 37-62, September.
    8. Jasper Vanlerberghe & Tom Maertens & Joris Walraevens & Stijn Vuyst & Herwig Bruneel, 2016. "On the optimization of two-class work-conserving parameterized scheduling policies," 4OR, Springer, vol. 14(3), pages 281-308, September.
    9. Dimitris Bertsimas & José Niño-Mora, 1994. "Restless bandits, linear programming relaxations and a primal-dual index heuristic," Economics Working Papers 301, Department of Economics and Business, Universitat Pompeu Fabra, revised Oct 1997.
    10. Bertsimas, Dimitris., 1995. "The achievable region method in the optimal control of queueing systems : formulations, bounds and policies," Working papers 3837-95., Massachusetts Institute of Technology (MIT), Sloan School of Management.
    11. José Niño-Mora, 2020. "A Verification Theorem for Threshold-Indexability of Real-State Discounted Restless Bandits," Mathematics of Operations Research, INFORMS, vol. 45(2), pages 465-496, May.
    12. Hellerstein, Lisa & Lidbetter, Thomas, 2023. "A game theoretic approach to a problem in polymatroid maximization," European Journal of Operational Research, Elsevier, vol. 305(2), pages 979-988.
    13. Vanlerberghe, Jasper & Walraevens, Joris & Maertens, Tom & Bruneel, Herwig, 2018. "Calculation of the performance region of an easy-to-optimize alternative for Generalized Processor Sharing," European Journal of Operational Research, Elsevier, vol. 270(2), pages 625-635.
    14. Nicole Bäuerle & Ulrich Rieder, 2014. "More Risk-Sensitive Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 39(1), pages 105-120, February.
    15. Santiago R. Balseiro & Ozan Candogan, 2017. "Optimal Contracts for Intermediaries in Online Advertising," Operations Research, INFORMS, vol. 65(4), pages 878-896, August.
    16. Nicolas Gast & Bruno Gaujal & Kimang Khun, 2023. "Testing indexability and computing Whittle and Gittins index in subcubic time," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 97(3), pages 391-436, June.
    17. Dimitris Bertsimas & José Niño-Mora, 1996. "Optimization of multiclass queueing networks with changeover times via the achievable region approach: Part I, the single-station case," Economics Working Papers 302, Department of Economics and Business, Universitat Pompeu Fabra, revised Jul 1998.
    18. Muhammad El-Taha, 2016. "Invariance of workload in queueing systems," Queueing Systems: Theory and Applications, Springer, vol. 83(1), pages 181-192, June.
    19. Dimitris Bertsimas & Velibor V. Mišić, 2016. "Decomposable Markov Decision Processes: A Fluid Optimization Approach," Operations Research, INFORMS, vol. 64(6), pages 1537-1555, December.
    20. K. D. Glazebrook & R. Minty, 2009. "A Generalized Gittins Index for a Class of Multiarmed Bandits with General Resource Requirements," Mathematics of Operations Research, INFORMS, vol. 34(1), pages 26-44, February.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:annopr:v:241:y:2016:i:1:d:10.1007_s10479-013-1523-0. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.