IDEAS home Printed from https://ideas.repec.org/p/upf/upfgen/301.html
   My bibliography  Save this paper

Restless bandits, linear programming relaxations and a primal-dual index heuristic

Author

Listed:
  • Dimitris Bertsimas
  • José Niño-Mora

Abstract

We develop a mathematical programming approach for the classical PSPACE - hard restless bandit problem in stochastic optimization. We introduce a hierarchy of n (where n is the number of bandits) increasingly stronger linear programming relaxations, the last of which is exact and corresponds to the (exponential size) formulation of the problem as a Markov decision chain, while the other relaxations provide bounds and are efficiently computed. We also propose a priority-index heuristic scheduling policy from the solution to the first-order relaxation, where the indices are defined in terms of optimal dual variables. In this way we propose a policy and a suboptimality guarantee. We report results of computational experiments that suggest that the proposed heuristic policy is nearly optimal. Moreover, the second-order relaxation is found to provide strong bounds on the optimal value.

Suggested Citation

  • Dimitris Bertsimas & José Niño-Mora, 1994. "Restless bandits, linear programming relaxations and a primal-dual index heuristic," Economics Working Papers 301, Department of Economics and Business, Universitat Pompeu Fabra, revised Oct 1997.
  • Handle: RePEc:upf:upfgen:301
    as

    Download full text from publisher

    File URL: https://econ-papers.upf.edu/papers/301.pdf
    File Function: Whole Paper
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. A. Federgruen & H. Groenevelt, 1988. "Characterization and Optimization of Achievable Performance in General Queueing Systems," Operations Research, INFORMS, vol. 36(5), pages 733-741, October.
    2. J. George Shanthikumar & David D. Yao, 1992. "Multiclass Queueing Systems: Polymatroidal Structure and Optimal Scheduling Control," Operations Research, INFORMS, vol. 40(3-supplem), pages 293-299, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. José Niño-Mora, 2006. "Restless Bandit Marginal Productivity Indices, Diminishing Returns, and Optimal Control of Make-to-Order/Make-to-Stock M/G/1 Queues," Mathematics of Operations Research, INFORMS, vol. 31(1), pages 50-84, February.
    2. Esther Frostig & Gideon Weiss, 2016. "Four proofs of Gittins’ multiarmed bandit theorem," Annals of Operations Research, Springer, vol. 241(1), pages 127-165, June.
    3. Hellerstein, Lisa & Lidbetter, Thomas, 2023. "A game theoretic approach to a problem in polymatroid maximization," European Journal of Operational Research, Elsevier, vol. 305(2), pages 979-988.
    4. José Niño-Mora, 2000. "On certain greedoid polyhedra, partially indexable scheduling problems and extended restless bandit allocation indices," Economics Working Papers 456, Department of Economics and Business, Universitat Pompeu Fabra.
    5. Santiago R. Balseiro & Ozan Candogan, 2017. "Optimal Contracts for Intermediaries in Online Advertising," Operations Research, INFORMS, vol. 65(4), pages 878-896, August.
    6. Dimitris Bertsimas & José Niño-Mora, 1996. "Optimization of multiclass queueing networks with changeover times via the achievable region approach: Part I, the single-station case," Economics Working Papers 302, Department of Economics and Business, Universitat Pompeu Fabra, revised Jul 1998.
    7. Dimitris Bertsimas & Velibor V. Mišić, 2016. "Decomposable Markov Decision Processes: A Fluid Optimization Approach," Operations Research, INFORMS, vol. 64(6), pages 1537-1555, December.
    8. Shaler Stidham, 2002. "Analysis, Design, and Control of Queueing Systems," Operations Research, INFORMS, vol. 50(1), pages 197-216, February.
    9. Tianhu Deng & Ying‐Ju Chen & Zuo‐Jun Max Shen, 2015. "Optimal pricing and scheduling control of product shipping," Naval Research Logistics (NRL), John Wiley & Sons, vol. 62(3), pages 215-227, April.
    10. Bertsimas, Dimitris. & Niño-Mora, Jose., 1994. "Restless bandit, linear programming relaxations and a primal-dual heuristic," Working papers 3727-94., Massachusetts Institute of Technology (MIT), Sloan School of Management.
    11. Dimitris Bertsimas & José Niño-Mora, 2000. "Restless Bandits, Linear Programming Relaxations, and a Primal-Dual Index Heuristic," Operations Research, INFORMS, vol. 48(1), pages 80-90, February.
    12. Dimitris Bertsimas & José Niño-Mora, 1996. "Optimization of multiclass queueing networks with changeover times via the achievable region method: Part II, the multi-station case," Economics Working Papers 314, Department of Economics and Business, Universitat Pompeu Fabra, revised Aug 1998.
    13. Dimitris Bertsimas & José Niño-Mora, 1999. "Optimization of Multiclass Queueing Networks with Changeover Times Via the Achievable Region Approach: Part I, The Single-Station Case," Mathematics of Operations Research, INFORMS, vol. 24(2), pages 306-330, May.
    14. Liao Wang & David D. Yao, 2021. "Risk Hedging for Production Planning," Production and Operations Management, Production and Operations Management Society, vol. 30(6), pages 1825-1837, June.
    15. Bertsimas, Dimitris., 1995. "The achievable region method in the optimal control of queueing systems : formulations, bounds and policies," Working papers 3837-95., Massachusetts Institute of Technology (MIT), Sloan School of Management.
    16. Baris Ata & Yichuan Ding & Stefanos Zenios, 2021. "An Achievable-Region-Based Approach for Kidney Allocation Policy Design with Endogenous Patient Choice," Manufacturing & Service Operations Management, INFORMS, vol. 23(1), pages 36-54, 1-2.
    17. Marcus Dacre & Kevin Glazebrook & José Niño-Mora, 1998. "The achievable region approach to the optimal control of stochastic systems," Economics Working Papers 306, Department of Economics and Business, Universitat Pompeu Fabra.
    18. José Niño-Mora, 2020. "A Verification Theorem for Threshold-Indexability of Real-State Discounted Restless Bandits," Mathematics of Operations Research, INFORMS, vol. 45(2), pages 465-496, May.
    19. Vanlerberghe, Jasper & Walraevens, Joris & Maertens, Tom & Bruneel, Herwig, 2018. "Calculation of the performance region of an easy-to-optimize alternative for Generalized Processor Sharing," European Journal of Operational Research, Elsevier, vol. 270(2), pages 625-635.
    20. Veeraruna Kavitha & Jayakrishnan Nair & Raman Kumar Sinha, 2019. "Pseudo conservation for partially fluid, partially lossy queueing systems," Annals of Operations Research, Springer, vol. 277(2), pages 255-292, June.

    More about this item

    Keywords

    Stochastic scheduling; bandit problems; resource allocation; dynamic programming;
    All these keywords.

    JEL classification:

    • C60 - Mathematical and Quantitative Methods - - Mathematical Methods; Programming Models; Mathematical and Simulation Modeling - - - General
    • C61 - Mathematical and Quantitative Methods - - Mathematical Methods; Programming Models; Mathematical and Simulation Modeling - - - Optimization Techniques; Programming Models; Dynamic Analysis

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:upf:upfgen:301. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: http://www.econ.upf.edu/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.