IDEAS home Printed from https://ideas.repec.org/a/inm/oropre/v61y2013i2p413-425.html
   My bibliography  Save this article

A Linear Programming Approach to Nonstationary Infinite-Horizon Markov Decision Processes

Author

Listed:
  • Archis Ghate

    (Industrial and Systems Engineering, University of Washington, Seattle, Washington 98195)

  • Robert L. Smith

    (Industrial and Operations Engineering, University of Michigan, Ann Arbor, Michigan 48109)

Abstract

Nonstationary infinite-horizon Markov decision processes (MDPs) generalize the most well-studied class of sequential decision models in operations research, namely, that of stationary MDPs, by relaxing the restrictive assumption that problem data do not change over time. Linear programming (LP) has been very successful in obtaining structural insights and devising solution methods for stationary MDPs. However, an LP approach for nonstationary MDPs is currently missing. This is because the LP formulation of a nonstationary infinite-horizon MDP includes countably infinite variables and constraints, and research on such infinite-dimensional LPs has traditionally faced several hurdles. For instance, duality results may not hold; an extreme point may not be a basic feasible solution; and in the context of a simplex algorithm, a pivot operation may require infinite data and computations, and a sequence of improving extreme points need not converge in value to optimal. In this paper, we tackle these challenges and establish (1) weak and strong duality, (2) complementary slackness, (3) a basic feasible solution characterization of extreme points, (4) a one-to-one correspondence between extreme points and deterministic Markovian policies, and (5) we devise a simplex algorithm for an infinite-dimensional LP formulation of nonstationary infinite-horizon MDPs. Pivots in this simplex algorithm use finite data, perform finite computations, and generate a sequence of improving extreme points that converges in value to optimal. Moreover, this sequence of extreme points gets arbitrarily close to the set of optimal extreme points. We also prove that decisions prescribed by these extreme points are eventually exactly optimal in all states of the nonstationary infinite-horizon MDP in early periods.

Suggested Citation

  • Archis Ghate & Robert L. Smith, 2013. "A Linear Programming Approach to Nonstationary Infinite-Horizon Markov Decision Processes," Operations Research, INFORMS, vol. 61(2), pages 413-425, April.
  • Handle: RePEc:inm:oropre:v:61:y:2013:i:2:p:413-425
    DOI: 10.1287/opre.1120.1121
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/opre.1120.1121
    Download Restriction: no

    File URL: https://libkey.io/10.1287/opre.1120.1121?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Archis Ghate & Dushyant Sharma & Robert L. Smith, 2010. "A Shadow Simplex Method for Infinite Linear Programs," Operations Research, INFORMS, vol. 58(4-part-1), pages 865-877, August.
    2. Daniel Adelman, 2007. "Dynamic Bid Prices in Revenue Management," Operations Research, INFORMS, vol. 55(4), pages 647-661, August.
    3. Robert L. Smith & Rachel Q. Zhang, 1998. "Infinite Horizon Production Planning in Time-Varying Systems with Convex Production and Inventory Costs," Management Science, INFORMS, vol. 44(9), pages 1313-1320, September.
    4. Wallace J. Hopp, 1989. "Technical Note—Identifying Forecast Horizons in Nonhomogeneous Markov Decision Processes," Operations Research, INFORMS, vol. 37(2), pages 339-343, April.
    5. Awi Federgruen & Michal Tzur, 1995. "Fast Solution and Detection of Minimal Forecast Horizons in Dynamic Programs with a Single Indicator of the Future: Applications to Dynamic Lot-Sizing Models," Management Science, INFORMS, vol. 41(5), pages 874-893, May.
    6. Alan S. Manne, 1960. "Linear Programming and Sequential Decisions," Management Science, INFORMS, vol. 6(3), pages 259-267, April.
    7. James C. Bean & Robert L. Smith, 1984. "Conditions for the Existence of Planning Horizons," Mathematics of Operations Research, INFORMS, vol. 9(3), pages 391-401, August.
    8. Daniel Adelman, 2003. "Price-Directed Replenishment of Subsets: Methodology and Its Application to Inventory Routing," Manufacturing & Service Operations Management, INFORMS, vol. 5(4), pages 348-371, May.
    9. Wallace J. Hopp & James C. Bean & Robert L. Smith, 1987. "A New Optimality Criterion for Nonhomogeneous Markov Decision Processes," Operations Research, INFORMS, vol. 35(6), pages 875-883, December.
    10. Huseyin Topaloglu, 2009. "Using Lagrangian Relaxation to Compute Capacity-Dependent Bid Prices in Network Revenue Management," Operations Research, INFORMS, vol. 57(3), pages 637-649, June.
    11. Yinyu Ye, 2011. "The Simplex and Policy-Iteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate," Mathematics of Operations Research, INFORMS, vol. 36(4), pages 593-603, November.
    12. Torpong Cheevaprawatdomrong & Irwin E. Schochetman & Robert L. Smith & Alfredo Garcia, 2007. "Solution and Forecast Horizons for Infinite-Horizon Nonhomogeneous Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 32(1), pages 51-72, February.
    13. D. P. de Farias & B. Van Roy, 2003. "The Linear Programming Approach to Approximate Dynamic Programming," Operations Research, INFORMS, vol. 51(6), pages 850-865, December.
    14. Jonathan Patrick & Martin L. Puterman & Maurice Queyranne, 2008. "Dynamic Multipriority Patient Scheduling for a Diagnostic Resource," Operations Research, INFORMS, vol. 56(6), pages 1507-1525, December.
    15. Huseyin Topaloglu & Warren B. Powell, 2007. "Sensitivity Analysis of a Dynamic Fleet Management Model Using Approximate Dynamic Programming," Operations Research, INFORMS, vol. 55(2), pages 319-331, April.
    16. Daniel Adelman & Adam J. Mersereau, 2008. "Relaxations of Weakly Coupled Stochastic Dynamic Programs," Operations Research, INFORMS, vol. 56(3), pages 712-727, June.
    17. H. Edwin Romeijn & Robert L. Smith, 1998. "Shadow Prices in Infinite-Dimensional Linear Programming," Mathematics of Operations Research, INFORMS, vol. 23(1), pages 239-256, February.
    18. Daniel Adelman, 2004. "A Price-Directed Approach to Stochastic Inventory/Routing," Operations Research, INFORMS, vol. 52(4), pages 499-514, August.
    19. Suresh Chand & Vernon Ning Hsu & Suresh Sethi, 2002. "Forecast, Solution, and Rolling Horizons in Operations Management Problems: A Classified Bibliography," Manufacturing & Service Operations Management, INFORMS, vol. 4(1), pages 25-43, September.
    20. Torpong Cheevaprawatdomrong & Robert L. Smith, 2004. "Infinite Horizon Production Scheduling in Time-Varying Systems Under Stochastic Demand," Operations Research, INFORMS, vol. 52(1), pages 105-115, February.
    21. Dan Zhang & Daniel Adelman, 2009. "An Approximate Dynamic Programming Approach to Network Revenue Management with Customer Choice," Transportation Science, INFORMS, vol. 43(3), pages 381-394, August.
    22. Yinyu Ye, 2005. "A New Complexity Result on Solving the Markov Decision Problem," Mathematics of Operations Research, INFORMS, vol. 30(3), pages 733-749, August.
    23. C. Bes & S. P. Sethi, 1988. "Concepts of Forecast and Decision Horizons: Applications to Dynamic Stochastic Optimization Problems," Mathematics of Operations Research, INFORMS, vol. 13(2), pages 295-310, May.
    24. GRINOLD, Richard C., 1977. "Finite horizon approximations of infinite horizon linear programs," LIDAM Reprints CORE 294, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
    25. Huseyin Topaloglu & Warren B. Powell, 2005. "A Distributed Decision-Making Structure for Dynamic Resource Allocation Using Nonlinear Functional Approximations," Operations Research, INFORMS, vol. 53(2), pages 281-297, April.
    26. Sumit Kunnumkal & Huseyin Topaloglu, 2008. "A duality‐based relaxation and decomposition approach for inventory distribution systems," Naval Research Logistics (NRL), John Wiley & Sons, vol. 55(7), pages 612-631, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ilbin Lee & Marina A. Epelman & H. Edwin Romeijn & Robert L. Smith, 2017. "Simplex Algorithm for Countable-State Discounted Markov Decision Processes," Operations Research, INFORMS, vol. 65(4), pages 1029-1042, August.
    2. Thomas W.M. Vossen & Fan You & Dan Zhang, 2022. "Finite‐horizon approximate linear programs for capacity allocation over a rolling horizon," Production and Operations Management, Production and Operations Management Society, vol. 31(5), pages 2127-2142, May.
    3. Ghate, Archis, 2015. "Circumventing the Slater conundrum in countably infinite linear programs," European Journal of Operational Research, Elsevier, vol. 246(3), pages 708-720.
    4. Rebecca S. Widrick & Sarah G. Nurre & Matthew J. Robbins, 2018. "Optimal Policies for the Management of an Electric Vehicle Battery Swap Station," Transportation Science, INFORMS, vol. 52(1), pages 59-79, January.
    5. Dimitris Bertsimas & Velibor V. Mišić, 2016. "Decomposable Markov Decision Processes: A Fluid Optimization Approach," Operations Research, INFORMS, vol. 64(6), pages 1537-1555, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Torpong Cheevaprawatdomrong & Irwin E. Schochetman & Robert L. Smith & Alfredo Garcia, 2007. "Solution and Forecast Horizons for Infinite-Horizon Nonhomogeneous Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 32(1), pages 51-72, February.
    2. Qihang Lin & Selvaprabu Nadarajah & Negar Soheili, 2020. "Revisiting Approximate Linear Programming: Constraint-Violation Learning with Applications to Inventory Control and Energy Storage," Management Science, INFORMS, vol. 66(4), pages 1544-1562, April.
    3. Thomas W. M. Vossen & Dan Zhang, 2015. "Reductions of Approximate Linear Programs for Network Revenue Management," Operations Research, INFORMS, vol. 63(6), pages 1352-1371, December.
    4. Archis Ghate & Robert L. Smith, 2009. "Optimal Backlogging Over an Infinite Horizon Under Time-Varying Convex Production and Inventory Costs," Manufacturing & Service Operations Management, INFORMS, vol. 11(2), pages 362-368, June.
    5. Antoine Sauré & Jonathan Patrick & Martin L. Puterman, 2015. "Simulation-Based Approximate Policy Iteration with Generalized Logistic Functions," INFORMS Journal on Computing, INFORMS, vol. 27(3), pages 579-595, August.
    6. Suresh Chand & Vernon Ning Hsu & Suresh Sethi, 2002. "Forecast, Solution, and Rolling Horizons in Operations Management Problems: A Classified Bibliography," Manufacturing & Service Operations Management, INFORMS, vol. 4(1), pages 25-43, September.
    7. Selvaprabu Nadarajah & François Margot & Nicola Secomandi, 2015. "Relaxations of Approximate Linear Programs for the Real Option Management of Commodity Storage," Management Science, INFORMS, vol. 61(12), pages 3054-3076, December.
    8. Thomas W.M. Vossen & Fan You & Dan Zhang, 2022. "Finite‐horizon approximate linear programs for capacity allocation over a rolling horizon," Production and Operations Management, Production and Operations Management Society, vol. 31(5), pages 2127-2142, May.
    9. Meissner, Joern & Strauss, Arne, 2012. "Network revenue management with inventory-sensitive bid prices and customer choice," European Journal of Operational Research, Elsevier, vol. 216(2), pages 459-468.
    10. Matthew S. Maxwell & Mateo Restrepo & Shane G. Henderson & Huseyin Topaloglu, 2010. "Approximate Dynamic Programming for Ambulance Redeployment," INFORMS Journal on Computing, INFORMS, vol. 22(2), pages 266-281, May.
    11. Adam Diamant, 2021. "Dynamic multistage scheduling for patient-centered care plans," Health Care Management Science, Springer, vol. 24(4), pages 827-844, December.
    12. Dan Zhang, 2011. "An Improved Dynamic Programming Decomposition Approach for Network Revenue Management," Manufacturing & Service Operations Management, INFORMS, vol. 13(1), pages 35-52, April.
    13. Amin Khademi & Denis R. Saure & Andrew J. Schaefer & Ronald S. Braithwaite & Mark S. Roberts, 2015. "The Price of Nonabandonment: HIV in Resource-Limited Settings," Manufacturing & Service Operations Management, INFORMS, vol. 17(4), pages 554-570, October.
    14. Laumer, Simon & Barz, Christiane, 2023. "Reductions of non-separable approximate linear programs for network revenue management," European Journal of Operational Research, Elsevier, vol. 309(1), pages 252-270.
    15. Marquinez, José Tomás & Sauré, Antoine & Cataldo, Alejandro & Ferrer, Juan-Carlos, 2021. "Identifying proactive ICU patient admission, transfer and diversion policies in a public-private hospital network," European Journal of Operational Research, Elsevier, vol. 295(1), pages 306-320.
    16. Archis Ghate & Dushyant Sharma & Robert L. Smith, 2010. "A Shadow Simplex Method for Infinite Linear Programs," Operations Research, INFORMS, vol. 58(4-part-1), pages 865-877, August.
    17. Huang, Kai & Ahmed, Shabbir, 2010. "A stochastic programming approach for planning horizons of infinite horizon capacity planning problems," European Journal of Operational Research, Elsevier, vol. 200(1), pages 74-84, January.
    18. David Sayah, 2015. "Approximate Linear Programming in Network Revenue Management with Multiple Modes," Working Papers 1518, Gutenberg School of Management and Economics, Johannes Gutenberg-Universität Mainz.
    19. Wallace J. Hopp & Suresh K. Nair, 1991. "Timing replacement decisions under discontinuous technological change," Naval Research Logistics (NRL), John Wiley & Sons, vol. 38(2), pages 203-220, April.
    20. Yuhang Ma & Paat Rusmevichientong & Mika Sumida & Huseyin Topaloglu, 2020. "An Approximation Algorithm for Network Revenue Management Under Nonstationary Arrivals," Operations Research, INFORMS, vol. 68(3), pages 834-855, May.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:oropre:v:61:y:2013:i:2:p:413-425. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.