IDEAS home Printed from https://ideas.repec.org/a/inm/ormnsc/v66y2020i4p1544-1562.html
   My bibliography  Save this article

Revisiting Approximate Linear Programming: Constraint-Violation Learning with Applications to Inventory Control and Energy Storage

Author

Listed:
  • Qihang Lin

    (Tippie College of Business, The University of Iowa, Iowa City, Iowa 52242)

  • Selvaprabu Nadarajah

    (College of Business Administration, University of Illinois at Chicago, Chicago, Illinois 60607)

  • Negar Soheili

    (College of Business Administration, University of Illinois at Chicago, Chicago, Illinois 60607)

Abstract

Approximate linear programs (ALPs) are well-known models for computing value function approximations (VFAs) of intractable Markov decision processes (MDPs). VFAs from ALPs have desirable theoretical properties, define an operating policy, and provide a lower bound on the optimal policy cost. However, solving ALPs near-optimally remains challenging, for example, when approximating MDPs with nonlinear cost functions and transition dynamics or when rich basis functions are required to obtain a good VFA. We address this tension between theory and solvability by proposing a convex saddle-point reformulation of an ALP that includes as primal and dual variables, respectively, a vector of basis function weights and a constraint violation density function over the state-action space. To solve this reformulation, we develop a proximal stochastic mirror descent (PSMD) method that learns regions of high ALP constraint violation via its dual update. We establish that PSMD returns a near-optimal ALP solution and a lower bound on the optimal policy cost in a finite number of iterations with high probability. We numerically compare PSMD with several benchmarks on inventory control and energy storage applications. We find that the PSMD lower bound is tighter than a perfect information bound. In contrast, the constraint-sampling approach to solve ALPs may not provide a lower bound, and applying row generation to tackle ALPs is not computationally viable. PSMD policies outperform problem-specific heuristics and are comparable or better than the policies obtained using constraint sampling. Overall, our ALP reformulation and solution approach broadens the applicability of approximate linear programming.

Suggested Citation

  • Qihang Lin & Selvaprabu Nadarajah & Negar Soheili, 2020. "Revisiting Approximate Linear Programming: Constraint-Violation Learning with Applications to Inventory Control and Energy Storage," Management Science, INFORMS, vol. 66(4), pages 1544-1562, April.
  • Handle: RePEc:inm:ormnsc:v:66:y:2020:i:4:p:1544-1562
    DOI: 10.1287/mnsc.2019.3289
    as

    Download full text from publisher

    File URL: https://doi.org/10.1287/mnsc.2019.3289
    Download Restriction: no

    File URL: https://libkey.io/10.1287/mnsc.2019.3289?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Huseyin Topaloglu & Sumit Kunnumkal, 2006. "Approximate dynamic programming methods for an inventory allocation problem under uncertainty," Naval Research Logistics (NRL), John Wiley & Sons, vol. 53(8), pages 822-841, December.
    2. David B. Brown & James E. Smith, 2014. "Information Relaxations, Duality, and Convex Stochastic Dynamic Programs," Operations Research, INFORMS, vol. 62(6), pages 1394-1415, December.
    3. Daniel Adelman, 2003. "Price-Directed Replenishment of Subsets: Methodology and Its Application to Inventory Routing," Manufacturing & Service Operations Management, INFORMS, vol. 5(4), pages 348-371, May.
    4. Daniela Pucci de Farias & Benjamin Van Roy, 2004. "On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming," Mathematics of Operations Research, INFORMS, vol. 29(3), pages 462-478, August.
    5. Daniel Adelman & Diego Klabjan, 2012. "Computing Near-Optimal Policies in Generalized Joint Replenishment," INFORMS Journal on Computing, INFORMS, vol. 24(1), pages 148-164, February.
    6. Dan Zhang & Daniel Adelman, 2009. "An Approximate Dynamic Programming Approach to Network Revenue Management with Customer Choice," Transportation Science, INFORMS, vol. 43(3), pages 381-394, August.
    7. Vijay V. Desai & Vivek F. Farias & Ciamac C. Moallemi, 2012. "Pathwise Optimization for Optimal Stopping Problems," Management Science, INFORMS, vol. 58(12), pages 2292-2308, December.
    8. Trick, Michael A. & Zin, Stanley E., 1997. "Spline Approximations To Value Functions," Macroeconomic Dynamics, Cambridge University Press, vol. 1(1), pages 255-277, January.
    9. David B. Brown & James E. Smith & Peng Sun, 2010. "Information Relaxations and Duality in Stochastic Dynamic Programs," Operations Research, INFORMS, vol. 58(4-part-1), pages 785-801, August.
    10. Secomandi, Nicola & Seppi, Duane J., 2014. "Real Options and Merchant Operations of Energy and Other Commodities," Foundations and Trends(R) in Technology, Information and Operations Management, now publishers, vol. 6(3-4), pages 161-331, July.
    11. Daniel Adelman, 2004. "A Price-Directed Approach to Stochastic Inventory/Routing," Operations Research, INFORMS, vol. 52(4), pages 499-514, August.
    12. Saif Benjaafar & Mohsen ElHafsi & Tingliang Huang, 2010. "Optimal control of a production‐inventory system with both backorders and lost sales," Naval Research Logistics (NRL), John Wiley & Sons, vol. 57(3), pages 252-265, April.
    13. Jonathan Patrick & Martin L. Puterman & Maurice Queyranne, 2008. "Dynamic Multipriority Patient Scheduling for a Diagnostic Resource," Operations Research, INFORMS, vol. 56(6), pages 1507-1525, December.
    14. David B. Brown & Martin B. Haugh, 2017. "Information Relaxation Bounds for Infinite Horizon Markov Decision Processes," Operations Research, INFORMS, vol. 65(5), pages 1355-1379, October.
    15. Daniel Adelman, 2007. "Dynamic Bid Prices in Revenue Management," Operations Research, INFORMS, vol. 55(4), pages 647-661, August.
    16. Thomas W. M. Vossen & Dan Zhang, 2015. "Reductions of Approximate Linear Programs for Network Revenue Management," Operations Research, INFORMS, vol. 63(6), pages 1352-1371, December.
    17. Steven Nahmias & Stephen A. Smith, 1994. "Optimizing Inventory Levels in a Two-Echelon Retailer System with Partial Lost Sales," Management Science, INFORMS, vol. 40(5), pages 582-596, May.
    18. D. P. de Farias & B. Van Roy, 2003. "The Linear Programming Approach to Approximate Dynamic Programming," Operations Research, INFORMS, vol. 51(6), pages 850-865, December.
    19. Selvaprabu Nadarajah & François Margot & Nicola Secomandi, 2015. "Relaxations of Approximate Linear Programs for the Real Option Management of Commodity Storage," Management Science, INFORMS, vol. 61(12), pages 3054-3076, December.
    20. Daniel Adelman & Adam J. Mersereau, 2013. "Dynamic Capacity Allocation to Customers Who Remember Past Service," Management Science, INFORMS, vol. 59(3), pages 592-612, January.
    21. Diego Klabjan & Daniel Adelman, 2007. "An Infinite-Dimensional Linear Programming Algorithm for Deterministic Semi-Markov Decision Processes on Borel Spaces," Mathematics of Operations Research, INFORMS, vol. 32(3), pages 528-550, August.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Bo Wei & William B. Haskell & Sixiang Zhao, 2020. "The CoMirror algorithm with random constraint sampling for convex semi-infinite programming," Annals of Operations Research, Springer, vol. 295(2), pages 809-841, December.
    2. Nadarajah, Selvaprabu & Secomandi, Nicola, 2023. "A review of the operations literature on real options in energy," European Journal of Operational Research, Elsevier, vol. 309(2), pages 469-487.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Laumer, Simon & Barz, Christiane, 2023. "Reductions of non-separable approximate linear programs for network revenue management," European Journal of Operational Research, Elsevier, vol. 309(1), pages 252-270.
    2. Selvaprabu Nadarajah & François Margot & Nicola Secomandi, 2015. "Relaxations of Approximate Linear Programs for the Real Option Management of Commodity Storage," Management Science, INFORMS, vol. 61(12), pages 3054-3076, December.
    3. Santiago R. Balseiro & David B. Brown, 2019. "Approximations to Stochastic Dynamic Programs via Information Relaxation Duality," Operations Research, INFORMS, vol. 67(2), pages 577-597, March.
    4. Adam Diamant, 2021. "Dynamic multistage scheduling for patient-centered care plans," Health Care Management Science, Springer, vol. 24(4), pages 827-844, December.
    5. Alejandro Toriello & William B. Haskell & Michael Poremba, 2014. "A Dynamic Traveling Salesman Problem with Stochastic Arc Costs," Operations Research, INFORMS, vol. 62(5), pages 1107-1125, October.
    6. Antoine Sauré & Jonathan Patrick & Martin L. Puterman, 2015. "Simulation-Based Approximate Policy Iteration with Generalized Logistic Functions," INFORMS Journal on Computing, INFORMS, vol. 27(3), pages 579-595, August.
    7. Archis Ghate & Robert L. Smith, 2013. "A Linear Programming Approach to Nonstationary Infinite-Horizon Markov Decision Processes," Operations Research, INFORMS, vol. 61(2), pages 413-425, April.
    8. Alessio Trivella & Danial Mohseni-Taheri & Selvaprabu Nadarajah, 2023. "Meeting Corporate Renewable Power Targets," Management Science, INFORMS, vol. 69(1), pages 491-512, January.
    9. Marquinez, José Tomás & Sauré, Antoine & Cataldo, Alejandro & Ferrer, Juan-Carlos, 2021. "Identifying proactive ICU patient admission, transfer and diversion policies in a public-private hospital network," European Journal of Operational Research, Elsevier, vol. 295(1), pages 306-320.
    10. Selvaprabu Nadarajah & Andre A. Cire, 2020. "Network-Based Approximate Linear Programming for Discrete Optimization," Operations Research, INFORMS, vol. 68(6), pages 1767-1786, November.
    11. Amin Khademi & Burak Eksioglu, 2018. "Spare Parts Inventory Management with Substitution-Dependent Reliability," INFORMS Journal on Computing, INFORMS, vol. 30(3), pages 507-521, August.
    12. Daniel Adelman & Diego Klabjan, 2012. "Computing Near-Optimal Policies in Generalized Joint Replenishment," INFORMS Journal on Computing, INFORMS, vol. 24(1), pages 148-164, February.
    13. Amin Khademi & Denis R. Saure & Andrew J. Schaefer & Ronald S. Braithwaite & Mark S. Roberts, 2015. "The Price of Nonabandonment: HIV in Resource-Limited Settings," Manufacturing & Service Operations Management, INFORMS, vol. 17(4), pages 554-570, October.
    14. Thomas W. M. Vossen & Dan Zhang, 2015. "Reductions of Approximate Linear Programs for Network Revenue Management," Operations Research, INFORMS, vol. 63(6), pages 1352-1371, December.
    15. Meissner, Joern & Strauss, Arne, 2012. "Network revenue management with inventory-sensitive bid prices and customer choice," European Journal of Operational Research, Elsevier, vol. 216(2), pages 459-468.
    16. Nadarajah, Selvaprabu & Margot, François & Secomandi, Nicola, 2017. "Comparison of least squares Monte Carlo methods with applications to energy real options," European Journal of Operational Research, Elsevier, vol. 256(1), pages 196-204.
    17. Anna Maria Gambaro & Nicola Secomandi, 2021. "A Discussion of Non‐Gaussian Price Processes for Energy and Commodity Operations," Production and Operations Management, Production and Operations Management Society, vol. 30(1), pages 47-67, January.
    18. Daniel R. Jiang & Lina Al-Kanj & Warren B. Powell, 2020. "Optimistic Monte Carlo Tree Search with Sampled Information Relaxation Dual Bounds," Operations Research, INFORMS, vol. 68(6), pages 1678-1697, November.
    19. David B. Brown & Martin B. Haugh, 2017. "Information Relaxation Bounds for Infinite Horizon Markov Decision Processes," Operations Research, INFORMS, vol. 65(5), pages 1355-1379, October.
    20. Daniela Pucci de Farias & Benjamin Van Roy, 2006. "A Cost-Shaping Linear Program for Average-Cost Approximate Dynamic Programming with Performance Guarantees," Mathematics of Operations Research, INFORMS, vol. 31(3), pages 597-620, August.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:ormnsc:v:66:y:2020:i:4:p:1544-1562. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.