IDEAS home Printed from https://ideas.repec.org/a/inm/oropre/v67y2019i2p577-597.html
   My bibliography  Save this article

Approximations to Stochastic Dynamic Programs via Information Relaxation Duality

Author

Listed:
  • Santiago R. Balseiro

    (Graduate School of Business, Columbia University, New York, New York 10027;)

  • David B. Brown

    (Fuqua School of Business, Duke University, Durham, North Carolina 27708)

Abstract

In the analysis of complex stochastic dynamic programs, we often seek strong theoretical guarantees on the suboptimality of heuristic policies. One technique for obtaining performance bounds is perfect information analysis: this approach provides bounds on the performance of an optimal policy by considering a decision maker who has access to the outcomes of all future uncertainties before making decisions, that is, fully relaxed nonanticipativity constraints. A limitation of this approach is that in many problems perfect information about uncertainties is quite valuable, and thus, the resulting bound is weak. In this paper, we use an information relaxation duality approach, which includes a penalty that punishes violations of the nonanticipativity constraints, to derive stronger analytical bounds on the suboptimality of heuristic policies in stochastic dynamic programs that are too difficult to solve. The general framework we develop ties the heuristic policy and the performance bound together explicitly through the use of an approximate value function: heuristic policies are greedy with respect to this approximation, and penalties are also generated in a specific way using this approximation. We apply this approach to three challenging problems: stochastic knapsack problems, stochastic scheduling on parallel machines, and sequential search problems. In each of these problems, we consider a greedy heuristic policy generated by an approximate value function and a corresponding penalized perfect information bound. We then characterize the gap between the performance of the policy and the information relaxation bound in each problem; the results imply asymptotic optimality of the heuristic policy for specific “large” regimes of interest.

Suggested Citation

  • Santiago R. Balseiro & David B. Brown, 2019. "Approximations to Stochastic Dynamic Programs via Information Relaxation Duality," Operations Research, INFORMS, vol. 67(2), pages 577-597, March.
  • Handle: RePEc:inm:oropre:v:67:y:2019:i:2:p:577-597
    DOI: 10.1287/opre.2018.1782
    as

    Download full text from publisher

    File URL: https://doi.org/10.1287/opre.2018.1782
    Download Restriction: no

    File URL: https://libkey.io/10.1287/opre.2018.1782?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Leif Andersen & Mark Broadie, 2004. "Primal-Dual Simulation Algorithm for Pricing Multidimensional American Options," Management Science, INFORMS, vol. 50(9), pages 1222-1234, September.
    2. Weitzman, Martin L, 1979. "Optimal Search for the Best Alternative," Econometrica, Econometric Society, vol. 47(3), pages 641-654, May.
    3. Michael H. Rothkopf, 1966. "Scheduling with Random Service Times," Management Science, INFORMS, vol. 12(9), pages 707-713, May.
    4. David B. Brown & Martin B. Haugh, 2017. "Information Relaxation Bounds for Infinite Horizon Markov Decision Processes," Operations Research, INFORMS, vol. 65(5), pages 1355-1379, October.
    5. R. J. Vanderbei, 1980. "The Optimal Choice of a Subset of a Population," Mathematics of Operations Research, INFORMS, vol. 5(4), pages 481-486, November.
    6. David B. Brown & James E. Smith, 2014. "Information Relaxations, Duality, and Convex Stochastic Dynamic Programs," Operations Research, INFORMS, vol. 62(6), pages 1394-1415, December.
    7. Martin Skutella & Maxim Sviridenko & Marc Uetz, 2016. "Unrelated Machine Scheduling with Stochastic Processing Times," Mathematics of Operations Research, INFORMS, vol. 41(3), pages 851-864, August.
    8. C. Derman & G. J. Lieberman & S. M. Ross, 1978. "A Renewal Decision Problem," Management Science, INFORMS, vol. 24(5), pages 554-561, January.
    9. Vahideh H. Manshadi & Shayan Oveis Gharan & Amin Saberi, 2012. "Online Stochastic Matching: Online Actions Based on Offline Statistics," Mathematics of Operations Research, INFORMS, vol. 37(4), pages 559-573, November.
    10. D. P. de Farias & B. Van Roy, 2003. "The Linear Programming Approach to Approximate Dynamic Programming," Operations Research, INFORMS, vol. 51(6), pages 850-865, December.
    11. Daniel Adelman & Adam J. Mersereau, 2008. "Relaxations of Weakly Coupled Stochastic Dynamic Programs," Operations Research, INFORMS, vol. 56(3), pages 712-727, June.
    12. David B. Brown & James E. Smith & Peng Sun, 2010. "Information Relaxations and Duality in Stochastic Dynamic Programs," Operations Research, INFORMS, vol. 58(4-part-1), pages 785-801, August.
    13. Sripad K. Devalkar & Ravi Anupindi & Amitabh Sinha, 2011. "Integrated Optimization of Procurement, Processing, and Trade of Commodities," Operations Research, INFORMS, vol. 59(6), pages 1369-1381, December.
    14. Martin B. Haugh & Leonid Kogan, 2004. "Pricing American Options: A Duality Approach," Operations Research, INFORMS, vol. 52(2), pages 258-270, April.
    15. Kalyan Talluri & Garrett van Ryzin, 1998. "An Analysis of Bid-Price Controls for Network Revenue Management," Management Science, INFORMS, vol. 44(11-Part-1), pages 1577-1593, November.
    16. Leslie A. Hall & Andreas S. Schulz & David B. Shmoys & Joel Wein, 1997. "Scheduling to Minimize Average Completion Time: Off-Line and On-Line Approximation Algorithms," Mathematics of Operations Research, INFORMS, vol. 22(3), pages 513-544, August.
    17. Selvaprabu Nadarajah & François Margot & Nicola Secomandi, 2015. "Relaxations of Approximate Linear Programs for the Real Option Management of Commodity Storage," Management Science, INFORMS, vol. 61(12), pages 3054-3076, December.
    18. Brian C. Dean & Michel X. Goemans & Jan Vondrák, 2008. "Approximating the Stochastic Knapsack Problem: The Benefit of Adaptivity," Mathematics of Operations Research, INFORMS, vol. 33(4), pages 945-964, November.
    19. L. C. G. Rogers, 2002. "Monte Carlo valuation of American options," Mathematical Finance, Wiley Blackwell, vol. 12(3), pages 271-286, July.
    20. Martin Haugh & Garud Iyengar & Chun Wang, 2016. "Tax-Aware Dynamic Asset Allocation," Operations Research, INFORMS, vol. 64(4), pages 849-866, August.
    21. Guoming Lai & François Margot & Nicola Secomandi, 2010. "An Approximate Dynamic Programming Approach to Benchmark Practice-Based Heuristics for Natural Gas Storage Valuation," Operations Research, INFORMS, vol. 58(3), pages 564-582, June.
    22. David B. Brown & James E. Smith, 2011. "Dynamic Portfolio Optimization with Transaction Costs: Heuristics and Dual Bounds," Management Science, INFORMS, vol. 57(10), pages 1752-1770, October.
    23. Vijay V. Desai & Vivek F. Farias & Ciamac C. Moallemi, 2012. "Pathwise Optimization for Optimal Stopping Problems," Management Science, INFORMS, vol. 58(12), pages 2292-2308, December.
    24. Jason D. Papastavrou & Srikanth Rajagopalan & Anton J. Kleywegt, 1996. "The Dynamic and Stochastic Knapsack Problem with Deadlines," Management Science, INFORMS, vol. 42(12), pages 1706-1718, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Alberto Vera & Siddhartha Banerjee & Itai Gurvich, 2021. "Online Allocation and Pricing: Constant Regret via Bellman Inequalities," Operations Research, INFORMS, vol. 69(3), pages 821-840, May.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. David B. Brown & Martin B. Haugh, 2017. "Information Relaxation Bounds for Infinite Horizon Markov Decision Processes," Operations Research, INFORMS, vol. 65(5), pages 1355-1379, October.
    2. Alessio Trivella & Danial Mohseni-Taheri & Selvaprabu Nadarajah, 2023. "Meeting Corporate Renewable Power Targets," Management Science, INFORMS, vol. 69(1), pages 491-512, January.
    3. Daniel R. Jiang & Lina Al-Kanj & Warren B. Powell, 2020. "Optimistic Monte Carlo Tree Search with Sampled Information Relaxation Dual Bounds," Operations Research, INFORMS, vol. 68(6), pages 1678-1697, November.
    4. David B. Brown & James E. Smith, 2014. "Information Relaxations, Duality, and Convex Stochastic Dynamic Programs," Operations Research, INFORMS, vol. 62(6), pages 1394-1415, December.
    5. David B. Brown & James E. Smith, 2013. "Optimal Sequential Exploration: Bandits, Clairvoyants, and Wildcats," Operations Research, INFORMS, vol. 61(3), pages 644-665, June.
    6. Vijay V. Desai & Vivek F. Farias & Ciamac C. Moallemi, 2012. "Pathwise Optimization for Optimal Stopping Problems," Management Science, INFORMS, vol. 58(12), pages 2292-2308, December.
    7. Secomandi, Nicola & Seppi, Duane J., 2014. "Real Options and Merchant Operations of Energy and Other Commodities," Foundations and Trends(R) in Technology, Information and Operations Management, now publishers, vol. 6(3-4), pages 161-331, July.
    8. Mark Broadie & Weiwei Shen, 2016. "High-Dimensional Portfolio Optimization With Transaction Costs," International Journal of Theoretical and Applied Finance (IJTAF), World Scientific Publishing Co. Pte. Ltd., vol. 19(04), pages 1-49, June.
    9. Dragos Florin Ciocan & Velibor V. Mišić, 2022. "Interpretable Optimal Stopping," Management Science, INFORMS, vol. 68(3), pages 1616-1638, March.
    10. Qihang Lin & Selvaprabu Nadarajah & Negar Soheili, 2020. "Revisiting Approximate Linear Programming: Constraint-Violation Learning with Applications to Inventory Control and Energy Storage," Management Science, INFORMS, vol. 66(4), pages 1544-1562, April.
    11. David B. Brown & James E. Smith & Peng Sun, 2010. "Information Relaxations and Duality in Stochastic Dynamic Programs," Operations Research, INFORMS, vol. 58(4-part-1), pages 785-801, August.
    12. Nadarajah, Selvaprabu & Margot, François & Secomandi, Nicola, 2017. "Comparison of least squares Monte Carlo methods with applications to energy real options," European Journal of Operational Research, Elsevier, vol. 256(1), pages 196-204.
    13. Christian Bender & Christian Gärtner & Nikolaus Schweizer, 2018. "Pathwise Dynamic Programming," Mathematics of Operations Research, INFORMS, vol. 43(3), pages 965-965, August.
    14. Anna Maria Gambaro & Nicola Secomandi, 2021. "A Discussion of Non‐Gaussian Price Processes for Energy and Commodity Operations," Production and Operations Management, Production and Operations Management Society, vol. 30(1), pages 47-67, January.
    15. Helin Zhu & Fan Ye & Enlu Zhou, 2013. "Fast Estimation of True Bounds on Bermudan Option Prices under Jump-diffusion Processes," Papers 1305.4321, arXiv.org.
    16. Helin Zhu & Fan Ye & Enlu Zhou, 2015. "Fast estimation of true bounds on Bermudan option prices under jump-diffusion processes," Quantitative Finance, Taylor & Francis Journals, vol. 15(11), pages 1885-1900, November.
    17. Christian Bender & Christian Gaertner & Nikolaus Schweizer, 2016. "Pathwise Iteration for Backward SDEs," Papers 1605.07500, arXiv.org, revised Jun 2016.
    18. Alessio Trivella & Selvaprabu Nadarajah & Stein-Erik Fleten & Denis Mazieres & David Pisinger, 2021. "Managing Shutdown Decisions in Merchant Commodity and Energy Production: A Social Commerce Perspective," Manufacturing & Service Operations Management, INFORMS, vol. 23(2), pages 311-330, March.
    19. Cosma, Antonio & Galluccio, Stefano & Pederzoli, Paola & Scaillet, Olivier, 2020. "Early Exercise Decision in American Options with Dividends, Stochastic Volatility, and Jumps," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 55(1), pages 331-356, February.
    20. Jalaj Bhandari & Daniel Russo & Raghav Singal, 2021. "A Finite Time Analysis of Temporal Difference Learning with Linear Function Approximation," Operations Research, INFORMS, vol. 69(3), pages 950-973, May.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:oropre:v:67:y:2019:i:2:p:577-597. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.