IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v310y2023i1p249-267.html
   My bibliography  Save this article

Approximate solutions to constrained risk-sensitive Markov decision processes

Author

Listed:
  • Kumar, Uday M
  • Bhat, Sanjay P.
  • Kavitha, Veeraruna
  • Hemachandra, Nandyala

Abstract

This paper considers the problem of finding near-optimal Markovian randomized (MR) policies for finite-state-action, infinite-horizon, constrained risk-sensitive Markov decision processes (CRSMDPs). Constraints are in the form of standard expected discounted cost functions as well as expected risk-sensitive discounted cost functions over finite and infinite horizons. We first show that the aforementioned CRSMDP optimization problem possesses a solution if it is feasible (that is, if there exists a policy which satisfies all the constraints). Secondly, we provide two methods for finding an approximate solution in the form of an ultimately stationary (US) MR policy. The latter is achieved through two approximating finite-horizon CRSMDPs constructed from the original CRSMDP by time-truncating the original objective and constraint cost functions, and suitably perturbing the constraint upper bounds. The first approximation gives a US policy which is ϵ-optimal and feasible for the original problem, while the second approximation gives a near-optimal US policy whose violation of the original constraints is bounded above by a specified tolerance value ϵ. A key step in the proofs is an appropriate choice of a metric that makes the set of infinite-horizon MR policies and the feasible regions of the three CRSMDPs compact, and the objective and constraint functions continuous. We also discuss two applications and use an infinite-horizon risk-sensitive inventory control problem as an example to illustrate how existing solution techniques may be used to solve the two approximate finite-horizon problems mentioned above.

Suggested Citation

  • Kumar, Uday M & Bhat, Sanjay P. & Kavitha, Veeraruna & Hemachandra, Nandyala, 2023. "Approximate solutions to constrained risk-sensitive Markov decision processes," European Journal of Operational Research, Elsevier, vol. 310(1), pages 249-267.
  • Handle: RePEc:eee:ejores:v:310:y:2023:i:1:p:249-267
    DOI: 10.1016/j.ejor.2023.02.039
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377221723001893
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ejor.2023.02.039?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Rubio-Herrero, Javier & Baykal-Gürsoy, Melike, 2020. "Mean-variance analysis of the newsvendor problem with price-dependent, isoelastic demand," European Journal of Operational Research, Elsevier, vol. 283(3), pages 942-953.
    2. Abhilasha Prakash Katariya & Sila Cetinkaya & Eylem Tekin, 2014. "On the comparison of risk-neutral and risk-averse newsvendor problems," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 65(7), pages 1090-1107, July.
    3. Eric V. Denardo & Haechurl Park & Uriel G. Rothblum, 2007. "Risk-Sensitive and Risk-Neutral Multiarmed Bandits," Mathematics of Operations Research, INFORMS, vol. 32(2), pages 374-394, May.
    4. Stratton C. Jaquette, 1976. "A Utility Criterion for Markov Decision Processes," Management Science, INFORMS, vol. 23(1), pages 43-49, September.
    5. Krishnamurthy Iyer & Nandyala Hemachandra, 2010. "Sensitivity analysis and optimal ultimately stationary deterministic policies in some constrained discounted cost models," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 71(3), pages 401-425, June.
    6. Eugene A. Feinberg & Adam Shwartz, 1996. "Constrained Discounted Dynamic Programming," Mathematics of Operations Research, INFORMS, vol. 21(4), pages 922-945, November.
    7. Mokrane Bouakiz & Matthew J. Sobel, 1992. "Inventory Control with an Exponential Utility Criterion," Operations Research, INFORMS, vol. 40(3), pages 603-608, June.
    8. Ronald A. Howard & James E. Matheson, 1972. "Risk-Sensitive Markov Decision Processes," Management Science, INFORMS, vol. 18(7), pages 356-369, March.
    9. Cyrus Derman & Morton Klein, 1965. "Some Remarks on Finite Horizon Markovian Decision Models," Operations Research, INFORMS, vol. 13(2), pages 272-278, April.
    10. Choi, Sungyong & Ruszczynski, Andrzej, 2011. "A multi-product risk-averse newsvendor with exponential utility function," European Journal of Operational Research, Elsevier, vol. 214(1), pages 78-84, October.
    11. Jerzy A. Filar & L. C. M. Kallenberg & Huey-Miin Lee, 1989. "Variance-Penalized Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 14(1), pages 147-161, February.
    12. Kamal Golabi & Ram B. Kulkarni & George B. Way, 1982. "A Statewide Pavement Management System," Interfaces, INFORMS, vol. 12(6), pages 5-21, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Krishnamurthy Iyer & Nandyala Hemachandra, 2010. "Sensitivity analysis and optimal ultimately stationary deterministic policies in some constrained discounted cost models," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 71(3), pages 401-425, June.
    2. Monahan, George E. & Sobel, Matthew J., 1997. "Risk-Sensitive Dynamic Market Share Attraction Games," Games and Economic Behavior, Elsevier, vol. 20(2), pages 149-160, August.
    3. Özlem Çavuş & Andrzej Ruszczyński, 2014. "Computational Methods for Risk-Averse Undiscounted Transient Markov Models," Operations Research, INFORMS, vol. 62(2), pages 401-417, April.
    4. Pelin Canbolat, 2014. "Optimal halting policies in Markov population decision chains with constant risk posture," Annals of Operations Research, Springer, vol. 222(1), pages 227-237, November.
    5. Karel Sladký, 2013. "Risk-Sensitive and Mean Variance Optimality in Markov Decision Processes," Czech Economic Review, Charles University Prague, Faculty of Social Sciences, Institute of Economic Studies, vol. 7(3), pages 146-161, November.
    6. Nicole Bäuerle & Ulrich Rieder, 2014. "More Risk-Sensitive Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 39(1), pages 105-120, February.
    7. Chan, Chi Kin & Lee, Y.C.E. & Campbell, J.F., 2013. "Environmental performance—Impacts of vendor–buyer coordination," International Journal of Production Economics, Elsevier, vol. 145(2), pages 683-695.
    8. Zeynep Erkin & Matthew D. Bailey & Lisa M. Maillart & Andrew J. Schaefer & Mark S. Roberts, 2010. "Eliciting Patients' Revealed Preferences: An Inverse Markov Decision Process Approach," Decision Analysis, INFORMS, vol. 7(4), pages 358-365, December.
    9. Eugene A. Feinberg & Uriel G. Rothblum, 2012. "Splitting Randomized Stationary Policies in Total-Reward Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 37(1), pages 129-153, February.
    10. Nandyala Hemachandra & Kamma Sri Naga Rajesh & Mohd. Abdul Qavi, 2016. "A model for equilibrium in some service-provider user-set interactions," Annals of Operations Research, Springer, vol. 243(1), pages 95-115, August.
    11. Rolando Cavazos-Cadena, 2009. "Solutions of the average cost optimality equation for finite Markov decision chains: risk-sensitive and risk-neutral criteria," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 70(3), pages 541-566, December.
    12. Jiahua Zhang & Shu-Cherng Fang & Yifan Xu, 2018. "Inventory centralization with risk-averse newsvendors," Annals of Operations Research, Springer, vol. 268(1), pages 215-237, September.
    13. Sakine Batun & Andrew J. Schaefer & Atul Bhandari & Mark S. Roberts, 2018. "Optimal Liver Acceptance for Risk-Sensitive Patients," Service Science, INFORMS, vol. 10(3), pages 320-333, September.
    14. Sen Lin & Bo Li & Antonio Arreola-Risa & Yiwei Huang, 2023. "Optimizing a single-product production-inventory system under constant absolute risk aversion," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(3), pages 510-537, October.
    15. Lucy Gongtao Chen & Daniel Zhuoyu Long & Melvyn Sim, 2015. "On Dynamic Decision Making to Meet Consumption Targets," Operations Research, INFORMS, vol. 63(5), pages 1117-1130, October.
    16. Li, Xiang & Qi, Xiangtong & Li, Yongjian, 2021. "On sales effort and pricing decisions under alternative risk criteria," European Journal of Operational Research, Elsevier, vol. 293(2), pages 603-614.
    17. C. Barz & K. Waldmann, 2007. "Risk-sensitive capacity control in revenue management," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 65(3), pages 565-579, June.
    18. HuiChen Chiang, 2007. "Financial intermediary's choice of borrowing," Applied Economics, Taylor & Francis Journals, vol. 40(2), pages 251-260.
    19. Kang Boda & Jerzy Filar, 2006. "Time Consistent Dynamic Risk Measures," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 63(1), pages 169-186, February.
    20. Li, Xiang & Qi, Xiangtong, 2021. "On pricing and quality decisions with risk aversion," Omega, Elsevier, vol. 98(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:310:y:2023:i:1:p:249-267. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.