IDEAS home Printed from https://ideas.repec.org/a/inm/ormoor/v42y2017i4p1180-1196.html
   My bibliography  Save this article

Partially Observable Risk-Sensitive Markov Decision Processes

Author

Listed:
  • Nicole Bäauerle

    (Department of Mathematics, Karlsruhe Institute of Technology, D-76128 Karlsruhe, Germany)

  • Ulrich Rieder

    (University of Ulm, D-89069 Ulm, Germany)

Abstract

We consider the problem of minimizing a certainty equivalent of the total or discounted cost over a finite and an infinite time horizon that is generated by a partially observable Markov decision process (POMDP). In contrast to a risk-neutral decision maker, this optimization criterion takes the variability of the cost into account. It contains as a special case the classical risk-sensitive optimization criterion with an exponential utility. We show that this optimization problem can be solved by embedding the problem into a completely observable Markov decision process with extended state space and give conditions under which an optimal policy exists. The state space has to be extended by the joint conditional distribution of current unobserved state and accumulated cost. In case of an exponential utility, the problem simplifies considerably and we rediscover what in previous literature has been named information state . However, since we do not use any change of measure techniques here, our approach is simpler. A simple example, namely, a risk-sensitive Bayesian house selling problem, is considered to illustrate our results.

Suggested Citation

  • Nicole Bäauerle & Ulrich Rieder, 2017. "Partially Observable Risk-Sensitive Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 42(4), pages 1180-1196, November.
  • Handle: RePEc:inm:ormoor:v:42:y:2017:i:4:p:1180-1196
    DOI: 10.1287/moor.2016.0844
    as

    Download full text from publisher

    File URL: https://doi.org/10.1287/moor.2016.0844
    Download Restriction: no

    File URL: https://libkey.io/10.1287/moor.2016.0844?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Rolando Cavazos-Cadena & Daniel Hernández-Hernández, 2016. "A Characterization of the Optimal Certainty Equivalent of the Average Cost via the Arrow-Pratt Sensitivity Function," Mathematics of Operations Research, INFORMS, vol. 41(1), pages 224-235, February.
    2. Jun-Yi Fu & An-Hua Wan, 2002. "Generalized vector equilibrium problems with set-valued mappings," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 56(2), pages 259-268, November.
    3. Nicole Bäuerle & Ulrich Rieder, 2014. "More Risk-Sensitive Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 39(1), pages 105-120, February.
    4. Ronald A. Howard & James E. Matheson, 1972. "Risk-Sensitive Markov Decision Processes," Management Science, INFORMS, vol. 18(7), pages 356-369, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Rasouli, Mohammad & Saghafian, Soroush, 2018. "Robust Partially Observable Markov Decision Processes," Working Paper Series rwp18-027, Harvard University, John F. Kennedy School of Government.
    2. Randall Martyr & John Moriarty & Magnus Perninge, 2019. "Discrete-time risk-aware optimal switching with non-adapted costs," Papers 1910.04047, arXiv.org, revised Sep 2021.
    3. Tomasz Kosmala & Randall Martyr & John Moriarty, 2023. "Markov risk mappings and risk-sensitive optimal prediction," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 97(1), pages 91-116, February.
    4. Jingnan Fan & Andrzej Ruszczyński, 2018. "Risk measurement and risk-averse control of partially observable discrete-time Markov systems," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 88(2), pages 161-184, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Bäuerle, Nicole & Rieder, Ulrich, 2017. "Zero-sum risk-sensitive stochastic games," Stochastic Processes and their Applications, Elsevier, vol. 127(2), pages 622-642.
    2. Carlos Camilo-Garay & Rolando Cavazos-Cadena & Hugo Cruz-Suárez, 2022. "Contractive Approximations in Risk-Sensitive Average Semi-Markov Decision Chains on a Finite State Space," Journal of Optimization Theory and Applications, Springer, vol. 192(1), pages 271-291, January.
    3. Naci Saldi & Tamer Bas¸ ar & Maxim Raginsky, 2020. "Approximate Markov-Nash Equilibria for Discrete-Time Risk-Sensitive Mean-Field Games," Mathematics of Operations Research, INFORMS, vol. 45(4), pages 1596-1620, November.
    4. Rolando Cavazos-Cadena, 2018. "Characterization of the Optimal Risk-Sensitive Average Cost in Denumerable Markov Decision Chains," Mathematics of Operations Research, INFORMS, vol. 43(3), pages 1025-1050, August.
    5. Julio Saucedo-Zul & Rolando Cavazos-Cadena & Hugo Cruz-Suárez, 2020. "A Discounted Approach in Communicating Average Markov Decision Chains Under Risk-Aversion," Journal of Optimization Theory and Applications, Springer, vol. 187(2), pages 585-606, November.
    6. Gustavo Portillo-Ramírez & Rolando Cavazos-Cadena & Hugo Cruz-Suárez, 2023. "Contractive approximations in average Markov decision chains driven by a risk-seeking controller," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 98(1), pages 75-91, August.
    7. Nicole Bäuerle & Alexander Glauner, 2021. "Minimizing spectral risk measures applied to Markov decision processes," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 94(1), pages 35-69, August.
    8. Nicole Bauerle & Alexander Glauner, 2020. "Minimizing Spectral Risk Measures Applied to Markov Decision Processes," Papers 2012.04521, arXiv.org.
    9. Sakine Batun & Andrew J. Schaefer & Atul Bhandari & Mark S. Roberts, 2018. "Optimal Liver Acceptance for Risk-Sensitive Patients," Service Science, INFORMS, vol. 10(3), pages 320-333, September.
    10. Rubén Blancas-Rivera & Rolando Cavazos-Cadena & Hugo Cruz-Suárez, 2020. "Discounted approximations in risk-sensitive average Markov cost chains with finite state space," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 91(2), pages 241-268, April.
    11. Xin Guo & Qiuli Liu & Yi Zhang, 2019. "Finite horizon risk-sensitive continuous-time Markov decision processes with unbounded transition and cost rates," 4OR, Springer, vol. 17(4), pages 427-442, December.
    12. Qingda Wei & Xian Chen, 2023. "Continuous-Time Markov Decision Processes Under the Risk-Sensitive First Passage Discounted Cost Criterion," Journal of Optimization Theory and Applications, Springer, vol. 197(1), pages 309-333, April.
    13. O. L. V. Costa & F. Dufour, 2021. "Integro-differential optimality equations for the risk-sensitive control of piecewise deterministic Markov processes," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 93(2), pages 327-357, April.
    14. L. Q. Anh & P. Q. Khanh, 2007. "On the Stability of the Solution Sets of General Multivalued Vector Quasiequilibrium Problems," Journal of Optimization Theory and Applications, Springer, vol. 135(2), pages 271-284, November.
    15. Pestien, Victor & Wang, Xiaobo, 1998. "Markov-achievable payoffs for finite-horizon decision models," Stochastic Processes and their Applications, Elsevier, vol. 73(1), pages 101-118, January.
    16. M. Balaj & L. J. Lin, 2013. "Existence Criteria for the Solutions of Two Types of Variational Relation Problems," Journal of Optimization Theory and Applications, Springer, vol. 156(2), pages 232-246, February.
    17. Basu, Arnab & Ghosh, Mrinal Kanti, 2014. "Zero-sum risk-sensitive stochastic games on a countable state space," Stochastic Processes and their Applications, Elsevier, vol. 124(1), pages 961-983.
    18. Bhabak, Arnab & Saha, Subhamay, 2022. "Risk-sensitive semi-Markov decision problems with discounted cost and general utilities," Statistics & Probability Letters, Elsevier, vol. 184(C).
    19. Lucy Gongtao Chen & Daniel Zhuoyu Long & Melvyn Sim, 2015. "On Dynamic Decision Making to Meet Consumption Targets," Operations Research, INFORMS, vol. 63(5), pages 1117-1130, October.
    20. J. Y. Fu, 2006. "Stampacchia Generalized Vector Quasiequilibrium Problems and Vector Saddle Points," Journal of Optimization Theory and Applications, Springer, vol. 128(3), pages 605-619, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:ormoor:v:42:y:2017:i:4:p:1180-1196. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.