IDEAS home Printed from https://ideas.repec.org/p/ecl/harjfk/rwp18-027.html
   My bibliography  Save this paper

Robust Partially Observable Markov Decision Processes

Author

Listed:
  • Rasouli, Mohammad

    (Stanford University)

  • Saghafian, Soroush

    (Harvard Kennedy School)

Abstract

In a variety of applications, decisions need to be made dynamically after receiving imperfect observations about the state of an underlying system. Partially Observable Markov Decision Processes (POMDPs) are widely used in such applications. To use a POMDP, however, a decision-maker must have access to reliable estimations of core state and observation transition probabilities under each possible state and action pair. This is often challenging mainly due to lack of ample data, especially when some actions are not taken frequently enough in practice. This significantly limits the application of POMDPs in real world settings. In healthcare, for example, medical tests are typically subject to false-positive and false-negative errors, and hence, the decision-maker has imperfect information about the health state of a patient. Furthermore, since some treatment options have not been recommended or explored in the past, data cannot be used to reliably estimate all the required transition probabilities regarding the health state of the patient. We introduce an extension of POMDPs, termed Robust POMDPs (RPOMDPs), which allows dynamic decision-making when there is ambiguity regarding transition probabilities. This extension enables making robust decisions by reducing the reliance on a single probabilistic model of transitions, while still allowing for imperfect state observations. We develop dynamic programming equations for solving RPOMDPs, provide a sucient statistic and an information state, discuss ways in which their computational complexity can be reduced, and connect them to stochastic zero-sum games with imperfect private monitoring.

Suggested Citation

  • Rasouli, Mohammad & Saghafian, Soroush, 2018. "Robust Partially Observable Markov Decision Processes," Working Paper Series rwp18-027, Harvard University, John F. Kennedy School of Government.
  • Handle: RePEc:ecl:harjfk:rwp18-027
    as

    Download full text from publisher

    File URL: https://research.hks.harvard.edu/publications/getFile.aspx?Id=1696
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Peter Klibanoff & Massimo Marinacci & Sujoy Mukerji, 2005. "A Smooth Model of Decision Making under Ambiguity," Econometrica, Econometric Society, vol. 73(6), pages 1849-1892, November.
    2. Hansen, Lars Peter & Sargent, Thomas J., 2007. "Recursive robust estimation and control without commitment," Journal of Economic Theory, Elsevier, vol. 136(1), pages 1-27, September.
    3. Tomasz Strzalecki, 2011. "Axiomatic Foundations of Multiplier Preferences," Econometrica, Econometric Society, vol. 79(1), pages 47-73, January.
    4. Klibanoff, Peter & Marinacci, Massimo & Mukerji, Sujoy, 2009. "Recursive smooth ambiguity preferences," Journal of Economic Theory, Elsevier, vol. 144(3), pages 930-976, May.
    5. Epstein, Larry G. & Schneider, Martin, 2003. "Recursive multiple-priors," Journal of Economic Theory, Elsevier, vol. 113(1), pages 1-31, November.
    6. Hansen, Lars Peter & Sargent, Thomas J., 2005. "Robust estimation and control under commitment," Journal of Economic Theory, Elsevier, vol. 124(2), pages 258-301, October.
    7. Wolfram Wiesemann & Daniel Kuhn & Berç Rustem, 2013. "Robust Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 38(1), pages 153-183, February.
    8. Soroush Saghafian & Brian Tomlin, 2016. "The Newsvendor under Demand Ambiguity: Combining Data with Moment and Tail Information," Operations Research, INFORMS, vol. 64(1), pages 167-185, February.
    9. Maccheroni, Fabio & Marinacci, Massimo & Rustichini, Aldo, 2006. "Dynamic variational preferences," Journal of Economic Theory, Elsevier, vol. 128(1), pages 4-44, May.
    10. Nicole Bäauerle & Ulrich Rieder, 2017. "Partially Observable Risk-Sensitive Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 42(4), pages 1180-1196, November.
    11. Massimo Marinacci, 2002. "Probabilistic Sophistication and Multiple Priors," Econometrica, Econometric Society, vol. 70(2), pages 755-764, March.
    12. Gilboa, Itzhak & Schmeidler, David, 1989. "Maxmin expected utility with non-unique prior," Journal of Mathematical Economics, Elsevier, vol. 18(2), pages 141-153, April.
    13. Ghirardato, Paolo & Maccheroni, Fabio & Marinacci, Massimo, 2004. "Differentiating ambiguity and ambiguity attitude," Journal of Economic Theory, Elsevier, vol. 118(2), pages 133-173, October.
    14. Jay K. Satia & Roy E. Lave, 1973. "Markovian Decision Processes with Uncertain Transition Probabilities," Operations Research, INFORMS, vol. 21(3), pages 728-740, June.
    15. Arnab Nilim & Laurent El Ghaoui, 2005. "Robust Control of Markov Decision Processes with Uncertain Transition Matrices," Operations Research, INFORMS, vol. 53(5), pages 780-798, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Louis Anthony Cox, 2020. "Answerable and Unanswerable Questions in Risk Analysis with Open‐World Novelty," Risk Analysis, John Wiley & Sons, vol. 40(S1), pages 2144-2177, November.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Saghafian, Soroush, 2018. "Ambiguous partially observable Markov decision processes: Structural results and applications," Journal of Economic Theory, Elsevier, vol. 178(C), pages 1-35.
    2. Andrew J. Keith & Darryl K. Ahner, 2021. "A survey of decision making and optimization under uncertainty," Annals of Operations Research, Springer, vol. 300(2), pages 319-353, May.
    3. Hengjie Ai & Ravi Bansal, 2016. "Risk Preferences and The Macro Announcement Premium," NBER Working Papers 22527, National Bureau of Economic Research, Inc.
    4. Hui Chen & Nengjiu Ju & Jianjun Miao, 2014. "Dynamic Asset Allocation with Ambiguous Return Predictability," Review of Economic Dynamics, Elsevier for the Society for Economic Dynamics, vol. 17(4), pages 799-823, October.
    5. Lars Peter Hansen & Thomas J Sargent, 2014. "Doubts or Variability?," World Scientific Book Chapters, in: UNCERTAINTY WITHIN ECONOMIC MODELS, chapter 7, pages 217-256, World Scientific Publishing Co. Pte. Ltd..
    6. Cerreia-Vioglio, S. & Maccheroni, F. & Marinacci, M. & Montrucchio, L., 2011. "Uncertainty averse preferences," Journal of Economic Theory, Elsevier, vol. 146(4), pages 1275-1330, July.
    7. Karni, Edi & Maccheroni, Fabio & Marinacci, Massimo, 2015. "Ambiguity and Nonexpected Utility," Handbook of Game Theory with Economic Applications,, Elsevier.
    8. Hansen, Lars Peter & Sargent, Thomas J., 2022. "Structured ambiguity and model misspecification," Journal of Economic Theory, Elsevier, vol. 199(C).
    9. Cerreia-Vioglio, Simone & Maccheroni, Fabio & Marinacci, Massimo & Montrucchio, Luigi, 2012. "Probabilistic sophistication, second order stochastic dominance and uncertainty aversion," Journal of Mathematical Economics, Elsevier, vol. 48(5), pages 271-283.
    10. Grant, Simon & Polak, Ben, 2013. "Mean-dispersion preferences and constant absolute uncertainty aversion," Journal of Economic Theory, Elsevier, vol. 148(4), pages 1361-1398.
    11. Christian Traeger, 2014. "Why uncertainty matters: discounting under intertemporal risk aversion and ambiguity," Economic Theory, Springer;Society for the Advancement of Economic Theory (SAET), vol. 56(3), pages 627-664, August.
    12. Ellis, Andrew, 2018. "On dynamic consistency in ambiguous games," Games and Economic Behavior, Elsevier, vol. 111(C), pages 241-249.
    13. Simon Quemin, 2016. "Intertemporal abatement decisions under ambiguity aversion in a cap and trade," Working Papers 1604, Chaire Economie du climat.
    14. Massimo Guidolin & Francesca Rinaldi, 2013. "Ambiguity in asset pricing and portfolio choice: a review of the literature," Theory and Decision, Springer, vol. 74(2), pages 183-217, February.
    15. Hansen, Lars Peter & Sargent, Thomas J., 2011. "Robustness and ambiguity in continuous time," Journal of Economic Theory, Elsevier, vol. 146(3), pages 1195-1223, May.
    16. Nengjiu Ju & Jianjun Miao, 2012. "Ambiguity, Learning, and Asset Returns," Econometrica, Econometric Society, vol. 80(2), pages 559-591, March.
    17. Madhav Chandrasekher & Mira Frick & Ryota Iijima & Yves Le Yaouanq, 2022. "Dual‐Self Representations of Ambiguity Preferences," Econometrica, Econometric Society, vol. 90(3), pages 1029-1061, May.
    18. Traeger, Christian P., 2011. "Subjective Risk, Confidence, and Ambiguity," Department of Agricultural & Resource Economics, UC Berkeley, Working Paper Series qt0gw7t7vn, Department of Agricultural & Resource Economics, UC Berkeley.
    19. Spyros Galanis, 2021. "Dynamic consistency, valuable information and subjective beliefs," Economic Theory, Springer;Society for the Advancement of Economic Theory (SAET), vol. 71(4), pages 1467-1497, June.
    20. Klibanoff, Peter & Marinacci, Massimo & Mukerji, Sujoy, 2009. "Recursive smooth ambiguity preferences," Journal of Economic Theory, Elsevier, vol. 144(3), pages 930-976, May.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ecl:harjfk:rwp18-027. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://edirc.repec.org/data/ksharus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.