IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2409.10096.html
   My bibliography  Save this paper

Robust Reinforcement Learning with Dynamic Distortion Risk Measures

Author

Listed:
  • Anthony Coache
  • Sebastian Jaimungal

Abstract

In a reinforcement learning (RL) setting, the agent's optimal strategy heavily depends on her risk preferences and the underlying model dynamics of the training environment. These two aspects influence the agent's ability to make well-informed and time-consistent decisions when facing testing environments. In this work, we devise a framework to solve robust risk-aware RL problems where we simultaneously account for environmental uncertainty and risk with a class of dynamic robust distortion risk measures. Robustness is introduced by considering all models within a Wasserstein ball around a reference model. We estimate such dynamic robust risk measures using neural networks by making use of strictly consistent scoring functions, derive policy gradient formulae using the quantile representation of distortion risk measures, and construct an actor-critic algorithm to solve this class of robust risk-aware RL problems. We demonstrate the performance of our algorithm on a portfolio allocation example.

Suggested Citation

  • Anthony Coache & Sebastian Jaimungal, 2024. "Robust Reinforcement Learning with Dynamic Distortion Risk Measures," Papers 2409.10096, arXiv.org.
  • Handle: RePEc:arx:papers:2409.10096
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2409.10096
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Paul Milgrom & Ilya Segal, 2002. "Envelope Theorems for Arbitrary Choice Sets," Econometrica, Econometric Society, vol. 70(2), pages 583-601, March.
    2. Gneiting, Tilmann, 2011. "Making and Evaluating Point Forecasts," Journal of the American Statistical Association, American Statistical Association, vol. 106(494), pages 746-762.
    3. Saeed Marzban & Erick Delage & Jonathan Yu-Meng Li, 2023. "Deep reinforcement learning for option pricing and hedging under dynamic expectile risk measures," Quantitative Finance, Taylor & Francis Journals, vol. 23(10), pages 1411-1430, October.
    4. Silvana M. Pesenti & Sebastian Jaimungal & Yuri F. Saporito & Rodrigo S. Targino, 2023. "Risk Budgeting Allocation for Dynamic Risk Measures," Papers 2305.11319, arXiv.org, revised Oct 2024.
    5. Gneiting, Tilmann & Raftery, Adrian E., 2007. "Strictly Proper Scoring Rules, Prediction, and Estimation," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 359-378, March.
    6. Carole Bernard & Silvana M. Pesenti & Steven Vanduffel, 2024. "Robust distortion risk measures," Mathematical Finance, Wiley Blackwell, vol. 34(3), pages 774-818, July.
    7. Yuhong Xu, 2014. "Robust valuation and risk measurement under model uncertainty," Papers 1407.8024, arXiv.org.
    8. David Wu & Sebastian Jaimungal, 2023. "Robust Risk-Aware Option Hedging," Papers 2303.15216, arXiv.org, revised Dec 2023.
    9. David Wu & Sebastian Jaimungal, 2023. "Robust Risk-Aware Option Hedging," Applied Mathematical Finance, Taylor & Francis Journals, vol. 30(3), pages 153-174, May.
    10. Jose Blanchet & Karthyek Murthy, 2019. "Quantifying Distributional Model Risk via Optimal Transport," Mathematics of Operations Research, INFORMS, vol. 44(2), pages 565-600, May.
    11. Paul Glasserman & Xingbo Xu, 2014. "Robust risk measurement and model risk," Quantitative Finance, Taylor & Francis Journals, vol. 14(1), pages 29-58, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Fritzsch, Simon & Timphus, Maike & Weiß, Gregor, 2024. "Marginals versus copulas: Which account for more model risk in multivariate risk forecasting?," Journal of Banking & Finance, Elsevier, vol. 158(C).
    2. Pascal Franc{c}ois & Genevi`eve Gauthier & Fr'ed'eric Godin & Carlos Octavio P'erez Mendoza, 2024. "Enhancing Deep Hedging of Options with Implied Volatility Surface Feedback Information," Papers 2407.21138, arXiv.org.
    3. Steven Kou & Xianhua Peng, 2016. "On the Measurement of Economic Tail Risk," Operations Research, INFORMS, vol. 64(5), pages 1056-1072, October.
    4. Kim, Sojung & Weber, Stefan, 2022. "Simulation methods for robust risk assessment and the distorted mix approach," European Journal of Operational Research, Elsevier, vol. 298(1), pages 380-398.
    5. Mohammed Berkhouch & Fernanda Maria Müller & Ghizlane Lakhnati & Marcelo Brutti Righi, 2022. "Deviation-Based Model Risk Measures," Computational Economics, Springer;Society for Computational Economics, vol. 59(2), pages 527-547, February.
    6. Frongillo, Rafael M. & Kash, Ian A., 2021. "General truthfulness characterizations via convex analysis," Games and Economic Behavior, Elsevier, vol. 130(C), pages 636-662.
    7. Carole Bernard & Silvana M. Pesenti & Steven Vanduffel, 2024. "Robust distortion risk measures," Mathematical Finance, Wiley Blackwell, vol. 34(3), pages 774-818, July.
    8. Lux, Thibaut & Papapantoleon, Antonis, 2019. "Model-free bounds on Value-at-Risk using extreme value information and statistical distances," Insurance: Mathematics and Economics, Elsevier, vol. 86(C), pages 73-83.
    9. Aleksandrina Goeva & Henry Lam & Huajie Qian & Bo Zhang, 2019. "Optimization-Based Calibration of Simulation Input Models," Operations Research, INFORMS, vol. 67(5), pages 1362-1382, September.
    10. Pascal Franc{c}ois & Genevi`eve Gauthier & Fr'ed'eric Godin & Carlos Octavio P'erez Mendoza, 2024. "Is the difference between deep hedging and delta hedging a statistical arbitrage?," Papers 2407.14736, arXiv.org, revised Oct 2024.
    11. Parisa Davar & Fr'ed'eric Godin & Jose Garrido, 2024. "Catastrophic-risk-aware reinforcement learning with extreme-value-theory-based policy gradients," Papers 2406.15612, arXiv.org, revised Jun 2024.
    12. Makam, Vaishno Devi & Millossovich, Pietro & Tsanakas, Andreas, 2021. "Sensitivity analysis with χ2-divergences," Insurance: Mathematics and Economics, Elsevier, vol. 100(C), pages 372-383.
    13. Rafael Frongillo, 2022. "Quantum Information Elicitation," Papers 2203.07469, arXiv.org.
    14. Thibaut Lux & Antonis Papapantoleon, 2016. "Model-free bounds on Value-at-Risk using extreme value information and statistical distances," Papers 1610.09734, arXiv.org, revised Nov 2018.
    15. Lahiri, Kajal & Yang, Liu, 2013. "Forecasting Binary Outcomes," Handbook of Economic Forecasting, in: G. Elliott & C. Granger & A. Timmermann (ed.), Handbook of Economic Forecasting, edition 1, volume 2, chapter 0, pages 1025-1106, Elsevier.
    16. Tobias Fissler & Silvana M. Pesenti, 2022. "Sensitivity Measures Based on Scoring Functions," Papers 2203.00460, arXiv.org, revised Jul 2022.
    17. Knüppel, Malte & Schultefrankenfeld, Guido, 2019. "Assessing the uncertainty in central banks’ inflation outlooks," International Journal of Forecasting, Elsevier, vol. 35(4), pages 1748-1769.
    18. Mingbin Ben Feng & Eunhye Song, 2020. "Efficient Nested Simulation Experiment Design via the Likelihood Ratio Method," Papers 2008.13087, arXiv.org, revised May 2024.
    19. Costa, Alexandre Bonnet R. & Ferreira, Pedro Cavalcanti G. & Gaglianone, Wagner P. & Guillén, Osmani Teixeira C. & Issler, João Victor & Lin, Yihao, 2021. "Machine learning and oil price point and density forecasting," Energy Economics, Elsevier, vol. 102(C).
    20. Constandina Koki & Loukia Meligkotsidou & Ioannis Vrontos, 2020. "Forecasting under model uncertainty: Non‐homogeneous hidden Markov models with Pòlya‐Gamma data augmentation," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 39(4), pages 580-598, July.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2409.10096. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.