Risk-Sensitive Reinforcement Learning: a Martingale Approach to Reward Uncertainty

My bibliography Save this paper

Risk-Sensitive Reinforcement Learning: a Martingale Approach to Reward Uncertainty

Author

Listed:

Nelson Vadori
Sumitra Ganesh
Prashant Reddy
Manuela Veloso

Registered:

Abstract

We introduce a novel framework to account for sensitivity to rewards uncertainty in sequential decision-making problems. While risk-sensitive formulations for Markov decision processes studied so far focus on the distribution of the cumulative reward as a whole, we aim at learning policies sensitive to the uncertain/stochastic nature of the rewards, which has the advantage of being conceptually more meaningful in some cases. To this end, we present a new decomposition of the randomness contained in the cumulative reward based on the Doob decomposition of a stochastic process, and introduce a new conceptual tool - the \textit{chaotic variation} - which can rigorously be interpreted as the risk measure of the martingale component associated to the cumulative reward process. We innovate on the reinforcement learning side by incorporating this new risk-sensitive approach into model-free algorithms, both policy gradient and value function based, and illustrate its relevance on grid world and portfolio optimization problems.

Suggested Citation

Nelson Vadori & Sumitra Ganesh & Prashant Reddy & Manuela Veloso, 2020. "Risk-Sensitive Reinforcement Learning: a Martingale Approach to Reward Uncertainty," Papers 2006.12686, arXiv.org, revised Sep 2020.

Handle: RePEc:arx:papers:2006.12686

Download full text from publisher

References listed on IDEAS

Olivier Guéant & Iuliia Manziuk, 2019. "Deep Reinforcement Learning for Market Making in Corporate Bonds: Beating the Curse of Dimensionality," Applied Mathematical Finance, Taylor & Francis Journals, vol. 26(5), pages 387-452, September.
- Olivier Guéant & Iuliia Manziuk, 2019. "Deep Reinforcement Learning for Market Making in Corporate Bonds: Beating the Curse of Dimensionality," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) hal-03252505, HAL.
- Olivier Guéant & Iuliia Manziuk, 2019. "Deep Reinforcement Learning for Market Making in Corporate Bonds: Beating the Curse of Dimensionality," Post-Print hal-03252505, HAL.
V. S. Borkar, 2002. "Q-Learning for Risk-Sensitive Control," Mathematics of Operations Research, INFORMS, vol. 27(2), pages 294-311, May.
Olivier Gu'eant & Iuliia Manziuk, 2019. "Deep reinforcement learning for market making in corporate bonds: beating the curse of dimensionality," Papers 1910.13205, arXiv.org.
Detlefsen, Kai & Scandolo, Giacomo, 2005. "Conditional and dynamic convex risk measures," SFB 649 Discussion Papers 2005-006, Humboldt University Berlin, Collaborative Research Center 649: Economic Risk.
Kai Detlefsen & Giacomo Scandolo, 2005. "Conditional and dynamic convex risk measures," Finance and Stochastics, Springer, vol. 9(4), pages 539-561, October.
Sumitra Ganesh & Nelson Vadori & Mengda Xu & Hua Zheng & Prashant Reddy & Manuela Veloso, 2019. "Reinforcement Learning for Market Making in a Multi-agent Dealer Market," Papers 1911.05892, arXiv.org.
repec:hum:wpaper:sfb649dp2005-006 is not listed on IDEAS

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Ben Hambly & Renyuan Xu & Huining Yang, 2021. "Recent Advances in Reinforcement Learning in Finance," Papers 2112.04553, arXiv.org, revised Feb 2023.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Ben Hambly & Renyuan Xu & Huining Yang, 2021. "Recent Advances in Reinforcement Learning in Finance," Papers 2112.04553, arXiv.org, revised Feb 2023.
Bruno Gav{s}perov & Zvonko Kostanjv{c}ar, 2022. "Deep Reinforcement Learning for Market Making Under a Hawkes Process-Based Limit Order Book Model," Papers 2207.09951, arXiv.org.
Pankaj Kumar, 2021. "Deep Hawkes Process for High-Frequency Market Making," Papers 2109.15110, arXiv.org.
Ben Hambly & Renyuan Xu & Huining Yang, 2023. "Recent advances in reinforcement learning in finance," Mathematical Finance, Wiley Blackwell, vol. 33(3), pages 437-503, July.
Bruno Gašperov & Stjepan Begušić & Petra Posedel Šimović & Zvonko Kostanjčar, 2021. "Reinforcement Learning Approaches to Optimal Market Making," Mathematics, MDPI, vol. 9(21), pages 1-22, October.
Ji, Ronglin & Shi, Xuejun & Wang, Shijie & Zhou, Jinming, 2019. "Dynamic risk measures for processes via backward stochastic differential equations," Insurance: Mathematics and Economics, Elsevier, vol. 86(C), pages 43-50.
Zachary Feinstein & Birgit Rudloff, 2018. "Scalar multivariate risk measures with a single eligible asset," Papers 1807.10694, arXiv.org, revised Feb 2021.
Qinyu Wu & Fan Yang & Ping Zhang, 2023. "Conditional generalized quantiles based on expected utility model and equivalent characterization of properties," Papers 2301.12420, arXiv.org.
Thomas Spooner & Rahul Savani, 2020. "Robust Market Making via Adversarial Reinforcement Learning," Papers 2003.01820, arXiv.org, revised Jul 2020.
Philippe Bergault & Olivier Guéant, 2021. "Size matters for OTC market makers: General results and dimensionality reduction techniques," Mathematical Finance, Wiley Blackwell, vol. 31(1), pages 279-322, January.
- Philippe Bergault & Olivier Guéant, 2020. "Size matters for OTC market makers: general results and dimensionality reduction techniques," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) hal-02987894, HAL.
- Philippe Bergault & Olivier Guéant, 2021. "Size matters for OTC market makers: General results and dimensionality reduction techniques," Post-Print hal-03885108, HAL.
- Philippe Bergault & Olivier Guéant, 2021. "Size matters for OTC market makers: General results and dimensionality reduction techniques," Post-Print hal-03252557, HAL.
- Philippe Bergault & Olivier Guéant, 2020. "Size matters for OTC market makers: general results and dimensionality reduction techniques," Working Papers hal-02987894, HAL.
- Philippe Bergault & Olivier Guéant, 2021. "Size matters for OTC market makers: General results and dimensionality reduction techniques," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) hal-03252557, HAL.
Fei Sun & Jingchao Li & Jieming Zhou, 2018. "Dynamic risk measures with fluctuation of market volatility under Bochne-Lebesgue space," Papers 1806.01166, arXiv.org, revised Mar 2024.
Leitner Johannes, 2007. "Pricing and hedging with globally and instantaneously vanishing risk," Statistics & Risk Modeling, De Gruyter, vol. 25(4), pages 311-332, October.
Adel Javanmard & Jingwei Ji & Renyuan Xu, 2024. "Multi-Task Dynamic Pricing in Credit Market with Contextual Information," Papers 2410.14839, arXiv.org, revised Oct 2024.
Bastien Baldacci & Joffrey Derchu & Iuliia Manziuk, 2020. "An approximate solution for options market-making in high dimension," Papers 2009.00907, arXiv.org.
Philippe Bergault & Louis Bertucci & David Bouba & Olivier Gu'eant & Julien Guilbert, 2024. "Automated Market Making: the case of Pegged Assets," Papers 2411.08145, arXiv.org.
Freddy Delbaen & Shige Peng & Emanuela Rosazza Gianin, 2010. "Representation of the penalty term of dynamic concave utilities," Finance and Stochastics, Springer, vol. 14(3), pages 449-472, September.
Tsanakas, Andreas, 2009. "To split or not to split: Capital allocation with convex risk measures," Insurance: Mathematics and Economics, Elsevier, vol. 44(2), pages 268-277, April.
Elisa Mastrogiacomo & Emanuela Rosazza Gianin, 2019. "Time-consistency of risk measures: how strong is such a property?," Decisions in Economics and Finance, Springer;Associazione per la Matematica, vol. 42(1), pages 287-317, June.
Rudloff, Birgit & Street, Alexandre & Valladão, Davi M., 2014. "Time consistency and risk averse dynamic decision models: Definition, interpretation and practical consequences," European Journal of Operational Research, Elsevier, vol. 234(3), pages 743-750.
Tomasz R. Bielecki & Igor Cialenco & Shibi Feng, 2018. "A Dynamic Model of Central Counterparty Risk," Papers 1803.02012, arXiv.org.

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2006.12686. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Risk-Sensitive Reinforcement Learning: a Martingale Approach to Reward Uncertainty

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data