IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2408.09242.html
   My bibliography  Save this paper

Learning to Optimally Stop Diffusion Processes, with Financial Applications

Author

Listed:
  • Min Dai
  • Yu Sun
  • Zuo Quan Xu
  • Xun Yu Zhou

Abstract

We study optimal stopping for diffusion processes with unknown model primitives within the continuous-time reinforcement learning (RL) framework developed by Wang et al. (2020), and present applications to option pricing and portfolio choice. By penalizing the corresponding variational inequality formulation, we transform the stopping problem into a stochastic optimal control problem with two actions. We then randomize controls into Bernoulli distributions and add an entropy regularizer to encourage exploration. We derive a semi-analytical optimal Bernoulli distribution, based on which we devise RL algorithms using the martingale approach established in Jia and Zhou (2022a), and prove a policy improvement theorem. We demonstrate the effectiveness of the algorithms in pricing finite-horizon American put options and in solving Merton's problem with transaction costs, and show that both the offline and online algorithms achieve high accuracy in learning the value functions and characterizing the associated free boundaries.

Suggested Citation

  • Min Dai & Yu Sun & Zuo Quan Xu & Xun Yu Zhou, 2024. "Learning to Optimally Stop Diffusion Processes, with Financial Applications," Papers 2408.09242, arXiv.org, revised Sep 2024.
  • Handle: RePEc:arx:papers:2408.09242
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2408.09242
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Sebastian Becker & Patrick Cheridito & Arnulf Jentzen & Timo Welti, 2019. "Solving high-dimensional optimal stopping problems using deep learning," Papers 1908.01602, arXiv.org, revised Aug 2021.
    2. Min Dai & Yuchao Dong & Yanwei Jia & Xun Yu Zhou, 2023. "Learning Merton's Strategies in an Incomplete Market: Recursive Entropy Regularization and Biased Gaussian Exploration," Papers 2312.11797, arXiv.org.
    3. R. Jiang & D. Saunders & C. Weng, 2022. "The reinforcement learning Kelly strategy," Quantitative Finance, Taylor & Francis Journals, vol. 22(8), pages 1445-1464, August.
    4. Dai, Min & Kwok, Yue Kuen & You, Hong, 2007. "Intensity-based framework and penalty formulation of optimal stopping problems," Journal of Economic Dynamics and Control, Elsevier, vol. 31(12), pages 3860-3880, December.
    5. Nicholas Barberis, 2012. "A Model of Casino Gambling," Management Science, INFORMS, vol. 58(1), pages 35-51, January.
    6. Sebastian Becker & Patrick Cheridito & Arnulf Jentzen, 2019. "Pricing and hedging American-style options with deep learning," Papers 1912.11060, arXiv.org, revised Jul 2020.
    7. Yanwei Jia & Xun Yu Zhou, 2022. "q-Learning in Continuous Time," Papers 2207.00713, arXiv.org, revised Apr 2023.
    8. Wu, Bo & Li, Lingfei, 2024. "Reinforcement learning for continuous-time mean-variance portfolio selection in a regime-switching market," Journal of Economic Dynamics and Control, Elsevier, vol. 158(C).
    9. Yanwei Jia & Xun Yu Zhou, 2021. "Policy Gradient and Actor-Critic Learning in Continuous Time and Space: Theory and Algorithms," Papers 2111.11232, arXiv.org, revised Jul 2022.
    10. Sang Hu & Jan Obłój & Xun Yu Zhou, 2023. "A Casino Gambling Model Under Cumulative Prospect Theory: Analysis and Algorithm," Management Science, INFORMS, vol. 69(4), pages 2474-2496, April.
    11. Yanwei Jia & Xun Yu Zhou, 2021. "Policy Evaluation and Temporal-Difference Learning in Continuous Time and Space: A Martingale Approach," Papers 2108.06655, arXiv.org, revised Feb 2022.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yanwei Jia, 2024. "Continuous-time Risk-sensitive Reinforcement Learning via Quadratic Variation Penalty," Papers 2404.12598, arXiv.org.
    2. Huy Chau & Duy Nguyen & Thai Nguyen, 2024. "Continuous-time optimal investment with portfolio constraints: a reinforcement learning approach," Papers 2412.10692, arXiv.org.
    3. Xiangyu Cui & Xun Li & Yun Shi & Si Zhao, 2023. "Discrete-Time Mean-Variance Strategy Based on Reinforcement Learning," Papers 2312.15385, arXiv.org.
    4. Zhou Fang, 2023. "Continuous-Time Path-Dependent Exploratory Mean-Variance Portfolio Construction," Papers 2303.02298, arXiv.org.
    5. Wu, Bo & Li, Lingfei, 2024. "Reinforcement learning for continuous-time mean-variance portfolio selection in a regime-switching market," Journal of Economic Dynamics and Control, Elsevier, vol. 158(C).
    6. Lukas Gonon, 2024. "Deep neural network expressivity for optimal stopping problems," Finance and Stochastics, Springer, vol. 28(3), pages 865-910, July.
    7. Min Dai & Yuchao Dong & Yanwei Jia & Xun Yu Zhou, 2023. "Learning Merton's Strategies in an Incomplete Market: Recursive Entropy Regularization and Biased Gaussian Exploration," Papers 2312.11797, arXiv.org.
    8. Yanwei Jia & Xun Yu Zhou, 2022. "q-Learning in Continuous Time," Papers 2207.00713, arXiv.org, revised Apr 2023.
    9. Lukas Gonon, 2022. "Deep neural network expressivity for optimal stopping problems," Papers 2210.10443, arXiv.org.
    10. Beatriz Salvador & Cornelis W. Oosterlee & Remco van der Meer, 2020. "Financial Option Valuation by Unsupervised Learning with Artificial Neural Networks," Mathematics, MDPI, vol. 9(1), pages 1-20, December.
    11. Nader Karimi & Erfan Salavati & Hirbod Assa & Hojatollah Adibi, 2023. "Sensitivity Analysis of Optimal Commodity Decision Making with Neural Networks: A Case for COVID-19," Mathematics, MDPI, vol. 11(5), pages 1-15, February.
    12. Zhou Fang & Haiqing Xu, 2023. "Over-the-Counter Market Making via Reinforcement Learning," Papers 2307.01816, arXiv.org.
    13. Xuefeng Gao & Lingfei Li & Xun Yu Zhou, 2024. "Reinforcement Learning for Jump-Diffusions, with Financial Applications," Papers 2405.16449, arXiv.org, revised Jan 2025.
    14. A. Max Reppen & H. Mete Soner & Valentin Tissot-Daguette, 2022. "Neural Optimal Stopping Boundary," Papers 2205.04595, arXiv.org, revised May 2023.
    15. Zhou Fang & Haiqing Xu, 2023. "Market Making of Options via Reinforcement Learning," Papers 2307.01814, arXiv.org.
    16. Jiefei Yang & Guanglian Li, 2024. "A deep primal-dual BSDE method for optimal stopping problems," Papers 2409.06937, arXiv.org.
    17. Matteo Bissiri & Riccardo Cogo, 2017. "Behavioral Value Adjustments," International Journal of Theoretical and Applied Finance (IJTAF), World Scientific Publishing Co. Pte. Ltd., vol. 20(08), pages 1-37, December.
    18. Yu-Jui Huang & Adrien Nguyen-Huu, 2018. "Time-consistent stopping under decreasing impatience," Finance and Stochastics, Springer, vol. 22(1), pages 69-95, January.
    19. Christian Hilpert & Jing Li & Alexander Szimayer, 2014. "The Effect of Secondary Markets on Equity-Linked Life Insurance With Surrender Guarantees," Journal of Risk & Insurance, The American Risk and Insurance Association, vol. 81(4), pages 943-968, December.
    20. Embrey, Matthew & Seel, Christian & Philipp Reiss, J., 2024. "Gambling in risk-taking contests: Experimental evidence," Journal of Economic Behavior & Organization, Elsevier, vol. 221(C), pages 570-585.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2408.09242. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.