IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2412.10692.html
   My bibliography  Save this paper

Continuous-time optimal investment with portfolio constraints: a reinforcement learning approach

Author

Listed:
  • Huy Chau
  • Duy Nguyen
  • Thai Nguyen

Abstract

In a reinforcement learning (RL) framework, we study the exploratory version of the continuous time expected utility (EU) maximization problem with a portfolio constraint that includes widely-used financial regulations such as short-selling constraints and borrowing prohibition. The optimal feedback policy of the exploratory unconstrained classical EU problem is shown to be Gaussian. In the case where the portfolio weight is constrained to a given interval, the corresponding constrained optimal exploratory policy follows a truncated Gaussian distribution. We verify that the closed form optimal solution obtained for logarithmic utility and quadratic utility for both unconstrained and constrained situations converge to the non-exploratory expected utility counterpart when the exploration weight goes to zero. Finally, we establish a policy improvement theorem and devise an implementable reinforcement learning algorithm by casting the optimal problem in a martingale framework. Our numerical examples show that exploration leads to an optimal wealth process that is more dispersedly distributed with heavier tail compared to that of the case without exploration. This effect becomes less significant as the exploration parameter is smaller. Moreover, the numerical implementation also confirms the intuitive understanding that a broader domain of investment opportunities necessitates a higher exploration cost. Notably, when subjected to both short-selling and money borrowing constraints, the exploration cost becomes negligible compared to the unconstrained case.

Suggested Citation

  • Huy Chau & Duy Nguyen & Thai Nguyen, 2024. "Continuous-time optimal investment with portfolio constraints: a reinforcement learning approach," Papers 2412.10692, arXiv.org.
  • Handle: RePEc:arx:papers:2412.10692
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2412.10692
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Kamma, Thijs & Pelsser, Antoon, 2022. "Near-optimal asset allocation in financial markets with trading constraints," European Journal of Operational Research, Elsevier, vol. 297(2), pages 766-781.
    2. Chen, An & Vellekoop, Michel, 2017. "Optimal investment and consumption when allowing terminal debt," European Journal of Operational Research, Elsevier, vol. 258(1), pages 385-397.
    3. Schnaubelt, Matthias, 2022. "Deep reinforcement learning for the optimal placement of cryptocurrency limit orders," European Journal of Operational Research, Elsevier, vol. 296(3), pages 993-1006.
    4. Abhijit Gosavi, 2009. "Reinforcement Learning: A Tutorial Survey and Recent Advances," INFORMS Journal on Computing, INFORMS, vol. 21(2), pages 178-192, May.
    5. Min Dai & Hanqing Jin & Steven Kou & Yuhong Xu, 2021. "A Dynamic Mean-Variance Analysis for Log Returns," Management Science, INFORMS, vol. 67(2), pages 1093-1108, February.
    6. Bodnar, Taras & Parolya, Nestor & Schmid, Wolfgang, 2013. "On the equivalence of quadratic optimization problems commonly used in portfolio theory," European Journal of Operational Research, Elsevier, vol. 229(3), pages 637-644.
    7. Lawrence B. Pulley, 1983. "Mean-Variance Approximations to Expected Logarithmic Utility," Operations Research, INFORMS, vol. 31(4), pages 685-696, August.
    8. Gerrard, Russell & Kyriakou, Ioannis & Nielsen, Jens Perch & Vodička, Peter, 2023. "On optimal constrained investment strategies for long-term savers in stochastic environments and probability hedging," European Journal of Operational Research, Elsevier, vol. 307(2), pages 948-962.
    9. Dimitris Bertsimas & Aurélie Thiele, 2006. "A Robust Optimization Approach to Inventory Theory," Operations Research, INFORMS, vol. 54(1), pages 150-168, February.
    10. Haoran Wang, 2019. "Large scale continuous-time mean-variance portfolio allocation via reinforcement learning," Papers 1907.11718, arXiv.org, revised Aug 2019.
    11. Liu, Yu & Chen, Yiming & Jiang, Tao, 2020. "Dynamic selective maintenance optimization for multi-state systems over a finite horizon: A deep reinforcement learning approach," European Journal of Operational Research, Elsevier, vol. 283(1), pages 166-181.
    12. Nicole El Karoui & Monique Jeanblanc-Picqué, 1998. "Optimization of consumption with labor income," Finance and Stochastics, Springer, vol. 2(4), pages 409-440.
    13. Sebastian Jaimungal, 2022. "Reinforcement learning and stochastic optimisation," Finance and Stochastics, Springer, vol. 26(1), pages 103-129, January.
    14. Lee, Hyun-Rok & Lee, Taesik, 2021. "Multi-agent reinforcement learning algorithm to solve a partially-observable multi-agent problem in disaster response," European Journal of Operational Research, Elsevier, vol. 291(1), pages 296-308.
    15. David Silver & Aja Huang & Chris J. Maddison & Arthur Guez & Laurent Sifre & George van den Driessche & Julian Schrittwieser & Ioannis Antonoglou & Veda Panneershelvam & Marc Lanctot & Sander Dieleman, 2016. "Mastering the game of Go with deep neural networks and tree search," Nature, Nature, vol. 529(7587), pages 484-489, January.
    16. Yanwei Jia & Xun Yu Zhou, 2022. "q-Learning in Continuous Time," Papers 2207.00713, arXiv.org, revised Apr 2023.
    17. David Silver & Julian Schrittwieser & Karen Simonyan & Ioannis Antonoglou & Aja Huang & Arthur Guez & Thomas Hubert & Lucas Baker & Matthew Lai & Adrian Bolton & Yutian Chen & Timothy Lillicrap & Fan , 2017. "Mastering the game of Go without human knowledge," Nature, Nature, vol. 550(7676), pages 354-359, October.
    18. Yanwei Jia & Xun Yu Zhou, 2021. "Policy Gradient and Actor-Critic Learning in Continuous Time and Space: Theory and Algorithms," Papers 2111.11232, arXiv.org, revised Jul 2022.
    19. Yanwei Jia & Xun Yu Zhou, 2021. "Policy Evaluation and Temporal-Difference Learning in Continuous Time and Space: A Martingale Approach," Papers 2108.06655, arXiv.org, revised Feb 2022.
    20. Cuoco, Domenico, 1997. "Optimal Consumption and Equilibrium Prices with Portfolio Constraints and Stochastic Income," Journal of Economic Theory, Elsevier, vol. 72(1), pages 33-73, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Gerrard, Russell & Kyriakou, Ioannis & Nielsen, Jens Perch & Vodička, Peter, 2023. "On optimal constrained investment strategies for long-term savers in stochastic environments and probability hedging," European Journal of Operational Research, Elsevier, vol. 307(2), pages 948-962.
    2. Xiangyu Cui & Xun Li & Yun Shi & Si Zhao, 2023. "Discrete-Time Mean-Variance Strategy Based on Reinforcement Learning," Papers 2312.15385, arXiv.org.
    3. Yanwei Jia, 2024. "Continuous-time Risk-sensitive Reinforcement Learning via Quadratic Variation Penalty," Papers 2404.12598, arXiv.org.
    4. Zhang, Qin & Liu, Yu & Xiang, Yisha & Xiahou, Tangfan, 2024. "Reinforcement learning in reliability and maintenance optimization: A tutorial," Reliability Engineering and System Safety, Elsevier, vol. 251(C).
    5. Zhou Fang, 2023. "Continuous-Time Path-Dependent Exploratory Mean-Variance Portfolio Construction," Papers 2303.02298, arXiv.org.
    6. Kamma, Thijs & Pelsser, Antoon, 2022. "Near-optimal asset allocation in financial markets with trading constraints," European Journal of Operational Research, Elsevier, vol. 297(2), pages 766-781.
    7. Wu, Bo & Li, Lingfei, 2024. "Reinforcement learning for continuous-time mean-variance portfolio selection in a regime-switching market," Journal of Economic Dynamics and Control, Elsevier, vol. 158(C).
    8. Haoran Wang & Shi Yu, 2021. "Robo-Advising: Enhancing Investment with Inverse Optimization and Deep Reinforcement Learning," Papers 2105.09264, arXiv.org.
    9. Min Dai & Yu Sun & Zuo Quan Xu & Xun Yu Zhou, 2024. "Learning to Optimally Stop Diffusion Processes, with Financial Applications," Papers 2408.09242, arXiv.org, revised Sep 2024.
    10. Yanwei Jia & Xun Yu Zhou, 2021. "Policy Gradient and Actor-Critic Learning in Continuous Time and Space: Theory and Algorithms," Papers 2111.11232, arXiv.org, revised Jul 2022.
    11. Yuchen Zhang & Wei Yang, 2022. "Breakthrough invention and problem complexity: Evidence from a quasi‐experiment," Strategic Management Journal, Wiley Blackwell, vol. 43(12), pages 2510-2544, December.
    12. Christoph Belak & An Chen & Carla Mereu & Robert Stelzer, 2014. "Optimal investment with time-varying stochastic endowments," Papers 1406.6245, arXiv.org, revised Feb 2022.
    13. Roche, Hervé & Tompaidis, Stathis & Yang, Chunyu, 2013. "Why does junior put all his eggs in one basket? A potential rational explanation for holding concentrated portfolios," Journal of Financial Economics, Elsevier, vol. 109(3), pages 775-796.
    14. Omar Al-Ani & Sanjoy Das, 2022. "Reinforcement Learning: Theory and Applications in HEMS," Energies, MDPI, vol. 15(17), pages 1-37, September.
    15. Boute, Robert N. & Gijsbrechts, Joren & van Jaarsveld, Willem & Vanvuchelen, Nathalie, 2022. "Deep reinforcement learning for inventory control: A roadmap," European Journal of Operational Research, Elsevier, vol. 298(2), pages 401-412.
    16. Ioannis Karatzas & Gordan Zitkovic, 2007. "Optimal consumption from investment and random endowment in incomplete semimartingale markets," Papers 0706.0051, arXiv.org.
    17. Paul Willen & Felix Kubler, 2006. "Collateralized Borrowing And Life-Cycle Portfolio Choice," 2006 Meeting Papers 578, Society for Economic Dynamics.
    18. Schwartz, Eduardo S & Tebaldi, Claudio, 2004. "Illiquid Assets and Optimal Portfolio Choice," University of California at Los Angeles, Anderson Graduate School of Management qt7q65t12x, Anderson Graduate School of Management, UCLA.
    19. Jeon, Junkee & Park, Kyunghyun, 2023. "Optimal job switching and retirement decision," Applied Mathematics and Computation, Elsevier, vol. 443(C).
    20. Wahid Faidi & Hanen Mezghanni & Mohamed Mnif, 2019. "Expected Utility Maximization Problem Under State Constraints and Model Uncertainty," Journal of Optimization Theory and Applications, Springer, vol. 183(3), pages 1123-1152, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2412.10692. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.