IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2412.13101.html
   My bibliography  Save this paper

Pontryagin-Guided Policy Optimization for Merton's Portfolio Problem

Author

Listed:
  • Jeonggyu Huh
  • Jaegi Jeon

Abstract

We present a Pontryagin-Guided Direct Policy Optimization (PG-DPO) framework for Merton's portfolio problem, unifying modern neural-network-based policy parameterization with the adjoint viewpoint from Pontryagin's maximum principle (PMP). Instead of approximating the value function (as done in deep BSDE methods), we track a policy-fixed BSDE for the adjoint processes, which allows each gradient update to align with continuous-time PMP conditions. This setup yields locally optimal consumption and investment policies that are closely tied to classical stochastic control. We further incorporate an alignment penalty that nudges the learned policy toward Pontryagin-derived solutions, enhancing both convergence speed and training stability. Numerical experiments confirm that PG-DPO effectively handles both consumption and investment, achieving strong performance and interpretability without requiring large offline datasets or model-free reinforcement learning.

Suggested Citation

  • Jeonggyu Huh & Jaegi Jeon, 2024. "Pontryagin-Guided Policy Optimization for Merton's Portfolio Problem," Papers 2412.13101, arXiv.org, revised Jan 2025.
  • Handle: RePEc:arx:papers:2412.13101
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2412.13101
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Merton, Robert C., 1971. "Optimum consumption and portfolio rules in a continuous-time model," Journal of Economic Theory, Elsevier, vol. 3(4), pages 373-413, December.
    2. Min Dai & Yuchao Dong & Yanwei Jia & Xun Yu Zhou, 2023. "Learning Merton's Strategies in an Incomplete Market: Recursive Entropy Regularization and Biased Gaussian Exploration," Papers 2312.11797, arXiv.org.
    3. A. Max Reppen & H. Mete Soner & Valentin Tissot-Daguette, 2023. "Deep stochastic optimization in finance," Digital Finance, Springer, vol. 5(1), pages 91-111, March.
    4. Anders Max Reppen & Halil Mete Soner, 2023. "Deep empirical risk minimization in finance: Looking into the future," Mathematical Finance, Wiley Blackwell, vol. 33(1), pages 116-145, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Pieter van Staden & Peter Forsyth & Yuying Li, 2024. "Smart leverage? Rethinking the role of Leveraged Exchange Traded Funds in constructing portfolios to beat a benchmark," Papers 2412.05431, arXiv.org.
    2. Auffret, Philippe, 2001. "An alternative unifying measure of welfare gains from risk-sharing," Policy Research Working Paper Series 2676, The World Bank.
    3. Chen, An & Hieber, Peter & Sureth, Caren, 2022. "Pay for tax certainty? Advance tax rulings for risky investment under multi-dimensional tax uncertainty," arqus Discussion Papers in Quantitative Tax Research 273, arqus - Arbeitskreis Quantitative Steuerlehre.
    4. Sanchez-Romero, Miguel, 2006. "“Demand for Private Annuities and Social Security: Consequences to Individual Wealth”," Working Papers in Economic Theory 2006/07, Universidad Autónoma de Madrid (Spain), Department of Economic Analysis (Economic Theory and Economic History).
    5. Andreas Fagereng & Luigi Guiso & Davide Malacrino & Luigi Pistaferri, 2020. "Heterogeneity and Persistence in Returns to Wealth," Econometrica, Econometric Society, vol. 88(1), pages 115-170, January.
    6. Luca Di Persio & Luca Prezioso & Kai Wallbaum, 2019. "Closed-End Formula for options linked to Target Volatility Strategies," Papers 1902.08821, arXiv.org.
    7. John H. Cochrane, 1999. "New facts in finance," Economic Perspectives, Federal Reserve Bank of Chicago, vol. 23(Q III), pages 36-58.
    8. Larrain, Borja, 2011. "World betas, consumption growth, and financial integration," Journal of International Money and Finance, Elsevier, vol. 30(6), pages 999-1018, October.
    9. Song, Dandan & Wang, Huamao & Yang, Zhaojun, 2014. "Learning, pricing, timing and hedging of the option to invest for perpetual cash flows with idiosyncratic risk," Journal of Mathematical Economics, Elsevier, vol. 51(C), pages 1-11.
    10. Devereux, Michael B. & Saito, Makoto, 1997. "Growth and risk-sharing with incomplete international assets markets," Journal of International Economics, Elsevier, vol. 42(3-4), pages 453-481, May.
    11. John Y. Campbell & Luis M. Viceira & Joshua S. White, 2003. "Foreign Currency for Long-Term Investors," Economic Journal, Royal Economic Society, vol. 113(486), pages 1-25, March.
    12. repec:dau:papers:123456789/56 is not listed on IDEAS
    13. Stephen Satchell & Susan Thorp, 2007. "Scenario Analysis with Recursive Utility: Dynamic Consumption Plans for Charitable Endowments," Research Paper Series 209, Quantitative Finance Research Centre, University of Technology, Sydney.
    14. Cuoco, Domenico & Liu, Hong, 2000. "Optimal consumption of a divisible durable good," Journal of Economic Dynamics and Control, Elsevier, vol. 24(4), pages 561-613, April.
    15. Renaud Bourlès & Dominique Henriet, 2012. "Risk-sharing Contracts with Asymmetric Information," The Geneva Risk and Insurance Review, Palgrave Macmillan;International Association for the Study of Insurance Economics (The Geneva Association), vol. 37(1), pages 27-56, March.
    16. Hong‐Chih Huang, 2010. "Optimal Multiperiod Asset Allocation: Matching Assets to Liabilities in a Discrete Model," Journal of Risk & Insurance, The American Risk and Insurance Association, vol. 77(2), pages 451-472, June.
    17. Carlos Garriga & Mark P. Keightley, 2007. "A general equilibrium theory of college with education subsidies, in-school labor supply, and borrowing constraints," Working Papers 2007-051, Federal Reserve Bank of St. Louis.
    18. Orszag, J. Michael & Yang, Hong, 1995. "Portfolio choice with Knightian uncertainty," Journal of Economic Dynamics and Control, Elsevier, vol. 19(5-7), pages 873-900.
    19. Bjork, Tomas, 2009. "Arbitrage Theory in Continuous Time," OUP Catalogue, Oxford University Press, edition 3, number 9780199574742.
    20. Andrew Papanicolaou, 2018. "Backward SDEs for Control with Partial Information," Papers 1807.08222, arXiv.org.
    21. E. Nasakkala & J. Keppo, 2008. "Hydropower with Financial Information," Applied Mathematical Finance, Taylor & Francis Journals, vol. 15(5-6), pages 503-529.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2412.13101. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.