IDEAS home Printed from https://ideas.repec.org/p/arx/papers/1712.04609.html
   My bibliography  Save this paper

QLBS: Q-Learner in the Black-Scholes(-Merton) Worlds

Author

Listed:
  • Igor Halperin

Abstract

This paper presents a discrete-time option pricing model that is rooted in Reinforcement Learning (RL), and more specifically in the famous Q-Learning method of RL. We construct a risk-adjusted Markov Decision Process for a discrete-time version of the classical Black-Scholes-Merton (BSM) model, where the option price is an optimal Q-function, while the optimal hedge is a second argument of this optimal Q-function, so that both the price and hedge are parts of the same formula. Pricing is done by learning to dynamically optimize risk-adjusted returns for an option replicating portfolio, as in the Markowitz portfolio theory. Using Q-Learning and related methods, once created in a parametric setting, the model is able to go model-free and learn to price and hedge an option directly from data, and without an explicit model of the world. This suggests that RL may provide efficient data-driven and model-free methods for optimal pricing and hedging of options, once we depart from the academic continuous-time limit, and vice versa, option pricing methods developed in Mathematical Finance may be viewed as special cases of model-based Reinforcement Learning. Further, due to simplicity and tractability of our model which only needs basic linear algebra (plus Monte Carlo simulation, if we work with synthetic data), and its close relation to the original BSM model, we suggest that our model could be used for benchmarking of different RL algorithms for financial trading applications

Suggested Citation

  • Igor Halperin, 2017. "QLBS: Q-Learner in the Black-Scholes(-Merton) Worlds," Papers 1712.04609, arXiv.org, revised Sep 2019.
  • Handle: RePEc:arx:papers:1712.04609
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/1712.04609
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Duan, Jin-Chuan & Simonato, Jean-Guy, 2001. "American option pricing under GARCH by a Markov chain approximation," Journal of Economic Dynamics and Control, Elsevier, vol. 25(11), pages 1689-1718, November.
    2. Föllmer, H. & Schweizer, M., 1989. "Hedging by Sequential Regression: an Introduction to the Mathematics of Option Trading," ASTIN Bulletin, Cambridge University Press, vol. 19(S1), pages 29-42, November.
    3. Potters, Marc & Bouchaud, Jean-Philippe & Sestovic, Dragan, 2001. "Hedged Monte-Carlo: low variance derivative pricing with objective probabilities," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 289(3), pages 517-525.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Hans Buhler & Lukas Gonon & Josef Teichmann & Ben Wood, 2018. "Deep Hedging," Papers 1802.03042, arXiv.org.
    2. Yoshiharu Sato, 2019. "Model-Free Reinforcement Learning for Financial Portfolios: A Brief Survey," Papers 1904.04973, arXiv.org, revised May 2019.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Stentoft, Lars, 2005. "Pricing American options when the underlying asset follows GARCH processes," Journal of Empirical Finance, Elsevier, vol. 12(4), pages 576-611, September.
    2. Christoffersen, Peter & Heston, Steven & Jacobs, Kris, 2010. "Option Anomalies and the Pricing Kernel," Working Papers 11-17, University of Pennsylvania, Wharton School, Weiss Center.
    3. Lars Stentoft, 2008. "American Option Pricing Using GARCH Models and the Normal Inverse Gaussian Distribution," Journal of Financial Econometrics, Oxford University Press, vol. 6(4), pages 540-582, Fall.
    4. Chun-Chou Wu, 2006. "The GARCH Option Pricing Model: A Modification of Lattice Approach," Review of Quantitative Finance and Accounting, Springer, vol. 26(1), pages 55-66, February.
    5. N. Vijayamohanan Pillai, 2004. "Causality and error correction in Markov chain: Inflation in India revisited," Centre for Development Studies, Trivendrum Working Papers 366, Centre for Development Studies, Trivendrum, India.
    6. Wang, Xiao-Tian & Wu, Min & Zhou, Ze-Min & Jing, Wei-Shu, 2012. "Pricing European option with transaction costs under the fractional long memory stochastic volatility model," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 391(4), pages 1469-1480.
    7. Edgardo Brigatti & Felipe Macias & Max O. Souza & Jorge P. Zubelli, 2015. "A Hedged Monte Carlo Approach to Real Option Pricing," Papers 1509.03577, arXiv.org.
    8. Battulga Gankhuu, 2021. "Equity-Linked Life Insurances on Maximum of Several Assets," Papers 2111.04038, arXiv.org, revised Sep 2024.
    9. Hu, Yuan & Lindquist, W. Brent & Rachev, Svetlozar T. & Shirvani, Abootaleb & Fabozzi, Frank J., 2022. "Market complete option valuation using a Jarrow-Rudd pricing tree with skewness and kurtosis," Journal of Economic Dynamics and Control, Elsevier, vol. 137(C).
    10. Bernales, Alejandro & Guidolin, Massimo, 2015. "Learning to smile: Can rational learning explain predictable dynamics in the implied volatility surface?," Journal of Financial Markets, Elsevier, vol. 26(C), pages 1-37.
    11. Battulga Gankhuu, 2022. "Augmented Dynamic Gordon Growth Model," Papers 2201.06012, arXiv.org, revised Sep 2024.
    12. Yan Liu & Xiong Zhang, 2023. "Option Pricing Using LSTM: A Perspective of Realized Skewness," Mathematics, MDPI, vol. 11(2), pages 1-21, January.
    13. William Lefebvre & Gr'egoire Loeper & Huy^en Pham, 2022. "Differential learning methods for solving fully nonlinear PDEs," Papers 2205.09815, arXiv.org.
    14. Valeriy Ryabchenko & Sergey Sarykalin & Stan Uryasev, 2004. "Pricing European Options by Numerical Replication: Quadratic Programming with Constraints," Asia-Pacific Financial Markets, Springer;Japanese Association of Financial Economics and Engineering, vol. 11(3), pages 301-333, September.
    15. Jussi Keppo & Lones Smith & Dmitry Davydov, 2006. "Optimal Electoral Timing: Exercise Wisely and You May Live Longer," Cowles Foundation Discussion Papers 1565, Cowles Foundation for Research in Economics, Yale University.
    16. Ke Du, 2013. "Commodity Derivative Pricing Under the Benchmark Approach," PhD Thesis, Finance Discipline Group, UTS Business School, University of Technology, Sydney, number 1-2013, January-A.
    17. Bernales, Alejandro & Guidolin, Massimo, 2014. "Can we forecast the implied volatility surface dynamics of equity options? Predictability and economic value tests," Journal of Banking & Finance, Elsevier, vol. 46(C), pages 326-342.
    18. K. Hsieh & P. Ritchken, 2005. "An empirical comparison of GARCH option pricing models," Review of Derivatives Research, Springer, vol. 8(3), pages 129-150, December.
    19. Alexandru Badescu & Robert J. Elliott & Juan-Pablo Ortega, 2012. "Quadratic hedging schemes for non-Gaussian GARCH models," Papers 1209.5976, arXiv.org, revised Dec 2013.
    20. Ludovic Gouden`ege & Andrea Molent & Antonino Zanette, 2023. "Backward Hedging for American Options with Transaction Costs," Papers 2305.06805, arXiv.org, revised Jun 2023.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:1712.04609. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.