IDEAS home Printed from https://ideas.repec.org/a/gam/jfinte/v2y2023i3p23-429d1182401.html
   My bibliography  Save this article

Practical Application of Deep Reinforcement Learning to Optimal Trade Execution

Author

Listed:
  • Woo Jae Byun

    (Qraft Technologies, Inc., 3040 Three IFC, 10 Gukjegeumyung-ro, Yeongdeungpo-gu, Seoul 07326, Republic of Korea
    These authors contributed equally to this work. Author ordering is randomly determined.)

  • Bumkyu Choi

    (Qraft Technologies, Inc., 3040 Three IFC, 10 Gukjegeumyung-ro, Yeongdeungpo-gu, Seoul 07326, Republic of Korea
    These authors contributed equally to this work. Author ordering is randomly determined.)

  • Seongmin Kim

    (Qraft Technologies, Inc., 3040 Three IFC, 10 Gukjegeumyung-ro, Yeongdeungpo-gu, Seoul 07326, Republic of Korea)

  • Joohyun Jo

    (Qraft Technologies, Inc., 3040 Three IFC, 10 Gukjegeumyung-ro, Yeongdeungpo-gu, Seoul 07326, Republic of Korea)

Abstract

Although deep reinforcement learning (DRL) has recently emerged as a promising technique for optimal trade execution, two problems still remain unsolved: (1) the lack of a generalized model for a large collection of stocks and execution time horizons; and (2) the inability to accurately train algorithms due to the discrepancy between the simulation environment and real market. In this article, we address the two issues by utilizing a widely used reinforcement learning (RL) algorithm called proximal policy optimization (PPO) with a long short-term memory (LSTM) network and by building our proprietary order execution simulation environment based on historical level 3 market data of the Korea Stock Exchange (KRX). This paper, to the best of our knowledge, is the first to achieve generalization across 50 stocks and across an execution time horizon ranging from 165 to 380 min along with dynamic target volume. The experimental results demonstrate that the proposed algorithm outperforms the popular benchmark, the volume-weighted average price (VWAP), highlighting the potential use of DRL for optimal trade execution in real-world financial markets. Furthermore, our algorithm is the first commercialized DRL-based optimal trade execution algorithm in the South Korea stock market.

Suggested Citation

  • Woo Jae Byun & Bumkyu Choi & Seongmin Kim & Joohyun Jo, 2023. "Practical Application of Deep Reinforcement Learning to Optimal Trade Execution," FinTech, MDPI, vol. 2(3), pages 1-16, June.
  • Handle: RePEc:gam:jfinte:v:2:y:2023:i:3:p:23-429:d:1182401
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2674-1032/2/3/23/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2674-1032/2/3/23/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Volodymyr Mnih & Koray Kavukcuoglu & David Silver & Andrei A. Rusu & Joel Veness & Marc G. Bellemare & Alex Graves & Martin Riedmiller & Andreas K. Fidjeland & Georg Ostrovski & Stig Petersen & Charle, 2015. "Human-level control through deep reinforcement learning," Nature, Nature, vol. 518(7540), pages 529-533, February.
    2. Xiaodong Li & Pangjing Wu & Chenxin Zou & Qing Li, 2022. "Hierarchical Deep Reinforcement Learning for VWAP Strategy Optimization," Papers 2212.14670, arXiv.org.
    3. Dieter Hendricks & Diane Wilcox, 2014. "A reinforcement learning extension to the Almgren-Chriss model for optimal trade execution," Papers 1403.2229, arXiv.org.
    4. Bertsimas, Dimitris & Lo, Andrew W., 1998. "Optimal control of execution costs," Journal of Financial Markets, Elsevier, vol. 1(1), pages 1-50, April.
    5. repec:bla:jfinan:v:43:y:1988:i:1:p:97-112 is not listed on IDEAS
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Schnaubelt, Matthias, 2022. "Deep reinforcement learning for the optimal placement of cryptocurrency limit orders," European Journal of Operational Research, Elsevier, vol. 296(3), pages 993-1006.
    2. Shuo Sun & Rundong Wang & Bo An, 2021. "Reinforcement Learning for Quantitative Trading," Papers 2109.13851, arXiv.org.
    3. Yuchen Fang & Kan Ren & Weiqing Liu & Dong Zhou & Weinan Zhang & Jiang Bian & Yong Yu & Tie-Yan Liu, 2021. "Universal Trading for Order Execution with Oracle Policy Distillation," Papers 2103.10860, arXiv.org.
    4. Ben Hambly & Renyuan Xu & Huining Yang, 2021. "Recent Advances in Reinforcement Learning in Finance," Papers 2112.04553, arXiv.org, revised Feb 2023.
    5. Dieter Hendricks, 2016. "Using real-time cluster configurations of streaming asynchronous features as online state descriptors in financial markets," Papers 1603.06805, arXiv.org, revised May 2017.
    6. Yuchao Dong, 2022. "Randomized Optimal Stopping Problem in Continuous time and Reinforcement Learning Algorithm," Papers 2208.02409, arXiv.org, revised Sep 2023.
    7. Söhnke M. Bartram & Jürgen Branke & Mehrshad Motahari, 2020. "Artificial intelligence in asset management," Working Papers 20202001, Cambridge Judge Business School, University of Cambridge.
    8. Ayman Chaouki & Stephen Hardiman & Christian Schmidt & Emmanuel S'eri'e & Joachim de Lataillade, 2020. "Deep Deterministic Portfolio Optimization," Papers 2003.06497, arXiv.org, revised Apr 2020.
    9. Dicks, Matthew & Paskaramoorthy, Andrew & Gebbie, Tim, 2024. "A simple learning agent interacting with an agent-based market model," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 633(C).
    10. Feiyang Pan & Tongzhe Zhang & Ling Luo & Jia He & Shuoling Liu, 2022. "Learn Continuously, Act Discretely: Hybrid Action-Space Reinforcement Learning For Optimal Execution," Papers 2207.11152, arXiv.org.
    11. Soohan Kim & Jimyeong Kim & Hong Kee Sul & Youngjoon Hong, 2023. "An Adaptive Dual-level Reinforcement Learning Approach for Optimal Trade Execution," Papers 2307.10649, arXiv.org.
    12. Schnaubelt, Matthias, 2020. "Deep reinforcement learning for the optimal placement of cryptocurrency limit orders," FAU Discussion Papers in Economics 05/2020, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.
    13. Haoran Wang & Xun Yu Zhou, 2019. "Continuous-Time Mean-Variance Portfolio Selection: A Reinforcement Learning Framework," Papers 1904.11392, arXiv.org, revised May 2019.
    14. Bruno Gašperov & Stjepan Begušić & Petra Posedel Šimović & Zvonko Kostanjčar, 2021. "Reinforcement Learning Approaches to Optimal Market Making," Mathematics, MDPI, vol. 9(21), pages 1-22, October.
    15. Xiaodong Li & Pangjing Wu & Chenxin Zou & Qing Li, 2022. "Hierarchical Deep Reinforcement Learning for VWAP Strategy Optimization," Papers 2212.14670, arXiv.org.
    16. Gianbiagio Curato & Jim Gatheral & Fabrizio Lillo, 2014. "Optimal execution with nonlinear transient market impact," Papers 1412.4839, arXiv.org.
    17. Curatola, Giuliano, 2022. "Price impact, strategic interaction and portfolio choice," The North American Journal of Economics and Finance, Elsevier, vol. 59(C).
    18. Tulika Saha & Sriparna Saha & Pushpak Bhattacharyya, 2020. "Towards sentiment aided dialogue policy learning for multi-intent conversations using hierarchical reinforcement learning," PLOS ONE, Public Library of Science, vol. 15(7), pages 1-28, July.
    19. Xiaoyue Li & John M. Mulvey, 2023. "Optimal Portfolio Execution in a Regime-switching Market with Non-linear Impact Costs: Combining Dynamic Program and Neural Network," Papers 2306.08809, arXiv.org.
    20. Hong, Harrison & Rady, Sven, 2002. "Strategic trading and learning about liquidity," Journal of Financial Markets, Elsevier, vol. 5(4), pages 419-450, October.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jfinte:v:2:y:2023:i:3:p:23-429:d:1182401. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.