IDEAS home Printed from https://ideas.repec.org/a/gam/jrisks/v13y2025i3p40-d1598238.html
   My bibliography  Save this article

Deep Reinforcement Learning in Non-Markov Market-Making

Author

Listed:
  • Luca Lalor

    (Department of Mathematics and Statistics, University of Calgary, 2500 University Dr NW, Calgary, AB T2N 1N4, Canada)

  • Anatoliy Swishchuk

    (Department of Mathematics and Statistics, University of Calgary, 2500 University Dr NW, Calgary, AB T2N 1N4, Canada)

Abstract

We develop a deep reinforcement learning (RL) framework for an optimal market-making (MM) trading problem, specifically focusing on price processes with semi-Markov and Hawkes Jump-Diffusion dynamics. We begin by discussing the basics of RL and the deep RL framework used; we deployed the state-of-the-art Soft Actor–Critic (SAC) algorithm for the deep learning part. The SAC algorithm is an off-policy entropy maximization algorithm more suitable for tackling complex, high-dimensional problems with continuous state and action spaces, like those in optimal market-making (MM). We introduce the optimal MM problem considered, where we detail all the deterministic and stochastic processes that go into setting up an environment to simulate this strategy. Here, we also provide an in-depth overview of the jump-diffusion pricing dynamics used and our method for dealing with adverse selection within the limit order book, and we highlight the working parts of our optimization problem. Next, we discuss the training and testing results, where we provide visuals of how important deterministic and stochastic processes such as the bid/ask prices, trade executions, inventory, and the reward function evolved. Our study includes an analysis of simulated and real data. We include a discussion on the limitations of these results, which are important points for most diffusion style models in this setting.

Suggested Citation

  • Luca Lalor & Anatoliy Swishchuk, 2025. "Deep Reinforcement Learning in Non-Markov Market-Making," Risks, MDPI, vol. 13(3), pages 1-27, February.
  • Handle: RePEc:gam:jrisks:v:13:y:2025:i:3:p:40-:d:1598238
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-9091/13/3/40/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-9091/13/3/40/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Pietro Fodra & Huy^en Pham, 2013. "High frequency trading and asymptotics for small risk aversion in a Markov renewal model," Papers 1310.1756, arXiv.org, revised Jan 2015.
    2. Ana Roldan Contreras & Anatoliy Swishchuk, 2022. "Optimal Liquidation, Acquisition and Market Making Problems in HFT under Hawkes Models for LOB," Risks, MDPI, vol. 10(8), pages 1-32, August.
    3. Nicholas T. Chan and Christian Shelton, 2001. "An Adaptive Electronic Market-Maker," Computing in Economics and Finance 2001 146, Society for Computational Economics.
    4. Myles Sjogren & Timothy DeLise, 2021. "General Compound Hawkes Processes for Mid-Price Prediction," Papers 2110.07075, arXiv.org.
    5. Timothy DeLise, 2024. "The Negative Drift of a Limit Order Fill," Papers 2407.16527, arXiv.org.
    6. Jonathan Sadighian, 2020. "Extending Deep Reinforcement Learning Frameworks in Cryptocurrency Market Making," Papers 2004.06985, arXiv.org.
    7. Luca Lalor & Anatoliy Swishchuk, 2024. "Market Simulation under Adverse Selection," Papers 2409.12721, arXiv.org.
    8. David Silver & Aja Huang & Chris J. Maddison & Arthur Guez & Laurent Sifre & George van den Driessche & Julian Schrittwieser & Ioannis Antonoglou & Veda Panneershelvam & Marc Lanctot & Sander Dieleman, 2016. "Mastering the game of Go with deep neural networks and tree search," Nature, Nature, vol. 529(7587), pages 484-489, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Luca Lalor & Anatoliy Swishchuk, 2024. "Reinforcement Learning in Non-Markov Market-Making," Papers 2410.14504, arXiv.org, revised Nov 2024.
    2. Luca Lalor & Anatoliy Swishchuk, 2024. "Algorithmic and High-Frequency Trading Problems for Semi-Markov and Hawkes Jump-Diffusion Models," Papers 2409.12776, arXiv.org, revised Mar 2025.
    3. Tristan Lim, 2022. "Predictive Crypto-Asset Automated Market Making Architecture for Decentralized Finance using Deep Reinforcement Learning," Papers 2211.01346, arXiv.org, revised Jan 2023.
    4. Shuo Sun & Rundong Wang & Bo An, 2021. "Reinforcement Learning for Quantitative Trading," Papers 2109.13851, arXiv.org.
    5. Tristan Lim, 2024. "Predictive crypto-asset automated market maker architecture for decentralized finance using deep reinforcement learning," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 10(1), pages 1-29, December.
    6. Bruno Gav{s}perov & Zvonko Kostanjv{c}ar, 2022. "Deep Reinforcement Learning for Market Making Under a Hawkes Process-Based Limit Order Book Model," Papers 2207.09951, arXiv.org.
    7. Bruno Gašperov & Stjepan Begušić & Petra Posedel Šimović & Zvonko Kostanjčar, 2021. "Reinforcement Learning Approaches to Optimal Market Making," Mathematics, MDPI, vol. 9(21), pages 1-22, October.
    8. Tian Zhu & Merry H. Ma, 2022. "Deriving the Optimal Strategy for the Two Dice Pig Game via Reinforcement Learning," Stats, MDPI, vol. 5(3), pages 1-14, August.
    9. Xiaoyue Li & John M. Mulvey, 2023. "Optimal Portfolio Execution in a Regime-switching Market with Non-linear Impact Costs: Combining Dynamic Program and Neural Network," Papers 2306.08809, arXiv.org.
    10. Pedro Afonso Fernandes, 2024. "Forecasting with Neuro-Dynamic Programming," Papers 2404.03737, arXiv.org.
    11. Nathan Companez & Aldeida Aleti, 2016. "Can Monte-Carlo Tree Search learn to sacrifice?," Journal of Heuristics, Springer, vol. 22(6), pages 783-813, December.
    12. Yuchen Zhang & Wei Yang, 2022. "Breakthrough invention and problem complexity: Evidence from a quasi‐experiment," Strategic Management Journal, Wiley Blackwell, vol. 43(12), pages 2510-2544, December.
    13. Yassine Chemingui & Adel Gastli & Omar Ellabban, 2020. "Reinforcement Learning-Based School Energy Management System," Energies, MDPI, vol. 13(23), pages 1-21, December.
    14. Leo Ardon & Nelson Vadori & Thomas Spooner & Mengda Xu & Jared Vann & Sumitra Ganesh, 2021. "Towards a fully RL-based Market Simulator," Papers 2110.06829, arXiv.org, revised Nov 2021.
    15. Zhewei Zhang & Youngjin Yoo & Kalle Lyytinen & Aron Lindberg, 2021. "The Unknowability of Autonomous Tools and the Liminal Experience of Their Use," Information Systems Research, INFORMS, vol. 32(4), pages 1192-1213, December.
    16. Yuhong Wang & Lei Chen & Hong Zhou & Xu Zhou & Zongsheng Zheng & Qi Zeng & Li Jiang & Liang Lu, 2021. "Flexible Transmission Network Expansion Planning Based on DQN Algorithm," Energies, MDPI, vol. 14(7), pages 1-21, April.
    17. JinHyo Joseph Yun & EuiSeob Jeong & Xiaofei Zhao & Sung Deuk Hahm & KyungHun Kim, 2019. "Collective Intelligence: An Emerging World in Open Innovation," Sustainability, MDPI, vol. 11(16), pages 1-15, August.
    18. Thomas P. Novak & Donna L. Hoffman, 2019. "Relationship journeys in the internet of things: a new framework for understanding interactions between consumers and smart objects," Journal of the Academy of Marketing Science, Springer, vol. 47(2), pages 216-237, March.
    19. Huang, Ruchen & He, Hongwen & Gao, Miaojue, 2023. "Training-efficient and cost-optimal energy management for fuel cell hybrid electric bus based on a novel distributed deep reinforcement learning framework," Applied Energy, Elsevier, vol. 346(C).
    20. Gokhale, Gargya & Claessens, Bert & Develder, Chris, 2022. "Physics informed neural networks for control oriented thermal modeling of buildings," Applied Energy, Elsevier, vol. 314(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jrisks:v:13:y:2025:i:3:p:40-:d:1598238. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.