IDEAS home Printed from https://ideas.repec.org/a/spr/annopr/v338y2024i2d10.1007_s10479-024-06029-x.html
   My bibliography  Save this article

A goal-oriented reinforcement learning for optimal drug dosage control

Author

Listed:
  • Qian Zhang

    (University of Electronic Science and Technology of China)

  • Tianhao Li

    (University of Electronic Science and Technology of China)

  • Dengfeng Li

    (University of Electronic Science and Technology of China)

  • Wei Lu

    (University of Electronic Science and Technology of China)

Abstract

The dosage control of therapeutic drugs is a concern for clinicians. Whether the clinician’s dosing decision is correct and efficient determines patient’s life. In intensive care units (ICU), medication decision is a dynamic and continuous process, which is difficult to solve by traditional intelligent technologies. while reinforcement learning (RL) has an advantage in handling sequential decision making, it faces challenges in multi-level problems because of the delayed rewards and complex states. Hierarchical reinforcement learning (HRL) is a layered algorithm based on RL. HRL has been proved to be effective in delayed sparse reward issues and reduce the learning difficulty by dividing the long-term goal into stages. Inspired by this, we propose a goal-oriented reinforcement learning (GORL) approach to optimize the drug dosage control for sepsis patients. Specifically, GORL employs two agents to make dosage decisions cooperatively by simulating the behaviors of clinicians. GORL decompose a long-term goal into several short-term goals to reduce the exploration space. In the long-term goal, the concept of the goal-oriented is introduced to solve the sparse reward. A goal-oriented hierarchical structure can help agents to interact and cooperate to achieve the short-term goal. In addition, we design a hindsight intrinsic reward to balance the long-term and short-term goals, and are thus able to learn an optimal policy of drug dosage control. We conduct our experiments on MIMIC-IV, which is one of the biggest medical datasets. The experimental results show that our model outperforms other baseline algorithms and can learn a more robust treatment policy than clinicians, with reducing the patient’s mortality by 10.23%.

Suggested Citation

  • Qian Zhang & Tianhao Li & Dengfeng Li & Wei Lu, 2024. "A goal-oriented reinforcement learning for optimal drug dosage control," Annals of Operations Research, Springer, vol. 338(2), pages 1403-1423, July.
  • Handle: RePEc:spr:annopr:v:338:y:2024:i:2:d:10.1007_s10479-024-06029-x
    DOI: 10.1007/s10479-024-06029-x
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10479-024-06029-x
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10479-024-06029-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Yazan F. Roumani & Yaman Roumani & Joseph K. Nwankpa & Mohan Tanniru, 2018. "Classifying readmissions to a cardiac intensive care unit," Annals of Operations Research, Springer, vol. 263(1), pages 429-451, April.
    2. Oriol Vinyals & Igor Babuschkin & Wojciech M. Czarnecki & Michaël Mathieu & Andrew Dudzik & Junyoung Chung & David H. Choi & Richard Powell & Timo Ewalds & Petko Georgiev & Junhyuk Oh & Dan Horgan & M, 2019. "Grandmaster level in StarCraft II using multi-agent reinforcement learning," Nature, Nature, vol. 575(7782), pages 350-354, November.
    3. Nazila Bazrafshan & M. M. Lotfi, 2020. "A finite-horizon Markov decision process model for cancer chemotherapy treatment planning: an application to sequential treatment decision making in clinical trials," Annals of Operations Research, Springer, vol. 295(1), pages 483-502, December.
    4. Ya-Ju Fan & Wanpracha Chaovalitwongse, 2010. "Optimizing feature selection to improve medical diagnosis," Annals of Operations Research, Springer, vol. 174(1), pages 169-183, February.
    5. Daniel Rasmussen & Aaron Voelker & Chris Eliasmith, 2017. "A neural model of hierarchical reinforcement learning," PLOS ONE, Public Library of Science, vol. 12(7), pages 1-39, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Talayeh Razzaghi & Ilya Safro & Joseph Ewing & Ehsan Sadrfaridpour & John D. Scott, 2019. "Predictive models for bariatric surgery risks with imbalanced medical datasets," Annals of Operations Research, Springer, vol. 280(1), pages 1-18, September.
    2. Najmeddine Dhieb & Ismail Abdulrashid & Hakim Ghazzai & Yehia Massoud, 2023. "Optimized drug regimen and chemotherapy scheduling for cancer treatment using swarm intelligence," Annals of Operations Research, Springer, vol. 320(2), pages 757-770, January.
    3. Yi, Zonggen & Luo, Yusheng & Westover, Tyler & Katikaneni, Sravya & Ponkiya, Binaka & Sah, Suba & Mahmud, Sadab & Raker, David & Javaid, Ahmad & Heben, Michael J. & Khanna, Raghav, 2022. "Deep reinforcement learning based optimization for a tightly coupled nuclear renewable integrated energy system," Applied Energy, Elsevier, vol. 328(C).
    4. Liying Xu & Jiadi Zhu & Bing Chen & Zhen Yang & Keqin Liu & Bingjie Dang & Teng Zhang & Yuchao Yang & Ru Huang, 2022. "A distributed nanocluster based multi-agent evolutionary network," Nature Communications, Nature, vol. 13(1), pages 1-10, December.
    5. Daphne Cornelisse & Thomas Rood & Mateusz Malinowski & Yoram Bachrach & Tal Kachman, 2022. "Neural Payoff Machines: Predicting Fair and Stable Payoff Allocations Among Team Members," Papers 2208.08798, arXiv.org.
    6. Weisheng Chiu & Thomas Chun Man Fan & Sang-Back Nam & Ping-Hung Sun, 2021. "Knowledge Mapping and Sustainable Development of eSports Research: A Bibliometric and Visualized Analysis," Sustainability, MDPI, vol. 13(18), pages 1-17, September.
    7. Nweye, Kingsley & Sankaranarayanan, Siva & Nagy, Zoltan, 2023. "MERLIN: Multi-agent offline and transfer learning for occupant-centric operation of grid-interactive communities," Applied Energy, Elsevier, vol. 346(C).
    8. Bossert, Leonie & Hagendorff, Thilo, 2021. "Animals and AI. The role of animals in AI research and application – An overview and ethical evaluation," Technology in Society, Elsevier, vol. 67(C).
    9. Yang, Zhengzhi & Zheng, Lei & Perc, Matjaž & Li, Yumeng, 2024. "Interaction state Q-learning promotes cooperation in the spatial prisoner's dilemma game," Applied Mathematics and Computation, Elsevier, vol. 463(C).
    10. Alaleh Razmjoo & Petros Xanthopoulos & Qipeng Phil Zheng, 2019. "Feature importance ranking for classification in mixed online environments," Annals of Operations Research, Springer, vol. 276(1), pages 315-330, May.
    11. Kazim Topuz & Behrooz Davazdahemami & Dursun Delen, 2024. "A Bayesian belief network-based analytics methodology for early-stage risk detection of novel diseases," Annals of Operations Research, Springer, vol. 341(1), pages 673-697, October.
    12. Constantin Waubert de Puiseau & Richard Meyes & Tobias Meisen, 2022. "On reliability of reinforcement learning based production scheduling systems: a comparative survey," Journal of Intelligent Manufacturing, Springer, vol. 33(4), pages 911-927, April.
    13. Weifan Long & Taixian Hou & Xiaoyi Wei & Shichao Yan & Peng Zhai & Lihua Zhang, 2023. "A Survey on Population-Based Deep Reinforcement Learning," Mathematics, MDPI, vol. 11(10), pages 1-17, May.
    14. Wang, Xianjia & Yang, Zhipeng & Liu, Yanli & Chen, Guici, 2023. "A reinforcement learning-based strategy updating model for the cooperative evolution," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 618(C).
    15. Avishkar Bhoopchand & Bethanie Brownfield & Adrian Collister & Agustin Dal Lago & Ashley Edwards & Richard Everett & Alexandre Fréchette & Yanko Gitahy Oliveira & Edward Hughes & Kory W. Mathewson & P, 2023. "Learning few-shot imitation as cultural transmission," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    16. Raeid Saqur, 2024. "What Teaches Robots to Walk, Teaches Them to Trade too -- Regime Adaptive Execution using Informed Data and LLMs," Papers 2406.15508, arXiv.org.
    17. Ni, Ji & Chen, Bowei & Allinson, Nigel M. & Ye, Xujiong, 2020. "A hybrid model for predicting human physical activity status from lifelogging data," European Journal of Operational Research, Elsevier, vol. 281(3), pages 532-542.
    18. Shuo Sun & Rundong Wang & Bo An, 2021. "Reinforcement Learning for Quantitative Trading," Papers 2109.13851, arXiv.org.
    19. Xuan-Kun Li & Jian-Xu Ma & Xiang-Yu Li & Jun-Jie Hu & Chuan-Yang Ding & Feng-Kai Han & Xiao-Min Guo & Xi Tan & Xian-Min Jin, 2024. "High-efficiency reinforcement learning with hybrid architecture photonic integrated circuit," Nature Communications, Nature, vol. 15(1), pages 1-10, December.
    20. Dmitrii Dobriborsci & Roman Zashchitin & Mikhail Kakanov & Wolfgang Aumer & Pavel Osinenko, 2024. "Predictive reinforcement learning: map-less navigation method for mobile robot," Journal of Intelligent Manufacturing, Springer, vol. 35(8), pages 4217-4232, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:annopr:v:338:y:2024:i:2:d:10.1007_s10479-024-06029-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.