A goal-oriented reinforcement learning for optimal drug dosage control

My bibliography Save this article

A goal-oriented reinforcement learning for optimal drug dosage control

Author

Listed:

Qian Zhang
(University of Electronic Science and Technology of China)
Tianhao Li
(University of Electronic Science and Technology of China)
Dengfeng Li
(University of Electronic Science and Technology of China)
Wei Lu
(University of Electronic Science and Technology of China)

Registered:

Abstract

The dosage control of therapeutic drugs is a concern for clinicians. Whether the clinician’s dosing decision is correct and efficient determines patient’s life. In intensive care units (ICU), medication decision is a dynamic and continuous process, which is difficult to solve by traditional intelligent technologies. while reinforcement learning (RL) has an advantage in handling sequential decision making, it faces challenges in multi-level problems because of the delayed rewards and complex states. Hierarchical reinforcement learning (HRL) is a layered algorithm based on RL. HRL has been proved to be effective in delayed sparse reward issues and reduce the learning difficulty by dividing the long-term goal into stages. Inspired by this, we propose a goal-oriented reinforcement learning (GORL) approach to optimize the drug dosage control for sepsis patients. Specifically, GORL employs two agents to make dosage decisions cooperatively by simulating the behaviors of clinicians. GORL decompose a long-term goal into several short-term goals to reduce the exploration space. In the long-term goal, the concept of the goal-oriented is introduced to solve the sparse reward. A goal-oriented hierarchical structure can help agents to interact and cooperate to achieve the short-term goal. In addition, we design a hindsight intrinsic reward to balance the long-term and short-term goals, and are thus able to learn an optimal policy of drug dosage control. We conduct our experiments on MIMIC-IV, which is one of the biggest medical datasets. The experimental results show that our model outperforms other baseline algorithms and can learn a more robust treatment policy than clinicians, with reducing the patient’s mortality by 10.23%.

Suggested Citation

Qian Zhang & Tianhao Li & Dengfeng Li & Wei Lu, 2024. "A goal-oriented reinforcement learning for optimal drug dosage control," Annals of Operations Research, Springer, vol. 338(2), pages 1403-1423, July.

Handle: RePEc:spr:annopr:v:338:y:2024:i:2:d:10.1007_s10479-024-06029-x
DOI: 10.1007/s10479-024-06029-x

Download full text from publisher

As the access to this document is restricted, you may want to search for a different version of it.

References listed on IDEAS

Yazan F. Roumani & Yaman Roumani & Joseph K. Nwankpa & Mohan Tanniru, 2018. "Classifying readmissions to a cardiac intensive care unit," Annals of Operations Research, Springer, vol. 263(1), pages 429-451, April.
Ya-Ju Fan & Wanpracha Chaovalitwongse, 2010. "Optimizing feature selection to improve medical diagnosis," Annals of Operations Research, Springer, vol. 174(1), pages 169-183, February.
Daniel Rasmussen & Aaron Voelker & Chris Eliasmith, 2017. "A neural model of hierarchical reinforcement learning," PLOS ONE, Public Library of Science, vol. 12(7), pages 1-39, July.
Oriol Vinyals & Igor Babuschkin & Wojciech M. Czarnecki & Michaël Mathieu & Andrew Dudzik & Junyoung Chung & David H. Choi & Richard Powell & Timo Ewalds & Petko Georgiev & Junhyuk Oh & Dan Horgan & M, 2019. "Grandmaster level in StarCraft II using multi-agent reinforcement learning," Nature, Nature, vol. 575(7782), pages 350-354, November.
Nazila Bazrafshan & M. M. Lotfi, 2020. "A finite-horizon Markov decision process model for cancer chemotherapy treatment planning: an application to sequential treatment decision making in clinical trials," Annals of Operations Research, Springer, vol. 295(1), pages 483-502, December.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Talayeh Razzaghi & Ilya Safro & Joseph Ewing & Ehsan Sadrfaridpour & John D. Scott, 2019. "Predictive models for bariatric surgery risks with imbalanced medical datasets," Annals of Operations Research, Springer, vol. 280(1), pages 1-18, September.
Najmeddine Dhieb & Ismail Abdulrashid & Hakim Ghazzai & Yehia Massoud, 2023. "Optimized drug regimen and chemotherapy scheduling for cancer treatment using swarm intelligence," Annals of Operations Research, Springer, vol. 320(2), pages 757-770, January.
Nweye, Kingsley & Sankaranarayanan, Siva & Nagy, Zoltan, 2023. "MERLIN: Multi-agent offline and transfer learning for occupant-centric operation of grid-interactive communities," Applied Energy, Elsevier, vol. 346(C).
Yang, Zhengzhi & Zheng, Lei & Perc, Matjaž & Li, Yumeng, 2024. "Interaction state Q-learning promotes cooperation in the spatial prisoner's dilemma game," Applied Mathematics and Computation, Elsevier, vol. 463(C).
Kazim Topuz & Behrooz Davazdahemami & Dursun Delen, 2024. "A Bayesian belief network-based analytics methodology for early-stage risk detection of novel diseases," Annals of Operations Research, Springer, vol. 341(1), pages 673-697, October.
Wang, Xianjia & Yang, Zhipeng & Liu, Yanli & Chen, Guici, 2023. "A reinforcement learning-based strategy updating model for the cooperative evolution," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 618(C).
Avishkar Bhoopchand & Bethanie Brownfield & Adrian Collister & Agustin Dal Lago & Ashley Edwards & Richard Everett & Alexandre Fréchette & Yanko Gitahy Oliveira & Edward Hughes & Kory W. Mathewson & P, 2023. "Learning few-shot imitation as cultural transmission," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
Raeid Saqur, 2024. "What Teaches Robots to Walk, Teaches Them to Trade too -- Regime Adaptive Execution using Informed Data and LLMs," Papers 2406.15508, arXiv.org.
Ni, Ji & Chen, Bowei & Allinson, Nigel M. & Ye, Xujiong, 2020. "A hybrid model for predicting human physical activity status from lifelogging data," European Journal of Operational Research, Elsevier, vol. 281(3), pages 532-542.
Shuo Sun & Rundong Wang & Bo An, 2021. "Reinforcement Learning for Quantitative Trading," Papers 2109.13851, arXiv.org.
Dmitrii Dobriborsci & Roman Zashchitin & Mikhail Kakanov & Wolfgang Aumer & Pavel Osinenko, 2024. "Predictive reinforcement learning: map-less navigation method for mobile robot," Journal of Intelligent Manufacturing, Springer, vol. 35(8), pages 4217-4232, December.
Li, Wenqing & Ni, Shaoquan, 2022. "Train timetabling with the general learning environment and multi-agent deep reinforcement learning," Transportation Research Part B: Methodological, Elsevier, vol. 157(C), pages 230-251.
Geng, Yini & Liu, Yifan & Lu, Yikang & Shen, Chen & Shi, Lei, 2022. "Reinforcement learning explains various conditional cooperation," Applied Mathematics and Computation, Elsevier, vol. 427(C).
Boian Lazov, 2023. "A Deep Reinforcement Learning Trader without Offline Training," Papers 2303.00356, arXiv.org.
Qingyan Li & Tao Lin & Qianyi Yu & Hui Du & Jun Li & Xiyue Fu, 2023. "Review of Deep Reinforcement Learning and Its Application in Modern Renewable Power System Control," Energies, MDPI, vol. 16(10), pages 1-23, May.
Michael Curry & Alexander Trott & Soham Phade & Yu Bai & Stephan Zheng, 2022. "Analyzing Micro-Founded General Equilibrium Models with Many Agents using Deep Reinforcement Learning," Papers 2201.01163, arXiv.org, revised Feb 2022.
Rodrick Wallace, 2022. "How AI founders on adversarial landscapes of fog and friction," The Journal of Defense Modeling and Simulation, , vol. 19(3), pages 519-538, July.
Jinming Xu & Yuan Lin, 2024. "Energy Management for Hybrid Electric Vehicles Using Safe Hybrid-Action Reinforcement Learning," Mathematics, MDPI, vol. 12(5), pages 1-20, February.
Dong Liu & Feng Xiao & Jian Luo & Fan Yang, 2023. "Deep Reinforcement Learning-Based Holding Control for Bus Bunching under Stochastic Travel Time and Demand," Sustainability, MDPI, vol. 15(14), pages 1-18, July.
Kamyab Karimi & Ali Ghodratnama & Reza Tavakkoli-Moghaddam, 2023. "Two new feature selection methods based on learn-heuristic techniques for breast cancer prediction: a comprehensive analysis," Annals of Operations Research, Springer, vol. 328(1), pages 665-700, September.

More about this item

Keywords

Goal-oriented; Reinforcement learning; Hierarchical decision; Multi-agent; Drug dosage control;
All these keywords.

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:annopr:v:338:y:2024:i:2:d:10.1007_s10479-024-06029-x. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

A goal-oriented reinforcement learning for optimal drug dosage control

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data