Author
Listed:
- Qian Zhang
(University of Electronic Science and Technology of China)
- Tianhao Li
(University of Electronic Science and Technology of China)
- Dengfeng Li
(University of Electronic Science and Technology of China)
- Wei Lu
(University of Electronic Science and Technology of China)
Abstract
The dosage control of therapeutic drugs is a concern for clinicians. Whether the clinician’s dosing decision is correct and efficient determines patient’s life. In intensive care units (ICU), medication decision is a dynamic and continuous process, which is difficult to solve by traditional intelligent technologies. while reinforcement learning (RL) has an advantage in handling sequential decision making, it faces challenges in multi-level problems because of the delayed rewards and complex states. Hierarchical reinforcement learning (HRL) is a layered algorithm based on RL. HRL has been proved to be effective in delayed sparse reward issues and reduce the learning difficulty by dividing the long-term goal into stages. Inspired by this, we propose a goal-oriented reinforcement learning (GORL) approach to optimize the drug dosage control for sepsis patients. Specifically, GORL employs two agents to make dosage decisions cooperatively by simulating the behaviors of clinicians. GORL decompose a long-term goal into several short-term goals to reduce the exploration space. In the long-term goal, the concept of the goal-oriented is introduced to solve the sparse reward. A goal-oriented hierarchical structure can help agents to interact and cooperate to achieve the short-term goal. In addition, we design a hindsight intrinsic reward to balance the long-term and short-term goals, and are thus able to learn an optimal policy of drug dosage control. We conduct our experiments on MIMIC-IV, which is one of the biggest medical datasets. The experimental results show that our model outperforms other baseline algorithms and can learn a more robust treatment policy than clinicians, with reducing the patient’s mortality by 10.23%.
Suggested Citation
Qian Zhang & Tianhao Li & Dengfeng Li & Wei Lu, 2024.
"A goal-oriented reinforcement learning for optimal drug dosage control,"
Annals of Operations Research, Springer, vol. 338(2), pages 1403-1423, July.
Handle:
RePEc:spr:annopr:v:338:y:2024:i:2:d:10.1007_s10479-024-06029-x
DOI: 10.1007/s10479-024-06029-x
Download full text from publisher
As the access to this document is restricted, you may want to search for a different version of it.
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:annopr:v:338:y:2024:i:2:d:10.1007_s10479-024-06029-x. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.