Author
Listed:
- Jacobs, Stef
- Ghane, Sara
- Houben, Pieter Jan
- Kabbara, Zakarya
- Huybrechts, Thomas
- Hellinckx, Peter
- Verhaert, Ivan
Abstract
Deep reinforcement learning (DRL) can be used to optimise the performance of Collective Heating Systems (CHS) by reducing operational costs while ensuring thermal comfort. However, heating systems often exhibit slow responsiveness to control inputs due to thermal inertia, which delays the effects of actions such as adapting temperature set points. This delayed feedback complicates the learning process for DRL agents, as it becomes more difficult to associate specific control actions with their outcomes. To address this challenge, this study evaluates four hyperparameter schemes during training. The focus lies on schemes with varying learning rate (the rate at which weights in neural networks are adapted) and/or discount factor (the importance the DRL agent attaches to future rewards). In this respect, we introduce the GALER approach, which combines the progressive increase of the discount factor with the reduction of the learning rate throughout the training process. The effectiveness of the four learning schemes is evaluated using the actor-critic Proximal Policy Optimization (PPO) algorithm for three types of CHS with a multi-objective reward function balancing thermal comfort and energy use or operational costs. The results demonstrate that energy-based reward functions allow for limited optimisation possibilities, while the GALER scheme yields the highest potential for price-based optimisation across all considered concepts. It achieved a 3%–15% performance improvement over other successful training schemes. DRL agents trained with GALER schemes strategically anticipate on high-price times by lowering the supply temperature and vice versa. This research highlights the advantage of varying both learning rates and discount factors when training DRL agents to operate in complex multi-objective environments with slow responsiveness.
Suggested Citation
Jacobs, Stef & Ghane, Sara & Houben, Pieter Jan & Kabbara, Zakarya & Huybrechts, Thomas & Hellinckx, Peter & Verhaert, Ivan, 2025.
"Improving the learning process of deep reinforcement learning agents operating in collective heating environments,"
Applied Energy, Elsevier, vol. 384(C).
Handle:
RePEc:eee:appene:v:384:y:2025:i:c:s0306261925001503
DOI: 10.1016/j.apenergy.2025.125420
Download full text from publisher
As the access to this document is restricted, you may want to search for a different version of it.
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:appene:v:384:y:2025:i:c:s0306261925001503. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/405891/description#description .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.