Author
Listed:
- Zhida Guo
(School of Economics and Management, Dalian Jiaotong University, Dalian 116021, China)
- Jingyuan Fu
(Faculty of Education, The University of Hong Kong, Hong Kong 999077, China)
- Peng Sun
(Institute of Computing Technology, China Academy of Railway Sciences, Beijing 100081, China)
Abstract
Reinforcement learning is an important machine learning method and has become a hot popular research direction topic at present in recent years. The combination of reinforcement learning and a recommendation system, is a very important application scenario and application, and has always received close attention from researchers in all sectors of society. In this paper, we first propose a feature engineering method based on label distribution learning, which analyzes historical behavior is analyzed and constructs, whereby feature vectors are constructed for users and products via label distribution learning. Then, a recommendation algorithm based on value distribution reinforcement learning is proposed. We first designed the stochastic process of the recommendation process, described the user’s state in the interaction process (by including the information on their explicit state and implicit state), and dynamically generated product recommendations through user feedback. Next, by studying hybrid recommendation strategies, we combined the user’s dynamic and static information to fully utilize their information and achieve high-quality recommendation algorithms. Finally, the algorithm was designed and validated, and various relevant baseline models were compared to demonstrate the effectiveness of the algorithm in this study. With this study, we actually tested the remarkable advantages of relevant design models based on nonlinear expectations compared to other homogeneous individual models. The use of recommendation systems with nonlinear expectations has considerably increased the accuracy, data utilization, robustness, model convergence speed, and stability of the systems. In this study, we incorporated the idea of nonlinear expectations into the design and implementation process of recommendation systems. The main practical value of the improved recommendation model is that its performance is more accurate than that of other recommendation models at the same level of computing power level. Moreover, due to the higher amount of information that the enhanced model contains, it provides theoretical support and the basis for an algorithm that can be used to achieve high-quality recommendation services, and it has many application prospects.
Suggested Citation
Zhida Guo & Jingyuan Fu & Peng Sun, 2023.
"Reinforcement Learning Recommendation Algorithm Based on Label Value Distribution,"
Mathematics, MDPI, vol. 11(13), pages 1-15, June.
Handle:
RePEc:gam:jmathe:v:11:y:2023:i:13:p:2895-:d:1181296
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:11:y:2023:i:13:p:2895-:d:1181296. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.