IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v12y2024i18p2930-d1482100.html
   My bibliography  Save this article

On Convergence Rate of MRetrace

Author

Listed:
  • Xingguo Chen

    (Jiangsu Key Laboratory of Big Data Security & Intelligent Processing, Nanjing University of Posts and Telecommunications, Nanjing 210023, China)

  • Wangrong Qin

    (Jiangsu Key Laboratory of Big Data Security & Intelligent Processing, Nanjing University of Posts and Telecommunications, Nanjing 210023, China)

  • Yu Gong

    (Jiangsu Key Laboratory of Big Data Security & Intelligent Processing, Nanjing University of Posts and Telecommunications, Nanjing 210023, China)

  • Shangdong Yang

    (Jiangsu Key Laboratory of Big Data Security & Intelligent Processing, Nanjing University of Posts and Telecommunications, Nanjing 210023, China)

  • Wenhao Wang

    (College of Electronic Engineering, National University of Defense Technology, Changsha 410073, China
    Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, China)

Abstract

Off-policy is a key setting for reinforcement learning algorithms. In recent years, the stability of off-policy learning for value-based reinforcement learning has been guaranteed even when combined with linear function approximation and bootstrapping. Convergence rate analysis is currently a hot topic. However, the convergence rates of learning algorithms vary, and analyzing the reasons behind this remains an open problem. In this paper, we propose an essentially simplified version of a convergence rate to generate general off-policy temporal difference learning algorithms. We emphasize that the primary determinant influencing convergence rate is the minimum eigenvalue of the key matrix. Furthermore, we conduct a comparative analysis of the influencing factor across various off-policy learning algorithms in diverse numerical scenarios. The experimental findings validate the proposed determinant, which serves as a benchmark for the design of more efficient learning algorithms.

Suggested Citation

  • Xingguo Chen & Wangrong Qin & Yu Gong & Shangdong Yang & Wenhao Wang, 2024. "On Convergence Rate of MRetrace," Mathematics, MDPI, vol. 12(18), pages 1-19, September.
  • Handle: RePEc:gam:jmathe:v:12:y:2024:i:18:p:2930-:d:1482100
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/12/18/2930/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/12/18/2930/
    Download Restriction: no
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:12:y:2024:i:18:p:2930-:d:1482100. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.