IDEAS home Printed from https://ideas.repec.org/a/wsi/acsxxx/v16y2013i02n03ns021952591350015x.html
   My bibliography  Save this article

Combining Correlation-Based And Reward-Based Learning In Neural Control For Policy Improvement

Author

Listed:
  • PORAMATE MANOONPONG

    (Bernstein Center for Computational Neuroscience, The Third Institute of Physics, University of Göttingen, Göttingen 37077, Germany;
    ATR Computational Neuroscience Laboratories, 2-2-2 Hikaridai Seika-cho, Soraku-gun, Kyoto 619-0288, Japan)

  • CHRISTOPH KOLODZIEJSKI

    (Bernstein Center for Computational Neuroscience, The Third Institute of Physics, University of Göttingen, Göttingen 37077, Germany)

  • FLORENTIN WÖRGÖTTER

    (Bernstein Center for Computational Neuroscience, The Third Institute of Physics, University of Göttingen, Göttingen 37077, Germany)

  • JUN MORIMOTO

    (Bernstein Center for Computational Neuroscience, The Third Institute of Physics, University of Göttingen, Göttingen 37077, Germany;
    ATR Computational Neuroscience Laboratories, 2-2-2 Hikaridai Seika-cho, Soraku-gun, Kyoto 619-0288, Japan)

Abstract

Classical conditioning (conventionally modeled as correlation-based learning) and operant conditioning (conventionally modeled as reinforcement learning or reward-based learning) have been found in biological systems. Evidence shows that these two mechanisms strongly involve learning about associations. Based on these biological findings, we propose a new learning model to achieve successful control policies for artificial systems. This model combines correlation-based learning using input correlation learning (ICO learning) and reward-based learning using continuous actor–critic reinforcement learning (RL), thereby working as a dual learner system. The model performance is evaluated by simulations of a cart-pole system as a dynamic motion control problem and a mobile robot system as a goal-directed behavior control problem. Results show that the model can strongly improve pole balancing control policy, i.e., it allows the controller to learn stabilizing the pole in the largest domain of initial conditions compared to the results obtained when using a single learning mechanism. This model can also find a successful control policy for goal-directed behavior, i.e., the robot can effectively learn to approach a given goal compared to its individual components. Thus, the study pursued here sharpens our understanding of how two different learning mechanisms can be combined and complement each other for solving complex tasks.

Suggested Citation

  • Poramate Manoonpong & Christoph Kolodziejski & Florentin Wörgötter & Jun Morimoto, 2013. "Combining Correlation-Based And Reward-Based Learning In Neural Control For Policy Improvement," Advances in Complex Systems (ACS), World Scientific Publishing Co. Pte. Ltd., vol. 16(02n03), pages 1-38.
  • Handle: RePEc:wsi:acsxxx:v:16:y:2013:i:02n03:n:s021952591350015x
    DOI: 10.1142/S021952591350015X
    as

    Download full text from publisher

    File URL: http://www.worldscientific.com/doi/abs/10.1142/S021952591350015X
    Download Restriction: Access to full text is restricted to subscribers

    File URL: https://libkey.io/10.1142/S021952591350015X?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wsi:acsxxx:v:16:y:2013:i:02n03:n:s021952591350015x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Tai Tone Lim (email available below). General contact details of provider: http://www.worldscinet.com/acs/acs.shtml .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.