IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1007944.html
   My bibliography  Save this article

Combined model-free and model-sensitive reinforcement learning in non-human primates

Author

Listed:
  • Bruno Miranda
  • W M Nishantha Malalasekera
  • Timothy E Behrens
  • Peter Dayan
  • Steven W Kennerley

Abstract

Contemporary reinforcement learning (RL) theory suggests that potential choices can be evaluated by strategies that may or may not be sensitive to the computational structure of tasks. A paradigmatic model-free (MF) strategy simply repeats actions that have been rewarded in the past; by contrast, model-sensitive (MS) strategies exploit richer information associated with knowledge of task dynamics. MF and MS strategies should typically be combined, because they have complementary statistical and computational strengths; however, this tradeoff between MF/MS RL has mostly only been demonstrated in humans, often with only modest numbers of trials. We trained rhesus monkeys to perform a two-stage decision task designed to elicit and discriminate the use of MF and MS methods. A descriptive analysis of choice behaviour revealed directly that the structure of the task (of MS importance) and the reward history (of MF and MS importance) significantly influenced both choice and response vigour. A detailed, trial-by-trial computational analysis confirmed that choices were made according to a combination of strategies, with a dominant influence of a particular form of model sensitivity that persisted over weeks of testing. The residuals from this model necessitated development of a new combined RL model which incorporates a particular credit assignment weighting procedure. Finally, response vigor exhibited a subtly different collection of MF and MS influences. These results provide new illumination onto RL behavioural processes in non-human primates.Author summary: We routinely solve planning problems in which present decisions have consequences in the future. These pose complex computational and statistical problems and are addressed by multiple systems in the brain which use different solutions to these problems, and which may compete and cooperate. We trained two rhesus monkeys on a paradigmatic planning task that transparently reveals canonical aspects of different strategies. We performed a detailed behavioral analysis using methods of reinforcement learning on choice and reaction time to reveal conjoint influences and structural interactions of different sources of information. We show the strengths and limitations of these analyses, at the same time as we provide a novel perspective on how different learning systems interact for choice in non-human primates.

Suggested Citation

  • Bruno Miranda & W M Nishantha Malalasekera & Timothy E Behrens & Peter Dayan & Steven W Kennerley, 2020. "Combined model-free and model-sensitive reinforcement learning in non-human primates," PLOS Computational Biology, Public Library of Science, vol. 16(6), pages 1-25, June.
  • Handle: RePEc:plo:pcbi00:1007944
    DOI: 10.1371/journal.pcbi.1007944
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007944
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1007944&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1007944?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Amir Dezfouli & Bernard W Balleine, 2013. "Actions, Action Sequences and Habits: Evidence That Goal-Directed and Habitual Action Control Are Hierarchically Organized," PLOS Computational Biology, Public Library of Science, vol. 9(12), pages 1-14, December.
    2. Thomas Akam & Rui Costa & Peter Dayan, 2015. "Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task," PLOS Computational Biology, Public Library of Science, vol. 11(12), pages 1-25, December.
    3. Peter Smittenaar & George Prichard & Thomas H B FitzGerald & Joern Diedrichsen & Raymond J Dolan, 2014. "Transcranial Direct Current Stimulation of Right Dorsolateral Prefrontal Cortex Does Not Affect Model-Based or Model-Free Reinforcement Learning in Humans," PLOS ONE, Public Library of Science, vol. 9(1), pages 1-8, January.
    4. Wouter Kool & Fiery A Cushman & Samuel J Gershman, 2016. "When Does Model-Based Control Pay Off?," PLOS Computational Biology, Public Library of Science, vol. 12(8), pages 1-34, August.
    5. Paul T E Cusack, 2020. "The Human Brain," Biomedical Journal of Scientific & Technical Research, Biomedical Research Network+, LLC, vol. 31(3), pages 24261-24266, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Andrew Mah & Shannon S. Schiereck & Veronica Bossio & Christine M. Constantinople, 2023. "Distinct value computations support rapid sequential decisions," Nature Communications, Nature, vol. 14(1), pages 1-14, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wouter Kool & Fiery A Cushman & Samuel J Gershman, 2016. "When Does Model-Based Control Pay Off?," PLOS Computational Biology, Public Library of Science, vol. 12(8), pages 1-34, August.
    2. Amir Dezfouli & Bernard W Balleine, 2019. "Learning the structure of the world: The adaptive nature of state-space and action representations in multi-stage decision-making," PLOS Computational Biology, Public Library of Science, vol. 15(9), pages 1-22, September.
    3. Thomas Akam & Rui Costa & Peter Dayan, 2015. "Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task," PLOS Computational Biology, Public Library of Science, vol. 11(12), pages 1-25, December.
    4. He A Xu & Alireza Modirshanechi & Marco P Lehmann & Wulfram Gerstner & Michael H Herzog, 2021. "Novelty is not surprise: Human exploratory and adaptive behavior in sequential decision-making," PLOS Computational Biology, Public Library of Science, vol. 17(6), pages 1-32, June.
    5. Carolina Feher da Silva & Todd A Hare, 2018. "A note on the analysis of two-stage task results: How changes in task structure affect what model-free and model-based strategies predict about the effects of reward and transition on the stay probabi," PLOS ONE, Public Library of Science, vol. 13(4), pages 1-13, April.
    6. Nitzan Shahar & Tobias U Hauser & Michael Moutoussis & Rani Moran & Mehdi Keramati & NSPN consortium & Raymond J Dolan, 2019. "Improving the reliability of model-based decision-making estimates in the two-stage decision task with reaction-times and drift-diffusion modeling," PLOS Computational Biology, Public Library of Science, vol. 15(2), pages 1-25, February.
    7. Abigail B. Schneider & Bridget Leonard, 2022. "From anxiety to control: Mask‐wearing, perceived marketplace influence, and emotional well‐being during the COVID‐19 pandemic," Journal of Consumer Affairs, Wiley Blackwell, vol. 56(1), pages 97-119, March.
    8. Odelaisy León-Triana & Julián Pérez-Beteta & David Albillo & Ana Ortiz de Mendivil & Luis Pérez-Romasanta & Elisabet González-Del Portillo & Manuel Llorente & Natalia Carballo & Estanislao Arana & Víc, 2021. "Brain Metastasis Response to Stereotactic Radio Surgery: A Mathematical Approach," Mathematics, MDPI, vol. 9(7), pages 1-19, March.
    9. Mirren Charnley & Saba Islam & Guneet K. Bindra & Jeremy Engwirda & Julian Ratcliffe & Jiangtao Zhou & Raffaele Mezzenga & Mark D. Hulett & Kyunghoon Han & Joshua T. Berryman & Nicholas P. Reynolds, 2022. "Neurotoxic amyloidogenic peptides in the proteome of SARS-COV2: potential implications for neurological symptoms in COVID-19," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    10. Hamed Nili & Alexander Walther & Arjen Alink & Nikolaus Kriegeskorte, 2020. "Inferring exemplar discriminability in brain representations," PLOS ONE, Public Library of Science, vol. 15(6), pages 1-28, June.
    11. Linzmajer, Marc & Hubert, Mirja & Hubert, Marco, 2021. "It’s about the process, not the result: An fMRI approach to explore the encoding of explicit and implicit price information," Journal of Economic Psychology, Elsevier, vol. 86(C).
    12. Natalie J Shook & Barış Sevi & Jerin Lee & Benjamin Oosterhoff & Holly N Fitzgerald, 2020. "Disease avoidance in the time of COVID-19: The behavioral immune system is associated with concern and preventative health behaviors," PLOS ONE, Public Library of Science, vol. 15(8), pages 1-15, August.
    13. Cristina Lázaro-Pérez & José Ángel Martínez-López & José Gómez-Galán, 2020. "Addictions in Spanish College Students in Confinement Times: Preventive and Social Perspective," Social Sciences, MDPI, vol. 9(11), pages 1-21, October.
    14. Yashika Arora & Pushpinder Walia & Mitsuhiro Hayashibe & Makii Muthalib & Shubhajit Roy Chowdhury & Stephane Perrey & Anirban Dutta, 2021. "Grey-box modeling and hypothesis testing of functional near-infrared spectroscopy-based cerebrovascular reactivity to anodal high-definition tDCS in healthy humans," PLOS Computational Biology, Public Library of Science, vol. 17(10), pages 1-38, October.
    15. Elvisa Drishti & Bresena Kopliku & Drini Imami, 2022. "Active political engagement, political patronage and local labour markets – The example of Shkoder," International Journal of Manpower, Emerald Group Publishing Limited, vol. 44(6), pages 1118-1142, April.
    16. Julie J Lee & Mehdi Keramati, 2017. "Flexibility to contingency changes distinguishes habitual and goal-directed strategies in humans," PLOS Computational Biology, Public Library of Science, vol. 13(9), pages 1-15, September.
    17. Nguyen, Ha Trong & Brinkman, Sally & Le, Huong Thu & Zubrick, Stephen R. & Mitrou, Francis, 2022. "Gender differences in time allocation contribute to differences in developmental outcomes in children and adolescents," Economics of Education Review, Elsevier, vol. 89(C).
    18. Gricelda Herrera-Franco & Néstor Montalván-Burbano & Carlos Mora-Frank & Lady Bravo-Montero, 2021. "Scientific Research in Ecuador: A Bibliometric Analysis," Publications, MDPI, vol. 9(4), pages 1-34, December.
    19. Sofie L. Valk & Ting Xu & Casey Paquola & Bo-yong Park & Richard A. I. Bethlehem & Reinder Vos de Wael & Jessica Royer & Shahrzad Kharabian Masouleh & Şeyma Bayrak & Peter Kochunov & B. T. Thomas Yeo , 2022. "Genetic and phylogenetic uncoupling of structure and function in human transmodal cortex," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    20. Rosen Valchev & Cosmin Ilut, 2017. "Economic Agents as Imperfect Problem Solvers," 2017 Meeting Papers 1285, Society for Economic Dynamics.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1007944. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.