IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1007944.html
   My bibliography  Save this article

Combined model-free and model-sensitive reinforcement learning in non-human primates

Author

Listed:
  • Bruno Miranda
  • W M Nishantha Malalasekera
  • Timothy E Behrens
  • Peter Dayan
  • Steven W Kennerley

Abstract

Contemporary reinforcement learning (RL) theory suggests that potential choices can be evaluated by strategies that may or may not be sensitive to the computational structure of tasks. A paradigmatic model-free (MF) strategy simply repeats actions that have been rewarded in the past; by contrast, model-sensitive (MS) strategies exploit richer information associated with knowledge of task dynamics. MF and MS strategies should typically be combined, because they have complementary statistical and computational strengths; however, this tradeoff between MF/MS RL has mostly only been demonstrated in humans, often with only modest numbers of trials. We trained rhesus monkeys to perform a two-stage decision task designed to elicit and discriminate the use of MF and MS methods. A descriptive analysis of choice behaviour revealed directly that the structure of the task (of MS importance) and the reward history (of MF and MS importance) significantly influenced both choice and response vigour. A detailed, trial-by-trial computational analysis confirmed that choices were made according to a combination of strategies, with a dominant influence of a particular form of model sensitivity that persisted over weeks of testing. The residuals from this model necessitated development of a new combined RL model which incorporates a particular credit assignment weighting procedure. Finally, response vigor exhibited a subtly different collection of MF and MS influences. These results provide new illumination onto RL behavioural processes in non-human primates.Author summary: We routinely solve planning problems in which present decisions have consequences in the future. These pose complex computational and statistical problems and are addressed by multiple systems in the brain which use different solutions to these problems, and which may compete and cooperate. We trained two rhesus monkeys on a paradigmatic planning task that transparently reveals canonical aspects of different strategies. We performed a detailed behavioral analysis using methods of reinforcement learning on choice and reaction time to reveal conjoint influences and structural interactions of different sources of information. We show the strengths and limitations of these analyses, at the same time as we provide a novel perspective on how different learning systems interact for choice in non-human primates.

Suggested Citation

  • Bruno Miranda & W M Nishantha Malalasekera & Timothy E Behrens & Peter Dayan & Steven W Kennerley, 2020. "Combined model-free and model-sensitive reinforcement learning in non-human primates," PLOS Computational Biology, Public Library of Science, vol. 16(6), pages 1-25, June.
  • Handle: RePEc:plo:pcbi00:1007944
    DOI: 10.1371/journal.pcbi.1007944
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007944
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1007944&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1007944?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Amir Dezfouli & Bernard W Balleine, 2013. "Actions, Action Sequences and Habits: Evidence That Goal-Directed and Habitual Action Control Are Hierarchically Organized," PLOS Computational Biology, Public Library of Science, vol. 9(12), pages 1-14, December.
    2. Thomas Akam & Rui Costa & Peter Dayan, 2015. "Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task," PLOS Computational Biology, Public Library of Science, vol. 11(12), pages 1-25, December.
    3. Peter Smittenaar & George Prichard & Thomas H B FitzGerald & Joern Diedrichsen & Raymond J Dolan, 2014. "Transcranial Direct Current Stimulation of Right Dorsolateral Prefrontal Cortex Does Not Affect Model-Based or Model-Free Reinforcement Learning in Humans," PLOS ONE, Public Library of Science, vol. 9(1), pages 1-8, January.
    4. Wouter Kool & Fiery A Cushman & Samuel J Gershman, 2016. "When Does Model-Based Control Pay Off?," PLOS Computational Biology, Public Library of Science, vol. 12(8), pages 1-34, August.
    5. Paul T E Cusack, 2020. "The Human Brain," Biomedical Journal of Scientific & Technical Research, Biomedical Research Network+, LLC, vol. 31(3), pages 24261-24266, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Andrew Mah & Shannon S. Schiereck & Veronica Bossio & Christine M. Constantinople, 2023. "Distinct value computations support rapid sequential decisions," Nature Communications, Nature, vol. 14(1), pages 1-14, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wouter Kool & Fiery A Cushman & Samuel J Gershman, 2016. "When Does Model-Based Control Pay Off?," PLOS Computational Biology, Public Library of Science, vol. 12(8), pages 1-34, August.
    2. He A Xu & Alireza Modirshanechi & Marco P Lehmann & Wulfram Gerstner & Michael H Herzog, 2021. "Novelty is not surprise: Human exploratory and adaptive behavior in sequential decision-making," PLOS Computational Biology, Public Library of Science, vol. 17(6), pages 1-32, June.
    3. Amir Dezfouli & Bernard W Balleine, 2019. "Learning the structure of the world: The adaptive nature of state-space and action representations in multi-stage decision-making," PLOS Computational Biology, Public Library of Science, vol. 15(9), pages 1-22, September.
    4. Carolina Feher da Silva & Todd A Hare, 2018. "A note on the analysis of two-stage task results: How changes in task structure affect what model-free and model-based strategies predict about the effects of reward and transition on the stay probabi," PLOS ONE, Public Library of Science, vol. 13(4), pages 1-13, April.
    5. Thomas Akam & Rui Costa & Peter Dayan, 2015. "Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task," PLOS Computational Biology, Public Library of Science, vol. 11(12), pages 1-25, December.
    6. Nitzan Shahar & Tobias U Hauser & Michael Moutoussis & Rani Moran & Mehdi Keramati & NSPN consortium & Raymond J Dolan, 2019. "Improving the reliability of model-based decision-making estimates in the two-stage decision task with reaction-times and drift-diffusion modeling," PLOS Computational Biology, Public Library of Science, vol. 15(2), pages 1-25, February.
    7. Dominic Holland & Oleksandr Frei & Rahul Desikan & Chun-Chieh Fan & Alexey A Shadrin & Olav B Smeland & V S Sundar & Paul Thompson & Ole A Andreassen & Anders M Dale, 2020. "Beyond SNP heritability: Polygenicity and discoverability of phenotypes estimated with a univariate Gaussian mixture model," PLOS Genetics, Public Library of Science, vol. 16(5), pages 1-30, May.
    8. Julia Berezutskaya & Zachary V Freudenburg & Umut Güçlü & Marcel A J van Gerven & Nick F Ramsey, 2020. "Brain-optimized extraction of complex sound features that drive continuous auditory perception," PLOS Computational Biology, Public Library of Science, vol. 16(7), pages 1-34, July.
    9. Abigail B. Schneider & Bridget Leonard, 2022. "From anxiety to control: Mask‐wearing, perceived marketplace influence, and emotional well‐being during the COVID‐19 pandemic," Journal of Consumer Affairs, Wiley Blackwell, vol. 56(1), pages 97-119, March.
    10. Geonhui Lee & Woong Choi & Hanjin Jo & Wookhyun Park & Jaehyo Kim, 2020. "Analysis of motor control strategy for frontal and sagittal planes of circular tracking movements using visual feedback noise from velocity change and depth information," PLOS ONE, Public Library of Science, vol. 15(11), pages 1-22, November.
    11. Odelaisy León-Triana & Julián Pérez-Beteta & David Albillo & Ana Ortiz de Mendivil & Luis Pérez-Romasanta & Elisabet González-Del Portillo & Manuel Llorente & Natalia Carballo & Estanislao Arana & Víc, 2021. "Brain Metastasis Response to Stereotactic Radio Surgery: A Mathematical Approach," Mathematics, MDPI, vol. 9(7), pages 1-19, March.
    12. Mirren Charnley & Saba Islam & Guneet K. Bindra & Jeremy Engwirda & Julian Ratcliffe & Jiangtao Zhou & Raffaele Mezzenga & Mark D. Hulett & Kyunghoon Han & Joshua T. Berryman & Nicholas P. Reynolds, 2022. "Neurotoxic amyloidogenic peptides in the proteome of SARS-COV2: potential implications for neurological symptoms in COVID-19," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    13. Samy Castro & Wael El-Deredy & Demian Battaglia & Patricio Orio, 2020. "Cortical ignition dynamics is tightly linked to the core organisation of the human connectome," PLOS Computational Biology, Public Library of Science, vol. 16(7), pages 1-23, July.
    14. Nguyen, Ha Trong & Brinkman, Sally & Le, Huong Thu & Zubrick, Stephen R. & Mitrou, Francis, 2022. "Gender differences in time allocation contribute to differences in developmental outcomes in children and adolescents," Economics of Education Review, Elsevier, vol. 89(C).
    15. Gregor Wolbring, 2022. "Auditing the ‘Social’ of Quantum Technologies: A Scoping Review," Societies, MDPI, vol. 12(2), pages 1-38, March.
    16. April R. Kriebel & Joshua D. Welch, 2022. "UINMF performs mosaic integration of single-cell multi-omic datasets using nonnegative matrix factorization," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    17. Boada, Júlia Pareto & Maestre, Begoña Román & Genís, Carme Torras, 2021. "The ethical issues of social assistive robotics: A critical literature review," Technology in Society, Elsevier, vol. 67(C).
    18. Hamed Nili & Alexander Walther & Arjen Alink & Nikolaus Kriegeskorte, 2020. "Inferring exemplar discriminability in brain representations," PLOS ONE, Public Library of Science, vol. 15(6), pages 1-28, June.
    19. Valtteri Arstila & Alexandra L Georgescu & Henri Pesonen & Daniel Lunn & Valdas Noreika & Christine M Falter-Wagner, 2020. "Event timing in human vision: Modulating factors and independent functions," PLOS ONE, Public Library of Science, vol. 15(8), pages 1-22, August.
    20. Don, Arjuna P.H. & Peters, James F. & Ramanna, Sheela & Tozzi, Arturo, 2021. "Quaternionic views of rs-fMRI hierarchical brain activation regions. Discovery of multilevel brain activation region intensities in rs-fMRI video frames," Chaos, Solitons & Fractals, Elsevier, vol. 152(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1007944. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.