Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task

My bibliography Save this article

Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task

Author

Listed:

Thomas Akam
Rui Costa
Peter Dayan

Registered:

Abstract

The recently developed ‘two-step’ behavioural task promises to differentiate model-based from model-free reinforcement learning, while generating neurophysiologically-friendly decision datasets with parametric variation of decision variables. These desirable features have prompted its widespread adoption. Here, we analyse the interactions between a range of different strategies and the structure of transitions and outcomes in order to examine constraints on what can be learned from behavioural performance. The task involves a trade-off between the need for stochasticity, to allow strategies to be discriminated, and a need for determinism, so that it is worth subjects’ investment of effort to exploit the contingencies optimally. We show through simulation that under certain conditions model-free strategies can masquerade as being model-based. We first show that seemingly innocuous modifications to the task structure can induce correlations between action values at the start of the trial and the subsequent trial events in such a way that analysis based on comparing successive trials can lead to erroneous conclusions. We confirm the power of a suggested correction to the analysis that can alleviate this problem. We then consider model-free reinforcement learning strategies that exploit correlations between where rewards are obtained and which actions have high expected value. These generate behaviour that appears model-based under these, and also more sophisticated, analyses. Exploiting the full potential of the two-step task as a tool for behavioural neuroscience requires an understanding of these issues.Author Summary: Planning is the use of a predictive model of the consequences of actions to guide decision making. Planning plays a critical role in human behaviour, but isolating its contribution is challenging because it is complemented by control systems which learn values of actions directly from the history of reinforcement, resulting in automatized mappings from states to actions often termed habits. Our study examined a recently developed behavioural task which uses choices in a multi-step decision tree to differentiate planning from value-based control. We compared various strategies using simulations, showing a range that produce behaviour that resembles planning but in fact arises as a fixed mapping from particular sorts of states to action. These results show that when a planning problem is faced repeatedly, sophisticated automatization strategies may be developed which identify that there are in fact a limited number of relevant states of the world each with an appropriate fixed or habitual response. Understanding such strategies is important for the design and interpretation of tasks which aim to isolate the contribution of planning to behaviour. Such strategies are also of independent scientific interest as they may contribute to automatization of behaviour in complex environments.

Suggested Citation

Thomas Akam & Rui Costa & Peter Dayan, 2015. "Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task," PLOS Computational Biology, Public Library of Science, vol. 11(12), pages 1-25, December.

Handle: RePEc:plo:pcbi00:1004648
DOI: 10.1371/journal.pcbi.1004648

Download full text from publisher

References listed on IDEAS

Amir Dezfouli & Bernard W Balleine, 2013. "Actions, Action Sequences and Habits: Evidence That Goal-Directed and Habitual Action Control Are Hierarchically Organized," PLOS Computational Biology, Public Library of Science, vol. 9(12), pages 1-14, December.
Petr Znamenskiy & Anthony M. Zador, 2013. "Corticostriatal neurons in auditory cortex drive decisions during auditory discrimination," Nature, Nature, vol. 497(7450), pages 482-485, May.
Christina M. Gremel & Rui M. Costa, 2013. "Orbitofrontal and striatal circuits dynamically encode the shift between goal-directed and habitual actions," Nature Communications, Nature, vol. 4(1), pages 1-12, October.
Peter Smittenaar & George Prichard & Thomas H B FitzGerald & Joern Diedrichsen & Raymond J Dolan, 2014. "Transcranial Direct Current Stimulation of Right Dorsolateral Prefrontal Cortex Does Not Affect Model-Based or Model-Free Reinforcement Learning in Humans," PLOS ONE, Public Library of Science, vol. 9(1), pages 1-8, January.
Nicky J. Welton & Howard H. Z. Thom, 2015. "Value of Information," Medical Decision Making, , vol. 35(5), pages 564-566, July.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Bruno Miranda & W M Nishantha Malalasekera & Timothy E Behrens & Peter Dayan & Steven W Kennerley, 2020. "Combined model-free and model-sensitive reinforcement learning in non-human primates," PLOS Computational Biology, Public Library of Science, vol. 16(6), pages 1-25, June.
Julie J Lee & Mehdi Keramati, 2017. "Flexibility to contingency changes distinguishes habitual and goal-directed strategies in humans," PLOS Computational Biology, Public Library of Science, vol. 13(9), pages 1-15, September.
Evan M Russek & Ida Momennejad & Matthew M Botvinick & Samuel J Gershman & Nathaniel D Daw, 2017. "Predictive representations can link model-based reinforcement learning to model-free mechanisms," PLOS Computational Biology, Public Library of Science, vol. 13(9), pages 1-35, September.
Wouter Kool & Fiery A Cushman & Samuel J Gershman, 2016. "When Does Model-Based Control Pay Off?," PLOS Computational Biology, Public Library of Science, vol. 12(8), pages 1-34, August.
Jaron T Colas & Wolfgang M Pauli & Tobias Larsen & J Michael Tyszka & John P O’Doherty, 2017. "Distinct prediction errors in mesostriatal circuits of the human brain mediate learning about the values of both states and actions: evidence from high-resolution fMRI," PLOS Computational Biology, Public Library of Science, vol. 13(10), pages 1-32, October.
Zhewei Zhang & Yuji K. Takahashi & Marlian Montesinos-Cartegena & Thorsten Kahnt & Angela J. Langdon & Geoffrey Schoenbaum, 2024. "Expectancy-related changes in firing of dopamine neurons depend on hippocampus," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
He A Xu & Alireza Modirshanechi & Marco P Lehmann & Wulfram Gerstner & Michael H Herzog, 2021. "Novelty is not surprise: Human exploratory and adaptive behavior in sequential decision-making," PLOS Computational Biology, Public Library of Science, vol. 17(6), pages 1-32, June.
Amir Dezfouli & Bernard W Balleine, 2019. "Learning the structure of the world: The adaptive nature of state-space and action representations in multi-stage decision-making," PLOS Computational Biology, Public Library of Science, vol. 15(9), pages 1-22, September.
Minsu Abel Yang & Min Whan Jung & Sang Wan Lee, 2025. "Striatal arbitration between choice strategies guides few-shot adaptation," Nature Communications, Nature, vol. 16(1), pages 1-26, December.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Wouter Kool & Fiery A Cushman & Samuel J Gershman, 2016. "When Does Model-Based Control Pay Off?," PLOS Computational Biology, Public Library of Science, vol. 12(8), pages 1-34, August.
Nitzan Shahar & Tobias U Hauser & Michael Moutoussis & Rani Moran & Mehdi Keramati & NSPN consortium & Raymond J Dolan, 2019. "Improving the reliability of model-based decision-making estimates in the two-stage decision task with reaction-times and drift-diffusion modeling," PLOS Computational Biology, Public Library of Science, vol. 15(2), pages 1-25, February.
Bruno Miranda & W M Nishantha Malalasekera & Timothy E Behrens & Peter Dayan & Steven W Kennerley, 2020. "Combined model-free and model-sensitive reinforcement learning in non-human primates," PLOS Computational Biology, Public Library of Science, vol. 16(6), pages 1-25, June.
Vincenzo Varriale & Antonello Cammarano & Francesca Michelino & Mauro Caputo, 2021. "Sustainable Supply Chains with Blockchain, IoT and RFID: A Simulation on Order Management," Sustainability, MDPI, vol. 13(11), pages 1-23, June.
Valeria Costantini & Francesco Crespi & Giovanni Marin & Elena Paglialunga, 2016. "Eco-innovation, sustainable supply chains and environmental performance in European industries," LEM Papers Series 2016/19, Laboratory of Economics and Management (LEM), Sant'Anna School of Advanced Studies, Pisa, Italy.
Lee, Alice J. & Ames, Daniel R., 2017. "“I can’t pay more” versus “It’s not worth more”: Divergent effects of constraint and disparagement rationales in negotiations," Organizational Behavior and Human Decision Processes, Elsevier, vol. 141(C), pages 16-28.
Hussain, Hadia & Murtaza, Murtaza & Ajmal, Areeb & Ahmed, Afreen & Khan, Muhammad Ovais Khalid, 2020. "A study on the effects of social media advertisement on consumer’s attitude and customer response," MPRA Paper 104675, University Library of Munich, Germany.
A. G. Fatullayev & Nizami A. Gasilov & Şahin Emrah Amrahov, 2019. "Numerical solution of linear inhomogeneous fuzzy delay differential equations," Fuzzy Optimization and Decision Making, Springer, vol. 18(3), pages 315-326, September.
Cyril Chalendard, 2015. "Use of internal information, external information acquisition and customs underreporting," Working Papers halshs-01179445, HAL.
- Cyril CHALENDARD, 2015. "Use of Internal Information, External Information Acquisition and Customs Underreporting," Working Papers 201522, CERDI.
Arun Advani & William Elming & Jonathan Shaw, 2023. "The Dynamic Effects of Tax Audits," The Review of Economics and Statistics, MIT Press, vol. 105(3), pages 545-561, May.
- Arun Advani & William Elming & Jonathan Shaw, 2017. "The dynamic effects of tax audits," IFS Working Papers W17/24, Institute for Fiscal Studies.
- Advani, Arun & Elming, William & Shaw, Jonathan, 2019. "The Dynamic Effects of Tax Audits," CAGE Online Working Paper Series 414, Competitive Advantage in the Global Economy (CAGE).
- Advani, Arun & Elming, William & Shaw, Jonathan, 2019. "The Dynamic Effects of Tax Audits," The Warwick Economics Research Paper Series (TWERPS) 1198, University of Warwick, Department of Economics.
Aghion, Philippe & Akcigit, Ufuk & Lequien, Matthieu & Stantcheva, Stefanie, 2017. "Tax simplicity and heterogeneous learning," LSE Research Online Documents on Economics 86613, London School of Economics and Political Science, LSE Library.
- P. Aghion & U. Akcigit & M. Lequien & S. Stantcheva, 2018. "Tax Simplicity and Heterogeneous Learning," Working papers 665, Banque de France.
- Stantcheva, Stefanie & Aghion, Philippe & Lequien, Matthieu & Akcigit, Ufuk, 2017. "Tax Simplicity and Heterogeneous Learning," CEPR Discussion Papers 12471, C.E.P.R. Discussion Papers.
- Philippe Aghion & Ufuk Akcigit & Matthieu Lequien & Stefanie Stantcheva, 2017. "Tax simplicity and heterogeneous learning," CEP Discussion Papers dp1516, Centre for Economic Performance, LSE.
Marie Bjørneby & Annette Alstadsæter & Kjetil Telle, 2018. "Collusive tax evasion by employers and employees. Evidence from a randomized fi eld experiment in Norway," Discussion Papers 891, Statistics Norway, Research Department.
- Marie Bjørneby & Annette Alstadsæter & Kjetil Telle, 2018. "Collusive Tax Evasion by Employers and Employees: Evidence from a Randomized Field Experiment in Norway," CESifo Working Paper Series 7381, CESifo.
Chuangen Gao & Shuyang Gu & Jiguo Yu & Hai Du & Weili Wu, 2022. "Adaptive seeding for profit maximization in social networks," Journal of Global Optimization, Springer, vol. 82(2), pages 413-432, February.
Koessler, Frederic & Laclau, Marie & Renault, Jérôme & Tomala, Tristan, 2022. "Long information design," Theoretical Economics, Econometric Society, vol. 17(2), May.
- Frédéric Koessler & Marie Laclau & Jérôme Renault & Tristan Tomala, 2021. "Long Information Design," PSE Working Papers halshs-02400053, HAL.
- Frédéric Koessler & Marie Laclau & Jerôme Renault & Tristan Tomala, 2022. "Long information design," PSE-Ecole d'économie de Paris (Postprint) hal-03700394, HAL.
- Koessler, Frédéric & Laclau, Marie & Renault, Jérôme & Tomala, Tristan, 2022. "Long information design," TSE Working Papers 22-1341, Toulouse School of Economics (TSE).
- Marie Laclau & Frédéric Koessler & Jérôme Renault & Tristan Tomala, 2022. "Long Information Design," Post-Print halshs-03342880, HAL.
- Marie Laclau & Frédéric Koessler & Jérôme Renault & Tristan Tomala, 2022. "Long Information Design," PSE-Ecole d'économie de Paris (Postprint) halshs-03342880, HAL.
- Frédéric Koessler & Marie Laclau & Jerôme Renault & Tristan Tomala, 2022. "Long information design," Post-Print hal-03700394, HAL.
- Frédéric Koessler & Marie Laclau & Jérôme Renault & Tristan Tomala, 2021. "Long Information Design," Working Papers halshs-02400053, HAL.
- Frédéric Koessler & Marie Laclau & Jérôme Renault & Tristan Tomala, 2022. "Long Information Design," Post-Print halshs-02400053, HAL.
- Frédéric Koessler & Marie Laclau & Jérôme Renault & Tristan Tomala, 2022. "Long Information Design," PSE-Ecole d'économie de Paris (Postprint) halshs-02400053, HAL.
Jamal El-Den & Pratap Adikhari & Pratap Adikhari, 2017. "Social media in the service of social entrepreneurship: Identifying factors for better services," Journal of Advances in Humanities and Social Sciences, Dr. Yi-Hsing Hsieh, vol. 3(2), pages 105-114.
Annette Alstadsæter & Wojciech Kopczuk & Kjetil Telle, 2019. "Social networks and tax avoidance: evidence from a well-defined Norwegian tax shelter," International Tax and Public Finance, Springer;International Institute of Public Finance, vol. 26(6), pages 1291-1328, December.
- Annette Alstadsæter & Wojciech Kopczuk & Kjetil Telle, 2018. "Social Networks and Tax Avoidance: Evidence from a Well-Defined Norwegian Tax Shelter," NBER Working Papers 25191, National Bureau of Economic Research, Inc.
- Annette Alstadsæter & Wojciech Kopczuk & Kjetil Telle, 2018. "Social networks and tax avoidance. Evidence from a well-defined Norwegian tax shelter," Discussion Papers 886, Statistics Norway, Research Department.
- Kopczuk, Wojciech & AlstadsÃ¦ter, Annette & Telle, Kjetil, 2018. "Social networks and tax avoidance: Evidence from a well-defined Norwegian tax shelter," CEPR Discussion Papers 13251, C.E.P.R. Discussion Papers.
Xiongnan Jin & Sejin Chun & Jooik Jung & Kyong-Ho Lee, 0. "A fast and scalable approach for IoT service selection based on a physical service model," Information Systems Frontiers, Springer, vol. 0, pages 1-16.
Jun Hong Park & Sang Ho Kook & Hyeonu Im & Soomin Eum & Chulung Lee, 2018. "Fabless Semiconductor Firms’ Financial Performance Determinant Factors: Product Platform Efficiency and Technological Capability," Sustainability, MDPI, vol. 10(10), pages 1-22, September.
Sebastian Kaumanns, 2019. "“Some fuzzy math”: relational information on debt value adjustments by managers and the financial press," Business Research, Springer;German Academic Association for Business Research, vol. 12(2), pages 755-794, December.
Samuel J Gershman, 2015. "A Unifying Probabilistic View of Associative Learning," PLOS Computational Biology, Public Library of Science, vol. 11(11), pages 1-20, November.

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1004648. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data