Modeling the Violation of Reward Maximization and Invariance in Reinforcement Schedules

My bibliography Save this article

Modeling the Violation of Reward Maximization and Invariance in Reinforcement Schedules

Author

Listed:

Giancarlo La Camera
Barry J Richmond

Registered:

Abstract

It is often assumed that animals and people adjust their behavior to maximize reward acquisition. In visually cued reinforcement schedules, monkeys make errors in trials that are not immediately rewarded, despite having to repeat error trials. Here we show that error rates are typically smaller in trials equally distant from reward but belonging to longer schedules (referred to as “schedule length effect”). This violates the principles of reward maximization and invariance and cannot be predicted by the standard methods of Reinforcement Learning, such as the method of temporal differences. We develop a heuristic model that accounts for all of the properties of the behavior in the reinforcement schedule task but whose predictions are not different from those of the standard temporal difference model in choice tasks. In the modification of temporal difference learning introduced here, the effect of schedule length emerges spontaneously from the sensitivity to the immediately preceding trial. We also introduce a policy for general Markov Decision Processes, where the decision made at each node is conditioned on the motivation to perform an instrumental action, and show that the application of our model to the reinforcement schedule task and the choice task are special cases of this general theoretical framework. Within this framework, Reinforcement Learning can approach contextual learning with the mixture of empirical findings and principled assumptions that seem to coexist in the best descriptions of animal behavior. As examples, we discuss two phenomena observed in humans that often derive from the violation of the principle of invariance: “framing,” wherein equivalent options are treated differently depending on the context in which they are presented, and the “sunk cost” effect, the greater tendency to continue an endeavor once an investment in money, effort, or time has been made. The schedule length effect might be a manifestation of these phenomena in monkeys.Author Summary: Theories of rational behavior are built on a number of principles, including the assumption that subjects adjust their behavior to maximize their long-term returns and that they should work equally hard to obtain a reward in situations where the effort to obtain reward is the same (called the invariance principle). Humans, however, are sensitive to the manner in which equivalent choices are presented, or “framed,” and often have a greater tendency to continue an endeavor once an investment in money, effort, or time has been made, a phenomenon known as “sunk cost” effect. In a similar manner, when monkeys must perform different numbers of trials to obtain a reward, they work harder as the number of trials already performed increases, even though both the work remaining and the forthcoming reward are the same in all situations. Methods from the theory of Reinforcement Learning, which usually provide learning strategies aimed at maximizing returns, cannot model this violation of invariance. Here we generalize a prominent method of Reinforcement Learning so as to explain the violation of invariance, without losing the ability to model behaviors explained by standard Reinforcement Learning models. This generalization extends our understanding of how animals and humans learn and behave.

Suggested Citation

Giancarlo La Camera & Barry J Richmond, 2008. "Modeling the Violation of Reward Maximization and Invariance in Reinforcement Schedules," PLOS Computational Biology, Public Library of Science, vol. 4(8), pages 1-17, August.

Handle: RePEc:plo:pcbi00:1000131
DOI: 10.1371/journal.pcbi.1000131

Download full text from publisher

References listed on IDEAS

Arrow, Kenneth J, 1982. "Risk Perception in Psychology and Economics," Economic Inquiry, Western Economic Association International, vol. 20(1), pages 1-9, January.
N Schweighofer & K Shishida & C E Han & Y Okamoto & S C Tanaka & S Yamawaki & K Doya, 2006. "Humans Can Adopt Optimal Discounting Strategy under Real-Time Constraints," PLOS Computational Biology, Public Library of Science, vol. 2(11), pages 1-8, November.
Schoemaker, Paul J H, 1982. "The Expected Utility Model: Its Variants, Purposes, Evidence and Limitations," Journal of Economic Literature, American Economic Association, vol. 20(2), pages 529-563, June.
Nathaniel D. Daw & John P. O'Doherty & Peter Dayan & Ben Seymour & Raymond J. Dolan, 2006. "Cortical substrates for exploratory decisions in humans," Nature, Nature, vol. 441(7095), pages 876-879, June.
Tversky, Amos & Kahneman, Daniel, 1986. "Rational Choice and the Framing of Decisions," The Journal of Business, University of Chicago Press, vol. 59(4), pages 251-278, October.
Arkes, Hal R. & Blumer, Catherine, 1985. "The psychology of sunk cost," Organizational Behavior and Human Decision Processes, Elsevier, vol. 35(1), pages 124-140, February.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Freeman, Steven F., 1997. "Good decisions : reconciling human rationality, evolution, and ethics," Working papers WP 3962-97., Massachusetts Institute of Technology (MIT), Sloan School of Management.
Shi, Yuwei & Herniman, John, 2023. "The role of expectation in innovation evolution: Exploring hype cycles," Technovation, Elsevier, vol. 119(C).
Kuhberger, Anton, 1998. "The Influence of Framing on Risky Decisions: A Meta-analysis," Organizational Behavior and Human Decision Processes, Elsevier, vol. 75(1), pages 23-55, July.
Robison, Lindon J. & Shupp, Robert S. & Myers, Robert J., 2010. "Expected utility paradoxes," Journal of Behavioral and Experimental Economics (formerly The Journal of Socio-Economics), Elsevier, vol. 39(2), pages 187-193, April.
William C. McDaniel & Francis Sistrunk, 1991. "Management Dilemmas and Decisions," Journal of Conflict Resolution, Peace Science Society (International), vol. 35(1), pages 21-42, March.
Schwaiger, Rene & Kirchler, Michael & Lindner, Florian & Weitzel, Utz, 2020. "Determinants of investor expectations and satisfaction. A study with financial professionals," Journal of Economic Dynamics and Control, Elsevier, vol. 110(C).
- Rene Schwaiger & Michael Kirchler & Florian Lindner & Utz Weitzel, 2018. "Determinants of investor expectations and satisfaction. A study with financial professionals," Working Papers 2018-17, Faculty of Economics and Statistics, Universität Innsbruck.
Cindy Cifuentes Gómez & Siervo Tulio Delgado Ruiz & Jorge Iván González, 2021. "El comportamiento económico desde la perspectiva biológica y psicológica," Apuntes del Cenes, Universidad Pedagógica y Tecnológica de Colombia, vol. 40(72), pages 17-43, July.
Xinsheng Liu & Arnold Vedlitz & James Stoutenborough & Scott Robinson, 2015. "Scientists’ views and positions on global warming and climate change: A content analysis of congressional testimonies," Climatic Change, Springer, vol. 131(4), pages 487-503, August.
repec:cup:judgdm:v:7:y:2012:i:4:p:462-471 is not listed on IDEAS
Maximilian Rüdisser & Raphael Flepp & Egon Franck, 2017. "Do casinos pay their customers to become risk-averse? Revising the house money effect in a field experiment," Experimental Economics, Springer;Economic Science Association, vol. 20(3), pages 736-754, September.
- Maximilian Rüdisser & Raphael Flepp & Egon Franck, 2015. "Do Casinos Pay their Customers to Become Risk-averse? Revising the House Money Effect in a Field Experiment," Working Papers 360, University of Zurich, Department of Business Administration (IBW).
Bruno S. Frey & Reiner Eichenberger, 1989. "Should Social Scientists Care about Choice Anomalies?," Rationality and Society, , vol. 1(1), pages 101-122, July.
Harvey, Michael & Reiche, B. Sebastian & Moeller, Miriam, 2011. "Developing effective global relationships through staffing with inpatriate managers: The role of interpersonal trust," Journal of International Management, Elsevier, vol. 17(2), pages 150-161, June.
Waller, William S. & Shapiro, Brian & Sevcik, Galen, 1999. "Do cost-based pricing biases persist in laboratory markets?," Accounting, Organizations and Society, Elsevier, vol. 24(8), pages 717-739, November.
Wei Qi & Xiumei Guo & Xia Wu & Dora Marinova & Jin Fan, 2018. "Do the sunk cost effect and cognitive dissonance increase risk perception? An empirical study in the context of city smog," Quality & Quantity: International Journal of Methodology, Springer, vol. 52(5), pages 2269-2289, September.
Levy, Haim & Levy, Moshe, 2002. "Experimental test of the prospect theory value function: A stochastic dominance approach," Organizational Behavior and Human Decision Processes, Elsevier, vol. 89(2), pages 1058-1081, November.
Daniel Kahneman, 2003. "A Psychological Perspective on Economics," American Economic Review, American Economic Association, vol. 93(2), pages 162-168, May.
Bashir Ahmad Joo & Kokab Durri, 2015. "Comprehensive Review of Literature on Behavioural Finance," Indian Journal of Commerce and Management Studies, Educational Research Multimedia & Publications,India, vol. 6(2), pages 11-19, May.
Marc Willinger, 1990. "La rénovation des fondements de l'utilité et du risque," Revue Économique, Programme National Persée, vol. 41(1), pages 5-48.
Marian W. Moszoro, 2021. "Political Cognitive Biases Effects on Fund Managers’ Performance," Journal of Behavioral Finance, Taylor & Francis Journals, vol. 22(3), pages 235-253, July.
- Moszoro, Marian, 2020. "Political Cognitive Biases Effects on Fund Managers' Performance," MPRA Paper 101572, University Library of Munich, Germany.
Kuhberger, Anton & Schulte-Mecklenbeck, Michael & Perner, Josef, 2002. "Framing decisions: Hypothetical and real," Organizational Behavior and Human Decision Processes, Elsevier, vol. 89(2), pages 1162-1175, November.
Weiss, Michael D., 1984. "Risk Concepts In Agriculture: A Closer Look," 1984 Annual Meeting, August 5-8, Ithaca, New York 279010, American Agricultural Economics Association (New Name 2008: Agricultural and Applied Economics Association).

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1000131. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Modeling the Violation of Reward Maximization and Invariance in Reinforcement Schedules

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data