IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2406.05972.html
   My bibliography  Save this paper

Decision-Making Behavior Evaluation Framework for LLMs under Uncertain Context

Author

Listed:
  • Jingru Jia
  • Zehua Yuan
  • Junhao Pan
  • Paul McNamara
  • Deming Chen

Abstract

When making decisions under uncertainty, individuals often deviate from rational behavior, which can be evaluated across three dimensions: risk preference, probability weighting, and loss aversion. Given the widespread use of large language models (LLMs) in decision-making processes, it is crucial to assess whether their behavior aligns with human norms and ethical expectations or exhibits potential biases. Several empirical studies have investigated the rationality and social behavior performance of LLMs, yet their internal decision-making tendencies and capabilities remain inadequately understood. This paper proposes a framework, grounded in behavioral economics, to evaluate the decision-making behaviors of LLMs. Through a multiple-choice-list experiment, we estimate the degree of risk preference, probability weighting, and loss aversion in a context-free setting for three commercial LLMs: ChatGPT-4.0-Turbo, Claude-3-Opus, and Gemini-1.0-pro. Our results reveal that LLMs generally exhibit patterns similar to humans, such as risk aversion and loss aversion, with a tendency to overweight small probabilities. However, there are significant variations in the degree to which these behaviors are expressed across different LLMs. We also explore their behavior when embedded with socio-demographic features, uncovering significant disparities. For instance, when modeled with attributes of sexual minority groups or physical disabilities, Claude-3-Opus displays increased risk aversion, leading to more conservative choices. These findings underscore the need for careful consideration of the ethical implications and potential biases in deploying LLMs in decision-making scenarios. Therefore, this study advocates for developing standards and guidelines to ensure that LLMs operate within ethical boundaries while enhancing their utility in complex decision-making environments.

Suggested Citation

  • Jingru Jia & Zehua Yuan & Junhao Pan & Paul McNamara & Deming Chen, 2024. "Decision-Making Behavior Evaluation Framework for LLMs under Uncertain Context," Papers 2406.05972, arXiv.org.
  • Handle: RePEc:arx:papers:2406.05972
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2406.05972
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Thomas Dohmen & Armin Falk & David Huffman & Uwe Sunde & Jürgen Schupp & Gert G. Wagner, 2011. "Individual Risk Attitudes: Measurement, Determinants, And Behavioral Consequences," Journal of the European Economic Association, European Economic Association, vol. 9(3), pages 522-550, June.
    2. Glenn Harrison & E. Rutström, 2009. "Expected utility theory and prospect theory: one wedding and a decent funeral," Experimental Economics, Springer;Economic Science Association, vol. 12(2), pages 133-158, June.
    3. Hans-Martin von Gaudecker & Arthur van Soest & Erik Wengstrom, 2011. "Heterogeneity in Risky Choice Behavior in a Broad Population," American Economic Review, American Economic Association, vol. 101(2), pages 664-694, April.
    4. Steffen Andersen & Glenn W. Harrison & Morten I. Lau & E. Elisabet Rutström, 2008. "Eliciting Risk and Time Preferences," Econometrica, Econometric Society, vol. 76(3), pages 583-618, May.
    5. Daniel Kahneman & Amos Tversky, 2013. "Prospect Theory: An Analysis of Decision Under Risk," World Scientific Book Chapters, in: Leonard C MacLean & William T Ziemba (ed.), HANDBOOK OF THE FUNDAMENTALS OF FINANCIAL DECISION MAKING Part I, chapter 6, pages 99-127, World Scientific Publishing Co. Pte. Ltd..
    6. Hans-Martin von Gaudecker & Arthur van Soest & Erik Wengstrom, 2011. "Heterogeneity in Risky Choice Behavior in a Broad Population," American Economic Review, American Economic Association, vol. 101(2), pages 664-694, April.
    7. Elaine M. Liu, 2013. "Time to Change What to Sow: Risk Preferences and Technology Adoption Decisions of Cotton Farmers in China," The Review of Economics and Statistics, MIT Press, vol. 95(4), pages 1386-1403, October.
    8. Daniel J. Benjamin & Sebastian A. Brown & Jesse M. Shapiro, 2013. "Who Is ‘Behavioral’? Cognitive Ability And Anomalous Preferences," Journal of the European Economic Association, European Economic Association, vol. 11(6), pages 1231-1255, December.
    9. Binswanger, Hans P, 1981. "Attitudes toward Risk: Theoretical Implications of an Experiment in Rural India," Economic Journal, Royal Economic Society, vol. 91(364), pages 867-890, December.
    10. Fulin Guo, 2023. "GPT in Game Theory Experiments," Papers 2305.05516, arXiv.org, revised Dec 2023.
    11. John J. Horton, 2023. "Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?," NBER Working Papers 31122, National Bureau of Economic Research, Inc.
    12. Shijie Wu & Ozan Irsoy & Steven Lu & Vadim Dabravolski & Mark Dredze & Sebastian Gehrmann & Prabhanjan Kambadur & David Rosenberg & Gideon Mann, 2023. "BloombergGPT: A Large Language Model for Finance," Papers 2303.17564, arXiv.org, revised Dec 2023.
    13. John J. Horton, 2023. "Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?," Papers 2301.07543, arXiv.org.
    14. Charles A. Holt & Susan K. Laury, 2002. "Risk Aversion and Incentive Effects," American Economic Review, American Economic Association, vol. 92(5), pages 1644-1655, December.
    15. Tomomi Tanaka & Colin F. Camerer & Quang Nguyen, 2010. "Risk and Time Preferences: Linking Experimental and Household Survey Data from Vietnam," American Economic Review, American Economic Association, vol. 100(1), pages 557-571, March.
    16. John A. List, 2003. "Does Market Experience Eliminate Market Anomalies?," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 118(1), pages 41-71.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Schleich, Joachim & Gassmann, Xavier & Meissner, Thomas & Faure, Corinne, 2019. "A large-scale test of the effects of time discounting, risk aversion, loss aversion, and present bias on household adoption of energy-efficient technologies," Energy Economics, Elsevier, vol. 80(C), pages 377-393.
    2. Charness, Gary & Gneezy, Uri & Imas, Alex, 2013. "Experimental methods: Eliciting risk preferences," Journal of Economic Behavior & Organization, Elsevier, vol. 87(C), pages 43-51.
    3. Galarza, Francisco, 2009. "Choices under Risk in Rural Peru," MPRA Paper 17708, University Library of Munich, Germany.
    4. Insaf Bekir & Faten Doss, 2020. "Status quo bias and attitude towards risk: An experimental investigation," Managerial and Decision Economics, John Wiley & Sons, Ltd., vol. 41(5), pages 827-838, July.
    5. Galizzi, Matteo M. & Machado, Sara R. & Miniaci, Raffaele, 2016. "Temporal stability, cross-validity, and external validity of risk preferences measures: experimental evidence from a UK representative sample," LSE Research Online Documents on Economics 67554, London School of Economics and Political Science, LSE Library.
    6. Marc Oliver Rieger & Mei Wang & Thorsten Hens, 2015. "Risk Preferences Around the World," Management Science, INFORMS, vol. 61(3), pages 637-648, March.
    7. Jonathan Chapman & Erik Snowberg & Stephanie Wang & Colin Camerer, 2018. "Loss Attitudes in the U.S. Population: Evidence from Dynamically Optimized Sequential Experimentation (DOSE)," NBER Working Papers 25072, National Bureau of Economic Research, Inc.
    8. James Alm & Antoine Malézieux, 2021. "40 years of tax evasion games: a meta-analysis," Experimental Economics, Springer;Economic Science Association, vol. 24(3), pages 699-750, September.
    9. Julia Ihli, Hanna & Chiputwa, Brian & Winter, Etti & Gassner, Anja, 2022. "Risk and time preferences for participating in forest landscape restoration: The case of coffee farmers in Uganda," World Development, Elsevier, vol. 150(C).
    10. Tamás Csermely & Alexander Rabas, 2016. "How to reveal people’s preferences: Comparing time consistency and predictive power of multiple price list risk elicitation methods," Journal of Risk and Uncertainty, Springer, vol. 53(2), pages 107-136, December.
    11. Dixit, Vinayak V. & Harb, Rami C. & Martínez-Correa, Jimmy & Rutström, Elisabet E., 2015. "Measuring risk aversion to guide transportation policy: Contexts, incentives, and respondents," Transportation Research Part A: Policy and Practice, Elsevier, vol. 80(C), pages 15-34.
    12. Filippin, Antonio & Crosetto, Paolo, 2014. "A Reconsideration of Gender Differences in Risk Attitudes," IZA Discussion Papers 8184, Institute of Labor Economics (IZA).
    13. Ola Andersson & Håkan J. Holm & Jean-Robert Tyran & Erik Wengström, 2020. "Robust inference in risk elicitation tasks," Journal of Risk and Uncertainty, Springer, vol. 61(3), pages 195-209, December.
    14. Jeffrey Butler & Luigi Guiso & Tullio Jappelli, 2014. "The role of intuition and reasoning in driving aversion to risk and ambiguity," Theory and Decision, Springer, vol. 77(4), pages 455-484, December.
    15. Holden , Stein T. & Tilahun , Mesfin, 2019. "The Devil is in the Details: Risk Preferences, Choice List Design, and Measurement Error," CLTS Working Papers 3/19, Norwegian University of Life Sciences, Centre for Land Tenure Studies, revised 16 Oct 2019.
    16. Elena Cettolin & Arno Riedl & Giang Tran, 2017. "Giving in the face of risk," Journal of Risk and Uncertainty, Springer, vol. 55(2), pages 95-118, December.
    17. Ferdinand M. Vieider & Peter Martinsson & Pham Khanh Nam & Nghi Truong, 2019. "Risk preferences and development revisited," Theory and Decision, Springer, vol. 86(1), pages 1-21, February.
    18. Goytom Abraha Kahsay & Workineh Asmare Kassie & Haileselassie Medhin & Lars Gårn Hansen, 2022. "Are religious farmers more risk taking? Empirical evidence from Ethiopia," Agricultural Economics, International Association of Agricultural Economists, vol. 53(4), pages 617-632, July.
    19. Menkhoff, Lukas & Sakha, Sahra, 2014. "Multiple-item risk measures," Kiel Working Papers 1980, Kiel Institute for the World Economy (IfW Kiel).
    20. Arnaud Reynaud & Cécile Aubert, 2020. "Does flood experience modify risk preferences? Evidence from an artefactual field experiment in Vietnam," The Geneva Risk and Insurance Review, Palgrave Macmillan;International Association for the Study of Insurance Economics (The Geneva Association), vol. 45(1), pages 36-74, March.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2406.05972. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.