IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2412.18174.html
   My bibliography  Save this paper

INVESTORBENCH: A Benchmark for Financial Decision-Making Tasks with LLM-based Agent

Author

Listed:
  • Haohang Li
  • Yupeng Cao
  • Yangyang Yu
  • Shashidhar Reddy Javaji
  • Zhiyang Deng
  • Yueru He
  • Yuechen Jiang
  • Zining Zhu
  • Koduvayur Subbalakshmi
  • Guojun Xiong
  • Jimin Huang
  • Lingfei Qian
  • Xueqing Peng
  • Qianqian Xie
  • Jordan W. Suchow

Abstract

Recent advancements have underscored the potential of large language model (LLM)-based agents in financial decision-making. Despite this progress, the field currently encounters two main challenges: (1) the lack of a comprehensive LLM agent framework adaptable to a variety of financial tasks, and (2) the absence of standardized benchmarks and consistent datasets for assessing agent performance. To tackle these issues, we introduce \textsc{InvestorBench}, the first benchmark specifically designed for evaluating LLM-based agents in diverse financial decision-making contexts. InvestorBench enhances the versatility of LLM-enabled agents by providing a comprehensive suite of tasks applicable to different financial products, including single equities like stocks, cryptocurrencies and exchange-traded funds (ETFs). Additionally, we assess the reasoning and decision-making capabilities of our agent framework using thirteen different LLMs as backbone models, across various market environments and tasks. Furthermore, we have curated a diverse collection of open-source, multi-modal datasets and developed a comprehensive suite of environments for financial decision-making. This establishes a highly accessible platform for evaluating financial agents' performance across various scenarios.

Suggested Citation

  • Haohang Li & Yupeng Cao & Yangyang Yu & Shashidhar Reddy Javaji & Zhiyang Deng & Yueru He & Yuechen Jiang & Zining Zhu & Koduvayur Subbalakshmi & Guojun Xiong & Jimin Huang & Lingfei Qian & Xueqing Pe, 2024. "INVESTORBENCH: A Benchmark for Financial Decision-Making Tasks with LLM-based Agent," Papers 2412.18174, arXiv.org.
  • Handle: RePEc:arx:papers:2412.18174
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2412.18174
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Xiao-Yang Liu & Guoxuan Wang & Hongyang Yang & Daochen Zha, 2023. "FinGPT: Democratizing Internet-scale Data for Financial Large Language Models," Papers 2307.10485, arXiv.org, revised Nov 2023.
    2. Madhavan, Ananth N., 2016. "Exchange-Traded Funds and the New Dynamics of Investing," OUP Catalogue, Oxford University Press, number 9780190279394.
    3. Jaap M J Murre & Joeri Dros, 2015. "Replication and Analysis of Ebbinghaus’ Forgetting Curve," PLOS ONE, Public Library of Science, vol. 10(7), pages 1-23, July.
    4. Yi Yang & Yixuan Tang & Kar Yan Tam, 2023. "InvestLM: A Large Language Model for Investment using Financial Domain Instruction Tuning," Papers 2309.13064, arXiv.org.
    5. Mukul Bhatnagar & Sanjay Taneja & Ramona Rupeika-Apoga, 2023. "Demystifying the Effect of the News (Shocks) on Crypto Market Volatility," JRFM, MDPI, vol. 16(2), pages 1-16, February.
    6. Taylan Kabbani & Ekrem Duman, 2022. "Deep Reinforcement Learning Approach for Trading Automation in The Stock Market," Papers 2208.07165, arXiv.org.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. David Kuo Chuen Lee & Chong Guan & Yinghui Yu & Qinxu Ding, 2024. "A Comprehensive Review of Generative AI in Finance," FinTech, MDPI, vol. 3(3), pages 1-19, September.
    2. Calvin Thigpen & Kelcie Ralph & Nicholas J. Klein & Anne Brown, 2023. "Can information increase support for transportation reform? Results from an experiment," Transportation, Springer, vol. 50(3), pages 893-912, June.
    3. Cattaneo, Cristina & D’Adda, Giovanna & Tavoni, Massimo & Bonan, Jacopo, 2019. "Can We Make Social Information Programs More Effective? The Role of Identity and Values," RFF Working Paper Series 19-21, Resources for the Future.
    4. Yichen Luo & Yebo Feng & Jiahua Xu & Paolo Tasca & Yang Liu, 2025. "LLM-Powered Multi-Agent System for Automated Crypto Portfolio Management," Papers 2501.00826, arXiv.org, revised Jan 2025.
    5. Xuewen Han & Neng Wang & Shangkun Che & Hongyang Yang & Kunpeng Zhang & Sean Xin Xu, 2024. "Enhancing Investment Analysis: Optimizing AI-Agent Collaboration in Financial Research," Papers 2411.04788, arXiv.org.
    6. Thanos Konstantinidis & Giorgos Iacovides & Mingxue Xu & Tony G. Constantinides & Danilo Mandic, 2024. "FinLlama: Financial Sentiment Classification for Algorithmic Trading Applications," Papers 2403.12285, arXiv.org.
    7. Andrew J. Stier & Sina Sajjadi & Fariba Karimi & Luís M. A. Bettencourt & Marc G. Berman, 2024. "Implicit racial biases are lower in more populous more diverse and less segregated US cities," Nature Communications, Nature, vol. 15(1), pages 1-10, December.
    8. Lu, Chia-Wu & Wu, Hsueh-Ling & Su, Yu-Hsuan, 2024. "The icing on the cake: ESG effect on the quality factor portfolios," Finance Research Letters, Elsevier, vol. 70(C).
    9. Balázs Zélity, 2023. "Age diversity and aggregate productivity," Journal of Population Economics, Springer;European Society for Population Economics, vol. 36(3), pages 1863-1899, July.
    10. Lawrence Glosten & Suresh Nallareddy & Yuan Zou, 2021. "ETF Activity and Informational Efficiency of Underlying Securities," Management Science, INFORMS, vol. 67(1), pages 22-47, January.
    11. Green, Alan, 2024. "Are we doing homework wrong? The marginal effect of homework using spaced repetition," International Review of Economics Education, Elsevier, vol. 46(C).
    12. Elton, Edwin J. & Gruber, Martin J. & de Souza, Andre, 2019. "Passive mutual funds and ETFs: Performance and comparison," Journal of Banking & Finance, Elsevier, vol. 106(C), pages 265-275.
    13. Shuyang Wang & Diego Klabjan, 2023. "An Ensemble Method of Deep Reinforcement Learning for Automated Cryptocurrency Trading," Papers 2309.00626, arXiv.org.
    14. Vasiliy A. Tatyannikov, 2018. "Prospects of Exchange-Traded Funds’ Development in Russia," Journal of New Economy, Ural State University of Economics, vol. 19(6), pages 89-100, December.
    15. Viktor Ivanitskiy & Vasily Tatyannikov, 2018. "Information Asymmetry in Financial Markets: Challenges and Threats," Economy of region, Centre for Economic Security, Institute of Economics of Ural Branch of Russian Academy of Sciences, vol. 1(4), pages 1156-1167.
    16. Hua Wang & Liao Xu, 2019. "Do exchange‐traded fund flows increase the volatility of the underlying index? Evidence from the emerging market in China," Accounting and Finance, Accounting and Finance Association of Australia and New Zealand, vol. 58(5), pages 1525-1548, March.
    17. Costola, Michele & Hinz, Oliver & Nofer, Michael & Pelizzon, Loriana, 2023. "Machine learning sentiment analysis, COVID-19 news and stock market reactions," Research in International Business and Finance, Elsevier, vol. 64(C).
    18. Lin, Tin-Chun, 2024. "Can instruction in consumer choice theory in introduction to microeconomics benefit student learning in upper-level economics courses? The example of public finance," International Review of Economics Education, Elsevier, vol. 46(C).
    19. Dobson, Peter, 2020. "ETFs tracking errors on global markets with consideration of regional diversity," MPRA Paper 103695, University Library of Munich, Germany.
    20. Czereszenko, Witalij, 2021. "Pursuing the aim of Exchange Traded Funds at the time of Covid-19," MPRA Paper 111319, University Library of Munich, Germany.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2412.18174. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.