IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2311.15180.html
   My bibliography  Save this paper

Benchmarking Large Language Model Volatility

Author

Listed:
  • Boyang Yu

Abstract

The impact of non-deterministic outputs from Large Language Models (LLMs) is not well examined for financial text understanding tasks. Through a compelling case study on investing in the US equity market via news sentiment analysis, we uncover substantial variability in sentence-level sentiment classification results, underscoring the innate volatility of LLM outputs. These uncertainties cascade downstream, leading to more significant variations in portfolio construction and return. While tweaking the temperature parameter in the language model decoder presents a potential remedy, it comes at the expense of stifled creativity. Similarly, while ensembling multiple outputs mitigates the effect of volatile outputs, it demands a notable computational investment. This work furnishes practitioners with invaluable insights for adeptly navigating uncertainty in the integration of LLMs into financial decision-making, particularly in scenarios dictated by non-deterministic information.

Suggested Citation

  • Boyang Yu, 2023. "Benchmarking Large Language Model Volatility," Papers 2311.15180, arXiv.org.
  • Handle: RePEc:arx:papers:2311.15180
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2311.15180
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Alejandro Lopez-Lira & Yuehua Tang, 2023. "Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models," Papers 2304.07619, arXiv.org, revised Sep 2024.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Francisco PeƱaranda & Enrique Sentana, 2024. "Portfolio management with big data," Working Papers wp2024_2411, CEMFI.
    2. Liping Wang & Jiawei Li & Lifan Zhao & Zhizhuo Kou & Xiaohan Wang & Xinyi Zhu & Hao Wang & Yanyan Shen & Lei Chen, 2023. "Methods for Acquiring and Incorporating Knowledge into Stock Price Prediction: A Survey," Papers 2308.04947, arXiv.org.
    3. Alex Kim & Maximilian Muhn & Valeri Nikolaev, 2023. "From Transcripts to Insights: Uncovering Corporate Risks Using Generative AI," Papers 2310.17721, arXiv.org.
    4. Manish Jha & Jialin Qian & Michael Weber & Baozhong Yang, 2024. "Harnessing Generative AI for Economic Insights," Papers 2410.03897, arXiv.org, revised Oct 2024.
    5. Alex Kim & Maximilian Muhn & Valeri Nikolaev, 2024. "Financial Statement Analysis with Large Language Models," Papers 2407.17866, arXiv.org, revised Nov 2024.
    6. Hanshuang Tong & Jun Li & Ning Wu & Ming Gong & Dongmei Zhang & Qi Zhang, 2024. "Ploutos: Towards interpretable stock movement prediction with financial large language model," Papers 2403.00782, arXiv.org.
    7. Udit Gupta, 2023. "GPT-InvestAR: Enhancing Stock Investment Strategies through Annual Report Analysis with Large Language Models," Papers 2309.03079, arXiv.org.
    8. Haohan Zhang & Fengrui Hua & Chengjin Xu & Hao Kong & Ruiting Zuo & Jian Guo, 2023. "Unveiling the Potential of Sentiment: Can Large Language Models Predict Chinese Stock Price Movements?," Papers 2306.14222, arXiv.org, revised May 2024.
    9. Pogorelova, Polina, 2024. "Investigation of the impact of uncertainty indices on Bitcoin volatility using the ARDL model," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 74, pages 35-50.
    10. Ummara Mumtaz & Summaya Mumtaz, 2023. "Potential of ChatGPT in predicting stock market trends based on Twitter Sentiment Analysis," Papers 2311.06273, arXiv.org.
    11. Georgios Fatouros & Konstantinos Metaxas & John Soldatos & Dimosthenis Kyriazis, 2024. "Can Large Language Models Beat Wall Street? Unveiling the Potential of AI in Stock Selection," Papers 2401.03737, arXiv.org, revised Apr 2024.
    12. Marius Hofert, 2023. "Correlation Pitfalls with ChatGPT: Would You Fall for Them?," Risks, MDPI, vol. 11(7), pages 1-17, June.
    13. Zihan Chen & Lei Nico Zheng & Cheng Lu & Jialu Yuan & Di Zhu, 2023. "ChatGPT Informed Graph Neural Network for Stock Movement Prediction," Papers 2306.03763, arXiv.org, revised Sep 2023.
    14. Baptiste Lefort & Eric Benhamou & Jean-Jacques Ohana & David Saltiel & Beatrice Guez & Thomas Jacquot, 2024. "Stress index strategy enhanced with financial news sentiment analysis for the equity markets," Papers 2404.00012, arXiv.org.
    15. Baptiste Lefort & Eric Benhamou & Jean-Jacques Ohana & David Saltiel & Beatrice Guez & Damien Challet, 2024. "Can ChatGPT Compute Trustworthy Sentiment Scores from Bloomberg Market Wraps?," Papers 2401.05447, arXiv.org.
    16. Marra de ArtiƱano, Ignacio & Riottini Depetris, Franco & Volpe Martincus, Christian, 2023. "Automatic Product Classification in International Trade: Machine Learning and Large Language Models," IDB Publications (Working Papers) 12962, Inter-American Development Bank.
    17. Yuqi Nie & Yaxuan Kong & Xiaowen Dong & John M. Mulvey & H. Vincent Poor & Qingsong Wen & Stefan Zohren, 2024. "A Survey of Large Language Models for Financial Applications: Progress, Prospects and Challenges," Papers 2406.11903, arXiv.org.
    18. Claudia Biancotti & Carolina Camassa, 2023. "Loquacity and visible emotion: ChatGPT as a policy advisor," Questioni di Economia e Finanza (Occasional Papers) 814, Bank of Italy, Economic Research and International Relations Area.
    19. Baptiste Lefort & Eric Benhamou & Jean-Jacques Ohana & David Saltiel & Beatrice Guez, 2024. "Optimizing Performance: How Compact Models Match or Exceed GPT's Classification Capabilities through Fine-Tuning," Papers 2409.11408, arXiv.org.
    20. Paul Glasserman & Caden Lin, 2023. "Assessing Look-Ahead Bias in Stock Return Predictions Generated By GPT Sentiment Analysis," Papers 2309.17322, arXiv.org.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2311.15180. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.