IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2305.07972.html
   My bibliography  Save this paper

Trillion Dollar Words: A New Financial Dataset, Task & Market Analysis

Author

Listed:
  • Agam Shah
  • Suvan Paturi
  • Sudheer Chava

Abstract

Monetary policy pronouncements by Federal Open Market Committee (FOMC) are a major driver of financial market returns. We construct the largest tokenized and annotated dataset of FOMC speeches, meeting minutes, and press conference transcripts in order to understand how monetary policy influences financial markets. In this study, we develop a novel task of hawkish-dovish classification and benchmark various pre-trained language models on the proposed dataset. Using the best-performing model (RoBERTa-large), we construct a measure of monetary policy stance for the FOMC document release days. To evaluate the constructed measure, we study its impact on the treasury market, stock market, and macroeconomic indicators. Our dataset, models, and code are publicly available on Huggingface and GitHub under CC BY-NC 4.0 license.

Suggested Citation

  • Agam Shah & Suvan Paturi & Sudheer Chava, 2023. "Trillion Dollar Words: A New Financial Dataset, Task & Market Analysis," Papers 2305.07972, arXiv.org.
  • Handle: RePEc:arx:papers:2305.07972
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2305.07972
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Yuriy Gorodnichenko & Tho Pham & Oleksandr Talavera, 2023. "The Voice of Monetary Policy," American Economic Review, American Economic Association, vol. 113(2), pages 548-584, February.
    2. Stephen Hansen & Michael McMahon & Andrea Prat, 2018. "Transparency and Deliberation Within the FOMC: A Computational Linguistics Approach," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 133(2), pages 801-870.
    3. Ehrmann, Michael & Talmi, Jonathan, 2020. "Starting from a blank page? Semantic similarity in central bank communication and market volatility," Journal of Monetary Economics, Elsevier, vol. 111(C), pages 48-62.
    4. Dario Caldara & Matteo Iacoviello, 2022. "Measuring Geopolitical Risk," American Economic Review, American Economic Association, vol. 112(4), pages 1194-1225, April.
    5. Emi Nakamura & Jón Steinsson, 2018. "High-Frequency Identification of Monetary Non-Neutrality: The Information Effect," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 133(3), pages 1283-1330.
    6. Stephen Hansen & Michael McMahon, 2016. "Shocking Language: Understanding the Macroeconomic Effects of Central Bank Communication," NBER Chapters, in: NBER International Seminar on Macroeconomics 2015, National Bureau of Economic Research, Inc.
    7. Bennani, Hamza & Fanta, Nicolas & Gertler, Pavel & Horvath, Roman, 2020. "Does central bank communication signal future monetary policy in a (post)-crisis era? The case of the ECB," Journal of International Money and Finance, Elsevier, vol. 104(C).
    8. Olivier Coibion & Yuriy Gorodnichenko & Michael Weber, 2022. "Monetary Policy Communications and Their Effects on Household Inflation Expectations," Journal of Political Economy, University of Chicago Press, vol. 130(6), pages 1537-1584.
    9. Rozkrut, Marek & Rybinski, Krzysztof & Sztaba, Lucyna & Szwaja, Radoslaw, 2007. "Quest for central bank communication: Does it pay to be "talkative"?," European Journal of Political Economy, Elsevier, vol. 23(1), pages 176-206, March.
    10. Stefano Nardelli & David Martens & Ellen Tobback, 2017. "Between hawks and doves: measuring Central Bank Communication," IFC Bulletins chapters, in: Bank for International Settlements (ed.), Big Data, volume 44, Bank for International Settlements.
    11. Bushee, Brian J. & Matsumoto, Dawn A. & Miller, Gregory S., 2003. "Open versus closed conference calls: the determinants and effects of broadening access to disclosure," Journal of Accounting and Economics, Elsevier, vol. 34(1-3), pages 149-180, January.
    12. Emmanuel Alanis & Sudheer Chava & Agam Shah, 2022. "Benchmarking Machine Learning Models to Predict Corporate Bankruptcy," Papers 2212.12051, arXiv.org.
    13. Schmeling, Maik & Wagner, Christian, 2019. "Does Central Bank Tone Move Asset Prices?," CEPR Discussion Papers 13490, C.E.P.R. Discussion Papers.
    14. Tobback, Ellen & Nardelli, Stefano & Martens, David, 2017. "Between hawks and doves: measuring central bank communication," Working Paper Series 2085, European Central Bank.
    15. Anna Cieslak & Adair Morse & Annette Vissing‐Jorgensen, 2019. "Stock Returns over the FOMC Cycle," Journal of Finance, American Finance Association, vol. 74(5), pages 2201-2248, October.
    16. Tim Loughran & Bill Mcdonald, 2011. "When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10‐Ks," Journal of Finance, American Finance Association, vol. 66(1), pages 35-65, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Kwok Ping Tsang & Zichao Yang, 2023. "Agree to Disagree: Measuring Hidden Dissent in FOMC Meetings," Papers 2308.10131, arXiv.org, revised Nov 2024.
    2. Yuqi Nie & Yaxuan Kong & Xiaowen Dong & John M. Mulvey & H. Vincent Poor & Qingsong Wen & Stefan Zohren, 2024. "A Survey of Large Language Models for Financial Applications: Progress, Prospects and Challenges," Papers 2406.11903, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Martin Baumgaertner & Johannes Zahner, 2021. "Whatever it takes to understand a central banker - Embedding their words using neural networks," MAGKS Papers on Economics 202130, Philipps-Universität Marburg, Faculty of Business Administration and Economics, Department of Economics (Volkswirtschaftliche Abteilung).
    2. Donato Masciandaro & Oana Peia & Davide Romelli, 2024. "Central bank communication and social media: From silence to Twitter," Journal of Economic Surveys, Wiley Blackwell, vol. 38(2), pages 365-388, April.
    3. Parle, Conor, 2022. "The financial market impact of ECB monetary policy press conferences — A text based approach," European Journal of Political Economy, Elsevier, vol. 74(C).
    4. Agam Shah & Sudheer Chava, 2023. "Zero is Not Hero Yet: Benchmarking Zero-Shot Performance of LLMs for Financial Tasks," Papers 2305.16633, arXiv.org.
    5. Donato Masciandaro & Davide Romelli & Gaia Rubera, 2021. "Monetary policy and financial markets: evidence from Twitter traffic," BAFFI CAREFIN Working Papers 21160, BAFFI CAREFIN, Centre for Applied Research on International Markets Banking Finance and Regulation, Universita' Bocconi, Milano, Italy.
    6. Istrefi, Klodiana & Odendahl, Florens & Sestieri, Giulia, 2023. "Fed communication on financial stability concerns and monetary policy decisions: Revelations from speeches," Journal of Banking & Finance, Elsevier, vol. 151(C).
    7. Paloviita, Maritta & Haavio, Markus & Jalasjoki, Pirkka & Kilponen, Juha & Vänni, Ilona, 2020. "Reading between the lines : Using text analysis to estimate the loss function of the ECB," Research Discussion Papers 12/2020, Bank of Finland.
    8. Hubert, Paul & Labondance, Fabien, 2021. "The signaling effects of central bank tone," European Economic Review, Elsevier, vol. 133(C).
    9. Yuriy Gorodnichenko & Tho Pham & Oleksandr Talavera, 2023. "The Voice of Monetary Policy," American Economic Review, American Economic Association, vol. 113(2), pages 548-584, February.
    10. repec:zbw:bofrdp:2020_012 is not listed on IDEAS
    11. Fadda, Pietro & Hanifi, Rayane & Istrefi, Klodiana & Penalver, Adrian, 2022. "Central Bank Communication of Uncertainty," CEPR Discussion Papers 17728, C.E.P.R. Discussion Papers.
    12. Alexopoulos, Michelle & Han, Xinfen & Kryvtsov, Oleksiy & Zhang, Xu, 2024. "More than words: Fed Chairs’ communication during congressional testimonies," Journal of Monetary Economics, Elsevier, vol. 142(C).
    13. Paul Hubert & Fabien Labondance, 2019. "Central bank tone and the dispersion of views within monetary policy committees," SciencePo Working papers Main hal-03403256, HAL.
    14. Saskia Ter Ellen & Vegard H. Larsen & Leif Anders Thorsrud, 2022. "Narrative Monetary Policy Surprises and the Media," Journal of Money, Credit and Banking, Blackwell Publishing, vol. 54(5), pages 1525-1549, August.
    15. Linas Jurkšas & Rokas Kaminskas, 2023. "ECB monetary policy communication: does it move euro area yields?," Bank of Lithuania Discussion Paper Series 29, Bank of Lithuania.
    16. Kwok Ping Tsang & Zichao Yang, 2023. "Agree to Disagree: Measuring Hidden Dissent in FOMC Meetings," Papers 2308.10131, arXiv.org, revised Nov 2024.
    17. Dimitrios Kanelis & Pierre L. Siklos, 2022. "Emotion in Euro Area Monetary Policy Communication and Bond Yields: The Draghi Era," CQE Working Papers 10322, Center for Quantitative Economics (CQE), University of Muenster.
    18. Armelius, Hanna & Bertsch, Christoph & Hull, Isaiah & Zhang, Xin, 2020. "Spread the Word: International spillovers from central bank communication," Journal of International Money and Finance, Elsevier, vol. 103(C).
    19. Xuefan, Pan, 2023. "Analysing the response of U.S. financial market to the Federal Open Market Committee statements and minutes based on computational linguistic approaches," Warwick-Monash Economics Student Papers 43, Warwick Monash Economics Student Papers.
    20. Möller, Rouven & Reichmann, Doron, 2021. "ECB language and stock returns – A textual analysis of ECB press conferences," The Quarterly Review of Economics and Finance, Elsevier, vol. 80(C), pages 590-604.
    21. Donato Masciandaro & Davide Romelli & Gaia Rubera, 2021. "Monetary policy, Twitter and financial markets: evidence from social media traffic," BAFFI CAREFIN Working Papers 21160, BAFFI CAREFIN, Centre for Applied Research on International Markets Banking Finance and Regulation, Universita' Bocconi, Milano, Italy.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2305.07972. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.