IDEAS home Printed from https://ideas.repec.org/a/eee/finlet/v62y2024ipas1544612324000928.html
   My bibliography  Save this article

Unlocking the power of the topic content in news headlines: BERTopic for predicting Chinese corporate bond defaults

Author

Listed:
  • Tang, Wenjin
  • Bu, Hui
  • Zuo, Yuan
  • Wu, Junjie

Abstract

This study explores the thematic content of news headlines and assesses their predictive power for corporate bond defaults. It establishes a data-driven framework, that emphasizes transparency and interpretability through the incorporation of explainable AI (xAI) techniques. The interpretable AI method, BERTopic, is applied to analyze news headlines from Jan. 1, 2014, to Aug. 31, 2021. A total of 18 economically inrerpretable topic measures are derived by combining similar topics among 78 original topics, offering insights into bond issuers’ operational behavior and associated risks. Integrating the BERTopic model, various machine learning prediction models, the SHapley Additive exPlanations (SHAP) approach, and feature combination approaches, this study uncovers the incremental information contributed by news headlines beyond financial ratios and economic variables. The inclusion of these topic measures significantly enhances the predictive performance of first-time corporate bond defaults within a 3-month horizon. Additionally, the robustness of news headlines’ information value is validated by extending the sample and employing an alternative study sample with differing credit risk scenarios, diverse markets, and even distinct news sources.

Suggested Citation

  • Tang, Wenjin & Bu, Hui & Zuo, Yuan & Wu, Junjie, 2024. "Unlocking the power of the topic content in news headlines: BERTopic for predicting Chinese corporate bond defaults," Finance Research Letters, Elsevier, vol. 62(PA).
  • Handle: RePEc:eee:finlet:v:62:y:2024:i:pa:s1544612324000928
    DOI: 10.1016/j.frl.2024.105062
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S1544612324000928
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.frl.2024.105062?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Joel Peress, 2014. "The Media and the Diffusion of Information in Financial Markets: Evidence from Newspaper Strikes," Journal of Finance, American Finance Association, vol. 69(5), pages 2007-2043, October.
    2. John Y. Campbell & Jens Hilscher & Jan Szilagyi, 2008. "In Search of Distress Risk," Journal of Finance, American Finance Association, vol. 63(6), pages 2899-2939, December.
    3. Malcolm Baker & Jeffrey Wurgler, 2006. "Investor Sentiment and the Cross‐Section of Stock Returns," Journal of Finance, American Finance Association, vol. 61(4), pages 1645-1680, August.
    4. Niklas Bussmann & Paolo Giudici & Dimitri Marinelli & Jochen Papenbrock, 2021. "Explainable Machine Learning in Credit Risk Management," Computational Economics, Springer;Society for Computational Economics, vol. 57(1), pages 203-216, January.
    5. Michelle Lowry & Roni Michaely & Ekaterina Volkova & Francesca Cornelli, 2020. "Information Revealed through the Regulatory Process: Interactions between the SEC and Companies ahead of Their IPO," The Review of Financial Studies, Society for Financial Studies, vol. 33(12), pages 5510-5554.
    6. Alexander Dyck & Adair Morse & Luigi Zingales, 2010. "Who Blows the Whistle on Corporate Fraud?," Journal of Finance, American Finance Association, vol. 65(6), pages 2213-2253, December.
    7. Liang, Deron & Lu, Chia-Chi & Tsai, Chih-Fong & Shih, Guan-An, 2016. "Financial ratios and corporate governance indicators in bankruptcy prediction: A comprehensive study," European Journal of Operational Research, Elsevier, vol. 252(2), pages 561-572.
    8. Larsen, Vegard H. & Thorsrud, Leif A., 2019. "The value of news for economic developments," Journal of Econometrics, Elsevier, vol. 210(1), pages 203-218.
    9. Paul C. Tetlock & Maytal Saar‐Tsechansky & Sofus Macskassy, 2008. "More Than Words: Quantifying Language to Measure Firms' Fundamentals," Journal of Finance, American Finance Association, vol. 63(3), pages 1437-1467, June.
    10. Hoberg, Gerard & Lewis, Craig, 2017. "Do fraudulent firms produce abnormal disclosure?," Journal of Corporate Finance, Elsevier, vol. 43(C), pages 58-85.
    11. Geng, Ruibin & Bose, Indranil & Chen, Xi, 2015. "Prediction of financial distress: An empirical study of listed Chinese companies using data mining," European Journal of Operational Research, Elsevier, vol. 241(1), pages 236-247.
    12. Gustaf Bellstam & Sanjai Bhagat & J. Anthony Cookson, 2021. "A Text-Based Analysis of Corporate Innovation," Management Science, INFORMS, vol. 67(7), pages 4004-4031, July.
    13. Patricia M. Dechow & Weili Ge & Chad R. Larson & Richard G. Sloan, 2011. "Predicting Material Accounting Misstatements," Contemporary Accounting Research, John Wiley & Sons, vol. 28(1), pages 17-82, March.
    14. Nerissa C. Brown & Richard M. Crowley & W. Brooke Elliott, 2020. "What Are You Saying? Using topic to Detect Financial Misreporting," Journal of Accounting Research, Wiley Blackwell, vol. 58(1), pages 237-291, March.
    15. Allen H. Huang & Reuven Lehavy & Amy Y. Zang & Rong Zheng, 2018. "Analyst Information Discovery and Interpretation Roles: A Topic Modeling Approach," Management Science, INFORMS, vol. 64(6), pages 2833-2855, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Daniel Borup & Jorge Wolfgang Hansen & Benjamin Dybro Liengaard & Erik Christian Montes Schütte, 2023. "Quantifying investor narratives and their role during COVID‐19," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 38(4), pages 512-532, June.
    2. Richard Frankel & Jared Jennings & Joshua Lee, 2022. "Disclosure Sentiment: Machine Learning vs. Dictionary Methods," Management Science, INFORMS, vol. 68(7), pages 5514-5532, July.
    3. Fengler, Matthias & Phan, Minh Tri, 2023. "A Topic Model for 10-K Management Disclosures," Economics Working Paper Series 2307, University of St. Gallen, School of Economics and Political Science.
    4. Jacobs, Heiko, 2020. "Hype or help? Journalists’ perceptions of mispriced stocks," Journal of Economic Behavior & Organization, Elsevier, vol. 178(C), pages 550-565.
    5. James P. Ryans, 2021. "Textual classification of SEC comment letters," Review of Accounting Studies, Springer, vol. 26(1), pages 37-80, March.
    6. Liao, Rose & Wang, Xinjie & Wu, Ge, 2021. "The role of media in mergers and acquisitions," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 74(C).
    7. Bhattacharya, Indranil & Mickovic, Ana, 2024. "Accounting fraud detection using contextual language learning," International Journal of Accounting Information Systems, Elsevier, vol. 53(C).
    8. Jiang, Cuiqing & Zhou, Yiru & Chen, Bo, 2023. "Mining semantic features in patent text for financial distress prediction," Technological Forecasting and Social Change, Elsevier, vol. 190(C).
    9. An, Zhe & Chen, Chen & Naiker, Vic & Wang, Jun, 2020. "Does media coverage deter firms from withholding bad news? Evidence from stock price crash risk," Journal of Corporate Finance, Elsevier, vol. 64(C).
    10. Yang, Shuai & Wu, Chao, 2021. "Do Chinese managers listen to the media?: Evidence from mergers and acquisitions," Research in International Business and Finance, Elsevier, vol. 58(C).
    11. Vegard Høghaug Larsen & Leif Anders Thorsrud, 2022. "Asset returns, news topics, and media effects," Scandinavian Journal of Economics, Wiley Blackwell, vol. 124(3), pages 838-868, July.
    12. Vegard H. Larsen & Leif Anders Thorsrud, 2018. "Business cycle narratives," Working Paper 2018/3, Norges Bank.
    13. Simon Fritzsch & Philipp Scharner & Gregor Weiß, 2021. "Estimating the relation between digitalization and the market value of insurers," Journal of Risk & Insurance, The American Risk and Insurance Association, vol. 88(3), pages 529-567, September.
    14. Buehlmaier, Matthias M. M. & Zechner, Josef, 2016. "Financial media, price discovery, and merger arbitrage," CFS Working Paper Series 551, Center for Financial Studies (CFS).
    15. Dang, Tung Lam & Dang, Man & Hoang, Luong & Nguyen, Lily & Phan, Hoang Long, 2020. "Media coverage and stock price synchronicity," International Review of Financial Analysis, Elsevier, vol. 67(C).
    16. Zhou, Jingting & Li, Wanli & Yan, Ziqiao & Lyu, Huaili, 2021. "Controlling shareholder share pledging and stock price crash risk: Evidence from China," International Review of Financial Analysis, Elsevier, vol. 77(C).
    17. Caporale, Guglielmo Maria & Menla Ali, Faek & Spagnolo, Fabio & Spagnolo, Nicola, 2022. "Cross-border portfolio flows and news media coverage," Journal of International Money and Finance, Elsevier, vol. 126(C).
    18. Kim, Min & Stice, Derrald & Stice, Han & White, Roger M., 2021. "Stop the presses! Or wait, we might need them: Firm responses to local newspaper closures and layoffs," Journal of Corporate Finance, Elsevier, vol. 69(C).
    19. Nyman, Rickard & Kapadia, Sujit & Tuckett, David, 2021. "News and narratives in financial systems: Exploiting big data for systemic risk assessment," Journal of Economic Dynamics and Control, Elsevier, vol. 127(C).
    20. Rong Liu & Jujun Huang & Zhongju Zhang, 2023. "Tracking disclosure change trajectories for financial fraud detection," Production and Operations Management, Production and Operations Management Society, vol. 32(2), pages 584-602, February.

    More about this item

    Keywords

    Topic modeling; BERTopic; xAI; Corporate bond default; Credit risk evaluation;
    All these keywords.

    JEL classification:

    • G32 - Financial Economics - - Corporate Finance and Governance - - - Financing Policy; Financial Risk and Risk Management; Capital and Ownership Structure; Value of Firms; Goodwill
    • C53 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Forecasting and Prediction Models; Simulation Methods

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:finlet:v:62:y:2024:i:pa:s1544612324000928. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/frl .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.