IDEAS home Printed from https://ideas.repec.org/p/arx/papers/1406.7330.html
   My bibliography  Save this paper

Stock Market Prediction from WSJ: Text Mining via Sparse Matrix Factorization

Author

Listed:
  • Felix Ming Fai Wong
  • Zhenming Liu
  • Mung Chiang

Abstract

We revisit the problem of predicting directional movements of stock prices based on news articles: here our algorithm uses daily articles from The Wall Street Journal to predict the closing stock prices on the same day. We propose a unified latent space model to characterize the "co-movements" between stock prices and news articles. Unlike many existing approaches, our new model is able to simultaneously leverage the correlations: (a) among stock prices, (b) among news articles, and (c) between stock prices and news articles. Thus, our model is able to make daily predictions on more than 500 stocks (most of which are not even mentioned in any news article) while having low complexity. We carry out extensive backtesting on trading strategies based on our algorithm. The result shows that our model has substantially better accuracy rate (55.7%) compared to many widely used algorithms. The return (56%) and Sharpe ratio due to a trading strategy based on our model are also much higher than baseline indices.

Suggested Citation

  • Felix Ming Fai Wong & Zhenming Liu & Mung Chiang, 2014. "Stock Market Prediction from WSJ: Text Mining via Sparse Matrix Factorization," Papers 1406.7330, arXiv.org.
  • Handle: RePEc:arx:papers:1406.7330
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/1406.7330
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Huina Mao & Scott Counts & Johan Bollen, 2011. "Predicting Financial Markets: Comparing Survey, News, Twitter and Search Engine Data," Papers 1112.1051, arXiv.org.
    2. Casey Dougal & Joseph Engelberg & Diego García & Christopher A. Parsons, 2012. "Journalists and the Stock Market," The Review of Financial Studies, Society for Financial Studies, vol. 25(3), pages 639-679.
    3. Paul C. Tetlock, 2007. "Giving Content to Investor Sentiment: The Role of Media in the Stock Market," Journal of Finance, American Finance Association, vol. 62(3), pages 1139-1168, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Qiong Wu & Christopher G. Brinton & Zheng Zhang & Andrea Pizzoferrato & Zhenming Liu & Mihai Cucuringu, 2019. "Equity2Vec: End-to-end Deep Learning Framework for Cross-sectional Asset Pricing," Papers 1909.04497, arXiv.org, revised Oct 2021.
    2. František Dařena & Jan Přichystal, 2018. "Analysis of the Association between Topics in Online Documents and Stock Price Movements," Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis, Mendel University Press, vol. 66(6), pages 1431-1439.
    3. Souta Nakatani & Kiyohiko G. Nishimura & Taiga Saito & Akihiko Takahashi, 2020. "Interest Rate Model with Investor Attitude and Text Mining," CIRJE F-Series CIRJE-F-1152, CIRJE, Faculty of Economics, University of Tokyo.
    4. Kübler, Raoul V. & Colicev, Anatoli & Pauwels, Koen H., 2020. "Social Media's Impact on the Consumer Mindset: When to Use Which Sentiment Extraction Tool?," Journal of Interactive Marketing, Elsevier, vol. 50(C), pages 136-155.
    5. Sun, Andrew & Lachanski, Michael & Fabozzi, Frank J., 2016. "Trade the tweet: Social media text mining and sparse matrix factorization for stock market prediction," International Review of Financial Analysis, Elsevier, vol. 48(C), pages 272-281.
    6. Stefan Feuerriegel & Helmut Prendinger, 2018. "News-based trading strategies," Papers 1807.06824, arXiv.org.
    7. Frantisek Darena & Jonas Petrovsky & Jan Zizka & Jan Prichystal, 2016. "Analyzing the correlation between online texts and stock price movements at micro-level using machine learning," MENDELU Working Papers in Business and Economics 2016-67, Mendel University in Brno, Faculty of Business and Economics.
    8. Pooja Gupta & Angshul Majumdar & Emilie Chouzenoux & Giovanni Chierchia, 2020. "SuperDeConFuse: A Supervised Deep Convolutional Transform based Fusion Framework for Financial Trading Systems," Papers 2011.04364, arXiv.org.
    9. Souta Nakatani & Kiyohiko G. Nishimura & Taiga Saito & Akihiko Takahashi, 2020. "Interest Rate Model with Investor Attitude and Text Mining (Published in IEEE Access)," CARF F-Series CARF-F-479, Center for Advanced Research in Finance, Faculty of Economics, The University of Tokyo.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. David E. Allen & Michael McAleer & Abhay K. Singh, 2019. "Daily market news sentiment and stock prices," Applied Economics, Taylor & Francis Journals, vol. 51(30), pages 3212-3235, June.
    2. Lixiang Wang & Wendi Hou & Yupei Liu, 2023. "How do co‐shareholding networks affect negative media coverage? Evidence from China," Accounting and Finance, Accounting and Finance Association of Australia and New Zealand, vol. 63(4), pages 4221-4249, December.
    3. Di Giuli, Alberta & Laux, Paul A., 2022. "The effect of media-linked directors on financing and external governance," Journal of Financial Economics, Elsevier, vol. 145(2), pages 103-131.
    4. Chen, Yangyang & Goyal, Abhinav & Veeraraghavan, Madhu & Zolotoy, Leon, 2020. "Terrorist attacks, investor sentiment, and the pricing of initial public offerings," Journal of Corporate Finance, Elsevier, vol. 65(C).
    5. Ahmad, Khurshid & Han, JingGuang & Hutson, Elaine & Kearney, Colm & Liu, Sha, 2016. "Media-expressed negative tone and firm-level stock returns," Journal of Corporate Finance, Elsevier, vol. 37(C), pages 152-172.
    6. Bianconi, Marcelo & Hua, Xiaxin & Tan, Chih Ming, 2015. "Determinants of systemic risk and information dissemination," International Review of Economics & Finance, Elsevier, vol. 38(C), pages 352-368.
    7. Ding, Rong & Hou, Wenxuan & Liu, Yue (Lucy) & Zhang, John Ziyang, 2018. "Media censorship and stock price: Evidence from the foreign share discount in China," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 55(C), pages 112-133.
    8. Steven Heston & Nitish R. Sinha, 2016. "News versus Sentiment : Predicting Stock Returns from News Stories," Finance and Economics Discussion Series 2016-048, Board of Governors of the Federal Reserve System (U.S.).
    9. Tsileponis, Nikolaos & Stathopoulos, Konstantinos & Walker, Martin, 2020. "Do corporate press releases drive media coverage?," The British Accounting Review, Elsevier, vol. 52(2).
    10. Liao, Rose & Wang, Xinjie & Wu, Ge, 2021. "The role of media in mergers and acquisitions," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 74(C).
    11. Mazboudi, Mohamad & Khalil, Samer, 2017. "The attenuation effect of social media: Evidence from acquisitions by large firms," Journal of Financial Stability, Elsevier, vol. 28(C), pages 115-124.
    12. Gabriele Ranco & Darko Aleksovski & Guido Caldarelli & Miha Grčar & Igor Mozetič, 2015. "The Effects of Twitter Sentiment on Stock Price Returns," PLOS ONE, Public Library of Science, vol. 10(9), pages 1-21, September.
    13. Vegard Høghaug Larsen & Leif Anders Thorsrud, 2022. "Asset returns, news topics, and media effects," Scandinavian Journal of Economics, Wiley Blackwell, vol. 124(3), pages 838-868, July.
    14. Eric. W. K. See-To & Yang Yang, 2017. "Market sentiment dispersion and its effects on stock return and volatility," Electronic Markets, Springer;IIM University of St. Gallen, vol. 27(3), pages 283-296, August.
    15. Roberto Casarin & Flaminio Squazzoni, 2012. "Financial press and stock markets in times of crisis," Working Papers 2012_04, Department of Economics, University of Venice "Ca' Foscari".
    16. Sushant Chari & Purva Hegde Desai & Nilesh Borde & Babu George, 2023. "Aggregate News Sentiment and Stock Market Returns in India," JRFM, MDPI, vol. 16(8), pages 1-18, August.
    17. Breitmayer, Bastian & Massari, Filippo & Pelster, Matthias, 2019. "Swarm intelligence? Stock opinions of the crowd and stock returns," International Review of Economics & Finance, Elsevier, vol. 64(C), pages 443-464.
    18. Thang Ngoc Doan & Dong Phu Do & Dat Van Luong, 2023. "Monetary stance and favorableness of the monetary policy in the media: the case of Vietnam," Journal of Asian Business and Economic Studies, Emerald Group Publishing Limited, vol. 31(2), pages 111-123, August.
    19. Scott R. Baker & Nicholas Bloom & Steven J. Davis & Marco C. Sammon, 2021. "What Triggers Stock Market Jumps?," NBER Working Papers 28687, National Bureau of Economic Research, Inc.
    20. Gabriele Ranco & Ilaria Bordino & Giacomo Bormetti & Guido Caldarelli & Fabrizio Lillo & Michele Treccani, 2014. "Coupling news sentiment with web browsing data improves prediction of intra-day price dynamics," Papers 1412.3948, arXiv.org, revised Dec 2015.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:1406.7330. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.