IDEAS home Printed from https://ideas.repec.org/a/pal/palcom/v11y2024i1d10.1057_s41599-024-03316-7.html
   My bibliography  Save this article

Data selection and collection for constructing investor sentiment from social media

Author

Listed:
  • Qing Liu

    (Pukyong National University
    Huainan Normal University)

  • Hosung Son

    (Pukyong National University)

Abstract

Research based on investor sentiment in social media has been a hot topic of research in behavioral finance, and the reliability of investor sentiment mined from social media is a potential condition for the reliability of the results of these studies. In the past, scholars have often focused on using more reliable tools to track investor sentiment in order to get more reliable investor sentiment. However, less attention has been paid to another key factor affecting the reliability of investor sentiment on social media: the selection and collection of data. In this study, we systematically investigate the process of data selection and collection in relation to the construction of investor sentiment on social media. Our findings suggest that the process of creating a dataset from social media is a process that starts and ends with a research question. In this process, we need to overcome various obstacles to end up with an imperfect dataset. The researchers must take a series of steps to get close to the best dataset and acknowledge some of the shortcomings and limitations. We emphasize that the absence of accepted, reliable standards makes it particularly important to follow basic principles. This study is an important reference for social media-based behavioral finance research.

Suggested Citation

  • Qing Liu & Hosung Son, 2024. "Data selection and collection for constructing investor sentiment from social media," Palgrave Communications, Palgrave Macmillan, vol. 11(1), pages 1-13, December.
  • Handle: RePEc:pal:palcom:v:11:y:2024:i:1:d:10.1057_s41599-024-03316-7
    DOI: 10.1057/s41599-024-03316-7
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1057/s41599-024-03316-7
    File Function: Abstract
    Download Restriction: Access to full text is restricted to subscribers.

    File URL: https://libkey.io/10.1057/s41599-024-03316-7?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Sanjiv R. Das & Mike Y. Chen, 2007. "Yahoo! for Amazon: Sentiment Extraction from Small Talk on the Web," Management Science, INFORMS, vol. 53(9), pages 1375-1388, September.
    2. Huina Mao & Scott Counts & Johan Bollen, 2011. "Predicting Financial Markets: Comparing Survey, News, Twitter and Search Engine Data," Papers 1112.1051, arXiv.org.
    3. Zhi Da & Joseph Engelberg & Pengjie Gao, 2015. "Editor's Choice The Sum of All FEARS Investor Sentiment and Asset Prices," The Review of Financial Studies, Society for Financial Studies, vol. 28(1), pages 1-32.
    4. Imlak Shaikh, 2019. "The U.S. Presidential Election 2012/2016 and Investors’ Sentiment: The Case of CBOE Market Volatility Index," SAGE Open, , vol. 9(3), pages 21582440198, July.
    5. Huang, Yuqin & Qiu, Huiyan & Wu, Zhiguo, 2016. "Local bias in investor attention: Evidence from China's Internet stock message boards," Journal of Empirical Finance, Elsevier, vol. 38(PA), pages 338-354.
    6. Christine L. Borgman, 2012. "The conundrum of sharing research data," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 63(6), pages 1059-1078, June.
    7. Renault, Thomas, 2017. "Intraday online investor sentiment and return patterns in the U.S. stock market," Journal of Banking & Finance, Elsevier, vol. 84(C), pages 25-40.
    8. Thomas Renault, 2017. "Intraday online investor sentiment and return patterns in the U.S. stock market," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) hal-03205113, HAL.
    9. Sanjiv Das & Asís Martínez-Jerez & Peter Tufano, 2005. "eInformation: A Clinical Study of Investor Discussion and Sentiment," Financial Management, Financial Management Association, vol. 34(3), Fall.
    10. Steven L. Heston & Robert A. Korajczyk & Ronnie Sadka, 2010. "Intraday Patterns in the Cross‐section of Stock Returns," Journal of Finance, American Finance Association, vol. 65(4), pages 1369-1407, August.
    11. Rui Fan & Oleksandr Talavera & Vu Tran, 2020. "Social media bots and stock markets," European Financial Management, European Financial Management Association, vol. 26(3), pages 753-777, June.
    12. Thomas Renault, 2020. "Sentiment analysis and machine learning in finance: a comparison of methods and models on one million messages," Digital Finance, Springer, vol. 2(1), pages 1-13, September.
    13. Feng Li, 2010. "The Information Content of Forward‐Looking Statements in Corporate Filings—A Naïve Bayesian Machine Learning Approach," Journal of Accounting Research, Wiley Blackwell, vol. 48(5), pages 1049-1102, December.
    14. Ying Zhang & Peggy E. Swanson & Wikrom Prombutr, 2012. "Measuring Effects On Stock Returns Of Sentiment Indexes Created From Stock Message Boards," Journal of Financial Research, Southern Finance Association;Southwestern Finance Association, vol. 35(1), pages 79-114, March.
    15. Siganos, Antonios & Vagenas-Nanos, Evangelos & Verwijmeren, Patrick, 2017. "Divergence of sentiment and stock market trading," Journal of Banking & Finance, Elsevier, vol. 78(C), pages 130-141.
    16. Young Bin Kim & Jun Gi Kim & Wook Kim & Jae Ho Im & Tae Hyeong Kim & Shin Jin Kang & Chang Hun Kim, 2016. "Predicting Fluctuations in Cryptocurrency Transactions Based on User Comments and Replies," PLOS ONE, Public Library of Science, vol. 11(8), pages 1-17, August.
    17. Giannini, Robert & Irvine, Paul & Shu, Tao, 2019. "The convergence and divergence of investors' opinions around earnings news: Evidence from a social network," Journal of Financial Markets, Elsevier, vol. 42(C), pages 94-120.
    18. Audrino, Francesco & Sigrist, Fabio & Ballinari, Daniele, 2020. "The impact of sentiment and attention measures on stock market volatility," International Journal of Forecasting, Elsevier, vol. 36(2), pages 334-357.
    19. Eric. W. K. See-To & Yang Yang, 2017. "Market sentiment dispersion and its effects on stock return and volatility," Electronic Markets, Springer;IIM University of St. Gallen, vol. 27(3), pages 283-296, August.
    20. Christine L. Borgman, 2012. "The conundrum of sharing research data," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 63(6), pages 1059-1078, June.
    21. Diego García, 2013. "Sentiment during Recessions," Journal of Finance, American Finance Association, vol. 68(3), pages 1267-1300, June.
    22. Li, Yelin & Bu, Hui & Li, Jiahong & Wu, Junjie, 2020. "The role of text-extracted investor sentiment in Chinese stock price prediction with the enhancement of deep learning," International Journal of Forecasting, Elsevier, vol. 36(4), pages 1541-1562.
    23. Kraaijeveld, Olivier & De Smedt, Johannes, 2020. "The predictive power of public Twitter sentiment for forecasting cryptocurrency prices," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 65(C).
    24. Huynh, Toan Luu Duc & Foglia, Matteo & Nasir, Muhammad Ali & Angelini, Eliana, 2021. "Feverish sentiment and global equity markets during the COVID-19 pandemic," Journal of Economic Behavior & Organization, Elsevier, vol. 188(C), pages 1088-1108.
    25. Tim Loughran & Bill Mcdonald, 2016. "Textual Analysis in Accounting and Finance: A Survey," Journal of Accounting Research, Wiley Blackwell, vol. 54(4), pages 1187-1230, September.
    26. Sanjiv Sabherwal & Salil K. Sarkar & Ying Zhang, 2011. "Do Internet Stock Message Boards Influence Trading? Evidence from Heavily Discussed Stocks with No Fundamental News," Journal of Business Finance & Accounting, Wiley Blackwell, vol. 38(9-10), pages 1209-1237, November.
    27. repec:bla:jfinan:v:59:y:2004:i:3:p:1259-1294 is not listed on IDEAS
    28. Nofer, Michael & Hinz, Oliver, 2015. "Using Twitter to Predict the Stock Market: Where is the Mood Effect?," Publications of Darmstadt Technical University, Institute for Business Studies (BWL) 77140, Darmstadt Technical University, Department of Business Administration, Economics and Law, Institute for Business Studies (BWL).
    29. J. Anthony Cookson & Marina Niessner, 2020. "Why Don't We Agree? Evidence from a Social Network of Investors," Journal of Finance, American Finance Association, vol. 75(1), pages 173-228, February.
    30. Bin Gu & Prabhudev Konana & Balaji Rajagopalan & Hsuan-Wei Michelle Chen, 2007. "Competition Among Virtual Communities and User Valuation: The Case of Investing-Related Communities," Information Systems Research, INFORMS, vol. 18(1), pages 68-85, March.
    31. Hailiang Chen & Prabuddha De & Yu (Jeffrey) Hu & Byoung-Hyoun Hwang, 2014. "Wisdom of Crowds: The Value of Stock Opinions Transmitted Through Social Media," The Review of Financial Studies, Society for Financial Studies, vol. 27(5), pages 1367-1403.
    32. Michael Nofer & Oliver Hinz, 2015. "Using Twitter to Predict the Stock Market," Business & Information Systems Engineering: The International Journal of WIRTSCHAFTSINFORMATIK, Springer;Gesellschaft für Informatik e.V. (GI), vol. 57(4), pages 229-242, August.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Daniele Ballinari & Simon Behrendt, 2021. "How to gauge investor behavior? A comparison of online investor sentiment measures," Digital Finance, Springer, vol. 3(2), pages 169-204, June.
    2. Bowden, James & Gemayel, Roland, 2022. "Sentiment and trading decisions in an ambiguous environment: A study on cryptocurrency traders," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 80(C).
    3. Renault, Thomas, 2017. "Intraday online investor sentiment and return patterns in the U.S. stock market," Journal of Banking & Finance, Elsevier, vol. 84(C), pages 25-40.
    4. Mariano González-Sánchez & M. Encina Morales de Vega, 2021. "Influence of Bloomberg’s Investor Sentiment Index: Evidence from European Union Financial Sector," Mathematics, MDPI, vol. 9(4), pages 1-21, February.
    5. Andrew Todd & James Bowden & Yashar Moshfeghi, 2024. "Text‐based sentiment analysis in finance: Synthesising the existing literature and exploring future directions," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 31(1), March.
    6. Audrino, Francesco & Sigrist, Fabio & Ballinari, Daniele, 2020. "The impact of sentiment and attention measures on stock market volatility," International Journal of Forecasting, Elsevier, vol. 36(2), pages 334-357.
    7. Fang, Hao & Chung, Chien-Ping & Lu, Yang-Cheng & Lee, Yen-Hsien & Wang, Wen-Hao, 2021. "The impacts of investors' sentiments on stock returns using fintech approaches," International Review of Financial Analysis, Elsevier, vol. 77(C).
    8. Mahmoudi, Nader & Docherty, Paul & Melia, Adrian, 2022. "Firm-level investor sentiment and corporate announcement returns," Journal of Banking & Finance, Elsevier, vol. 144(C).
    9. Thomas Renault, 2020. "Sentiment analysis and machine learning in finance: a comparison of methods and models on one million messages," Digital Finance, Springer, vol. 2(1), pages 1-13, September.
    10. Yang Gao & Chengjie Zhao & Bianxia Sun & Wandi Zhao, 2022. "Effects of investor sentiment on stock volatility: new evidences from multi-source data in China’s green stock markets," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 8(1), pages 1-30, December.
    11. Rui Fan & Oleksandr Talavera & Vu Tran, 2023. "Social media and price discovery: The case of cross‐listed firms," Journal of Financial Research, Southern Finance Association;Southwestern Finance Association, vol. 46(1), pages 151-167, February.
    12. Fan, Rui & Talavera, Oleksandr & Tran, Vu, 2023. "Information flows and the law of one price," International Review of Financial Analysis, Elsevier, vol. 85(C).
    13. Santi, Caterina, 2023. "Investor climate sentiment and financial markets," International Review of Financial Analysis, Elsevier, vol. 86(C).
    14. Szymon Lis, 2022. "Investor Sentiment in Asset Pricing Models: A Review," Working Papers 2022-14, Faculty of Economic Sciences, University of Warsaw.
    15. Wang, Gang-Jin & Xiong, Lu & Zhu, You & Xie, Chi & Foglia, Matteo, 2022. "Multilayer network analysis of investor sentiment and stock returns," Research in International Business and Finance, Elsevier, vol. 62(C).
    16. Minjian Ye & Guangzhong Li, 2017. "Internet big data and capital markets: a literature review," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 3(1), pages 1-18, December.
    17. Geng, Yuedan & Ye, Qiang & Jin, Yu & Shi, Wen, 2022. "Crowd wisdom and internet searches: What happens when investors search for stocks?," International Review of Financial Analysis, Elsevier, vol. 82(C).
    18. Alina Lerman, 2020. "Individual Investors' Attention to Accounting Information: Evidence from Online Financial Communities," Contemporary Accounting Research, John Wiley & Sons, vol. 37(4), pages 2020-2057, December.
    19. Costola, Michele & Hinz, Oliver & Nofer, Michael & Pelizzon, Loriana, 2023. "Machine learning sentiment analysis, COVID-19 news and stock market reactions," Research in International Business and Finance, Elsevier, vol. 64(C).
    20. Eierle, Brigitte & Klamer, Sebastian & Muck, Matthias, 2022. "Does it really pay off for investors to consider information from social media?," International Review of Financial Analysis, Elsevier, vol. 81(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:pal:palcom:v:11:y:2024:i:1:d:10.1057_s41599-024-03316-7. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: https://www.nature.com/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.