IDEAS home Printed from https://ideas.repec.org/a/spr/psycho/v87y2022i1d10.1007_s11336-021-09820-y.html
   My bibliography  Save this article

Modeling Latent Topics in Social Media using Dynamic Exploratory Graph Analysis: The Case of the Right-wing and Left-wing Trolls in the 2016 US Elections

Author

Listed:
  • Hudson Golino

    (University of Virginia)

  • Alexander P. Christensen

    (University of Pennsylvania)

  • Robert Moulder

    (University of Virginia)

  • Seohyun Kim

    (University of Virginia)

  • Steven M. Boker

    (University of Virginia)

Abstract

The past few years were marked by increased online offensive strategies perpetrated by state and non-state actors to promote their political agenda, sow discord, and question the legitimacy of democratic institutions in the US and Western Europe. In 2016, the US congress identified a list of Russian state-sponsored Twitter accounts that were used to try to divide voters on a wide range of issues. Previous research used latent Dirichlet allocation (LDA) to estimate latent topics in data extracted from these accounts. However, LDA has characteristics that may limit the effectiveness of its use on data from social media: The number of latent topics must be specified by the user, interpretability of the topics can be difficult to achieve, and it does not model short-term temporal dynamics. In the current paper, we propose a new method to estimate latent topics in texts from social media termed Dynamic Exploratory Graph Analysis (DynEGA). In a Monte Carlo simulation, we compared the ability of DynEGA and LDA to estimate the number of simulated latent topics. The results show that DynEGA is substantially more accurate than several different LDA algorithms when estimating the number of simulated topics. In an applied example, we performed DynEGA on a large dataset with Twitter posts from state-sponsored right- and left-wing trolls during the 2016 US presidential election. DynEGA revealed topics that were pertinent to several consequential events in the election cycle, demonstrating the coordinated effort of trolls capitalizing on current events in the USA. This example demonstrates the potential power of our approach for revealing temporally relevant information from qualitative text data.

Suggested Citation

  • Hudson Golino & Alexander P. Christensen & Robert Moulder & Seohyun Kim & Steven M. Boker, 2022. "Modeling Latent Topics in Social Media using Dynamic Exploratory Graph Analysis: The Case of the Right-wing and Left-wing Trolls in the 2016 US Elections," Psychometrika, Springer;The Psychometric Society, vol. 87(1), pages 156-187, March.
  • Handle: RePEc:spr:psycho:v:87:y:2022:i:1:d:10.1007_s11336-021-09820-y
    DOI: 10.1007/s11336-021-09820-y
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11336-021-09820-y
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11336-021-09820-y?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Jiahua Chen & Zehua Chen, 2008. "Extended Bayesian information criteria for model selection with large model spaces," Biometrika, Biometrika Trust, vol. 95(3), pages 759-771.
    2. Grün, Bettina & Hornik, Kurt, 2011. "topicmodels: An R Package for Fitting Topic Models," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 40(i13).
    3. Hudson F Golino & Sacha Epskamp, 2017. "Exploratory graph analysis: A new approach for estimating the number of dimensions in psychological research," PLOS ONE, Public Library of Science, vol. 12(6), pages 1-26, June.
    4. Feinerer, Ingo & Hornik, Kurt & Meyer, David, 2008. "Text Mining Infrastructure in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 25(i05).
    5. Golino, Hudson F. & Demetriou, Andreas, 2017. "Estimating the dimensionality of intelligence like data using Exploratory Graph Analysis," Intelligence, Elsevier, vol. 62(C), pages 54-70.
    6. Sacha Epskamp & Mijke Rhemtulla & Denny Borsboom, 2017. "Generalized Network Psychometrics: Combining Network and Latent Variable Models," Psychometrika, Springer;The Psychometric Society, vol. 82(4), pages 904-927, December.
    7. Louis Guttman, 1953. "Image theory for the structure of quantitative variates," Psychometrika, Springer;The Psychometric Society, vol. 18(4), pages 277-296, December.
    8. Wayne Velicer, 1976. "Determining the number of components from the matrix of partial correlations," Psychometrika, Springer;The Psychometric Society, vol. 41(3), pages 321-327, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Stefan Claus & Massimo Stella, 2022. "Natural Language Processing and Cognitive Networks Identify UK Insurers’ Trends in Investor Day Transcripts," Future Internet, MDPI, vol. 14(10), pages 1-18, October.
    2. Denny Borsboom, 2022. "Possible Futures for Network Psychometrics," Psychometrika, Springer;The Psychometric Society, vol. 87(1), pages 253-265, March.
    3. Maarten Marsman & Mijke Rhemtulla, 2022. "Guest Editors’ Introduction to The Special Issue “Network Psychometrics in Action”: Methodological Innovations Inspired by Empirical Problems," Psychometrika, Springer;The Psychometric Society, vol. 87(1), pages 1-11, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Pedro Henrique Ribeiro Santiago & Gustavo Hermes Soares & Lisa Gaye Smithers & Rachel Roberts & Lisa Jamieson, 2022. "Psychological Network of Stress, Coping and Social Support in an Aboriginal Population," IJERPH, MDPI, vol. 19(22), pages 1-22, November.
    2. Stefano Sbalchiero & Maciej Eder, 2020. "Topic modeling, long texts and the best number of topics. Some Problems and solutions," Quality & Quantity: International Journal of Methodology, Springer, vol. 54(4), pages 1095-1108, August.
    3. Daoud, Adel & Kohl, Sebastian, 2016. "How much do sociologists write about economic topics? Using big data to test some conventional views in economic sociology, 1890 to 2014," MPIfG Discussion Paper 16/7, Max Planck Institute for the Study of Societies.
    4. Holand, Øystein & Contiero, Barbara & Næss, Marius W. & Cozzi, Giulio, 2024. "“The Times They Are A-Changin' “ – research trends and perspectives of reindeer pastoralism – A review using text mining and topic modelling," Land Use Policy, Elsevier, vol. 136(C).
    5. Kan, Kees-Jan & van der Maas, Han L.J. & Levine, Stephen Z., 2019. "Extending psychometric network analysis: Empirical evidence against g in favor of mutualism?," Intelligence, Elsevier, vol. 73(C), pages 52-62.
    6. Raveenajit Kaur A. P. & Kalvant Singh & Alberto Luis August, 2021. "Exploring the Factor Structure of the Constructs of Technological, Pedagogical, and Content Knowledge (TPACK): An Exploratory Factor Analysis Based on the Perceptions of TESOL Pre-Service Teachers at ," Research Journal of Education, Academic Research Publishing Group, vol. 7(2), pages 103-115, 06-2021.
    7. Cho, Yung-Jan & Fu, Pei-Wen & Wu, Chi-Cheng, 2017. "Popular Research Topics in Marketing Journals, 1995–2014," Journal of Interactive Marketing, Elsevier, vol. 40(C), pages 52-72.
    8. Sacha Epskamp, 2020. "Psychometric network models from time-series and panel data," Psychometrika, Springer;The Psychometric Society, vol. 85(1), pages 206-231, March.
    9. Bing Li & Cody Ding & Huiying Shi & Fenghui Fan & Liya Guo, 2023. "Assessment of Psychological Zone of Optimal Performance among Professional Athletes: EGA and Item Response Theory Analysis," Sustainability, MDPI, vol. 15(10), pages 1-15, May.
    10. Motta Queiroz, Mariza & Roque, Carlos & Moura, Filipe & Marôco, João, 2024. "Understanding the expectations of parents regarding their children's school commuting by public transport using latent Dirichlet Allocation," Transportation Research Part A: Policy and Practice, Elsevier, vol. 181(C).
    11. João Guerreiro & Paulo Rita & Duarte Trigueiros, 2016. "A Text Mining-Based Review of Cause-Related Marketing Literature," Journal of Business Ethics, Springer, vol. 139(1), pages 111-128, November.
    12. W. Holmes Finch, 2024. "Comparison of Methods for Addressing Outliers in Exploratory Factor Analysis and Impact on Accuracy of Determining the Number of Factors," Stats, MDPI, vol. 7(3), pages 1-21, August.
    13. Boris Forthmann & Mark A. Runco, 2020. "An Empirical Test of the Inter-Relationships between Various Bibliometric Creative Scholarship Indicators," Publications, MDPI, vol. 8(2), pages 1-16, June.
    14. Abhinav Khare & Qing He & Rajan Batta, 2020. "Predicting gasoline shortage during disasters using social media," OR Spectrum: Quantitative Approaches in Management, Springer;Gesellschaft für Operations Research e.V., vol. 42(3), pages 693-726, September.
    15. Yunxiao Chen & Xiaoou Li & Jingchen Liu & Zhiliang Ying, 2018. "Robust Measurement via A Fused Latent and Graphical Item Response Theory Model," Psychometrika, Springer;The Psychometric Society, vol. 83(3), pages 538-562, September.
    16. Lehotský, Lukáš & Černoch, Filip & Osička, Jan & Ocelík, Petr, 2019. "When climate change is missing: Media discourse on coal mining in the Czech Republic," Energy Policy, Elsevier, vol. 129(C), pages 774-786.
    17. Doblinger, Claudia & Surana, Kavita & Li, Deyu & Hultman, Nathan & Anadón, Laura Díaz, 2022. "How do global manufacturing shifts affect long-term clean energy innovation? A study of wind energy suppliers," Research Policy, Elsevier, vol. 51(7).
    18. Andres, Maximilian & Bruttel, Lisa & Friedrichsen, Jana, 2023. "How communication makes the difference between a cartel and tacit collusion: A machine learning approach," European Economic Review, Elsevier, vol. 152(C).
    19. Sun, Katherine Qianwen & Slepian, Michael L., 2020. "The conversations we seek to avoid," Organizational Behavior and Human Decision Processes, Elsevier, vol. 160(C), pages 87-105.
    20. Rieger, Jonas & von Nordheim, Gerret, 2021. "corona100d: German-language Twitter dataset of the first 100 days after Chancellor Merkel addressed the coronavirus outbreak on TV," DoCMA Working Papers 4, TU Dortmund University, Dortmund Center for Data-based Media Analysis (DoCMA).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:psycho:v:87:y:2022:i:1:d:10.1007_s11336-021-09820-y. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.