IDEAS home Printed from https://ideas.repec.org/a/sae/somere/v51y2022i4p1580-1633.html
   My bibliography  Save this article

Who Does What to Whom? Making Text Parsers Work for Sociological Inquiry

Author

Listed:
  • Oscar Stuhler

Abstract

Over the past decade, sociologists have become increasingly interested in the formal study of semantic relations within text. Most contemporary studies focus either on mapping concept co-occurrences or on measuring semantic associations via word embeddings. Although conducive to many research goals, these approaches share an important limitation: they abstract away what one can call the event structure of texts, that is, the narrative action that takes place in them. I aim to overcome this limitation by introducing a new framework for extracting semantically rich relations from text that involves three components. First, a semantic grammar structured around textual entities that distinguishes six motif classes: actions of an entity, treatments of an entity, agents acting upon an entity, patients acted upon by an entity, characterizations of an entity, and possessions of an entity; second, a comprehensive set of mapping rules, which make it possible to recover motifs from predictions of dependency parsers; third, an R package that allows researchers to extract motifs from their own texts. The framework is demonstrated in empirical analyses on gendered interaction in novels and constructions of collective identity by U.S. presidential candidates.

Suggested Citation

  • Oscar Stuhler, 2022. "Who Does What to Whom? Making Text Parsers Work for Sociological Inquiry," Sociological Methods & Research, , vol. 51(4), pages 1580-1633, November.
  • Handle: RePEc:sae:somere:v:51:y:2022:i:4:p:1580-1633
    DOI: 10.1177/00491241221099551
    as

    Download full text from publisher

    File URL: https://journals.sagepub.com/doi/10.1177/00491241221099551
    Download Restriction: no

    File URL: https://libkey.io/10.1177/00491241221099551?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Grimmer, Justin & Stewart, Brandon M., 2013. "Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts," Political Analysis, Cambridge University Press, vol. 21(3), pages 267-297, July.
    2. Loet Leydesdorff & Iina Hellsten, 2006. "Measuring the meaning of words in contexts: An automated analysis of controversies about 'Monarch butterflies,' 'Frankenfoods,' and 'stem cells'," Scientometrics, Springer;Akadémiai Kiadó, vol. 67(2), pages 231-258, May.
    3. Marshall A. Taylor & Dustin S. Stoltz, 2021. "Integrating semantic directions with concept mover’s distance to measure binary concept engagement," Journal of Computational Social Science, Springer, vol. 4(1), pages 231-242, May.
    4. van Atteveldt, Wouter & Sheafer, Tamir & Shenhav, Shaul R. & Fogel-Dror, Yair, 2017. "Clause Analysis: Using Syntactic Information to Automatically Extract Source, Subject, and Predicate from Texts with an Application to the 2008–2009 Gaza War," Political Analysis, Cambridge University Press, vol. 25(2), pages 207-222, April.
    5. van Atteveldt, Wouter & Kleinnijenhuis, Jan & Ruigrok, Nel, 2008. "Parsing, Semantic Networks, and Political Authority Using Syntactic Analysis to Extract Semantic Relations from Dutch Newspaper Articles," Political Analysis, Cambridge University Press, vol. 16(4), pages 428-446.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Bernhardt, Lea & Dewenter, Ralf & Thomas, Tobias, 2023. "Measuring partisan media bias in US newscasts from 2001 to 2012," European Journal of Political Economy, Elsevier, vol. 78(C).
    2. Rauh, Christian, 2015. "Communicating supranational governance? The salience of EU affairs in the German Bundestag, 1991–2013," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 16(1), pages 116-138.
    3. Julia Seiermann, 2018. "Only Words? How Power in Trade Agreement Texts Affects International Trade Flows," UNCTAD Blue Series Papers 80, United Nations Conference on Trade and Development.
    4. Arthur Dyevre & Nicolas Lampach, 2021. "Issue attention on international courts: Evidence from the European Court of Justice," The Review of International Organizations, Springer, vol. 16(4), pages 793-815, October.
    5. Dewenter, Ralf & Dulleck, Uwe & Thomas, Tobias, 2018. "The political coverage index and its application to government capture," Research Papers 6, EcoAustria – Institute for Economic Research.
    6. Pastwa, Anna M. & Shrestha, Prabal & Thewissen, James & Torsin, Wouter, 2021. "Unpacking the black box of ICO white papers: a topic modeling approach," LIDAM Discussion Papers LFIN 2021018, Université catholique de Louvain, Louvain Finance (LFIN).
    7. Maksym Polyakov & Morteza Chalak & Md. Sayed Iftekhar & Ram Pandit & Sorada Tapsuwan & Fan Zhang & Chunbo Ma, 2018. "Authorship, Collaboration, Topics, and Research Gaps in Environmental and Resource Economics 1991–2015," Environmental & Resource Economics, Springer;European Association of Environmental and Resource Economists, vol. 71(1), pages 217-239, September.
    8. Milena Djourelova & Ruben Durante, 2019. "Media attention and strategic timing in politics: Evidence from U.S. presidential executive orders," Economics Working Papers 1675, Department of Economics and Business, Universitat Pompeu Fabra.
    9. Mohamed M. Mostafa, 2023. "A one-hundred-year structural topic modeling analysis of the knowledge structure of international management research," Quality & Quantity: International Journal of Methodology, Springer, vol. 57(4), pages 3905-3935, August.
    10. Jian Zhang & Michael S. Vogeley & Chaomei Chen, 2011. "Scientometrics of big science: a case study of research in the Sloan Digital Sky Survey," Scientometrics, Springer;Akadémiai Kiadó, vol. 86(1), pages 1-14, January.
    11. Erkan Işığıçok & Sadullah Çelik & Dilek Özdemir Yılmaz, 2023. "Analysis of Skills and Qualifications Required in Data Scientist Job Postings Based on the Pareto Analysis Perspective Using Text Mining," EKOIST Journal of Econometrics and Statistics, Istanbul University, Faculty of Economics, vol. 0(39), pages 10-25, December.
    12. Yuting Chen & Don Bredin & Valerio Potì & Roman Matkovskyy, 2022. "COVID risk narratives: a computational linguistic approach to the econometric identification of narrative risk during a pandemic," Digital Finance, Springer, vol. 4(1), pages 17-61, March.
    13. Purwoko Haryadi Santoso & Edi Istiyono & Haryanto & Wahyu Hidayatulloh, 2022. "Thematic Analysis of Indonesian Physics Education Research Literature Using Machine Learning," Data, MDPI, vol. 7(11), pages 1-41, October.
    14. Markus Eberhardt & Giovanni Facchini & Valeria Rueda, 2023. "Gender Differences in Reference Letters: Evidence from the Economics Job Market," The Economic Journal, Royal Economic Society, vol. 133(655), pages 2676-2708.
    15. Rauh, Christian, 2018. "Validating a sentiment dictionary for German political language—a workbench note," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 15(4), pages 319-343.
    16. Ferrara, Federico M. & Masciandaro, Donato & Moschella, Manuela & Romelli, Davide, 2022. "Political voice on monetary policy: Evidence from the parliamentary hearings of the European Central Bank," European Journal of Political Economy, Elsevier, vol. 74(C).
    17. James Evans, 2022. "From Text Signals to Simulations: A Review and Complement to Text as Data by Grimmer, Roberts & Stewart (PUP 2022)," Sociological Methods & Research, , vol. 51(4), pages 1868-1885, November.
    18. Camilla Salvatore & Silvia Biffignandi & Annamaria Bianchi, 2022. "Corporate Social Responsibility Activities Through Twitter: From Topic Model Analysis to Indexes Measuring Communication Characteristics," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 164(3), pages 1217-1248, December.
    19. Jason Anastasopoulos & George J. Borjas & Gavin G. Cook & Michael Lachanski, 2018. "Job Vacancies, the Beveridge Curve, and Supply Shocks: The Frequency and Content of Help-Wanted Ads in Pre- and Post-Mariel Miami," NBER Working Papers 24580, National Bureau of Economic Research, Inc.
    20. Yang Bao & Anindya Datta, 2014. "Simultaneously Discovering and Quantifying Risk Types from Textual Risk Disclosures," Management Science, INFORMS, vol. 60(6), pages 1371-1391, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:sae:somere:v:51:y:2022:i:4:p:1580-1633. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: SAGE Publications (email available below). General contact details of provider: .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.