IDEAS home Printed from https://ideas.repec.org/a/cog/poango/v8y2020i2p326-339.html
   My bibliography  Save this article

Integrating Manual and Automatic Annotation for the Creation of Discourse Network Data Sets

Author

Listed:
  • Sebastian Haunss

    (Research Center on Inequality and Social Policy, University of Bremen, Germany)

  • Jonas Kuhn

    (Institute for Natural Language Processing, University of Stuttgart, Germany)

  • Sebastian Padó

    (Institute for Natural Language Processing, University of Stuttgart, Germany)

  • Andre Blessing

    (Institute for Natural Language Processing, University of Stuttgart, Germany)

  • Nico Blokker

    (Research Center on Inequality and Social Policy, University of Bremen, Germany)

  • Erenay Dayanik

    (Institute for Natural Language Processing, University of Stuttgart, Germany)

  • Gabriella Lapesa

    (Institute for Natural Language Processing, University of Stuttgart, Germany)

Abstract

This article investigates the integration of machine learning in the political claim annotation workflow with the goal to partially automate the annotation and analysis of large text corpora. It introduces the MARDY annotation environment and presents results from an experiment in which the annotation quality of annotators with and without machine learning based annotation support is compared. The design and setting aim to measure and evaluate: a) annotation speed; b) annotation quality; and c) applicability to the use case of discourse network generation. While the results indicate only slight increases in terms of annotation speed, the authors find a moderate boost in annotation quality. Additionally, with the help of manual annotation of the actors and filtering out of the false positives, the machine learning based annotation suggestions allow the authors to fully recover the core network of the discourse as extracted from the articles annotated during the experiment. This is due to the redundancy which is naturally present in the annotated texts. Thus, assuming a research focus not on the complete network but the network core, an AI-based annotation can provide reliable information about discourse networks with much less human intervention than compared to the traditional manual approach.

Suggested Citation

  • Sebastian Haunss & Jonas Kuhn & Sebastian Padó & Andre Blessing & Nico Blokker & Erenay Dayanik & Gabriella Lapesa, 2020. "Integrating Manual and Automatic Annotation for the Creation of Discourse Network Data Sets," Politics and Governance, Cogitatio Press, vol. 8(2), pages 326-339.
  • Handle: RePEc:cog:poango:v8:y:2020:i:2:p:326-339
    DOI: 10.17645/pag.v8i2.2591
    as

    Download full text from publisher

    File URL: https://www.cogitatiopress.com/politicsandgovernance/article/view/2591
    Download Restriction: no

    File URL: https://libkey.io/10.17645/pag.v8i2.2591?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:cog:poango:v8:y:2020:i:2:p:326-339. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: António Vieira or IT Department (email available below). General contact details of provider: https://www.cogitatiopress.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.