IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0242050.html
   My bibliography  Save this article

A versatile framework for resource-limited sentiment articulation, annotation, and analysis of short texts

Author

Listed:
  • Vuk Batanović
  • Miloš Cvetanović
  • Boško Nikolić

Abstract

Choosing a comprehensive and cost-effective way of articulating and annotating the sentiment of a text is not a trivial task, particularly when dealing with short texts, in which sentiment can be expressed through a wide variety of linguistic and rhetorical phenomena. This problem is especially conspicuous in resource-limited settings and languages, where design options are restricted either in terms of manpower and financial means required to produce appropriate sentiment analysis resources, or in terms of available language tools, or both. In this paper, we present a versatile approach to addressing this issue, based on multiple interpretations of sentiment labels that encode information regarding the polarity, subjectivity, and ambiguity of a text, as well as the presence of sarcasm or a mixture of sentiments. We demonstrate its use on Serbian, a resource-limited language, via the creation of a main sentiment analysis dataset focused on movie comments, and two smaller datasets belonging to the movie and book domains. In addition to measuring the quality of the annotation process, we propose a novel metric to validate its cost-effectiveness. Finally, the practicality of our approach is further validated by training, evaluating, and determining the optimal configurations of several different kinds of machine-learning models on a range of sentiment classification tasks using the produced dataset.

Suggested Citation

  • Vuk Batanović & Miloš Cvetanović & Boško Nikolić, 2020. "A versatile framework for resource-limited sentiment articulation, annotation, and analysis of short texts," PLOS ONE, Public Library of Science, vol. 15(11), pages 1-30, November.
  • Handle: RePEc:plo:pone00:0242050
    DOI: 10.1371/journal.pone.0242050
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0242050
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0242050&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0242050?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Igor Mozetič & Miha Grčar & Jasmina Smailović, 2016. "Multilingual Twitter Sentiment Classification: The Role of Human Annotators," PLOS ONE, Public Library of Science, vol. 11(5), pages 1-26, May.
    2. Lowri Williams & Michael Arribas-Ayllon & Andreas Artemiou & Irena Spasić, 2019. "Comparing the Utility of Different Classification Schemes for Emotive Language Analysis," Journal of Classification, Springer;The Classification Society, vol. 36(3), pages 619-648, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Reem ALBayari & Sherief Abdallah, 2022. "Instagram-Based Benchmark Dataset for Cyberbullying Detection in Arabic Text," Data, MDPI, vol. 7(7), pages 1-11, June.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Peter Gabrovšek & Darko Aleksovski & Igor Mozetič & Miha Grčar, 2017. "Twitter sentiment around the Earnings Announcement events," PLOS ONE, Public Library of Science, vol. 12(2), pages 1-21, February.
    2. Paweł Matuszewski, 2023. "How to prepare data for the automatic classification of politically related beliefs expressed on Twitter? The consequences of researchers’ decisions on the number of coders, the algorithm learning pro," Quality & Quantity: International Journal of Methodology, Springer, vol. 57(1), pages 301-321, February.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0242050. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.