IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v9y2021i21p2722-d665812.html
   My bibliography  Save this article

Topic-Based Document-Level Sentiment Analysis Using Contextual Cues

Author

Listed:
  • Ciprian-Octavian Truică

    (Computer Science and Engineering Department, Faculty of Automatic Control and Computers, University Politehnica of Bucharest, RO-060042 Bucharest, Romania
    These authors contributed equally to this work.)

  • Elena-Simona Apostol

    (Computer Science and Engineering Department, Faculty of Automatic Control and Computers, University Politehnica of Bucharest, RO-060042 Bucharest, Romania
    These authors contributed equally to this work.)

  • Maria-Luiza Șerban

    (Computer Science and Engineering Department, Faculty of Automatic Control and Computers, University Politehnica of Bucharest, RO-060042 Bucharest, Romania
    These authors contributed equally to this work.)

  • Adrian Paschke

    (Fraunhofer Institute for Open Communication Systems, 10589 Berlin, Germany)

Abstract

Document-level Sentiment Analysis is a complex task that implies the analysis of large textual content that can incorporate multiple contradictory polarities at the phrase and word levels. Most of the current approaches either represent textual data using pre-trained word embeddings without considering the local context that can be extracted from the dataset, or they detect the overall topic polarity without considering both the local and global context. In this paper, we propose a novel document-topic embedding model, DocTopic 2 Vec , for document-level polarity detection in large texts by employing general and specific contextual cues obtained through the use of document embeddings ( Doc 2 Vec ) and Topic Modeling. In our approach, (1) we use a large dataset with game reviews to create different word embeddings by applying Word 2 Vec , FastText , and GloVe , (2) we create Doc 2 Vec s enriched with the local context given by the word embeddings for each review, (3) we construct topic embeddings Topic 2 Vec using three Topic Modeling algorithms, i.e., LDA, NMF, and LSI, to enhance the global context of the Sentiment Analysis task, (4) for each document and its dominant topic, we build the new DocTopic 2 Vec by concatenating the Doc 2 Vec with the Topic 2 Vec created with the same word embedding. We also design six new Convolutional-based (Bidirectional) Recurrent Deep Neural Network Architectures that show promising results for this task. The proposed DocTopic 2 Vec s are used to benchmark multiple Machine and Deep Learning models, i.e., a Logistic Regression model, used as a baseline, and 18 Deep Neural Networks Architectures. The experimental results show that the new embedding and the new Deep Neural Network Architectures achieve better results than the baseline, i.e., Logistic Regression and Doc 2 Vec .

Suggested Citation

  • Ciprian-Octavian Truică & Elena-Simona Apostol & Maria-Luiza Șerban & Adrian Paschke, 2021. "Topic-Based Document-Level Sentiment Analysis Using Contextual Cues," Mathematics, MDPI, vol. 9(21), pages 1-23, October.
  • Handle: RePEc:gam:jmathe:v:9:y:2021:i:21:p:2722-:d:665812
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/9/21/2722/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/9/21/2722/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Yoon, Hyui Geon & Kim, Hyungjun & Kim, Chang Ouk & Song, Min, 2016. "Opinion polarity detection in Twitter data combining shrinkage regression and topic modeling," Journal of Informetrics, Elsevier, vol. 10(2), pages 634-644.
    2. Scott Deerwester & Susan T. Dumais & George W. Furnas & Thomas K. Landauer & Richard Harshman, 1990. "Indexing by latent semantic analysis," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 41(6), pages 391-407, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Qiang Gao & Xiao Huang & Ke Dong & Zhentao Liang & Jiang Wu, 2022. "Semantic-enhanced topic evolution analysis: a combination of the dynamic topic model and word2vec," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(3), pages 1543-1563, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Shr-Wei Kao & Pin Luarn, 2020. "Topic Modeling Analysis of Social Enterprises: Twitter Evidence," Sustainability, MDPI, vol. 12(8), pages 1-20, April.
    2. Irina Wedel & Michael Palk & Stefan Voß, 2022. "A Bilingual Comparison of Sentiment and Topics for a Product Event on Twitter," Information Systems Frontiers, Springer, vol. 24(5), pages 1635-1646, October.
    3. Mohammed Salem Binwahlan, 2023. "Polynomial Networks Model for Arabic Text Summarization," International Journal of Research and Scientific Innovation, International Journal of Research and Scientific Innovation (IJRSI), vol. 10(2), pages 74-84, February.
    4. Curci, Ylenia & Mongeau Ospina, Christian A., 2016. "Investigating biofuels through network analysis," Energy Policy, Elsevier, vol. 97(C), pages 60-72.
    5. Chao Wei & Senlin Luo & Xincheng Ma & Hao Ren & Ji Zhang & Limin Pan, 2016. "Locally Embedding Autoencoders: A Semi-Supervised Manifold Learning Approach of Document Representation," PLOS ONE, Public Library of Science, vol. 11(1), pages 1-20, January.
    6. Maksym Polyakov & Morteza Chalak & Md. Sayed Iftekhar & Ram Pandit & Sorada Tapsuwan & Fan Zhang & Chunbo Ma, 2018. "Authorship, Collaboration, Topics, and Research Gaps in Environmental and Resource Economics 1991–2015," Environmental & Resource Economics, Springer;European Association of Environmental and Resource Economists, vol. 71(1), pages 217-239, September.
    7. Ding, Ying, 2011. "Community detection: Topological vs. topical," Journal of Informetrics, Elsevier, vol. 5(4), pages 498-514.
    8. Klaus Gugler & Florian Szücs & Ulrich Wohak, 2023. "Start-up Acquisitions, Venture Capital and Innovation: A Comparative Study of Google, Apple, Facebook, Amazon and Microsoft," Department of Economics Working Papers wuwp340, Vienna University of Economics and Business, Department of Economics.
    9. Juan Shi & Kin Keung Lai & Ping Hu & Gang Chen, 2018. "Factors dominating individual information disseminating behavior on social networking sites," Information Technology and Management, Springer, vol. 19(2), pages 121-139, June.
    10. Ganesh Dash & Chetan Sharma & Shamneesh Sharma, 2023. "Sustainable Marketing and the Role of Social Media: An Experimental Study Using Natural Language Processing (NLP)," Sustainability, MDPI, vol. 15(6), pages 1-16, March.
    11. Paola Cerchiello & Giancarlo Nicola, 2018. "Assessing News Contagion in Finance," Econometrics, MDPI, vol. 6(1), pages 1-19, February.
    12. Gissler, Stefan & Oldfather, Jeremy & Ruffino, Doriana, 2016. "Lending on hold: Regulatory uncertainty and bank lending standards," Journal of Monetary Economics, Elsevier, vol. 81(C), pages 89-101.
    13. Wittek, Peter, 2013. "Two-way incremental seriation in the temporal domain with three-dimensional visualization: Making sense of evolving high-dimensional datasets," Computational Statistics & Data Analysis, Elsevier, vol. 66(C), pages 193-201.
    14. Alina Evstigneeva & Mark Sidorovskiy, 2021. "Assessment of Clarity of Bank of Russia Monetary Policy Communication by Neural Network Approach," Russian Journal of Money and Finance, Bank of Russia, vol. 80(3), pages 3-33, September.
    15. Arno de Caigny & Kristof Coussement & Koen W. de Bock & Stefan Lessmann, 2019. "Incorporating textual information in customer churn prediction models based on a convolutional neural network," Post-Print hal-02275958, HAL.
    16. Hei-Chia Wang & Tzu-Ting Hsu & Yunita Sari, 2019. "Personal research idea recommendation using research trends and a hierarchical topic model," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(3), pages 1385-1406, December.
    17. Borke, Lukas & Härdle, Wolfgang Karl, 2016. "Q3-D3-Lsa," SFB 649 Discussion Papers 2016-049, Humboldt University Berlin, Collaborative Research Center 649: Economic Risk.
    18. Hiroaki Sugino & Tatsuya Sekiguchi & Yuuki Terada & Naoki Hayashi, 2023. "“Future Compass”, a Tool That Allows Us to See the Right Horizon—Integration of Topic Modeling and Multiple-Factor Analysis," Sustainability, MDPI, vol. 15(13), pages 1-20, June.
    19. David A. Broniatowski, 2018. "Building the tower without climbing it: Progress in engineering systems," Systems Engineering, John Wiley & Sons, vol. 21(3), pages 259-281, May.
    20. Marcin Chlebus & Maciej Stefan Świtała, 2020. "So close and so far. Finding similar tendencies in econometrics and machine learning papers. Topic models comparison," Working Papers 2020-16, Faculty of Economic Sciences, University of Warsaw.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:9:y:2021:i:21:p:2722-:d:665812. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.