IDEAS home Printed from https://ideas.repec.org/a/wsi/ijitdm/v18y2019i05ns0219622019500305.html
   My bibliography  Save this article

Social Media Cross-Source and Cross-Domain Sentiment Classification

Author

Listed:
  • Paola Zola

    (Department of Economy and Management, University of Brescia, 25121, Brescia C.da S. Chiara, 50, Italy)

  • Paulo Cortez

    (#x2020;ALGORITMI Centre, Department of Information Systems, University of Minho, 4804-533, Guimarães, Portugal)

  • Costantino Ragno

    (#x2021;School of Science and Technology, University of Camerino, Camerino, Italy)

  • Eugenio Brentari

    (Department of Economy and Management, University of Brescia, 25121, Brescia C.da S. Chiara, 50, Italy)

Abstract

Due to the expansion of Internet and Web 2.0 phenomenon, there is a growing interest in sentiment analysis of freely opinionated text. In this paper, we propose a novel cross-source cross-domain sentiment classification, in which cross-domain-labeled Web sources (Amazon and Tripadvisor) are used to train supervised learning models (including two deep learning algorithms) that are tested on typically nonlabeled social media reviews (Facebook and Twitter). We explored a three-step methodology, in which distinct balanced training, text preprocessing and machine learning methods were tested, using two languages: English and Italian. The best results were achieved using undersampling training and a Convolutional Neural Network. Interesting cross-source classification performances were achieved, in particular when using Amazon and Tripadvisor reviews to train a model that is tested on Facebook data for both English and Italian.

Suggested Citation

  • Paola Zola & Paulo Cortez & Costantino Ragno & Eugenio Brentari, 2019. "Social Media Cross-Source and Cross-Domain Sentiment Classification," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 18(05), pages 1469-1499, September.
  • Handle: RePEc:wsi:ijitdm:v:18:y:2019:i:05:n:s0219622019500305
    DOI: 10.1142/S0219622019500305
    as

    Download full text from publisher

    File URL: http://www.worldscientific.com/doi/abs/10.1142/S0219622019500305
    Download Restriction: Access to full text is restricted to subscribers

    File URL: https://libkey.io/10.1142/S0219622019500305?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Ron S. Kenett & Galit Shmueli, 2014. "On information quality," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 177(1), pages 3-38, January.
    2. Yang Liu & Jian-Wu Bi & Zhi-Ping Fan, 2017. "A Method for Ranking Products Through Online Reviews Based on Sentiment Classification and Interval-Valued Intuitionistic Fuzzy TOPSIS," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 16(06), pages 1497-1522, November.
    3. Jian Li & Zhenjing Xu & Huijuan Xu & Ling Tang & Lean Yu, 2017. "Forecasting Oil Price Trends with Sentiment of Online News Articles," Asia-Pacific Journal of Operational Research (APJOR), World Scientific Publishing Co. Pte. Ltd., vol. 34(02), pages 1-22, April.
    4. Ning Wang & Shanhui Ke & Yibo Chen & Tao Yan & Andrew Lim, 2019. "Textual Sentiment of Chinese Microblog Toward the Stock Market," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 18(02), pages 649-671, March.
    5. Xiangling Fu & Jintae Lee & Chenwei Yan & Li Gao, 2019. "Mining Newsworthy Events in the Traffic Accident Domain from Chinese Microblog," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 18(02), pages 717-742, March.
    6. Yijun Li & Qiang Ye & Ziqiong Zhang & Tienan Wang, 2011. "Snippet-Based Unsupervised Approach For Sentiment Classification Of Chinese Online Reviews," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 10(06), pages 1097-1110.
    7. P. D. Mahendhiran & S. Kannimuthu, 2018. "Deep Learning Techniques for Polarity Classification in Multimodal Sentiment Analysis," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 17(03), pages 883-910, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Davide Giacomini & Paola Zola & Diego Paredi & Mario Mazzoleni, 2020. "Environmental disclosure and stakeholder engagement via social media: State of the art and potential in public utilities," Corporate Social Responsibility and Environmental Management, John Wiley & Sons, vol. 27(4), pages 1552-1564, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Pierpaolo D’Urso & Vincenzina Vitale, 2021. "Modeling Local BES Indicators by Copula-Based Bayesian Networks," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 153(3), pages 823-847, February.
    2. Moussa Larbani & Po Lung Yu, 2020. "Empowering Data Mining Sciences by Habitual Domains Theory, Part I: The Concept of Wonderful Solution," Annals of Data Science, Springer, vol. 7(3), pages 373-397, September.
    3. Hao, Jun & Feng, Qianqian & Yuan, Jiaxin & Sun, Xiaolei & Li, Jianping, 2022. "A dynamic ensemble learning with multi-objective optimization for oil prices prediction," Resources Policy, Elsevier, vol. 79(C).
    4. He, Zhifang, 2020. "Dynamic impacts of crude oil price on Chinese investor sentiment: Nonlinear causality and time-varying effect," International Review of Economics & Finance, Elsevier, vol. 66(C), pages 131-153.
    5. Galit Shmueli, 2020. "Discussion on “Assessing the goodness of fit of logistic regression models in large samples: A modification of the Hosmer‐Lemeshow test” by Giovanni Nattino, Michael L. Pennell, and Stanley Lemeshow," Biometrics, The International Biometric Society, vol. 76(2), pages 561-563, June.
    6. Zhang, Chenxi & Xu, Zeshui, 2024. "Gaining insights for service improvement through unstructured text from online reviews," Journal of Retailing and Consumer Services, Elsevier, vol. 80(C).
    7. Biemer Paul & Trewin Dennis & Bergdahl Heather & Japec Lilli, 2014. "A System for Managing the Quality of Official Statistics," Journal of Official Statistics, Sciendo, vol. 30(3), pages 381-415, September.
    8. Lifeng He & Dongmei Han & Xiaohang Zhou & Zheng Qu, 2020. "The Voice of Drug Consumers: Online Textual Review Analysis Using Structural Topic Model," IJERPH, MDPI, vol. 17(10), pages 1-18, May.
    9. Melda Kokoç & Süleyman Ersöz, 2021. "A literature review of interval-valued intuitionistic fuzzy multi-criteria decision-making methodologies," Operations Research and Decisions, Wroclaw University of Science and Technology, Faculty of Management, vol. 31(4), pages 89-116.
    10. Jerzy Michnik & Artur Grabowski, 2020. "Modeling Uncertainty in the Wings Method Using Interval Arithmetic," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 19(01), pages 221-240, January.
    11. Lucey, Brian & Ren, Boru, 2021. "Does news tone help forecast oil?," Economic Modelling, Elsevier, vol. 104(C).
    12. Ron S. Kenett & Abraham Rubinstein, 2021. "Generalizing research findings for enhanced reproducibility: an approach based on verbal alternative representations," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(5), pages 4137-4151, May.
    13. Rosaria Simone, 2023. "Uncertainty Diagnostics of Binomial Regression Trees for Ordered Rating Data," Journal of Classification, Springer;The Classification Society, vol. 40(1), pages 79-105, April.
    14. Heidary Dahooie, Jalil & Raafat, Romina & Qorbani, Ali Reza & Daim, Tugrul, 2021. "An intuitionistic fuzzy data-driven product ranking model using sentiment analysis and multi-criteria decision-making," Technological Forecasting and Social Change, Elsevier, vol. 173(C).
    15. Nikolaos Askitas, 2016. "Big Data is a big deal but how much data do we need? [Big Data gut und schön. Aber wie viel Data brauchen wir?]," AStA Wirtschafts- und Sozialstatistisches Archiv, Springer;Deutsche Statistische Gesellschaft - German Statistical Society, vol. 10(2), pages 113-125, October.
    16. Pierpaolo D’Urso & Vincenzina Vitale, 2020. "Bayesian Networks Model Averaging for Bes Indicators," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 151(3), pages 897-919, October.
    17. Lin, Ling & Jiang, Yong & Xiao, Helu & Zhou, Zhongbao, 2020. "Crude oil price forecasting based on a novel hybrid long memory GARCH-M and wavelet analysis model," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 543(C).
    18. Federica Cugnata & Silvia Salini, 2014. "Model-based approach for importance–performance analysis," Quality & Quantity: International Journal of Methodology, Springer, vol. 48(6), pages 3053-3064, November.
    19. Coleman Shirley Y., 2016. "Data-Mining Opportunities for Small and Medium Enterprises with Official Statistics in the UK," Journal of Official Statistics, Sciendo, vol. 32(4), pages 849-865, December.
    20. Jiangwei Liu & Xiaohong Huang, 2021. "Forecasting Crude Oil Price Using Event Extraction," Papers 2111.09111, arXiv.org.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wsi:ijitdm:v:18:y:2019:i:05:n:s0219622019500305. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Tai Tone Lim (email available below). General contact details of provider: http://www.worldscinet.com/ijitdm/ijitdm.shtml .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.