IDEAS home Printed from https://ideas.repec.org/a/gam/jdataj/v8y2023i5p90-d1146474.html
   My bibliography  Save this article

An Efficient Deep Learning for Thai Sentiment Analysis

Author

Listed:
  • Nattawat Khamphakdee

    (Natural Language and Speech Processing Research Group, Department of Computer Science, College of Computing, Khon Kaen University, Khon Kaen 40002, Thailand)

  • Pusadee Seresangtakul

    (Natural Language and Speech Processing Research Group, Department of Computer Science, College of Computing, Khon Kaen University, Khon Kaen 40002, Thailand)

Abstract

The number of reviews from customers on travel websites and platforms is quickly increasing. They provide people with the ability to write reviews about their experience with respect to service quality, location, room, and cleanliness, thereby helping others before booking hotels. Many people fail to consider hotel bookings because the numerous reviews take a long time to read, and many are in a non-native language. Thus, hotel businesses need an efficient process to analyze and categorize the polarity of reviews as positive, negative, or neutral. In particular, low-resource languages such as Thai have greater limitations in terms of resources to classify sentiment polarity. In this paper, a sentiment analysis method is proposed for Thai sentiment classification in the hotel domain. Firstly, the Word2Vec technique (the continuous bag-of-words (CBOW) and skip-gram approaches) was applied to create word embeddings of different vector dimensions. Secondly, each word embedding model was combined with deep learning (DL) models to observe the impact of each word vector dimension result. We compared the performance of nine DL models (CNN, LSTM, Bi-LSTM, GRU, Bi-GRU, CNN-LSTM, CNN-BiLSTM, CNN-GRU, and CNN-BiGRU) with different numbers of layers to evaluate their performance in polarity classification. The dataset was classified using the FastText and BERT pre-trained models to carry out the sentiment polarity classification. Finally, our experimental results show that the WangchanBERTa model slightly improved the accuracy, producing a value of 0.9225, and the skip-gram and CNN model combination outperformed other DL models, reaching an accuracy of 0.9170. From the experiments, we found that the word vector dimensions, hyperparameter values, and the number of layers of the DL models affected the performance of sentiment classification. Our research provides guidance for setting suitable hyperparameter values to improve the accuracy of sentiment classification for the Thai language in the hotel domain.

Suggested Citation

  • Nattawat Khamphakdee & Pusadee Seresangtakul, 2023. "An Efficient Deep Learning for Thai Sentiment Analysis," Data, MDPI, vol. 8(5), pages 1-22, May.
  • Handle: RePEc:gam:jdataj:v:8:y:2023:i:5:p:90-:d:1146474
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2306-5729/8/5/90/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2306-5729/8/5/90/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Ebaa Fayyoumi & Sahar Idwan, 2021. "Semantic Partitioning and Machine Learning in Sentiment Analysis," Data, MDPI, vol. 6(6), pages 1-17, June.
    2. Yahya Tashtoush & Balqis Alrababah & Omar Darwish & Majdi Maabreh & Nasser Alsaedi, 2022. "A Deep Learning Framework for Detection of COVID-19 Fake News on Social Media Platforms," Data, MDPI, vol. 7(5), pages 1-17, May.
    3. Nosratabadi, Saeed & Mosavi, Amir & Duan, Puhong & Ghamisi, Pedram & Filip, Ferdinand & Band, Shahab S. & Reuter, Uwe & Gama, Joao & Gandomi, Amir H., 2020. "Data science in economics: comprehensive review of advanced machine learning and deep learning methods," Thesis Commons auyvc, Center for Open Science.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Labib Shami & Teddy Lazebnik, 2024. "Implementing Machine Learning Methods in Estimating the Size of the Non-observed Economy," Computational Economics, Springer;Society for Computational Economics, vol. 63(4), pages 1459-1476, April.
    2. Urko Aguirre-Larracoechea & Cruz E. Borges, 2021. "Imputation for Repeated Bounded Outcome Data: Statistical and Machine-Learning Approaches," Mathematics, MDPI, vol. 9(17), pages 1-27, August.
    3. Adrian Millea, 2021. "Deep Reinforcement Learning for Trading—A Critical Survey," Data, MDPI, vol. 6(11), pages 1-25, November.
    4. Hanyao Gao & Gang Kou & Haiming Liang & Hengjie Zhang & Xiangrui Chao & Cong-Cong Li & Yucheng Dong, 2024. "Machine learning in business and finance: a literature review and research opportunities," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 10(1), pages 1-35, December.
    5. Cui, Xiwen & Yu, Xiaoyu & Niu, Dongxiao, 2024. "The ultra-short-term wind power point-interval forecasting model based on improved variational mode decomposition and bidirectional gated recurrent unit improved by improved sparrow search algorithm a," Energy, Elsevier, vol. 288(C).
    6. Lin, Yong & Wang, Renyu & Gong, Xingyue & Jia, Guozhu, 2022. "Cross-correlation and forecast impact of public attention on USD/CNY exchange rate: Evidence from Baidu Index," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 604(C).
    7. Oliver Hümbelin & Lukas Hobi & Robert Fluder, 2021. "Rich Cities, Poor Countryside? Social Structure of the Poor and Poverty Risks in Urban and Rural Places in an Affluent Country. An Administrative Data based Analysis using Random Forest," University of Bern Social Sciences Working Papers 40, University of Bern, Department of Social Sciences, revised 10 Nov 2021.
    8. Petr Suler & Zuzana Rowland & Tomas Krulicky, 2021. "Evaluation of the Accuracy of Machine Learning Predictions of the Czech Republic’s Exports to the China," JRFM, MDPI, vol. 14(2), pages 1-30, February.
    9. Saeed Nosratabadi & Nesrine Khazami & Marwa Ben Abdallah & Zoltan Lackner & Shahab S. Band & Amir Mosavi & Csaba Mako, 2020. "Social Capital Contributions to Food Security: A Comprehensive Literature Review," Papers 2012.03606, arXiv.org.
    10. Cheng Zhang & Nilam Nur Amir Sjarif & Roslina Ibrahim, 2023. "Deep learning models for price forecasting of financial time series: A review of recent advancements: 2020-2022," Papers 2305.04811, arXiv.org, revised Sep 2023.
    11. Xiaodong Zhang & Suhui Liu & Xin Zheng, 2021. "Stock Price Movement Prediction Based on a Deep Factorization Machine and the Attention Mechanism," Mathematics, MDPI, vol. 9(8), pages 1-21, April.
    12. Teddy Lazebnik & Tzach Fleischer & Amit Yaniv-Rosenfeld, 2023. "Benchmarking Biologically-Inspired Automatic Machine Learning for Economic Tasks," Sustainability, MDPI, vol. 15(14), pages 1-9, July.
    13. Di Wu & Zhenning Xu & Seung Bach, 2023. "Using Google Trends to predict and forecast avocado sales," Journal of Marketing Analytics, Palgrave Macmillan, vol. 11(4), pages 629-641, December.
    14. Yong-Chao Su & Cheng-Yu Wu & Cheng-Hong Yang & Bo-Sheng Li & Sin-Hua Moi & Yu-Da Lin, 2021. "Machine Learning Data Imputation and Prediction of Foraging Group Size in a Kleptoparasitic Spider," Mathematics, MDPI, vol. 9(4), pages 1-16, February.
    15. Mei-Li Shen & Cheng-Feng Lee & Hsiou-Hsiang Liu & Po-Yin Chang & Cheng-Hong Yang, 2021. "An Effective Hybrid Approach for Forecasting Currency Exchange Rates," Sustainability, MDPI, vol. 13(5), pages 1-29, March.
    16. Marcus Vinicius Santos & Fernando Morgado-Dias & Thiago C. Silva, 2023. "Oil Sector and Sentiment Analysis—A Review," Energies, MDPI, vol. 16(12), pages 1-29, June.
    17. Liang She & Jianyuan Wang & Yifan Bo & Yangyan Zeng, 2022. "MACA: Multi-Agent with Credit Assignment for Computation Offloading in Smart Parks Monitoring," Mathematics, MDPI, vol. 10(23), pages 1-18, December.
    18. Ren, Yi-Shuai & Ma, Chao-Qun & Kong, Xiao-Lin & Baltas, Konstantinos & Zureigat, Qasim, 2022. "Past, present, and future of the application of machine learning in cryptocurrency research," Research in International Business and Finance, Elsevier, vol. 63(C).
    19. ErLe Du & Meng Ji, 2021. "Analyzing the regional economic changes in a high-tech industrial development zone using machine learning algorithms," PLOS ONE, Public Library of Science, vol. 16(6), pages 1-18, June.
    20. David G. Green, 2023. "Emergence in complex networks of simple agents," Journal of Economic Interaction and Coordination, Springer;Society for Economic Science with Heterogeneous Interacting Agents, vol. 18(3), pages 419-462, July.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jdataj:v:8:y:2023:i:5:p:90-:d:1146474. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.