Author
Listed:
- Nan Yang
(University of East Anglia)
- Nikolaos Korfiatis
(University of East Anglia)
- Dimitris Zissis
(University of East Anglia)
- Konstantina Spanaki
(Audencia Business School)
Abstract
Rating prediction is a crucial element of business analytics as it enables decision-makers to assess service performance based on expressive customer feedback. Enhancing rating score predictions and demand forecasting through incorporating performance features from verbatim text fields, particularly in service quality measurement and customer satisfaction modelling is a key objective in various areas of analytics. A range of methods has been identified in the literature for improving the predictability of customer feedback, including simple bag-of-words-based approaches and advanced supervised machine learning models, which are designed to work with response variables such as Likert-based rating scores. This paper presents a dynamic model that incorporates values from topic membership, an outcome variable from Latent Dirichlet Allocation, with sentiment analysis in an Extreme Gradient Boosting (XGBoost) model used for rating prediction. The results show that, by incorporating features from simple unsupervised machine learning approaches (LDA-based), an 86% prediction accuracy (AUC based) can be achieved on objective rating values. At the same time, a combination of polarity and single-topic membership can yield an even higher accuracy when compared with sentiment text detection tasks both at the document and sentence levels. This study carries significant practical implications since sentiment analysis tasks often require dictionary coverage and domain-specific adjustments depending on the task at hand. To further investigate this result, we used Shapley Additive Values to determine the additive predictability of topic membership values in combination with sentiment-based methods using a dataset of customer reviews from food delivery services.
Suggested Citation
Nan Yang & Nikolaos Korfiatis & Dimitris Zissis & Konstantina Spanaki, 2024.
"Incorporating topic membership in review rating prediction from unstructured data: a gradient boosting approach,"
Annals of Operations Research, Springer, vol. 339(1), pages 631-662, August.
Handle:
RePEc:spr:annopr:v:339:y:2024:i:1:d:10.1007_s10479-023-05336-z
DOI: 10.1007/s10479-023-05336-z
Download full text from publisher
As the access to this document is restricted, you may want to search for a different version of it.
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:annopr:v:339:y:2024:i:1:d:10.1007_s10479-023-05336-z. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.