IDEAS home Printed from https://ideas.repec.org/a/spr/infott/v18y2018i1d10.1007_s40558-017-0098-z.html
   My bibliography  Save this article

Assessing reliability of social media data: lessons from mining TripAdvisor hotel reviews

Author

Listed:
  • Zheng Xiang

    (Virginia Tech
    Beijing Union University)

  • Qianzhou Du

    (Virginia Tech)

  • Yufeng Ma

    (Virginia Tech)

  • Weiguo Fan

    (Virginia Tech)

Abstract

As an emerging research paradigm, big data analytics has been gaining currency in various fields. However, in existing hospitality and tourism literature there is scarcity of discussions on the quality of data which may impact the validity and generalizability of research findings. This study examines the reliability of online hotel reviews in TripAdvisor by developing a text classifier to predict travel purpose (i.e., business vs. leisure) based upon review textual contents. The classifier is tested over a range of cities and data sizes to examine its sensitivity to data samples. The findings show that, while the classifier’s performance is consistent across different cities, there are variations in response to data sizes and sampling methods. More importantly, a considerable amount of noise is found in the data, which leads to misclassification. Furthermore, a novel approach is developed to address the misclassification problem resulting from data noise. This study reveals important data quality issues and contributes to the theoretical development of social media analytics in hospitality and tourism.

Suggested Citation

  • Zheng Xiang & Qianzhou Du & Yufeng Ma & Weiguo Fan, 2018. "Assessing reliability of social media data: lessons from mining TripAdvisor hotel reviews," Information Technology & Tourism, Springer, vol. 18(1), pages 43-59, April.
  • Handle: RePEc:spr:infott:v:18:y:2018:i:1:d:10.1007_s40558-017-0098-z
    DOI: 10.1007/s40558-017-0098-z
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s40558-017-0098-z
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s40558-017-0098-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Xiang, Zheng & Du, Qianzhou & Ma, Yufeng & Fan, Weiguo, 2017. "A comparative analysis of major online review platforms: Implications for social media analytics in hospitality and tourism," Tourism Management, Elsevier, vol. 58(C), pages 51-65.
    2. Hamid Ekbia & Michael Mattioli & Inna Kouper & G. Arave & Ali Ghazinejad & Timothy Bowman & Venkata Ratandeep Suri & Andrew Tsou & Scott Weingart & Cassidy R. Sugimoto, 2015. "Big data, bigger dilemmas: A critical review," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 66(8), pages 1523-1545, August.
    3. Xiang, Zheng & Pan, Bing, 2011. "Travel queries on cities in the United States: Implications for search engine marketing for tourist destinations," Tourism Management, Elsevier, vol. 32(1), pages 88-97.
    4. Park, Sangwon & Nicolau, Juan L., 2015. "Asymmetric effects of online consumer reviews," Annals of Tourism Research, Elsevier, vol. 50(C), pages 67-83.
    5. Alan S. Abrahams & Weiguo Fan & G. Alan Wang & Zhongju (John) Zhang & Jian Jiao, 2015. "An Integrated Text Analytic Framework for Product Defect Discovery," Production and Operations Management, Production and Operations Management Society, vol. 24(6), pages 975-990, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Kuo-Yan Wang & Mu-Lan Ma & Jing Yu, 2021. "Understanding the perceived satisfaction and revisiting intentions of lodgers in a restricted service scenario: evidence from the hotel industry in quarantine," Service Business, Springer;Pan-Pacific Business Association, vol. 15(2), pages 335-368, June.
    2. Zachlod, Cécile & Samuel, Olga & Ochsner, Andrea & Werthmüller, Sarah, 2022. "Analytics of social media data – State of characteristics and application," Journal of Business Research, Elsevier, vol. 144(C), pages 1064-1076.
    3. Madelene Blaer, 2023. "Interactive webcam travel: supporting wildlife tourism and conservation during COVID-19 lockdowns," Information Technology & Tourism, Springer, vol. 25(1), pages 47-69, March.
    4. Yang, Yang & Mao, Zhenxing & Zhang, Xiaowei, 2021. "Better sleep, better trip: The effect of sleep quality on tourists' experiences," Annals of Tourism Research, Elsevier, vol. 87(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xiang, Zheng & Du, Qianzhou & Ma, Yufeng & Fan, Weiguo, 2017. "A comparative analysis of major online review platforms: Implications for social media analytics in hospitality and tourism," Tourism Management, Elsevier, vol. 58(C), pages 51-65.
    2. Sunyoung Hlee & Hanna Lee & Chulmo Koo, 2018. "Hospitality and Tourism Online Review Research: A Systematic Analysis and Heuristic-Systematic Model," Sustainability, MDPI, vol. 10(4), pages 1-27, April.
    3. Yani Wang & Jun Wang & Tang Yao, 2019. "What makes a helpful online review? A meta-analysis of review characteristics," Electronic Commerce Research, Springer, vol. 19(2), pages 257-284, June.
    4. K Thirumaran & Haejin Jang & Zahra Pourabedin & Jacob Wood, 2021. "The Role of Social Media in the Luxury Tourism Business: A Research Review and Trajectory Assessment," Sustainability, MDPI, vol. 13(3), pages 1-13, January.
    5. Liu, Zhenyuan & Geng, Ruoqi & Tse, Ying Kei (Mike) & Han, Shuihua, 2023. "Mapping the relationship between social media usage and organizational performance: A meta-analysis," Technological Forecasting and Social Change, Elsevier, vol. 187(C).
    6. Qiao, Guanghui & Song, Hanqi & Prideaux, Bruce & Huang, Songshan (Sam), 2023. "The “unseen” tourism: Travel experience of people with visual impairment," Annals of Tourism Research, Elsevier, vol. 99(C).
    7. Ayat Zaki Ahmed & Manuel Rodríguez-Díaz, 2020. "Significant Labels in Sentiment Analysis of Online Customer Reviews of Airlines," Sustainability, MDPI, vol. 12(20), pages 1-18, October.
    8. Yucheng Zhang & Zhiling Wang & Lin Xiao & Lijun Wang & Pei Huang, 2023. "Discovering the evolution of online reviews: A bibliometric review," Electronic Markets, Springer;IIM University of St. Gallen, vol. 33(1), pages 1-22, December.
    9. Paolo E. Giordani & Francesco Rullani, 2020. "The Digital Revolution and COVID-19," Working Papers 06, Venice School of Management - Department of Management, Università Ca' Foscari Venezia.
    10. Yi Yang & Kunpeng Zhang & Yangyang Fan, 2023. "sDTM: A Supervised Bayesian Deep Topic Model for Text Analytics," Information Systems Research, INFORMS, vol. 34(1), pages 137-156, March.
    11. Yang, Xin & Pan, Bing & Evans, James A. & Lv, Benfu, 2015. "Forecasting Chinese tourist volume with search engine data," Tourism Management, Elsevier, vol. 46(C), pages 386-397.
    12. Park, Sangwon & Nicolau, Juan L., 2017. "Effects of general and particular online hotel ratings," Annals of Tourism Research, Elsevier, vol. 62(C), pages 114-116.
    13. Ian Sutherland & Youngseok Sim & Seul Ki Lee & Jaemun Byun & Kiattipoom Kiatkawsin, 2020. "Topic Modeling of Online Accommodation Reviews via Latent Dirichlet Allocation," Sustainability, MDPI, vol. 12(5), pages 1-15, February.
    14. Shan, Wei & Qiao, Tong & Zhang, Mingli, 2020. "Getting more resources for better performance: The effect of user-owned resources on the value of user-generated content," Technological Forecasting and Social Change, Elsevier, vol. 161(C).
    15. Kenneth David Strang & Zhaohao Sun, 2017. "Big Data Paradigm: What is the Status of Privacy and Security?," Annals of Data Science, Springer, vol. 4(1), pages 1-17, March.
    16. Ghimire, Binam & Shanaev, Savva & Lin, Zhibin, 2022. "Effects of official versus online review ratings," Annals of Tourism Research, Elsevier, vol. 92(C).
    17. Carmela Iorio & Giuseppe Pandolfo & Antonio D’Ambrosio & Roberta Siciliano, 2020. "Mining big data in tourism," Quality & Quantity: International Journal of Methodology, Springer, vol. 54(5), pages 1655-1669, December.
    18. Angela Aerry Choi & Daegon Cho & Dobin Yim & Jae Yun Moon & Wonseok Oh, 2019. "When Seeing Helps Believing: The Interactive Effects of Previews and Reviews on E-Book Purchases," Information Systems Research, INFORMS, vol. 30(4), pages 1164-1183, December.
    19. Guo, Yue & Barnes, Stuart J. & Jia, Qiong, 2017. "Mining meaning from online ratings and reviews: Tourist satisfaction analysis using latent dirichlet allocation," Tourism Management, Elsevier, vol. 59(C), pages 467-483.
    20. Josef Zelenka & Tracy Azubuike & Martina Pásková, 2021. "Trust Model for Online Reviews of Tourism Services and Evaluation of Destinations," Administrative Sciences, MDPI, vol. 11(2), pages 1-21, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:infott:v:18:y:2018:i:1:d:10.1007_s40558-017-0098-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.