Author
Listed:
- Zhiang Wu
(School of Computer Science, Nanjing Audit University, Nanjing 211815, China)
- Guannan Liu
(School of Economics and Management, Beihang University, Beijing 100191, China; Key Laboratory of Data Intelligence and Management, Ministry of Industry and Information Technology, Beijing 100191, China)
- Junjie Wu
(School of Economics and Management, Beihang University, Beijing 100191, China; Key Laboratory of Data Intelligence and Management, Ministry of Industry and Information Technology, Beijing 100191, China)
- Yong Tan
(Michael G. Foster School of Business, University of Washington, Seattle, Washington 98195)
Abstract
Review spammers can harm the trustworthy environment of online platforms by purposefully posting unauthentic ratings and comments for products or online merchants, with the aim of gaining improper benefits. Although many methods have been proposed to resolve the spammer detection problem, several challenges, such as collusion recognition, label scarcity, and biased distributions, are still persistent and call for further investigation. Building on prevalent collusive spamming behaviors and the network homophily theory, we introduce a reviewer network to account for explicit coreview relations, and then, we propose a semisupervised probabilistic collaborative learning model to capture both reviewers’ individual behavioral features and the reviewer network. Our model features integrating partial label propagation with a pseudolabeling strategy and feature-based learning for reviewer network modeling, which is proved theoretically to be a weighted logistic regression on a network-derived synthetic data set. The rich parameters that characterize the importance of network information, the strength of network homophily, and the value of unlabeled data make our model more transparent. The empirical evaluations on two distinctive real-life data sets have demonstrated the effectiveness of our model and the significance of unlabeled data, in which the reviewer network after proper trimming demonstrates notable homophily effects and plays a vital role. In particular, the proposed model exhibits robustness against label scarcity and biased label distribution.
Suggested Citation
Zhiang Wu & Guannan Liu & Junjie Wu & Yong Tan, 2024.
"Are Neighbors Alike? A Semisupervised Probabilistic Collaborative Learning Model for Online Review Spammers Detection,"
Information Systems Research, INFORMS, vol. 35(4), pages 1565-1585, December.
Handle:
RePEc:inm:orisre:v:35:y:2024:i:4:p:1565-1585
DOI: 10.1287/isre.2022.0047
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:orisre:v:35:y:2024:i:4:p:1565-1585. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.