IDEAS home Printed from https://ideas.repec.org/a/eee/infome/v18y2024i2s1751157724000142.html
   My bibliography  Save this article

Textual features of peer review predict top-cited papers: An interpretable machine learning perspective

Author

Listed:
  • Sun, Zhuanlan

Abstract

Peer review is crucial in improving the quality and reliability of scientific research. However, the mechanisms through which peer review practices ensure papers become top-cited papers (TCPs) after publication are not well understood. In this study, by collecting a data set containing 13, 066 papers published between 2016 and 2020 from Nature communications with open peer review reports, we aim to examine how textual features embedded within the peer review reports of papers that reflect the reviewers’ emotions may predict the papers to be TCPs. We compiled a list of 15 textual features and classified them into three categories: peer review features, linguistic features, and sentiment features. We then chose the XGBoost machine learning model with the best performance in predicting TCPs, and utilized the explainable artificial intelligence techniques SHAP to interpret the role of feature importance on the prediction results. The distribution of feature importance ranking results demonstrates that sentiment features play a crucial role in determining papers’ potential to be highly cited. This conclusion still holds, even when the ranking of the feature importance changes in the subgroup analysis of dividing the samples into four disciplines (biological sciences, health sciences, physical sciences, and earth and environmental sciences), as well as two groups based on whether reviewers’ identities were revealed. This research emphasizes the textual features retrieved from peer review reports that play role in improving manuscript quality can predict the post-publication research impact.

Suggested Citation

  • Sun, Zhuanlan, 2024. "Textual features of peer review predict top-cited papers: An interpretable machine learning perspective," Journal of Informetrics, Elsevier, vol. 18(2).
  • Handle: RePEc:eee:infome:v:18:y:2024:i:2:s1751157724000142
    DOI: 10.1016/j.joi.2024.101501
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S1751157724000142
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.joi.2024.101501?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Natsuo Onodera & Fuyuki Yoshikane, 2015. "Factors affecting citation rates of research articles," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 66(4), pages 739-764, April.
    2. Sun, Zhuanlan & Clark Cao, C. & Ma, Chao & Li, Yiwei, 2023. "The academic status of reviewers predicts their language use," Journal of Informetrics, Elsevier, vol. 17(4).
    3. Hajar Sotudeh & Zeinab Saber & Farzin Ghanbari Aloni & Mahdieh Mirzabeigi & Farshad Khunjush, 2022. "A longitudinal study of the evolution of opinions about open access and its main features: a twitter sentiment analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(10), pages 5587-5611, October.
    4. Bianchi, Federico & García-Costa, Daniel & Grimaldo, Francisco & Squazzoni, Flaminio, 2022. "Measuring the effect of reviewers on manuscript change: A study on a sample of submissions to Royal Society journals (2006–2017)," Journal of Informetrics, Elsevier, vol. 16(3).
    5. Niccolò Casnici & Francisco Grimaldo & Nigel Gilbert & Pierpaolo Dondio & Flaminio Squazzoni, 2017. "Assessing peer review by gauging the fate of rejected manuscripts: the case of the Journal of Artificial Societies and Social Simulation," Scientometrics, Springer;Akadémiai Kiadó, vol. 113(1), pages 533-546, October.
    6. Jue Ni & Zhenyue Zhao & Yupo Shao & Shuo Liu & Wanlin Li & Yaoze Zhuang & Junmo Qu & Yu Cao & Nayuan Lian & Jiang Li, 2021. "The influence of opening up peer review on the citations of journal articles," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(12), pages 9393-9404, December.
    7. David N. Laband, 1990. "Is There Value-Added from the Review Process in Economics?: Preliminary Evidence from Authors," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 105(2), pages 341-352.
    8. Janine Huisman & Jeroen Smits, 2017. "Duration and quality of the peer review process: the author’s perspective," Scientometrics, Springer;Akadémiai Kiadó, vol. 113(1), pages 633-650, October.
    9. Niccolò Casnici & Francisco Grimaldo & Nigel Gilbert & Flaminio Squazzoni, 2017. "Attitudes of referees in a multidisciplinary journal: An empirical analysis," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 68(7), pages 1763-1771, July.
    10. Shan Jiang, 2021. "Understanding authors' psychological reactions to peer reviews: a text mining approach," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(7), pages 6085-6103, July.
    11. Dietmar Wolfram & Peiling Wang & Fuad Abuzahra, 2021. "An exploration of referees’ comments published in open peer review journals: The characteristics of review language and the association between review scrutiny and citations [Peer Review for Journa," Research Evaluation, Oxford University Press, vol. 30(3), pages 314-322.
    12. Ponomarev, Ilya V. & Williams, Duane E. & Hackett, Charles J. & Schnell, Joshua D. & Haak, Laurel L., 2014. "Predicting highly cited papers: A Method for Early Detection of Candidate Breakthroughs," Technological Forecasting and Social Change, Elsevier, vol. 81(C), pages 49-55.
    13. Dimity Stephen, 2022. "Peer reviewers equally critique theory, method, and writing, with limited effect on the final content of accepted manuscripts," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(6), pages 3413-3435, June.
    14. Ruan, Xuanmin & Zhu, Yuanyang & Li, Jiang & Cheng, Ying, 2020. "Predicting the citation counts of individual papers via a BP neural network," Journal of Informetrics, Elsevier, vol. 14(3).
    15. Flaminio Squazzoni & Claudio Gandelli, 2013. "Opening the Black-Box of Peer Review: An Agent-Based Model of Scientist Behaviour," Journal of Artificial Societies and Social Simulation, Journal of Artificial Societies and Social Simulation, vol. 16(2), pages 1-3.
    16. Alberto Falk Delgado & Gregory Garretson & Anna Falk Delgado, 2019. "The language of peer review reports on articles published in the BMJ, 2014–2017: an observational study," Scientometrics, Springer;Akadémiai Kiadó, vol. 120(3), pages 1225-1235, September.
    17. Andrea Fronzetti Colladon & Ciriaco Andrea D’Angelo & Peter A. Gloor, 2020. "Predicting the future success of scientific publications through social network and semantic analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(1), pages 357-377, July.
    18. Donner, Paul, 2018. "Effect of publication month on citation impact," Journal of Informetrics, Elsevier, vol. 12(1), pages 330-343.
    19. J. A. García & Rosa Rodriguez-Sánchez & J. Fdez-Valdivia, 2019. "Do the best papers have the highest probability of being cited?," Scientometrics, Springer;Akadémiai Kiadó, vol. 118(3), pages 885-890, March.
    20. Zhou-min Yuan & Mingxin Yao, 2022. "Is academic writing becoming more positive? A large-scale diachronic case study of Science research articles across 25 years," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6191-6207, November.
    21. J. Rigby & D. Cox & K. Julian, 2018. "Journal peer review: a bar or bridge? An analysis of a paper’s revision history and turnaround time, and the effect on citation," Scientometrics, Springer;Akadémiai Kiadó, vol. 114(3), pages 1087-1105, March.
    22. Giangiacomo Bravo & Francisco Grimaldo & Emilia López-Iñesta & Bahar Mehmani & Flaminio Squazzoni, 2019. "The effect of publishing peer review reports on referee behavior in five scholarly journals," Nature Communications, Nature, vol. 10(1), pages 1-8, December.
    23. Iman Tahamtan & Askar Safipour Afshar & Khadijeh Ahamdzadeh, 2016. "Factors affecting number of citations: a comprehensive review of the literature," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(3), pages 1195-1225, June.
    24. Jurgen Huber & Sabiou M. Inoua & Rudolf Kerschbamer & Christian Konig-Kersting & Stefan Palan & Vernon L. Smith, 2022. "Nobel and Novice: Author Prominence Affects Peer Review," Working Papers 22-15, Chapman University, Economic Science Institute.
    25. Wanjun Xia & Tianrui Li & Chongshou Li, 2023. "A review of scientific impact prediction: tasks, features and methods," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(1), pages 543-585, January.
    26. Wenjie Wei & Hongxu Liu & Zhuanlan Sun, 2022. "Cover papers of top journals are reliable source for emerging topics detection: a machine learning based prediction framework," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(8), pages 4315-4333, August.
    27. Zhang, Guangyao & Xu, Shenmeng & Sun, Yao & Jiang, Chunlin & Wang, Xianwen, 2022. "Understanding the peer review endeavor in scientific publishing," Journal of Informetrics, Elsevier, vol. 16(2).
    28. Thomas Klebel & Stefan Reichmann & Jessica Polka & Gary McDowell & Naomi Penfold & Samantha Hindle & Tony Ross-Hellauer, 2020. "Peer review and preprint policies are unclear at most major journals," PLOS ONE, Public Library of Science, vol. 15(10), pages 1-19, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Sun, Zhuanlan & He, Dongjin & Li, Yiwei, 2024. "How the readability of manuscript before journal submission advantages peer review process: Evidence from biomedical scientific publications," Journal of Informetrics, Elsevier, vol. 18(3).
    2. Sun, Zhuanlan & Pang, Ka Lok & Li, Yiwei, 2024. "The fading of status bias during the open peer review process," Journal of Informetrics, Elsevier, vol. 18(3).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hou, Li & Wu, Qiang & Xie, Yundong, 2024. "Does open identity of peer reviewers positively relate to citations?," Journal of Informetrics, Elsevier, vol. 18(1).
    2. Sun, Zhuanlan & Clark Cao, C. & Ma, Chao & Li, Yiwei, 2023. "The academic status of reviewers predicts their language use," Journal of Informetrics, Elsevier, vol. 17(4).
    3. Zhang, Guangyao & Xu, Shenmeng & Sun, Yao & Jiang, Chunlin & Wang, Xianwen, 2022. "Understanding the peer review endeavor in scientific publishing," Journal of Informetrics, Elsevier, vol. 16(2).
    4. Bianchi, Federico & Grimaldo, Francisco & Squazzoni, Flaminio, 2019. "The F3-index. Valuing reviewers for scholarly journals," Journal of Informetrics, Elsevier, vol. 13(1), pages 78-86.
    5. Wanjun Xia & Tianrui Li & Chongshou Li, 2023. "A review of scientific impact prediction: tasks, features and methods," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(1), pages 543-585, January.
    6. Pengfei Jia & Weixi Xie & Guangyao Zhang & Xianwen Wang, 2023. "Do reviewers get their deserved acknowledgments from the authors of manuscripts?," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(10), pages 5687-5703, October.
    7. Sun, Zhuanlan & He, Dongjin & Li, Yiwei, 2024. "How the readability of manuscript before journal submission advantages peer review process: Evidence from biomedical scientific publications," Journal of Informetrics, Elsevier, vol. 18(3).
    8. Sun, Zhuanlan & Pang, Ka Lok & Li, Yiwei, 2024. "The fading of status bias during the open peer review process," Journal of Informetrics, Elsevier, vol. 18(3).
    9. Chunli Wei & Jingyi Zhao & Jue Ni & Jiang Li, 2023. "What does open peer review bring to scientific articles? Evidence from PLoS journals," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(5), pages 2763-2776, May.
    10. Zhuanlan Sun & C. Clark Cao & Sheng Liu & Yiwei Li & Chao Ma, 2024. "Behavioral consequences of second-person pronouns in written communications between authors and reviewers of scientific papers," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    11. Martorell Cunil, Onofre & Otero González, Luis & Durán Santomil, Pablo & Mulet Forteza, Carlos, 2023. "How to accomplish a highly cited paper in the tourism, leisure and hospitality field," Journal of Business Research, Elsevier, vol. 157(C).
    12. Buljan, Ivan & Garcia-Costa, Daniel & Grimaldo, Francisco & Klein, Richard A. & Bakker, Marjan & Marušić, Ana, 2024. "Development and application of a comprehensive glossary for the identification of statistical and methodological concepts in peer review reports," Journal of Informetrics, Elsevier, vol. 18(3).
    13. Akbaritabar, Aliakbar & Stephen, Dimity & Squazzoni, Flaminio, 2022. "A study of referencing changes in preprint-publication pairs across multiple fields," Journal of Informetrics, Elsevier, vol. 16(2).
    14. Shan Jiang, 2021. "Understanding authors' psychological reactions to peer reviews: a text mining approach," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(7), pages 6085-6103, July.
    15. Bianchi, Federico & García-Costa, Daniel & Grimaldo, Francisco & Squazzoni, Flaminio, 2022. "Measuring the effect of reviewers on manuscript change: A study on a sample of submissions to Royal Society journals (2006–2017)," Journal of Informetrics, Elsevier, vol. 16(3).
    16. Katchanov, Yurij L. & Markova, Yulia V. & Shmatko, Natalia A., 2023. "Uncited papers in the structure of scientific communication," Journal of Informetrics, Elsevier, vol. 17(2).
    17. Sepideh Fahimifar & Khadijeh Mousavi & Fatemeh Mozaffari & Marcel Ausloos, 2023. "Identification of the most important external features of highly cited scholarly papers through 3 (i.e., Ridge, Lasso, and Boruta) feature selection data mining methods," Quality & Quantity: International Journal of Methodology, Springer, vol. 57(4), pages 3685-3712, August.
    18. Wenqing Wu & Haixu Xi & Chengzhi Zhang, 2024. "Are the confidence scores of reviewers consistent with the review content? Evidence from top conference proceedings in AI," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(7), pages 4109-4135, July.
    19. Akella, Akhil Pandey & Alhoori, Hamed & Kondamudi, Pavan Ravikanth & Freeman, Cole & Zhou, Haiming, 2021. "Early indicators of scientific impact: Predicting citations with altmetrics," Journal of Informetrics, Elsevier, vol. 15(2).
    20. Peter Sjögårde & Fereshteh Didegah, 2022. "The association between topic growth and citation impact of research publications," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(4), pages 1903-1921, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:infome:v:18:y:2024:i:2:s1751157724000142. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/joi .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.