IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v113y2017i2d10.1007_s11192-017-2516-6.html
   My bibliography  Save this article

Inter-rater reliability and validity of peer reviews in an interdisciplinary field

Author

Listed:
  • Jens Jirschitzka

    (Eberhard Karls Universität Tübingen)

  • Aileen Oeberst

    (Johannes Gutenberg-Universität Mainz
    Leibniz-Institut für Wissensmedien)

  • Richard Göllner

    (Eberhard Karls Universität Tübingen)

  • Ulrike Cress

    (Eberhard Karls Universität Tübingen
    Leibniz-Institut für Wissensmedien)

Abstract

Peer review is an integral part of science. Devised to ensure and enhance the quality of scientific work, it is a crucial step that influences the publication of papers, the provision of grants and, as a consequence, the career of scientists. In order to meet the challenges of this responsibility, a certain shared understanding of scientific quality seems necessary. Yet previous studies have shown that inter-rater reliability in peer reviews is relatively low. However, most of these studies did not take ill-structured measurement design of the data into account. Moreover, no prior (quantitative) study has analyzed inter-rater reliability in an interdisciplinary field. And finally, issues of validity have hardly ever been addressed. Therefore, the three major research goals of this paper are (1) to analyze inter-rater agreement of different rating dimensions (e.g., relevance and soundness) in an interdisciplinary field, (2) to account for ill-structured designs by applying state-of-the-art methods, and (3) to examine the construct and criterion validity of reviewers’ evaluations. A total of 443 reviews were analyzed. These reviews were provided by m = 130 reviewers for n = 145 submissions to an interdisciplinary conference. Our findings demonstrate the urgent need for improvement of scientific peer review. Inter-rater reliability was rather poor and there were no significant differences between evaluations from reviewers of the same scientific discipline as the papers they were reviewing versus reviewer evaluations of papers from disciplines other than their own. These findings extend beyond those of prior research. Furthermore, convergent and discriminant construct validity of the rating dimensions were low as well. Nevertheless, a multidimensional model yielded a better fit than a unidimensional model. Our study also shows that the citation rate of accepted papers was positively associated with the relevance ratings made by reviewers from the same discipline as the paper they were reviewing. In addition, high novelty ratings from same-discipline reviewers were negatively associated with citation rate.

Suggested Citation

  • Jens Jirschitzka & Aileen Oeberst & Richard Göllner & Ulrike Cress, 2017. "Inter-rater reliability and validity of peer reviews in an interdisciplinary field," Scientometrics, Springer;Akadémiai Kiadó, vol. 113(2), pages 1059-1092, November.
  • Handle: RePEc:spr:scient:v:113:y:2017:i:2:d:10.1007_s11192-017-2516-6
    DOI: 10.1007/s11192-017-2516-6
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-017-2516-6
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-017-2516-6?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Liwen Qiu, 1992. "A study of interdisciplinary research collaboration," Research Evaluation, Oxford University Press, vol. 2(3), pages 169-175, December.
    2. Hendrik P. van Dalen & Kène Henkens, 2012. "Intended and unintended consequences of a publish-or-perish culture: A worldwide survey," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 63(7), pages 1282-1293, July.
    3. Joost C. F. Winter & Amir A. Zadpoor & Dimitra Dodou, 2014. "The expansion of Google Scholar versus Web of Science: a longitudinal study," Scientometrics, Springer;Akadémiai Kiadó, vol. 98(2), pages 1547-1565, February.
    4. Hendrik P. van Dalen & Kène Henkens, 2012. "Intended and unintended consequences of a publish‐or‐perish culture: A worldwide survey," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 63(7), pages 1282-1293, July.
    5. Upali W. Jayasinghe & Herbert W. Marsh & Nigel Bond, 2003. "A multilevel cross‐classified modelling approach to peer review of grant proposals: the effects of assessor and researcher attributes on assessor ratings," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 166(3), pages 279-300, October.
    6. White, Halbert, 1980. "A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity," Econometrica, Econometric Society, vol. 48(4), pages 817-838, May.
    7. Lutz Bornmann & Rüdiger Mutz & Hans-Dieter Daniel, 2010. "A Reliability-Generalization Study of Journal Peer Reviews: A Multilevel Meta-Analysis of Inter-Rater Reliability and Its Determinants," PLOS ONE, Public Library of Science, vol. 5(12), pages 1-10, December.
    8. Richard Van Noorden, 2015. "Interdisciplinary research by the numbers," Nature, Nature, vol. 525(7569), pages 306-307, September.
    9. Michael L Callaham & John Tercier, 2007. "The Relationship of Previous Training and Experience of Journal Peer Reviewers to Subsequent Review Quality," PLOS Medicine, Public Library of Science, vol. 4(1), pages 1-9, January.
    10. John Horn, 1965. "A rationale and test for the number of factors in factor analysis," Psychometrika, Springer;The Psychometric Society, vol. 30(2), pages 179-185, June.
    11. Henry Kaiser, 1970. "A second generation little jiffy," Psychometrika, Springer;The Psychometric Society, vol. 35(4), pages 401-415, December.
    12. Carole J. Lee & Cassidy R. Sugimoto & Guo Zhang & Blaise Cronin, 2013. "Bias in peer review," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 64(1), pages 2-17, January.
    13. The PLoS Medicine Editors, 2007. "Peer Review in PLoS Medicine," PLOS Medicine, Public Library of Science, vol. 4(1), pages 1-2, January.
    14. Benda, Wim G.G. & Engels, Tim C.E., 2011. "The predictive validity of peer review: A selective review of the judgmental forecasting qualities of peers, and implications for innovation in science," International Journal of Forecasting, Elsevier, vol. 27(1), pages 166-182.
    15. Benda, Wim G.G. & Engels, Tim C.E., 2011. "The predictive validity of peer review: A selective review of the judgmental forecasting qualities of peers, and implications for innovation in science," International Journal of Forecasting, Elsevier, vol. 27(1), pages 166-182, January.
    16. Rüdiger Mutz & Lutz Bornmann & Hans-Dieter Daniel, 2012. "Heterogeneity of Inter-Rater Reliabilities of Grant Peer Reviews and Its Determinants: A General Estimating Equations Approach," PLOS ONE, Public Library of Science, vol. 7(10), pages 1-10, October.
    17. Michael Eid, 2000. "A multitrait-multimethod model with minimal assumptions," Psychometrika, Springer;The Psychometric Society, vol. 65(2), pages 241-261, June.
    18. Lutz Bornmann & Hans-Dieter Daniel, 2005. "Selection of research fellowship recipients by committee peer review. Reliability, fairness and predictive validity of Board of Trustees' decisions," Scientometrics, Springer;Akadémiai Kiadó, vol. 63(2), pages 297-320, April.
    19. Iman Tahamtan & Askar Safipour Afshar & Khadijeh Ahamdzadeh, 2016. "Factors affecting number of citations: a comprehensive review of the literature," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(3), pages 1195-1225, June.
    20. David J. Spiegelhalter & Nicola G. Best & Bradley P. Carlin & Angelika Van Der Linde, 2002. "Bayesian measures of model complexity and fit," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 64(4), pages 583-639, October.
    21. Upali W. Jayasinghe & Herbert W. Marsh & Nigel Bond, 2006. "A new reader trial approach to peer review in funding research grants: An Australian experiment," Scientometrics, Springer;Akadémiai Kiadó, vol. 69(3), pages 591-606, December.
    22. Carole J. Lee & Cassidy R. Sugimoto & Guo Zhang & Blaise Cronin, 2013. "Bias in peer review," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 64(1), pages 2-17, January.
    23. Paula Stephan & Reinhilde Veugelers & Jian Wang, 2017. "Reviewers are blinkered by bibliometrics," Nature, Nature, vol. 544(7651), pages 411-412, April.
    24. Dag W Aksnes, 2003. "Characteristics of highly cited papers," Research Evaluation, Oxford University Press, vol. 12(3), pages 159-170, December.
    25. Albert Satorra & Peter Bentler, 2010. "Ensuring Positiveness of the Scaled Difference Chi-square Test Statistic," Psychometrika, Springer;The Psychometric Society, vol. 75(2), pages 243-248, June.
    26. Benjamin List, 2017. "Crowd-based peer review can be good and fast," Nature, Nature, vol. 546(7656), pages 9-9, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Balázs Győrffy & Andrea Magda Nagy & Péter Herman & Ádám Török, 2018. "Factors influencing the scientific performance of Momentum grant holders: an evaluation of the first 117 research groups," Scientometrics, Springer;Akadémiai Kiadó, vol. 117(1), pages 409-426, October.
    2. Sergio Copiello, 2018. "On the money value of peer review," Scientometrics, Springer;Akadémiai Kiadó, vol. 115(1), pages 613-620, April.
    3. García, J.A. & Montero-Parodi, J.J. & Rodriguez-Sánchez, Rosa & Fdez-Valdivia, J., 2023. "How to motivate a reviewer with a present bias to work harder," Journal of Informetrics, Elsevier, vol. 17(4).
    4. Bornmann, Lutz & Haunschild, Robin, 2022. "Empirical analysis of recent temporal dynamics of research fields: Annual publications in chemistry and related areas as an example," Journal of Informetrics, Elsevier, vol. 16(2).
    5. Weixi Xie & Pengfei Jia & Guangyao Zhang & Xianwen Wang, 2024. "Are reviewer scores consistent with citations?," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(8), pages 4721-4740, August.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Rüdiger Mutz & Tobias Wolbring & Hans-Dieter Daniel, 2017. "The effect of the “very important paper” (VIP) designation in Angewandte Chemie International Edition on citation impact: A propensity score matching analysis," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 68(9), pages 2139-2153, September.
    2. Abramo, Giovanni & D'Angelo, Ciriaco Andrea & Grilli, Leonardo, 2021. "The effects of citation-based research evaluation schemes on self-citation behavior," Journal of Informetrics, Elsevier, vol. 15(4).
    3. Feliciani, Thomas & Morreau, Michael & Luo, Junwen & Lucas, Pablo & Shankar, Kalpana, 2022. "Designing grant-review panels for better funding decisions: Lessons from an empirically calibrated simulation model," Research Policy, Elsevier, vol. 51(4).
    4. Stephen A Gallo & Joanne H Sullivan & Scott R Glisson, 2016. "The Influence of Peer Reviewer Expertise on the Evaluation of Research Funding Applications," PLOS ONE, Public Library of Science, vol. 11(10), pages 1-18, October.
    5. Wiltrud Kuhlisch & Magnus Roos & Jörg Rothe & Joachim Rudolph & Björn Scheuermann & Dietrich Stoyan, 2016. "A statistical approach to calibrating the scores of biased reviewers of scientific papers," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 79(1), pages 37-57, January.
    6. Richard R Snell, 2015. "Menage a Quoi? Optimal Number of Peer Reviewers," PLOS ONE, Public Library of Science, vol. 10(4), pages 1-14, April.
    7. Vieira, Elizabeth S. & Cabral, José A.S. & Gomes, José A.N.F., 2014. "How good is a model based on bibliometric indicators in predicting the final decisions made by peers?," Journal of Informetrics, Elsevier, vol. 8(2), pages 390-405.
    8. Rüdiger Mutz & Lutz Bornmann & Hans-Dieter Daniel, 2015. "Testing for the fairness and predictive validity of research funding decisions: A multilevel multiple imputation for missing data approach using ex-ante and ex-post peer evaluation data from the Austr," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 66(11), pages 2321-2339, November.
    9. Dell'Anno, Roberto & Caferra, Rocco & Morone, Andrea, 2020. "A “Trojan Horse” in the peer-review process of fee-charging economic journals," Journal of Informetrics, Elsevier, vol. 14(3).
    10. Elena Veretennik & Maria Yudkevich, 2023. "Inconsistent quality signals: evidence from the regional journals," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(6), pages 3675-3701, June.
    11. Lanu Kim & Jason H. Portenoy & Jevin D. West & Katherine W. Stovel, 2020. "Scientific journals still matter in the era of academic search engines and preprint archives," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 71(10), pages 1218-1226, October.
    12. Lutz Bornmann, 2015. "Interrater reliability and convergent validity of F1000Prime peer review," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 66(12), pages 2415-2426, December.
    13. Yuetong Chen & Hao Wang & Baolong Zhang & Wei Zhang, 2022. "A method of measuring the article discriminative capacity and its distribution," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(6), pages 3317-3341, June.
    14. Vincent Chandler, 2019. "Identifying emerging scholars: seeing through the crystal ball of scholarship selection committees," Scientometrics, Springer;Akadémiai Kiadó, vol. 120(1), pages 39-56, July.
    15. Marsh, Herbert W. & Jayasinghe, Upali W. & Bond, Nigel W., 2011. "Gender differences in peer reviews of grant applications: A substantive-methodological synergy in support of the null hypothesis model," Journal of Informetrics, Elsevier, vol. 5(1), pages 167-180.
    16. Kok, Holmer & Faems, Dries & de Faria, Pedro, 2022. "Pork Barrel or Barrel of Gold? Examining the performance implications of earmarking in public R&D grants," Research Policy, Elsevier, vol. 51(7).
    17. Dag W. Aksnes & Liv Langfeldt & Paul Wouters, 2019. "Citations, Citation Indicators, and Research Quality: An Overview of Basic Concepts and Theories," SAGE Open, , vol. 9(1), pages 21582440198, February.
    18. Bradford Demarest & Guo Freeman & Cassidy R. Sugimoto, 2014. "The reviewer in the mirror: examining gendered and ethnicized notions of reciprocity in peer review," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(1), pages 717-735, October.
    19. Gaëlle Vallée-Tourangeau & Ana Wheelock & Tushna Vandrevala & Priscilla Harries, 2022. "Peer reviewers’ dilemmas: a qualitative exploration of decisional conflict in the evaluation of grant applications in the medical humanities and social sciences," Palgrave Communications, Palgrave Macmillan, vol. 9(1), pages 1-11, December.
    20. Antonio Fernandez-Cano, 2021. "Letter to the Editor: publish, publish … cursed!," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(4), pages 3673-3682, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:113:y:2017:i:2:d:10.1007_s11192-017-2516-6. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.