IDEAS home Printed from https://ideas.repec.org/a/eee/infome/v7y2013i1p50-62.html
   My bibliography  Save this article

Caveats for using statistical significance tests in research assessments

Author

Listed:
  • Schneider, Jesper W.

Abstract

This article raises concerns about the advantages of using statistical significance tests in research assessments as has recently been suggested in the debate about proper normalization procedures for citation indicators by Opthof and Leydesdorff (2010). Statistical significance tests are highly controversial and numerous criticisms have been leveled against their use. Based on examples from articles by proponents of the use statistical significance tests in research assessments, we address some of the numerous problems with such tests. The issues specifically discussed are the ritual practice of such tests, their dichotomous application in decision making, the difference between statistical and substantive significance, the implausibility of most null hypotheses, the crucial assumption of randomness, as well as the utility of standard errors and confidence intervals for inferential purposes. We argue that applying statistical significance tests and mechanically adhering to their results are highly problematic and detrimental to critical thinking. We claim that the use of such tests do not provide any advantages in relation to deciding whether differences between citation indicators are important or not. On the contrary their use may be harmful. Like many other critics, we generally believe that statistical significance tests are over- and misused in the empirical sciences including scientometrics and we encourage a reform on these matters.

Suggested Citation

  • Schneider, Jesper W., 2013. "Caveats for using statistical significance tests in research assessments," Journal of Informetrics, Elsevier, vol. 7(1), pages 50-62.
  • Handle: RePEc:eee:infome:v:7:y:2013:i:1:p:50-62
    DOI: 10.1016/j.joi.2012.08.005
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S175115771200065X
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.joi.2012.08.005?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Deirdre N. McCloskey & Stephen T. Ziliak, 1996. "The Standard Error of Regressions," Journal of Economic Literature, American Economic Association, vol. 34(1), pages 97-114, March.
    2. Armstrong, J. Scott, 2007. "Significance tests harm progress in forecasting," International Journal of Forecasting, Elsevier, vol. 23(2), pages 321-327.
    3. Loet Leydesdorff & Lutz Bornmann & Rüdiger Mutz & Tobias Opthof, 2011. "Turning the tables on citation analysis one more time: Principles for comparing sets of documents," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 62(7), pages 1370-1381, July.
    4. Mingers, John & Xu, Fang, 2010. "The drivers of citations in management science journals," European Journal of Operational Research, Elsevier, vol. 205(2), pages 422-430, September.
    5. Jürgen Harald Jacob & Siegfried Lehrl & Andreas Wolfram Henkel, 2007. "Early recognition of high quality researchers of the German psychiatry by worldwide accessible bibliometric indicators," Scientometrics, Springer;Akadémiai Kiadó, vol. 73(2), pages 117-130, November.
    6. McCloskey, Donald N, 1985. "The Loss Function Has Been Mislaid: The Rhetoric of Significance Tests," American Economic Review, American Economic Association, vol. 75(2), pages 201-205, May.
    7. Western, Bruce & Jackman, Simon, 1994. "Bayesian Inference for Comparative Research," American Political Science Review, Cambridge University Press, vol. 88(2), pages 412-423, June.
    8. van Raan, Anthony F.J. & van Leeuwen, Thed N. & Visser, Martijn S. & van Eck, Nees Jan & Waltman, Ludo, 2010. "Rivals for the crown: Reply to Opthof and Leydesdorff," Journal of Informetrics, Elsevier, vol. 4(3), pages 431-435.
    9. Ludo Waltman & Nees Jan Eck & Thed N. Leeuwen & Martijn S. Visser & Anthony F. J. Raan, 2011. "Towards a new crown indicator: an empirical analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 87(3), pages 467-481, June.
    10. Louis Guttman, 1985. "The illogic of statistical inference for cumulative science," Applied Stochastic Models and Data Analysis, John Wiley & Sons, vol. 1(1), pages 3-9.
    11. Rebecca Long & Aleta Crawford & Michael White & Kimberly Davis, 2009. "Determinants of faculty research productivity in information systems: An empirical analysis of the impact of academic origin and academic affiliation," Scientometrics, Springer;Akadémiai Kiadó, vol. 78(2), pages 231-260, February.
    12. Larivière, Vincent & Gingras, Yves, 2011. "Averages of ratios vs. ratios of averages: An empirical analysis of four levels of aggregation," Journal of Informetrics, Elsevier, vol. 5(3), pages 392-399.
    13. Nick Haslam & Lauren Ban & Leah Kaufmann & Stephen Loughnan & Kim Peters & Jennifer Whelan & Sam Wilson, 2008. "What makes an article influential? Predicting impact in social and personality psychology," Scientometrics, Springer;Akadémiai Kiadó, vol. 76(1), pages 169-185, July.
    14. S. Stremersch & I. Verniers & C. Verhoef, 2006. "The Quest for Citations: Drivers of Article Impact," Working Papers of Faculty of Economics and Business Administration, Ghent University, Belgium 06/422, Ghent University, Faculty of Economics and Business Administration.
    15. Lundberg, Jonas, 2007. "Lifting the crown—citation z-score," Journal of Informetrics, Elsevier, vol. 1(2), pages 145-154.
    16. Ludo Waltman & Nees Jan Eck & Thed N. Leeuwen & Martijn S. Visser & Anthony F. J. Raan, 2011. "On the correlation between bibliometric indicators and peer review: reply to Opthof and Leydesdorff," Scientometrics, Springer;Akadémiai Kiadó, vol. 88(3), pages 1017-1022, September.
    17. Waltman, Ludo & van Eck, Nees Jan & van Leeuwen, Thed N. & Visser, Martijn S. & van Raan, Anthony F.J., 2011. "Towards a new crown indicator: Some theoretical considerations," Journal of Informetrics, Elsevier, vol. 5(1), pages 37-47.
    18. Opthof, Tobias & Leydesdorff, Loet, 2010. "Caveats for the journal and field normalizations in the CWTS (“Leiden”) evaluations of research performance," Journal of Informetrics, Elsevier, vol. 4(3), pages 423-430.
    19. Vieira, E.S. & Gomes, J.A.N.F., 2010. "Citations to scientific articles: Its distribution and dependence on the article features," Journal of Informetrics, Elsevier, vol. 4(1), pages 1-13.
    20. Marsh, Herbert W. & Jayasinghe, Upali W. & Bond, Nigel W., 2011. "Gender differences in peer reviews of grant applications: A substantive-methodological synergy in support of the null hypothesis model," Journal of Informetrics, Elsevier, vol. 5(1), pages 167-180.
    21. Colliander, Cristian & Ahlgren, Per, 2011. "The effects and their stability of field normalization baseline on relative performance with respect to citation impact: A case study of 20 natural science departments," Journal of Informetrics, Elsevier, vol. 5(1), pages 101-113.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Verleysen, Frederik T. & Engels, Tim C.E., 2014. "Barycenter representation of book publishing internationalization in the Social Sciences and Humanities," Journal of Informetrics, Elsevier, vol. 8(1), pages 234-240.
    2. Williams, Richard & Bornmann, Lutz, 2016. "Sampling issues in bibliometric analysis," Journal of Informetrics, Elsevier, vol. 10(4), pages 1225-1232.
    3. Jesper W. Schneider, 2015. "Null hypothesis significance tests. A mix-up of two different theories: the basis for widespread confusion and numerous misinterpretations," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(1), pages 411-432, January.
    4. Cinzia Daraio & Simone Di Leo & Loet Leydesdorff, 2023. "A heuristic approach based on Leiden rankings to identify outliers: evidence from Italian universities in the European landscape," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(1), pages 483-510, January.
    5. Franklin G. Mixon & Benno Torgler & Kamal P. Upadhyaya, 2022. "Committees or Markets? An Exploratory Analysis of Best Paper Awards in Economics," Economies, MDPI, vol. 10(5), pages 1-15, May.
    6. Bornmann, Lutz & Williams, Richard, 2013. "How to calculate the practical significance of citation impact differences? An empirical example from evaluative institutional bibliometrics using adjusted predictions and marginal effects," Journal of Informetrics, Elsevier, vol. 7(2), pages 562-574.
    7. Jae H. Kim, 2022. "Moving to a world beyond p-value," Review of Managerial Science, Springer, vol. 16(8), pages 2467-2493, November.
    8. Lorna Wildgaard & Jesper W. Schneider & Birger Larsen, 2014. "A review of the characteristics of 108 author-level bibliometric indicators," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(1), pages 125-158, October.
    9. Lu, Chao & Bu, Yi & Dong, Xianlei & Wang, Jie & Ding, Ying & Larivière, Vincent & Sugimoto, Cassidy R. & Paul, Logan & Zhang, Chengzhi, 2019. "Analyzing linguistic complexity and scientific impact," Journal of Informetrics, Elsevier, vol. 13(3), pages 817-829.
    10. Cinzia Daraio & Simone Di Leo & Loet Leydesdorff, 2022. "Using the Leiden Rankings as a Heuristics: Evidence from Italian universities in the European landscape," LEM Papers Series 2022/08, Laboratory of Economics and Management (LEM), Sant'Anna School of Advanced Studies, Pisa, Italy.
    11. Díaz-Faes, Adrián A. & Costas, Rodrigo & Galindo, M. Purificación & Bordons, María, 2015. "Unravelling the performance of individual scholars: Use of Canonical Biplot analysis to explore the performance of scientists by academic rank and scientific field," Journal of Informetrics, Elsevier, vol. 9(4), pages 722-733.
    12. Lorna Wildgaard, 2015. "A comparison of 17 author-level bibliometric indicators for researchers in Astronomy, Environmental Science, Philosophy and Public Health in Web of Science and Google Scholar," Scientometrics, Springer;Akadémiai Kiadó, vol. 104(3), pages 873-906, September.
    13. Abramo, Giovanni & D’Angelo, Ciriaco Andrea & Grilli, Leonardo, 2015. "Funnel plots for visualizing uncertainty in the research performance of institutions," Journal of Informetrics, Elsevier, vol. 9(4), pages 954-961.
    14. Bonaccorsi, Andrea & Cicero, Tindaro, 2016. "Nondeterministic ranking of university departments," Journal of Informetrics, Elsevier, vol. 10(1), pages 224-237.
    15. Frederik T. Verleysen & Tim C. E. Engels, 2014. "Internationalization of peer reviewed and non-peer reviewed book publications in the Social Sciences and Humanities," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(2), pages 1431-1444, November.
    16. Thelwall, Mike, 2016. "The precision of the arithmetic mean, geometric mean and percentiles for citation data: An experimental simulation modelling approach," Journal of Informetrics, Elsevier, vol. 10(1), pages 110-123.
    17. Engsted, Tom & Schneider, Jesper W., 2023. "Non-Experimental Data, Hypothesis Testing, and the Likelihood Principle: A Social Science Perspective," SocArXiv nztk8, Center for Open Science.
    18. Wildgaard, Lorna, 2016. "A critical cluster analysis of 44 indicators of author-level performance," Journal of Informetrics, Elsevier, vol. 10(4), pages 1055-1078.
    19. Leydesdorff, Loet & Wagner, Caroline S. & Bornmann, Lutz, 2014. "The European Union, China, and the United States in the top-1% and top-10% layers of most-frequently cited publications: Competition and collaborations," Journal of Informetrics, Elsevier, vol. 8(3), pages 606-617.
    20. Chen, Kuan-Ming & Jen, Tsung-Hau & Wu, Margaret, 2014. "Estimating the accuracies of journal impact factor through bootstrap," Journal of Informetrics, Elsevier, vol. 8(1), pages 181-196.
    21. Loet Leydesdorff & Lutz Bornmann & Jonathan Adams, 2019. "The integrated impact indicator revisited (I3*): a non-parametric alternative to the journal impact factor," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(3), pages 1669-1694, June.
    22. Bornmann, Lutz, 2013. "The problem of citation impact assessments for recent publication years in institutional evaluations," Journal of Informetrics, Elsevier, vol. 7(3), pages 722-729.
    23. Judit Bar-Ilan, 2014. "Astrophysics publications on arXiv, Scopus and Mendeley: a case study," Scientometrics, Springer;Akadémiai Kiadó, vol. 100(1), pages 217-225, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Waltman, Ludo, 2016. "A review of the literature on citation impact indicators," Journal of Informetrics, Elsevier, vol. 10(2), pages 365-391.
    2. Bouyssou, Denis & Marchant, Thierry, 2016. "Ranking authors using fractional counting of citations: An axiomatic approach," Journal of Informetrics, Elsevier, vol. 10(1), pages 183-199.
    3. Dunaiski, Marcel & Geldenhuys, Jaco & Visser, Willem, 2019. "Globalised vs averaged: Bias and ranking performance on the author level," Journal of Informetrics, Elsevier, vol. 13(1), pages 299-313.
    4. Mingers, John & Leydesdorff, Loet, 2015. "A review of theory and practice in scientometrics," European Journal of Operational Research, Elsevier, vol. 246(1), pages 1-19.
    5. Pislyakov, Vladimir, 2022. "On some properties of medians, percentiles, baselines, and thresholds in empirical bibliometric analysis," Journal of Informetrics, Elsevier, vol. 16(4).
    6. Zahedi, Zohreh & Haustein, Stefanie, 2018. "On the relationships between bibliographic characteristics of scientific documents and citation and Mendeley readership counts: A large-scale analysis of Web of Science publications," Journal of Informetrics, Elsevier, vol. 12(1), pages 191-202.
    7. Vinkler, Péter, 2012. "The case of scientometricians with the “absolute relative” impact indicator," Journal of Informetrics, Elsevier, vol. 6(2), pages 254-264.
    8. Dunaiski, Marcel & Geldenhuys, Jaco & Visser, Willem, 2019. "On the interplay between normalisation, bias, and performance of paper impact metrics," Journal of Informetrics, Elsevier, vol. 13(1), pages 270-290.
    9. Rons, Nadine, 2012. "Partition-based Field Normalization: An approach to highly specialized publication records," Journal of Informetrics, Elsevier, vol. 6(1), pages 1-10.
    10. Thelwall, Mike, 2017. "Three practical field normalised alternative indicator formulae for research evaluation," Journal of Informetrics, Elsevier, vol. 11(1), pages 128-151.
    11. Herranz, Neus & Ruiz-Castillo, Javier, 2012. "Sub-field normalization in the multiplicative case: Average-based citation indicators," Journal of Informetrics, Elsevier, vol. 6(4), pages 543-556.
    12. Abramo, Giovanni & D’Angelo, Ciriaco Andrea, 2016. "A farewell to the MNCS and like size-independent indicators," Journal of Informetrics, Elsevier, vol. 10(2), pages 646-651.
    13. Larivière, Vincent & Gingras, Yves, 2011. "Averages of ratios vs. ratios of averages: An empirical analysis of four levels of aggregation," Journal of Informetrics, Elsevier, vol. 5(3), pages 392-399.
    14. Ludo Waltman & Nees Jan Eck & Thed N. Leeuwen & Martijn S. Visser & Anthony F. J. Raan, 2011. "Towards a new crown indicator: an empirical analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 87(3), pages 467-481, June.
    15. Tol, Richard S.J., 2013. "Identifying excellent researchers: A new approach," Journal of Informetrics, Elsevier, vol. 7(4), pages 803-810.
    16. Zhou, Ping & Zhong, Yongfeng, 2012. "The citation-based indicator and combined impact indicator—New options for measuring impact," Journal of Informetrics, Elsevier, vol. 6(4), pages 631-638.
    17. Loet Leydesdorff, 2012. "Alternatives to the journal impact factor: I3 and the top-10% (or top-25%?) of the most-highly cited papers," Scientometrics, Springer;Akadémiai Kiadó, vol. 92(2), pages 355-365, August.
    18. Wu, Jiang, 2013. "Investigating the universal distributions of normalized indicators and developing field-independent index," Journal of Informetrics, Elsevier, vol. 7(1), pages 63-71.
    19. Smolinsky, Lawrence, 2016. "Expected number of citations and the crown indicator," Journal of Informetrics, Elsevier, vol. 10(1), pages 43-47.
    20. Thelwall, Mike & Sud, Pardeep, 2016. "National, disciplinary and temporal variations in the extent to which articles with more authors have more impact: Evidence from a geometric field normalised citation indicator," Journal of Informetrics, Elsevier, vol. 10(1), pages 48-61.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:infome:v:7:y:2013:i:1:p:50-62. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/joi .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.