IDEAS home Printed from https://ideas.repec.org/a/sae/socres/v22y2017i2p204-210.html
   My bibliography  Save this article

Significance Testing is Still Wrong, and Damages Real Lives: A Brief Reply to Spreckelsen and Van Der Horst, and Nicholson and McCusker

Author

Listed:
  • Stephen Gorard

Abstract

This paper is a brief reply to two responses to a paper I published previously in this journal. In that first paper I presented a summary of part of the long-standing literature critical of the use of significance testing in real-life research, and reported again on how significance testing is abused, leading to invalid and therefore potentially damaging research outcomes. I illustrated and explained the inverse logic error that is routinely used in significance testing, and argued that all of this should now cease. Although clearly disagreeing with me, neither of the responses to my paper addressed these issues head on. One focussed mainly on arguing with things I had not said (such as that there are no other problems in social science). The other tried to argue either that the inverse logic error is not prevalent, or that there is some other unspecified way of presenting the results of significance testing that does not involve this error. This reply paper summarises my original points, deals with each response paper in turn, and then turns to an examination of how the responders use significance testing in practice in their own studies. All of them use significance testing exactly as I described in the original paper – with non-random cases, and using the probability of the observed data erroneously as though it were the probability of the hypothesis assumed in order to calculate the probability of the observed data.

Suggested Citation

  • Stephen Gorard, 2017. "Significance Testing is Still Wrong, and Damages Real Lives: A Brief Reply to Spreckelsen and Van Der Horst, and Nicholson and McCusker," Sociological Research Online, , vol. 22(2), pages 204-210, May.
  • Handle: RePEc:sae:socres:v:22:y:2017:i:2:p:204-210
    DOI: 10.5153/sro.4281
    as

    Download full text from publisher

    File URL: https://journals.sagepub.com/doi/10.5153/sro.4281
    Download Restriction: no

    File URL: https://libkey.io/10.5153/sro.4281?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Stephen Gorard, 2016. "Damaging Real Lives through Obstinacy: Re-Emphasising Why Significance Testing is Wrong," Sociological Research Online, , vol. 21(1), pages 102-115, February.
    2. Sellke T. & Bayarri M. J. & Berger J. O., 2001. "Calibration of rho Values for Testing Precise Null Hypotheses," The American Statistician, American Statistical Association, vol. 55, pages 62-71, February.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jyotirmoy Sarkar, 2018. "Will P†Value Triumph over Abuses and Attacks?," Biostatistics and Biometrics Open Access Journal, Juniper Publishers Inc., vol. 7(4), pages 66-71, July.
    2. Bachmann, Dirk & Dette, Holger, 2004. "A note on the Bickel-Rosenblatt test in autoregressive time series," Technical Reports 2004,17, Technische Universität Dortmund, Sonderforschungsbereich 475: Komplexitätsreduktion in multivariaten Datenstrukturen.
    3. Ferguson John P. & Palejev Dean, 2014. "P-value calibration for multiple testing problems in genomics," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 13(6), pages 659-673, December.
    4. Guido Consonni & Roberta Paroli, 2017. "Objective Bayesian Comparison of Constrained Analysis of Variance Models," Psychometrika, Springer;The Psychometric Society, vol. 82(3), pages 589-609, September.
    5. Alessio Farcomeni, 2003. "Unified conditional frequentist and Bayesian testing: computations in practice and sample size determination," Metron - International Journal of Statistics, Dipartimento di Statistica, Probabilità e Statistiche Applicate - University of Rome, vol. 0(2), pages 243-266.
    6. Snyder, Christopher & Zhuo, Ran, 2018. "Sniff Tests in Economics: Aggregate Distribution of Their Probability Values and Implications for Publication Bias," MetaArXiv 8vdrh, Center for Open Science.
    7. Daniel J. Benjamin & James O. Berger & Magnus Johannesson & Brian A. Nosek & E.-J. Wagenmakers & Richard Berk & Kenneth A. Bollen & Björn Brembs & Lawrence Brown & Colin Camerer & David Cesarini & Chr, 2018. "Redefine statistical significance," Nature Human Behaviour, Nature, vol. 2(1), pages 6-10, January.
      • Daniel Benjamin & James Berger & Magnus Johannesson & Brian Nosek & E. Wagenmakers & Richard Berk & Kenneth Bollen & Bjorn Brembs & Lawrence Brown & Colin Camerer & David Cesarini & Christopher Chambe, 2017. "Redefine Statistical Significance," Artefactual Field Experiments 00612, The Field Experiments Website.
    8. Hirschauer Norbert & Mußhoff Oliver & Grüner Sven & Frey Ulrich & Theesfeld Insa & Wagner Peter, 2016. "Die Interpretation des p-Wertes – Grundsätzliche Missverständnisse," Journal of Economics and Statistics (Jahrbuecher fuer Nationaloekonomie und Statistik), De Gruyter, vol. 236(5), pages 557-575, October.
    9. James Nicholson & Sean Mccusker, 2016. "Damaging the Case for Improving Social Science Methodology through Misrepresentation: Re-Asserting Confidence in Hypothesis Testing as a Valid Scientific Process," Sociological Research Online, , vol. 21(2), pages 136-147, May.
    10. Gary Koop & Roberto Leon-Gonzalez & Rodney Strachan, 2008. "Bayesian inference in a cointegrating panel data model," Advances in Econometrics, in: Bayesian Econometrics, pages 433-469, Emerald Group Publishing Limited.
    11. Hirschauer Norbert & Grüner Sven & Mußhoff Oliver & Becker Claudia, 2019. "Twenty Steps Towards an Adequate Inferential Interpretation of p-Values in Econometrics," Journal of Economics and Statistics (Jahrbuecher fuer Nationaloekonomie und Statistik), De Gruyter, vol. 239(4), pages 703-721, August.
    12. Jesper W. Schneider, 2018. "NHST is still logically flawed," Scientometrics, Springer;Akadémiai Kiadó, vol. 115(1), pages 627-635, April.
    13. Jesper W. Schneider, 2015. "Null hypothesis significance tests. A mix-up of two different theories: the basis for widespread confusion and numerous misinterpretations," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(1), pages 411-432, January.
    14. Nosek, Brian A. & Ebersole, Charles R. & DeHaven, Alexander Carl & Mellor, David Thomas, 2018. "The Preregistration Revolution," OSF Preprints 2dxu5, Center for Open Science.
    15. D. Vélez & M. E. Pérez & L. R. Pericchi, 2022. "Increasing the replicability for linear models via adaptive significance levels," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(3), pages 771-789, September.
    16. Mayo, Deborah & Morey, Richard Donald, 2017. "A Poor Prognosis for the Diagnostic Screening Critique of Statistical Tests," OSF Preprints ps38b, Center for Open Science.
    17. Kim-Ngan Ta-Thi & Kai-Jen Chuang & Chyi-Huey Bai, 2021. "Association between Migraine and the Risk of Stroke: A Bayesian Meta-Analysis," Sustainability, MDPI, vol. 13(7), pages 1-12, March.
    18. Denes Szucs & John P A Ioannidis, 2017. "Empirical assessment of published effect sizes and power in the recent cognitive neuroscience and psychology literature," PLOS Biology, Public Library of Science, vol. 15(3), pages 1-18, March.
    19. Dolores Catelan & Manuela Giangreco & Annibale Biggeri & Fabio Barbone & Lorenzo Monasta & Giuseppe Ricci & Federico Romano & Valentina Rosolen & Gabriella Zito & Luca Ronfani, 2021. "Spatial Patterns of Endometriosis Incidence. A Study in Friuli Venezia Giulia (Italy) in the Period 2004–2017," IJERPH, MDPI, vol. 18(13), pages 1-14, July.
    20. Christopher Snyder & Ran Zhuo, 2018. "Sniff Tests as a Screen in the Publication Process: Throwing out the Wheat with the Chaff," NBER Working Papers 25058, National Bureau of Economic Research, Inc.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:sae:socres:v:22:y:2017:i:2:p:204-210. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: SAGE Publications (email available below). General contact details of provider: .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.