IDEAS home Printed from https://ideas.repec.org/a/gam/jstats/v6y2023i4p81-1338d1294237.html
   My bibliography  Save this article

Revisiting the Large n (Sample Size) Problem: How to Avert Spurious Significance Results

Author

Listed:
  • Aris Spanos

    (Department of Economics, Virginia Tech, Blacksburg, VA 24061, USA)

Abstract

Although large data sets are generally viewed as advantageous for their ability to provide more precise and reliable evidence, it is often overlooked that these benefits are contingent upon certain conditions being met. The primary condition is the approximate validity (statistical adequacy) of the probabilistic assumptions comprising the statistical model M θ ( x ) applied to the data. In the case of a statistically adequate M θ ( x ) and a given significance level α , as n increases, the power of a test increases, and the p -value decreases due to the inherent trade-off between type I and type II error probabilities in frequentist testing. This trade-off raises concerns about the reliability of declaring ‘statistical significance’ based on conventional significance levels when n is exceptionally large. To address this issue, the author proposes that a principled approach, in the form of post-data severity (SEV) evaluation, be employed. The SEV evaluation represents a post-data error probability that converts unduly data-specific ‘accept/reject H 0 results’ into evidence either supporting or contradicting inferential claims regarding the parameters of interest. This approach offers a more nuanced and robust perspective in navigating the challenges posed by the large n problem.

Suggested Citation

  • Aris Spanos, 2023. "Revisiting the Large n (Sample Size) Problem: How to Avert Spurious Significance Results," Stats, MDPI, vol. 6(4), pages 1-16, December.
  • Handle: RePEc:gam:jstats:v:6:y:2023:i:4:p:81-1338:d:1294237
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2571-905X/6/4/81/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2571-905X/6/4/81/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Spanos, Aris, 2010. "Statistical adequacy and the trustworthiness of empirical evidence: Statistical vs. substantive information," Economic Modelling, Elsevier, vol. 27(6), pages 1436-1452, November.
    2. Aris Spanos, 2018. "Mis†Specification Testing In Retrospect," Journal of Economic Surveys, Wiley Blackwell, vol. 32(2), pages 541-577, April.
    3. Aris Spanos, 2009. "Statistical Misspecification and the Reliability of Inference: The Simple T-Test in the Presence of Markov Dependence," Korean Economic Review, Korean Economic Association, vol. 25, pages 165-213.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hanck, Christoph, 2011. "Now, whose schools are really better (or weaker) than Germany's? A multiple testing approach," Economic Modelling, Elsevier, vol. 28(4), pages 1739-1746, July.
    2. Aris Spanos, 2016. "Transforming structural econometrics: substantive vs. statistical premises of inference," Review of Political Economy, Taylor & Francis Journals, vol. 28(3), pages 426-437, July.
    3. Salvati, Luca & Carlucci, Margherita, 2015. "Towards sustainability in agro-forest systems? Grazing intensity, soil degradation and the socioeconomic profile of rural communities in Italy," Ecological Economics, Elsevier, vol. 112(C), pages 1-13.
    4. Niraj Poudyal & Aris Spanos, 2022. "Model Validation and DSGE Modeling," Econometrics, MDPI, vol. 10(2), pages 1-25, April.
    5. Spanos, Aris, 2010. "Akaike-type criteria and the reliability of inference: Model selection versus statistical model specification," Journal of Econometrics, Elsevier, vol. 158(2), pages 204-220, October.
    6. Owen, P. Dorian, 2018. "Replication to assess statistical adequacy," Economics - The Open-Access, Open-Assessment E-Journal (2007-2020), Kiel Institute for the World Economy (IfW Kiel), vol. 12, pages 1-16.
    7. David J. Hand, 2022. "Trustworthiness of statistical inference," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(1), pages 329-347, January.
    8. Kiviet, Jan F., 2020. "Microeconometric dynamic panel data methods: Model specification and selection issues," Econometrics and Statistics, Elsevier, vol. 13(C), pages 16-45.
    9. Aris Spanos, 2021. "Yule–Simpson’s paradox: the probabilistic versus the empirical conundrum," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(2), pages 605-635, June.
    10. Aris Spanos, 2022. "Statistical modeling and inference in the era of Data Science and Graphical Causal modeling," Journal of Economic Surveys, Wiley Blackwell, vol. 36(5), pages 1251-1287, December.
    11. Niraj Poudyal & Nabin Baral & Stanley T Asah, 2016. "Wolf Lethal Control and Livestock Depredations: Counter-Evidence from Respecified Models," PLOS ONE, Public Library of Science, vol. 11(2), pages 1-8, February.
    12. Francisco Estrada & Víctor Guerrero & Carlos Gay-García & Benjamín Martínez-López, 2013. "A cautionary note on automated statistical downscaling methods for climate change," Climatic Change, Springer, vol. 120(1), pages 263-276, September.
    13. Jae H. Kim, 2022. "Moving to a world beyond p-value," Review of Managerial Science, Springer, vol. 16(8), pages 2467-2493, November.
    14. Aris Spanos, 2022. "Frequentist Model-based Statistical Induction and the Replication Crisis," Journal of Quantitative Economics, Springer;The Indian Econometric Society (TIES), vol. 20(1), pages 133-159, September.
    15. Jae H. Kim & Philip I. Ji, 2024. "Testing for signal-to-noise ratio in linear regression: a test under large or massive sample," Review of Managerial Science, Springer, vol. 18(10), pages 3007-3024, October.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jstats:v:6:y:2023:i:4:p:81-1338:d:1294237. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.