IDEAS home Printed from https://ideas.repec.org/a/bpj/sagmbi/v9y2010i1n39.html
   My bibliography  Save this article

Permutation P-values Should Never Be Zero: Calculating Exact P-values When Permutations Are Randomly Drawn

Author

Listed:
  • Phipson Belinda

    (The Walter and Eliza Hall Institute of Medical Research)

  • Smyth Gordon K

    (The Walter and Eliza Hall Institute of Medical Reseach)

Abstract

Permutation tests are amongst the most commonly used statistical tools in modern genomic research, a process by which p-values are attached to a test statistic by randomly permuting the sample or gene labels. Yet permutation p-values published in the genomic literature are often computed incorrectly, understated by about 1/m, where m is the number of permutations. The same is often true in the more general situation when Monte Carlo simulation is used to assign p-values. Although the p-value understatement is usually small in absolute terms, the implications can be serious in a multiple testing context. The understatement arises from the intuitive but mistaken idea of using permutation to estimate the tail probability of the test statistic. We argue instead that permutation should be viewed as generating an exact discrete null distribution. The relevant literature, some of which is likely to have been relatively inaccessible to the genomic community, is reviewed and summarized. A computation strategy is developed for exact p-values when permutations are randomly drawn. The strategy is valid for any number of permutations and samples. Some simple recommendations are made for the implementation of permutation tests in practice.

Suggested Citation

  • Phipson Belinda & Smyth Gordon K, 2010. "Permutation P-values Should Never Be Zero: Calculating Exact P-values When Permutations Are Randomly Drawn," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 9(1), pages 1-16, October.
  • Handle: RePEc:bpj:sagmbi:v:9:y:2010:i:1:n:39
    DOI: 10.2202/1544-6115.1585
    as

    Download full text from publisher

    File URL: https://doi.org/10.2202/1544-6115.1585
    Download Restriction: For access to full text, subscription to the journal or payment for the individual article is required.

    File URL: https://libkey.io/10.2202/1544-6115.1585?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Dørum Guro & Snipen Lars & Solheim Margrete & Sæbø Solve, 2009. "Rotation Testing in Gene Set Enrichment Analysis for Small Direct Comparison Experiments," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 8(1), pages 1-26, July.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ilenia Lovato & Alessia Pini & Aymeric Stamm & Maxime Taquet & Simone Vantini, 2021. "Multiscale null hypothesis testing for network‐valued data: Analysis of brain networks of patients with autism," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(2), pages 372-397, March.
    2. Langaas Mette & Bakke Øyvind, 2014. "Robust methods to detect disease-genotype association in genetic association studies: calculate p-values using exact conditional enumeration instead of simulated permutations or asymptotic approximati," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 13(6), pages 675-692, December.
    3. Fabian J.E. Telschow & Michael R. Pierrynowski & Stephan F. Huckemann, 2021. "Functional inference on rotational curves under sample‐specific group actions and identification of human gait," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 48(4), pages 1256-1276, December.
    4. Jolene S. Ranek & Wayne Stallaert & J. Justin Milner & Margaret Redick & Samuel C. Wolff & Adriana S. Beltran & Natalie Stanley & Jeremy E. Purvis, 2024. "DELVE: feature selection for preserving biological trajectories in single-cell data," Nature Communications, Nature, vol. 15(1), pages 1-26, December.
    5. Hiromitsu Kobayashi & Chorong Song & Harumi Ikei & Bum-Jin Park & Takahide Kagawa & Yoshifumi Miyazaki, 2017. "Diurnal Changes in Distribution Characteristics of Salivary Cortisol and Immunoglobulin A Concentrations," IJERPH, MDPI, vol. 14(9), pages 1-9, August.
    6. Claude Welcker & Nadir Abusamra Spencer & Olivier Turc & Italo Granato & Romain Chapuis & Delphine Madur & Katia Beauchene & Brigitte Gouesnard & Xavier Draye & Carine Palaffre & Josiane Lorgeou & Ste, 2022. "Physiological adaptive traits are a potential allele reservoir for maize genetic progress under challenging conditions," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    7. Lovato, Ilenia & Pini, Alessia & Stamm, Aymeric & Vantini, Simone, 2020. "Model-free two-sample test for network-valued data," Computational Statistics & Data Analysis, Elsevier, vol. 144(C).
    8. N W Koning & J Hemerik, 2024. "More efficient exact group invariance testing: using a representative subgroup," Biometrika, Biometrika Trust, vol. 111(2), pages 441-458.
    9. Jesse Hemerik & Jelle J. Goeman, 2021. "Another Look at the Lady Tasting Tea and Differences Between Permutation Tests and Randomisation Tests," International Statistical Review, International Statistical Institute, vol. 89(2), pages 367-381, August.
    10. Gunther Glehr & Paloma Riquelme & Katharina Kronenberg & Robert Lohmayer & Víctor J. López-Madrona & Michael Kapinsky & Hans J. Schlitt & Edward K. Geissler & Rainer Spang & Sebastian Haferkamp & Jame, 2024. "Restricting datasets to classifiable samples augments discovery of immune disease biomarkers," Nature Communications, Nature, vol. 15(1), pages 1-21, December.
    11. Hivert, Benjamin & Agniel, Denis & Thiébaut, Rodolphe & Hejblum, Boris P., 2024. "Post-clustering difference testing: Valid inference and practical considerations with applications to ecological and biological data," Computational Statistics & Data Analysis, Elsevier, vol. 193(C).
    12. Lucy L. Gao & Daniela Witten & Jacob Bien, 2022. "Testing for association in multiview network data," Biometrics, The International Biometric Society, vol. 78(3), pages 1018-1030, September.
    13. Chaturvedi Nimisha & Menezes Renée X. de & Goeman Jelle J. & Wieringen Wessel van, 2018. "A test for detecting differential indirect trans effects between two groups of samples," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 17(5), pages 1-11, October.
    14. Yaroslav Rosokha & Kenneth Younge, 2020. "Motivating Innovation: The Effect of Loss Aversion on the Willingness to Persist," The Review of Economics and Statistics, MIT Press, vol. 102(3), pages 569-582, July.
    15. Masha Shunko & Julie Niederhoff & Yaroslav Rosokha, 2018. "Humans Are Not Machines: The Behavioral Impact of Queueing Design on Service Time," Management Science, INFORMS, vol. 64(1), pages 453-473, January.
    16. Angela L. Riffo-Campos & Guillermo Ayala & Juan Domingo, 2021. "Ordering of Omics Features Using Beta Distributions on Montecarlo p -Values," Mathematics, MDPI, vol. 9(11), pages 1-18, June.
    17. Jesse Hemerik & Jelle Goeman, 2018. "Exact testing with random permutations," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 27(4), pages 811-825, December.
    18. Romero, Julian & Rosokha, Yaroslav, 2018. "Constructing strategies in the indefinitely repeated prisoner’s dilemma game," European Economic Review, Elsevier, vol. 104(C), pages 185-219.
    19. Baddeley, Adrian & Hardegen, Andrew & Lawrence, Thomas & Milne, Robin K. & Nair, Gopalan & Rakshit, Suman, 2017. "On two-stage Monte Carlo tests of composite hypotheses," Computational Statistics & Data Analysis, Elsevier, vol. 114(C), pages 75-87.
    20. Silke Janitza & Ender Celik & Anne-Laure Boulesteix, 2018. "A computationally fast variable importance test for random forests for high-dimensional data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 12(4), pages 885-915, December.
    21. Kristina Handler & Karsten Bach & Costanza Borrelli & Salvatore Piscuoglio & Xenia Ficht & Ilhan E. Acar & Andreas E. Moor, 2023. "Fragment-sequencing unveils local tissue microenvironments at single-cell resolution," Nature Communications, Nature, vol. 14(1), pages 1-17, December.
    22. Hiromitsu Kobayashi & Chorong Song & Harumi Ikei & Bum-Jin Park & Juyoung Lee & Takahide Kagawa & Yoshifumi Miyazaki, 2017. "Population-Based Study on the Effect of a Forest Environment on Salivary Cortisol Concentration," IJERPH, MDPI, vol. 14(8), pages 1-9, August.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Dørum Guro & Snipen Lars & Solheim Margrete & Saebo Solve, 2011. "Smoothing Gene Expression Data with Network Information Improves Consistency of Regulated Genes," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-26, August.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bpj:sagmbi:v:9:y:2010:i:1:n:39. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.degruyter.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.