IDEAS home Printed from https://ideas.repec.org/a/bpj/ijbist/v5y2009i1n20.html
   My bibliography  Save this article

Optimal Sufficient Statistics for Parametric and Non-Parametric Multiple Simultaneous Hypothesis Testing

Author

Listed:
  • Oba Shigeyuki

    (Kyoto University and PRESTO, Japan Science and Technology Corporation)

  • Ishii Shin

    (Kyoto University and Nara Institute of Science and Technology)

Abstract

In multiple simultaneous hypothesis testing (MSHT), a significance thresholding function as a scalar statistic can be designed in an adaptive manner by sharing information among many tests performed simultaneously. By using such an adapted statistic, MSHT has greater detection power than tests using simple individual statistics. To systematically obtain an optimal thresholding function that maximizes the detection power in MSHT, Storey (2007) proposed a theoretical framework called the optimal discovery procedure (ODP). He also proposed an empirical estimation of the ODP thresholding function for a parametric MSHT that presupposes parametric forms of the null and alternative likelihood functions. Empirical Bayesian testing (Efron et al. 2001), which is based on a non-parametric treatment of arbitrary test statistics, has sometimes exhibited comparable power to the ODP. These two MSHT frameworks appear to be closely related but, because of differences in their approach (frequentist vs. Bayesian), the relationship is not well understood.We present the new concept of an optimal sufficient statistic that links the ODP and empirical Bayesian frameworks, and we show that the local false discovery rate based on the empirical Bayes can be an optimal thresholding function if a certain condition holds. We lay out exhaustive sets of presumptions to achieve optimal thresholding functions and show that, if an optimal thresholding function is derived for a parametric MSHT problem, it is still optimal for a more general and broader range of MSHT problems defined in a non- or semi-parametric way. A guide to designing optimal thresholding functions for general MSHT problems is thus provided by our study.

Suggested Citation

  • Oba Shigeyuki & Ishii Shin, 2009. "Optimal Sufficient Statistics for Parametric and Non-Parametric Multiple Simultaneous Hypothesis Testing," The International Journal of Biostatistics, De Gruyter, vol. 5(1), pages 1-27, June.
  • Handle: RePEc:bpj:ijbist:v:5:y:2009:i:1:n:20
    DOI: 10.2202/1557-4679.1163
    as

    Download full text from publisher

    File URL: https://doi.org/10.2202/1557-4679.1163
    Download Restriction: For access to full text, subscription to the journal or payment for the individual article is required.

    File URL: https://libkey.io/10.2202/1557-4679.1163?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Mette Langaas & Bo Henry Lindqvist & Egil Ferkingstad, 2005. "Estimating the proportion of true null hypotheses, with application to DNA microarray data," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(4), pages 555-572, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Van Hanh Nguyen & Catherine Matias, 2014. "On Efficient Estimators of the Proportion of True Null Hypotheses in a Multiple Testing Setup," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 41(4), pages 1167-1194, December.
    2. Shigeyuki Matsui & Hisashi Noma, 2011. "Estimating Effect Sizes of Differentially Expressed Genes for Power and Sample-Size Assessments in Microarray Experiments," Biometrics, The International Biometric Society, vol. 67(4), pages 1225-1235, December.
    3. Axel Gandy & Georg Hahn, 2016. "A Framework for Monte Carlo based Multiple Testing," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 43(4), pages 1046-1063, December.
    4. Celisse, Alain & Robin, Stephane, 2008. "Nonparametric density estimation by exact leave-p-out cross-validation," Computational Statistics & Data Analysis, Elsevier, vol. 52(5), pages 2350-2368, January.
    5. repec:jss:jstsof:40:i14 is not listed on IDEAS
    6. Han, Bing & Dalal, Siddhartha R., 2012. "A Bernstein-type estimator for decreasing density with application to p-value adjustments," Computational Statistics & Data Analysis, Elsevier, vol. 56(2), pages 427-437.
    7. Ferreira José A. & Berkhof Johannes & Souverein Olga & Zwinderman Koos, 2009. "A Multiple Testing Approach to High-Dimensional Association Studies with an Application to the Detection of Associations between Risk Factors of Heart Disease and Genetic Polymorphisms," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 8(1), pages 1-58, January.
    8. Long Qu & Dan Nettleton & Jack C. M. Dekkers, 2012. "Improved Estimation of the Noncentrality Parameter Distribution from a Large Number of t-Statistics, with Applications to False Discovery Rate Estimation in Microarray Data Analysis," Biometrics, The International Biometric Society, vol. 68(4), pages 1178-1187, December.
    9. Chen, Xiongzhi, 2019. "Uniformly consistently estimating the proportion of false null hypotheses via Lebesgue–Stieltjes integral equations," Journal of Multivariate Analysis, Elsevier, vol. 173(C), pages 724-744.
    10. Montazeri Zahra & Yanofsky Corey M. & Bickel David R., 2010. "Shrinkage Estimation of Effect Sizes as an Alternative to Hypothesis Testing Followed by Estimation in High-Dimensional Biology: Applications to Differential Gene Expression," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 9(1), pages 1-33, June.
    11. Zehetmayer Sonja & Graf Alexandra C. & Posch Martin, 2015. "Sample size reassessment for a two-stage design controlling the false discovery rate," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 14(5), pages 429-442, November.
    12. Dickhaus Thorsten & Straßburger Klaus & Schunk Daniel & Morcillo-Suarez Carlos & Illig Thomas & Navarro Arcadi, 2012. "How to analyze many contingency tables simultaneously in genetic association studies," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(4), pages 1-33, July.
    13. Habiger, Joshua D. & Adekpedjou, Akim, 2014. "Optimal rejection curves for exact false discovery rate control," Statistics & Probability Letters, Elsevier, vol. 94(C), pages 21-28.
    14. Yu, Chang & Zelterman, Daniel, 2017. "A parametric model to estimate the proportion from true null using a distribution for p-values," Computational Statistics & Data Analysis, Elsevier, vol. 114(C), pages 105-118.
    15. Friguet, Chloé & Causeur, David, 2011. "Estimation of the proportion of true null hypotheses in high-dimensional data under dependence," Computational Statistics & Data Analysis, Elsevier, vol. 55(9), pages 2665-2676, September.
    16. T. Tony Cai & Wenguang Sun & Weinan Wang, 2019. "Covariate‐assisted ranking and screening for large‐scale two‐sample inference," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 81(2), pages 187-234, April.
    17. Marot Guillemette & Mayer Claus-Dieter, 2009. "Sequential Analysis for Microarray Data Based on Sensitivity and Meta-Analysis," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 8(1), pages 1-35, January.
    18. Song Huang & Tiejun Tong & Hongyu Zhao, 2010. "Bias-Corrected Diagonal Discriminant Rules for High-Dimensional Classification," Biometrics, The International Biometric Society, vol. 66(4), pages 1096-1106, December.
    19. Rohit Kumar Patra & Bodhisattva Sen, 2016. "Estimation of a two-component mixture model with applications to multiple testing," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 78(4), pages 869-893, September.
    20. Rossell David & Guerra Rudy & Scott Clayton, 2008. "Semi-Parametric Differential Expression Analysis via Partial Mixture Estimation," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 7(1), pages 1-29, April.
    21. Chang Yu & Daniel Zelterman, 2020. "Distributions associated with simultaneous multiple hypothesis testing," Journal of Statistical Distributions and Applications, Springer, vol. 7(1), pages 1-17, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bpj:ijbist:v:5:y:2009:i:1:n:20. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.degruyter.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.