IDEAS home Printed from https://ideas.repec.org/a/spr/testjl/v31y2022i1d10.1007_s11749-021-00775-x.html
   My bibliography  Save this article

Where to find needles in a haystack?

Author

Listed:
  • Zhigen Zhao

    (Temple University)

Abstract

In many existing methods of multiple comparison, one starts with either Fisher’s p value or the local fdr. One commonly used p value, defined as the tail probability exceeding the observed test statistic under the null distribution, fails to use information from the distribution under the alternative hypothesis. The targeted region of signals could be wrong when the likelihood ratio is not monotone. The oracle local fdr based approaches could be optimal because they use the probability density functions of the test statistic under both the null and alternative hypotheses. However, the data-driven version could be problematic because of the difficulty and challenge of probability density function estimation. In this paper, we propose a new method, Cdf and Local fdr Assisted multiple Testing method (CLAT), which is optimal for cases when the p value based methods are optimal and for some other cases when p value based methods are not. Additionally, CLAT only relies on the empirical distribution function which quickly converges to the oracle one. Both the simulations and real data analysis demonstrate the superior performance of the CLAT method. Furthermore, the computation is instantaneous based on a novel algorithm and is scalable to large data sets.

Suggested Citation

  • Zhigen Zhao, 2022. "Where to find needles in a haystack?," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(1), pages 148-174, March.
  • Handle: RePEc:spr:testjl:v:31:y:2022:i:1:d:10.1007_s11749-021-00775-x
    DOI: 10.1007/s11749-021-00775-x
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11749-021-00775-x
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11749-021-00775-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Hongyuan Cao & Wenguang Sun & Michael R. Kosorok, 2013. "The optimal power puzzle: scrutiny of the monotone likelihood ratio assumption in multiple testing," Biometrika, Biometrika Trust, vol. 100(2), pages 495-502.
    2. Efron B. & Tibshirani R. & Storey J.D. & Tusher V., 2001. "Empirical Bayes Analysis of a Microarray Experiment," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1151-1160, December.
    3. Wenguang Sun & T. Tony Cai, 2009. "Large‐scale multiple testing under dependence," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(2), pages 393-424, April.
    4. J. T. Gene Hwang & Jing Qiu & Zhigen Zhao, 2009. "Empirical Bayes confidence intervals shrinking both means and variances," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(1), pages 265-285, January.
    5. He, Li & Sarkar, Sanat K. & Zhao, Zhigen, 2015. "Capturing the severity of type II errors in high-dimensional multiple testing," Journal of Multivariate Analysis, Elsevier, vol. 142(C), pages 106-116.
    6. Sun, Wenguang & Cai, T. Tony, 2007. "Oracle and Adaptive Compound Decision Rules for False Discovery Rate Control," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 901-912, September.
    7. Christopher Genovese & Larry Wasserman, 2002. "Operating characteristics and extensions of the false discovery rate procedure," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 64(3), pages 499-517, August.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ruth Heller & Saharon Rosset, 2021. "Optimal control of false discovery criteria in the two‐group model," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(1), pages 133-155, February.
    2. T. Tony Cai & Wenguang Sun & Weinan Wang, 2019. "Covariate‐assisted ranking and screening for large‐scale two‐sample inference," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 81(2), pages 187-234, April.
    3. Tingting Cui & Pengfei Wang & Wensheng Zhu, 2021. "Covariate-adjusted multiple testing in genome-wide association studies via factorial hidden Markov models," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(3), pages 737-757, September.
    4. Ghosh Debashis, 2012. "Incorporating the Empirical Null Hypothesis into the Benjamini-Hochberg Procedure," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(4), pages 1-21, July.
    5. Izmirlian, Grant, 2020. "Strong consistency and asymptotic normality for quantities related to the Benjamini–Hochberg false discovery rate procedure," Statistics & Probability Letters, Elsevier, vol. 160(C).
    6. Cipolli III, William & Hanson, Timothy & McLain, Alexander C., 2016. "Bayesian nonparametric multiple testing," Computational Statistics & Data Analysis, Elsevier, vol. 101(C), pages 64-79.
    7. Jiaying Gu & Roger Koenker, 2020. "Invidious Comparisons: Ranking and Selection as Compound Decisions," Papers 2012.12550, arXiv.org, revised Sep 2021.
    8. Hai Shu & Bin Nan & Robert Koeppe, 2015. "Multiple testing for neuroimaging via hidden Markov random field," Biometrics, The International Biometric Society, vol. 71(3), pages 741-750, September.
    9. Nikolaos Ignatiadis & Wolfgang Huber, 2021. "Covariate powered cross‐weighted multiple testing," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(4), pages 720-751, September.
    10. Joungyoun Kim & Donghyeon Yu & Johan Lim & Joong-Ho Won, 2018. "A peeling algorithm for multiple testing on a random field," Computational Statistics, Springer, vol. 33(1), pages 503-525, March.
    11. Jiaying Gu & Roger Koenker, 2023. "Invidious Comparisons: Ranking and Selection as Compound Decisions," Econometrica, Econometric Society, vol. 91(1), pages 1-41, January.
    12. Pallavi Basu & Luella Fu & Alessio Saretto & Wenguang Sun, 2021. "Empirical Bayes Control of the False Discovery Exceedance," Working Papers 2115, Federal Reserve Bank of Dallas.
    13. He, Li & Sarkar, Sanat K. & Zhao, Zhigen, 2015. "Capturing the severity of type II errors in high-dimensional multiple testing," Journal of Multivariate Analysis, Elsevier, vol. 142(C), pages 106-116.
    14. Daniel Yekutieli, 2015. "Bayesian tests for composite alternative hypotheses in cross-tabulated data," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 24(2), pages 287-301, June.
    15. Gómez-Villegas Miguel A. & Salazar Isabel & Sanz Luis, 2014. "A Bayesian decision procedure for testing multiple hypotheses in DNA microarray experiments," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 13(1), pages 49-65, February.
    16. Xiaoquan Wen, 2017. "Robust Bayesian FDR Control Using Bayes Factors, with Applications to Multi-tissue eQTL Discovery," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 9(1), pages 28-49, June.
    17. Joshua Habiger & David Watts & Michael Anderson, 2017. "Multiple testing with heterogeneous multinomial distributions," Biometrics, The International Biometric Society, vol. 73(2), pages 562-570, June.
    18. Qingyun Cai & Hock Peng Chan, 2017. "A Double Application of the Benjamini-Hochberg Procedure for Testing Batched Hypotheses," Methodology and Computing in Applied Probability, Springer, vol. 19(2), pages 429-443, June.
    19. Noirrit Kiran Chandra & Sourabh Bhattacharya, 2021. "Asymptotic theory of dependent Bayesian multiple testing procedures under possible model misspecification," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 73(5), pages 891-920, October.
    20. Alejandro Ochoa & John D Storey & Manuel Llinás & Mona Singh, 2015. "Beyond the E-Value: Stratified Statistics for Protein Domain Prediction," PLOS Computational Biology, Public Library of Science, vol. 11(11), pages 1-21, November.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:testjl:v:31:y:2022:i:1:d:10.1007_s11749-021-00775-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.