IDEAS home Printed from https://ideas.repec.org/a/bla/istatr/v88y2020is1ps91-s113.html
   My bibliography  Save this article

Double Empirical Bayes Testing

Author

Listed:
  • Wesley Tansey
  • Yixin Wang
  • Raul Rabadan
  • David Blei

Abstract

Analysing data from large‐scale, multiexperiment studies requires scientists to both analyse each experiment and to assess the results as a whole. In this article, we develop double empirical Bayes testing (DEBT), an empirical Bayes method for analysing multiexperiment studies when many covariates are gathered per experiment. DEBT is a two‐stage method: in the first stage, it reports which experiments yielded significant outcomes and in the second stage, it hypothesises which covariates drive the experimental significance. In both of its stages, DEBT builds on the work of Efron, who laid out an elegant empirical Bayes approach to testing. DEBT enhances this framework by learning a series of black box predictive models to boost power and control the false discovery rate. In Stage 1, it uses a deep neural network prior to report which experiments yielded significant outcomes. In Stage 2, it uses an empirical Bayes version of the knockoff filter to select covariates that have significant predictive power of Stage 1 significance. In both simulated and real data, DEBT increases the proportion of discovered significant outcomes and selects more features when signals are weak. In a real study of cancer cell lines, DEBT selects a robust set of biologically plausible genomic drivers of drug sensitivity and resistance in cancer.

Suggested Citation

  • Wesley Tansey & Yixin Wang & Raul Rabadan & David Blei, 2020. "Double Empirical Bayes Testing," International Statistical Review, International Statistical Institute, vol. 88(S1), pages 91-113, December.
  • Handle: RePEc:bla:istatr:v:88:y:2020:i:s1:p:s91-s113
    DOI: 10.1111/insr.12430
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/insr.12430
    Download Restriction: no

    File URL: https://libkey.io/10.1111/insr.12430?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Wu, Yujun & Boos, Dennis D. & Stefanski, Leonard A., 2007. "Controlling Variable Selection by the Addition of Pseudovariables," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 235-243, March.
    2. Lihua Lei & William Fithian, 2018. "AdaPT: an interactive procedure for multiple testing with side information," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(4), pages 649-679, September.
    3. Efron B. & Tibshirani R. & Storey J.D. & Tusher V., 2001. "Empirical Bayes Analysis of a Microarray Experiment," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1151-1160, December.
    4. Mathew J. Garnett & Elena J. Edelman & Sonja J. Heidorn & Chris D. Greenman & Anahita Dastur & King Wai Lau & Patricia Greninger & I. Richard Thompson & Xi Luo & Jorge Soares & Qingsong Liu & Francesc, 2012. "Systematic identification of genomic markers of drug sensitivity in cancer cells," Nature, Nature, vol. 483(7391), pages 570-575, March.
    5. Efron, Bradley, 2004. "Large-Scale Simultaneous Hypothesis Testing: The Choice of a Null Hypothesis," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 96-104, January.
    6. Nikia A. Laurie & Stacy L. Donovan & Chie-Schin Shih & Jiakun Zhang & Nicholas Mills & Christine Fuller & Amina Teunisse & Suzanne Lam & Yolande Ramos & Adithi Mohan & Dianna Johnson & Matthew Wilson , 2006. "Inactivation of the p53 pathway in retinoblastoma," Nature, Nature, vol. 444(7115), pages 61-66, November.
    7. Ang Li & Rina Foygel Barber, 2017. "Accumulation Tests for FDR Control in Ordered Hypothesis Testing," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(518), pages 837-849, April.
    8. Vanesa Fernández-Majada & Patrick-Simon Welz & Maria A. Ermolaeva & Michael Schell & Alexander Adam & Felix Dietlein & David Komander & Reinhard Büttner & Roman K. Thomas & Björn Schumacher & Manolis , 2016. "The tumour suppressor CYLD regulates the p53 DNA damage response," Nature Communications, Nature, vol. 7(1), pages 1-14, November.
    9. James G. Scott & Ryan C. Kelly & Matthew A. Smith & Pengcheng Zhou & Robert E. Kass, 2015. "False Discovery Rate Regression: An Application to Neural Synchrony Detection in Primary Visual Cortex," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(510), pages 459-471, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. T. Tony Cai & Wenguang Sun & Weinan Wang, 2019. "Covariate‐assisted ranking and screening for large‐scale two‐sample inference," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 81(2), pages 187-234, April.
    2. Nikolaos Ignatiadis & Wolfgang Huber, 2021. "Covariate powered cross‐weighted multiple testing," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(4), pages 720-751, September.
    3. Otília Menyhart & Boglárka Weltz & Balázs Győrffy, 2021. "MultipleTesting.com: A tool for life science researchers for multiple hypothesis testing correction," PLOS ONE, Public Library of Science, vol. 16(6), pages 1-12, June.
    4. Pounds Stanley B. & Gao Cuilan L. & Zhang Hui, 2012. "Empirical Bayesian Selection of Hypothesis Testing Procedures for Analysis of Sequence Count Expression Data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(5), pages 1-32, October.
    5. Jelle J Goeman & Aldo Solari, 2024. "On selection and conditioning in multiple testing and selective inference," Biometrika, Biometrika Trust, vol. 111(2), pages 393-416.
    6. He, Yi & Pan, Wei & Lin, Jizhen, 2006. "Cluster analysis using multivariate normal mixture models to detect differential gene expression with microarray data," Computational Statistics & Data Analysis, Elsevier, vol. 51(2), pages 641-658, November.
    7. Gordon, Alexander & Chen, Linlin & Glazko, Galina & Yakovlev, Andrei, 2009. "Balancing type one and two errors in multiple testing for differential expression of genes," Computational Statistics & Data Analysis, Elsevier, vol. 53(5), pages 1622-1629, March.
    8. Dennis Leung & Wenguang Sun, 2022. "ZAP: Z$$ Z $$‐value adaptive procedures for false discovery rate control with side information," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(5), pages 1886-1946, November.
    9. Bickel David R., 2012. "Empirical Bayes Interval Estimates that are Conditionally Equal to Unadjusted Confidence Intervals or to Default Prior Credibility Intervals," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(3), pages 1-34, February.
    10. Leek Jeffrey T & Storey John D., 2011. "The Joint Null Criterion for Multiple Hypothesis Tests," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-22, June.
    11. Chen, Yunxiao & Lee, Yi-Hsuan & Li, Xiaoou, 2022. "Item pool quality control in educational testing: change point model, compound risk, and sequential detection," LSE Research Online Documents on Economics 112498, London School of Economics and Political Science, LSE Library.
    12. Campbell R. Harvey & Yan Liu & Heqing Zhu, 2014. ". . . and the Cross-Section of Expected Returns," NBER Working Papers 20592, National Bureau of Economic Research, Inc.
    13. Sairam Rayaprolu & Zhiyi Chi, 2021. "False Discovery Variance Reduction in Large Scale Simultaneous Hypothesis Tests," Methodology and Computing in Applied Probability, Springer, vol. 23(3), pages 711-733, September.
    14. Zhaoyang Tian & Kun Liang & Pengfei Li, 2021. "A powerful procedure that controls the false discovery rate with directional information," Biometrics, The International Biometric Society, vol. 77(1), pages 212-222, March.
    15. Chen, Yunxiao & Lu, Yan & Moustaki, Irini, 2022. "Detection of two-way outliers in multivariate data and application to cheating detection in educational tests," LSE Research Online Documents on Economics 112499, London School of Economics and Political Science, LSE Library.
    16. Joshua Habiger & Edsel Peña, 2011. "Randomised -values and nonparametric procedures in multiple testing," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 23(3), pages 583-604.
    17. Bickel David R., 2008. "Correcting the Estimated Level of Differential Expression for Gene Selection Bias: Application to a Microarray Study," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 7(1), pages 1-27, March.
    18. Pallavi Basu & Luella Fu & Alessio Saretto & Wenguang Sun, 2021. "Empirical Bayes Control of the False Discovery Exceedance," Working Papers 2115, Federal Reserve Bank of Dallas.
    19. Habiger, Joshua D. & Peña, Edsel A., 2014. "Compound p-value statistics for multiple testing procedures," Journal of Multivariate Analysis, Elsevier, vol. 126(C), pages 153-166.
    20. Kline, Patrick & Walters, Christopher, 2019. "Audits as Evidence: Experiments, Ensembles, and Enforcement," Institute for Research on Labor and Employment, Working Paper Series qt3z72m9kn, Institute of Industrial Relations, UC Berkeley.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:istatr:v:88:y:2020:i:s1:p:s91-s113. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/isiiinl.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.