IDEAS home Printed from https://ideas.repec.org/a/oup/biomet/v111y2024i2p393-416..html
   My bibliography  Save this article

On selection and conditioning in multiple testing and selective inference

Author

Listed:
  • Jelle J Goeman
  • Aldo Solari

Abstract

SummaryWe investigate a class of methods for selective inference that condition on a selection event. Such methods follow a two-stage process. First, a data-driven collection of hypotheses is chosen from some large universe of hypotheses. Subsequently, inference takes place within this data-driven collection, conditioned on the information that was used for the selection. Examples of such methods include basic data splitting as well as modern data-carving methods and post-selection inference methods for lasso coefficients based on the polyhedral lemma. In this article, we take a holistic view of such methods, considering the selection, conditioning and final error control steps together as a single method. From this perspective, we demonstrate that multiple testing methods defined directly on the full universe of hypotheses are always at least as powerful as selective inference methods based on selection and conditioning. This result holds true even when the universe is potentially infinite and only implicitly defined, such as in the case of data splitting. We provide general theory and intuition before investigating in detail several case studies where a shift to a nonselective or unconditional perspective can yield a power gain.

Suggested Citation

  • Jelle J Goeman & Aldo Solari, 2024. "On selection and conditioning in multiple testing and selective inference," Biometrika, Biometrika Trust, vol. 111(2), pages 393-416.
  • Handle: RePEc:oup:biomet:v:111:y:2024:i:2:p:393-416.
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1093/biomet/asad078
    Download Restriction: Access to full text is restricted to subscribers.
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Yoav Benjamini & Abba M. Krieger & Daniel Yekutieli, 2006. "Adaptive linear step-up procedures that control the false discovery rate," Biometrika, Biometrika Trust, vol. 93(3), pages 491-507, September.
    2. Diaa Al Mohamad & Erik W Van Zwet & Eric Cator & Jelle J Goeman, 2020. "Adaptive critical value for constrained likelihood ratio testing," Biometrika, Biometrika Trust, vol. 107(3), pages 677-688.
    3. Snigdha Panigrahi & Jonathan Taylor, 2023. "Approximate Selective Inference via Maximum Likelihood," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 118(544), pages 2810-2820, October.
    4. Danijel Kivaranovic & Hannes Leeb, 2021. "On the Length of Post-Model-Selection Confidence Intervals Conditional on Polyhedral Constraints," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 116(534), pages 845-857, April.
    5. Andrea C. Garcia‐Angulo & Gerda Claeskens, 2023. "Exact uniformly most powerful postselection confidence distributions," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 50(1), pages 358-382, March.
    6. Sean Jewell & Paul Fearnhead & Daniela Witten, 2022. "Testing for a change in mean after changepoint detection," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(4), pages 1082-1104, September.
    7. Ali Charkhi & Gerda Claeskens, 2018. "Asymptotic post-selection inference for the Akaike information criterion," Biometrika, Biometrika Trust, vol. 105(3), pages 645-664.
    8. Qingyuan Zhao & Dylan S. Small & Weijie Su, 2019. "Multiple Testing When Many p-Values are Uniformly Conservative, with Application to Testing Qualitative Interaction in Educational Interventions," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(527), pages 1291-1304, July.
    9. Yoav Benjamini & Daniel Yekutieli, 2005. "False Discovery Rate-Adjusted Multiple Confidence Intervals for Selected Parameters," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 71-81, March.
    10. Rubin Daniel & Dudoit Sandrine & van der Laan Mark, 2006. "A Method to Increase the Power of Multiple Testing Procedures Through Sample Splitting," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 5(1), pages 1-20, August.
    11. Xiaoying Tian & Jonathan Taylor, 2017. "Asymptotics of Selective Inference," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 44(2), pages 480-499, June.
    12. Lihua Lei & William Fithian, 2018. "AdaPT: an interactive procedure for multiple testing with side information," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(4), pages 649-679, September.
    13. Ang Li & Rina Foygel Barber, 2017. "Accumulation Tests for FDR Control in Ordered Hypothesis Testing," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(518), pages 837-849, April.
    14. A. Farcomeni & L. Finos, 2013. "FDR Control with Pseudo-Gatekeeping Based on a Possibly Data Driven Order of the Hypotheses," Biometrics, The International Biometric Society, vol. 69(3), pages 606-613, September.
    15. Sangwon Hyun & Kevin Z. Lin & Max G'Sell & Ryan J. Tibshirani, 2021. "Post‐selection inference for changepoint detection algorithms with application to copy number variation data," Biometrics, The International Biometric Society, vol. 77(3), pages 1037-1049, September.
    16. Haibing Zhao & Xinping Cui, 2020. "Constructing confidence intervals for selected parameters," Biometrics, The International Biometric Society, vol. 76(4), pages 1098-1108, December.
    17. Yuval Benjamini & Jonathan Taylor & Rafael A. Irizarry, 2019. "Selection-Corrected Statistical Inference for Region Detection With High-Throughput Assays," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(527), pages 1351-1365, July.
    18. Samuel S. Wu & Weizhen Wang & Mark C. K. Yang, 2010. "Interval estimation for drop-the-losers designs," Biometrika, Biometrika Trust, vol. 97(2), pages 405-418.
    19. Isaiah Andrews & Dillon Bowen & Toru Kitagawa & Adam McCloskey, 2022. "Inference for Losers," AEA Papers and Proceedings, American Economic Association, vol. 112, pages 635-642, May.
    20. Asaf Weinstein & William Fithian & Yoav Benjamini, 2013. "Selection Adjusted Confidence Intervals With More Power to Determine the Sign," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 108(501), pages 165-176, March.
    21. Qingyuan Zhao & Dylan S. Small & Ashkan Ertefaie, 2022. "Selective inference for effect modification via the lasso," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(2), pages 382-413, April.
    22. D García Rasines & G A Young, 2023. "Splitting strategies for post-selection inference," Biometrika, Biometrika Trust, vol. 110(3), pages 597-614.
    23. Nan Bi & Jelena Markovic & Lucy Xia & Jonathan Taylor, 2020. "Inferactive data analysis," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 47(1), pages 212-249, March.
    24. Ryan J. Tibshirani & Jonathan Taylor & Richard Lockhart & Robert Tibshirani, 2016. "Exact Post-Selection Inference for Sequential Regression Procedures," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(514), pages 600-620, April.
    25. Ruth Heller & Amit Meir & Nilanjan Chatterjee, 2019. "Post‐selection estimation and testing following aggregate association tests," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 81(3), pages 547-573, July.
    26. DiCiccio, Cyrus J. & DiCiccio, Thomas J. & Romano, Joseph P., 2020. "Exact tests via multiple data splitting," Statistics & Probability Letters, Elsevier, vol. 166(C).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yoav Benjamini, 2010. "Discovering the false discovery rate," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 72(4), pages 405-416, September.
    2. Habiger, Joshua D. & Peña, Edsel A., 2014. "Compound p-value statistics for multiple testing procedures," Journal of Multivariate Analysis, Elsevier, vol. 126(C), pages 153-166.
    3. Guillermo Durand & Gilles Blanchard & Pierre Neuvial & Etienne Roquain, 2020. "Post hoc false positive control for structured hypotheses," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 47(4), pages 1114-1148, December.
    4. D García Rasines & G A Young, 2023. "Splitting strategies for post-selection inference," Biometrika, Biometrika Trust, vol. 110(3), pages 597-614.
    5. Rügamer, David & Baumann, Philipp F.M. & Greven, Sonja, 2022. "Selective inference for additive and linear mixed models," Computational Statistics & Data Analysis, Elsevier, vol. 167(C).
    6. Wesley Tansey & Yixin Wang & Raul Rabadan & David Blei, 2020. "Double Empirical Bayes Testing," International Statistical Review, International Statistical Institute, vol. 88(S1), pages 91-113, December.
    7. Wang, Jiangzhou & Cui, Tingting & Zhu, Wensheng & Wang, Pengfei, 2023. "Covariate-modulated large-scale multiple testing under dependence," Computational Statistics & Data Analysis, Elsevier, vol. 180(C).
    8. Toshiaki Tsukurimichi & Yu Inatsu & Vo Nguyen Le Duy & Ichiro Takeuchi, 2022. "Conditional selective inference for robust regression and outlier detection using piecewise-linear homotopy continuation," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 74(6), pages 1197-1228, December.
    9. Nikolaos Ignatiadis & Wolfgang Huber, 2021. "Covariate powered cross‐weighted multiple testing," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(4), pages 720-751, September.
    10. Haibing Zhao & Wing Kam Fung, 2018. "Controlling mixed directional false discovery rate in multidimensional decisions with applications to microarray studies," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 27(2), pages 316-337, June.
    11. Andrea C. Garcia‐Angulo & Gerda Claeskens, 2023. "Exact uniformly most powerful postselection confidence distributions," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 50(1), pages 358-382, March.
    12. Qingyuan Zhao & Dylan S. Small & Ashkan Ertefaie, 2022. "Selective inference for effect modification via the lasso," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(2), pages 382-413, April.
    13. Algo Carè & Simone Garatti & Marco C. Campi, 2017. "A coverage theory for least squares," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(5), pages 1367-1389, November.
    14. Haibing Zhao & Xinping Cui, 2020. "Constructing confidence intervals for selected parameters," Biometrics, The International Biometric Society, vol. 76(4), pages 1098-1108, December.
    15. Markus Pelger & Jiacheng Zou, 2022. "Inference for Large Panel Data with Many Covariates," Papers 2301.00292, arXiv.org, revised Mar 2023.
    16. Michael L. Anderson & Fangwen Lu, 2017. "Learning to Manage and Managing to Learn: The Effects of Student Leadership Service," Management Science, INFORMS, vol. 63(10), pages 3246-3261, October.
    17. Daniel Yekutieli, 2008. "Comments on: Control of the false discovery rate under dependence using the bootstrap and subsampling," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 17(3), pages 458-460, November.
    18. Emilio Depetris-Chauvin & Ömer Özak, 2020. "The origins of the division of labor in pre-industrial times," Journal of Economic Growth, Springer, vol. 25(3), pages 297-340, September.
    19. Fernández Guerrico, Sofía, 2021. "The effects of trade-induced worker displacement on health and mortality in Mexico," Journal of Health Economics, Elsevier, vol. 80(C).
    20. Daniel Bjorkegren & Joshua Blumenstock & Omowunmi Folajimi-Senjobi & Jacqueline Mauro & Suraj R. Nair, 2022. "Instant Loans Can Lift Subjective Well-Being: A Randomized Evaluation of Digital Credit in Nigeria," Papers 2202.13540, arXiv.org.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:oup:biomet:v:111:y:2024:i:2:p:393-416.. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Oxford University Press (email available below). General contact details of provider: https://academic.oup.com/biomet .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.