IDEAS home Printed from https://ideas.repec.org/a/spr/compst/v37y2022i3d10.1007_s00180-021-01148-6.html
   My bibliography  Save this article

Hierarchical correction of p-values via an ultrametric tree running Ornstein-Uhlenbeck process

Author

Listed:
  • Antoine Bichat

    (LaMME, Université d’Évry val d’Essonne
    Enterome)

  • Christophe Ambroise

    (LaMME, Université d’Évry val d’Essonne)

  • Mahendra Mariadassou

    (MaIAGE, INRAE, Université Paris-Saclay)

Abstract

Statistical testing is classically used as an exploratory tool to search for association between a phenotype and many possible explanatory variables. This approach often leads to multiple testing under dependence. We assume a hierarchical structure between tests via an Ornstein-Uhlenbeck process on a tree. The process correlation structure is used for smoothing the p-values. We design a penalized estimation of the mean of the Ornstein-Uhlenbeck process for p-value computation. The performances of the algorithm are assessed via simulations. Its ability to discover new associations is demonstrated on a metagenomic dataset. The corresponding R package is available from https://github.com/abichat/zazou .

Suggested Citation

  • Antoine Bichat & Christophe Ambroise & Mahendra Mariadassou, 2022. "Hierarchical correction of p-values via an ultrametric tree running Ornstein-Uhlenbeck process," Computational Statistics, Springer, vol. 37(3), pages 995-1013, July.
  • Handle: RePEc:spr:compst:v:37:y:2022:i:3:d:10.1007_s00180-021-01148-6
    DOI: 10.1007/s00180-021-01148-6
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00180-021-01148-6
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00180-021-01148-6?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Matteo Sesia & Eugene Katsevich & Stephen Bates & Emmanuel Candès & Chiara Sabatti, 2020. "Multi-resolution localization of causal variants across the genome," Nature Communications, Nature, vol. 11(1), pages 1-10, December.
    2. Goeman Jelle J. & Finos Livio, 2012. "The Inheritance Procedure: Multiple Testing of Tree-structured Hypotheses," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(1), pages 1-18, January.
    3. Frederick A Matsen IV & Steven N Evans, 2013. "Edge Principal Components and Squash Clustering: Using the Special Structure of Phylogenetic Placement Data for Sample Comparison," PLOS ONE, Public Library of Science, vol. 8(3), pages 1-15, March.
    4. Henk R Cremers & Tor D Wager & Tal Yarkoni, 2017. "The relation between statistical power and inference in fMRI," PLOS ONE, Public Library of Science, vol. 12(11), pages 1-20, November.
    5. Paul Bastide & Mahendra Mariadassou & Stéphane Robin, 2017. "Detection of adaptive shifts on phylogenies by using shifted stochastic processes on a tree," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(4), pages 1067-1093, September.
    6. Kim Kyung In & Roquain Etienne & van de Wiel Mark A, 2010. "Spatial Clustering of Array CGH Features in Combination with Hierarchical Multiple Testing," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 9(1), pages 1-25, November.
    7. Yingying Fan & Cheng Yong Tang, 2013. "Tuning parameter selection in high dimensional penalized likelihood," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 75(3), pages 531-552, June.
    8. Yekutieli, Daniel, 2008. "Hierarchical False Discovery RateControlling Methodology," Journal of the American Statistical Association, American Statistical Association, vol. 103, pages 309-316, March.
    9. Benjamini, Yoav & Heller, Ruth, 2007. "False Discovery Rates for Spatial Signals," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 1272-1281, December.
    10. Claude Renaux & Laura Buzdugan & Markus Kalisch & Peter Bühlmann, 2020. "Rejoinder on: Hierarchical inference for genome-wide association studies: a view on methodology with software," Computational Statistics, Springer, vol. 35(1), pages 59-67, March.
    11. Sankaran, Kris & Holmes, Susan, 2014. "structSSI: Simultaneous and Selective Inference for Grouped or Hierarchically Structured Data," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 59(i13).
    12. Tingni Sun & Cun-Hui Zhang, 2012. "Scaled sparse linear regression," Biometrika, Biometrika Trust, vol. 99(4), pages 879-898.
    13. Nicolai Meinshausen, 2008. "Hierarchical testing of variable importance," Biometrika, Biometrika Trust, vol. 95(2), pages 265-278.
    14. Claude Renaux & Laura Buzdugan & Markus Kalisch & Peter Bühlmann, 2020. "Hierarchical inference for genome-wide association studies: a view on methodology with software," Computational Statistics, Springer, vol. 35(1), pages 1-40, March.
    15. Cun-Hui Zhang & Stephanie S. Zhang, 2014. "Confidence intervals for low dimensional parameters in high dimensional linear models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(1), pages 217-242, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xue Wu & Chixiang Chen & Zheng Li & Lijun Zhang & Vernon M. Chinchilli & Ming Wang, 2024. "A three-stage approach to identify biomarker signatures for cancer genetic data with survival endpoints," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 33(3), pages 863-883, July.
    2. Claude Renaux & Laura Buzdugan & Markus Kalisch & Peter Bühlmann, 2020. "Hierarchical inference for genome-wide association studies: a view on methodology with software," Computational Statistics, Springer, vol. 35(1), pages 1-40, March.
    3. T. Tony Cai & Wenguang Sun, 2017. "Optimal screening and discovery of sparse signals with applications to multistage high throughput studies," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(1), pages 197-223, January.
    4. Meijer Rosa J. & Krebs Thijmen J.P. & Goeman Jelle J., 2015. "A region-based multiple testing method for hypotheses ordered in space or time," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 14(1), pages 1-19, February.
    5. Yoav Benjamini, 2010. "Discovering the false discovery rate," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 72(4), pages 405-416, September.
    6. Goeman Jelle J. & Finos Livio, 2012. "The Inheritance Procedure: Multiple Testing of Tree-structured Hypotheses," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(1), pages 1-18, January.
    7. Gilles R. Ducharme & Walid Al Akhras, 2016. "Tree based diagnostic procedures following a smooth test of goodness-of-fit," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 79(8), pages 971-989, November.
    8. Zemin Zheng & Jie Zhang & Yang Li, 2022. "L 0 -Regularized Learning for High-Dimensional Additive Hazards Regression," INFORMS Journal on Computing, INFORMS, vol. 34(5), pages 2762-2775, September.
    9. Guillermo Durand & Gilles Blanchard & Pierre Neuvial & Etienne Roquain, 2020. "Post hoc false positive control for structured hypotheses," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 47(4), pages 1114-1148, December.
    10. Qingyun Cai & Hock Peng Chan, 2017. "A Double Application of the Benjamini-Hochberg Procedure for Testing Batched Hypotheses," Methodology and Computing in Applied Probability, Springer, vol. 19(2), pages 429-443, June.
    11. Anders Bredahl Kock & David Preinerstorfer, 2021. "Superconsistency of Tests in High Dimensions," Papers 2106.03700, arXiv.org, revised Jan 2022.
    12. Zemin Zheng & Jinchi Lv & Wei Lin, 2021. "Nonsparse Learning with Latent Variables," Operations Research, INFORMS, vol. 69(1), pages 346-359, January.
    13. Guo, Zijian & Kang, Hyunseung & Cai, T. Tony & Small, Dylan S., 2018. "Testing endogeneity with high dimensional covariates," Journal of Econometrics, Elsevier, vol. 207(1), pages 175-187.
    14. Kock, Anders Bredahl, 2016. "Oracle inequalities, variable selection and uniform inference in high-dimensional correlated random effects panel data models," Journal of Econometrics, Elsevier, vol. 195(1), pages 71-85.
    15. Lucas Janson & Rina Foygel Barber & Emmanuel Candès, 2017. "EigenPrism: inference for high dimensional signal-to-noise ratios," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(4), pages 1037-1065, September.
    16. Qing Zhou & Seunghyun Min, 2017. "Uncertainty quantification under group sparsity," Biometrika, Biometrika Trust, vol. 104(3), pages 613-632.
    17. Kim Kyung In & Roquain Etienne & van de Wiel Mark A, 2010. "Spatial Clustering of Array CGH Features in Combination with Hierarchical Multiple Testing," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 9(1), pages 1-25, November.
    18. Gueuning, Thomas & Claeskens, Gerda, 2016. "Confidence intervals for high-dimensional partially linear single-index models," Journal of Multivariate Analysis, Elsevier, vol. 149(C), pages 13-29.
    19. Zhou, Jia & Zheng, Zemin & Zhou, Huiting & Dong, Ruipeng, 2021. "Innovated scalable efficient inference for ultra-large graphical models," Statistics & Probability Letters, Elsevier, vol. 173(C).
    20. Lan, Wei & Zhong, Ping-Shou & Li, Runze & Wang, Hansheng & Tsai, Chih-Ling, 2016. "Testing a single regression coefficient in high dimensional linear models," Journal of Econometrics, Elsevier, vol. 195(1), pages 154-168.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:compst:v:37:y:2022:i:3:d:10.1007_s00180-021-01148-6. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.