IDEAS home Printed from https://ideas.repec.org/a/oup/biomet/v95y2008i2p265-278.html
   My bibliography  Save this article

Hierarchical testing of variable importance

Author

Listed:
  • Nicolai Meinshausen

Abstract

A frequently encountered challenge in high-dimensional regression is the detection of relevant variables. Variable selection suffers from instability and the power to detect relevant variables is typically low if predictor variables are highly correlated. When taking the multiplicity of the testing problem into account, the power diminishes even further. To gain power and insight, it can be advantageous to look for influence not at the level of individual variables but rather at the level of clusters of highly correlated variables. We propose a hierarchical approach. Variable importance is first tested at the coarsest level, corresponding to the global null hypothesis. The method then tries to attribute any effect to smaller subclusters or even individual variables. The smallest possible clusters, which still exhibit a significant influence on the response variable, are retained. It is shown that the proposed testing procedure controls the familywise error rate at a prespecified level, simultaneously over all resolution levels. The method has power comparable to the Bonferroni--Holm procedure on the level of individual variables and dramatically larger power for coarser resolution levels. The best resolution level is selected adaptively. Copyright 2008, Oxford University Press.

Suggested Citation

  • Nicolai Meinshausen, 2008. "Hierarchical testing of variable importance," Biometrika, Biometrika Trust, vol. 95(2), pages 265-278.
  • Handle: RePEc:oup:biomet:v:95:y:2008:i:2:p:265-278
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1093/biomet/asn007
    Download Restriction: Access to full text is restricted to subscribers.
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Kim Kyung In & Roquain Etienne & van de Wiel Mark A, 2010. "Spatial Clustering of Array CGH Features in Combination with Hierarchical Multiple Testing," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 9(1), pages 1-25, November.
    2. Jelle J. Goeman & Stefan Böhringer, 2020. "Comments on: Hierarchical inference for genome-wide association studies by Jelle J. Goeman and Stefan Böhringer," Computational Statistics, Springer, vol. 35(1), pages 41-45, March.
    3. Meijer Rosa J. & Krebs Thijmen J.P. & Goeman Jelle J., 2015. "A region-based multiple testing method for hypotheses ordered in space or time," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 14(1), pages 1-19, February.
    4. Goeman Jelle J. & Finos Livio, 2012. "The Inheritance Procedure: Multiple Testing of Tree-structured Hypotheses," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(1), pages 1-18, January.
    5. Anders Bredahl Kock & David Preinerstorfer, 2021. "Superconsistency of Tests in High Dimensions," Papers 2106.03700, arXiv.org, revised Jan 2022.
    6. Wang Xiaoming & Dinu Irina & Liu Wei & Yasui Yutaka, 2011. "Linear Combination Test for Hierarchical Gene Set Analysis," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-18, March.
    7. Gilles R. Ducharme & Walid Al Akhras, 2016. "Tree based diagnostic procedures following a smooth test of goodness-of-fit," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 79(8), pages 971-989, November.
    8. Paulo C. Rodrigues & Vanda M. Lourenço, 2020. "Comments on: Hierarchical Inference for genome-wide association studies: a view on methodology with software by Paulo C. Rodrigues and Vanda M. Lourenço," Computational Statistics, Springer, vol. 35(1), pages 57-58, March.
    9. T. Tony Cai & Wenguang Sun, 2017. "Optimal screening and discovery of sparse signals with applications to multistage high throughput studies," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(1), pages 197-223, January.
    10. Claude Renaux & Laura Buzdugan & Markus Kalisch & Peter Bühlmann, 2020. "Rejoinder on: Hierarchical inference for genome-wide association studies: a view on methodology with software," Computational Statistics, Springer, vol. 35(1), pages 59-67, March.
    11. Gao Wang & Abhishek Sarkar & Peter Carbonetto & Matthew Stephens, 2020. "A simple new approach to variable selection in regression, with application to genetic fine mapping," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 82(5), pages 1273-1300, December.
    12. Guillermo Durand & Gilles Blanchard & Pierre Neuvial & Etienne Roquain, 2020. "Post hoc false positive control for structured hypotheses," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 47(4), pages 1114-1148, December.
    13. Patrick K. Kimes & Yufeng Liu & David Neil Hayes & James Stephen Marron, 2017. "Statistical significance for hierarchical clustering," Biometrics, The International Biometric Society, vol. 73(3), pages 811-821, September.
    14. Antoine Bichat & Christophe Ambroise & Mahendra Mariadassou, 2022. "Hierarchical correction of p-values via an ultrametric tree running Ornstein-Uhlenbeck process," Computational Statistics, Springer, vol. 37(3), pages 995-1013, July.
    15. Claude Renaux & Laura Buzdugan & Markus Kalisch & Peter Bühlmann, 2020. "Hierarchical inference for genome-wide association studies: a view on methodology with software," Computational Statistics, Springer, vol. 35(1), pages 1-40, March.
    16. Yoav Benjamini, 2010. "Discovering the false discovery rate," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 72(4), pages 405-416, September.
    17. Rina Foygel Barber & Aaditya Ramdas, 2017. "The p-filter: multilayer false discovery rate control for grouped hypotheses," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(4), pages 1247-1268, September.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:oup:biomet:v:95:y:2008:i:2:p:265-278. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Oxford University Press (email available below). General contact details of provider: https://academic.oup.com/biomet .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.