IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v189y2022ics0047259x21002001.html
   My bibliography  Save this article

Reproducible learning in large-scale graphical models

Author

Listed:
  • Zhou, Jia
  • Li, Yang
  • Zheng, Zemin
  • Li, Daoji

Abstract

Learning the conditional dependence structures through high-dimensional graphical models is of fundamental importance in many contemporary applications. Despite the fast growing literature on graphical models, a practical issue of reproducibility remains largely unexplored as most of existing methods for graph recovery do not guarantee the false discovery rate (FDR) control. In this paper, we propose a new procedure, called the high-dimensional graphical knockoff filter, to control the overall FDR for large-scale graph recovery. The proposed procedure enjoys not only theoretical guarantees and high power but also the robustness of FDR control even when the population precision matrices of predictors are replaced by consistent estimates. Furthermore, a scalable implementation approach is developed such that all knockoff variables can be generated through one single estimation of the overall graphical structure. Our new methodology and results are evidenced by numerical studies.

Suggested Citation

  • Zhou, Jia & Li, Yang & Zheng, Zemin & Li, Daoji, 2022. "Reproducible learning in large-scale graphical models," Journal of Multivariate Analysis, Elsevier, vol. 189(C).
  • Handle: RePEc:eee:jmvana:v:189:y:2022:i:c:s0047259x21002001
    DOI: 10.1016/j.jmva.2021.104934
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047259X21002001
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jmva.2021.104934?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Jinzhou Li & Marloes H. Maathuis, 2021. "GGM knockoff filter: False discovery rate control for Gaussian graphical models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(3), pages 534-558, July.
    2. Jianqing Fan & Shaojun Guo & Ning Hao, 2012. "Variance estimation using refitted cross‐validation in ultrahigh dimensional regression," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 74(1), pages 37-65, January.
    3. Jian Guo & Elizaveta Levina & George Michailidis & Ji Zhu, 2011. "Joint estimation of multiple graphical models," Biometrika, Biometrika Trust, vol. 98(1), pages 1-15.
    4. Wang, Luheng & Chen, Zhao & Wang, Christina Dan & Li, Runze, 2020. "Ultrahigh dimensional precision matrix estimation via refitted cross validation," Journal of Econometrics, Elsevier, vol. 215(1), pages 118-130.
    5. Zhao Chen & Jianqing Fan & Runze Li, 2018. "Error Variance Estimation in Ultrahigh-Dimensional Additive Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(521), pages 315-327, January.
    6. Patrick Danaher & Pei Wang & Daniela M. Witten, 2014. "The joint graphical lasso for inverse covariance estimation across multiple classes," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(2), pages 373-397, March.
    7. Yunzhang Zhu & Xiaotong Shen & Wei Pan, 2014. "Structural Pursuit Over Multiple Undirected Graphs," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(508), pages 1683-1696, December.
    8. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    9. Rong Zhang & Zhao Ren & Wei Chen, 2018. "SILGGM: An extensive R package for efficient statistical inference in large-scale gene networks," PLOS Computational Biology, Public Library of Science, vol. 14(8), pages 1-14, August.
    10. Yingying Fan & Emre Demirkaya & Gaorong Li & Jinchi Lv, 2020. "RANK: Large-Scale Inference With Graphical Nonlinear Knockoffs," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(529), pages 362-379, January.
    11. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    12. Ming Yuan & Yi Lin, 2007. "Model selection and estimation in the Gaussian graphical model," Biometrika, Biometrika Trust, vol. 94(1), pages 19-35.
    13. Fan, Jianqing & Feng, Yang & Xia, Lucy, 2020. "A projection-based conditional dependence measure with applications to high-dimensional undirected graphical models," Journal of Econometrics, Elsevier, vol. 218(1), pages 119-139.
    14. John D. Storey, 2002. "A direct approach to false discovery rates," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 64(3), pages 479-498, August.
    15. Zheng, Zemin & Shi, Haiyu & Li, Yang & Yuan, Hui, 2020. "Uniform joint screening for ultra-high dimensional graphical models," Journal of Multivariate Analysis, Elsevier, vol. 179(C).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hyungrok Do & Shinjini Nandi & Preston Putzel & Padhraic Smyth & Judy Zhong, 2023. "A joint fairness model with applications to risk predictions for underrepresented populations," Biometrics, The International Biometric Society, vol. 79(2), pages 826-840, June.
    2. Laura Freijeiro‐González & Manuel Febrero‐Bande & Wenceslao González‐Manteiga, 2022. "A Critical Review of LASSO and Its Derivatives for Variable Selection Under Dependence Among Covariates," International Statistical Review, International Statistical Institute, vol. 90(1), pages 118-145, April.
    3. Panxu Yuan & Yinfei Kong & Gaorong Li, 2024. "FDR control and power analysis for high-dimensional logistic regression via StabKoff," Statistical Papers, Springer, vol. 65(5), pages 2719-2749, July.
    4. Jinzhou Li & Marloes H. Maathuis, 2021. "GGM knockoff filter: False discovery rate control for Gaussian graphical models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(3), pages 534-558, July.
    5. Sakae Oya, 2022. "A Bayesian Graphical Approach for Large-Scale Portfolio Management with Fewer Historical Data," Asia-Pacific Financial Markets, Springer;Japanese Association of Financial Economics and Engineering, vol. 29(3), pages 507-526, September.
    6. Mehran Aflakparast & Mathisca de Gunst & Wessel van Wieringen, 2020. "Analysis of Twitter data with the Bayesian fused graphical lasso," PLOS ONE, Public Library of Science, vol. 15(7), pages 1-28, July.
    7. Zemin Zheng & Jinchi Lv & Wei Lin, 2021. "Nonsparse Learning with Latent Variables," Operations Research, INFORMS, vol. 69(1), pages 346-359, January.
    8. Aaron Hudson & Ali Shojaie, 2022. "Covariate-Adjusted Inference for Differential Analysis of High-Dimensional Networks," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 84(1), pages 345-388, June.
    9. Zhang, Yaowu & Zhou, Yeqing & Zhu, Liping, 2024. "A post-screening diagnostic study for ultrahigh dimensional data," Journal of Econometrics, Elsevier, vol. 239(2).
    10. Yang Ni & Veerabhadran Baladandayuthapani & Marina Vannucci & Francesco C. Stingo, 2022. "Bayesian graphical models for modern biological applications," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 31(2), pages 197-225, June.
    11. Zemin Zheng & Jie Zhang & Yang Li, 2022. "L 0 -Regularized Learning for High-Dimensional Additive Hazards Regression," INFORMS Journal on Computing, INFORMS, vol. 34(5), pages 2762-2775, September.
    12. Dong Liu & Changwei Zhao & Yong He & Lei Liu & Ying Guo & Xinsheng Zhang, 2023. "Simultaneous cluster structure learning and estimation of heterogeneous graphs for matrix‐variate fMRI data," Biometrics, The International Biometric Society, vol. 79(3), pages 2246-2259, September.
    13. Claudia Angelini & Daniela De Canditiis & Anna Plaksienko, 2021. "Jewel : A Novel Method for Joint Estimation of Gaussian Graphical Models," Mathematics, MDPI, vol. 9(17), pages 1-24, August.
    14. Pei Wang & Shunjie Chen & Sijia Yang, 2022. "Recent Advances on Penalized Regression Models for Biological Data," Mathematics, MDPI, vol. 10(19), pages 1-24, October.
    15. Zhang, Qingzhao & Ma, Shuangge & Huang, Yuan, 2021. "Promote sign consistency in the joint estimation of precision matrices," Computational Statistics & Data Analysis, Elsevier, vol. 159(C).
    16. Wang, Luheng & Chen, Zhao & Wang, Christina Dan & Li, Runze, 2020. "Ultrahigh dimensional precision matrix estimation via refitted cross validation," Journal of Econometrics, Elsevier, vol. 215(1), pages 118-130.
    17. de Paula, Aureo & Rasul, Imran & Souza, Pedro, 2018. "Identifying Network Ties from Panel Data: Theory and an Application to Tax Competition," CEPR Discussion Papers 12792, C.E.P.R. Discussion Papers.
    18. Lin Zhang & Andrew DiLernia & Karina Quevedo & Jazmin Camchong & Kelvin Lim & Wei Pan, 2021. "A random covariance model for bi‐level graphical modeling with application to resting‐state fMRI data," Biometrics, The International Biometric Society, vol. 77(4), pages 1385-1396, December.
    19. Chen, Xin & Yang, Dan & Xu, Yan & Xia, Yin & Wang, Dong & Shen, Haipeng, 2023. "Testing and support recovery of correlation structures for matrix-valued observations with an application to stock market data," Journal of Econometrics, Elsevier, vol. 232(2), pages 544-564.
    20. Murat Genç, 2022. "A new double-regularized regression using Liu and lasso regularization," Computational Statistics, Springer, vol. 37(1), pages 159-227, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:189:y:2022:i:c:s0047259x21002001. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.