IDEAS home Printed from https://ideas.repec.org/a/bla/biomet/v75y2019i1p36-47.html
   My bibliography  Save this article

Joint skeleton estimation of multiple directed acyclic graphs for heterogeneous population

Author

Listed:
  • Jianyu Liu
  • Wei Sun
  • Yufeng Liu

Abstract

The directed acyclic graph (DAG) is a powerful tool to model the interactions of high‐dimensional variables. While estimating edge directions in a DAG often requires interventional data, one can estimate the skeleton of a DAG (i.e., an undirected graph formed by removing the direction of each edge in a DAG) using observational data. In real data analyses, the samples of the high‐dimensional variables may be collected from a mixture of multiple populations. Each population has its own DAG while the DAGs across populations may have significant overlap. In this article, we propose a two‐step approach to jointly estimate the DAG skeletons of multiple populations while the population origin of each sample may or may not be labeled. In particular, our method allows a probabilistic soft label for each sample, which can be easily computed and often leads to more accurate skeleton estimation than hard labels. Compared with separate estimation of skeletons for each population, our method is more accurate and robust to labeling errors. We study the estimation consistency for our method, and demonstrate its performance using simulation studies in different settings. Finally, we apply our method to analyze gene expression data from breast cancer patients of multiple cancer subtypes.

Suggested Citation

  • Jianyu Liu & Wei Sun & Yufeng Liu, 2019. "Joint skeleton estimation of multiple directed acyclic graphs for heterogeneous population," Biometrics, The International Biometric Society, vol. 75(1), pages 36-47, March.
  • Handle: RePEc:bla:biomet:v:75:y:2019:i:1:p:36-47
    DOI: 10.1111/biom.12941
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/biom.12941
    Download Restriction: no

    File URL: https://libkey.io/10.1111/biom.12941?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Min Jin Ha & Wei Sun & Jichun Xie, 2016. "PenPC : A two-step approach to estimate the skeletons of high-dimensional directed acyclic graphs," Biometrics, The International Biometric Society, vol. 72(1), pages 146-155, March.
    2. Jiahua Chen & Zehua Chen, 2008. "Extended Bayesian information criteria for model selection with large model spaces," Biometrika, Biometrika Trust, vol. 95(3), pages 759-771.
    3. Sung Won Han & Gong Chen & Myun-Seok Cheon & Hua Zhong, 2016. "Estimation of Directed Acyclic Graphs Through Two-Stage Adaptive Lasso for Gene Network Inference," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(515), pages 1004-1019, July.
    4. Jian Guo & Elizaveta Levina & George Michailidis & Ji Zhu, 2011. "Joint estimation of multiple graphical models," Biometrika, Biometrika Trust, vol. 98(1), pages 1-15.
    5. Jian Huang & Shuange Ma & Huiliang Xie & Cun-Hui Zhang, 2009. "A group bridge approach for variable selection," Biometrika, Biometrika Trust, vol. 96(2), pages 339-355.
    6. Ming Yuan & Yi Lin, 2007. "Model selection and estimation in the Gaussian graphical model," Biometrika, Biometrika Trust, vol. 94(1), pages 19-35.
    7. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    8. Ming Yuan & Yi Lin, 2006. "Model selection and estimation in regression with grouped variables," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(1), pages 49-67, February.
    9. Patrick Danaher & Pei Wang & Daniela M. Witten, 2014. "The joint graphical lasso for inverse covariance estimation across multiple classes," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(2), pages 373-397, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Lee, Kyoungjae & Cao, Xuan, 2022. "Bayesian joint inference for multiple directed acyclic graphs," Journal of Multivariate Analysis, Elsevier, vol. 191(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Dong Liu & Changwei Zhao & Yong He & Lei Liu & Ying Guo & Xinsheng Zhang, 2023. "Simultaneous cluster structure learning and estimation of heterogeneous graphs for matrix‐variate fMRI data," Biometrics, The International Biometric Society, vol. 79(3), pages 2246-2259, September.
    2. Ziqi Chen & Chenlei Leng, 2016. "Dynamic Covariance Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(515), pages 1196-1207, July.
    3. Shan, Liang & Kim, Inyoung, 2018. "Joint estimation of multiple Gaussian graphical models across unbalanced classes," Computational Statistics & Data Analysis, Elsevier, vol. 121(C), pages 89-103.
    4. Pei Wang & Shunjie Chen & Sijia Yang, 2022. "Recent Advances on Penalized Regression Models for Biological Data," Mathematics, MDPI, vol. 10(19), pages 1-24, October.
    5. Skripnikov, A. & Michailidis, G., 2019. "Regularized joint estimation of related vector autoregressive models," Computational Statistics & Data Analysis, Elsevier, vol. 139(C), pages 164-177.
    6. Kevin H. Lee & Qian Chen & Wayne S. DeSarbo & Lingzhou Xue, 2022. "Estimating Finite Mixtures of Ordinal Graphical Models," Psychometrika, Springer;The Psychometric Society, vol. 87(1), pages 83-106, March.
    7. Diego Vidaurre & Concha Bielza & Pedro Larrañaga, 2013. "A Survey of L1 Regression," International Statistical Review, International Statistical Institute, vol. 81(3), pages 361-387, December.
    8. Shuichi Kawano, 2014. "Selection of tuning parameters in bridge regression models via Bayesian information criterion," Statistical Papers, Springer, vol. 55(4), pages 1207-1223, November.
    9. Lam, Clifford, 2008. "Estimation of large precision matrices through block penalization," LSE Research Online Documents on Economics 31543, London School of Economics and Political Science, LSE Library.
    10. Lee, Wonyul & Liu, Yufeng, 2012. "Simultaneous multiple response regression and inverse covariance matrix estimation via penalized Gaussian maximum likelihood," Journal of Multivariate Analysis, Elsevier, vol. 111(C), pages 241-255.
    11. Yawei He & Zehua Chen, 2016. "The EBIC and a sequential procedure for feature selection in interactive linear models with high-dimensional data," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 68(1), pages 155-180, February.
    12. Karsten Schweikert, 2022. "Oracle Efficient Estimation of Structural Breaks in Cointegrating Regressions," Journal of Time Series Analysis, Wiley Blackwell, vol. 43(1), pages 83-104, January.
    13. Sang Gil Kang & Woo Dong Lee & Yongku Kim, 2022. "Objective Bayesian group variable selection for linear model," Computational Statistics, Springer, vol. 37(3), pages 1287-1310, July.
    14. Mehran Aflakparast & Mathisca de Gunst & Wessel van Wieringen, 2020. "Analysis of Twitter data with the Bayesian fused graphical lasso," PLOS ONE, Public Library of Science, vol. 15(7), pages 1-28, July.
    15. Huang Hailin & Shangguan Jizi & Ruan Peifeng & Liang Hua, 2019. "Bi-level feature selection in high dimensional AFT models with applications to a genomic study," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 18(5), pages 1-11, October.
    16. Loann David Denis Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," Econometrics, MDPI, vol. 6(4), pages 1-27, November.
    17. A. Antoniadis & I. Gijbels & S. Lambert-Lacroix, 2014. "Penalized estimation in additive varying coefficient models using grouped regularization," Statistical Papers, Springer, vol. 55(3), pages 727-750, August.
    18. Byol Kim & Song Liu & Mladen Kolar, 2021. "Two‐sample inference for high‐dimensional Markov networks," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(5), pages 939-962, November.
    19. Bin Luo & Xiaoli Gao, 2022. "A high-dimensional M-estimator framework for bi-level variable selection," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 74(3), pages 559-579, June.
    20. Benjamin G. Stokell & Rajen D. Shah & Ryan J. Tibshirani, 2021. "Modelling high‐dimensional categorical data using nonconvex fusion penalties," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(3), pages 579-611, July.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:biomet:v:75:y:2019:i:1:p:36-47. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0006-341X .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.