IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1004755.html
   My bibliography  Save this article

FastGGM: An Efficient Algorithm for the Inference of Gaussian Graphical Model in Biological Networks

Author

Listed:
  • Ting Wang
  • Zhao Ren
  • Ying Ding
  • Zhou Fang
  • Zhe Sun
  • Matthew L MacDonald
  • Robert A Sweet
  • Jieru Wang
  • Wei Chen

Abstract

Biological networks provide additional information for the analysis of human diseases, beyond the traditional analysis that focuses on single variables. Gaussian graphical model (GGM), a probability model that characterizes the conditional dependence structure of a set of random variables by a graph, has wide applications in the analysis of biological networks, such as inferring interaction or comparing differential networks. However, existing approaches are either not statistically rigorous or are inefficient for high-dimensional data that include tens of thousands of variables for making inference. In this study, we propose an efficient algorithm to implement the estimation of GGM and obtain p-value and confidence interval for each edge in the graph, based on a recent proposal by Ren et al., 2015. Through simulation studies, we demonstrate that the algorithm is faster by several orders of magnitude than the current implemented algorithm for Ren et al. without losing any accuracy. Then, we apply our algorithm to two real data sets: transcriptomic data from a study of childhood asthma and proteomic data from a study of Alzheimer’s disease. We estimate the global gene or protein interaction networks for the disease and healthy samples. The resulting networks reveal interesting interactions and the differential networks between cases and controls show functional relevance to the diseases. In conclusion, we provide a computationally fast algorithm to implement a statistically sound procedure for constructing Gaussian graphical model and making inference with high-dimensional biological data. The algorithm has been implemented in an R package named “FastGGM”.Author Summary: Gaussian graphical model (GGM), a probability model for characterizing conditional dependence among a set of random variables, has been widely used in studying biological networks. It is important and practical to make inference with rigorous statistical properties and high efficiency under a high-dimensional setting, which is common in biological systems that usually contain tens of thousands of molecular elements, such as genes and proteins. This work proposes a novel efficient algorithm, FastGGM, to implement asymptotically normal estimation of large GGM established by Ren et al [1]. It quickly estimates the precision matrix, partial correlations, as well as p-values and confidence intervals for the graph. Simulation studies demonstrate our algorithm outperforms the current algorithm for Ren et al. and algorithms for some other estimation methods, and real data analyses further prove its efficiency in studying biological networks. In conclusion, FastGGM is a statistically sound and computationally fast algorithm for constructing GGM with high-dimensional data. An R package for implementation can be downloaded from http://www.pitt.edu/~wec47/FastGGM.html.

Suggested Citation

  • Ting Wang & Zhao Ren & Ying Ding & Zhou Fang & Zhe Sun & Matthew L MacDonald & Robert A Sweet & Jieru Wang & Wei Chen, 2016. "FastGGM: An Efficient Algorithm for the Inference of Gaussian Graphical Model in Biological Networks," PLOS Computational Biology, Public Library of Science, vol. 12(2), pages 1-16, February.
  • Handle: RePEc:plo:pcbi00:1004755
    DOI: 10.1371/journal.pcbi.1004755
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004755
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1004755&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1004755?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Yin, Jianxin & Li, Hongzhe, 2013. "Adjusting for high-dimensional covariates in sparse precision matrix estimation by ℓ1-penalization," Journal of Multivariate Analysis, Elsevier, vol. 116(C), pages 365-381.
    2. Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i01).
    3. Ming Yuan & Yi Lin, 2007. "Model selection and estimation in the Gaussian graphical model," Biometrika, Biometrika Trust, vol. 94(1), pages 19-35.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Rong Zhang & Zhao Ren & Wei Chen, 2018. "SILGGM: An extensive R package for efficient statistical inference in large-scale gene networks," PLOS Computational Biology, Public Library of Science, vol. 14(8), pages 1-14, August.
    2. Ning Zhang & Jin Yang, 2023. "Sparse precision matrix estimation with missing observations," Computational Statistics, Springer, vol. 38(3), pages 1337-1355, September.
    3. Laurenţiu Cătălin Hinoveanu & Fabrizio Leisen & Cristiano Villa, 2020. "A loss‐based prior for Gaussian graphical models," Australian & New Zealand Journal of Statistics, Australian Statistical Publishing Association Inc., vol. 62(4), pages 444-466, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Avagyan, Vahe & Nogales, Francisco J., 2015. "D-trace Precision Matrix Estimation Using Adaptive Lasso Penalties," DES - Working Papers. Statistics and Econometrics. WS 21775, Universidad Carlos III de Madrid. Departamento de Estadística.
    2. Dong Liu & Changwei Zhao & Yong He & Lei Liu & Ying Guo & Xinsheng Zhang, 2023. "Simultaneous cluster structure learning and estimation of heterogeneous graphs for matrix‐variate fMRI data," Biometrics, The International Biometric Society, vol. 79(3), pages 2246-2259, September.
    3. Avagyan, Vahe, 2016. "D-Trace precision matrix estimator with eigenvalue control," DES - Working Papers. Statistics and Econometrics. WS 23410, Universidad Carlos III de Madrid. Departamento de Estadística.
    4. Pei Wang & Shunjie Chen & Sijia Yang, 2022. "Recent Advances on Penalized Regression Models for Biological Data," Mathematics, MDPI, vol. 10(19), pages 1-24, October.
    5. Xiao Guo & Hai Zhang, 2020. "Sparse directed acyclic graphs incorporating the covariates," Statistical Papers, Springer, vol. 61(5), pages 2119-2148, October.
    6. Yang, Yuehan & Xia, Siwei & Yang, Hu, 2023. "Multivariate sparse Laplacian shrinkage for joint estimation of two graphical structures," Computational Statistics & Data Analysis, Elsevier, vol. 178(C).
    7. Rieser, Christopher & Filzmoser, Peter, 2023. "Extending compositional data analysis from a graph signal processing perspective," Journal of Multivariate Analysis, Elsevier, vol. 198(C).
    8. Murat Genç, 2022. "A new double-regularized regression using Liu and lasso regularization," Computational Statistics, Springer, vol. 37(1), pages 159-227, March.
    9. Runmin Shi & Faming Liang & Qifan Song & Ye Luo & Malay Ghosh, 2018. "A Blockwise Consistency Method for Parameter Estimation of Complex Models," Sankhya B: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 80(1), pages 179-223, December.
    10. Siwei Xia & Yuehan Yang & Hu Yang, 2022. "Sparse Laplacian Shrinkage with the Graphical Lasso Estimator for Regression Problems," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(1), pages 255-277, March.
    11. Rong Zhang & Zhao Ren & Wei Chen, 2018. "SILGGM: An extensive R package for efficient statistical inference in large-scale gene networks," PLOS Computational Biology, Public Library of Science, vol. 14(8), pages 1-14, August.
    12. Pan, Yuqing & Mai, Qing, 2020. "Efficient computation for differential network analysis with applications to quadratic discriminant analysis," Computational Statistics & Data Analysis, Elsevier, vol. 144(C).
    13. Zhang Haixiang & Zheng Yinan & Yoon Grace & Zhang Zhou & Gao Tao & Joyce Brian & Zhang Wei & Schwartz Joel & Vokonas Pantel & Colicino Elena & Baccarelli Andrea & Hou Lifang & Liu Lei, 2017. "Regularized estimation in sparse high-dimensional multivariate regression, with application to a DNA methylation study," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 16(3), pages 159-171, August.
    14. Fan, Xinyan & Zhang, Qingzhao & Ma, Shuangge & Fang, Kuangnan, 2021. "Conditional score matching for high-dimensional partial graphical models," Computational Statistics & Data Analysis, Elsevier, vol. 153(C).
    15. Ding, Wenliang & Shu, Lianjie & Gu, Xinhua, 2023. "A robust Glasso approach to portfolio selection in high dimensions," Journal of Empirical Finance, Elsevier, vol. 70(C), pages 22-37.
    16. Avagyan, Vahe & Nogales, Francisco J., 2014. "Improving the graphical lasso estimation for the precision matrix through roots ot the sample convariance matrix," DES - Working Papers. Statistics and Econometrics. WS ws141208, Universidad Carlos III de Madrid. Departamento de Estadística.
    17. Liu, Weidong & Luo, Xi, 2015. "Fast and adaptive sparse precision matrix estimation in high dimensions," Journal of Multivariate Analysis, Elsevier, vol. 135(C), pages 153-162.
    18. Vahe Avagyan & Andrés M. Alonso & Francisco J. Nogales, 2018. "D-trace estimation of a precision matrix using adaptive Lasso penalties," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 12(2), pages 425-447, June.
    19. Laura Freijeiro‐González & Manuel Febrero‐Bande & Wenceslao González‐Manteiga, 2022. "A Critical Review of LASSO and Its Derivatives for Variable Selection Under Dependence Among Covariates," International Statistical Review, International Statistical Institute, vol. 90(1), pages 118-145, April.
    20. Vahe Avagyan, 2022. "Precision matrix estimation using penalized Generalized Sylvester matrix equation," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(4), pages 950-967, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1004755. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.