IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1005915.html
   My bibliography  Save this article

MPLasso: Inferring microbial association networks using prior microbial knowledge

Author

Listed:
  • Chieh Lo
  • Radu Marculescu

Abstract

Due to the recent advances in high-throughput sequencing technologies, it becomes possible to directly analyze microbial communities in human body and environment. To understand how microbial communities adapt, develop, and interact with the human body and the surrounding environment, one of the fundamental challenges is to infer the interactions among different microbes. However, due to the compositional and high-dimensional nature of microbial data, statistical inference cannot offer reliable results. Consequently, new approaches that can accurately and robustly estimate the associations (putative interactions) among microbes are needed to analyze such compositional and high-dimensional data. We propose a novel framework called Microbial Prior Lasso (MPLasso) which integrates graph learning algorithm with microbial co-occurrences and associations obtained from scientific literature by using automated text mining. We show that MPLasso outperforms existing models in terms of accuracy, microbial network recovery rate, and reproducibility. Furthermore, the association networks we obtain from the Human Microbiome Project datasets show credible results when compared against laboratory data.Author summary: Microbial communities exhibit rich dynamics including the way they adapt, develop, and interact with the human body and the surrounding environment. The associations among microbes can provide a solid foundation to model the interplay between the (host) human body and the microbial populations. However, due to the unique properties of compositional and high-dimensional nature of microbial data, standard statistical methods are likely to produce spurious results. Although several existing methods can estimate the associations among microbes under the sparsity assumption, they still have major difficulties to infer the associations among microbes given such high-dimensional data. To enhance the model accuracy on inferring microbial associations, we propose to integrate multiple levels of biological information by mining the co-occurrence patterns and interactions directly from large amount of scientific literature. We first show that our proposed method can outperform existing methods in synthetic experiments. Next, we obtain credible inference results from Human Microbiome Project datasets when compared against laboratory data. By creating a more accurate microbial association network, scientists in this field will be able to better focus their efforts when experimentally verifying microbial associations by eliminating the need to perform exhaustive searches on all possible pairs of associations.

Suggested Citation

  • Chieh Lo & Radu Marculescu, 2017. "MPLasso: Inferring microbial association networks using prior microbial knowledge," PLOS Computational Biology, Public Library of Science, vol. 13(12), pages 1-20, December.
  • Handle: RePEc:plo:pcbi00:1005915
    DOI: 10.1371/journal.pcbi.1005915
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005915
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1005915&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1005915?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Ming Yuan & Yi Lin, 2007. "Model selection and estimation in the Gaussian graphical model," Biometrika, Biometrika Trust, vol. 94(1), pages 19-35.
    2. Paul J McMurdie & Susan Holmes, 2014. "Waste Not, Want Not: Why Rarefying Microbiome Data Is Inadmissible," PLOS Computational Biology, Public Library of Science, vol. 10(4), pages 1-12, April.
    3. Zachary D Kurtz & Christian L Müller & Emily R Miraldi & Dan R Littman & Martin J Blaser & Richard A Bonneau, 2015. "Sparse and Compositionally Robust Inference of Microbial Ecological Networks," PLOS Computational Biology, Public Library of Science, vol. 11(5), pages 1-25, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Sean M Devlin & Axel Martin & Irina Ostrovnaya, 2021. "Identifying prognostic pairwise relationships among bacterial species in microbiome studies," PLOS Computational Biology, Public Library of Science, vol. 17(11), pages 1-12, November.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Duo Jiang & Thomas Sharpton & Yuan Jiang, 2021. "Microbial Interaction Network Estimation via Bias-Corrected Graphical Lasso," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 13(2), pages 329-350, July.
    2. Pratheepa Jeganathan & Susan P. Holmes, 2021. "A Statistical Perspective on the Challenges in Molecular Microbial Biology," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 26(2), pages 131-160, June.
    3. Rieser, Christopher & Filzmoser, Peter, 2023. "Extending compositional data analysis from a graph signal processing perspective," Journal of Multivariate Analysis, Elsevier, vol. 198(C).
    4. Ines Wilms & Jacob Bien, 2021. "Tree-based Node Aggregation in Sparse Graphical Models," Papers 2101.12503, arXiv.org.
    5. Zachary D Kurtz & Christian L Müller & Emily R Miraldi & Dan R Littman & Martin J Blaser & Richard A Bonneau, 2015. "Sparse and Compositionally Robust Inference of Microbial Ecological Networks," PLOS Computational Biology, Public Library of Science, vol. 11(5), pages 1-25, May.
    6. Zhigang Li & Katherine Lee & Margaret R. Karagas & Juliette C. Madan & Anne G. Hoen & A. James O’Malley & Hongzhe Li, 2018. "Conditional Regression Based on a Multivariate Zero-Inflated Logistic-Normal Model for Microbiome Relative Abundance Data," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 10(3), pages 587-608, December.
    7. McGillivray, Annaliza & Khalili, Abbas & Stephens, David A., 2020. "Estimating sparse networks with hubs," Journal of Multivariate Analysis, Elsevier, vol. 179(C).
    8. Aaron C Ericsson & J Wade Davis & William Spollen & Nathan Bivens & Scott Givan & Catherine E Hagan & Mark McIntosh & Craig L Franklin, 2015. "Effects of Vendor and Genetic Background on the Composition of the Fecal Microbiota of Inbred Mice," PLOS ONE, Public Library of Science, vol. 10(2), pages 1-19, February.
    9. Avagyan, Vahe & Nogales, Francisco J., 2015. "D-trace Precision Matrix Estimation Using Adaptive Lasso Penalties," DES - Working Papers. Statistics and Econometrics. WS 21775, Universidad Carlos III de Madrid. Departamento de Estadística.
    10. Byrd, Michael & Nghiem, Linh H. & McGee, Monnie, 2021. "Bayesian regularization of Gaussian graphical models with measurement error," Computational Statistics & Data Analysis, Elsevier, vol. 156(C).
    11. Jie Jian & Peijun Sang & Mu Zhu, 2024. "Two Gaussian Regularization Methods for Time-Varying Networks," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 29(4), pages 853-873, December.
    12. Lam, Clifford, 2008. "Estimation of large precision matrices through block penalization," LSE Research Online Documents on Economics 31543, London School of Economics and Political Science, LSE Library.
    13. Giraud Christophe & Huet Sylvie & Verzelen Nicolas, 2012. "Graph Selection with GGMselect," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(3), pages 1-52, February.
    14. Seunghwan Lee & Sang Cheol Kim & Donghyeon Yu, 2023. "An efficient GPU-parallel coordinate descent algorithm for sparse precision matrix estimation via scaled lasso," Computational Statistics, Springer, vol. 38(1), pages 217-242, March.
    15. Shilan Li & Jianxin Shi & Paul Albert & Hong-Bin Fang, 2022. "Dependence Structure Analysis and Its Application in Human Microbiome," Mathematics, MDPI, vol. 11(1), pages 1-14, December.
    16. Benjamin Poignard & Manabu Asai, 2023. "Estimation of high-dimensional vector autoregression via sparse precision matrix," The Econometrics Journal, Royal Economic Society, vol. 26(2), pages 307-326.
    17. Dong Liu & Changwei Zhao & Yong He & Lei Liu & Ying Guo & Xinsheng Zhang, 2023. "Simultaneous cluster structure learning and estimation of heterogeneous graphs for matrix‐variate fMRI data," Biometrics, The International Biometric Society, vol. 79(3), pages 2246-2259, September.
    18. Mehran Aflakparast & Mathisca de Gunst & Wessel van Wieringen, 2020. "Analysis of Twitter data with the Bayesian fused graphical lasso," PLOS ONE, Public Library of Science, vol. 15(7), pages 1-28, July.
    19. M. McCauley & T. L. Goulet & C. R. Jackson & S. Loesgen, 2023. "Systematic review of cnidarian microbiomes reveals insights into the structure, specificity, and fidelity of marine associations," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    20. Huangdi Yi & Qingzhao Zhang & Cunjie Lin & Shuangge Ma, 2022. "Information‐incorporated Gaussian graphical model for gene expression data," Biometrics, The International Biometric Society, vol. 78(2), pages 512-523, June.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1005915. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.