IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v13y2022i1d10.1038_s41467-022-34405-3.html
   My bibliography  Save this article

Faecal microbiome-based machine learning for multi-class disease diagnosis

Author

Listed:
  • Qi Su

    (Microbiota I-Center (MagIC)
    The Chinese University of Hong Kong
    The Chinese University of Hong Kong
    Faculty of Medicine, The Chinese University of Hong Kong)

  • Qin Liu

    (Microbiota I-Center (MagIC)
    The Chinese University of Hong Kong
    The Chinese University of Hong Kong
    Faculty of Medicine, The Chinese University of Hong Kong)

  • Raphaela Iris Lau

    (Microbiota I-Center (MagIC)
    The Chinese University of Hong Kong
    The Chinese University of Hong Kong)

  • Jingwan Zhang

    (Microbiota I-Center (MagIC)
    The Chinese University of Hong Kong
    The Chinese University of Hong Kong
    Faculty of Medicine, The Chinese University of Hong Kong)

  • Zhilu Xu

    (Microbiota I-Center (MagIC)
    The Chinese University of Hong Kong
    The Chinese University of Hong Kong
    Faculty of Medicine, The Chinese University of Hong Kong)

  • Yun Kit Yeoh

    (Microbiota I-Center (MagIC))

  • Thomas W. H. Leung

    (The Chinese University of Hong Kong)

  • Whitney Tang

    (Microbiota I-Center (MagIC)
    The Chinese University of Hong Kong
    The Chinese University of Hong Kong)

  • Lin Zhang

    (Microbiota I-Center (MagIC)
    The Chinese University of Hong Kong
    The Chinese University of Hong Kong
    Faculty of Medicine, The Chinese University of Hong Kong)

  • Jessie Q. Y. Liang

    (The Chinese University of Hong Kong
    The Chinese University of Hong Kong
    Faculty of Medicine, The Chinese University of Hong Kong)

  • Yuk Kam Yau

    (Microbiota I-Center (MagIC)
    The Chinese University of Hong Kong
    The Chinese University of Hong Kong)

  • Jiaying Zheng

    (Microbiota I-Center (MagIC)
    The Chinese University of Hong Kong
    The Chinese University of Hong Kong)

  • Chengyu Liu

    (Microbiota I-Center (MagIC)
    The Chinese University of Hong Kong
    The Chinese University of Hong Kong)

  • Mengjing Zhang

    (Microbiota I-Center (MagIC)
    The Chinese University of Hong Kong
    The Chinese University of Hong Kong)

  • Chun Pan Cheung

    (Microbiota I-Center (MagIC)
    The Chinese University of Hong Kong
    Faculty of Medicine, The Chinese University of Hong Kong)

  • Jessica Y. L. Ching

    (Microbiota I-Center (MagIC)
    The Chinese University of Hong Kong
    The Chinese University of Hong Kong)

  • Hein M. Tun

    (Microbiota I-Center (MagIC)
    The Chinese University of Hong Kong
    The Chinese University of Hong Kong)

  • Jun Yu

    (The Chinese University of Hong Kong
    The Chinese University of Hong Kong
    Faculty of Medicine, The Chinese University of Hong Kong)

  • Francis K. L. Chan

    (Microbiota I-Center (MagIC)
    The Chinese University of Hong Kong
    The Chinese University of Hong Kong
    Faculty of Medicine, The Chinese University of Hong Kong)

  • Siew C. Ng

    (Microbiota I-Center (MagIC)
    The Chinese University of Hong Kong
    The Chinese University of Hong Kong
    Faculty of Medicine, The Chinese University of Hong Kong)

Abstract

Systemic characterisation of the human faecal microbiome provides the opportunity to develop non-invasive approaches in the diagnosis of a major human disease. However, shared microbial signatures across different diseases make accurate diagnosis challenging in single-disease models. Herein, we present a machine-learning multi-class model using faecal metagenomic dataset of 2,320 individuals with nine well-characterised phenotypes, including colorectal cancer, colorectal adenomas, Crohn’s disease, ulcerative colitis, irritable bowel syndrome, obesity, cardiovascular disease, post-acute COVID-19 syndrome and healthy individuals. Our processed data covers 325 microbial species derived from 14.3 terabytes of sequence. The trained model achieves an area under the receiver operating characteristic curve (AUROC) of 0.90 to 0.99 (Interquartile range, IQR, 0.91–0.94) in predicting different diseases in the independent test set, with a sensitivity of 0.81 to 0.95 (IQR, 0.87–0.93) at a specificity of 0.76 to 0.98 (IQR 0.83–0.95). Metagenomic analysis from public datasets of 1,597 samples across different populations observes comparable predictions with AUROC of 0.69 to 0.91 (IQR 0.79–0.87). Correlation of the top 50 microbial species with disease phenotypes identifies 363 significant associations (FDR

Suggested Citation

  • Qi Su & Qin Liu & Raphaela Iris Lau & Jingwan Zhang & Zhilu Xu & Yun Kit Yeoh & Thomas W. H. Leung & Whitney Tang & Lin Zhang & Jessie Q. Y. Liang & Yuk Kam Yau & Jiaying Zheng & Chengyu Liu & Mengjin, 2022. "Faecal microbiome-based machine learning for multi-class disease diagnosis," Nature Communications, Nature, vol. 13(1), pages 1-8, December.
  • Handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-34405-3
    DOI: 10.1038/s41467-022-34405-3
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-022-34405-3
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-022-34405-3?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Qiang Feng & Suisha Liang & Huijue Jia & Andreas Stadlmayr & Longqing Tang & Zhou Lan & Dongya Zhang & Huihua Xia & Xiaoying Xu & Zhuye Jie & Lili Su & Xiaoping Li & Xin Li & Junhua Li & Liang Xiao & , 2015. "Gut microbiome development along the colorectal adenoma–carcinoma sequence," Nature Communications, Nature, vol. 6(1), pages 1-13, May.
    2. Nan Qin & Fengling Yang & Ang Li & Edi Prifti & Yanfei Chen & Li Shao & Jing Guo & Emmanuelle Le Chatelier & Jian Yao & Lingjiao Wu & Jiawei Zhou & Shujun Ni & Lin Liu & Nicolas Pons & Jean Michel Bat, 2014. "Alterations of the human gut microbiome in liver cirrhosis," Nature, Nature, vol. 513(7516), pages 59-64, September.
    3. R. Gacesa & A. Kurilshikov & A. Vich Vila & T. Sinha & M. A. Y. Klaassen & L. A. Bolte & S. Andreu-Sánchez & L. Chen & V. Collij & S. Hu & J. A. M. Dekens & V. C. Lenters & J. R. Björk & J. C. Swarte , 2022. "Environmental factors shaping the gut microbiome in a Dutch population," Nature, Nature, vol. 604(7907), pages 732-739, April.
    4. Claire Duvallet & Sean M. Gibbons & Thomas Gurry & Rafael A. Irizarry & Eric J. Alm, 2017. "Meta-analysis of gut microbiome studies identifies disease-specific and shared responses," Nature Communications, Nature, vol. 8(1), pages 1-10, December.
    5. Zhuye Jie & Huihua Xia & Shi-Long Zhong & Qiang Feng & Shenghui Li & Suisha Liang & Huanzi Zhong & Zhipeng Liu & Yuan Gao & Hui Zhao & Dongya Zhang & Zheng Su & Zhiwei Fang & Zhou Lan & Junhua Li & Li, 2017. "The gut microbiome in atherosclerotic cardiovascular disease," Nature Communications, Nature, vol. 8(1), pages 1-12, December.
    6. Chao Chen & Kay Grennan & Judith Badner & Dandan Zhang & Elliot Gershon & Li Jin & Chunyu Liu, 2011. "Removing Batch Effects in Analysis of Expression Microarray Data: An Evaluation of Six Batch Adjustment Methods," PLOS ONE, Public Library of Science, vol. 6(2), pages 1-10, February.
    7. Vinod K. Gupta & Minsuk Kim & Utpal Bakshi & Kevin Y. Cunningham & John M. Davis & Konstantinos N. Lazaridis & Heidi Nelson & Nicholas Chia & Jaeyun Sung, 2020. "A predictive index for health status using species-level gut microbiome profiling," Nature Communications, Nature, vol. 11(1), pages 1-16, December.
    8. Edoardo Pasolli & Duy Tin Truong & Faizan Malik & Levi Waldron & Nicola Segata, 2016. "Machine Learning Meta-analysis of Large Metagenomic Datasets: Tools and Biological Insights," PLOS Computational Biology, Public Library of Science, vol. 12(7), pages 1-26, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wanting Dong & Xinyue Fan & Yaqiong Guo & Siyi Wang & Shulei Jia & Na Lv & Tao Yuan & Yuanlong Pan & Yong Xue & Xi Chen & Qian Xiong & Ruifu Yang & Weigang Zhao & Baoli Zhu, 2024. "An expanded database and analytical toolkit for identifying bacterial virulence factors and their associations with chronic diseases," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    2. Efrat Muller & Itamar Shiryan & Elhanan Borenstein, 2024. "Multi-omic integration of microbiome data for identifying disease-associated modules," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    3. Daniel Chang & Vinod K. Gupta & Benjamin Hur & Sergio Cobo-López & Kevin Y. Cunningham & Nam Soo Han & Insuk Lee & Vanessa L. Kronzer & Levi M. Teigen & Lioudmila V. Karnatovskaia & Erin E. Longbrake , 2024. "Gut Microbiome Wellness Index 2 enhances health status prediction from gut microbiome taxonomic profiles," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    4. Sean M Gibbons & Claire Duvallet & Eric J Alm, 2018. "Correcting for batch effects in case-control microbiome studies," PLOS Computational Biology, Public Library of Science, vol. 14(4), pages 1-17, April.
    5. Youwen Qin & Xin Tong & Wei-Jian Mei & Yanshuang Cheng & Yuanqiang Zou & Kai Han & Jiehai Yu & Zhuye Jie & Tao Zhang & Shida Zhu & Xin Jin & Jian Wang & Huanming Yang & Xun Xu & Huanzi Zhong & Liang X, 2024. "Consistent signatures in the human gut microbiome of old- and young-onset colorectal cancer," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    6. Brandon Hickman & Anne Salonen & Alise J. Ponsero & Roosa Jokela & Kaija-Leena Kolho & Willem M. Vos & Katri Korpela, 2024. "Gut microbiota wellbeing index predicts overall health in a cohort of 1000 infants," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    7. Alan Le Goallec & Braden T Tierney & Jacob M Luber & Evan M Cofer & Aleksandar D Kostic & Chirag J Patel, 2020. "A systematic machine learning and data type comparison yields metagenomic predictors of infant age, sex, breastfeeding, antibiotic usage, country of origin, and delivery type," PLOS Computational Biology, Public Library of Science, vol. 16(5), pages 1-21, May.
    8. Vanessa R. Marcelino & Caitlin Welsh & Christian Diener & Emily L. Gulliver & Emily L. Rutten & Remy B. Young & Edward M. Giles & Sean M. Gibbons & Chris Greening & Samuel C. Forster, 2023. "Disease-specific loss of microbial cross-feeding interactions in the human gut," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    9. Xia Qing & Thompson Jeffrey A. & Koestler Devin C., 2021. "Batch effect reduction of microarray data with dependent samples using an empirical Bayes approach (BRIDGE)," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 20(4-6), pages 101-119, December.
    10. Jaron Thompson & Renee Johansen & John Dunbar & Brian Munsky, 2019. "Machine learning to predict microbial community functions: An analysis of dissolved organic carbon from litter decomposition," PLOS ONE, Public Library of Science, vol. 14(7), pages 1-16, July.
    11. Ashwag Shami & Rewaa S. Jalal & Ruba A. Ashy & Haneen W. Abuauf & Lina Baz & Mohammed Y. Refai & Aminah A. Barqawi & Hanadi M. Baeissa & Manal A. Tashkandi & Sahar Alshareef & Aala A. Abulfaraj, 2022. "Use of Metagenomic Whole Genome Shotgun Sequencing Data in Taxonomic Assignment of Dipterygium glaucum Rhizosphere and Surrounding Bulk Soil Microbiomes, and Their Response to Watering," Sustainability, MDPI, vol. 14(14), pages 1-21, July.
    12. Hung-Chih Chen & Yen-Wen Liu & Kuan-Cheng Chang & Yen-Wen Wu & Yi-Ming Chen & Yu-Kai Chao & Min-Yi You & David J. Lundy & Chen-Ju Lin & Marvin L. Hsieh & Yu-Che Cheng & Ray P. Prajnamitra & Po-Ju Lin , 2023. "Gut butyrate-producers confer post-infarction cardiac protection," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    13. Braden T Tierney & Yingxuan Tan & Zhen Yang & Bing Shui & Michaela J Walker & Benjamin M Kent & Aleksandar D Kostic & Chirag J Patel, 2022. "Systematically assessing microbiome–disease associations identifies drivers of inconsistency in metagenomic research," PLOS Biology, Public Library of Science, vol. 20(3), pages 1-18, March.
    14. Rhee, Chaeyoung & Park, Sung-Gwan & Yu, Sung Il & Dalantai, Tergel & Shin, Juhee & Chae, Kyu-Jung & Shin, Seung Gu, 2023. "Mapping microbial dynamics in anaerobic digestion system linked with organic composition of substrates: Protein and lipid," Energy, Elsevier, vol. 275(C).
    15. Georges P. Schmartz & Jacqueline Rehner & Miriam J. Schuff & Leidy-Alejandra G. Molano & Sören L. Becker & Marcin Krawczyk & Azat Tagirdzhanov & Alexey Gurevich & Richard Francke & Rolf Müller & Veren, 2024. "Exploring microbial diversity and biosynthetic potential in zoo and wildlife animal microbiomes," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    16. J. Casper Swarte & Tim J. Knobbe & Johannes R. Björk & Ranko Gacesa & Lianne M. Nieuwenhuis & Shuyan Zhang & Arnau Vich Vila & Daan Kremer & Rianne M. Douwes & Adrian Post & Evelien E. Quint & Robert , 2023. "Health-related quality of life is linked to the gut microbiome in kidney transplant recipients," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    17. Yadid M. Algavi & Elhanan Borenstein, 2023. "A data-driven approach for predicting the impact of drugs on the human microbiome," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    18. Antonino Malacrinò & Victoria A Sadowski & Tvisha K Martin & Nathalia Cavichiolli de Oliveira & Ian J Brackett & James D Feller & Kristian J Harris & Orlando Combita Heredia & Rosa Vescio & Alison E B, 2020. "Biological invasions alter environmental microbiomes: A meta-analysis," PLOS ONE, Public Library of Science, vol. 15(10), pages 1-12, October.
    19. Runtan Cheng & Lu Wang & Shenglong Le & Yifan Yang & Can Zhao & Xiangqi Zhang & Xin Yang & Ting Xu & Leiting Xu & Petri Wiklund & Jun Ge & Dajiang Lu & Chenhong Zhang & Luonan Chen & Sulin Cheng, 2022. "A randomized controlled trial for response of microbiome network to exercise and diet intervention in patients with nonalcoholic fatty liver disease," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    20. Oliver Aasmets & Kertu Liis Krigul & Kreete Lüll & Andres Metspalu & Elin Org, 2022. "Gut metagenome associations with extensive digital health data in a volunteer-based Estonian microbiome cohort," Nature Communications, Nature, vol. 13(1), pages 1-11, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-34405-3. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.