IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v15y2024i1d10.1038_s41467-024-49060-z.html
   My bibliography  Save this article

Exploring high-quality microbial genomes by assembling short-reads with long-range connectivity

Author

Listed:
  • Zhenmiao Zhang

    (Hong Kong Baptist University)

  • Jin Xiao

    (Hong Kong Baptist University)

  • Hongbo Wang

    (Hong Kong Baptist University)

  • Chao Yang

    (Hong Kong Baptist University)

  • Yufen Huang

    (BGI Research)

  • Zhen Yue

    (BGI Research)

  • Yang Chen

    (The Second Affiliated Hospital of Guangzhou University of Chinese)

  • Lijuan Han

    (Kangmeihuada GeneTech Co., Ltd (KMHD))

  • Kejing Yin

    (Hong Kong Baptist University
    Hong Kong Baptist University)

  • Aiping Lyu

    (Hong Kong Baptist University)

  • Xiaodong Fang

    (BGI Research
    BGI Research
    Kangmeihuada GeneTech Co., Ltd (KMHD))

  • Lu Zhang

    (Hong Kong Baptist University
    Hong Kong Baptist University)

Abstract

Although long-read sequencing enables the generation of complete genomes for unculturable microbes, its high cost limits the widespread adoption of long-read sequencing in large-scale metagenomic studies. An alternative method is to assemble short-reads with long-range connectivity, which can be a cost-effective way to generate high-quality microbial genomes. Here, we develop Pangaea, a bioinformatic approach designed to enhance metagenome assembly using short-reads with long-range connectivity. Pangaea leverages connectivity derived from physical barcodes of linked-reads or virtual barcodes by aligning short-reads to long-reads. Pangaea utilizes a deep learning-based read binning algorithm to assemble co-barcoded reads exhibiting similar sequence contexts and abundances, thereby improving the assembly of high- and medium-abundance microbial genomes. Pangaea also leverages a multi-thresholding algorithm strategy to refine assembly for low-abundance microbes. We benchmark Pangaea on linked-reads and a combination of short- and long-reads from simulation data, mock communities and human gut metagenomes. Pangaea achieves significantly higher contig continuity as well as more near-complete metagenome-assembled genomes (NCMAGs) than the existing assemblers. Pangaea also generates three complete and circular NCMAGs on the human gut microbiomes.

Suggested Citation

  • Zhenmiao Zhang & Jin Xiao & Hongbo Wang & Chao Yang & Yufen Huang & Zhen Yue & Yang Chen & Lijuan Han & Kejing Yin & Aiping Lyu & Xiaodong Fang & Lu Zhang, 2024. "Exploring high-quality microbial genomes by assembling short-reads with long-range connectivity," Nature Communications, Nature, vol. 15(1), pages 1-18, December.
  • Handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-49060-z
    DOI: 10.1038/s41467-024-49060-z
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-024-49060-z
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-024-49060-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Alexandre Almeida & Alex L. Mitchell & Miguel Boland & Samuel C. Forster & Gregory B. Gloor & Aleksandra Tarkowska & Trevor D. Lawley & Robert D. Finn, 2019. "A new genomic blueprint of the human gut microbiota," Nature, Nature, vol. 568(7753), pages 499-504, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Li Zhang & Karen R. Jonscher & Zuyuan Zhang & Yi Xiong & Ryan S. Mueller & Jacob E. Friedman & Chongle Pan, 2022. "Islet autoantibody seroconversion in type-1 diabetes is associated with metagenome-assembled genomes in infant gut microbiomes," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    2. Jae-Chang Cho, 2021. "Human microbiome privacy risks associated with summary statistics," PLOS ONE, Public Library of Science, vol. 16(4), pages 1-11, April.
    3. Ying-Li Zhou & Paraskevi Mara & Guo-Jie Cui & Virginia P. Edgcomb & Yong Wang, 2022. "Microbiomes in the Challenger Deep slope and bottom-axis sediments," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    4. Candice R. Gurbatri & Georgette A. Radford & Laura Vrbanac & Jongwon Im & Elaine M. Thomas & Courtney Coker & Samuel R. Taylor & YoungUk Jang & Ayelet Sivan & Kyu Rhee & Anas A. Saleh & Tiffany Chien , 2024. "Engineering tumor-colonizing E. coli Nissle 1917 for detection and treatment of colorectal neoplasia," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    5. Bin Ma & Caiyu Lu & Yiling Wang & Jingwen Yu & Kankan Zhao & Ran Xue & Hao Ren & Xiaofei Lv & Ronghui Pan & Jiabao Zhang & Yongguan Zhu & Jianming Xu, 2023. "A genomic catalogue of soil microbiomes boosts mining of biodiversity and genetic resources," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    6. Fiona B. Tamburini & Dylan Maghini & Ovokeraye H. Oduaran & Ryan Brewster & Michaella R. Hulley & Venesa Sahibdeen & Shane A. Norris & Stephen Tollman & Kathleen Kahn & Ryan G. Wagner & Alisha N. Wade, 2022. "Short- and long-read metagenomics of urban and rural South African gut microbiomes reveal a transitional composition and undescribed taxa," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    7. Sigal Leviatan & Saar Shoer & Daphna Rothschild & Maria Gorodetski & Eran Segal, 2022. "An expanded reference map of the human gut microbiome reveals hundreds of previously unknown species," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    8. Chan Yeong Kim & Junyeong Ma & Insuk Lee, 2022. "HiFi metagenomic sequencing enables assembly of accurate and complete genomes from human gut microbiota," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    9. Jing Guo & Luyao Gong & Haiying Yu & Ming Li & Qiaohui An & Zhenquan Liu & Shuru Fan & Changjialian Yang & Dahe Zhao & Jing Han & Hua Xiang, 2024. "Engineered minimal type I CRISPR-Cas system for transcriptional activation and base editing in human cells," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    10. Shaojun Pan & Chengkai Zhu & Xing-Ming Zhao & Luis Pedro Coelho, 2022. "A deep siamese neural network improves metagenome-assembled genomes in microbiome datasets across different environments," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
    11. Chiranjib Chakraborty & Ashish Ranjan Sharma & Garima Sharma & Manojit Bhattacharya & Sang-Soo Lee, 2023. "Exploring the status of global terrestrial and aquatic microbial diversity through ‘Biodiversity Informatics’," Environment, Development and Sustainability: A Multidisciplinary Approach to the Theory and Practice of Sustainable Development, Springer, vol. 25(10), pages 10567-10598, October.
    12. Can Chen & Chen Liao & Yang-Yu Liu, 2023. "Teasing out missing reactions in genome-scale metabolic networks through hypergraph learning," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    13. Eleonora Pedrazzoli & Michele Demozzi & Elisabetta Visentin & Matteo Ciciani & Ilaria Bonuzzi & Laura Pezzè & Lorenzo Lucchetta & Giulia Maule & Simone Amistadi & Federica Esposito & Mariangela Lupo &, 2024. "CoCas9 is a compact nuclease from the human microbiome for efficient and precise genome editing," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    14. Shuqin Zeng & Dhrati Patangia & Alexandre Almeida & Zhemin Zhou & Dezhi Mu & R. Paul Ross & Catherine Stanton & Shaopu Wang, 2022. "A compendium of 32,277 metagenome-assembled genomes and over 80 million genes from the early-life human gut microbiome," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    15. Mingyue Cheng & Shuai Luo & Peng Zhang & Guangzhou Xiong & Kai Chen & Chuanqi Jiang & Fangdian Yang & Hanhui Huang & Pengshuo Yang & Guanxi Liu & Yuhao Zhang & Sang Ba & Ping Yin & Jie Xiong & Wei Mia, 2024. "A genome and gene catalog of the aquatic microbiomes of the Tibetan Plateau," Nature Communications, Nature, vol. 15(1), pages 1-13, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-49060-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.