IDEAS home Printed from https://ideas.repec.org/a/nat/nature/v601y2022i7892d10.1038_s41586-021-04233-4.html
   My bibliography  Save this article

Towards the biogeography of prokaryotic genes

Author

Listed:
  • Luis Pedro Coelho

    (Fudan University
    MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, and MOE Frontiers Center for Brain Science
    European Molecular Biology Laboratory)

  • Renato Alves

    (European Molecular Biology Laboratory)

  • Álvaro Rodríguez Río

    (Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC))

  • Pernille Neve Myers

    (Technical University of Denmark)

  • Carlos P. Cantalapiedra

    (Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC))

  • Joaquín Giner-Lamia

    (Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC)
    Universidad Politécnica de Madrid (UPM))

  • Thomas Sebastian Schmidt

    (European Molecular Biology Laboratory)

  • Daniel R. Mende

    (European Molecular Biology Laboratory
    University of Hawai‘i at Mānoa)

  • Askarbek Orakov

    (European Molecular Biology Laboratory)

  • Ivica Letunic

    (biobyte solutions GmbH)

  • Falk Hildebrand

    (European Molecular Biology Laboratory
    Norwich Research Park
    Quadram Institute, Norwich Research Park)

  • Thea Rossum

    (European Molecular Biology Laboratory)

  • Sofia K. Forslund

    (European Molecular Biology Laboratory
    a joint venture of the Max Delbrück Centre (MDC) and Charité University Hospital
    Berlin Initiative of Health)

  • Supriya Khedkar

    (European Molecular Biology Laboratory)

  • Oleksandr M. Maistrenko

    (European Molecular Biology Laboratory)

  • Shaojun Pan

    (Fudan University
    MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, and MOE Frontiers Center for Brain Science)

  • Longhao Jia

    (Fudan University
    MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, and MOE Frontiers Center for Brain Science)

  • Pamela Ferretti

    (European Molecular Biology Laboratory)

  • Shinichi Sunagawa

    (European Molecular Biology Laboratory
    Department of Biology, Institute of Microbiology and Swiss Institute of Bioinformatics, ETH Zürich)

  • Xing-Ming Zhao

    (Fudan University
    MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, and MOE Frontiers Center for Brain Science)

  • Henrik Bjørn Nielsen

    (Clinical Microbiomics A/S)

  • Jaime Huerta-Cepas

    (European Molecular Biology Laboratory
    Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC))

  • Peer Bork

    (European Molecular Biology Laboratory
    Max Delbrück Centre for Molecular Medicine
    Yonsei University
    University of Würzburg)

Abstract

Microbial genes encode the majority of the functional repertoire of life on earth. However, despite increasing efforts in metagenomic sequencing of various habitats1–3, little is known about the distribution of genes across the global biosphere, with implications for human and planetary health. Here we constructed a non-redundant gene catalogue of 303 million species-level genes (clustered at 95% nucleotide identity) from 13,174 publicly available metagenomes across 14 major habitats and use it to show that most genes are specific to a single habitat. The small fraction of genes found in multiple habitats is enriched in antibiotic-resistance genes and markers for mobile genetic elements. By further clustering these species-level genes into 32 million protein families, we observed that a small fraction of these families contain the majority of the genes (0.6% of families account for 50% of the genes). The majority of species-level genes and protein families are rare. Furthermore, species-level genes, and in particular the rare ones, show low rates of positive (adaptive) selection, supporting a model in which most genetic variability observed within each protein family is neutral or nearly neutral.

Suggested Citation

  • Luis Pedro Coelho & Renato Alves & Álvaro Rodríguez Río & Pernille Neve Myers & Carlos P. Cantalapiedra & Joaquín Giner-Lamia & Thomas Sebastian Schmidt & Daniel R. Mende & Askarbek Orakov & Ivica Let, 2022. "Towards the biogeography of prokaryotic genes," Nature, Nature, vol. 601(7892), pages 252-256, January.
  • Handle: RePEc:nat:nature:v:601:y:2022:i:7892:d:10.1038_s41586-021-04233-4
    DOI: 10.1038/s41586-021-04233-4
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41586-021-04233-4
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1038/s41586-021-04233-4?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Mingyue Cheng & Shuai Luo & Peng Zhang & Guangzhou Xiong & Kai Chen & Chuanqi Jiang & Fangdian Yang & Hanhui Huang & Pengshuo Yang & Guanxi Liu & Yuhao Zhang & Sang Ba & Ping Yin & Jie Xiong & Wei Mia, 2024. "A genome and gene catalog of the aquatic microbiomes of the Tibetan Plateau," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    2. Xiyang Dong & Yongyi Peng & Muhua Wang & Laura Woods & Wenxue Wu & Yong Wang & Xi Xiao & Jiwei Li & Kuntong Jia & Chris Greening & Zongze Shao & Casey R. J. Hubert, 2023. "Evolutionary ecology of microbial populations inhabiting deep sea sediments associated with cold seeps," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    3. Patrick J. Dörner & Harithaa Anandakumar & Ivo Röwekamp & Facundo Fiocca Vernengo & Belén Millet Pascual-Leone & Marta Krzanowski & Josua Sellmaier & Ulrike Brüning & Raphaela Fritsche-Guenther & Lenn, 2024. "Clinically used broad-spectrum antibiotics compromise inflammatory monocyte-dependent antibacterial defense in the lung," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    4. Wenhui Li & Xianyue Jiang & Wuke Wang & Liya Hou & Runze Cai & Yongqian Li & Qiuxi Gu & Qinchang Chen & Peixiang Ma & Jin Tang & Menghao Guo & Guohui Chuai & Xingxu Huang & Jun Zhang & Qi Liu, 2024. "Discovering CRISPR-Cas system with self-processing pre-crRNA capability by foundation models," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    5. Xianzhe Gong & Álvaro Rodríguez Río & Le Xu & Zhiyi Chen & Marguerite V. Langwig & Lei Su & Mingxue Sun & Jaime Huerta-Cepas & Valerie Anda & Brett J. Baker, 2022. "New globally distributed bacterial phyla within the FCB superphylum," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
    6. Yiqian Duan & Célio Dias Santos-Júnior & Thomas Sebastian Schmidt & Anthony Fullam & Breno L. S. Almeida & Chengkai Zhu & Michael Kuhn & Xing-Ming Zhao & Peer Bork & Luis Pedro Coelho, 2024. "A catalog of small proteins from the global microbiome," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    7. Ning Duan & Emily Hand & Mannuku Pheko & Shikha Sharma & Akintunde Emiola, 2024. "Structure-guided discovery of anti-CRISPR and anti-phage defense proteins," Nature Communications, Nature, vol. 15(1), pages 1-10, December.
    8. Shaojun Pan & Chengkai Zhu & Xing-Ming Zhao & Luis Pedro Coelho, 2022. "A deep siamese neural network improves metagenome-assembled genomes in microbiome datasets across different environments," Nature Communications, Nature, vol. 13(1), pages 1-12, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:nature:v:601:y:2022:i:7892:d:10.1038_s41586-021-04233-4. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.