IDEAS home Printed from https://ideas.repec.org/a/nat/nature/v619y2023i7968d10.1038_s41586-023-06173-7.html
   My bibliography  Save this article

A pangenome reference of 36 Chinese populations

Author

Listed:
  • Yang Gao

    (Fudan University
    Fudan University
    Chinese Academy of Sciences
    ShanghaiTech University)

  • Xiaofei Yang

    (Xi’an Jiaotong University
    Xi’an Jiaotong University
    Genome Institute, The First Affiliated Hospital of Xi’an Jiaotong University)

  • Hao Chen

    (Chinese Academy of Sciences)

  • Xinjiang Tan

    (Chinese Academy of Sciences)

  • Zhaoqing Yang

    (Chinese Academy of Medical Sciences)

  • Lian Deng

    (Fudan University)

  • Baonan Wang

    (Fudan University)

  • Shuang Kong

    (Fudan University)

  • Songyang Li

    (Fudan University)

  • Yuhang Cui

    (Fudan University)

  • Chang Lei

    (Fudan University)

  • Yimin Wang

    (Chinese Academy of Sciences)

  • Yuwen Pan

    (Chinese Academy of Sciences)

  • Sen Ma

    (Chinese Academy of Sciences)

  • Hao Sun

    (Chinese Academy of Medical Sciences)

  • Xiaohan Zhao

    (Fudan University)

  • Yingbing Shi

    (Fudan University)

  • Ziyi Yang

    (Fudan University)

  • Dongdong Wu

    (Chinese Academy of Sciences)

  • Shaoyuan Wu

    (Jiangsu Normal University)

  • Xingming Zhao

    (MOE Frontiers Center for Brain Science Fudan University)

  • Binyin Shi

    (The First Affiliated Hospital of Xi’an Jiaotong University)

  • Li Jin

    (Fudan University
    Fudan University)

  • Zhibin Hu

    (Nanjing Medical University
    Nanjing Medical University)

  • Yan Lu

    (Fudan University)

  • Jiayou Chu

    (Chinese Academy of Medical Sciences)

  • Kai Ye

    (Xi’an Jiaotong University
    Xi’an Jiaotong University
    Xi’an Jiaotong University)

  • Shuhua Xu

    (Fudan University
    Fudan University
    ShanghaiTech University
    Jiangsu Normal University)

Abstract

Human genomics is witnessing an ongoing paradigm shift from a single reference sequence to a pangenome form, but populations of Asian ancestry are underrepresented. Here we present data from the first phase of the Chinese Pangenome Consortium, including a collection of 116 high-quality and haplotype-phased de novo assemblies based on 58 core samples representing 36 minority Chinese ethnic groups. With an average 30.65× high-fidelity long-read sequence coverage, an average contiguity N50 of more than 35.63 megabases and an average total size of 3.01 gigabases, the CPC core assemblies add 189 million base pairs of euchromatic polymorphic sequences and 1,367 protein-coding gene duplications to GRCh38. We identified 15.9 million small variants and 78,072 structural variants, of which 5.9 million small variants and 34,223 structural variants were not reported in a recently released pangenome reference1. The Chinese Pangenome Consortium data demonstrate a remarkable increase in the discovery of novel and missing sequences when individuals are included from underrepresented minority ethnic groups. The missing reference sequences were enriched with archaic-derived alleles and genes that confer essential functions related to keratinization, response to ultraviolet radiation, DNA repair, immunological responses and lifespan, implying great potential for shedding new light on human evolution and recovering missing heritability in complex disease mapping.

Suggested Citation

  • Yang Gao & Xiaofei Yang & Hao Chen & Xinjiang Tan & Zhaoqing Yang & Lian Deng & Baonan Wang & Shuang Kong & Songyang Li & Yuhang Cui & Chang Lei & Yimin Wang & Yuwen Pan & Sen Ma & Hao Sun & Xiaohan Z, 2023. "A pangenome reference of 36 Chinese populations," Nature, Nature, vol. 619(7968), pages 112-121, July.
  • Handle: RePEc:nat:nature:v:619:y:2023:i:7968:d:10.1038_s41586-023-06173-7
    DOI: 10.1038/s41586-023-06173-7
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41586-023-06173-7
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1038/s41586-023-06173-7?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:nature:v:619:y:2023:i:7968:d:10.1038_s41586-023-06173-7. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.