IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v16y2025i1d10.1038_s41467-025-57885-5.html
   My bibliography  Save this article

Diverse ancestral representation improves genetic intolerance metrics

Author

Listed:
  • Alexander L. Han

    (Baylor College of Medicine
    Texas Children’s Hospital)

  • Chloe F. Sands

    (Baylor College of Medicine
    Texas Children’s Hospital)

  • Dorota Matelska

    (AstraZeneca)

  • Jessica C. Butts

    (Rice University
    Rice University)

  • Vida Ravanmehr

    (Baylor College of Medicine
    Texas Children’s Hospital)

  • Fengyuan Hu

    (AstraZeneca)

  • Esmeralda Villavicencio Gonzalez

    (Texas Children’s Hospital
    Baylor College of Medicine)

  • Nicholas Katsanis

    (Galatea Bio, Inc)

  • Carlos D. Bustamante

    (Galatea Bio, Inc)

  • Quanli Wang

    (AstraZeneca)

  • Slavé Petrovski

    (AstraZeneca
    University of Melbourne)

  • Dimitrios Vitsios

    (AstraZeneca)

  • Ryan S. Dhindsa

    (Baylor College of Medicine
    Texas Children’s Hospital
    Baylor College of Medicine)

Abstract

The unprecedented scale of genomic databases has revolutionized our ability to identify regions in the human genome intolerant to variation—regions often implicated in disease. However, these datasets remain constrained by limited ancestral diversity. Here, we analyze whole-exome sequencing data from 460,551 UK Biobank and 125,748 Genome Aggregation Database (gnomAD) participants across multiple ancestries to test several key intolerance metrics, including the Residual Variance Intolerance Score (RVIS), Missense Tolerance Ratio (MTR), and Loss-of-Function Observed/Expected ratio (LOF O/E). We demonstrate that increasing ancestral representation, rather than sample size alone, critically drives their performance. Scores trained on variation observed in African and Admixed American ancestral groups show higher resolution in detecting haploinsufficient and neurodevelopmental disease risk genes compared to scores trained on European ancestry groups. Most strikingly, MTR trained on 43,000 multi-ancestry exomes demonstrates greater predictive power than when trained on a nearly 10-fold larger dataset of 440,000 non-Finnish European exomes. We further find that European ancestry group-based scores are likely approaching saturation. These findings highlight the need for enhanced population representation in genomic resources to fully realize the potential of precision medicine and drug discovery. Ancestry group-specific scores are publicly available through an interactive portal: http://intolerance.public.cgr.astrazeneca.com/ .

Suggested Citation

  • Alexander L. Han & Chloe F. Sands & Dorota Matelska & Jessica C. Butts & Vida Ravanmehr & Fengyuan Hu & Esmeralda Villavicencio Gonzalez & Nicholas Katsanis & Carlos D. Bustamante & Quanli Wang & Slav, 2025. "Diverse ancestral representation improves genetic intolerance metrics," Nature Communications, Nature, vol. 16(1), pages 1-9, December.
  • Handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-57885-5
    DOI: 10.1038/s41467-025-57885-5
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-025-57885-5
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-025-57885-5?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-57885-5. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.