IDEAS home Printed from https://ideas.repec.org/a/plo/pgen00/1008922.html
   My bibliography  Save this article

Unified inference of missense variant effects and gene constraints in the human genome

Author

Listed:
  • Yi-Fei Huang

Abstract

A challenge in medical genomics is to identify variants and genes associated with severe genetic disorders. Based on the premise that severe, early-onset disorders often result in a reduction of evolutionary fitness, several statistical methods have been developed to predict pathogenic variants or constrained genes based on the signatures of negative selection in human populations. However, we currently lack a statistical framework to jointly predict deleterious variants and constrained genes from both variant-level features and gene-level selective constraints. Here we present such a unified approach, UNEECON, based on deep learning and population genetics. UNEECON treats the contributions of variant-level features and gene-level constraints as a variant-level fixed effect and a gene-level random effect, respectively. The sum of the fixed and random effects is then combined with an evolutionary model to infer the strength of negative selection at both variant and gene levels. Compared with previously published methods, UNEECON shows improved performance in predicting missense variants and protein-coding genes associated with autosomal dominant disorders, and feature importance analysis suggests that both gene-level selective constraints and variant-level predictors are important for accurate variant prioritization. Furthermore, based on UNEECON, we observe a low correlation between gene-level intolerance to missense mutations and that to loss-of-function mutations, which can be partially explained by the prevalence of disordered protein regions that are highly tolerant to missense mutations. Finally, we show that genes intolerant to both missense and loss-of-function mutations play key roles in the central nervous system and the autism spectrum disorders. Overall, UNEECON is a promising framework for both variant and gene prioritization.Author summary: Numerous statistical methods have been developed to predict deleterious missense variants or constrained genes in the human genome, but unified prioritization methods that utilize both variant- and gene-level information are underdeveloped. Here we present UNEECON, an evolution-based deep learning framework for unified variant and gene prioritization. By integrating variant-level predictors and gene-level selective constraints, UNEECON outperforms existing methods in predicting missense variants and protein-coding genes associated with dominant disorders. Based on UNEECON, we show that disordered proteins are tolerant to missense mutations but not to loss-of-function mutations. In addition, we find that genes under strong selective constraints at both missense and loss-of-function levels are strongly associated with the central nervous system and the autism spectrum disorders, highlighting the need to investigate the function of these highly constrained genes in future studies.

Suggested Citation

  • Yi-Fei Huang, 2020. "Unified inference of missense variant effects and gene constraints in the human genome," PLOS Genetics, Public Library of Science, vol. 16(7), pages 1-24, July.
  • Handle: RePEc:plo:pgen00:1008922
    DOI: 10.1371/journal.pgen.1008922
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1008922
    Download Restriction: no

    File URL: https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1008922&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pgen.1008922?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Yi-Fei Huang & G Brian Golding, 2014. "Phylogenetic Gaussian Process Model for the Inference of Functionally Important Regions in Protein Tertiary Structures," PLOS Computational Biology, Public Library of Science, vol. 10(1), pages 1-12, January.
    2. Ho, Daniel & Imai, Kosuke & King, Gary & Stuart, Elizabeth A., 2011. "MatchIt: Nonparametric Preprocessing for Parametric Causal Inference," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 42(i08).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Chervier, Colas & Le Velly, Gwenolé & Ezzine-de-Blas, Driss, 2019. "When the Implementation of Payments for Biodiversity Conservation Leads to Motivation Crowding-out: A Case Study From the Cardamoms Forests, Cambodia," Ecological Economics, Elsevier, vol. 156(C), pages 499-510.
    2. Moritz Flubacher & George Sheldon & Adrian Müller, 2015. "Comparison of the Economic Performance between Organic and Conventional Dairy Farms in the Swiss Mountain Region Using Matching and Stochastic Frontier Analysis," Journal of Socio-Economics in Agriculture (Until 2015: Yearbook of Socioeconomics in Agriculture), Swiss Society for Agricultural Economics and Rural Sociology, vol. 7(1), pages 76-84.
    3. Akoh Fabien Yao & Maxime Sèbe & Laura Recuero Virto & Abdelhak Nassiri & Hervé Dumez, 2024. "The effect of LNG bunkering on port competitiveness using multilevel data analysis [L'effet du soutage par GNL sur la compétitivité des ports à l'aide de l'analyse de données à plusieurs niveaux]," Post-Print hal-04611804, HAL.
    4. Rigdon, Joseph & Berkowitz, Seth A. & Seligman, Hilary K. & Basu, Sanjay, 2017. "Re-evaluating associations between the Supplemental Nutrition Assistance Program participation and body mass index in the context of unmeasured confounders," Social Science & Medicine, Elsevier, vol. 192(C), pages 112-124.
    5. Rongtao Jiang & Stephanie Noble & Matthew Rosenblatt & Wei Dai & Jean Ye & Shu Liu & Shile Qi & Vince D. Calhoun & Jing Sui & Dustin Scheinost, 2024. "The brain structure, inflammatory, and genetic mechanisms mediate the association between physical frailty and depression," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    6. Finocchiaro Castro, Massimo & Guccio, Calogero & Rizzo, Ilde, 2023. "How "one-size-fits-all" public works contract does it better? An assessment of infrastructure provision in Italy," EconStor Preprints 270729, ZBW - Leibniz Information Centre for Economics.
    7. Lo, A. W.-T. & Houston, D., 2018. "How do compact, accessible, and walkable communities promote gender equality in spatial behavior?," Journal of Transport Geography, Elsevier, vol. 68(C), pages 42-54.
    8. Stjepan Srhoj & Michael Lapinski & Janette Walde, 2019. "Size matters? Impact evaluation of business development grants on SME performance," Working Papers 2019-14, Faculty of Economics and Statistics, Universität Innsbruck.
    9. Jiaming Zeng & Michael F. Gensheimer & Daniel L. Rubin & Susan Athey & Ross D. Shachter, 2022. "Uncovering interpretable potential confounders in electronic medical records," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    10. Dong, Hongwei, 2017. "Rail-transit-induced gentrification and the affordability paradox of TOD," Journal of Transport Geography, Elsevier, vol. 63(C), pages 1-10.
    11. Peter Psathas, 2024. "Top of the Class Assessing the Credit Performance of Graduates from Secured Credit Card Programs," Consumer Finance Institute discussion papers DP24-02, Federal Reserve Bank of Philadelphia.
    12. Chen, Shanting & Mallory, Allen B., 2021. "The effect of racial discrimination on mental and physical health: A propensity score weighting approach," Social Science & Medicine, Elsevier, vol. 285(C).
    13. Katie Devenish & Sébastien Desbureaux & Simon Willcock & Julia P. G. Jones, 2022. "On track to achieve no net loss of forest at Madagascar’s biggest mine," Nature Sustainability, Nature, vol. 5(6), pages 498-508, June.
    14. Changjun Gu & Pei Zhao & Qiong Chen & Shicheng Li & Lanhui Li & Linshan Liu & Yili Zhang, 2020. "Forest Cover Change and the Effectiveness of Protected Areas in the Himalaya since 1998," Sustainability, MDPI, vol. 12(15), pages 1-24, July.
    15. Häusler, Mara-Magdalena & Zabel, Astrid, 2024. "Sites side by side: Can an agglomeration bonus with an adjacency rule connect agri-environmental sites?," Ecological Economics, Elsevier, vol. 224(C).
    16. Leite, Walter & Zhang, Huibin & collier, zachary & Chawla, Kamal & , l.kong@ufl.edu & Lee, Yongseok & Quan, Jia & Soyoye, Olushola, 2024. "Machine Learning for Propensity Score Estimation: A Systematic Review and Reporting Guidelines," OSF Preprints gmrk7, Center for Open Science.
    17. Arne Lauer & Samantha L. Speroni & Myoung Choi & Xiao Da & Christine Duncan & Siobhan McCarthy & Vijai Krishnan & Cole A. Lusk & David Rohde & Mikkel Bo Hansen & Jayashree Kalpathy-Cramer & Daniel J. , 2023. "Hematopoietic stem-cell gene therapy is associated with restored white matter microvascular function in cerebral adrenoleukodystrophy," Nature Communications, Nature, vol. 14(1), pages 1-10, December.
    18. Maekawa, Wakako, 2024. "United Nations peacekeeping operations and multilateral foreign aid: Credibility of good governance," World Development, Elsevier, vol. 176(C).
    19. Mei-Cheng Wang & Yuxin Zhu, 2022. "Bias correction via outcome reassignment for cross-sectional data with binary disease outcome," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 28(4), pages 659-674, October.
    20. Schneeberger, Andres R. & Huber, Christian G. & Lang, Undine E. & Muenzenmaier, Kristina H. & Castille, Dorothy & Jaeger, Matthias & Seixas, Azizi & Sowislo, Julia & Link, Bruce G., 2017. "Effects of assisted outpatient treatment and health care services on psychotic symptoms," Social Science & Medicine, Elsevier, vol. 175(C), pages 152-160.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:1008922. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.