IDEAS home Printed from https://ideas.repec.org/a/plo/pgen00/1008922.html
   My bibliography  Save this article

Unified inference of missense variant effects and gene constraints in the human genome

Author

Listed:
  • Yi-Fei Huang

Abstract

A challenge in medical genomics is to identify variants and genes associated with severe genetic disorders. Based on the premise that severe, early-onset disorders often result in a reduction of evolutionary fitness, several statistical methods have been developed to predict pathogenic variants or constrained genes based on the signatures of negative selection in human populations. However, we currently lack a statistical framework to jointly predict deleterious variants and constrained genes from both variant-level features and gene-level selective constraints. Here we present such a unified approach, UNEECON, based on deep learning and population genetics. UNEECON treats the contributions of variant-level features and gene-level constraints as a variant-level fixed effect and a gene-level random effect, respectively. The sum of the fixed and random effects is then combined with an evolutionary model to infer the strength of negative selection at both variant and gene levels. Compared with previously published methods, UNEECON shows improved performance in predicting missense variants and protein-coding genes associated with autosomal dominant disorders, and feature importance analysis suggests that both gene-level selective constraints and variant-level predictors are important for accurate variant prioritization. Furthermore, based on UNEECON, we observe a low correlation between gene-level intolerance to missense mutations and that to loss-of-function mutations, which can be partially explained by the prevalence of disordered protein regions that are highly tolerant to missense mutations. Finally, we show that genes intolerant to both missense and loss-of-function mutations play key roles in the central nervous system and the autism spectrum disorders. Overall, UNEECON is a promising framework for both variant and gene prioritization.Author summary: Numerous statistical methods have been developed to predict deleterious missense variants or constrained genes in the human genome, but unified prioritization methods that utilize both variant- and gene-level information are underdeveloped. Here we present UNEECON, an evolution-based deep learning framework for unified variant and gene prioritization. By integrating variant-level predictors and gene-level selective constraints, UNEECON outperforms existing methods in predicting missense variants and protein-coding genes associated with dominant disorders. Based on UNEECON, we show that disordered proteins are tolerant to missense mutations but not to loss-of-function mutations. In addition, we find that genes under strong selective constraints at both missense and loss-of-function levels are strongly associated with the central nervous system and the autism spectrum disorders, highlighting the need to investigate the function of these highly constrained genes in future studies.

Suggested Citation

  • Yi-Fei Huang, 2020. "Unified inference of missense variant effects and gene constraints in the human genome," PLOS Genetics, Public Library of Science, vol. 16(7), pages 1-24, July.
  • Handle: RePEc:plo:pgen00:1008922
    DOI: 10.1371/journal.pgen.1008922
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1008922
    Download Restriction: no

    File URL: https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1008922&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pgen.1008922?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Yi-Fei Huang & G Brian Golding, 2014. "Phylogenetic Gaussian Process Model for the Inference of Functionally Important Regions in Protein Tertiary Structures," PLOS Computational Biology, Public Library of Science, vol. 10(1), pages 1-12, January.
    2. Ho, Daniel & Imai, Kosuke & King, Gary & Stuart, Elizabeth A., 2011. "MatchIt: Nonparametric Preprocessing for Parametric Causal Inference," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 42(i08).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Liao, Chuan & Jung, Suhyun & Brown, Daniel G. & Agrawal, Arun, 2024. "Does land tenure change accelerate deforestation? A matching-based four-country comparison," Ecological Economics, Elsevier, vol. 215(C).
    2. Quynh-Lam Tran & Gregorio Benitez & Fadi Shehadeh & Matthew Kaczynski & Eleftherios Mylonakis, 2022. "Clinical Outcomes Associated with SARS-CoV-2 Co-Infection with Rhinovirus and Adenovirus in Adults—A Retrospective Matched Cohort Study," IJERPH, MDPI, vol. 20(1), pages 1-13, December.
    3. Meyer, Maximilian & Hulke, Carolin & Kamwi, Jonathan & Kolem, Hannah & Börner, Jan, 2022. "Spatially heterogeneous effects of collective action on environmental dependence in Namibia’s Zambezi region," World Development, Elsevier, vol. 159(C).
    4. Chervier, Colas & Le Velly, Gwenolé & Ezzine-de-Blas, Driss, 2019. "When the Implementation of Payments for Biodiversity Conservation Leads to Motivation Crowding-out: A Case Study From the Cardamoms Forests, Cambodia," Ecological Economics, Elsevier, vol. 156(C), pages 499-510.
    5. Mansaray, Alhassan & Coleman, Simeon & Ataullah, Ali & Sirichand, Kavita, 2021. "Residual government ownership in public-private partnership projects," Journal of Government and Economics, Elsevier, vol. 4(C).
    6. Moritz Flubacher & George Sheldon & Adrian Müller, 2015. "Comparison of the Economic Performance between Organic and Conventional Dairy Farms in the Swiss Mountain Region Using Matching and Stochastic Frontier Analysis," Journal of Socio-Economics in Agriculture (Until 2015: Yearbook of Socioeconomics in Agriculture), Swiss Society for Agricultural Economics and Rural Sociology, vol. 7(1), pages 76-84.
    7. Finocchiaro Castro, Massimo & Guccio, Calogero & Rizzo, Ilde, 2023. "“One-size-fits-all” public works contract does it better? An assessment of infrastructure provision in Italy," Journal of Policy Modeling, Elsevier, vol. 45(5), pages 994-1014.
    8. Ferentinos, Konstantinos & Gibberd, Alex & Guin, Benjamin, 2021. "Climate policy and transition risk in the housing market," Bank of England working papers 918, Bank of England.
    9. Altman, Micah & Fox, John & Jackman, Simon & Zeileis, Achim, 2011. "An Introduction to the Special Volume on "Political Methodology"," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 42(i01).
    10. Marcos-Martinez, Raymundo & Measham, Thomas G. & Fleming-Muñoz, David A., 2019. "Economic impacts of early unconventional gas mining: Lessons from the coal seam gas industry in New South Wales, Australia," Energy Policy, Elsevier, vol. 125(C), pages 338-346.
    11. Akoh Fabien Yao & Maxime Sèbe & Laura Recuero Virto & Abdelhak Nassiri & Hervé Dumez, 2024. "The effect of LNG bunkering on port competitiveness using multilevel data analysis [L'effet du soutage par GNL sur la compétitivité des ports à l'aide de l'analyse de données à plusieurs niveaux]," Post-Print hal-04611804, HAL.
    12. Carazza, Luís & Silveira Neto, Raul da Mota, 2021. "Evaluating the Regional Expansion of Brazil’s Federal System of Vocational and Technological Education," Revista Brasileira de Estudos Regionais e Urbanos, Associação Brasileira de Estudos Regionais e Urbanos (ABER), vol. 15(2), pages 212-246.
    13. Rigdon, Joseph & Berkowitz, Seth A. & Seligman, Hilary K. & Basu, Sanjay, 2017. "Re-evaluating associations between the Supplemental Nutrition Assistance Program participation and body mass index in the context of unmeasured confounders," Social Science & Medicine, Elsevier, vol. 192(C), pages 112-124.
    14. Wang-Ly, Nathan & Newell, Ben R., 2022. "Allowing early access to retirement savings: Lessons from Australia," Economic Analysis and Policy, Elsevier, vol. 75(C), pages 716-733.
    15. Rongtao Jiang & Stephanie Noble & Matthew Rosenblatt & Wei Dai & Jean Ye & Shu Liu & Shile Qi & Vince D. Calhoun & Jing Sui & Dustin Scheinost, 2024. "The brain structure, inflammatory, and genetic mechanisms mediate the association between physical frailty and depression," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    16. Finocchiaro Castro, Massimo & Guccio, Calogero & Rizzo, Ilde, 2023. "How "one-size-fits-all" public works contract does it better? An assessment of infrastructure provision in Italy," EconStor Preprints 270729, ZBW - Leibniz Information Centre for Economics.
    17. Blanco-Varela, Bruno & Amoedo, José Manuel & Sánchez-Carreira, María Carmen, 2024. "Analysing ability grouping in secondary school: A way to improve academic performance and mitigate educational inequalities in Spain?," International Journal of Educational Development, Elsevier, vol. 107(C).
    18. Lo, A. W.-T. & Houston, D., 2018. "How do compact, accessible, and walkable communities promote gender equality in spatial behavior?," Journal of Transport Geography, Elsevier, vol. 68(C), pages 42-54.
    19. Dai, Ziyi & Liu, Haobing & Rodgers, Michael O. & Guensler, Randall, 2022. "Electric vehicle market potential and associated energy and emissions reduction benefits," Applied Energy, Elsevier, vol. 322(C).
    20. Petros Andrikopoulos & Judith Aron-Wisnewsky & Rima Chakaroun & Antonis Myridakis & Sofia K. Forslund & Trine Nielsen & Solia Adriouch & Bridget Holmes & Julien Chilloux & Sara Vieira-Silva & Gwen Fal, 2023. "Evidence of a causal and modifiable relationship between kidney function and circulating trimethylamine N-oxide," Nature Communications, Nature, vol. 14(1), pages 1-18, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:1008922. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.