IDEAS home Printed from https://ideas.repec.org/a/plo/pgen00/1003143.html
   My bibliography  Save this article

Predicting Mendelian Disease-Causing Non-Synonymous Single Nucleotide Variants in Exome Sequencing Studies

Author

Listed:
  • Miao-Xin Li
  • Johnny S H Kwan
  • Su-Ying Bao
  • Wanling Yang
  • Shu-Leong Ho
  • Yong-Qiang Song
  • Pak C Sham

Abstract

Exome sequencing is becoming a standard tool for mapping Mendelian disease-causing (or pathogenic) non-synonymous single nucleotide variants (nsSNVs). Minor allele frequency (MAF) filtering approach and functional prediction methods are commonly used to identify candidate pathogenic mutations in these studies. Combining multiple functional prediction methods may increase accuracy in prediction. Here, we propose to use a logit model to combine multiple prediction methods and compute an unbiased probability of a rare variant being pathogenic. Also, for the first time we assess the predictive power of seven prediction methods (including SIFT, PolyPhen2, CONDEL, and logit) in predicting pathogenic nsSNVs from other rare variants, which reflects the situation after MAF filtering is done in exome-sequencing studies. We found that a logit model combining all or some original prediction methods outperforms other methods examined, but is unable to discriminate between autosomal dominant and autosomal recessive disease mutations. Finally, based on the predictions of the logit model, we estimate that an individual has around 5% of rare nsSNVs that are pathogenic and carries ∼22 pathogenic derived alleles at least, which if made homozygous by consanguineous marriages may lead to recessive diseases. Author Summary: Sequencing the coding regions of the human genome is becoming a standard approach in identifying causal genes for human Mendelian diseases. Researchers often rely on multiple functional prediction methods/tools to separate the candidate causal mutation(s) from other rare mutations in these studies. In this paper, we propose the use of a statistical model to combine prediction scores from multiple methods and to estimate the chance of a rare mutation being Mendelian disease-causing (or pathogenic). We found that our model using all or some individual prediction methods consistently outperforms other prediction methods examined and could exclude more than 55% of rare non-pathogenic mutations in an individual genome. Unfortunately, no method was able to discriminate between autosomal dominant and autosomal recessive disease mutations. In addition, based on the predictions of our model, we estimated that a person can carry ∼22 pathogenic derived alleles at least, which if present at the same position in the genome may lead to Mendelian diseases.

Suggested Citation

  • Miao-Xin Li & Johnny S H Kwan & Su-Ying Bao & Wanling Yang & Shu-Leong Ho & Yong-Qiang Song & Pak C Sham, 2013. "Predicting Mendelian Disease-Causing Non-Synonymous Single Nucleotide Variants in Exome Sequencing Studies," PLOS Genetics, Public Library of Science, vol. 9(1), pages 1-11, January.
  • Handle: RePEc:plo:pgen00:1003143
    DOI: 10.1371/journal.pgen.1003143
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1003143
    Download Restriction: no

    File URL: https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1003143&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pgen.1003143?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Sarah B. Ng & Emily H. Turner & Peggy D. Robertson & Steven D. Flygare & Abigail W. Bigham & Choli Lee & Tristan Shaffer & Michelle Wong & Arindam Bhattacharjee & Evan E. Eichler & Michael Bamshad & D, 2009. "Targeted capture and massively parallel sequencing of 12 human exomes," Nature, Nature, vol. 461(7261), pages 272-276, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ni Huang & Insuk Lee & Edward M Marcotte & Matthew E Hurles, 2010. "Characterising and Predicting Haploinsufficiency in the Human Genome," PLOS Genetics, Public Library of Science, vol. 6(10), pages 1-11, October.
    2. Elaine T. Lim & Yingleong Chan & Pepper Dawes & Xiaoge Guo & Serkan Erdin & Derek J. C. Tai & Songlei Liu & Julia M. Reichert & Mannix J. Burns & Ying Kai Chan & Jessica J. Chiang & Katharina Meyer & , 2022. "Orgo-Seq integrates single-cell and bulk transcriptomic data to identify cell type specific-driver genes associated with autism spectrum disorder," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    3. Zura Kakushadze & Willie Yu, 2017. "Mutation Clusters from Cancer Exome," Papers 1707.08504, arXiv.org.
    4. Jason Flannick & Joshua M Korn & Pierre Fontanillas & George B Grant & Eric Banks & Mark A Depristo & David Altshuler, 2012. "Efficiency and Power as a Function of Sequence Coverage, SNP Array Density, and Imputation," PLOS Computational Biology, Public Library of Science, vol. 8(7), pages 1-13, July.
    5. Thomas J Hoffmann & Bronya J Keats & Noriko Yoshikawa & Catherine Schaefer & Neil Risch & Lawrence R Lustig, 2016. "A Large Genome-Wide Association Study of Age-Related Hearing Impairment Using Electronic Health Records," PLOS Genetics, Public Library of Science, vol. 12(10), pages 1-20, October.
    6. Degui Zhi & Rui Chen, 2012. "Statistical Guidance for Experimental Design and Data Analysis of Mutation Detection in Rare Monogenic Mendelian Diseases by Exome Sequencing," PLOS ONE, Public Library of Science, vol. 7(2), pages 1-11, February.
    7. Kirsley Chennen & Thomas Weber & Xavière Lornage & Arnaud Kress & Johann Böhm & Julie Thompson & Jocelyn Laporte & Olivier Poch, 2020. "MISTIC: A prediction tool to reveal disease-relevant deleterious missense variants," PLOS ONE, Public Library of Science, vol. 15(7), pages 1-23, July.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:1003143. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.