IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1007613.html
   My bibliography  Save this article

SmartPhase: Accurate and fast phasing of heterozygous variant pairs for genetic diagnosis of rare diseases

Author

Listed:
  • Paul Hager
  • Hans-Werner Mewes
  • Meino Rohlfs
  • Christoph Klein
  • Tim Jeske

Abstract

There is an increasing need to use genome and transcriptome sequencing to genetically diagnose patients suffering from suspected monogenic rare diseases. The proper detection of compound heterozygous variant combinations as disease-causing candidates is a challenge in diagnostic workflows as haplotype information is lost by currently used next-generation sequencing technologies. Consequently, computational tools are required to phase, or resolve the haplotype of, the high number of heterozygous variants in the exome or genome of each patient. Here we present SmartPhase, a phasing tool designed to efficiently reduce the set of potential compound heterozygous variant pairs in genetic diagnoses pipelines. The phasing algorithm of SmartPhase creates haplotypes using both parental genotype information and reads generated by DNA or RNA sequencing and is thus well suited to resolve the phase of rare variants. To inform the user about the reliability of a phasing prediction, it computes a confidence score which is essential to select error-free predictions. It incorporates existing haplotype information and applies logical rules to determine variants that can be excluded as causing a recessive, monogenic disease. SmartPhase can phase either all possible variant pairs in predefined genetic loci or preselected variant pairs of interest, thus keeping the focus on clinically relevant results. We compared SmartPhase to WhatsHap, one of the leading comparable phasing tools, using simulated data and a real clinical cohort of 921 patients. On both data sets, SmartPhase generated error-free predictions using our derived confidence score threshold. It outperformed WhatsHap with regard to the percentage of resolved pairs when parental genotype information is available. On the cohort data, SmartPhase enabled on average the exclusion of approximately 22% of the input variant pairs in each singleton patient and 44% in each trio patient. SmartPhase is implemented as an open-source Java tool and freely available at http://ibis.helmholtz-muenchen.de/smartphase/.

Suggested Citation

  • Paul Hager & Hans-Werner Mewes & Meino Rohlfs & Christoph Klein & Tim Jeske, 2020. "SmartPhase: Accurate and fast phasing of heterozygous variant pairs for genetic diagnosis of rare diseases," PLOS Computational Biology, Public Library of Science, vol. 16(2), pages 1-12, February.
  • Handle: RePEc:plo:pcbi00:1007613
    DOI: 10.1371/journal.pcbi.1007613
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007613
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1007613&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1007613?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Laura S. Kremer & Daniel M. Bader & Christian Mertes & Robert Kopajtich & Garwin Pichler & Arcangela Iuso & Tobias B. Haack & Elisabeth Graf & Thomas Schwarzmayr & Caterina Terrile & Eliška Koňaříková, 2017. "Genetic diagnosis of Mendelian disorders via RNA sequencing," Nature Communications, Nature, vol. 8(1), pages 1-11, August.
    2. Stephane E. Castel & Pejman Mohammadi & Wendy K. Chung & Yufeng Shen & Tuuli Lappalainen, 2016. "Rare variant phasing and haplotypic expression from RNA sequencing with phASER," Nature Communications, Nature, vol. 7(1), pages 1-6, November.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Matthew J. O’Neill & Tao Yang & Julie Laudeman & Maria E. Calandranis & M. Lorena Harvey & Joseph F. Solus & Dan M. Roden & Andrew M. Glazer, 2024. "ParSE-seq: a calibrated multiplexed assay to facilitate the clinical classification of putative splice-altering variants," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    2. Nava Ehsan & Bence M. Kotis & Stephane E. Castel & Eric J. Song & Nicholas Mancuso & Pejman Mohammadi, 2024. "Haplotype-aware modeling of cis-regulatory effects highlights the gaps remaining in eQTL data," Nature Communications, Nature, vol. 15(1), pages 1-10, December.
    3. Andrea Wilderman & Eva D’haene & Machteld Baetens & Tara N. Yankee & Emma Wentworth Winchester & Nicole Glidden & Ellen Roets & Jo Dorpe & Sandra Janssens & Danny E. Miller & Miranda Galey & Kari M. B, 2024. "A distant global control region is essential for normal expression of anterior HOXA genes during mouse and human craniofacial development," Nature Communications, Nature, vol. 15(1), pages 1-23, December.
    4. Iker Núñez-Carpintero & Maria Rigau & Mattia Bosio & Emily O’Connor & Sally Spendiff & Yoshiteru Azuma & Ana Topf & Rachel Thompson & Peter A. C. ’t Hoen & Teodora Chamova & Ivailo Tournev & Velina Gu, 2024. "Rare disease research workflow using multilayer networks elucidates the molecular determinants of severity in Congenital Myasthenic Syndromes," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    5. Liina Nagirnaja & Alexandra M. Lopes & Wu-Lin Charng & Brian Miller & Rytis Stakaitis & Ieva Golubickaite & Alexandra Stendahl & Tianpengcheng Luan & Corinna Friedrich & Eisa Mahyari & Eloise Fadial &, 2022. "Diverse monogenic subforms of human spermatogenic failure," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    6. Margaret Sunitha Selvaraj & Xihao Li & Zilin Li & Akhil Pampana & David Y. Zhang & Joseph Park & Stella Aslibekyan & Joshua C. Bis & Jennifer A. Brody & Brian E. Cade & Lee-Ming Chuang & Ren-Hua Chung, 2022. "Whole genome sequence analysis of blood lipid levels in >66,000 individuals," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    7. Xena Marie Mapel & Naveen Kumar Kadri & Alexander S. Leonard & Qiongyu He & Audald Lloret-Villas & Meenu Bhati & Maya Hiltpold & Hubert Pausch, 2024. "Molecular quantitative trait loci in reproductive tissues impact male fertility in cattle," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    8. Daniel Munro & Nava Ehsan & Seyed Mehdi Esmaeili-Fard & Alexander Gusev & Abraham A. Palmer & Pejman Mohammadi, 2024. "Multimodal analysis of RNA sequencing data powers discovery of complex trait genetics," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    9. Asia Mendelevich & Svetlana Vinogradova & Saumya Gupta & Andrey A. Mironov & Shamil R. Sunyaev & Alexander A. Gimelbrant, 2021. "Replicate sequencing libraries are important for quantification of allelic imbalance," Nature Communications, Nature, vol. 12(1), pages 1-13, December.
    10. William L. Macken & Micol Falabella & Caroline McKittrick & Chiara Pizzamiglio & Rebecca Ellmers & Kelly Eggleton & Cathy E. Woodward & Yogen Patel & Robyn Labrum & Rahul Phadke & Mary M. Reilly & Cat, 2022. "Specialist multidisciplinary input maximises rare disease diagnoses from whole genome sequencing," Nature Communications, Nature, vol. 13(1), pages 1-9, December.
    11. Helen Ray-Jones & Chak Kei Sung & Lai Ting Chan & Alexander Haglund & Pavel Artemov & Monica Della Rosa & Luminita Ruje & Frances Burden & Roman Kreuzhuber & Anna Litovskikh & Eline Weyenbergh & Zoï B, 2025. "Genetic coupling of enhancer activity and connectivity in gene expression control," Nature Communications, Nature, vol. 16(1), pages 1-26, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1007613. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.