IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v11y2020i1d10.1038_s41467-020-18564-9.html
   My bibliography  Save this article

A diploid assembly-based benchmark for variants in the major histocompatibility complex

Author

Listed:
  • Chen-Shan Chin

    (DNAnexus, Inc, 1975 W El Camino Real)

  • Justin Wagner

    (Material Measurement Laboratory, National Institute of Standards and Technology)

  • Qiandong Zeng

    (Laboratory Corporation of America Holdings)

  • Erik Garrison

    (University of California, Santa Cruz)

  • Shilpa Garg

    (Harvard Medical School)

  • Arkarachai Fungtammasan

    (DNAnexus, Inc, 1975 W El Camino Real)

  • Mikko Rautiainen

    (Center for Bioinformatics, Saarland University, Saarland Informatics Campus E2.1
    Max Planck Institute for Informatics, Saarland Informatics Campus E1.4
    Saarland Graduate School for Computer Science, Saarland Informatics Campus E1.3)

  • Sergey Aganezov

    (Johns Hopkins University)

  • Melanie Kirsche

    (Johns Hopkins University)

  • Samantha Zarate

    (Johns Hopkins University)

  • Michael C. Schatz

    (Johns Hopkins University
    Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor)

  • Chunlin Xiao

    (National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health)

  • William J. Rowell

    (Pacific Biosciences)

  • Charles Markello

    (University of California, Santa Cruz)

  • Jesse Farek

    (Human Genome Sequencing Center, Baylor College of Medicine)

  • Fritz J. Sedlazeck

    (Human Genome Sequencing Center, Baylor College of Medicine)

  • Vikas Bansal

    (University of California San Diego)

  • Byunggil Yoo

    (Genomic Medicine Center, Children’s Mercy Kansas City)

  • Neil Miller

    (Genomic Medicine Center, Children’s Mercy Kansas City)

  • Xin Zhou

    (Stanford University)

  • Andrew Carroll

    (Google Inc, 1600 Amphitheatre Pkwy)

  • Alvaro Martinez Barrio

    (10x Genomics)

  • Marc Salit

    (Joint Initiative for Metrology in Biology)

  • Tobias Marschall

    (Institute of Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University Düsseldorf)

  • Alexander T. Dilthey

    (Institute of Medical Microbiology and Hospital Hygiene, Heinrich Heine University Düsseldorf)

  • Justin M. Zook

    (Material Measurement Laboratory, National Institute of Standards and Technology)

Abstract

Most human genomes are characterized by aligning individual reads to the reference genome, but accurate long reads and linked reads now enable us to construct accurate, phased de novo assemblies. We focus on a medically important, highly variable, 5 million base-pair (bp) region where diploid assembly is particularly useful - the Major Histocompatibility Complex (MHC). Here, we develop a human genome benchmark derived from a diploid assembly for the openly-consented Genome in a Bottle sample HG002. We assemble a single contig for each haplotype, align them to the reference, call phased small and structural variants, and define a small variant benchmark for the MHC, covering 94% of the MHC and 22368 variants smaller than 50 bp, 49% more variants than a mapping-based benchmark. This benchmark reliably identifies errors in mapping-based callsets, and enables performance assessment in regions with much denser, complex variation than regions covered by previous benchmarks.

Suggested Citation

  • Chen-Shan Chin & Justin Wagner & Qiandong Zeng & Erik Garrison & Shilpa Garg & Arkarachai Fungtammasan & Mikko Rautiainen & Sergey Aganezov & Melanie Kirsche & Samantha Zarate & Michael C. Schatz & Ch, 2020. "A diploid assembly-based benchmark for variants in the major histocompatibility complex," Nature Communications, Nature, vol. 11(1), pages 1-9, December.
  • Handle: RePEc:nat:natcom:v:11:y:2020:i:1:d:10.1038_s41467-020-18564-9
    DOI: 10.1038/s41467-020-18564-9
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-020-18564-9
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-020-18564-9?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Taotao Li & Duo Du & Dandan Zhang & Yicheng Lin & Jiakang Ma & Mengyu Zhou & Weida Meng & Zelin Jin & Ziqiang Chen & Haozhe Yuan & Jue Wang & Shulong Dong & Shaoyang Sun & Wenjing Ye & Bosen Li & Houb, 2023. "CRISPR-based targeted haplotype-resolved assembly of a megabase region," Nature Communications, Nature, vol. 14(1), pages 1-15, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:11:y:2020:i:1:d:10.1038_s41467-020-18564-9. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.