IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1000074.html
   My bibliography  Save this article

Viral Population Estimation Using Pyrosequencing

Author

Listed:
  • Nicholas Eriksson
  • Lior Pachter
  • Yumi Mitsuya
  • Soo-Yon Rhee
  • Chunlin Wang
  • Baback Gharizadeh
  • Mostafa Ronaghi
  • Robert W Shafer
  • Niko Beerenwinkel

Abstract

The diversity of virus populations within single infected hosts presents a major difficulty for the natural immune response as well as for vaccine design and antiviral drug therapy. Recently developed pyrophosphate-based sequencing technologies (pyrosequencing) can be used for quantifying this diversity by ultra-deep sequencing of virus samples. We present computational methods for the analysis of such sequence data and apply these techniques to pyrosequencing data obtained from HIV populations within patients harboring drug-resistant virus strains. Our main result is the estimation of the population structure of the sample from the pyrosequencing reads. This inference is based on a statistical approach to error correction, followed by a combinatorial algorithm for constructing a minimal set of haplotypes that explain the data. Using this set of explaining haplotypes, we apply a statistical model to infer the frequencies of the haplotypes in the population via an expectation–maximization (EM) algorithm. We demonstrate that pyrosequencing reads allow for effective population reconstruction by extensive simulations and by comparison to 165 sequences obtained directly from clonal sequencing of four independent, diverse HIV populations. Thus, pyrosequencing can be used for cost-effective estimation of the structure of virus populations, promising new insights into viral evolutionary dynamics and disease control strategies.Author Summary: The genetic diversity of viral populations is important for biomedical problems such as disease progression, vaccine design, and drug resistance, yet it is not generally well understood. In this paper, we use pyrosequencing, a novel DNA sequencing technique, to reconstruct viral populations. Pyrosequencing produces DNA sequences, called reads, in numbers much greater than standard DNA sequencing techniques. However, these reads are substantially shorter and more error-prone than those obtained from standard sequencing techniques. Therefore, pyrosequencing data requires new methods of analysis. Here, we develop mathematical and statistical tools for reconstructing viral populations using pyrosequencing. To this end, we show how to correct errors in the reads and assemble them into the different viral strains present in the population. We apply these methods to HIV-1 populations from drug-resistant patients and show that our techniques produce results quite close to accepted techniques at a lower cost and potentially higher resolution.

Suggested Citation

  • Nicholas Eriksson & Lior Pachter & Yumi Mitsuya & Soo-Yon Rhee & Chunlin Wang & Baback Gharizadeh & Mostafa Ronaghi & Robert W Shafer & Niko Beerenwinkel, 2008. "Viral Population Estimation Using Pyrosequencing," PLOS Computational Biology, Public Library of Science, vol. 4(5), pages 1-13, May.
  • Handle: RePEc:plo:pcbi00:1000074
    DOI: 10.1371/journal.pcbi.1000074
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1000074
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1000074&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1000074?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Alexander R Macalalad & Michael C Zody & Patrick Charlebois & Niall J Lennon & Ruchi M Newman & Christine M Malboeuf & Elizabeth M Ryan & Christian L Boutwell & Karen A Power & Doug E Brackney & Kendr, 2012. "Highly Sensitive and Specific Detection of Rare Variants in Mixed Viral Populations from Massively Parallel Sequence Data," PLOS Computational Biology, Public Library of Science, vol. 8(3), pages 1-10, March.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1000074. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.