IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1000016.html
   My bibliography  Save this article

Statistical Resolution of Ambiguous HLA Typing Data

Author

Listed:
  • Jennifer Listgarten
  • Zabrina Brumme
  • Carl Kadie
  • Gao Xiaojiang
  • Bruce Walker
  • Mary Carrington
  • Philip Goulder
  • David Heckerman

Abstract

High-resolution HLA typing plays a central role in many areas of immunology, such as in identifying immunogenetic risk factors for disease, in studying how the genomes of pathogens evolve in response to immune selection pressures, and also in vaccine design, where identification of HLA-restricted epitopes may be used to guide the selection of vaccine immunogens. Perhaps one of the most immediate applications is in direct medical decisions concerning the matching of stem cell transplant donors to unrelated recipients. However, high-resolution HLA typing is frequently unavailable due to its high cost or the inability to re-type historical data. In this paper, we introduce and evaluate a method for statistical, in silico refinement of ambiguous and/or low-resolution HLA data. Our method, which requires an independent, high-resolution training data set drawn from the same population as the data to be refined, uses linkage disequilibrium in HLA haplotypes as well as four-digit allele frequency data to probabilistically refine HLA typings. Central to our approach is the use of haplotype inference. We introduce new methodology to this area, improving upon the Expectation-Maximization (EM)-based approaches currently used within the HLA community. Our improvements are achieved by using a parsimonious parameterization for haplotype distributions and by smoothing the maximum likelihood (ML) solution. These improvements make it possible to scale the refinement to a larger number of alleles and loci in a more computationally efficient and stable manner. We also show how to augment our method in order to incorporate ethnicity information (as HLA allele distributions vary widely according to race/ethnicity as well as geographic area), and demonstrate the potential utility of this experimentally. A tool based on our approach is freely available for research purposes at http://microsoft.com/science.Author Summary: At the core of the human adaptive immune response is the train-to-kill mechanism in which specialized immune cells are sensitized to recognize small peptides from foreign sources (e.g., from HIV or bacteria). Following this sensitization, these immune cells are then activated to kill other cells which display this same peptide (and which contain this same foreign peptide). However, in order for sensitization and killing to occur, the foreign peptide must be “paired up” with one of the infected person's other specialized immune molecules—an HLA molecule. The way in which peptides interact with these HLA molecules defines if and how an immune response will be generated. There is a huge repertoire of such HLA molecules, with almost no two people having the same set. Furthermore, a person's HLA type can determine their susceptibility to disease, or the success of a transplant, for example. However, obtaining high quality HLA data for patients is often difficult because of the great cost and specialized laboratories required, or because the data are historical and cannot be retyped with modern methods. Therefore, we introduce a statistical model which can make use of existing high-quality HLA data, to infer higher-quality HLA data from lower-quality data.

Suggested Citation

  • Jennifer Listgarten & Zabrina Brumme & Carl Kadie & Gao Xiaojiang & Bruce Walker & Mary Carrington & Philip Goulder & David Heckerman, 2008. "Statistical Resolution of Ambiguous HLA Typing Data," PLOS Computational Biology, Public Library of Science, vol. 4(2), pages 1-15, February.
  • Handle: RePEc:plo:pcbi00:1000016
    DOI: 10.1371/journal.pcbi.1000016
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1000016
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1000016&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1000016?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Jennifer Listgarten & Nicole Frahm & Carl Kadie & Christian Brander & David Heckerman, 2007. "A Statistical Framework for Modeling HLA-Dependent T Cell Response Data," PLOS Computational Biology, Public Library of Science, vol. 3(10), pages 1-8, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jonathan M Carlson & Zabrina L Brumme & Christine M Rousseau & Chanson J Brumme & Philippa Matthews & Carl Kadie & James I Mullins & Bruce D Walker & P Richard Harrigan & Philip J R Goulder & David He, 2008. "Phylogenetic Dependency Networks: Inferring Patterns of CTL Escape and Codon Covariation in HIV-1 Gag," PLOS Computational Biology, Public Library of Science, vol. 4(11), pages 1-23, November.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.

      More about this item

      Statistics

      Access and download statistics

      Corrections

      All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1000016. See general information about how to correct material in RePEc.

      If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

      If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

      If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

      For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

      Please note that corrections may take a couple of weeks to filter through the various RePEc services.

      IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.