IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1005086.html
   My bibliography  Save this article

Likelihood-Based Inference of B Cell Clonal Families

Author

Listed:
  • Duncan K Ralph
  • Frederick A Matsen IV

Abstract

The human immune system depends on a highly diverse collection of antibody-making B cells. B cell receptor sequence diversity is generated by a random recombination process called “rearrangement” forming progenitor B cells, then a Darwinian process of lineage diversification and selection called “affinity maturation.” The resulting receptors can be sequenced in high throughput for research and diagnostics. Such a collection of sequences contains a mixture of various lineages, each of which may be quite numerous, or may consist of only a single member. As a step to understanding the process and result of this diversification, one may wish to reconstruct lineage membership, i.e. to cluster sampled sequences according to which came from the same rearrangement events. We call this clustering problem “clonal family inference.” In this paper we describe and validate a likelihood-based framework for clonal family inference based on a multi-hidden Markov Model (multi-HMM) framework for B cell receptor sequences. We describe an agglomerative algorithm to find a maximum likelihood clustering, two approximate algorithms with various trade-offs of speed versus accuracy, and a third, fast algorithm for finding specific lineages. We show that under simulation these algorithms greatly improve upon existing clonal family inference methods, and that they also give significantly different clusters than previous methods when applied to two real data sets.Author Summary: Antibodies must recognize a great diversity of antigens to protect us from infectious disease. The binding properties of antibodies are determined by the DNA sequences of their corresponding B cell receptors (BCRs). These BCR sequences are created in naive form by VDJ recombination, which randomly selects and trims the ends of V, D, and J genes, then joins the resulting segments together with additional random nucleotides. If they pass initial screening and bind an antigen, these sequences then undergo an evolutionary process of reproduction, mutation, and selection, revising the BCR to improve binding to its cognate antigen. It has recently become possible to determine the BCR sequences resulting from this process in high throughput. Although these sequences implicitly contain a wealth of information about both antigen exposure and the process by which we learn to resist pathogens, this information can only be extracted using computer algorithms. In this paper we describe a likelihood-based statistical method to determine, given a collection of BCR sequences, which of them are derived from the same recombination events. It is based on a hidden Markov model (HMM) of VDJ rearrangement which is able to calculate likelihoods for many sequences at once.

Suggested Citation

  • Duncan K Ralph & Frederick A Matsen IV, 2016. "Likelihood-Based Inference of B Cell Clonal Families," PLOS Computational Biology, Public Library of Science, vol. 12(10), pages 1-28, October.
  • Handle: RePEc:plo:pcbi00:1005086
    DOI: 10.1371/journal.pcbi.1005086
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005086
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1005086&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1005086?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Duncan K Ralph & Frederick A Matsen IV, 2016. "Consistency of VDJ Rearrangement and Substitution Parameters Enables Accurate B Cell Receptor Sequence Annotation," PLOS Computational Biology, Public Library of Science, vol. 12(1), pages 1-25, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Amrit Dhar & Kristian Davidsen & Frederick A Matsen IV & Vladimir N Minin, 2018. "Predicting B cell receptor substitution profiles using public repertoire data," PLOS Computational Biology, Public Library of Science, vol. 14(10), pages 1-24, October.
    2. Nima Nouri & Steven H Kleinstein, 2020. "Somatic hypermutation analysis for improved identification of B cell clonal families from next-generation sequencing data," PLOS Computational Biology, Public Library of Science, vol. 16(6), pages 1-22, June.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Amrit Dhar & Kristian Davidsen & Frederick A Matsen IV & Vladimir N Minin, 2018. "Predicting B cell receptor substitution profiles using public repertoire data," PLOS Computational Biology, Public Library of Science, vol. 14(10), pages 1-24, October.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1005086. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.