IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1000090.html
   My bibliography  Save this article

CSMET: Comparative Genomic Motif Detection via Multi-Resolution Phylogenetic Shadowing

Author

Listed:
  • Pradipta Ray
  • Suyash Shringarpure
  • Mladen Kolar
  • Eric P Xing

Abstract

Functional turnover of transcription factor binding sites (TFBSs), such as whole-motif loss or gain, are common events during genome evolution. Conventional probabilistic phylogenetic shadowing methods model the evolution of genomes only at nucleotide level, and lack the ability to capture the evolutionary dynamics of functional turnover of aligned sequence entities. As a result, comparative genomic search of non-conserved motifs across evolutionarily related taxa remains a difficult challenge, especially in higher eukaryotes, where the cis-regulatory regions containing motifs can be long and divergent; existing methods rely heavily on specialized pattern-driven heuristic search or sampling algorithms, which can be difficult to generalize and hard to interpret based on phylogenetic principles. We propose a new method: Conditional Shadowing via Multi-resolution Evolutionary Trees, or CSMET, which uses a context-dependent probabilistic graphical model that allows aligned sites from different taxa in a multiple alignment to be modeled by either a background or an appropriate motif phylogeny conditioning on the functional specifications of each taxon. The functional specifications themselves are the output of a phylogeny which models the evolution not of individual nucleotides, but of the overall functionality (e.g., functional retention or loss) of the aligned sequence segments over lineages. Combining this method with a hidden Markov model that autocorrelates evolutionary rates on successive sites in the genome, CSMET offers a principled way to take into consideration lineage-specific evolution of TFBSs during motif detection, and a readily computable analytical form of the posterior distribution of motifs under TFBS turnover. On both simulated and real Drosophila cis-regulatory modules, CSMET outperforms other state-of-the-art comparative genomic motif finders.Author Summary: Functional turnover of transcription factor binding sites (TFBSs), such as whole-motif loss or gain, are common events during genome evolution, and play a major role in shaping the genome and regulatory circuitry of contemporary species. Conventional methods for searching non-conserved motifs across evolutionarily related species have little or no probabilistic machinery to explicitly model this important evolutionary process; therefore, they offer little insight into the mechanism and dynamics of TFBS turnover and have limited power in finding motif patterns shaped by such processes. In this paper, we propose a new method: Conditional Shadowing via Multi-resolution Evolutionary Trees, or CSMET, which uses a mathematically elegant and computationally efficient way to model biological sequence evolution at both nucleotide level at each individual site, and functional level of a whole TFBS. CSMET offers the first principled way to take into consideration lineage-specific evolution of TFBSs and CRMs during motif detection, and offers a readily computable analytical form of the posterior distribution of motifs under TFBS turnover. Its performance improves upon current state-of-the-art programs. It represents an initial foray into the problem of statistical inference of functional evolution of TFBS, and offers a well-founded mathematical basis for the development of more realistic and informative models.

Suggested Citation

  • Pradipta Ray & Suyash Shringarpure & Mladen Kolar & Eric P Xing, 2008. "CSMET: Comparative Genomic Motif Detection via Multi-Resolution Phylogenetic Shadowing," PLOS Computational Biology, Public Library of Science, vol. 4(6), pages 1-20, June.
  • Handle: RePEc:plo:pcbi00:1000090
    DOI: 10.1371/journal.pcbi.1000090
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1000090
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1000090&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1000090?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Michael Z. Ludwig & Casey Bergman & Nipam H. Patel & Martin Kreitman, 2000. "Evidence for stabilizing selection in a eukaryotic enhancer element," Nature, Nature, vol. 403(6769), pages 564-567, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. William H Majoros & Uwe Ohler, 2010. "Modeling the Evolution of Regulatory Elements by Simultaneous Detection and Alignment with Phylogenetic Pair HMMs," PLOS Computational Biology, Public Library of Science, vol. 6(12), pages 1-12, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Armita Nourmohammad & Michael Lässig, 2011. "Formation of Regulatory Modules by Local Sequence Duplication," PLOS Computational Biology, Public Library of Science, vol. 7(10), pages 1-12, October.
    2. Lauren A. Choate & Gilad Barshad & Pierce W. McMahon & Iskander Said & Edward J. Rice & Paul R. Munn & James J. Lewis & Charles G. Danko, 2021. "Multiple stages of evolutionary change in anthrax toxin receptor expression in humans," Nature Communications, Nature, vol. 12(1), pages 1-12, December.
    3. Iksoo Huh & Isabel Mendizabal & Taesung Park & Soojin V Yi, 2018. "Functional conservation of sequence determinants at rapidly evolving regulatory regions across mammals," PLOS Computational Biology, Public Library of Science, vol. 14(10), pages 1-21, October.
    4. Mathilde Paris & Tommy Kaplan & Xiao Yong Li & Jacqueline E Villalta & Susan E Lott & Michael B Eisen, 2013. "Extensive Divergence of Transcription Factor Binding in Drosophila Embryos with Highly Conserved Gene Expression," PLOS Genetics, Public Library of Science, vol. 9(9), pages 1-18, September.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1000090. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.