IDEAS home Printed from https://ideas.repec.org/a/bpj/sagmbi/v12y2013i1p1-15n1.html
   My bibliography  Save this article

Studying the evolution of transcription factor binding events using multi-species ChIP-Seq data

Author

Listed:
  • Zheng Wei

    (Yale University – Keck Biostatistics Resources, Room 503, 300 George Street, New Haven, CT 06511, USA)

  • Zhao Hongyu

    (Yale School of Public Health – Biostatistics Division, New Haven, CT, USA)

Abstract

Recent technology advances make it possible to collect whole-genome transcription factor binding (TFB) profiles from multiple species through the ChIP-Seq data. This provides rich information to understand TFB evolution. However, few rigorous statistical models are available to infer TFB evolution from these data. We have developed a phylogenetic tree based method to model the on/off rates of TFB events. There are two unique features of our method compared to existing models. First, we mask nucleotide substitutions and focus on INDEL disruption of TFB events, which are rarer evolution events and more appropriate for divergent species and non-coding regulatory regions. Second, we correct for ascertainment bias in ChIP-Seq data by maximizing likelihood conditional on the observed (incomplete) data. Simulations show that our method works well in model selection and parameter estimation when there are sufficient aligned TFB events. When this method is applied to a ChIP-Seq data set with five vertebrates, we find that the instantaneous transition rates to INDELs are higher in TFB regions than in homologous non-binding regions. This is driven by an excess of alignment columns showing binding in one species but gaps in all other species. When we compare the inferred transition rates between the conserved and non-conserved regions, as expected, the conserved regions are estimated to have lower transition rates. The R package TFBphylo that implements the described model can be downloaded from http://bioinformatics.med.yale.edu/.

Suggested Citation

  • Zheng Wei & Zhao Hongyu, 2013. "Studying the evolution of transcription factor binding events using multi-species ChIP-Seq data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 12(1), pages 1-15, March.
  • Handle: RePEc:bpj:sagmbi:v:12:y:2013:i:1:p:1-15:n:1
    DOI: 10.1515/sagmb-2012-0004
    as

    Download full text from publisher

    File URL: https://doi.org/10.1515/sagmb-2012-0004
    Download Restriction: For access to full text, subscription to the journal or payment for the individual article is required.

    File URL: https://libkey.io/10.1515/sagmb-2012-0004?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Hobolth Asger & Jensen Jens Ledet, 2005. "Statistical Inference in Evolutionary Models of DNA Sequences via the EM Algorithm," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 4(1), pages 1-22, August.
    2. Wei Zheng & Hongyu Zhao & Eugenio Mancera & Lars M. Steinmetz & Michael Snyder, 2010. "Genetic analysis of variation in transcription factor binding in yeast," Nature, Nature, vol. 464(7292), pages 1187-1191, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Nils Lid Hjort & Cristiano Varin, 2008. "ML, PL, QL in Markov Chain Models," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 35(1), pages 64-82, March.
    2. Hobolth, Asger & Siren, Jukka, 2016. "The multivariate Wright–Fisher process with mutation: Moment-based analysis and inference using a hierarchical Beta model," Theoretical Population Biology, Elsevier, vol. 108(C), pages 36-50.
    3. Hobolth, Asger & Wiuf, Carsten, 2024. "Maximum likelihood estimation and natural pairwise estimating equations are identical for three sequences and a symmetric 2-state substitution model," Theoretical Population Biology, Elsevier, vol. 156(C), pages 1-4.
    4. Mathilde Paris & Tommy Kaplan & Xiao Yong Li & Jacqueline E Villalta & Susan E Lott & Michael B Eisen, 2013. "Extensive Divergence of Transcription Factor Binding in Drosophila Embryos with Highly Conserved Gene Expression," PLOS Genetics, Public Library of Science, vol. 9(9), pages 1-18, September.
    5. Alexander Kremer & Rafael Weißbach, 2013. "Consistent estimation for discretely observed Markov jump processes with an absorbing state," Statistical Papers, Springer, vol. 54(4), pages 993-1007, November.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bpj:sagmbi:v:12:y:2013:i:1:p:1-15:n:1. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.degruyter.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.