IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1003246.html
   My bibliography  Save this article

dPeak: High Resolution Identification of Transcription Factor Binding Sites from PET and SET ChIP-Seq Data

Author

Listed:
  • Dongjun Chung
  • Dan Park
  • Kevin Myers
  • Jeffrey Grass
  • Patricia Kiley
  • Robert Landick
  • Sündüz Keleş

Abstract

Chromatin immunoprecipitation followed by high throughput sequencing (ChIP-Seq) has been successfully used for genome-wide profiling of transcription factor binding sites, histone modifications, and nucleosome occupancy in many model organisms and humans. Because the compact genomes of prokaryotes harbor many binding sites separated by only few base pairs, applications of ChIP-Seq in this domain have not reached their full potential. Applications in prokaryotic genomes are further hampered by the fact that well studied data analysis methods for ChIP-Seq do not result in a resolution required for deciphering the locations of nearby binding events. We generated single-end tag (SET) and paired-end tag (PET) ChIP-Seq data for factor in Escherichia coli (E. coli). Direct comparison of these datasets revealed that although PET assay enables higher resolution identification of binding events, standard ChIP-Seq analysis methods are not equipped to utilize PET-specific features of the data. To address this problem, we developed dPeak as a high resolution binding site identification (deconvolution) algorithm. dPeak implements a probabilistic model that accurately describes ChIP-Seq data generation process for both the SET and PET assays. For SET data, dPeak outperforms or performs comparably to the state-of-the-art high-resolution ChIP-Seq peak deconvolution algorithms such as PICS, GPS, and GEM. When coupled with PET data, dPeak significantly outperforms SET-based analysis with any of the current state-of-the-art methods. Experimental validations of a subset of dPeak predictions from PET ChIP-Seq data indicate that dPeak can estimate locations of binding events with as high as to resolution. Applications of dPeak to ChIP-Seq data in E. coli under aerobic and anaerobic conditions reveal closely located promoters that are differentially occupied and further illustrate the importance of high resolution analysis of ChIP-Seq data.Author Summary: Chromatin immunoprecipitation followed by high throughput sequencing (ChIP-Seq) is widely used for studying in vivo protein-DNA interactions genome-wide. Current state-of-the-art ChIP-Seq protocols utilize single-end tag (SET) assay which only sequences ends of DNA fragments in the library. Although paired-end tag (PET) sequencing is routinely used in other applications of next generation sequencing, it has not been much adapted to ChIP-Seq. We illustrate both experimentally and computationally that PET sequencing significantly improves the resolution of ChIP-Seq experiments and enables ChIP-Seq applications in compact genomes like Escherichia coli (E. coli). To enable efficient identification using PET ChIP-Seq data, we develop dPeak as a high resolution binding site identification algorithm. dPeak implements probabilistic models for both SET and PET data and facilitates efficient analysis of both data types. Applications of dPeak to deeply sequenced E. coli PET and SET ChIP-Seq data establish significantly better resolution of PET compared to SET sequencing.

Suggested Citation

  • Dongjun Chung & Dan Park & Kevin Myers & Jeffrey Grass & Patricia Kiley & Robert Landick & Sündüz Keleş, 2013. "dPeak: High Resolution Identification of Transcription Factor Binding Sites from PET and SET ChIP-Seq Data," PLOS Computational Biology, Public Library of Science, vol. 9(10), pages 1-13, October.
  • Handle: RePEc:plo:pcbi00:1003246
    DOI: 10.1371/journal.pcbi.1003246
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003246
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1003246&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1003246?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Xuekui Zhang & Gordon Robertson & Martin Krzywinski & Kaida Ning & Arnaud Droit & Steven Jones & Raphael Gottardo, 2011. "PICS: Probabilistic Inference for ChIP-seq," Biometrics, The International Biometric Society, vol. 67(1), pages 151-163, March.
    2. Kuan, Pei Fen & Chung, Dongjun & Pan, Guangjin & Thomson, James A. & Stewart, Ron & Keleş, Sündüz, 2011. "A Statistical Framework for the Analysis of ChIP-Seq Data," Journal of the American Statistical Association, American Statistical Association, vol. 106(495), pages 891-903.
    3. Tarjei S. Mikkelsen & Manching Ku & David B. Jaffe & Biju Issac & Erez Lieberman & Georgia Giannoukos & Pablo Alvarez & William Brockman & Tae-Kyung Kim & Richard P. Koche & William Lee & Eric Mendenh, 2007. "Genome-wide maps of chromatin state in pluripotent and lineage-committed cells," Nature, Nature, vol. 448(7153), pages 553-560, August.
    4. Elizabeth G Wilbanks & Marc T Facciotti, 2010. "Evaluation of Algorithm Performance in ChIP-Seq Peak Detection," PLOS ONE, Public Library of Science, vol. 5(7), pages 1-12, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Liang-Yu Fu & Tao Zhu & Xinkai Zhou & Ranran Yu & Zhaohui He & Peijing Zhang & Zhigui Wu & Ming Chen & Kerstin Kaufmann & Dijun Chen, 2022. "ChIP-Hub provides an integrative platform for exploring plant regulome," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    2. Gaylor Boulay & Liliane C. Broye & Rui Dong & Sowmya Iyer & Rajendran Sanalkumar & Yu-Hang Xing & Rémi Buisson & Shruthi Rengarajan & Beverly Naigles & Benoît Duc & Angela Volorio & Mary E. Awad & Raf, 2024. "EWS-WT1 fusion isoforms establish oncogenic programs and therapeutic vulnerabilities in desmoplastic small round cell tumors," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    3. Caiyan Jia & Matthew B Carson & Yang Wang & Youfang Lin & Hui Lu, 2014. "A New Exhaustive Method and Strategy for Finding Motifs in ChIP-Enriched Regions," PLOS ONE, Public Library of Science, vol. 9(1), pages 1-13, January.
    4. Daniel Sánchez-Taltavull & Parameswaran Ramachandran & Nelson Lau & Theodore J Perkins, 2016. "Bayesian Correlation Analysis for Sequence Count Data," PLOS ONE, Public Library of Science, vol. 11(10), pages 1-24, October.
    5. Yayoi Natsume-Kitatani & Hiroshi Mamitsuka, 2016. "Classification of Promoters Based on the Combination of Core Promoter Elements Exhibits Different Histone Modification Patterns," PLOS ONE, Public Library of Science, vol. 11(3), pages 1-18, March.
    6. Anthony Mathelier & Wyeth W Wasserman, 2013. "The Next Generation of Transcription Factor Binding Site Prediction," PLOS Computational Biology, Public Library of Science, vol. 9(9), pages 1-18, September.
    7. Linghua Zhou & Yong Shen & Libo Jiang & Danni Yin & Jingxin Guo & Hui Zheng & Hao Sun & Rongling Wu & Yunqian Guo, 2015. "Systems Mapping for Hematopoietic Progenitor Cell Heterogeneity," PLOS ONE, Public Library of Science, vol. 10(5), pages 1-18, May.
    8. Samb Rawane & Khadraoui Khader & Belleau Pascal & Deschênes Astrid & Lakhal-Chaieb Lajmi & Droit Arnaud, 2015. "Using informative Multinomial-Dirichlet prior in a t-mixture with reversible jump estimation of nucleosome positions for genome-wide profiling," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 14(6), pages 517-532, December.
    9. Weronika Sikora-Wohlfeld & Marit Ackermann & Eleni G Christodoulou & Kalaimathy Singaravelu & Andreas Beyer, 2013. "Assessing Computational Methods for Transcription Factor Target Gene Identification Based on ChIP-seq Data," PLOS Computational Biology, Public Library of Science, vol. 9(11), pages 1-11, November.
    10. Yuzhuo Wang & Chengzhi Zhang & Kai Li, 2022. "A review on method entities in the academic literature: extraction, evaluation, and application," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(5), pages 2479-2520, May.
    11. Timothy Bailey & Pawel Krajewski & Istvan Ladunga & Celine Lefebvre & Qunhua Li & Tao Liu & Pedro Madrigal & Cenny Taslim & Jie Zhang, 2013. "Practical Guidelines for the Comprehensive Analysis of ChIP-seq Data," PLOS Computational Biology, Public Library of Science, vol. 9(11), pages 1-8, November.
    12. Mohammad Jaber & Ahmed Radwan & Netanel Loyfer & Mufeed Abdeen & Shulamit Sebban & Areej Khatib & Hazar Yassen & Thorsten Kolb & Marc Zapatka & Kirill Makedonski & Aurelie Ernst & Tommy Kaplan & Yosef, 2022. "Comparative parallel multi-omics analysis during the induction of pluripotent and trophectoderm states," Nature Communications, Nature, vol. 13(1), pages 1-21, December.
    13. Dai, Hongsheng & Bao, Yanchun & Bao, Mingtang, 2013. "Maximum likelihood estimate for the dispersion parameter of the negative binomial distribution," Statistics & Probability Letters, Elsevier, vol. 83(1), pages 21-27.
    14. Federica Baccini & Monica Bianchini & Filippo Geraci, 2022. "Graph-Based Integration of Histone Modification Profiles," Mathematics, MDPI, vol. 10(11), pages 1-15, May.
    15. Ben Li & Yunxiao Li & Zhaohui S. Qin, 2017. "Improving Hierarchical Models Using Historical Data with Applications in High-Throughput Genomics Data Analysis," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 9(1), pages 73-90, June.
    16. Yurika Matsui & Mohamed Nadhir Djekidel & Katherine Lindsay & Parimal Samir & Nina Connolly & Gang Wu & Xiaoyang Yang & Yiping Fan & Beisi Xu & Jamy C. Peng, 2023. "SNIP1 and PRC2 coordinate cell fates of neural progenitors during brain development," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    17. Chet H. Loh & Siebe Genesen & Matteo Perino & Magnus R. Bark & Gert Jan C. Veenstra, 2021. "Loss of PRC2 subunits primes lineage choice during exit of pluripotency," Nature Communications, Nature, vol. 12(1), pages 1-14, December.
    18. Guannan Sun & Rajini Srinivasan & Camila Lopez-Anido & Holly A Hung & John Svaren & Sündüz Keleş, 2014. "In Silico Pooling of ChIP-seq Control Experiments," PLOS ONE, Public Library of Science, vol. 9(11), pages 1-9, November.
    19. Yan Jiang & Siqi Sun & Yuan Quan & Xin Wang & Yuling You & Xiao Zhang & Yue Zhang & Yin Liu & Bingjing Wang & Henan Xu & Xuetao Cao, 2023. "Nuclear RPSA senses viral nucleic acids to promote the innate inflammatory response," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    20. Apratim Mitra & Jiuzhou Song, 2012. "WaveSeq: A Novel Data-Driven Method of Detecting Histone Modification Enrichments Using Wavelets," PLOS ONE, Public Library of Science, vol. 7(9), pages 1-11, September.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1003246. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.