Author
Listed:
- Xi Chen
- Andrew F Neuwald
- Leena Hilakivi-Clarke
- Robert Clarke
- Jianhua Xuan
Abstract
Transcription factors (TFs) often function as a module including both master factors and mediators binding at cis-regulatory regions to modulate nearby gene transcription. ChIP-seq profiling of multiple TFs makes it feasible to infer functional TF modules. However, when inferring TF modules based on co-localization of ChIP-seq peaks, often many weak binding events are missed, especially for mediators, resulting in incomplete identification of modules. To address this problem, we develop a ChIP-seq data-driven Gibbs Sampler to infer Modules (ChIP-GSM) using a Bayesian framework that integrates ChIP-seq profiles of multiple TFs. ChIP-GSM samples read counts of module TFs iteratively to estimate the binding potential of a module to each region and, across all regions, estimates the module abundance. Using inferred module-region probabilistic bindings as feature units, ChIP-GSM then employs logistic regression to predict active regulatory elements. Validation of ChIP-GSM predicted regulatory regions on multiple independent datasets sharing the same context confirms the advantage of using TF modules for predicting regulatory activity. In a case study of K562 cells, we demonstrate that the ChIP-GSM inferred modules form as groups, activate gene expression at different time points, and mediate diverse functional cellular processes. Hence, ChIP-GSM infers biologically meaningful TF modules and improves the prediction accuracy of regulatory region activities.Author summary: Investigating TF binding to different types of regulatory regions can help reveal underlying activation mechanisms. However, accurately inferring modules among a large set of TFs is challenging due to the existence of weak, noisy, and context-sensitive binding signals. To reliably infer TF modules, here we describe ChIP-GSM, a Gibbs sampler built upon a Bayesian framework, that can further predict active regulatory elements. A comparison with other methods demonstrates ChIP-GSM’s improved performance on module identification and active regulatory element prediction. Experimental results demonstrate that TF modules identified by ChIP-GSM are likely mediating distinct cellular functions by activating regulatory regions at different time points.
Suggested Citation
Xi Chen & Andrew F Neuwald & Leena Hilakivi-Clarke & Robert Clarke & Jianhua Xuan, 2021.
"ChIP-GSM: Inferring active transcription factor modules to predict functional regulatory elements,"
PLOS Computational Biology, Public Library of Science, vol. 17(7), pages 1-22, July.
Handle:
RePEc:plo:pcbi00:1009203
DOI: 10.1371/journal.pcbi.1009203
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1009203. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.