IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1000590.html
   My bibliography  Save this article

Inferring Binding Energies from Selected Binding Sites

Author

Listed:
  • Yue Zhao
  • David Granas
  • Gary D Stormo

Abstract

We employ a biophysical model that accounts for the non-linear relationship between binding energy and the statistics of selected binding sites. The model includes the chemical potential of the transcription factor, non-specific binding affinity of the protein for DNA, as well as sequence-specific parameters that may include non-independent contributions of bases to the interaction. We obtain maximum likelihood estimates for all of the parameters and compare the results to standard probabilistic methods of parameter estimation. On simulated data, where the true energy model is known and samples are generated with a variety of parameter values, we show that our method returns much more accurate estimates of the true parameters and much better predictions of the selected binding site distributions. We also introduce a new high-throughput SELEX (HT-SELEX) procedure to determine the binding specificity of a transcription factor in which the initial randomized library and the selected sites are sequenced with next generation methods that return hundreds of thousands of sites. We show that after a single round of selection our method can estimate binding parameters that give very good fits to the selected site distributions, much better than standard motif identification algorithms.Author Summary: The DNA binding sites of transcription factors that control gene expression are often predicted based on a collection of known or selected binding sites. The most commonly used methods for inferring the binding site pattern, or sequence motif, assume that the sites are selected in proportion to their affinity for the transcription factor, ignoring the effect of the transcription factor concentration. We have developed a new maximum likelihood approach, in a program called BEEML, that directly takes into account the transcription factor concentration as well as non-specific contributions to the binding affinity, and we show in simulation studies that it gives a much more accurate model of the transcription factor binding sites than previous methods. We also develop a new method for extracting binding sites for a transcription factor from a random pool of DNA sequences, called high-throughput SELEX (HT-SELEX), and we show that after a single round of selection BEEML can obtain an accurate model of the transcription factor binding sites.

Suggested Citation

  • Yue Zhao & David Granas & Gary D Stormo, 2009. "Inferring Binding Energies from Selected Binding Sites," PLOS Computational Biology, Public Library of Science, vol. 5(12), pages 1-8, December.
  • Handle: RePEc:plo:pcbi00:1000590
    DOI: 10.1371/journal.pcbi.1000590
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1000590
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1000590&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1000590?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Dana S F Homsi & Vineet Gupta & Gary D Stormo, 2009. "Modeling the Quantitative Specificity of DNA-Binding Proteins from Example Binding Sites," PLOS ONE, Public Library of Science, vol. 4(8), pages 1-9, August.
    2. Eilon Sharon & Shai Lubliner & Eran Segal, 2008. "A Feature-Based Approach to Modeling Protein–DNA Interactions," PLOS Computational Biology, Public Library of Science, vol. 4(8), pages 1-17, August.
    3. Mei-Ling Ting Lee & Martha L. Bulyk & G. A. Whitmore & George M. Church, 2002. "A Statistical Model for Investigating Binding Probabilities of DNA Nucleotide Sequences Using Microarrays," Biometrics, The International Biometric Society, vol. 58(4), pages 981-988, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Vishaka Datta & Sridhar Hannenhalli & Rahul Siddharthan, 2019. "ChIPulate: A comprehensive ChIP-seq simulation pipeline," PLOS Computational Biology, Public Library of Science, vol. 15(3), pages 1-32, March.
    2. Claudia Coronnello & Ryan Hartmaier & Arshi Arora & Luai Huleihel & Kusum V Pandit & Abha S Bais & Michael Butterworth & Naftali Kaminski & Gary D Stormo & Steffi Oesterreich & Panayiotis V Benos, 2012. "Novel Modeling of Combinatorial miRNA Targeting Identifies SNP with Potential Role in Bone Density," PLOS Computational Biology, Public Library of Science, vol. 8(12), pages 1-13, December.
    3. Shuxiang Ruan & Gary D Stormo, 2017. "Inherent limitations of probabilistic models for protein-DNA binding specificity," PLOS Computational Biology, Public Library of Science, vol. 13(7), pages 1-15, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Dana S F Homsi & Vineet Gupta & Gary D Stormo, 2009. "Modeling the Quantitative Specificity of DNA-Binding Proteins from Example Binding Sites," PLOS ONE, Public Library of Science, vol. 4(8), pages 1-9, August.
    2. Shuxiang Ruan & Gary D Stormo, 2017. "Inherent limitations of probabilistic models for protein-DNA binding specificity," PLOS Computational Biology, Public Library of Science, vol. 13(7), pages 1-15, July.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1000590. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.