IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1005176.html
   My bibliography  Save this article

Automated incorporation of pairwise dependency in transcription factor binding site prediction using dinucleotide weight tensors

Author

Listed:
  • Saeed Omidi
  • Mihaela Zavolan
  • Mikhail Pachkov
  • Jeremie Breda
  • Severin Berger
  • Erik van Nimwegen

Abstract

Gene regulatory networks are ultimately encoded by the sequence-specific binding of (TFs) to short DNA segments. Although it is customary to represent the binding specificity of a TF by a position-specific weight matrix (PSWM), which assumes each position within a site contributes independently to the overall binding affinity, evidence has been accumulating that there can be significant dependencies between positions. Unfortunately, methodological challenges have so far hindered the development of a practical and generally-accepted extension of the PSWM model. On the one hand, simple models that only consider dependencies between nearest-neighbor positions are easy to use in practice, but fail to account for the distal dependencies that are observed in the data. On the other hand, models that allow for arbitrary dependencies are prone to overfitting, requiring regularization schemes that are difficult to use in practice for non-experts. Here we present a new regulatory motif model, called dinucleotide weight tensor (DWT), that incorporates arbitrary pairwise dependencies between positions in binding sites, rigorously from first principles, and free from tunable parameters. We demonstrate the power of the method on a large set of ChIP-seq data-sets, showing that DWTs outperform both PSWMs and motif models that only incorporate nearest-neighbor dependencies. We also demonstrate that DWTs outperform two previously proposed methods. Finally, we show that DWTs inferred from ChIP-seq data also outperform PSWMs on HT-SELEX data for the same TF, suggesting that DWTs capture inherent biophysical properties of the interactions between the DNA binding domains of TFs and their binding sites. We make a suite of DWT tools available at dwt.unibas.ch, that allow users to automatically perform ‘motif finding’, i.e. the inference of DWT motifs from a set of sequences, binding site prediction with DWTs, and visualization of DWT ‘dilogo’ motifs.Author summary: Gene regulatory networks are ultimately encoded in constellations of short binding sites in the DNA and RNA that are recognized by regulatory factors such as transcription factors (TFs). For several decades, computational analysis of regulatory networks has relied on a model of TF sequence-specificity, the position-specific weight-matrix (PSWM), that assumes different positions in a binding site contribute independently to the total binding energy of the TF. However, in recent years evidence has been accumulating that, at least for some TFs, this assumption does not hold. Here we present a new model for the sequence-specificity of TFs, the dinucleotide weight tensor (DWT), that takes arbitrary dependencies between positions in binding sites into account and show that it consistently outperforms PSWMs on high-throughput datasets on TF binding. Moreover, in contrast to previous approaches, DWTs are directly derived from first principles within a Bayesian framework, and contain no tunable parameters. This allows them to be easily applied in practice and we make a suite of tools available for computational analysis with DWTs.

Suggested Citation

  • Saeed Omidi & Mihaela Zavolan & Mikhail Pachkov & Jeremie Breda & Severin Berger & Erik van Nimwegen, 2017. "Automated incorporation of pairwise dependency in transcription factor binding site prediction using dinucleotide weight tensors," PLOS Computational Biology, Public Library of Science, vol. 13(7), pages 1-22, July.
  • Handle: RePEc:plo:pcbi00:1005176
    DOI: 10.1371/journal.pcbi.1005176
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005176
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1005176&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1005176?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Anthony Mathelier & Wyeth W Wasserman, 2013. "The Next Generation of Transcription Factor Binding Site Prediction," PLOS Computational Biology, Public Library of Science, vol. 9(9), pages 1-18, September.
    2. Rahul Siddharthan & Eric D Siggia & Erik van Nimwegen, 2005. "PhyloGibbs: A Gibbs Sampling Motif Finder That Incorporates Phylogeny," PLOS Computational Biology, Public Library of Science, vol. 1(7), pages 1-23, December.
    3. Rajagopal, 2014. "The Human Factors," Palgrave Macmillan Books, in: Architecting Enterprise, chapter 9, pages 225-249, Palgrave Macmillan.
    4. Lukas Burger & Erik van Nimwegen, 2010. "Disentangling Direct from Indirect Co-Evolution of Residues in Protein Alignments," PLOS Computational Biology, Public Library of Science, vol. 6(1), pages 1-18, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Rahman, Shaikh Moksadur, 2020. "Relationship between Job Satisfaction and Turnover Intention: Evidence from Bangladesh," Asian Business Review, Asian Business Consortium, vol. 10(2), pages 99-108.
    2. Wang Kai, 2019. "Towards a Taxonomy of Idea Generation Techniques," Foundations of Management, Sciendo, vol. 11(1), pages 65-80, January.
    3. Bridgelall, Raj & Stubbing, Edward, 2021. "Forecasting the effects of autonomous vehicles on land use," Technological Forecasting and Social Change, Elsevier, vol. 163(C).
    4. Bevilacqua, Maurizio & Ciarapica, Filippo Emanuele, 2018. "Human factor risk management in the process industry: A case study," Reliability Engineering and System Safety, Elsevier, vol. 169(C), pages 149-159.
    5. Naveena Prakasam & Louisa Huxtable-Thomas, 2021. "Reddit: Affordances as an Enabler for Shifting Loyalties," Information Systems Frontiers, Springer, vol. 23(3), pages 723-751, June.
    6. Colin Jerolmack & Alexandra K. Murphy, 2019. "The Ethical Dilemmas and Social Scientific Trade-offs of Masking in Ethnography," Sociological Methods & Research, , vol. 48(4), pages 801-827, November.
    7. Valeriy Makarov & Albert Bakhtizin, 2014. "The Estimation Of The Regions’ Efficiency Of The Russian Federation Including The Intellectual Capital, The Characteristics Of Readiness For Innovation, Level Of Well-Being, And Quality Of Life," Economy of region, Centre for Economic Security, Institute of Economics of Ural Branch of Russian Academy of Sciences, vol. 1(4), pages 9-30.
    8. Zhao, Jing & Knoop, Victor L. & Wang, Meng, 2020. "Two-dimensional vehicular movement modelling at intersections based on optimal control," Transportation Research Part B: Methodological, Elsevier, vol. 138(C), pages 1-22.
    9. Kristine Edgar Danielyan & Samvel Grigoriy Chailyan, 2019. "Delineation of Effectors Impact on The Human Brain Derived Phosphoribosylpyrophosphate Synthetase-1 Activity," Biomedical Journal of Scientific & Technical Research, Biomedical Research Network+, LLC, vol. 24(1), pages 17918-17926, December.
    10. Chuan Wang & Yupeng Liu & Wen Hou & Chao Yu & Guorong Wang & Yuyan Zheng, 2021. "Reliability and availability modeling of Subsea Autonomous High Integrity Pressure Protection System with partial stroke test by Dynamic Bayesian," Journal of Risk and Reliability, , vol. 235(2), pages 268-281, April.
    11. Mohammad AL-Zoubi, 2018. "The Role of Technology, Organization, and Environment Factors in Enterprise Resource Planning Implementation Success in Jordan," International Business Research, Canadian Center of Science and Education, vol. 11(8), pages 48-65, August.
    12. Damgaard, Mette Trier & Nielsen, Helena Skyt, 2018. "Nudging in education," Economics of Education Review, Elsevier, vol. 64(C), pages 313-342.
    13. Nicole D. Sintov & P. Wesley Schultz, 2017. "Adjustable Green Defaults Can Help Make Smart Homes More Sustainable," Sustainability, MDPI, vol. 9(4), pages 1-12, April.
    14. Hwang, ShinYoung & Kim Seongcheol, 2017. "What triggers the use of mIM service provider’s sequel O2O service extensions?," 14th ITS Asia-Pacific Regional Conference, Kyoto 2017: Mapping ICT into Transformation for the Next Information Society 168494, International Telecommunications Society (ITS).
    15. Sana Sadiq & Khadija Anasse & Najib Slimani, 2022. "The impact of mobile phones on high school students: connecting the research dots," Technium Social Sciences Journal, Technium Science, vol. 30(1), pages 252-270, April.
    16. Maude Hasbi & Antoine Dubus, 2019. "Determinants of Mobile Broadband Use in Developing Economies: Evidence from Sub-Saharan Africa," Working Papers hal-02264651, HAL.
    17. Jascha-Alexander Koch & Michael Siering, 2019. "The recipe of successful crowdfunding campaigns," Electronic Markets, Springer;IIM University of St. Gallen, vol. 29(4), pages 661-679, December.
    18. Martins, José & Costa, Catarina & Oliveira, Tiago & Gonçalves, Ramiro & Branco, Frederico, 2019. "How smartphone advertising influences consumers' purchase intention," Journal of Business Research, Elsevier, vol. 94(C), pages 378-387.
    19. Retina Rimal & Chris Papadopoulos, 2016. "The mental health of sexually trafficked female survivors in Nepal," International Journal of Social Psychiatry, , vol. 62(5), pages 487-495, August.
    20. Wu, Bing & Yip, Tsz Leung & Yan, Xinping & Guedes Soares, C., 2022. "Review of techniques and challenges of human and organizational factors analysis in maritime transportation," Reliability Engineering and System Safety, Elsevier, vol. 219(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1005176. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.