IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1004418.html
   My bibliography  Save this article

Contribution of Sequence Motif, Chromatin State, and DNA Structure Features to Predictive Models of Transcription Factor Binding in Yeast

Author

Listed:
  • Zing Tsung-Yeh Tsai
  • Shin-Han Shiu
  • Huai-Kuang Tsai

Abstract

Transcription factor (TF) binding is determined by the presence of specific sequence motifs (SM) and chromatin accessibility, where the latter is influenced by both chromatin state (CS) and DNA structure (DS) properties. Although SM, CS, and DS have been used to predict TF binding sites, a predictive model that jointly considers CS and DS has not been developed to predict either TF-specific binding or general binding properties of TFs. Using budding yeast as model, we found that machine learning classifiers trained with either CS or DS features alone perform better in predicting TF-specific binding compared to SM-based classifiers. In addition, simultaneously considering CS and DS further improves the accuracy of the TF binding predictions, indicating the highly complementary nature of these two properties. The contributions of SM, CS, and DS features to binding site predictions differ greatly between TFs, allowing TF-specific predictions and potentially reflecting different TF binding mechanisms. In addition, a "TF-agnostic" predictive model based on three DNA “intrinsic properties” (in silico predicted nucleosome occupancy, major groove geometry, and dinucleotide free energy) that can be calculated from genomic sequences alone has performance that rivals the model incorporating experiment-derived data. This intrinsic property model allows prediction of binding regions not only across TFs, but also across DNA-binding domain families with distinct structural folds. Furthermore, these predicted binding regions can help identify TF binding sites that have a significant impact on target gene expression. Because the intrinsic property model allows prediction of binding regions across DNA-binding domain families, it is TF agnostic and likely describes general binding potential of TFs. Thus, our findings suggest that it is feasible to establish a TF agnostic model for identifying functional regulatory regions in potentially any sequenced genome.Author Summary: Identification of transcription factor binding sites based on sequence motifs is typically accompanied by a high false positive rate. Increasing evidence suggests that there are many other factors besides DNA sequence that may affect the binding and interaction of TFs with DNA. Through the integration of sequence motif, chromatin state, and DNA structure properties, we show that TF binding can be better predicted. Moreover, considering chromatin state and DNA structure properties simultaneously yields a significant improvement. While the binding of some TFs can be readily predicted using either chromatin state information or DNA structure, other TFs need both. Thus, our findings provide insights on how different histone modifications and DNA structure properties may influence the binding of a particular TF and thus how TFs regulate gene expression. These features are referred to as sequence “intrinsic properties” because they can be predicted from sequences alone. These intrinsic properties can be used to build a TF binding prediction model that has a similar performance to considering all features. Moreover, the intrinsic property model allows TFBS predictions not only across TFs, but also across DNA-binding domain families that are present in most eukaryotes, suggesting that the model likely can be used across species.

Suggested Citation

  • Zing Tsung-Yeh Tsai & Shin-Han Shiu & Huai-Kuang Tsai, 2015. "Contribution of Sequence Motif, Chromatin State, and DNA Structure Features to Predictive Models of Transcription Factor Binding in Yeast," PLOS Computational Biology, Public Library of Science, vol. 11(8), pages 1-22, August.
  • Handle: RePEc:plo:pcbi00:1004418
    DOI: 10.1371/journal.pcbi.1004418
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004418
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1004418&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1004418?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Istvan Albert & Travis N. Mavrich & Lynn P. Tomsho & Ji Qi & Sara J. Zanton & Stephan C. Schuster & B. Franklin Pugh, 2007. "Translational and rotational settings of H2A.Z nucleosomes across the Saccharomyces cerevisiae genome," Nature, Nature, vol. 446(7135), pages 572-576, March.
    2. Rajagopal, 2014. "The Human Factors," Palgrave Macmillan Books, in: Architecting Enterprise, chapter 9, pages 225-249, Palgrave Macmillan.
    3. Leelavati Narlikar & Raluca Gordân & Alexander J Hartemink, 2007. "A Nucleosome-Guided Map of Transcription Factor Binding Sites in Yeast," PLOS Computational Biology, Public Library of Science, vol. 3(11), pages 1-10, November.
    4. Christopher T. Harbison & D. Benjamin Gordon & Tong Ihn Lee & Nicola J. Rinaldi & Kenzie D. Macisaac & Timothy W. Danford & Nancy M. Hannett & Jean-Bosco Tagne & David B. Reynolds & Jane Yoo & Ezra G., 2004. "Transcriptional regulatory code of a eukaryotic genome," Nature, Nature, vol. 431(7004), pages 99-104, September.
    5. Colin R. Lickwar & Florian Mueller & Sean E. Hanlon & James G. McNally & Jason D. Lieb, 2012. "Genome-wide protein–DNA binding dynamics suggest a molecular clutch for transcription factor function," Nature, Nature, vol. 484(7393), pages 251-255, April.
    6. Eran Segal & Yvonne Fondufe-Mittendorf & Lingyi Chen & AnnChristine Thåström & Yair Field & Irene K. Moore & Ji-Ping Z. Wang & Jonathan Widom, 2006. "A genomic code for nucleosome positioning," Nature, Nature, vol. 442(7104), pages 772-778, August.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Sahra Uygun & Cheng Peng & Melissa D Lehti-Shiu & Robert L Last & Shin-Han Shiu, 2016. "Utility and Limitations of Using Gene Expression Data to Identify Functional Associations," PLOS Computational Biology, Public Library of Science, vol. 12(12), pages 1-27, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Guo-Cheng Yuan & Jun S Liu, 2008. "Genomic Sequence Is Highly Predictive of Local Nucleosome Depletion," PLOS Computational Biology, Public Library of Science, vol. 4(1), pages 1-11, January.
    2. Ji-Ping Wang & Yvonne Fondufe-Mittendorf & Liqun Xi & Guei-Feng Tsai & Eran Segal & Jonathan Widom, 2008. "Preferentially Quantized Linker DNA Lengths in Saccharomyces cerevisiae," PLOS Computational Biology, Public Library of Science, vol. 4(9), pages 1-10, September.
    3. Wolfram Möbius & Ulrich Gerland, 2010. "Quantitative Test of the Barrier Nucleosome Model for Statistical Positioning of Nucleosomes Up- and Downstream of Transcription Start Sites," PLOS Computational Biology, Public Library of Science, vol. 6(8), pages 1-11, August.
    4. Leelavati Narlikar & Raluca Gordân & Alexander J Hartemink, 2007. "A Nucleosome-Guided Map of Transcription Factor Binding Sites in Yeast," PLOS Computational Biology, Public Library of Science, vol. 3(11), pages 1-10, November.
    5. Eilon Sharon & Shai Lubliner & Eran Segal, 2008. "A Feature-Based Approach to Modeling Protein–DNA Interactions," PLOS Computational Biology, Public Library of Science, vol. 4(8), pages 1-17, August.
    6. Alexander W. Blocker & Edoardo M. Airoldi, 2016. "Template-Based Models for Genome-Wide Analysis of Next-Generation Sequencing Data at Base-Pair Resolution," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(515), pages 967-987, July.
    7. Rahman, Shaikh Moksadur, 2020. "Relationship between Job Satisfaction and Turnover Intention: Evidence from Bangladesh," Asian Business Review, Asian Business Consortium, vol. 10(2), pages 99-108.
    8. Wang Kai, 2019. "Towards a Taxonomy of Idea Generation Techniques," Foundations of Management, Sciendo, vol. 11(1), pages 65-80, January.
    9. Bridgelall, Raj & Stubbing, Edward, 2021. "Forecasting the effects of autonomous vehicles on land use," Technological Forecasting and Social Change, Elsevier, vol. 163(C).
    10. Bevilacqua, Maurizio & Ciarapica, Filippo Emanuele, 2018. "Human factor risk management in the process industry: A case study," Reliability Engineering and System Safety, Elsevier, vol. 169(C), pages 149-159.
    11. Naveena Prakasam & Louisa Huxtable-Thomas, 2021. "Reddit: Affordances as an Enabler for Shifting Loyalties," Information Systems Frontiers, Springer, vol. 23(3), pages 723-751, June.
    12. Colin Jerolmack & Alexandra K. Murphy, 2019. "The Ethical Dilemmas and Social Scientific Trade-offs of Masking in Ethnography," Sociological Methods & Research, , vol. 48(4), pages 801-827, November.
    13. Valeriy Makarov & Albert Bakhtizin, 2014. "The Estimation Of The Regions’ Efficiency Of The Russian Federation Including The Intellectual Capital, The Characteristics Of Readiness For Innovation, Level Of Well-Being, And Quality Of Life," Economy of region, Centre for Economic Security, Institute of Economics of Ural Branch of Russian Academy of Sciences, vol. 1(4), pages 9-30.
    14. Zhao, Jing & Knoop, Victor L. & Wang, Meng, 2020. "Two-dimensional vehicular movement modelling at intersections based on optimal control," Transportation Research Part B: Methodological, Elsevier, vol. 138(C), pages 1-22.
    15. Kristine Edgar Danielyan & Samvel Grigoriy Chailyan, 2019. "Delineation of Effectors Impact on The Human Brain Derived Phosphoribosylpyrophosphate Synthetase-1 Activity," Biomedical Journal of Scientific & Technical Research, Biomedical Research Network+, LLC, vol. 24(1), pages 17918-17926, December.
    16. Chuan Wang & Yupeng Liu & Wen Hou & Chao Yu & Guorong Wang & Yuyan Zheng, 2021. "Reliability and availability modeling of Subsea Autonomous High Integrity Pressure Protection System with partial stroke test by Dynamic Bayesian," Journal of Risk and Reliability, , vol. 235(2), pages 268-281, April.
    17. Mohammad AL-Zoubi, 2018. "The Role of Technology, Organization, and Environment Factors in Enterprise Resource Planning Implementation Success in Jordan," International Business Research, Canadian Center of Science and Education, vol. 11(8), pages 48-65, August.
    18. Damgaard, Mette Trier & Nielsen, Helena Skyt, 2018. "Nudging in education," Economics of Education Review, Elsevier, vol. 64(C), pages 313-342.
    19. Nicole D. Sintov & P. Wesley Schultz, 2017. "Adjustable Green Defaults Can Help Make Smart Homes More Sustainable," Sustainability, MDPI, vol. 9(4), pages 1-12, April.
    20. Hwang, ShinYoung & Kim Seongcheol, 2017. "What triggers the use of mIM service provider’s sequel O2O service extensions?," 14th ITS Asia-Pacific Regional Conference, Kyoto 2017: Mapping ICT into Transformation for the Next Information Society 168494, International Telecommunications Society (ITS).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1004418. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.