IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1000562.html
   My bibliography  Save this article

Discovery of Regulatory Elements is Improved by a Discriminatory Approach

Author

Listed:
  • Eivind Valen
  • Albin Sandelin
  • Ole Winther
  • Anders Krogh

Abstract

A major goal in post-genome biology is the complete mapping of the gene regulatory networks for every organism. Identification of regulatory elements is a prerequisite for realizing this ambitious goal. A common problem is finding regulatory patterns in promoters of a group of co-expressed genes, but contemporary methods are challenged by the size and diversity of regulatory regions in higher metazoans. Two key issues are the small amount of information contained in a pattern compared to the large promoter regions and the repetitive characteristics of genomic DNA, which both lead to “pattern drowning”. We present a new computational method for identifying transcription factor binding sites in promoters using a discriminatory approach with a large negative set encompassing a significant sample of the promoters from the relevant genome. The sequences are described by a probabilistic model and the most discriminatory motifs are identified by maximizing the probability of the sets given the motif model and prior probabilities of motif occurrences in both sets. Due to the large number of promoters in the negative set, an enhanced suffix array is used to improve speed and performance. Using our method, we demonstrate higher accuracy than the best of contemporary methods, high robustness when extending the length of the input sequences and a strong correlation between our objective function and the correct solution. Using a large background set of real promoters instead of a simplified model leads to higher discriminatory power and markedly reduces the need for repeat masking; a common pre-processing step for other pattern finders.Author Summary: In the years following the sequencing of the human genome focus have shifted towards trying to understand how this blueprint results in the diversity of cells that we observe. Part of the answer lies in the regulation of transcription and how the proteins responsible for this recognize where they should attach to the DNA. This is a well studied problem, but most methods developed for this have a hard time dealing with the heterogeneity of the mammalian genomes. Here we present a method that greatly improves the efficiency of this search by contrasting the DNA with a large number of background DNA sequences. This enables us to handle repetitive segments of the genome that may be functional, but are usually considered intractable by most methods.

Suggested Citation

  • Eivind Valen & Albin Sandelin & Ole Winther & Anders Krogh, 2009. "Discovery of Regulatory Elements is Improved by a Discriminatory Approach," PLOS Computational Biology, Public Library of Science, vol. 5(11), pages 1-8, November.
  • Handle: RePEc:plo:pcbi00:1000562
    DOI: 10.1371/journal.pcbi.1000562
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1000562
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1000562&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1000562?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Tae Hoon Kim & Leah O. Barrera & Ming Zheng & Chunxu Qu & Michael A. Singer & Todd A. Richmond & Yingnian Wu & Roland D. Green & Bing Ren, 2005. "A high-resolution map of active promoters in the human genome," Nature, Nature, vol. 436(7052), pages 876-880, August.
    2. Leelavati Narlikar & Raluca Gordân & Alexander J Hartemink, 2007. "A Nucleosome-Guided Map of Transcription Factor Binding Sites in Yeast," PLOS Computational Biology, Public Library of Science, vol. 3(11), pages 1-10, November.
    3. Troels T Marstrand & Jes Frellsen & Ida Moltke & Martin Thiim & Eivind Valen & Dorota Retelska & Anders Krogh, 2008. "Asap: A Framework for Over-Representation Statistics for Transcription Factor Binding Sites," PLOS ONE, Public Library of Science, vol. 3(2), pages 1-5, February.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zing Tsung-Yeh Tsai & Shin-Han Shiu & Huai-Kuang Tsai, 2015. "Contribution of Sequence Motif, Chromatin State, and DNA Structure Features to Predictive Models of Transcription Factor Binding in Yeast," PLOS Computational Biology, Public Library of Science, vol. 11(8), pages 1-22, August.
    2. René Dreos & Anna Sloutskin & Nati Malachi & Diana Ideses & Philipp Bucher & Tamar Juven-Gershon, 2021. "Computational identification and experimental characterization of preferred downstream positions in human core promoters," PLOS Computational Biology, Public Library of Science, vol. 17(8), pages 1-27, August.
    3. Eilon Sharon & Shai Lubliner & Eran Segal, 2008. "A Feature-Based Approach to Modeling Protein–DNA Interactions," PLOS Computational Biology, Public Library of Science, vol. 4(8), pages 1-17, August.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1000562. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.