Author
Listed:
- Vladimir B Bajic
- Sin Lam Tan
- Alan Christoffels
- Christian Schönbach
- Leonard Lipovich
- Liang Yang
- Oliver Hofmann
- Adele Kruger
- Winston Hide
- Chikatoshi Kai
- Jun Kawai
- David A Hume
- Piero Carninci
- Yoshihide Hayashizaki
Abstract
Using the two largest collections of Mus musculus and Homo sapiens transcription start sites (TSSs) determined based on CAGE tags, ditags, full-length cDNAs, and other transcript data, we describe the compositional landscape surrounding TSSs with the aim of gaining better insight into the properties of mammalian promoters. We classified TSSs into four types based on compositional properties of regions immediately surrounding them. These properties highlighted distinctive features in the extended core promoters that helped us delineate boundaries of the transcription initiation domain space for both species. The TSS types were analyzed for associations with initiating dinucleotides, CpG islands, TATA boxes, and an extensive collection of statistically significant cis-elements in mouse and human. We found that different TSS types show preferences for different sets of initiating dinucleotides and cis-elements. Through Gene Ontology and eVOC categories and tissue expression libraries we linked TSS characteristics to expression. Moreover, we show a link of TSS characteristics to very specific genomic organization in an example of immune-response-related genes (GO:0006955). Our results shed light on the global properties of the two transcriptomes not revealed before and therefore provide the framework for better understanding of the transcriptional mechanisms in the two species, as well as a framework for development of new and more efficient promoter- and gene-finding tools. Synopsis: Tens of thousands of mammalian genes are expressed in various cells at different times, controlled mainly at the promoter level through the interaction of transcription factors with cis-elements. The authors analyzed properties of a large collection of experimental mouse (Mus musculus) and human (Homo sapiens) transcription start sites (TSSs). They defined four types of TSSs based on the compositional properties of surrounding regions and showed that (a) the regions surrounding TSSs are much richer in properties than previously thought, (b) the four TSSs types are associated with distinct groups of cis-elements and initiating dinucleotides, (c) the regions upstream of TSSs are distinctly different from the downstream ones in terms of the associated cis-elements, and (d) mouse and human TSS properties relative to CpG islands (CGIs) and TATA box elements suggest species-specific adaptation. The authors linked TSS characteristics to gene expression through categories defined by the Gene Ontology and eVOC classifications and tissue expression libraries. They provided examples of the preference of immune response genes for TSS types and specific genomic organization. Their results shed light on the fine compositional properties of TSSs in mammals and could lead to better design of promoter- and gene-finding tools, better annotation of promoters by cis-elements, and better regulatory network reconstructions. These areas represent some of the focal topics of bioinformatics and genomics research that are of interest to a wide range of life scientists.
Suggested Citation
Vladimir B Bajic & Sin Lam Tan & Alan Christoffels & Christian Schönbach & Leonard Lipovich & Liang Yang & Oliver Hofmann & Adele Kruger & Winston Hide & Chikatoshi Kai & Jun Kawai & David A Hume & Pi, 2006.
"Mice and Men: Their Promoter Properties,"
PLOS Genetics, Public Library of Science, vol. 2(4), pages 1-13, April.
Handle:
RePEc:plo:pgen00:0020054
DOI: 10.1371/journal.pgen.0020054
Download full text from publisher
Citations
Citations are extracted by the
CitEc Project, subscribe to its
RSS feed for this item.
Cited by:
- Xiaobei Zhao & Eivind Valen & Brian J Parker & Albin Sandelin, 2011.
"Systematic Clustering of Transcription Start Site Landscapes,"
PLOS ONE, Public Library of Science, vol. 6(8), pages 1-16, August.
- Sutapa Datta & Subhasis Mukhopadhyay, 2013.
"A Composite Method Based on Formal Grammar and DNA Structural Features in Detecting Human Polymerase II Promoter Region,"
PLOS ONE, Public Library of Science, vol. 8(2), pages 1-11, February.
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:0020054. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.