Bacterial In
bacteria, the promoter contains two short sequence elements approximately 10 (
Pribnow Box) and 35 nucleotides
upstream from the
transcription start site. • The sequence at -10 (the -10 element) has the
consensus sequence TATAAT. • The sequence at -35 (the -35 element) has the consensus sequence TTGACA. • The above consensus sequences, while conserved on average, are not found intact in most promoters. On average, only 3 to 4 of the 6 base pairs in each consensus sequence are found in any given promoter. Few natural promoters have been identified to date that possess intact consensus sequences at both the -10 and -35; artificial promoters with complete conservation of the -10 and -35 elements have been found to transcribe at lower frequencies than those with a few mismatches with the consensus. • The optimal spacing between the -35 and -10 sequences is 17 bp. The spacer sequence affects promoter strength by up to 600-fold. (
consensus sequence 5'-AAAAAARNR-3' when centered in the -42 region; consensus sequence 5'-AWWWWWTTTTT-3' when centered in the -52 region; W = A or T; R = A or G; N = any base). • The transcription start site has the consensus sequence YRY. The above promoter sequences are recognized only by RNA polymerase
holoenzyme containing
sigma-70. RNA polymerase holoenzymes containing other sigma factors recognize different core promoter sequences. ← upstream downstream → 5'-XXXXXXXPPPPPPXXXXXXPPPPPPXXXXGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGXXXX-3' -35 -10 Gene to be transcribed
Probability of occurrence of each nucleotide for -10 sequence T A T A A T 77% 76% 60% 61% 56% 82% for -35 sequence T T G A C A 69% 79% 61% 56% 54% 54%
Bidirectional (prokaryotic) Promoters can be very closely located in the DNA. Such "closely spaced promoters" have been observed in the DNAs of all life forms, from humans to prokaryotes and are highly conserved. Therefore, they may provide some (presently unknown) advantages. These pairs of promoters can be positioned in divergent, tandem, and convergent directions. They can also be regulated by transcription factors and differ in various features, such as the nucleotide distance between them, the two promoter strengths, etc. The most important aspect of two closely spaced promoters is that they will, most likely, interfere with each other. Several studies have explored this using both analytical and stochastic models. There are also studies that measured gene expression in synthetic genes or from one to a few genes controlled by bidirectional promoters. More recently, one study measured most genes controlled by tandem promoters in
E. coli. In that study, two main forms of interference were measured. One is when an RNAP is on the downstream promoter, blocking the movement of RNAPs elongating from the upstream promoter. The other is when the two promoters are so close that when an RNAP sits on one of the promoters, it blocks any other RNAP from reaching the other promoter. These events are possible because the RNAP occupies several nucleotides when bound to the DNA, including in transcription start sites. Similar events occur when the promoters are in divergent and convergent formations. The possible events also depend on the distance between them.
Eukaryotic Gene promoters are typically located upstream of the gene and can have regulatory elements several kilobases away from the transcriptional start site (enhancers). In eukaryotes, the transcriptional complex can cause the DNA to bend back on itself, which allows for placement of regulatory sequences far from the actual site of transcription. Eukaryotic RNA-polymerase-II-dependent promoters can contain a
TATA box (
consensus sequence TATAAA), which is recognized by the
general transcription factor TATA-binding protein (TBP); and a
B recognition element (BRE), which is recognized by the general transcription factor
TFIIB. The TATA element and BRE typically are located close to the transcriptional start site (typically within 30 to 40 base pairs). Eukaryotic promoter regulatory sequences typically bind proteins called transcription factors that are involved in the formation of the transcriptional complex. An example is the
E-box (sequence CACGTG), which binds transcription factors in the
basic helix-loop-helix (bHLH) family (e.g.
BMAL1-Clock,
cMyc). Some promoters that are targeted by multiple transcription factors might achieve a hyperactive state, leading to increased transcriptional activity. • Core promoter – the minimal portion of the promoter required to properly initiate transcription • Includes the transcription start site (TSS) and elements directly upstream • A binding site for RNA polymerase •
RNA polymerase I: transcribes genes encoding 18S, 5.8S and 28S
ribosomal RNAs •
RNA polymerase II: transcribes genes encoding
messenger RNA and certain
small nuclear RNAs and
microRNA •
RNA polymerase III: transcribes genes encoding
transfer RNA,
5s ribosomal RNAs and other small RNAs • General transcription factor binding sites, e.g.
TATA box,
B recognition element. • Many other elements/motifs may be present. There is no such thing as a set of "universal elements" found in every core promoter. • Proximal promoter – the proximal sequence upstream of the gene that tends to contain primary regulatory elements • Approximately 250 base pairs upstream of the start site • Specific
transcription factor binding sites •
Distal promoter – the distal sequence upstream of the gene that may contain additional regulatory elements, often with a weaker influence than the proximal promoter • Anything further upstream (but not an enhancer or other regulatory region whose influence is positional/orientation independent) • Specific transcription factor binding sites
Mammalian promoters regulatory region is enabled to interact with the promoter region of its target
gene by formation of a chromosome loop. This can initiate
messenger RNA (mRNA) synthesis by
RNA polymerase II (RNAP II) bound to the promoter at the
transcription start site of the gene. The loop is stabilized by one architectural protein anchored to the enhancer and one anchored to the promoter and these proteins are joined to form a dimer (red zigzags). Specific regulatory
transcription factors bind to DNA sequence motifs on the enhancer.
General transcription factors bind to the promoter. When a transcription factor is activated by a signal (here indicated as
phosphorylation shown by a small red star on a transcription factor on the enhancer) the enhancer is activated and can now activate its target promoter. The active enhancer is transcribed on each strand of DNA in opposite directions by bound RNAP IIs.
Mediator (coactivator) (a complex consisting of about 26 proteins in an interacting structure) communicates regulatory signals from the enhancer DNA-bound transcription factors to the promoter. Up-regulated expression of genes in mammals is initiated when signals are transmitted to the promoters associated with the genes. Promoter DNA sequences may include different elements such as
CpG islands (present in about 70% of promoters), a
TATA box (present in about 24% of promoters),
initiator (Inr) (present in about 49% of promoters), upstream and downstream TFIIB recognition elements (BREu and BREd) (present in about 22% of promoters), and downstream core promoter element (DPE) (present in about 12% of promoters). The presence of multiple
methylated CpG sites in CpG islands of promoters causes stable silencing of genes. However, the presence or absence of the other elements have relatively small effects on gene expression in experiments. Two sequences, the TATA box and Inr, caused small but significant increases in expression (45% and 28% increases, respectively). The BREu and the BREd elements significantly decreased expression by 35% and 20%, respectively, and the DPE element had no detected effect on expression. These cis-regulatory modules include
enhancers,
silencers,
insulators and tethering elements. Among this constellation of elements, enhancers and their associated
transcription factors have a leading role in the regulation of gene expression.
Enhancers are regions of the genome that are major gene-regulatory elements. Enhancers control cell-type-specific gene expression programs, most often by looping through long distances to come in physical proximity with the promoters of their target genes. In a study of brain cortical neurons, 24,937 loops were found, bringing enhancers to promoters. Several cell function specific transcription factors (there are about 1,600 transcription factors in a human cell) generally bind to specific motifs on an enhancer and a small combination of these enhancer-bound transcription factors, when brought close to a promoter by a DNA loop, govern the level of transcription of the target gene.
Mediator (coactivator) (a complex usually consisting of about 26 proteins in an interacting structure) communicates regulatory signals from enhancer DNA-bound transcription factors directly to the RNA polymerase II (pol II) enzyme bound to the promoter. Enhancers, when active, are generally transcribed from both strands of DNA with RNA polymerases acting in two different directions, producing two eRNAs as illustrated in the Figure. An inactive enhancer may be bound by an inactive transcription factor. Phosphorylation of the transcription factor may activate it and that activated transcription factor may then activate the enhancer to which it is bound (see small red star representing phosphorylation of transcription factor bound to enhancer in the illustration). An activated enhancer begins transcription of its RNA before activating a promoter to initiate transcription of messenger RNA from its target gene.
Bidirectional (mammalian) Bidirectional promoters are short (<1 kbp) intergenic regions of
DNA between the 5' ends of the
genes in a bidirectional gene pair. A "bidirectional gene pair" refers to two adjacent genes coded on opposite strands, with their 5' ends oriented toward one another. The two genes are often functionally related, and modification of their shared promoter region allows them to be co-regulated and thus co-expressed. Bidirectional promoters are a common feature of
mammalian
genomes. About 11% of human genes are bidirectionally paired.
Microarray analysis has shown bidirectionally paired genes to be co-expressed to a higher degree than random genes or neighboring unidirectional genes. There are exceptions to this, however. In some cases (about 11%), only one gene of a bidirectional pair is expressed. Some functional classes of genes are more likely to be bidirectionally paired than others. Genes implicated in DNA repair are five times more likely to be regulated by bidirectional promoters than by unidirectional promoters.
Chaperone proteins are three times more likely, and
mitochondrial genes are more than twice as likely. Many basic
housekeeping and cellular metabolic genes are regulated by bidirectional promoters.
CCAAT boxes are common, as they are in many promoters that lack TATA boxes. In addition, the
motifs NRF-1,
GABPA,
YY1, and ACTACAnnTCCC are represented in bidirectional promoters at significantly higher rates than in unidirectional promoters. The absence of TATA boxes in bidirectional promoters suggests that TATA boxes play a role in determining the directionality of promoters, but counterexamples of bidirectional promoters do possess TATA boxes and unidirectional promoters without them indicates that they cannot be the only factor. Although the term "bidirectional promoter" refers specifically to promoter regions of
mRNA-encoding genes,
luciferase assays have shown that over half of human genes do not have a strong directional bias. Research suggests that
non-coding RNAs are frequently associated with the promoter regions of mRNA-encoding genes. It has been hypothesized that the recruitment and initiation of
RNA polymerase II usually begins bidirectionally, but divergent transcription is halted at a checkpoint later during elongation. Possible mechanisms behind this regulation include sequences in the promoter region, chromatin modification, and the spatial orientation of the DNA. == Subgenomic ==