Transcript Document

Transcription in Eukaryotes
by Jean-Pierre Herveg, Etienne De Plaen and a lot of friends at
the Brussels Branch of the Ludwig Institute for Cancer research (Licr) and the Christian de Duve*
Institute for cellular Patholgy (ICP).
April 2006
Université Catholique de Louvain
Avenue E. Mounier, 1200 Brussels (Belgium)
Questions
1. In Prokaryotes, the sigma factor helps the RNA pol to recognize a promoter.
How is this done in Eukaryotes ?
2. Describe a eukaryotic promoter.
3. What are the three main postranscriptional modifications in eukaryotes ?
4. What is a lariat ?
5. How can the sequence of a pseudogene be recognized ?
Transcription in Eukaryotes
In eukaryotes, DNA is contained within the nucleus, where DNA is transcribed into RNA. RNA must then be
carried across the nuclear pores (exported) into the cytosol.
If in prokaryotes, transcription is performed by a single RNA pol, in eukaryotes, transcription is performed
by 3 different RNA pols:
RNA pol I transcribes 5.8 S, 18 S, and 28 S ribosomal RNA in the nucleolus.
RNA pol II transcibes m RNA and the small nuclear RNA (snRNA)
RNA pol III transcribes 5 S rRNA as well as all the tRNA species.
* S mean Svedberg and is a unit of sedimentation.
"S" es la unidad Svedberg de sedimentación.
Svedberg es un físico sueco que inventó el centrífugo analítico.
"S" es una unidad de tiempo,
1S = 10 -13 segundos
16S es una sustancia que sédimente a 16S en esta máquina.
16S Está también la constante de sedimentación de ARN
de la pequeña unidad del ribosome en el procarioticos o
small subunit ribosomal RNA (SSU ARNr).
En eucarioticos esta secuencia es 18S.
Ahora, no se medie más velocidad de sedimentación de ARN
para compararlos. Se compara sus secuencias.
RNA pol I and III
RNA polymerase I (5.8 S, 18 S,and 28 S ribosomal RNA )
RNA polymerase I transcribes only the genes for ribosomal RNA, from a single type of promoter.
The transcript includes the sequences of both large and small rRNAs, which are later released by
cleavages and processing. There are many copies of the transcription unit,
alternating with nontranscribed spacers, and organized in a cluster.
RNA polymerase III (5S rRNA and tRNA species).
The promoters fall into two general classes that are recognized in different ways by different
groups of factors. The promoters for 5S and tRNA genes are internal;
they lie downstream of the startpoint.
The promoters for snRNA (small nuclear RNA) genes lie upstream of the startpoint in the more
conventional manner of other promoters. In both cases, the individual elements that are
necessary for promoter function consist exclusively of sequences recognized
by transcription factors, which in turn direct the binding of RNA polymerase.
RNA pol II
general transcription factors (TF II) instead of the prokaryotic s factor
RNA Pol II does not contain a subunit similar to the prokaryotic s factor, which can recognize the promoter
and unwind the DNA double helix. In eukaryotes, these two functions are carried out by a set of proteins
called general transcription factors. The RNA Pol II is associated with six general transcription factors,
designated as TFIIA, TFIIB, TFIID, TFIIE, TFIIF and TFIIH, where "TF" stands for "transcription factor" and
"II" for the RNA Pol II.
TATA-box binfing protein and TAFs
TFIID consists of TBP (TATA-box binding protein) and TAFs (TBP associated factors). The role of TBP
is to bind the “TATA” core promoter. TAFs may assist TBP in this process. In human cells, TAFs ar
formed by 12 subunits. One of them, TAF250 (with molecular weight 250 kD), has the histone
acetyltransferase activity, which can relieve the binding between DNA and histones in the nucleosome.
Pre-initation complex (PIC)
The transcription factor which catalyzes DNA melting is TFIIH. However, before TFIIH can unwind DNA
the RNA Pol II and at least five general transcription factors (TFIIA is not absolutely necessary)
have to form a pre-initiation complex (PIC).
Elongation
After PIC is assembled at the promoter, TFIIH, an helicase, can unwind DNA.
This requires energy released from ATP hydrolysis. Then, RNA Pol II (NTPs) to synthesize a RNA transcript.
During RNA elongation, TFIIF remains attached to the RNA polymerase, but all of the other
transcription factors have dissociated from PIC.
The carboxyl-terminal domain (CTD) of the largest subunit of RNA Pol II is critical for elongation.
In the initiation phase, CTD is unphosphorylated, but during elongation it has to be phosphorylated.
Termination
Eukaryotic protein genes contain a poly-A signal located downstream of the last exon. This signal is used
to add a series of adenylate residues during RNA processing. Transcription often terminates at 0.5 - 2 kb
downstream of the poly-A signal, but the mechanism is unclear.
initiation
The promoter:
Eukaryotic RNA pols lack the s factor found in the prokaryotic enzyme. Instead of the Pribnow box,
a TATA box, is found in most eukaryotic genes. It is located at approximately -25.
Many promoters have a CAAT box and some a GC box, both at around -40 to -110 bases
upstream. The location of these additional elements can vary, and they can be present on
either strand.
The basal transcription machinery: TF means trascription factor and II indicates that this TF belongs
to the RNA pol II family of enzymes.
TF II bind to the promoter region, guiding the polymerase to this site.
They form the basal transcription machinery. The initial event in the process is the recognition of
the TATA box by the TATA box-binding protein, a component of TFIID. This is followed by the
sequential binding of other factors, including TFIIA, TFIIB, RNA polymerase II, and TFIIE.
Enhancers and silencers:
Enhancer sequences, which can be located several thousand bases upstream, downstream, or in the
middle of the transcribed region, can also bind proteins which stimulate transcription.
These are often tissue- and species-specific, explaining the regulation of genes in some tissues,
and the host range of viruses which have usurped these sequences to stimulate transcription of their
own genes.
------------------------------------question
Describe a eukaryotic promoter
Acetylation: to separate DNA from the nucleosomes
In eukaryotes, the association between DNA and histones prevents access of the polymerase and general
transcription factors to the promoter. Histone acetylation catalyzed by HATs can relieve the binding between
DNA and histones. Although a subunit of TFIID (TAF250 in human) has the HAT activity,
participation of other HATs can make transcription more efficient.
The following rules apply to most (but not all) cases:
Binding of activators to the enhancer element recruits HATs to relieve association between histones
and DNA, thereby enhancing transcription.
Binding of repressors to the silencer element recruits histone deacetylases
(denoted by HDs or HDACs) to tighten association between histones and DNA.
Methylation: to silence genes !
Experimental evidence has shown that in certain cells there are heavily methylated genes and these genes
are not expressed. On the other hand cells that have non-methylated forms of these genes are expressed.
An example of these is seen in housekeeping cells (cells that produce proteins used in "clean up" of
cellular debris and dead organelles) in which cells with non-methylated genes for these cells continuously
transcribe or produce the materials needed to make housekeeping cells.
Also one of the X chromosomes in females is not expressed.
A hypothesis for this phenomenon is linked to the heavy methylation of the inactive X chromosome.
TAFs:
TAFs may assist TBP in connecting the basal transciption machinery to enhancers or silencers.
In human cells, TAFs are formed by 12 subunits.
One of them, TAF250 (with molecular weight 250 kD), has the histone acetyltransferase activity,
which can relieve the binding between DNA and histones in the nucleosome.
-----------------------------------question
In Prokaryotes, the sigma factor helps the RNA pol to recognize a promoter. How is this done
In Eukaryotes ?
elongation
TFIIH can now use its helicase activity to unwind DNA.
This requires energy released from ATP hydrolysis.
The DNA melting starts from about -10 bp. Then, RNA Pol II uses nucleoside triphosphates (NTPs)
to synthesize a RNA transcript.
During RNA elongation, TFIIF remains attached to the RNA polymerase,
but all of the other transcription factors have dissociated from PIC (pre-initiation complex).
The carboxyl-terminal domain (CTD) of the largest subunit of RNA Pol II is critical for elongation.
In the initiation phase, CTD is unphosphorylated, but during elongation it has to be phosphorylated.
This domain contains many proline, serine and threonine residues.
termination
Eukaryotic protein genes contain a poly-A signal located downstream of the last exon.
This signal is used to add a series of adenylate residues during RNA processing.
Transcription often terminates at 0.5 - 2 kb downstream of the poly-A signal,
but the mechanism is unclear.
postranscritional modifications
Capping
Modification of the 5'-ends of eukaryotic mRNAs is called capping.
The cap consists of a methylated GTP linked to the rest of the mRNA by a 5' to 5' triphosphate "bridge”
(Cap Structure). Capping occurs very early during the synthesis of eukaryotic mRNAs, even before mRNA
molecules are finished being made by RNA polymerase II. Capped mRNAs are very efficiently translated
by ribosomes to make proteins. In fact, some viruses, such as poliovirus, prevent capped cellular mRNAs
from being translated into proteins. This enables poliovirus to take over the protein synthesizing machinery
in the infected cell to make new viruses.
Polyadenylation
Modification of the 3'-ends of eukaryotic mRNAs is called polyadenylation (Polyadenylation Pathway).
Polyadenylation is the addition of several hundred A nucleotides to the 3' ends of mRNAs.
Polyadenylation signal
All eukaryotic mRNAs destined to get a poly A tail (note: most, but not all, eukaryotic mRNAs get such a tail
contain the sequence AAUAAA about 11-30 nucleotides upstream to where the tail is added.
AAUAAA is recognized by an endonuclease that cuts the RNA, allowing the tail to be added by a specific
enzyme: polyA polymerase.
Splicing (trans or cis):
------------------------Question
What are the three main postranscriptional modifications in eukaryotes ?
splicing
Another major difference between prokaryotic and eukaryotic mRNA is the occurence of splicing
in eukaryotes. This process removes intervening sequences (introns) from the primary transcript,
and precisely assembles a set of exons which form the transcript that is translated.
The fragments which are removed are not random. they have consensus sequences at the 5’
and 3' splice sites.
Exons
end with the sequence AG and
begin with a G, and
Introns
begin with GU and
end with AG.
About 20-50 bases upstream from the 3' splice site is an adenine residue known as the
branch site:
Splicing occurs through the production of a lariat intermediate.
The 2'-OH of adenine (A) residue of the branch site attacks the phosphorous atom at the 5' splice site,
cleaving the bond while generating a lariat loop, with the branch residue having 3 phosphodiester
bonds, at the 2', 3', and 5' atoms.
The 3' end of the 5' exon attacks the phosphorous atom at the 3' splice site, joining the two exons
and releasing the lariat:
Note that the number of phosphodiester bonds remains the same throughout this reaction;
it proceeds through transesterification, not hydrolysis and ligation.
These reactions are mediated with small nuclear ribonucleoprotein particles (snRNPs) an
small cytoplasmic particles (scRNPs) consisting of snRNAs and scRNAs and specific proteins,
forming spliceosomes.
Spliceosomes recognize and align the splice sites, and prevent the intermediates from leaving
the complex until the reaction is complete. The RNA molecules present in the spliceosomes
base-pair with the primary transcript at the splice site to assist in this reaction.
The snRNPs referred to as U1 and U2 recognize and bind to the end of exon 1 and the branch site,
respectively. A complex of U4, U5, and U6 then joins to complete the spliceosom
---------------------------Question
What is a lariat ?
Base pairing appears to be involved in the recognition of these sites.
The snRNA contained in U1 has a sequence which matches the consensus for the beginning of an intron,
and the snRNA of U2 has a sequence which is complementary to the branch site.
The U2 and U6 snRNPs together probably function to form the catalytic site of the complex:
alternative splicing
For those who are interested, there is also a self splicing RNA:
http://138.192.68.68/bio/Courses/biochem2/RNA/SelfSplicingRNA.html
An intron inside a vector
Mammalian vectors have often an intron before the polylinker,
This intron increase transcription:
capping
Polyadenylation signal
pseudogenes
--------------------------------Question
How can the sequence of a pseudogene be recognized ?