|
|
||||||||
Sir William Dunn School of Pathology, University of Oxford, Oxford, OX1 3RE, United Kingdom
1Correspondence: Sir William Dunn School of Pathology, University of Oxford, South Parks Rd., Oxford, OX1 3RE, U.K. E-mail: Dean.Jackson{at}Path.ox.ac.uk
| ABSTRACT |
|---|
|
|
|---|
Key Words: gene expression RNA polymerase nascent transcripts transcript profiles nuclear structure
| INTRODUCTION |
|---|
|
|
|---|
Three different RNA polymerase (pol) complexes perform RNA synthesis in
the nuclei of mammalian cells (11
, 12)
. In most cells, RNA
polymerase II (pol II) is the major activity, transcribing all
protein-coding genes to generate patterns of gene expression that
determine cell type. Synthesis is performed by an ~4 MDa holoenzyme
containing the pol II core enzyme and other activities required during
RNA synthesis and processing (13
14
15
16
17)
. RNA polymerase I
(pol I) is dedicated to the synthesis of the repeated ribosomal RNA
(rRNA) genes, within specialized nuclear sitesnucleoliand RNA
polymerase III (pol III), a minor nucleoplasmic activity, transcribes
transfer RNA (tRNA) and 5S rRNA genes. Small nuclear RNA (snRNA) and
small nucleolar RNA (snoRNA) genes encode structural RNAs needed for
RNA processing; some are transcribed by pol II and others by pol III.
| THE COMPLEXITY OF RNA SYNTHESIS IN MAMMALIAN CELLS |
|---|
|
|
|---|
Numbers of nascent transcripts in intact cells
Establishing the number of engaged transcripts is the first step
toward explaining the behavior of nuclear RNA (nRNA). Overall rates of
RNA synthesis in vivo can reveal the number of nascent
transcripts, providing that rates of elongation and average transcript
lengths are known.
In mammalian cells, rRNA genes give the most reliable estimates for
these values (18)
. In humans, each chromosome set has
~180 copies of a 45 kbp rRNA repeat that is clustered in tandem
arrays on chromosomes 13, 14, 15, 21, and 22. Each repeat contains a 13
kbp transcription unit that is transcribed and processed in nucleoli to
give the 28, 18, and 5.8S rRNAs. The repeat organization of these
highly active genes makes them easy to identify in chromatin spreads
(19)
, where the number of engaged transcription complexes
can be counted (see Fig. 1
). In a HeLa cell, 120150 active rRNA genes (20)
are each
associated with 100120 transcription complexes (19)
.
With a synthetic rate of ~2.5 kbp/min (21
, 22)
, this is
sufficient to yield 4 x 106 complete
transcripts/22h cell cycle and maintain the observed steady-state level
of ~3.5 x 106 ribosomes in these
proliferative cells (23)
.
|
Estimates of the rate of synthesis by pol II rely on the time of
appearance of sequences located at known distances along a transcript.
In human cells, quantitative polymerase chain reaction (PCR) was used
to analyze transcription across different regions of the ~2.5 Mb
dystrophin gene following the induction of expression in muscle cell
cultures. Elongation rates in the range 1.72.5 kb/min were determined
(24)
. In rat kidney cells, the reactivation of serum
responsive genes following serum deprivation suggests a synthetic rate
of 1.11.4 kb/min (25)
. For comparison, in
Drosophila, the larval Ultrabithorax genes are
transcribed at ~1.4 kb/min (26)
and stress-induced
hsp70 genes at ~1.2 kb/min, in cells growing in culture
(27)
.
Kinetic analyses of the entry of [3H]adenosine
into ATP and RNA establish levels of RNA synthesis within intact cells.
Mouse L cells, which divide every 1214 h, support a continuous
synthetic rate of ~2 x 108
nucleotides/min with 39, 58, and 3% into pre-rRNA, pre-mRNA (hnRNA),
and 45S RNA, respectively (28)
. With synthetic rates of
2.5 and 2 kbp/min for pol I and pol II/III, respectively (see above),
and average transcript lengths of 13, 10, and 0.1 kb for pol I, pol II,
and pol III, respectively, these values correspond to ~30,000,
60,000, and 3,000 engaged pol I, II, and III complexes/cell,
respectively. In rabbit erythroid cells, levels of RNA synthesis are at
least 10-fold lower (29)
, though similar proportions of
pol I and II activity are seen.
At the same time, this approach allows the proportion of labeled RNA
entering the cytoplasm to be established. Only 2.1% hnRNA reaches the
cytoplasm as mRNA in mouse L cells (28)
, similar to the
3.5% seen for rabbit erythroid cells (29)
, where globin
transcripts represent at least 85% of cytoplasmic mRNA.
A complementary analysis performed on HeLa cells, labeled for 10 min
with orthophosphate-32P, confirmed that nucleolar
and nucleoplasmic RNA accounts for ~20 and 80% of synthesis,
respectively (30)
. However, in view of their unusually
high level of nRNA turnover (31)
, it is worth considering
if cells adapted for continuous culture have developed atypical
patterns of RNA metabolism. In fact, studies on freshly isolated cells
indicate that this is not the case. For example, the distribution of
nRNA has been analyzed in freshly isolated rat hepatocytes that were
treated with hydrocortisone, labeled for 5 min with
[3H]uridine, and subsequently grown in medium
with excess uridine (32)
. In these cells, nucleolar
transcripts account for ~30% of RNA synthesis, while the remainder,
in the nucleoplasm, behave like pol II transcripts in immortalized
cells.
The numbers of active RNA polymerases in vitro
Various factors complicate the analysis of nascent transcripts
labeled in vivo (30
, 32)
. However, in
principle, it should be a simple matter to establish the number of
nascent transcripts as cells have the same number of active pols.
Numbers of active pols can be determined in vitro using
purified nuclei and defined conditions with labeled precursors of known
specific activity and inhibitors to establish levels of synthesis by
the different pols. Following incorporation, the amounts of radiolabel
at internal and terminal positions of the nascent chains allow the
number of active complexes to be established. Using this approach, rat
liver nuclei were shown to have 25,00035,000 engaged polymerase
complexes, ~60% of which were pol I (33)
. However, this
type of analysis is complicated by variable recoveries of active pols
in nuclei and the behavior of different enzymes under the various
conditions used (33
34
35)
.
Most of the problems can be minimized if cells are simply permeabilized
using mild detergents, washed to remove endogenous pools, and
transcription performed under conditions that mimic those found
in vivo (20)
. This preserves active polymerase
complexes and allows the number of active polymerases/cell to be
determined, providing the number of nucleotides incorporated into each
RNA can be established. In HeLa cells, this approach gives ~90,000
nascent RNA chains per cell, with ~15,000 active pol I and 75,000
active pol II/III complexes, ~10% of which are pol III (20
, 36)
.
Direct measurement of the number of pol II molecules in proliferative
cells are in line with these estimates. For example, binding of
[3H]amanitin to either crude or purified cell
homogenates shows cultured cells to have ~40,000 pol II molecules per
haploid genome (37)
. In this case, a typical HeLa cell,
with almost 5 haploid equivalents of DNA, will contain ~200,000 pol
II complexes. Comparison of HeLa extracts with known amounts of
purified protein confirms that each cell has ~320,000 molecules of
the largest subunit of RNA polymerase II, though only ~65,000
molecules are engaged in transcription (38)
; like many
nuclear proteins, only a minor fraction of pol II complexes are active
at any time.
| THE ORGANIZATION OF TRANSCRIPTION IN MAMMALIAN CELLS |
|---|
|
|
|---|
In view of the scarcity of nascent complexes in Miller spreads, a
quantitative analysis of the distribution of active transcripts was
performed using DNA spreads from cells dissolved in 0.25% sarkosyl
(20)
. Efficient labeling of nascent RNAs in HeLa cells was
achieved using saponin-permeabilized cells and a reaction mix
supplemented with Br-UTP. After spreading, two distinct classes of
nascent transcript were seen (Fig. 2
). Sensitivity to
-amanitin confirmed that transcripts synthesised by
pol I are restricted to intensely labeled structures, with labeling
reminiscent of that seen in nucleoli (20)
. Weakly labeled
foci, dispersed throughout the spread (Fig. 2)
, indicated that each
cell contains at least 50,000 nucleoplasmic transcripts. The analysis
of Br-RNA by electron microscopy showed that ~66% of transcription
units contain a single nascent complex, suggesting that most genes are
activated less than once every ~5 min. Clustered pol complexes on
single DNA molecules were often spaced uniformly (Fig. 2)
, suggesting
that active protein coding genes engage pol II complexes at a constant
rate.
|
Sites of transcription in vivo
Various approaches have been used to visualize sites of RNA
synthesis and map the pathways taken by mature RNAs en route to sites
of protein synthesis in the cytoplasm. For many years,
3H-labeled nucleosides provided the best means of
labeling RNA inside mammalian cells. Active sites were shown to be
associated with perichromatin fibrils at the periphery of dense
chromatin (42
, 43)
. More recently, analogues, such as
Br-UTP, that can be detected by immunolabeling have facilitated a more
detailed analysis of transcription sites (44
, 45)
.
Transcription performed either in permeabilized cells or after
microinjection showed nascent transcripts associated with the dense
fibrillar component of nucleoli and many discrete sites, dispersed
throughout the nucleoplasm (20
, 36
, 44
45
46
47)
. Using
high-resolution immunostaining, the active transcription sites were
shown to contain transcript-rich zones measuring ~50 nm with adjacent
regions rich in chromatin, the active form of RNA polymerase II or
proteins required for RNA processing (36
, 47
, 48)
.
At the time of synthesis, nascent RNA molecules associate with
different RNA-binding proteins to form ribonucleoprotein (RNP)
complexes (49)
that serves as substrates for RNA
processing (50
, 51)
and transport (52)
. At
least 20 abundant nuclear proteins contribute to RNP structure and
influence the metabolism of nRNA. Different RNPs are known to bind
preferentially to particular RNA sequences so that individual
transcripts might associate with different constellations of hnRNPs.
Furthermore, as RNP structure is established as a consequence of both
RNA synthesis and processing, it is easy to see how proteins present at
different stages of the synthetic pathway might function to ensure that
only fully processed mRNAs reach the cytoplasm (49
, 51)
.
Transcript profiles in mammalian cells
It is now obvious how cytoplasmic mRNAs are related to primary
transcripts in the nucleus. However, important aspects of RNA
metabolism are only revealed by detailed studies of the size
distribution of nRNAs (53
, 54)
. Using radiolabeled RNA
precursor and actinomycin D to eliminate incorporation by pol I, it is
a simple matter to generate size profiles for transcripts synthesized
by pol II (54)
. During short labeling periods (i.e., <1
min), most label is incorporated into nascent hnRNA molecules that
remain associated with an active pol complex throughout. When separated
on sucrose gradients, these RNAs have sizes ranging up to ~200S
(i.e., >25 kb), with maximum labeling at 18S (~2 kb). As transcripts
are labeled to similar extents, the average size close to 28S (~5 kb)
corresponds to one-half the average length of a primary transcript
(with uniformly distributed pols a typical transcript will be 50%
complete). After labeling for 13 h, transcripts will be uniformly
labeled and profiles reflect their average molecular weight; long
molecules are inevitably labeled more than shorter ones. Under these
conditions, transcripts cover the same range of sizes, but now the peak
of labeled material is at ~50S (~15 kb), with an average transcript
length of 9,00010,000 kb.
It is notable that cytoplasmic mRNAs, with an average size of ~2 kb
in mammalian cells, are much shorter than their presumed nuclear
precursors. Noncoding, intronic sequences found throughout most genes
account for this difference; primary transcripts are typically 26x
longer that the mature mRNAs they encode (55)
. However,
the size distribution of nRNAs labeled over 13 h suggests that mRNAs
migrate from the nucleus soon after maturation, while many primary
transcripts remain unprocessed hours after synthesis.
The abundance of different RNAs in mammalian cells
Details of pol activity and transcript profiles give only a
partial image of RNA synthesis in mammalian cells. For a complete
picture, it is also necessary to establish how many genes are active
and how frequently these are transcribed. Such information is obtained
by analyzing the sequence composition of different RNA populations.
Hybridization experiments show that in HeLa cells a remarkable 15% of
nonrepetitive DNA is transcribed (56)
. This corresponds to
~5 x 108 bp DNA and infers that at least
50,000 different transcription units are expressed. The complexity of
ribosomes-associated mRNAs is ~10% of nRNA and equivalent to
~25 x 103 different mRNAs
(56)
.
The actual number of mRNA molecules can be estimated using simple
calculations. A HeLa cell is known to contain ~3.5 x
106 copies of the 18 and 28S rRNAs. These are
~7 kbp, together, and make up ~75% cellular RNA. Also in the
cytoplasm, tRNAs and mRNAs comprise ~10 and 2.5% cellular RNA,
respectively. With average sizes of ~0.1 and 2 kb, there are
~3 x 107 tRNA and 0.4 x
106 mRNA molecules. Of the latter, a small number
of mRNAs have many thousand copies per cell, while the vast majority
have <10 (56
, 57)
. Similar analyses show that most
tissues of mammalian origin express at least 10,000 genes. In rat
liver, for example, there are ~10 species present at ~10,000 copies
per cell, 500 at ~200 copies per cell, and 15,000 at ~10 copies per
cell (33)
, ~350,000 mRNAs per cell, in all.
Recently, techniques have been developed that allow levels of
expression from specific transcription units to be established.
Comprehensive gene expression profiles can be generated using the
serial analysis of cDNA tags (SAGE; refs 58
, 59
) or
hybridization of labeled RNA to high-density arrays of oligonucleotides
(ref 60
; see also www.wi.mit.edu/young/expression.html) or
DNA fragments generated using the PCR (61)
. Such
techniques confirm the metabolic differences of cells grown in culture
and those with specialized roles. In HeLa cells (62)
, the
most active genes code for proteins involved in translation, while in
pancreas (58)
>20% of poly(A)+ mRNAs code for only four
tissue-specific genes.
Analyzing the activity of specific genes
Although detailed studies of RNA populations emphasize the
complexities of RNA metabolism, understanding the dynamics of gene
expression clearly benefits from the analysis of specific products.
Biochemical fractionation of nascent, nuclear, and cytoplasmic RNAs
together with metabolic studies on rates of synthesis and decay give a
valuable impression of RNA dynamics. In human cells, the exceptionally
long dystrophin gene provides an interesting model. Using myotubes in
culture, transcription of the ~2.5 Mb gene takes ~16 h
(24)
and mRNA decays with a half-life of 15.6 h
(63)
. Quantitative PCR shows skeletal muscle to contain
~8 mature dystrophin mRNAs per nucleus and ~15 hnRNA molecules,
determined from the relative concentrations of 3' and 5' sequences.
These studies suggest that even for the dystrophin genethe longest
gene knowna mature mRNA is made by at least 50% of polymerases that
initiate transcription. This emphasizes how robust the mechanisms of
RNA synthesis and processing must be. As an interesting comparison,
each nucleus in this tissue maintains a steady-state level of ~50,000
cytoplasmic myosin heavy chain mRNAs.
The nuclear distribution of specific transcripts
Fluorescence in situ hybridization (FISH) allows the
distribution of individual RNAs to be examined in fixed cells
(64
, 65)
. Using this approach, highly active genes
generally display two RNA-rich sites that lie adjacent to the active
alleles. In most cases, specific RNAs that have moved away from these
transcriptional hot spots are not seen, though under some circumstances
processed RNAs may accumulate locally to form extended hot spots or
tracks (64)
.
A particularly elegant demonstration of the value of this approach is
presented by Femino et al. (25)
, using the expression of
serum responsive ß- and
-actin genes in rat kidney cells. When
these cells were grown for 24 h in medium with 0.5% serum,
cytoplasmic levels of ß-actin mRNA fell to ~500 copies per cell;
~1,500 copies per cell are detected during exponential growth. On
replacing serum, ß-actin RNA was detected within 3 min, and by 15 min
~30 RNA molecules were associated with each gene (Fig. 3
). During this period, the rate of initiation of transcription peaked at
~4 transcripts per gene per minute. Within 120 min, steady-state
levels of synthesis returned, with two pols over the body of each gene
and a further two associated with its 3' end (Fig. 3)
. Reassuringly,
the decay of ß-actin mRNA accompanying serum deprivation shows that
proliferating cells produce ~2,500 mature transcripts per cell
generation to maintain steady-state levels. This requires a synthetic
rate of one initiation per gene per 1.5 minassuming a 20 h cell
cycle, that the gene replicates within the first one-third of S-phase,
and that all four alleles are subsequently active. As this closely
matches the observed rate of synthesis, almost all transcripts must be
processed successfully to give mRNAs that pass to the cytoplasm.
|
Clearly, high levels of activity seen after serum induction are not
required under the steady-state conditions established during
proliferative growth. Although the number of transcripts close to each
ß-actin allele might vary by almost 10-fold between peak levels after
induction and steady-state levels, it should be noted that the maximum
rate leads to an accumulation of hnRNAs that have completed
transcription but require additional processing before mature mRNAs can
pass to the cytoplasm. Transcription requires ~4 min to complete, and
1020 min after induction mature mRNAs begin to move away from the
active genes. At this time, the separation of individual molecules
suggests a constant rate of movement from the site of synthesis to the
cytoplasm, where a corresponding increase in the number of mRNAs is
seen (25)
.
One downside of this technique is the possibility that cell fixation
and processing might distort the organization of sensitive structures
existing in vivo. A method is available that allows an RNA
to be identified in vivo when specific binding sites are
occupied by a bacteriophage protein coupled to green fluorescent
protein (66)
. Though the use of this approach is limited
by sensitivity, this strategy might ultimately allow individual RNA
molecules to be tracked inside living cells.
Mapping RNAs in transit to the cytoplasm
RNA splicing is known to begin during transcription. However, the
existence of intron-containing poly(A)+ pre-mRNA
shows that polyadenylation can begin before splicing is complete and
raises the possibility that some processing might occur on RNAs in
transit. This prospect appears to be supported by the high
concentration of poly(A)+ RNA (67
, 68)
and accumulation of intron-containing pre-mRNA
(69)
introduced into cells by microinjectionwithin
splicing protein-rich nuclear speckles (3)
. However,
while the partial purification of these structures has been reported
(70)
interchromatin granule clusters seen by EM
correspond to specklesit remains to be confirmed whether they contain
natural mRNAs en route to the cytoplasm. Indeed, in the case of
ß-actin transcripts, pre-mRNA processing appears to be completed by
the time individual mRNAs leave their site of synthesis
(25)
so that export via speckles would be
unnecessary. The generality of this view is supported by the rate of
poly(A)+ mRNA export (71)
and the
fact that poly(A)+ RNAs in speckles have a
protein composition unlike that of typical hnRNPs (72)
.
Detailed studies have confirmed that many active genes are associated
with nuclear speckles (65)
. This clearly correlates with
the local demand for RNA processing and suggests that nuclei are
organized so that the splicing components can accumulate in close
proximity to sites of greatest demand (3)
. This is not
surprising, as the CTD of pol II is known to couple RNA synthesis and
processing (14)
, while pol II with a truncated CTD (with
47 of the 52 repeats deleted) supports much reduced levels of RNA
splicing (73)
. The loss of association between a chosen
active gene and nuclear speckles in cells with this mutated pol
confirms that speckles accumulate close to active genes as a result of
ongoing splicing (73)
. It should be noted, however, that
highly active genes that are not spliced also associate with speckles,
suggesting that these structures will be involved in other aspect of
RNA metabolism (74)
.
Pulse-labeling with either [3H]- (31
, 41)
or Br-uridine (75)
also shows that RNA
leaves transcription sites after 1020 min and moves into the
interchromatin channels from where mature mRNA passes to the cytoplasm.
In HeLa cells, ~1,000 RNPs/min engage the transport pathway
(75)
. Within minutes these are able to move to the
cytoplasm, a process that appears to rely on passive diffusion
(76)
.
| THE TURNOVER OF NUCLEAR RNAA PARADOX |
|---|
|
|
|---|
We know that approximately one-third of this profligacy can be
explained by RNA processing events that generate mRNAs from much longer
primary transcripts, but other clues are needed to understand the
behavior of the remaining RNAs (discussed in ref 55
). For
example, an analysis of RNA synthesis in quiescent and proliferating
fibroblasts shows that the ~5-fold increase in cytoplasmic mRNA seen
during proliferation is accounted for by an ~2- to 3-fold increase in
levels of RNA synthesis and <2-fold increase in the efficiency of RNA
processing and export. In such shift-up experiments, protein
synthesis increases in parallel with the cytoplasmic concentration of
specific mRNAs, ~75% of which are polysome-associated. This implies
that the concentration of cytoplasmic mRNAs is rate-limiting for
protein synthesis and is controlled predominantly at the level of
transcription.
Most mature mRNAs contain 5' and 3' modifications that protect them
from exonuclease activities; these 5' caps and 3'
poly(A)+ tails are known to be added at the time
of RNA synthesis. Caps are found in hnRNA and mRNA at frequencies of
0.1 and 0.5 per 1,000 nucleotides, respectively, consistent with one
per RNA chain in both populations. In contrast,
poly(A)+ tails, of 200250 nucleotides,
represent ~10% of cytoplasmic mRNA mass but <1% of the mass of
hnRNA. In fact, only ~30% of the primary transcripts appear to be
polyadenylated and pulse-chase experiments demonstrate that most of
these enter the cytoplasm (71)
. This suggests that
productive and nonproductive hnRNA molecules are specified at the time
of synthesis.
An explanation for the extent of hnRNA turnover is that many RNAs
transcribed from authentic protein coding genes turnover in nuclei
because of faulty transcription or processing. If such defects were
common, most transcripts initiated could be potential precursors of
mature mRNA. However, it is notable that HeLa cells and erythrocytes
have similar proportions of nonproductive hnRNAs even though the latter
are specialized for the synthesis of short globin transcripts that
represent ~85% cytoplasmic mRNA (29)
. In erythrocytes
at least, it seems improbable that the bulk of nRNA might arise through
defects in the primary synthetic pathway.
Though the global efficiency of gene expression is difficult to assess,
specific examples show how successful this process can be. The dynamics
of actin gene expression (discussed at length earlier) show that
synthesis of this shorter than average transcript occurs efficiently
(25)
. Even the ~2.5 Mb dystrophin gene produces one
mature mRNA for every two hnRNA molecules initiated (24)
,
suggesting that failures of RNA synthesis and processing might occur
less than once per Mb hnRNA. It is also worth remembering that in most
cases, different specialized cells have >90% hnRNA sequences in
common, while genes such as globin and ovalbumin are only transcribed
in tissues where mature mRNA is found.
Available facts lead us to conclude that mRNA synthesis is efficient,
while an unrelated population of primary transcripts remains nuclear
and is never destined to produce mRNA. If this is so, the question
arises: what role do these RNAs perform? It is possible that
nonproductive RNA synthesis could act to keep chromatin domains
open, so preventing the formation of inactive chromatin states.
This would explain the unexpectedly high complexity of nRNA and
intergenic transcripts that can be identified using a nuclear run-on
analysis (77)
. In addition, there are situations where
transcripts have roles other than the production of mRNA. The best
example of this is the structural role played by XIST
transcripts during X chromosome inactivation (78)
. Whether
other classes of pol II transcripts play analogous roles to locally
activate or suppress gene expression remains to be elucidated. Finally,
although chromatin must be structured to optimize expression from
promoters, circumstances may arise where transcription initiates from
nonpromoter sequences to generate noncoding, junk RNA.
Steady-state levels of nuclear RNAs
Evidence discussed above leads us to conclude that pol II
transcripts have at least three possible fates. A minority (~1 in 3)
produce mRNAs that move quickly to the cytoplasm. Some of the
transcripts on this pathway fail to mature and are recycled in nuclei;
available evidence suggests that this might happen with only 1% of
pre-mRNAs. However, if processing intermediates that fail to mature
account for the accumulation of poly(A)+ RNAs in
nuclear speckles, efficient mRNA synthesis must reflect the longevity
of RNA in these sites. The third, and predominant, population appears
not be polyadenylated and seems to be a poor substrate for RNA splicing
(54
, 71)
. 3H- and Br-labeled RNAs in
this population move away from transcription sites with kinetics
similar to mRNAs on the productive pathway (32
, 42
, 75)
.
However, they are not exported from the nucleus and remain in the
hnRNP-rich interchromatin space where they must eventually be recycled.
The purpose of this population and nature of its transcripts remain
obscure.
To develop a better understanding of the metabolism of nRNA, we have
modeled the behavior of different RNA populations using the flow
diagram shown in Fig. 4
. HnRNA is assumed to be the nascent transcripts arising from synthesis
by pol II, as originally defined from its heterodispersed size
distribution on pulse-labeling with radioactive precursors. The
turnover of hnRNA is both rapid and complex (31)
. Some
hnRNAs generate mRNA and nucleotides from degradation of intron
sequences, while others disperse throughout the nucleus. The behavior
of this class, here called nRNA, is ill-characterized. Whether some
molecules are partially processed and how quickly they are destroyed
are complex questions (31
, 75)
.
|
The model for RNA synthesis and decay shown in Fig. 4
is tested in
Table 1
. The analysis requires that certain parameters are fixed. HeLa cell
values were used for steady-state amounts of pol II-derived RNAs: with
~1 x 105 hnRNAs (each cell has
~6x104 engaged pol II transcripts and a
minority of genes have many associated hnRNAs, some having completed
transcription); 4 x 105 mRNAs (see above);
and 3 x 105 nRNAs (estimated from
population size at steady state and ~7.5% cellular RNA). The rates
of synthesis (2 kb/min) and t1/2 of decay of
processed primary transcripts (20 min) were also fixed in the
calculations. Different parameters were then varied in turn, and
unknown rates calculated (Table 1)
. In this way, we established a best
fit to the observed ratio of productive to nonproductive hnRNAs
(estimated to be ~1 out of 3) when t1/2 for
mRNA decay was set at 300 min and the corresponding
t1/2 for decay of nRNA was 120 min. Such
estimates lie within the accepted range for the decay of these RNA
populations (31
, 55
, 79)
.
|
The degradation of nuclear RNA
In human cells, ~50% RNA synthesised by pol I and >95% by pol
II never reaches the cytoplasm and is recycled within nuclei. In
nucleoli, numerous small-nucleolar RNA-protein complexes
(80)
perform reactions needed to generate mature 28, 18,
and 5.8S rRNAs from a 45S pre-rRNA, whereas complexes such as RNase MRP
are involved in processing events that take place within discrete
nucleolar sites (81)
. A related endonuclease, RNase P,
performs tRNA maturation at sites scattered throughout the nucleoplasm
(81)
.
In the nucleoplasm of mammalian cells, pre-mRNA splicing is an obvious
source of RNA fragments destined for destruction. Indeed, ~80% of
the nucleotides in pre-mRNAs are removed by splicing during their
maturation by a series of RNA splicing events catalyzed by small
nRNA-protein complexes within spliceosomes (50)
. The
noncoding regions of the primary transcriptsintronsare rapidly
degraded at the time of splicing by exonucleases that have not yet been
characterized in detailed. However, very recent experiments have
described a complex of 3'-to-5' exonucleases (the exosome) that
contains enzymes involved in the maturation of stable RNAs and
processing/degradation of nuclear pre-mRNAs, poly (A) nuclear RNAs as
well as mRNAs in the cytoplasm (82)
.
Under some circumstances, RNA-RNA duplex can activate nuclear RNA
turnover. For example, it is well known that short (~50 nucleotides)
transcripts dramatically reduce the expression of target genes that
encode a complementary sequence (83)
. RNA-RNA duplexes are
the likely cause of this reduced expression, though the rarity of
hybrid structures implies that they are degraded very rapidly, once
formed (84)
.
Nonsense-mediated decay of nuclear mRNA
The behavior of nRNAs with premature translation stop codons
provides a rare opportunity to compare the nuclear turnover of specific
pol II transcripts with precise mutations (85
, 86)
. Both
mutations in DNA and defects in nRNA metabolism can potentially give
rise to nonsense codons in mRNA that would yield truncated proteins.
However, mRNA molecules containing nonsense codons as a result of
nonsense or frameshift mutation are generally of low abundance as a
result of the activation of nonsense-mediated mRNA decay (NMD).
Examples are seen in many diseases (85)
and naturally,
during lymphocytes development (86)
, where immunoglobulin
(Ig) and T cell receptor (TCR) gene rearrangements generate premature
stop codons in two out of three potential mRNAs. Though the effects of
protein synthesis inhibitors and suppressor tRNAs imply that the
mechanism is ribosome-dependent (and cytoplasmic), a number of
experiments support nuclear mechanisms for NMD. For example, Ig and TCR
transcripts with premature stop codons exhibit 10- to 100-fold
reductions in stable mRNA and dramatically reduced amounts of nuclear
mRNA even though levels of RNA synthesis are largely unaffected. In
addition, various studies have shown decay is most efficient for stop
codons at the beginning of transcripts, while the preferred substrates
retain at least one intron downstream of the premature stop (85
, 86)
.
These observations suggest that eukaryotic nuclei possess a specific mechanism to identify and destroy partially processed nRNAs with in-frame premature stop codons. However, the extent of this destruction, the mechanism by which it is activated, the ribonucleases involved, and their site of action in mammalian nuclei remain to be discovered. Whether this process is related to other modes of RNA turnover is also unclear.
| CONTROLLING RATES OF GENE EXPRESSION IN MAMMALIAN CELLS |
|---|
|
|
|---|
1 kb apart.
Few genes support higher polymerase densities. In proliferative
mammalian cells, pol I complexes on active rRNA genes are 100120 bp
apart (see above). Genes for other structural RNAs, though much
smaller, are transcribed at similar rates. For example, during
proliferation, cells produce ~106 copies of the
snRNAs needed for RNA splicing. To do this, each snRNA U2 gene (with
~20 copies per haploid genome) engages a pol II complex every 510
s. As snRNAs are only ~200 nucleotides long, active genes will have
only one or two engaged pols. In contrast, most RNAs synthesized by pol
II act as templates for protein synthesis, with each transcript
producing ~150 proteins per hour. For mRNAs, the maximum rates of
synthesis by pol II (with pols separated by ~100 bp) are clearly
restricted to specialized cells, at certain stages of development and,
transiently, in stress response genes (87)
and genes that
respond to fluctuations in growth factor concentration (25
, 61)
. It is clear from the profile of RNAs in proliferative cells
that most mRNAs are present at <10 copies per cell and, therefore,
need only be transcribed about once every hour to maintain their
steady-state levels of cytoplasmic mRNA.
Variations in the steady-state concentrations of different mRNAs beg
the question: how are different levels of gene expression maintained?
Presumably, genes expressed from their natural chromosomal site inhabit
chromatin that has evolved to allow the desired level of transcription.
Under the appropriate conditions, chromatin structure (4
, 5)
allows transcription factors to access promoter sequences
(6
7
8
9)
and activate RNA synthesis (11
12
13
14
15
16
17)
.
Stable transcription factor complexes will drive many cycles of
synthesis while unstable ones might activate transcription
infrequently. Outside promoters, DNA motifs within enhancers
(88)
, locus control regions (89)
, and nuclear
matrix- or scaffold-attached regions (90)
might also
influence levels of gene expression.
| NUCLEAR STRUCTURE AND GENE EXPRESSION |
|---|
|
|
|---|
These and other observations emphasize how nuclear organization can
influence the metabolism of RNA (1
2
3
, 93
, 94)
. It is
notable, for example, that the metabolism of specific RNAs can be
influenced by transcription from different promoters (95)
or the structure of the primary transcript (96)
, while
certain pre-mRNAs only mature successfully when transcribed from the
appropriate class of promoter (97
, 98)
. This latter
observation implies that the different steps needed for mRNA synthesis
are coupled and suggests that efficient gene expression involves a
complex pathway which can only succeed if transcription is performed by
the appropriate RNA polymerase. This is reinforced by the fact that
transcripts are synthesized and processed in dedicated nuclear sites
that may coordinate the different events required for mRNA synthesis
(99)
.
| CONCLUSIONS |
|---|
|
|
|---|
Once transcription is activated, RNAs must be processed and transferred
to cytoplasmic sites where translation occurs. Pathways of RNA
synthesis and processing that generate mRNA appear to deliver these to
the cytoplasm with reasonable speed and efficiency (24
, 25)
. An important aspect of this efficiency is the apparent need
to coordinate critical processes required to generate nascent RNA. It
has been known for many years that intron-containing genes are often
expressed poorly from cDNAs, implying that splicing might influence the
stability of nRNA or the export of mature transcripts. However, it is
now clear that the CTD of RNA pol II coordinates RNA processing
(14
15
16)
and, therefore, ensures that transcripts
generated from pol II promoters are processed into mature messages with
optimal efficiency. In this way, efficient processing (CAP addition,
splicing, and polyadenylation) must represent a cascade of events that
generates mature mRNAs with a constellation of bound RNPs that
reflect the synthetic history of the transcript and ensure efficient
export of the final product (49
50
51
52)
.
Despite the complexity of this process, it is perhaps surprising that
in most cells studied only approximately one-third of primary
transcripts initiated yield mRNA. Apparently, nonproductive nRNAs
together with introns removed from mRNAs during maturation account for
the surprising observation that ~95% hnRNA synthesized in mammalian
cells turns over inside nuclei (31)
. Though the purpose of
this excessive synthesis remains unclear, it is hard to imagine that
this level of activity could survive while serving no function at all.
The complexity of mammalian nuclei undoubtedly contributes to our
conceptual inadequacies. Indeed, the process of gene expression is
grossly understated by a simple flow of information from the gene to
mRNA and corresponding polypeptide. For example, experiments on
transgenic animals show that introducing genes into nuclei with the
appropriate transcription factors is not always sufficient to guarantee
natural levels of mRNA synthesis (91
, 92)
. This suggests
that different nuclear sites have different synthetic capabilities and
is supported by the fact that certain genetic alterations generate bone
fide primary transcripts that fail to mature successfully
(96
97
98)
. Different pathways of RNA metabolism can even be
used by identical mRNAs driven from particular pol II-dependent
promoters (95)
. Such observations imply that features
present at the time of transcription have a significant impact on
events that follow.
With these facts in mind, it might be argued that mammalian nuclei are not a collection of genetic units that operate independently, but instead a community of units that function cooperatively to achieve highly sophisticated patterns of gene expression. We anticipate that the way various units are networked within nuclei will be a crucial factor in determining the metabolism of their transcripts in higher eukaryotic cells.
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
| REFERENCES |
|---|
|
|
|---|