|
|
||||||||
Department of Microbiology & Molecular Genetics, New Jersey Medical School-UMDNJ, Newark, New Jersey, USA; and
* PharmaSeq, Inc., Monmouth Junction, New Jersey, USA
2Correspondence: Department of Microbiology & Molecular Genetics, New Jersey Medical School-UMDNJ, 225 Warren St., P.O. Box 1709, Newark, NJ 07101-1709, USA. E-mail: egoldman{at}umdnj.edu
| ABSTRACT |
|---|
|
|
|---|
Key Words: E. coli protein synthesis internal initiation of translation non-ORF expression UGA stop codons
| INTRODUCTION |
|---|
|
|
|---|
The random peptides in the library were encoded by
40 NNK codons (N=A, C, G, or T; K=G or T) flanked by FLAG (amino-terminal) and E-tag (carboxyl-terminal) epitopes, fused to the gene III coat protein of phage M13. By use of an amber suppressor host, UAG stop codons were translated as glutamine. Despite the design of the library, deletions, insertions, or point mutations were occasionally observed in the genes isolated from phage display (1
, 2)
.
Only about half of the
150 peptide-encoding sequences isolated from these experiments contained a canonical open reading frame (1)
. The remainder contained sequences with TAA or TGA stop codons in frame and, in many instances, the E-tag sequence in a different reading frame (1)
.
Clone "H10" was selected as producing a peptide that bound to GHBP (growth hormone binding protein) (2)
. The sequence contained two zero frame TGA stop codons upstream of the sequence encoding the E-tag epitope (and the fusion to M13 gene III protein). Using doped mutagenesis, a secondary peptide library for phage display was made and again used in biopanning against GHBP. Most of the additional clones obtained encoded frameshifts, both +1 and 1, in the placement of the E-tag sequence relative to the translation start, as well as retention of the 5' proximal TGA stop codon of parental H10. Because the 3' region of the sequence showed almost no mutations, it was possible to deduce the amino acid sequence that bound to GHBP, proved by synthesis of the putative peptide fragment and demonstrating the appropriate binding properties (2)
.
The H10 sequence and several secondary derivatives were subcloned into a reporter system and tested for expression in Escherichia coli (3)
. This sequence turned out to facilitate high-frequency expression (1040% of control) despite two TGA stop codons in the zero frame for H10, and +1 or 1 frameshifted E-tag and fused reporter in some of the secondary derivatives. Expression was obtained in two of the three reading frames: the original frame from phage display, and the frame 1 to that. We initially thought that expression from this sequence must result from a recoding event (4)
, since a site-directed mutant (to sense) of the TGA at codon 15 (in the reporter fusion) abolished expression in an isolate requiring a +1 frameshift to express reporter (3)
.
Inspection of the sequence in the 1 frame of H10 suggested the possibility of a new translation initiation event to account for observed expression in the 1 frame. This frame contained an ATG codon without any stop codons before the reporter junction and a consensus Shine-Dalgarno sequence appropriately distanced upstream of this ATG. Mutating the putative start codon in the frame 1 to the original H10 frame did in fact eliminate translation from that frame, strongly supporting new initiation as the explanation for expression in the 1 frame in these sequences (5)
.
Unexpectedly, mutagenesis of that out-of-frame downstream ATG also increased expression in the original non-open reading frame by two- to threefold, creating a TTG codon adjacent to an existing in-frame TTG codon. Thus, despite the earlier result with the mutation of the TGA at codon 15, we began to suspect that a downstream translational reinitiation at a putative TTG start at codon 38 in the H10-reporter fusion sequence could explain most if not all of our results.
We undertook an extensive site-directed mutagenesis approach and report here the evidence that this hypothesis is almost certainly correct. We also describe some of the features required for this apparent translational reinitiation.
| MATERIALS AND METHODS |
|---|
|
|
|---|
[lac-pro], supE, thi/F' lacIQZ
M15, proA+B+), generously provided by Jim Curran. This strain (6)
The vector into which the H10 derivatives were cloned was a 7049 base pairs long plasmid pJC27 (7)
, also supplied by Jim Curran, and used previously in this laboratory (3
, 5
, 8
, 9)
. This vector contains a chloramphenicol resistance gene, a p15A origin of replication, and a pseudo-wild type lacZ gene under control of the lacUV5 promoter, with a HindIII site at nucleotides 1823 and a BamHI site at nucleotides 3843 (where nucleotide 1 is the "A" of the first initiating ATG codon of lacZ). Plasmids carrying the H10 sequence or its derivatives (2)
were obtained from DGI Biotechnologies (Edison, NJ, USA). Plasmids in which H10 or its derivatives were recloned in our reporter vector to test expression, as well as the 8T mutants of these plasmids (changing the first TGA to sense), were described by Goldman et al. (3)
. Because our earlier studies compared the reporter in three reading frames, we adopted a convention of using a suffix to designate in which frame the reporter was attached; when reporter was attached in the same frame as E-tag at the carboxyl terminus of the peptide, the suffix was ".1." Since the only constructs tested in the present report are of the ".1" type, the suffix will be omitted for general discussion of results in the text. The plasmid carrying H10.1.74T was constructed by Jennifer Zemsky in our laboratory (5)
.
Recombinant PCR amplification
Mutants were constructed by a two-step PCR (polymerase chain reaction) procedure based on a protocol described by Higuchi (10)
. In the first step, two overlapping PCR fragments were generated that contained the designed mutation using one outer primer and one inner primer and proper template DNA separately; 10 µL of these PCR products were loaded to a 1% agarose gel. The DNA bands were cut out of the gel and purified using the QIAquick Gel Extraction Kit (Qiagen, Chatsworth, CA, USA). The products were combined, heated at 95°C for 10 min, and allowed to slowly cool to room temperature. The annealed products were then used as templates for generating a full-length copy of the mutated sequences using the outer PCR primers H3F and BHIR. PCR Master Mix Kit was purchased from Qiagen.
All PCR oligonucletide primers were generated at the New Jersey Medical School Molecular Resources Facility and are listed below (nucleotides in parentheses denote introduced mutations):
Outer Primers:
H3F: GAAACAGCTATGACCATGATTACGC
BHIR: CCCAGTCACGACGTTGTAAAA
Inner primers:
For mutant H10.1.74T.76C:
SMFI: GCAGGAGGAGGTTGATTG(C)TG
SMRI: CA(G)CAATCAACCTCCTCCTGC
For mutants H10.1.76C, 210.1.76C, H10.1.8T.76C, 210.1.8T.76C:
SMFII: GCAGGAGGAGGTTGATGG(C)TG
SMRII: CA(G)CCATCAACCTCCTCCTGC
For mutant 221.1.76C:
SMFIII: GCAGCGAGGAGGTTGATGG(C)TG
SMRIII: CA(G)CCATCAACCTCCTCGCTGC
For mutant H10.1.76A:
SMFIV: GCAGGAGGAGGTTGATGG(A)TG
SMRIV: CA(T)CCATCAACCTCCTCCTGC
For mutant 210.1.8T.101C:
SMFV: GCTATTTTGTTGCTGCTGGGG(C)AG
SMRV: CT(G)CCCCAGCAACAAAATAGC
For mutants 210.1.8T.70G, H10.1.70G:
SMFVI: GTTTGCAGGAGGAGGT(G)GATG
SMRVI: CATC(C)ACCTCCTCCTGCAAAC
Cloning of mutated PCR products to generate plasmids
The PCR products included the BamHI and HindIII restriction endonuclease sites present in the parental plasmids. PCR products were purified by QIAquick PCR Purification kit (Qiagen), then purified PCR products and plasmid vector pJC27 were digested with BamHI and HindIII (Fisher Scientific, Pittsburgh, PA, USA) at 37°C for 4 h. The products were phenol/chloroform extracted three times and ethanol precipitated, then ligated using T4 DNA ligation Kit (Fisher Scientific) for 3 h at room temperature or at 4°C overnight. Ligation products were transformed into E. coli strain MY411. Plasmid DNA was purified from resultant colonies using the Qiagen Spin Miniprep Kit. The H10.1 family inserts all contain a Not I enzyme site; the pJC27 parental vector does not contain this enzyme site, so we used Not I (Life Technologies, Grand Island, NY, USA) to verify whether a new clone contained the H10.1 family insert. Plasmid DNAs of putative positive clones were submitted to New Jersey Medical School Molecular Resource Facility for DNA sequencing.
Site-directed mutagenesis by whole plasmid synthesis
To mutate two in-frame upstream ATGs (for initiation of reporter translation) in vector pJC27, a pair of completely complementary primers was synthesized by New Jersey Medical School Molecular Resource Facility:
H3FI: GAAACAGCTA(C)GACCA(C)GATTACGC
Anti-H3F1: GCGTAATC(G)TGGTC(G)TAGCTGTTTC
Expand Long Template PCR System (http://biochem.roche.com/pack-insert/1681834a.pdf) was purchased from Roche (Nutley, NJ, USA).
PCR procedure (http://biochem.roche.com/pack-insert/1681834a.pdf):
Master mix 1: dATP, dCTP, dGTP, dTTP 500 µM each, H3FI 400 nM, anti-H3FI 400 nM, template DNA 10 ng, and dd (deionized and distilled) water to 25 µL.
Master mix 2: 5 µL 10xPCR buffer-3, 19.25 µL dd water, and 0.75 µL enzyme mix.
Master mix 1 and 2 were combined, mixed, and placed in a thermal cycler 1x 94°C for 2 min, 20 x (94°C 30 s, 60°C 30 s, 68°C 14 min), and finally 1x 68°C for 7 min.
DpnIdigestion:
DpnI endonuclease is specific for methylated and hemi-methylated DNA and is used to digest parental DNA template and select for mutation-containing synthesized DNA. DNA isolated from almost all E. coli strains is dam-methylated and therefore susceptible to DpnI digestion (11)
; 1 µL (10 units) of DpnI (Roche) was added directly to the PCR reaction and digested for 2 h at 37°C.
Transformation and verification of the mutation:
The DpnI digested PCR products were purified by QIAquick PCR Purification kit; competent E. coli MY411 were transformed with 2 µL of the purified DNA. Plasmids were sent to New Jersey Medical School Molecular Resource Facility for DNA sequencing.
ß-Galactosidase assays
Assay procedures were based on standard protocols (12
, 13)
. Five milliliters of 1x medium "A" supplemented with 0.001 M MgSO4, 0.2% dextrose, 0.0005% vitamin B1, and 20 µg/mL chloramphenicol were inoculated 1:50 from overnight LB cultures containing 20 µg/mL chloramphenicol. The culture was grown to an optical density of
0.4 at 600 nm (
2x108 cells/mL). Fusion protein expression was induced by addition of 1 mM IPTG for 1 h. A600 measurements were taken and 0.1 mL of the culture was added to 0.9 mL of Z buffer (0.06 M Na2HPO4, 0.04 M NaH2PO4, 0.01 M KCl, 0.001 M MgSO4, 0.05 M ß-mercaptoethanol). Two drops of chloroform and one drop of 0.1% SDS were added to the reaction mixture and vortexed for 10 s. Reaction mixtures were allowed to equilibrate to 28°C; 0.2 mL of o-nitrophenyl-ß-D-galactopyranoside in water (4 mg/mL) was added to reaction mixtures at 1 min intervals. When a yellow color developed, reactions were stopped by addition of 0.5 mL of 1 M Na2CO3. A550 and A420 measurements were taken. ß-Galactosidase activity was calculated based on the following formula: ß-galactosidase activity units = 1000 x (A4200.75xA550)/(t x 0.1 x A600), where t is equal to the time in minutes and 0.1 is the 0.1 mL of culture volume used in the assay.
| RESULTS |
|---|
|
|
|---|
|
As previously reported (3)
, expression is in the 1020% range compared with an in-frame ORF control (Table 1
, line 1) for a ß-galactosidase reporter fused immediately downstream of the H10 sequence, or derivatives of H10 in which the reporter is in the +1 frame (clone 221) or the 1 frame (clone 210) (Table 1
, lines 35). Mutation of the first TGA to TTA at codon 15 in the H10-reporter fusions (the "8T" mutation; see Fig. 1
) has no effect on reporter expression in the 1 frame variant, 210 (Table 1
, line 8) and a relatively small reduction in expression of H10 (Table 1
, line 6). However, the "8T" mutation in the +1 variant 221 almost abolished expression of the reporter (Table 1
, line 7); this is the result that had led us to hypothesize that a recoding phenomenon was involved (3)
.
|
As reported, the mutation "74T" in H10 changed a putative ATG start in the 1 frame to ATT, thereby eliminating expression from the 1 frame (5)
. However, as a result of this mutation, the zero frame TGG at codon 37 of the H10-reporter fusion was changed to TTG, adjacent to the preexisting TTG at codon 38 in the sequence (see Fig. 1
), and expression in the zero frame increased more than twofold (Table 1
, line 9). This is the result that made us suspect that the preexisting TTG at codon 38 might be functioning as a translational reinitiation site. Appropriately spaced, overlapping Shine-Dalgarno sequences are upstream of these TTG codons in the H10.74T sequence (boxed sequence "ggaggagg" at nucleotides 6168 in Fig. 1
). The 74T mutation possibly increased the target size for reinitiation by placing two potential TTG start codons side by side. Though rarely used, UUG is known to be a valid start codon in E. coli mRNA, although the efficiency has been reported to be as low as 10% compared with AUG (14
, 15)
. The putative TTG start was in the same frame as reporter for constructs where reporter was in the +1 or 1 frame.
To ascertain whether the TTG at codon 38 was in fact responsible for translation of reporter in the H10 series, we performed site-directed mutagenesis of this codon, changing the TTG to synonymous CTG (mutant 76C) in H10, 221, and 210 (see Fig. 1
). Indeed, this mutation almost abolished expression of the reporter (Table 1
, lines 1012). In the case of the "8T" mutants of H10 and 210 (i.e., where the first TGA had been changed to sense), adding the second mutation of TTG to CTG almost abolished expression from those constructs as well (Table 1
, lines 13, 14), indicating that the presence or absence of the first TGA had no effect on the TTG
CTG mutants. (There was no reason to construct a mutant of TTG in 221.8T since the 8T mutation had already eliminated expression of reporter [Table 1
, line 7].)
As mentioned above, the 74T mutation more than doubled expression in the zero frame of H10, which we speculated could be due to the increased target size for initiation by having two tandem UUG codons in the mRNA resulting from this mutation. Support for this explanation comes from the result with the double mutant 74T.76C (Table 1, line 15
), which shows essentially a reversion to the level of expression of that of parental H10 (Table 1
, line 3). The 74T.76C double mutant contains the new putative TTG start at codon 37, but has eliminated the putative parental TTG start at codon 38, thereby returning the sequence to a single TTG for potential reinitiation; this in turn restores the level of expression to that seen with the single TTG at codon 38. The 74T.76C double mutant no longer offers the 1 frame ATG restart, therefore removing the competition of 1 frame reinitiation does not enhance TTG reinitiation in the 0 frame.
Since ATG is the standard start codon for translation, we mutated the TTG at codon 38 in H10 to ATG (mutant 76A) and found that expression of the reporter doubled (Table 1
, line 16), further supporting the model of translation reinitiation at codon 38 in the H10 series.
A graphical representation of expression levels obtained with all the reinitiation site mutations, including reinitiation in the frame 1 to E-tag (5)
, is shown in Fig. 2
.
|
Translation initiation at codon 38 TTG requires an upstream translation start
Although the result showing a twofold increase in expression when the codon 38 TTG was mutated to ATG supported the model of translation reinitiation at codon 38, we were surprised that the enhancement with ATG was only twofold. A possible explanation is that the context, presumably secondary/tertiary structure of the mRNA, is such that the codon 38 start is never highly accessible to the translation apparatus for de novo initiation.
To ascertain whether upstream translation was indeed required for the codon 38 TTG to be able to initiate translation, we again performed site-directed mutagenesis, this time of the bona fide start codon(s) at the beginning of the lacZ reporter gene in the vector. Since there are two in-frame ATGs at codons 1 and 3 of the lacZ sequence, we decided to mutate both to insure that this translation start would be prevented. These double mutants, -29C.-35C, change both ATGs to ACGs (Fig. 1)
. Every time we introduced this double mutation, expression was eliminated or very nearly so (Table 1
, lines 1720), indicating that the downstream translation start at codon 38 TTG could not support independent translation initiation. Among the constructs tested was the -29C.-35C mutant of 221.8T (Table 1
, line 18). Since the 8T mutation in this clone already eliminated expression, the start codon mutant was constructed to determine whether the downstream TTG might support translation when there was no competition from ribosome traffic in the zero frame blocking access to reinitiation in the +1 frame (there are no other zero frame stop codons in the 221.8T mutant upstream of the putative +1 frame TTG start). That the -29C.-35C mutants do not facilitate expression demonstrates that ribosome access to the downstream TTG start requires upstream translation.
Translation initiation at codon 38 TTG requires a nearby translation stop
Although at this point the evidence seemed compelling that expression of the reporter was a consequence of translation initiation at codon 38 TTG in the H10 series, some observations were still unexplained. Why didn't 221.8T show some expression of reporter (Table 1
, line 7), or conversely, why did 210.8T continue to show expression of reporter (Table 1
, line 8)? In both instances, the 8T mutation eliminated all zero frame stop codons before the downstream TTG start. If the lack of expression in 221.8T was due to ribosome traffic in the zero frame blocking access to the TTG start in the +1 frame, then why wasn't a similar observation made for 210.8T, where ribosome traffic in the zero frame would have been expected to block access to the TTG start in the 1 frame?
The literature on translation reinitiation indicated the necessity for a translation stop in the proximity of the reinitiation site (e.g., refs 1619), so we decided to further mutate stop codons near the downstream TTG start (Fig. 1)
. Mutant 70G in H10 changes the codon 36 TGA to GGA; however, H10 still retains the codon 15 TGA, which we already knew supported reinitiation at the downstream TTG in clones 221 and 210. Indeed, the 70G mutation in H10 had no significant effect on the level of expression (Table 1
, line 21). We made an analogous mutant, 70G, in clone 210.8T, eliminating both TGAs even though the downstream TGA was not in the zero frame for 210; this had no effect on expression level (Table 1
, line 22). Because some of the published work on reinitiation indicated that the stop codon could actually be downstream of the reinitiation site (19
, 20)
, we inspected the region 3' to the TTG start and realized that even though we were using an amber suppressor host and had considered TAG as coding for glutamine, this codon is still a stop signal, albeit in competition with suppressor tRNA. In fact, the sequence of clone 210 exhibits a TAG at codon 46 in its zero frame 22 nucleotides downstream of the 1 frame TTG start: ggT TGg gtt gct att ttg ttg ctg ggg TAG (sequence from Fig. 1
with codons in the zero frame for clone 210; the 1 frame TTG restart and the zero frame TAG stop are in upper case).
So we constructed mutant 101C in clone 210.8T, thereby eliminating all zero frame stop codons, and finally obtained a significant drop in expression (Table 1
, line 23), indicating that reinitiation in 210.8T had been supported even by the amber codon despite the presence of suppressor tRNA, presumably because efficiencies of amber suppression are usually 50% or less. [We know the amber suppressor is functional in this strain from expression of another H10 derivative, clone 117, which contains a zero frame amber codon in the same frame as reporter (3).]
A graphical representation of expression levels obtained with all the stop codon mutations, including the 8T mutants that eliminated codon 15 TGA, is shown in Fig. 3
.
|
Thus, we conclude that either TGA at codon 15 or 36 is sufficient for reinitiation (at codon 38) in clone H10, that the TGA at codon 15 is required for reinitiation in clone 221, and that either TGA at codon 15 or (suppressed) TAG at codon 46 is sufficient for reinitiation in clone 210. In clone 221.8T, there actually is a stop codon (TGA) further downstream in the zero frame, at codon 68 (see legend to Fig. 1
), but evidently this is too far downstream to support reinitiation at codon 38.
| DISCUSSION |
|---|
|
|
|---|
twofold. Adding a second tandem TTG increases expression by more than twofold, presumably because of increasing the target size for reinitiation. Even though the reinitiation site evidently is exposed by upstream translation, it is nonfunctional in the absence of a nearby stop codon, presumably because ribosome traffic in the zero frame obscures access to the reinitiation site (e.g., refs 21
Thus, our data explain the puzzling expression patterns of the H10 series of isolates and reinforce and enhance conclusions from earlier work on translational reinitiation in ß-galactosidase (ref 17 and references therein), lac repressor (ref 16 and references therein), and phage T4 rIIB (ref 18 and references therein). Similar conclusions to our work were described in a recent publication, in which segments from a human gene were cloned in front of the alpha fragment of ß-galactosidase and non-ORF expression appeared to be explained by translational reinitiation (23)
.
In our system, there is a rather long 66 nucleotide distance from the first TGA to the reinitiation site, and it remains possible that some kind of "bridging" mechanism is used, as appears to be the case with lac repressor (24)
; however, there are no TAAs in any of the frames and only a few suppressed TAGs that could function as bridging translation stops before another reinitiation. It is possible that downstream cis-acting elements facilitate the putative reinitiation (25
, 26)
. We cannot entirely rule out a recoding explanation somehow involving the codon 38 TTG.
Translational coupling is very similar to translational reinitiation, the main differences being that coupling involves two adjacent genes whereas reinitiation is an intragenic phenomenon; reinitiation is rather dependent on a Shine-Dalgarno sequence (27)
, whereas at least some translationally coupled genes do not appear to have this requirement. Our results support conclusions from studies of translationally coupled genes (ref 19 and references therein) and modify slightly the scanning model for bacterial ribosomes finding the next available initiation codon within a certain limited length of nucleotide sequence near the stop codon (19)
, since H10 appears to be able to reinitiate in either of two frames. After termination at codon 15 TGA, the ribosome reinitiates either with 3040% efficiency at the ATG (nucleotides 7274) in the 1 frame (5)
or with 1020% efficiency at the TTG (nucleotides 7678) in the 0 frame. Both alternate restarts are downstream of suitable Shine-Dalgarno sequences.
The recent publication of the human genome sequence (28)
has led to significantly lower estimates than previously thought of the number of genes in the human genome. However, ribosomes appear to have a propensity to find unexpected ways to start translation, as exemplified in the work reported here as well as by others. Of special note are the recent demonstrations of translation initiation following an IRES (internal ribosome entry site) that do not even use a start codon or methionyl-tRNA (29
30
31)
. These observations raise the possibility that many tiny ORFs and not-so-tiny ORFs might be translated after all, with potential functional ramifications, and that estimates of the human proteome may need to be revisited.
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
Received for publication February 18, 2003. Accepted for publication May 8, 2003.
| REFERENCES |
|---|
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |