FASEB J.
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by NOTARO, R.
Right arrow Articles by LUZZATTO, L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by NOTARO, R.
Right arrow Articles by LUZZATTO, L.
(The FASEB Journal. 2000;14:485-494.)
© 2000 FASEB

Human mutations in glucose 6-phosphate dehydrogenase reflect evolutionary history

ROSARIO NOTARO*, ADEYINKA AFOLAYAN*,{dagger} and LUCIO LUZZATTO*1

* Department of Human Genetics, Memorial Sloan-Kettering Cancer Center, New York, NY 10021, USA; and
{dagger} Department of Biochemistry, Obafemi Awolowo University, Ile-Ife, Nigeria

1Correspondence: Department of Human Genetics, Memorial Sloan-Kettering Cancer Center, 1275 York Avenue, New York, NY 10021, USA. E-mail: l-luzzatto{at}ski.mskcc.org


   ABSTRACT
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
CONCLUSION
REFERENCES
 
Glucose 6-phosphate dehydrogenase (G6PD) is a cytosolic enzyme encoded by a housekeeping X-linked gene whose main function is to produce NADPH, a key electron donor in the defense against oxidizing agents and in reductive biosynthetic reactions. Inherited G6PD deficiency is associated with either episodic hemolytic anemia (triggered by fava beans or other agents) or life-long hemolytic anemia. We show here that an evolutionary analysis is a key to understanding the biology of a housekeeping gene. From the alignment of the amino acid (aa) sequence of 52 glucose 6-phosphate dehydrogenase (G6PD) species from 42 different organisms, we found a striking correlation between the aa replacements that cause G6PD deficiency in humans and the sequence conservation of G6PD: two-thirds of such replacements are in highly and moderately conserved (50–99%) aa; relatively few are in fully conserved aa (where they might be lethal) or in poorly conserved aa, where presumably they simply would not cause G6PD deficiency. This is consistent with the notion that all human mutants have residual enzyme activity and that null mutations are lethal at some stage of development. Comparing the distribution of mutations in a human housekeeping gene with evolutionary conservation is a useful tool for pinpointing amino acid residues important for the stability or the function of the corresponding protein. In view of the current explosive increase in full genome sequencing projects, this tool will become rapidly available for numerous other genes.—Notaro, N., Afolayan, A., Luzzatto, L. Human mutations in glucose 6-phosphate dehydrogenase reflect evolutionary history.


Key Words: G6PD deficiency • G6PD variants • housekeeping genes • human mutants • evolution


   INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
CONCLUSION
REFERENCES
 
GLUCOSE 6-PHOSPHATE DEHYDROGENASE (G6PD) is a cytosolic enzyme whose main function is to produce NADPH, a key electron donor in the defense against oxidizing agents and in reductive biosynthetic reactions (1) . In humans, genetically determined G6PD deficiency is associated with either episodic hemolytic anemia, triggered by fava beans or other agents (referred to hereinafter as mild phenotype), or life-long hemolytic anemia (referred to hereafter as severe phenotype) (2) . Among 122 variants characterized at the molecular level, those with severe phenotype are all rare, but those with mild phenotype have become common in many populations as a result of malaria selection (1) . G6PD was first identified in the early 1930s in yeast (3) and human red blood cells (4) , and since then in all organisms that have been tested for this enzyme activity as well as in all tissues and cell types from higher animals and plants (5) ; hence, it can be regarded as a prototype example of a housekeeping enzyme. Structural and functional features of the G6PD promoter conform to those of a typical housekeeping gene (6 7 8) . The G6PD gene from Homo sapiens was the first to be cloned (9) . Since then, 52 complete G6PD coding sequences have become available; of these, 43 were sought out deliberately and 9 have emerged in the last 2 years from the complete sequencing of microorganisms. From these sequence data we have derived a general idea of the evolution of G6PD over a Myr time scale (macroevolution). We used this evolutionary information to understand the distribution of human mutations that affect enzyme activity (microevolution).


   MATERIALS AND METHODS
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
CONCLUSION
REFERENCES
 
G6PD sequences
For this study we have selected only complete G6PD sequences for which there was an available open reading frame from a start to a stop codon. The following 49 G6PD complete coding sequences have been obtained from GENBANK by keyword search or by BLAST search using human G6PD sequence as probe (in parentheses, GENBANK sequence ID): Actinobacillus actinomycetemcomitans (1651208); Anabaena sp. (988293); Aquifex aeolicus (2983137); Arabidopsis thaliana 1, chloroplast (3021305); Arabidopsis thaliana 2 (1174336); Aspergillus niger (870831); Bacillus subtilis (1303961); Borrelia burgdorferi (2688568); Caenorhabditis elegans (1321752); Ceratitis capitata (460877); Chlamydia trachomatis (1791245); Cricetulus griseus (2828743); Drosophila melanogaster (157470); Emericella nidulans (642160); Erwinia chrysanthemi (397854); Escherichia coli (1788158); Fugu rubripes (2734869); Haemophilus influenzae (1573543); Helicobacter pylori (2314250); Homo sapiens (31543); Kluyveromyces lactis (5539); Leuconostoc mesenteroides (149631); Macropus robustus (560549); Medicago sativa (603219); Mus musculus G6PD1 (51114) and G6PD2 (1806126); Mycobacterium tuberculosis zwf (2117215) and zwf2 (2131049); Nicotiana tabacum cytosolic TCG6 (3021508), cytosolic TCG9 (3021510), chloroplast TPG16 (AJ001771), chloroplast TPG18 (3021532) and chloroplast (1480344) G6PD; Nostoc sp. (558505); Petroselinum crispum cytosolic cG6PDH1 (2352921), cytosolic cG6PDH2 (2352923) and chloroplast pG6PDH (2352919); Pichia jadinii (348417); Plasmodium falciparum (438212); Pseudomonas aeruginosa (2935661); Rattus norvegicus (56196); Saccharomyces cerevisiae (171545); Solanum tuberosum cytosolic (471345) and chloroplast (1197385) G6PD; Spinacia oleracea, chloroplast (2276344); Synechococcus sp. (988286); Synechocystis sp. (1652530); Treponema pallidum (3322776); Zymomonas mobilis (155591). Two additional complete sequences (Deinococcus radiodurans and Streptococcus pnaeumoniae) were retrieved by BLAST search from the archive of The Institute of Genomic Research. The complete sequence from Rhodobacter capsulatus was retrieved from the archive of University of Chicago. The full genomes of Archaeoglobus fulgidus, Methanococcus jannaschii, Methanobacterium thermoautotrophicum, Mycoplasma genitalium, Mycoplasma pneumoniae, and Rickettsia prowazekii are listed in GENBANK and were searched by using BLAST; the full-sequence Pyrococcus horikoshii was searched by using BLAST on the server of National Institute of Technology and Evaluation, Tokyo, Japan. We have not included the sequence of a rabbit microsomal G6PD (10) ; it is likely that this sequence belongs to the non-glucose-6-phosphate-specific hexose-6-phosphate dehydrogenase previously characterized from rat liver microsomes (11) .

Alignment and sequence analysis
The sequences were aligned using the program PIMA 1.4 (12) (Human Genome Center, Baylor College of Medicine, Houston, Tex.: http://dot.imgen.bcm.tmc.edu:9331/multi-align.html). The alignment was manually refined. The percentages of identity and similarity were calculated from the alignment using the program GeneDoc (13) . For aa similarity, we adopted the criteria of the Dayhoff’s PAM 250 matrix (14) : 1) DEQHN, 2) FY, 3) KR, 4) SAT, and 5) LIVM. For physicochemical grouping, we adopted GeneDoc criteria in which aa are classified in twelve groups based on three subsequent hierarchical levels: the first level is based on size, the second is based on electrical charge for polar aa, the third is based on aromaticity for nonpolar aa (15 , 16) .

Phylogenetic tree
The evolutionary tree of G6PD was derived from the alignment of the 52 sequences. Distances between sequences were calculated from the alignment by using PROTDIST (according to Dayhoff’s PAM 250 matrix; ref 14 ); the tree was then generated by using KITSCH and drawn by using TREEVIEW (17) . The programs PROTDIST and KITSCH are part of the PHYLIP 3.57 package (18) .

Amino acid solvent accessibility
The 3-dimensional structure of human G6PD was previously modeled on the crystal structure of L. mesenteroides G6PD (19) . Specifically, residues 27–512 of human G6PD are aligned to residues 1–486 of L. mesenteroides. The coordinates of this model have been used to calculate the solvent accessibility (SA) of individual aa residues in the monomeric structure of human G6PD using Swisse-PdbViewer (20) program (http://www.expasy.ch/spdbv/mainpage.htm). Amino acids with SA < 10% are regarded as buried and amino acids with SA >= 10% are regarded as exposed (21) .

Human mutations
One hundred twenty-two mutations or combination of mutations of human G6PD have recently been tabulated (22) : 114 variants have missense mutations; of these, 106 have a single mutation, 7 have two mutations, and 1 has three mutations. Six variants have in-frame deletions (four single aa deletions, one 2 aa, and one 8 aa deletion). One variant has a nonsense mutation and one has a splicing site mutation. We have made use of only 118 variants from this database (Fig. 4) because no clinical data are available on the other four. Of these 118 variants, 3 do not affect enzyme activity; all others entail enzyme deficiency: of these, 57 are associated with chronic nonspherocytic hemolytic anemia (WHO class I, severe phenotype) and 58 are associated with the risk of acute hemolytic anemia (WHO classes II and III, mild phenotype). In studying the relationship between human mutations and evolutionary conservation, we have confined the analysis to only the 99 single missense mutations and the 4 single amino acid deletions.



View larger version (67K):
[in this window]
[in a new window]
 
Figure 4. Microevolution of the G6PD molecule. This diagram reflects a synoptic analysis of 52 different G6PD sequences and of 118 human mutations. Human G6PD is used as the reference sequence and the exons are in black boxes; the coding sequence starts in exon 2. In the lettering of human G6PD, aa similarity (cfr. Fig. 1 ) is reported with the following color code: White letters in blue field, 100%; black letters in green field, 76–99%; white letters in red field, 51–75%; black lowercase for aa with <50% conservation. Below the human sequence, aa replacements in human mutants (mutations) are shown in uppercase in a black field if they have a severe clinical phenotype, in boldface underlined lowercase if they have a mild clinical phenotype, in italics lowercase if they do not have any known clinical phenotype, and in lowercase if they have been observed only in association with other mutations; {Delta} indicates a deleted residue; Z indicates a nonsense mutation; {int} indicates a mutation that interferes with splicing. The human mutations are from ref 22 .


   RESULTS
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
CONCLUSION
REFERENCES
 
G6PD in living organisms
We have retrieved from available databases a total of 52 sequences clearly homologous to human G6PD, which we have used as reference. Of these, 50 were annotated as G6PD and 2 were not. When all of these sequences are aligned by using conventional algorithms (Fig. 1 ), we find that the mammalian sequences have ~94% identity (by comparison, homology among mammalian hemoglobins is of the order of 70%); when these are compared with those of lower vertebrates, invertebrates, and microorganisms, it is not surprising that the degree of identity decreases but remains higher than 20% (Table 1 ). The percentage of similarity among all of the 52 sequences ranges between 43 to 98%. A long stretch of 141 AA (163–303 in human) is aligned without a single gap in all sequences (except H. pylori and A. aeolicus). Some G6PD species in plants have amino-terminal extensions related to their localization in chloroplasts; G6PD from Plasmodium falciparum has an intriguing amino-terminal extension (23) that may have a different enzymatic function (24) , but in the G6PD-like portion the homology is of the same order of magnitude as that found in other organisms.



View larger version (79K):
[in this window]
[in a new window]
 
Figure 1. Macroevolution of the G6PD molecule. This diagram reflects a synoptic analysis of 52 different G6PD sequences. Human G6PD (italics) is used as the reference sequence and the even exons are underlined and numbered: the coding sequence starts in exon 2. In the lettering of human G6PD, aa solvent accessibility (SA) is reported with the following color code: blue, buried aa (SA <10%); red, exposed aa (SA=10%). Immediately above the human sequence we show the similarity consensus for the 52 sequences analyzed (similarity), and above that the identity consensus sequence (identity). At each position, the most frequent aa is shown, with a coding for its frequency (white uppercase in blue field: 100%; black uppercase in green field: 76–99%; white lowercase in red field: 51–75%); hyphens are entered in the two consensus lines for aa with <50% conservation. For aa similarity we have adopted the criteria of the Dayhoff PAM 250 matrix (14) : DEQHN, FY, KR, SAT, and LIVM. Conserved blocks described in the text are boxed and identified with roman numerals. Above the identity consensus sequence, runs of underlined a and b indicate {alpha}-helices and ß-strands respectively, as found in the Leuconostoc mesenteroides molecule (sec. structure) (19) . The structure of the first 24 aa and the last 3 aa (lowercase) is not known for L. mesenteroids.


View this table:
[in this window]
[in a new window]
 
Table 1.

In our retrieval of G6PD sequences, it was of special interest that no coding sequence recognizable as G6PD could be found in seven microorganisms whose genome has been fully sequenced. Four of these (Archaeoglobus fulgidus, Methanococcus jannaschii, Methanobacterium thermoautotrophicum, Pyrococcus horikoshii) are Archea, which grow in an oxygen-poor habitat. The other three are Eubacteria with a small and defective genome, two of which (Mycoplasma genitalium and M. pneumoniae) are cell surface parasites, whereas one (Rickettsia prowazekii) is an obligate intracellular parasite that might be able to capitalize on the G6PD activity of the host cell.

The homology between G6PD sequences from Eubacteria and Eukaryota clearly supports a common evolutionary origin of G6PD throughout living organisms. Unlike tissue-specific genes such as globins, a housekeeping gene such as G6PD allows us to outline evolutionary relationships for most of the living organism, the only exception so far being the Archea. A dendrogram based on economy principles is consonant with taxonomy (Fig. 2 ), and some specific points are noteworthy. 1) Metazoa, fungi, and plants all separate early on from bacteria. 2) Vertebrates separate cleanly from invertebrates. 3) The chloroplast G6PD sequences of the plants separate early from the cytosolic G6PD species of the plants and from G6PD of fungi and metazoa. 4) Rather than being nearer metazoa, the G6PD sequences of fungi seem to fall between chloroplast G6PD and cytosolic G6PD.



View larger version (35K):
[in this window]
[in a new window]
 
Figure 2. Evolutionary tree of G6PD derived from the alignment of the 52 sequences (see Fig. 1 and Materials and Methods). Distances between sequences were calculated from the alignment by using PROTDIST according to Dayhoff’s PAM 250 matrix (14) ; the tree was then generated by using KITSCH and drawn by using TREEVIEW (17) . The programs PROTDIST and KITSCH are part of the PHYLIP 3.57 package (18) . Cyt, cytosol; Chl: chloroplast. #The author who reported the M. sativa G6PD sequence (603219) did not state whether the sequence is encoding for the G6PD present in the cytosol or in the chloroplast. From the alignment and from this phylogenetic tree, we infer that the sequence almost certainly encodes for a cytosolic G6PD. $The author who reported A. thaliana 2 G6PD sequence (1174336) did not state whether the sequence is encoding for the G6PD present in the cytosol or in the chloroplast. From the alignment and from this phylogenetic tree, we infer that the sequence almost certainly encodes for a chloroplast G6PD.

Sequence conservation and 3-dimensional structure
In all of the 52 known G6PD species, there are 25 identical aa and 56 similar aa. The total number of aa is 515 in human and between 425 and 604 in most species. If we apply broader criteria of similarity, whereby aa are classified into 12 groups sharing some physicochemical properties, there are in fact 143 conserved aa. The overall percent homology among G6PD sequences is of course only a rough measure of evolutionary conservation, because homology is not uniform throughout the sequence. The degree of homology tends to decrease toward both the NH2 terminus and the carboxyl terminus, perhaps because these regions are allowed greater flexibility without prejudice to the overall conformation of the molecule. Similarity conservation is shaped in blocks; we have identified 12 by visual inspection (Fig. 1) . Although we cannot yet fully explain the significance of each conserved block, we can offer a rationale in most cases. Block I is the NADP binding region (residues 38–47 of human G6PD); it has a characteristic dinucleotide pocket (GXXGXX), flanked on either side by additional conserved hydrophobic amino acids (e.g., IIM 35–37, P 50 and P 62, L 55, and L 61) and a KKK motif. Block IV comprises the catalytic site, which surrounds a completely conserved lysine; in human G6PD this is K 205, previously shown to be essential for enzyme activity though not for substrate binding (25) .

To understand the significance of other regions of high conservation, we must consider the conformation of the protein. The crystal structure has been fully solved for L. mesenteroides G6PD (26) and human G6PD (27) . The latter agrees well with a previously published model (19) ; therefore, it is reasonable to presume that the 3-dimensional structure is essentially conserved. With reference to the human G6PD model (19) , we find that the conserved blocks III and V are facing the active center (see above). Block X contributes much of the subunit interface. Blocks II, VII, VIII, IX, and XI are part of the hydrophobic core. The carboxyl-terminal ends of block IV and block XI also contribute to the subunit interface. Blocks VI and XII include amphipatic {alpha} helices. To test the notion that surface accessibility is significantly related to evolutionary conservation, we consider four groups of aa based on similarity conservation (fully conserved, 100%; highly conserved, 75–99%; moderately conserved, 50–74%; poorly conserved, <50%). We find that buried aa residues are over-represented and that exposed aa residues are under-represented among those with a similarity higher than 75% (Fig. 3 ) ({chi}2=50; P<0.001).



View larger version (22K):
[in this window]
[in a new window]
 
Figure 3. Conservation and 3-dimensional structure. Each bar represents the distribution of G6PD amino acids among the various conservation categories (see text) for exposed aa, total aa, and buried aa, respectively.

Mutations in human G6PD
We next analyzed the known mutations of human G6PD (microevolution) in relation to the macroevolutionary conservation of the protein sequence (Fig. 4 ). The majority of mutations causing G6PD deficiency are missense, and obvious ‘null’ mutations (early nonsense mutations, mutations destroying the reading frame, mutations in the substrate binding or coenzyme binding regions) have never been observed. In addition, the mutations in human G6PD are spread throughout the sequence. Only one discrete cluster emerges in the aa range 380–410 that corresponds to the subunit interface in the enzymatically active G6PD dimer (19) ,

To assess how human mutations associated with G6PD deficiency relate in general to the evolutionary history of the G6PD sequence, we refer again to the four similarity conservation groups defined above and see that a distinct pattern emerges (Table 2 ; Fig. 5 ). Fully conserved amino acids and poorly conserved amino acids are under-represented in G6PD-deficient mutants. By contrast, highly and moderately conserved amino acids are over-represented in G6PD-deficient mutants. This skewed distribution of mutations among the different amino acid conservation group is statistically significant ({chi}2=9.36; P<0.03).


View this table:
[in this window]
[in a new window]
 
Table 2. Human G6PD deficiency mutations relate to evolutionary conservationa



View larger version (22K):
[in this window]
[in a new window]
 
Figure 5. Conservation and G6PD variants. The upper bar represents the distribution of G6PD mutants among the various conservation categories (see text). For comparison, the lower bar represents the distribution of all amino acids in human G6PD among the various conservation categories.

In 84% (83/99) of cases, regardless of the degree of evolutionary conservation, aa replacements in human mutants do not respect similarity; in 68% (67/99) of cases, the aa replaced in the mutants are not found in any nonhuman G6PD—if so, they probably would not cause G6PD deficiency. In support of this notion, we noticed that in 9 of the 16 cases (56%) where a human mutation does respect similarity, the mutated residue is normal in another species; but of the 83 cases where the human mutation does not respect similarity, there are only 23 (28%) where the mutated residue is normal in another species.


   DISCUSSION
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
CONCLUSION
REFERENCES
 
The G6PD gene is unique among housekeeping genes because of the existence of a large number of human mutants that have been discovered, of which many are polymorphic rather than sporadic. Analyzing this system therefore affords an opportunity to relate the naturally occurring genetic variation in the human species (microevolution) to sequence information available for G6PD from more than 50 other organisms (macroevolution).

G6PD is not ubiquitous
Knowledge of the complete sequence of the genome of several microorganisms has been one of the most significant advances in genomics research in the past 5 years. A remarkable implication is that for the first time it is possible to ask directly not only what genes can be found in an organism, but also what genes are lacking. To our surprise, we found no recognizable G6PD sequence in seven microorganisms whose genome had been fully sequenced. Indeed, three of the microorganisms that lack G6PD are Eubacteria, with small and ‘defective’ genomes, and parasitic habit (Mycoplasma genitalium and M. pneumoniae, Rickettsia prowazekii). We surmise that all three might be able to capitalize on the G6PD activity of the host cell. The other four microorganisms that lack G6PD (Archaeoglobus fulgidus, Methanococcus jannaschii, Methanobacterium thermoautotrophicum, Pyrococcus horikoshii) are Archea, which grow in an oxygen-poor habitat; we surmise that they do not need G6PD because they do not need to defend themselves against oxidative stress. These evolutionary findings, while showing that life without G6PD is possible, provide independent confirmation for the notion derived from targeting the G6PD gene in mouse embryonic stem cells—namely, that in the organisms that do have G6PD, its only indispensable function consists not in pentose synthesis, but rather in supplying reductive potential in the form of NADPH (28) . NADPH in turn serves as a defense against oxidative stress as well as for the synthesis of nitric oxide (29 , 30) .

Phylogenesis
The phylogenetic tree we obtained for G6PD clearly supports a common evolutionary origin throughout living organisms and is consonant with taxonomy (Fig. 2) . Our findings on the evolution of the G6PD gene are reminiscent of those that have been reported with other housekeeping genes, most of which are ancient in evolutionary history (31) : namely, the degree of homology bears a broadly inverse correlation to evolutionary distances, and certain critical regions are nearly completely conserved. P. falciparum seems to be at an evolutionary dead-end, as is often the case with parasitic protozoa. The absence of G6PD in all the Archea that have been fully sequenced is intriguing, but not unique. Eight hundred sixty-four clusters of orthologous groups (COG) have been defined on the basis of consistent patterns of sequence similarities when comparing protein sequences encoded in eight complete genomes from six major phylogenetic lineages (32) . Each COG consists of individual proteins or groups of paralogs from at least three lineages. Indeed, 26% of COG show the same phylogenetic pattern of G6PD (presence in Bacteria and Eukarya, absence in Archaea) (33) . This phylogenetic pattern suggests that G6PD gene has been lost in the common ancestor of Archaea. Alternatively, the common ancestor of Archea and Eukaria did not have G6PD, and Eukaria later acquired G6PD gene by horizontal transfer from Bacteria.

Evolutionary conservation and 3-dimensional structure
In general, one would expect in an enzyme that is a globular protein that the active center and the hydrophobic core would show a high degree of conservation (34) , and we have validated this principle in the case of G6PD: 64.2% of buried (but only 35.8% of exposed) amino acid residues are highly or fully conserved (Fig. 1 , Fig. 3 ; P<0.001). In addition, we note that in a homodimer the subunit interface is structurally special because each amino acid within that region is represented twice, in symmetrical positions, on the contact surface. As a result, any amino acid replacement within this region will produce two nearby changes in the structure of the protein. In fact, we have found that the subunit interface probably accounts for the majority of the conserved blocks in the G6PD sequence on a long-range evolutionary scale, suggesting that constraints against sequence change are greater than average in that region.

Genetic variation in human G6PD
Comparing the distribution of mutations in a human gene with its evolutionary history is a powerful tool for pinpointing the function of domains and even of individual aa within a protein. To do this, it is useful to consider the range of potential phenotypic (clinical) consequences of different types of mutations in a housekeeping gene such as G6PD. 1) Null mutations have never been observed. We have obtained mice heterozygous for G6PD deficiency from G6PD null embryonic stem cells (28) , but no hemizygous G6PD-deficient mice; we have recently found that the condition is lethal at an early stage in embryonic development (35) . 2) Mutations that compromise drastically substrate binding, coenzyme binding, or the catalytic mechanism would be expected to affect G6PD function in all cells to approximately the same extent and they ought to cluster in the respective regions of the sequence; no such clusters are seen (Fig. 4) . We presume that mutations with such drastic effects may frequently be lethal. 3) Mutations that cause instability of the G6PD protein molecule would be expected to produce marked deficiency of G6PD in red cells (much more than in other cells), because these cells have a long life span after they have lost capacity for protein synthesis (36) . The majority of known human G6PD-deficient variants fall into this category. 4) The only cluster of mutations emerges between aa 380 and 410, and corresponds to the subunit interface in the enzymatically active G6PD dimer (19) . Moreover, nearly all of the mutations in this cluster cause a severe phenotype (class I); indeed, 34% of all class I mutations fall within this 6% of the entire sequence. This highlights the critical role of precisely fitting noncovalent interactions for the formation and stability of the dimeric molecule.

The distribution of mutations in relationship to the evolutionary history of the G6PD sequence shows a definite and peculiar pattern: more than two-thirds of the aa replacements that cause G6PD deficiency in humans are in highly and moderately conserved aa, whereas relatively few are in fully or poorly conserved aa (Fig. 5 , P<0.03). Fully conserved amino acids are under-represented in G6PD-deficient mutants, presumably because of a higher probability that their replacement may be lethal. Poorly conserved amino acids are also under-represented, presumably because in many cases their replacement may not cause G6PD deficiency and therefore such mutations may go undetected. In fact, of the 19 mutants in this group, 7 are in the subunit interface region (see above) and 3 are in-frame deletions rather than missense mutations; of the remaining, none has a severe phenotype, confirming that replacements of these nonconserved amino acids are generally well tolerated. By contrast, highly and moderately conserved amino acids are over-represented in G6PD-deficient mutants; indeed, two-thirds of these mutants affect such amino acids residues, as though the consequence of their replacement is often serious enough to cause instability but not sufficiently serious to be lethal. This finding is a novel modification of the notion that mutations associated with genetic defects tend to affect the amino acids residues that are most conserved.

Human polymorphic mutations vs. human sporadic mutations
Extensive databases on human disease genes in several cases comprise 100 or more mutations (37) . However, G6PD is the only case of a housekeeping gene in which many (about one-half; see Table 2 ) of the known mutations are both potentially pathogenic and polymorphic2 as a result of malaria selection (38 , 39) . Indeed, each of the mutant alleles concerned constitutes an example of balanced polymorphism, whereby the deleterious phenotypic consequences of the mutation are balanced by the resistance that it confers against Plasmodium falciparum. By contrast, we can presume that each of the sporadic G6PD mutants has remained sporadic because its phenotypic consequences are too deleterious to be balanced by the resistance it might confer against Plasmodium falciparum. If we analyze separately the severe sporadic mutations and the mild polymorphic mutations against the backdrop of macroevolution, we find that the former are slightly more frequent in the 76–99% conservation bracket and the latter are more frequent in the 50–75% conservation bracket (see Table 2 ). However, with the number of mutants known thus far, the difference is not yet significant.


   CONCLUSION
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
CONCLUSION
REFERENCES
 
The evolutionary conservation of a housekeeping gene such as G6PD is greater than that of tissue-specific genes, presumably because the latter may require more specific adaptation to the physiology of individual organisms. Total lack of G6PD is almost certainly lethal in mammals, but partial G6PD deficiency confers biological advantage with respect to malaria. Most of the aa replacements causing G6PD deficiency take place in positions that are conserved, but less than fully conserved. Thus, the range of permissible G6PD mutations in the microevolution of the human species bears a clear imprint of the macroevolutionary history of this gene. With increasing numbers of full genome sequences becoming available, the approach we have used for the analysis of G6PD mutations is likely to be generally applicable to genes (particularly housekeeping genes) underlying other human diseases.


   ACKNOWLEDGMENTS
 
We thank Dr. Margaret Adams for providing the coordinates of the human G6PD model, Dr. Mike Rosen for valuable help with the analysis of the 3-dimensional structure, Drs. Nicola Pavletich and Dr. Martin Wiedmann for critical reading of the manuscript, and Dr. Howard T. Thaler for helpful suggestions. We also thank all colleagues of the G6PD groups in Ibadan (Nigeria), Napoli (Italy), and London (UK). This work was supported by MSKCC and NIH (PO1 HL 59312). R.N. is on leave of absence from the Hematology Division of Federico II University, Naples, Italy.

Note added in proof: In a publication that appeared after this paper was submitted, Cheng et al. [J. Biomed. Sci. (1999) 6, 106–114] have reported an independent analysis of human mutations in relation to amino acid conservation among 23 G6PD sequences from different organisms; some of their conclusions are in good agreement with ours.


   FOOTNOTES
 
2 The other pertinent case is of course that of hemoglobin, produced by two highly tissue specific (nonhousekeeping) genes. Polymorphic mutants of both the {alpha} and the ß globin genes are known, although they are not as numerous as the G6PD mutants.

Received for publication February 28, 1999. Revised for publication September 16, 1999.


   REFERENCES
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
CONCLUSION
REFERENCES
 

  1. Luzzatto, L., Mehta, A. (1995) Glucose 6-phosphate dehydrogenase deficiency. Scriver, C. R. Beaudet, A. L. Sly, W. S. Valle, D. eds. The Metabolic and Molecular Basis of Inherited Disease ,3367-3398 McGraw-Hill New York.
  2. Beutler, E. (1991) Glucose 6-phosphate dehydrogenase deficiency. N. Engl. J. Med. 324,169-174[Medline]
  3. Warburg, O., Christian, W. (1932) Über ein neues Oxydationsferment und sein Absorptionsspektrum. Biochem. Z. 254,438-458
  4. Warburg, O., Christian, W. (1931) Über Aktivierung der robinsonschen Hexosemono-phosphorsäure in roten Blutzellen and die Gewinnung aktivierender Fermentlösung. Biochem. Z. 242,206-227
  5. Luzzatto, L., Battistuzzi, G. (1985) Glucose-6-phosphate dehydrogenase. Adv. Hum. Genet. 14,217-329[Medline]
  6. Martini, G., Toniolo, D., Vulliamy, T., Luzzatto, L., Dono, R., Viglietto, G., Paonessa, G., D’Urso, M., Persico, M. G. (1986) Structural analysis of the X-linked gene encoding human glucose 6-phosphate dehydrogenase. EMBO J 5,1849-1855[Medline]
  7. Philippe, M., Larondelle, Y., Lemaigre, F., Mariame, B., Delhez, H., Mason, P., Luzzatto, L., Rousseau, G. G. (1994) Promoter function of the human glucose-6-phosphate dehydrogenase gene depends on two GC boxes that are cell specifically controlled. Eur. J. Biochem. 226,377-384[Medline]
  8. Ursini, M. V., Scalera, L., Martini, G. (1990) High levels of transcription driven by a 400 bp segment of the human G6PD promoter. Biochem. Biophys. Res. Commun. 170,1203-1209[Medline]
  9. Persico, M. G., Viglietto, G., Martini, G., Toniolo, D., Paonessa, G., Moscatelli, C., Dono, R., Vulliamy, T., Luzzatto, L., D’Urso, M. (1986) Isolation of human glucose-6-phosphate dehydrogenase (G6PD) cDNA clones: primary structure of the protein and unusual 5' non-coding region. Nucleic Acids Res. 14,2511-2522; 7822[Abstract/Free Full Text]
  10. Ozols, J. (1993) Isolation and the complete amino acid sequence of lumenal endoplasmic reticulum glucose-6-phosphate dehydrogenase. Proc. Natl. Acad. Sci. USA 90,5302-5306[Abstract/Free Full Text]
  11. Hino, Y., Minakami, S. (1982) Hexose-6-phosphate dehydrogenase of rat liver microsomes. Isolation by affinity chromatography and properties. J. Biol. Chem. 257,2563-2568[Abstract/Free Full Text]
  12. Smith, R. F., Smith, T. F. (1990) Automatic generation of primary sequence patterns from sets of related protein sequences. Proc. Natl. Acad. Sci. USA 87,118-122[Abstract/Free Full Text]
  13. Nicholas, K. B., Nicholas, B. J. (1997) GeneDoc Pittsburgh Supercomputer Center
  14. Dayhoff, M. O. (1979) Atlas of Protein Sequences and Structure Vol. 5, Suppl. 3 National Biomedical Research Foundation Washington, D.C..
  15. Nicholas, K. B., and Nicholas, H. B. J. (1997) Analysis and visualization of genetic variant. http://wwwcriscom/~Ketchup/genedocshtml
  16. Taylor, W. R. (1986) The classification of amino acid conservation. J. Theor. Biol. 119,205-218[Medline]
  17. Page, R. D. (1996) TreeView: an application to display phylogenetic trees on personal computers. Comput. Appl. Biosci. 12,357-358[Free Full Text]
  18. Felsenstein, J. (1989) PHYLIP—phylogeny inference package (version 3.2). Cladistic 5,164-166
  19. Naylor, C. E., Rowland, P., Basak, A. K., Gover, S., Mason, P. J., Bautista, J. M., Vulliamy, T. J., Luzzatto, L., Adams, M. J. (1996) Glucose 6-phosphate dehydrogenase mutations causing enzyme deficiency in a model of the tertiary structure of the human enzyme. Blood 87,2974-2982[Abstract/Free Full Text]
  20. Guex, N., Peitsch, M. C. (1997) SWISS-MODEL and the Swiss-dbViewer: an environment for comparative protein modeling. Electrophoresis 18,2714-2723[Medline]
  21. Goldman, N., Thorne, J. L., Jones, D. T. (1998) Assessing the impact of secondary structure and solvent accessibility on protein evolution. Genetics 149,445-458[Abstract/Free Full Text]
  22. Vulliamy, T., Luzzatto, L., Hirono, A., Beutler, E. (1997) Hematologically important mutations: glucose-6-phosphate dehydrogenase. Blood Cells Mol. Dis. 23,302-313[Medline]
  23. O’Brien, E., Kurdi-Haidar, B., Wanachiwanawin, W., Carvajal, J. L., Vulliamy, T. J., Cappadoro, M., Mason, P. J., Luzzatto, L. (1994) Cloning of the glucose 6-phosphate dehydrogenase gene from Plasmodium falciparum. Mol. Biochem. Parasitol. 64,313-326[Medline]
  24. Scopes, D. A., Bautista, J. M., Vulliamy, T. J., Mason, P. J. (1997) Plasmodium falciparum glucose-6-phosphate dehydrogenase (G6PD)-the N-terminal portion is homologous to a predicted protein encoded near to G6PD in Haemophilus influenzae. Mol. Microbiol. 23,847-848[Medline]
  25. Bautista, J. M., Mason, P. J., Luzzatto, L. (1995) Human glucose-6-phosphate dehydrogenase. Lysine 205 is dispensable for substrate binding but essential for catalysis. FEBS Lett. 366,61-64[Medline]
  26. Rowland, P., Basak, A. K., Gover, S., Levy, H. R., Adams, M. J. (1994) The 3-dimensional structure of glucose 6-phosphate dehydrogenase from Leuconostoc mesenteroides refined at 2.0 A resolution. Structure 2,1073-1087[Medline]
  27. Au, S. W., Naylor, C. E., Gover, S., Vandeputte-Rutten, L., Scopes, D. A., Mason, P. J., Luzzatto, L., Lam, V. M., Adams, M. J. (1999) Solution of the structure of tetrameric human glucose 6-phosphate dehydrogenase by molecular replacement. Acta Crystallogr. D. Biol. Crystallogr. 55,826-834[Medline]
  28. Pandolfi, P. P., Sonati, F., Rivi, R., Mason, P., Grosveld, F., Luzzatto, L. (1995) Targeted disruption of the housekeeping gene encoding glucose 6-phosphate dehydrogenase (G6PD): G6PD is dispensable for pentose synthesis but essential for defense against oxidative stress. EMBO J 14,5209-5215[Medline]
  29. Tsai, K. J., Hung, I. J., Chow, C. K., Stern, A., Chao, S. S., Chiu, D. T. (1998) Impaired production of nitric oxide, superoxide, and hydrogen peroxide in glucose 6-phosphate-dehydrogenase-deficient granulocytes. FEBS Lett 436,411-414[Medline]
  30. Nathan, C. (1992) Nitric oxide as a secretory product of mammalian cells. FASEB J 6,3051-3064[Abstract]
  31. Straus, D., Gilbert, W. (1985) Genetic engineering in the Precambrian: structure of the chicken triosephosphate isomerase gene. Mol. Cell. Biol. 5,3497-3506[Abstract/Free Full Text]
  32. Tatusov, R. L., Koonin, E. V., Lipman, D. J. (1997) A genomic perspective on protein families. Science 278,631-637[Abstract/Free Full Text]
  33. Tatusov, R. L., Koonin, E. V. K., and Lipman, D. J. (1998) A natural system of gene families from complete genomes. http://www.ncbi.nlm.nih.gov/COG
  34. Chothia, C., Lesk, A. M. (1986) The relation between the divergence of sequence and structure in proteins. EMBO J 5,823-826[Medline]
  35. Longo, L., Rosti, V., Gaboli, M., Tabarini, D., Pandolfi, P. P., Luzzatto, L. (1997) Mice heterozygous for complete glucose 6-phosphate dehydrogenase (G6PD) deficiency are viable and show somatic cell selection. Blood 90(Suppl. 1),408a
  36. Mason, P. J. (1996) New insights into G6PD deficiency. Br. J. Haematol. 94,585-591[Medline]
  37. Krawczak, M., Cooper, D. N. (1997) The human gene mutation database. Trends Genet 13,121-122[Medline]
  38. Luzzatto, L. (1979) Genetics of red cells and susceptibility to malaria. Blood 54,961-976[Free Full Text]
  39. Ruwende, C., Khoo, S. C., Snow, R. W., Yates, S. N., Kwiatkowski, D., Gupta, S., Warn, P., Allsopp, C. E., Gilbert, S. C., Peschu, N., et al (1995) Natural selection of hemi- and heterozygotes for G6PD deficiency in Africa by resistance to severe malaria. Nature (London) 376,246-249[Medline]



This article has been cited by other articles:


Home page
Eukaryot CellHome page
M. Saliola, G. Scappucci, I. De Maria, T. Lodi, P. Mancini, and C. Falcone
Deletion of the Glucose-6-Phosphate Dehydrogenase Gene KlZWF1 Affects both Fermentative and Respiratory Metabolism in Kluyveromyces lactis
Eukaryot. Cell, January 1, 2007; 6(1): 19 - 27.
[Abstract] [Full Text] [PDF]


Home page
BloodHome page
F. Paglialunga, A. Fico, I. Iaccarino, R. Notaro, L. Luzzatto, G. Martini, and S. Filosa
G6PD is indispensable for erythropoiesis after the embryonic-adult hemoglobin switch
Blood, November 15, 2004; 104(10): 3148 - 3152.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
L. M. Matzkin
Population Genetics and Geographic Variation of Alcohol Dehydrogenase (Adh) Paralogs and Glucose-6-Phosphate Dehydrogenase (G6pd) in Drosophila mojavensis
Mol. Biol. Evol., February 1, 2004; 21(2): 276 - 285.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Clin. Nutr.Home page
B. N Ames, I. Elson-Schwab, and E. A Silver
High-dose vitamin therapy stimulates variant enzymes with decreased coenzyme binding affinity (increased Km): relevance to genetic disease and polymorphisms
Am. J. Clinical Nutrition, April 1, 2002; 75(4): 616 - 658.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
M. P. Miller and S. Kumar
Understanding human disease mutations through the use of interspecific genetic variation
Hum. Mol. Genet., October 1, 2001; 10(21): 2319 - 2328.
[Abstract] [Full Text] [PDF]


Home page
BloodHome page
A. Rovira, M. De Angioletti, O. Camacho-Vanegas, D. Liu, V. Rosti, H. F. Gallardo, R. Notaro, M. Sadelain, and L. Luzzatto
Stable in vivo expression of glucose-6-phosphate dehydrogenase (G6PD) and rescue of G6PD deficiency in stem cells by gene transfer
Blood, December 15, 2000; 96(13): 4111 - 4117.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by NOTARO, R.
Right arrow Articles by LUZZATTO, L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by NOTARO, R.
Right arrow Articles by LUZZATTO, L.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS