(The FASEB Journal. 2006;20:202-206.)
© 2006 FASEB
Diverse membrane-associated proteins contain a novel SMP domain
Ian Lee*,1 and
Wanjin Hong*,
* Computational Molecular Biology Programme and
Membrane Biology Lab, Institute of Molecular and Cell Biology, Proteos, Singapore
1 Correspondence: Computational Molecular Biology Programme, Institute of Molecular and Cell Biology, 61 Biopolis Dr., Proteos, Singapore (138673). E-mail: leehh{at}imcb.a-star.edu.sg
 |
ABSTRACT
|
|---|
We have analyzed the sequence of a mitochondrial integral membrane protein, Mdm12, and found that it forms the prototype for a novel domain, designated the SMP domain, that is common to an extended family of membrane-associated proteins. Comprehensive sequence searches using protein alignment models of SMP proteins were cross-validated by statistical resampling; providing strong support for these relationships. No consensus of 3-dimensional structure was reached upon threading sequences through known folds. SMP proteins are widespread amongst eukaryotic species with a particular enrichment in plants and features suggestive of species-specific functional variations. Members of 2 SMP families, the mitochore and tricalbin proteins, are essential components of protein complexes involved in mitochondrial inheritance and receptor endocytosis while a third SMP protein family, HT008, is associated with the Rvs161-Rvs167 complex, a known regulator of sphingolipid metabolism. In addition, HT008 and PDZK8 SMP proteins possess additional protein-protein interaction domains in domain architectures that are typical of protein scaffolds and adaptors. We therefore predict that the SMP domain is an important link between these distinct membrane-associated proteins and a key regulatory hub for unidentified global regulators.Lee, I., Hong, W. Diverse membrane-associated proteins contain a novel SMP domain.
Key Words: PROBE chaperones signaling eukaryotic complexes
 |
INTRODUCTION
|
|---|
PROTEIN CHAPERONES are molecular machines that are integral to a plethora of cellular activities. These include the regulation of cellular responses to environmental stress, transport of protein and RNA molecules and remodeling events on protein complexes involved in signaling, transcription and cell division. Transient associations with multiple protein substrates by protein chaperones via protein domains often forms the basis for organization of dense, inter-connected cellular networks (1)
. We therefore hypothesized that the identification of novel domains in chaperone families would lead to important new linkages between molecular pathways.
 |
DISCOVERY OF A NOVEL DOMAIN IN PUTATIVE CHAPERONES
|
|---|
Despite their functional diversity, tertiary structures across distinct chaperone families are remarkably similar. Primary sequence similarities between members of the same structural fold tend to be sparse; usually limited to short sequence motifs separated by linkers of variable length. Thus, sequence search strategies that explicitly account for these properties have been particularly efficacious in uncovering relationships between otherwise disparate families (2)
.
A PROBE (3)
search against the NCBI nonredundant protein database was initiated with the full protein sequence of Saccharomyces cerevisiae Mdm12p. The initial seed alignment model was constructed from proteins retrieved with a transitive search threshold of P = 0.001. This model, optimized by a combination of gibbs sampling and a genetic algorithm, was used to search for additional members of the motif family. Only sequences that matched the alignment model at an E value threshold of below 0.01 were included within subsequent steps of alignment model refinement and database search. After 7 iterations, the algorithm converged, having recovered over a 160 protein sequences from a minimum of 12 distinct eukaryotic species.
To weed out potential false positive hits identified by the PROBE algorithm, two separate resampling procedures were performed as cross-validations.
1. Jackknife
For each query sequence, related sequences were identified and excluded based on a PROBE transitive blast search (depth=1, P=0.005). The remaining sequences were used to build an alignment model (1 round of gibbs sampling and model refinement), which was then used to scan the NRDB. Bonferroni corrections for database size were made for each query sequence as described in ref 3
. The rationale behind the jackknife resampling procedure is that potential false positives will not be retrieved with high confidence when absent from the seed set of sequences used in model building. Few sequences were eliminated as false positives by this procedure, with jackknifed P values ranging between 1e-9 and 6.3e-34. This suggests that the stochastic optimization procedures implemented in PROBE converged on the alignment model that best captured the statistical relationships between the distinct protein families.
2. Repetition of iterative searching using alternate queries
The accuracy of iterative sequence search procedures in identifying novel relationships between disparate protein families is known to depend critically on the choice of query sequence used to initiate the search (4)
. Consistency across sets of sequences initiated with distinct starting points are thus an important indicator of the significance of observed relationships. We repeated the PROBE search (parameters as described above) using the SMP associated regions of two proteins, Saccharomyces cerevisiae Mmm1p and Homo sapiens HT008 (retrieved on the first and fifth iterations respectively in the original PROBE search) as seeds. The sets of proteins detected by these separate searches were essentially similar to that set in which Saccharomyces cerevisiae Mdm12 was used as initial query.
Plant synaptotagmin-like proteins as well as putative orthologs of Mdm12 and Mmm1 in various yeasts dominate the list of proteins retrieved from searching the NRDB. In addition, members of two uncharacterized protein families (represented by their Homo sapiens orthologs HT008 and PDZK8) were detected. Sequence similarities between various proteins were confined to discrete segments of each protein and define a compact globular domain. The identified proteins can be classified into four major families based on the organization of other domains (Fig. 1
): C2 domain synaptotagmin-like, PH domain-containing HT-008, PDZK8 and mitochondrial protein families. These characteristics formed the basis for naming the protein module the SMP domain (an acronym for synaptotagmin-like, mitochondrial and lipid binding proteins).

View larger version (102K):
[in this window]
[in a new window]
|
Figure 1. Multiple alignment of SMP domains. Multiple alignment of SMP domains consisting of 6 alignment blocks generated by PROBE (3)
. Numbers in parentheses represent residues not shown between alignment blocks. PROBE jackknifed P values are shown after alignment blocks. The predicted secondary structure with PHD (5)
is shown below it [H denotes helix, L denotes loop residues]. Sequences are denoted by protein, gene or GenBank locus identifiers followed by species abbreviations, GenBank protein ids and residue limits. CHROMA (21)
was used to label aligned columns according to a 80% consensus sequence. Species abbreviations: Mm, Mus musculus; Hs, Homo sapiens; Rn, Rattus norvegicus; Dm, Drosophila melanogaster; Ag, Anopheles gambiae; Ce, Caenorhabditis elegans; Cb, Caenorhabditis briggsae; Os, Oryza sativa; At, Arabidopsis thaliana; Sc, Saccharomyces cerevisiae; Sp, Schizosaccharomyces pombe. Sequences are grouped based on additional domains detected by SMART (7)
.
|
|
After removal of duplicates and sequence fragments, a distribution of the SMP domain emerged revealing a striking enrichment in plants relative to mammals and other higher eukaryotes of equivalent or comparable proteome sizes (At least 10 proteins were detected in Arabidopsis thaliana, which was double that in Homo sapiens.) An increased repertoire of synaptotagmin-like proteins in plants seems to account for a majority of this relative expansion. The HT008 protein family is represented across a broad spectrum of eukaryotic species. Intriguingly, there was species restriction of 2 SMP domain families, which suggests cell-type or tissue-specific functional variations. The PDZK8 family was detected only in animals while the mitochondrial outer membrane proteins Mdm12 and Mmm1 are apparently exclusive to yeasts.
The multiple alignment constructed from the iterative search procedure consists of 6 alignment blocks and reveals a preponderance of polar and hydrophobic residues (see Fig. 1
). Prediction of the secondary structure with this alignment using PHD (5)
indicates a mix of ß strand and helical elements extending between 180 to 300 residues in length. An asparginine-rich column is prominent in alignment block 1 while loop-inducing glycine and proline positions are found within blocks 2, 5, and 6. An extended loop region (located between alignment block three and four and predicted with high confidence by PHD) distinguishes members of the HT008 family from the other SMP families. The extraordinary length of this loop may imply greater flexibility of 3-dimensional structure that is of particular relevance for protein interaction. Several charged residues are prominent at various positions within this loop and may be contact sites for charged surfaces.
SMP domains in each protein were subjected to sequence structure threading with Threader (6)
. Top-ranking matches to known structural folds that were statistically significant are shown in Table 1
. Despite the detection of several highly significant matches (with Z-scores
4.0), there was no consensus between matches to any specific fold. Even at the level of SCOP fold class, there was disagreement between matches that better aligned with ß strands (all ß fold class) and those that registered matches that included the predicted tail-end
helices of the SMP domain (mixed
-ß fold class). On the basis of these disparities, the fold predictions should best be interpreted only as a corroboration of the above secondary structural prediction by PHD (5)
.
 |
BIOLOGICAL SIGNIFICANCE OF THE SMP DOMAIN
|
|---|
A search of domain repositories with SMART (7)
reveals that at least 20 distinct modular architectures are represented among the identified SMP proteins (see Fig. 2
), the majority of which are defined by the presence of putative transmembrane segments. Yeast Mmm1p is an integral mitochondrial membrane protein, which forms a three-protein complex with Mdm12p and Mdm10p termed the mitochore. The Mmm1p membrane anchor is necessary for straddling both the mitochondrial outer and inner membranes. Its C-terminal end (which harbors the SMP domain) remains exposed in the cytoplasm and is essential for maintenance of mitochondrial morphology (8)
. Recent studies suggest that an intact mitochore is necessary for linking mitochondria to the actin cytoskeleton in yeasts; using the ARP2/3 complex to mediate control of mitochondrial motility, inheritance and morphology (9)
. Consistent with this, deletion of Mmm1p or Mdm12p leads to mislocalization of the other; producing phenotypes indicative of dysfunction in mitochondrial inheritance. Thus, the Mmm1p SMP domain is likely to form the link with other mitochore components, coupling mitochondria to the actin segregation machinery.
An interesting parallel is found in the Saccharomyces cerevisiae Tricalbins Tcb1, Tcb2, and Tcb3, components of a complex that is proposed to regulate vacuole morphology via the Pdr1p drug resistance pathway (10)
. A screen for suppressors of cycloheximide sensitivity induced by Tcb mutants, identified RSP5, a ubiquitin conjugating ligase essential for receptor-mediated and fluid-phase endocytosis, as an effector of the Tricalbin complex. Analogous to the mitochore, deletions of specific Tricalbins leads to defects in the trafficking of cell surface receptors to the vacuole.
Thus, available evidence on the mitochore and Tricalbin complexes points to the roles of SMP domains in maintaining intact molecular chaperone complexes that direct the transport of target substrates to their respective destinations.
The Pleckstrin homology (PH) domain is a functional module that has been identified in a significant number of proteins involved in signal transduction, post-transcriptional regulation and cytoskeletal organization (11)
. The PH domain is part of a large structural family, known as the PH superfold, which includes the phosphotyrosine binding domain (PTB), the Ena/WASP homology domain (WH1) and Ran binding domain (RanBD). The formation of scaffolds or adaptors is a common mechanistic feature of the superfold (12)
. While the PTB, WH1 and RanBD domains recognize protein ligands, the major function of the PH domain is to interact with phosphoinositides (13)
. Members of the HT008 family possess a tandem transmembrane segment, PH domain and SMP domain arrangement and are functionally uncharacterized. The putative yeast ortholog Ypr091cp, however, was recovered by phage display using the SH3 domain of the yeast amphiphysin Rvs167p as bait (14)
. Rvs167p is known to form a complex with Rvs161p that regulates sphingolipid metabolism (15)
. Thus, the HT008 family may function to transduce signals to the sphingolipid metabolic pathway via the Rvs167-Rvs161 complex.
Little is known about the functions of the PDZK8 family apart from the identification of its human member in an immunoscreen of cDNA libraries using serum antibodies from a fibrosarcoma patient (SEREX) (16)
. PDZK8/NY-SAR-104 contains an N-terminal SMP domain and a PDZ domain that is known to localize membrane channels, signaling enzymes and adhesion molecules through binding of C-terminal tri-peptides. PDZ domains are almost invariably found as repeats and previous studies indicate that these ensembles are critical to activities specific to metazoans. An important example is Drosophila melanogaster InaD, which possesses 5 tandem PDZ domains that are individually responsible for binding distinct proteins; forming a membrane-associated scaffold. The sequential arrangement of PDZ domains is integral to the creation of this interface, allowing aggregation of signal-transducing proteins in a large multiprotein complex essential for photoreceptor function (17)
. Members of the PDZK8 family, which possess an SMP domain and a single PDZ domain in tandem, are therefore predicted to form similar scaffolds (with the SMP domain functioning as an ad hoc PDZ-like interface in place of additional PDZ domains) that may also involve the C-terminal C1 domain as a third binding interface. The sequence of the C1 domain more closely resembles the C1 domains of small G protein interactors although binding to phorbol esters or diacylglycerol may not be ruled out (18)
.
Strikingly, examination of protein domain architectures reveal that 12 of 20 proteins shown in Figs. 1
, 2
are derived from among members of the synaptotagmin-like family and are composed of combinations of transmembrane segments, SMP and C2 domains. The C2 domains are present as repeats with copy numbers ranging between 2 and 5. Although the C2 domain is generally recognized as a Ca2+-dependent lipid binding module, binding of ligands and substrates ranging from phospholipids, inositol polyphosphates to intracellular proteins have been reported (19)
. Given the expansion of synaptotagmin-like proteins in plants, differentiation of substrate recognition may be particularly relevant to membrane association. Furthermore, different C2 domains are known to possess non-overlapping functions even amongst functionally coupled proteins of the same species. The yeast Tricalbins are a salient example of this in that the C2 domains of Tcb2 do not bind Ca2+ while the third C2 domains of Tcb1 and Tcb3 do so with exquisite sensitivity (20)
. Thus, variations in the number and type of C2 domains amongst synaptotagmins are likely to represent heterogeneous responses to calcium signaling depending on their target substrates. This in turn may be connected with the type of molecule, tethered by the SMP domain, being targeted to the plasma membrane.
 |
CONCLUDING REMARKS
|
|---|
Taken together, the SMP domain is associated with a host of eukaryotic membrane proteins that bear the hallmarks of molecular chaperones or protein complexes. A significant number of these proteins, such as the mitochore proteins and Tricalbins, are known to either regulate or affect the transport of protein molecules or other complexes, while themselves forming complexes. Others, such as the HT008 and PDZK8 families, possess other functional domains that exist together with the SMP domain, forming domain architectures that are ideally suited to function as adaptors or scaffolds. The SMP domain therefore represents a novel link between the chaperones of diverse biochemical pathways, pointing tantalizingly to an as yet undiscovered common mechanism of regulation.
 |
ACKNOWLEDGMENTS
|
|---|
The authors wish to express their gratitude for the generous provision of computational resources by the Bioinformatics Institute (BII) that allowed this work.
Received for publication August 26, 2005.
Accepted for publication October 11, 2005.
 |
REFERENCES
|
|---|
- Soti, C., Pai, C., Papp, B., Csermely, P. (2005) Molecular chaperones as regulatory elements of cellular networks. Curr. Opin. Cell Biol. 17,210-215[CrossRef][Medline]
- Neuwald, A. F., Aravind, L., Spouge, J. L., Koonin, E. V. (1999) AAA+: a class of chaperone-like ATPases associated with the assembly, operation and disassembly of protein complexes. Genome Res. 9,27-43[Abstract/Free Full Text]
- Neuwald, A. F., Liu, J. S., Lipman, D. J., Lawrence, C. E. (1997) Extracting protein alignment models from the sequence database. Nucleic Acids Res. 25,1665-1673[Abstract/Free Full Text]
- Aravind, L., Koonin, E. V. (1999) Gleaning non-trivial structural, functional and evolutionary information about proteins by iterative database searches. J. Mol. Biol. 287,1023-1040[CrossRef][Medline]
- Rost, B. (1996) PHD: predicting one-dimensional protein structure by profile based neural networks. (1996) Methods Enzymol. 266,525-539[CrossRef][Medline]
- Jones, D. T., Miller, R. T., Thornton, J. M. (1995) Successful protein fold recognition by optimal sequence threading validated by rigorous blind testing. Proteins 23,387-397[CrossRef][Medline]
- Schultz, J., Milpetz, F., Bork, P., Ponting, C. P. (1998) SMART, a simple modular architecture research tool: Identification of signaling domains. Proc. Natl. Acad. Sci. USA 95,5857-5864[Abstract/Free Full Text]
- Kondo-Okamoto, N., Shaw, J., Okamoto, K. (2003) Mmm1p spans the outer and inner mitochondrial membranes and contains distinct domains for targeting and foci formation. J. Biol. Chem. 278,48997-49005[Abstract/Free Full Text]
- Boldogh, I. R., Nowakowski, D. W., Yang, H. C., Chung, H., Karmon, S., Royes, P., Pon, L. A. (2003) A protein complex containing Mdm10p, Mdm12p, and Mmm1p links mitochondrial membranes and DNA to the cytoskeleton-based segregation machinery. Mol. Biol. Cell 14,4618-4627[Abstract/Free Full Text]
- Creutz, C. E., Snyder, S. L., Schutz, T. A. (2004) Characterization of the yeast tricalbins: membrane-bound multi-C2-domain proteins that form complexes involved in membrane trafficking. Cell. Mol. Life Sci. 61,1208-1220[CrossRef][Medline]
- Rebecchi, M. J., Scarlata, S. (1998) Pleckstrin homology domains: a common fold with diverse functions. Annu. Rev. Biophys. Biomol. Struct. 27,503-528[CrossRef][Medline]
- Blomberg, N., Baraldi, E., Nilges, M., Saraste, M. (1999) The PH superfold: a structural scaffold for multiple functions. Trends Biochem. Sci. 24,441-445[CrossRef][Medline]
- Lemmon, M. A., Ferguson, K. M. (2000) Signal-dependent membrane targeting by pleckstrin homology domains. Biochem. J. 350,1-18
- Landgraf, C., Panni, S., Montecchi-Palazzi, L., Castagnoli, L., Schneider-Mergener, J., Volkmer-Engert, R., Cesareni, G. (2004) Protein interaction networks by proteome peptide scanning. PLoS Biol. 2,94-103
- Germann, M., Swain, E., Bergman, L., Nickels, J. T., Jr (2005) Characterizing the sphingolipid signaling pathway that remediates defects associated with loss of the yeast Amphiphysin-like orthologs, Rvs161p and Rvs167p. J. Biol. Chem. 280,4270-4278[Abstract/Free Full Text]
- Lee, S. Y., Obata, Y., Yoshida, M., Stockert, E., Williamson, B., Jungbluth, A. A., Chen, Y. T., Old, L. J., Scanian, M. J. (2003) Immunomic analysis of human sarcoma. Proc. Natl. Acad. Sci. USA 100,2651-2656[Abstract/Free Full Text]
- Ranganathan, R., Ross, E. M. (1997) PDZ domain proteins: scaffolds for signaling complexes. Curr. Biol. 7,R770-R773[CrossRef][Medline]
- Hurley, J. H., Newton, A. C., Parker, P. J., Blumberg, P. M., Nishizuka, Y. (1997) Taxonomy and function of C1 protein kinase C homology domains. Protein Sci. 6,477-480[Abstract]
- Nalepski, E. A., Falke, J. J. (1996) The C2 domain calcium-binding motif: Structural and functional diversity. Protein Sci. 5,2375-2390[Abstract]
- Schultz, T., Creutz, C. (2004) The Tricalbin C2 domains: Lipid-binding properties of a novel, synaptotagmin-like yeast protein family. Biochemistry 43,3987-3995[CrossRef][Medline]
- Goodstadt, L., Ponting, C. P. (2001) CHROMA: consensus-based coloring of multiple alignments for publication. Bioinformatics 17,845-846[Abstract/Free Full Text]