|
|
||||||||

* Computational Molecular Biology Programme and
Membrane Biology Lab, Institute of Molecular and Cell Biology, Proteos, Singapore
1 Correspondence: Computational Molecular Biology Programme, Institute of Molecular and Cell Biology, 61 Biopolis Dr., Proteos, Singapore (138673). E-mail: leehh{at}imcb.a-star.edu.sg
| ABSTRACT |
|---|
|
|
|---|
Key Words: PROBE chaperones signaling eukaryotic complexes
| INTRODUCTION |
|---|
|
|
|---|
| DISCOVERY OF A NOVEL DOMAIN IN PUTATIVE CHAPERONES |
|---|
|
|
|---|
A PROBE (3)
search against the NCBI nonredundant protein database was initiated with the full protein sequence of Saccharomyces cerevisiae Mdm12p. The initial seed alignment model was constructed from proteins retrieved with a transitive search threshold of P = 0.001. This model, optimized by a combination of gibbs sampling and a genetic algorithm, was used to search for additional members of the motif family. Only sequences that matched the alignment model at an E value threshold of below 0.01 were included within subsequent steps of alignment model refinement and database search. After 7 iterations, the algorithm converged, having recovered over a 160 protein sequences from a minimum of 12 distinct eukaryotic species.
To weed out potential false positive hits identified by the PROBE algorithm, two separate resampling procedures were performed as cross-validations.
1. Jackknife
For each query sequence, related sequences were identified and excluded based on a PROBE transitive blast search (depth=1, P=0.005). The remaining sequences were used to build an alignment model (1 round of gibbs sampling and model refinement), which was then used to scan the NRDB. Bonferroni corrections for database size were made for each query sequence as described in ref 3
. The rationale behind the jackknife resampling procedure is that potential false positives will not be retrieved with high confidence when absent from the seed set of sequences used in model building. Few sequences were eliminated as false positives by this procedure, with jackknifed P values ranging between 1e-9 and 6.3e-34. This suggests that the stochastic optimization procedures implemented in PROBE converged on the alignment model that best captured the statistical relationships between the distinct protein families.
2. Repetition of iterative searching using alternate queries
The accuracy of iterative sequence search procedures in identifying novel relationships between disparate protein families is known to depend critically on the choice of query sequence used to initiate the search (4)
. Consistency across sets of sequences initiated with distinct starting points are thus an important indicator of the significance of observed relationships. We repeated the PROBE search (parameters as described above) using the SMP associated regions of two proteins, Saccharomyces cerevisiae Mmm1p and Homo sapiens HT008 (retrieved on the first and fifth iterations respectively in the original PROBE search) as seeds. The sets of proteins detected by these separate searches were essentially similar to that set in which Saccharomyces cerevisiae Mdm12 was used as initial query.
Plant synaptotagmin-like proteins as well as putative orthologs of Mdm12 and Mmm1 in various yeasts dominate the list of proteins retrieved from searching the NRDB. In addition, members of two uncharacterized protein families (represented by their Homo sapiens orthologs HT008 and PDZK8) were detected. Sequence similarities between various proteins were confined to discrete segments of each protein and define a compact globular domain. The identified proteins can be classified into four major families based on the organization of other domains (Fig. 1
): C2 domain synaptotagmin-like, PH domain-containing HT-008, PDZK8 and mitochondrial protein families. These characteristics formed the basis for naming the protein module the SMP domain (an acronym for synaptotagmin-like, mitochondrial and lipid binding proteins).
|
After removal of duplicates and sequence fragments, a distribution of the SMP domain emerged revealing a striking enrichment in plants relative to mammals and other higher eukaryotes of equivalent or comparable proteome sizes (At least 10 proteins were detected in Arabidopsis thaliana, which was double that in Homo sapiens.) An increased repertoire of synaptotagmin-like proteins in plants seems to account for a majority of this relative expansion. The HT008 protein family is represented across a broad spectrum of eukaryotic species. Intriguingly, there was species restriction of 2 SMP domain families, which suggests cell-type or tissue-specific functional variations. The PDZK8 family was detected only in animals while the mitochondrial outer membrane proteins Mdm12 and Mmm1 are apparently exclusive to yeasts.
The multiple alignment constructed from the iterative search procedure consists of 6 alignment blocks and reveals a preponderance of polar and hydrophobic residues (see Fig. 1
). Prediction of the secondary structure with this alignment using PHD (5)
indicates a mix of ß strand and helical elements extending between 180 to 300 residues in length. An asparginine-rich column is prominent in alignment block 1 while loop-inducing glycine and proline positions are found within blocks 2, 5, and 6. An extended loop region (located between alignment block three and four and predicted with high confidence by PHD) distinguishes members of the HT008 family from the other SMP families. The extraordinary length of this loop may imply greater flexibility of 3-dimensional structure that is of particular relevance for protein interaction. Several charged residues are prominent at various positions within this loop and may be contact sites for charged surfaces.
SMP domains in each protein were subjected to sequence structure threading with Threader (6)
. Top-ranking matches to known structural folds that were statistically significant are shown in Table 1
. Despite the detection of several highly significant matches (with Z-scores
4.0), there was no consensus between matches to any specific fold. Even at the level of SCOP fold class, there was disagreement between matches that better aligned with ß strands (all ß fold class) and those that registered matches that included the predicted tail-end
helices of the SMP domain (mixed
-ß fold class). On the basis of these disparities, the fold predictions should best be interpreted only as a corroboration of the above secondary structural prediction by PHD (5)
.
|
| BIOLOGICAL SIGNIFICANCE OF THE SMP DOMAIN |
|---|
|
|
|---|
|
An interesting parallel is found in the Saccharomyces cerevisiae Tricalbins Tcb1, Tcb2, and Tcb3, components of a complex that is proposed to regulate vacuole morphology via the Pdr1p drug resistance pathway (10)
. A screen for suppressors of cycloheximide sensitivity induced by Tcb mutants, identified RSP5, a ubiquitin conjugating ligase essential for receptor-mediated and fluid-phase endocytosis, as an effector of the Tricalbin complex. Analogous to the mitochore, deletions of specific Tricalbins leads to defects in the trafficking of cell surface receptors to the vacuole.
Thus, available evidence on the mitochore and Tricalbin complexes points to the roles of SMP domains in maintaining intact molecular chaperone complexes that direct the transport of target substrates to their respective destinations.
The Pleckstrin homology (PH) domain is a functional module that has been identified in a significant number of proteins involved in signal transduction, post-transcriptional regulation and cytoskeletal organization (11)
. The PH domain is part of a large structural family, known as the PH superfold, which includes the phosphotyrosine binding domain (PTB), the Ena/WASP homology domain (WH1) and Ran binding domain (RanBD). The formation of scaffolds or adaptors is a common mechanistic feature of the superfold (12)
. While the PTB, WH1 and RanBD domains recognize protein ligands, the major function of the PH domain is to interact with phosphoinositides (13)
. Members of the HT008 family possess a tandem transmembrane segment, PH domain and SMP domain arrangement and are functionally uncharacterized. The putative yeast ortholog Ypr091cp, however, was recovered by phage display using the SH3 domain of the yeast amphiphysin Rvs167p as bait (14)
. Rvs167p is known to form a complex with Rvs161p that regulates sphingolipid metabolism (15)
. Thus, the HT008 family may function to transduce signals to the sphingolipid metabolic pathway via the Rvs167-Rvs161 complex.
Little is known about the functions of the PDZK8 family apart from the identification of its human member in an immunoscreen of cDNA libraries using serum antibodies from a fibrosarcoma patient (SEREX) (16)
. PDZK8/NY-SAR-104 contains an N-terminal SMP domain and a PDZ domain that is known to localize membrane channels, signaling enzymes and adhesion molecules through binding of C-terminal tri-peptides. PDZ domains are almost invariably found as repeats and previous studies indicate that these ensembles are critical to activities specific to metazoans. An important example is Drosophila melanogaster InaD, which possesses 5 tandem PDZ domains that are individually responsible for binding distinct proteins; forming a membrane-associated scaffold. The sequential arrangement of PDZ domains is integral to the creation of this interface, allowing aggregation of signal-transducing proteins in a large multiprotein complex essential for photoreceptor function (17)
. Members of the PDZK8 family, which possess an SMP domain and a single PDZ domain in tandem, are therefore predicted to form similar scaffolds (with the SMP domain functioning as an ad hoc PDZ-like interface in place of additional PDZ domains) that may also involve the C-terminal C1 domain as a third binding interface. The sequence of the C1 domain more closely resembles the C1 domains of small G protein interactors although binding to phorbol esters or diacylglycerol may not be ruled out (18)
.
Strikingly, examination of protein domain architectures reveal that 12 of 20 proteins shown in Figs. 1
, 2
are derived from among members of the synaptotagmin-like family and are composed of combinations of transmembrane segments, SMP and C2 domains. The C2 domains are present as repeats with copy numbers ranging between 2 and 5. Although the C2 domain is generally recognized as a Ca2+-dependent lipid binding module, binding of ligands and substrates ranging from phospholipids, inositol polyphosphates to intracellular proteins have been reported (19)
. Given the expansion of synaptotagmin-like proteins in plants, differentiation of substrate recognition may be particularly relevant to membrane association. Furthermore, different C2 domains are known to possess non-overlapping functions even amongst functionally coupled proteins of the same species. The yeast Tricalbins are a salient example of this in that the C2 domains of Tcb2 do not bind Ca2+ while the third C2 domains of Tcb1 and Tcb3 do so with exquisite sensitivity (20)
. Thus, variations in the number and type of C2 domains amongst synaptotagmins are likely to represent heterogeneous responses to calcium signaling depending on their target substrates. This in turn may be connected with the type of molecule, tethered by the SMP domain, being targeted to the plasma membrane.
| CONCLUDING REMARKS |
|---|
|
|
|---|
| ACKNOWLEDGMENTS |
|---|
Received for publication August 26, 2005. Accepted for publication October 11, 2005.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
B. Kornmann, E. Currie, S. R. Collins, M. Schuldiner, J. Nunnari, J. S. Weissman, and P. Walter An ER-Mitochondria Tethering Complex Revealed by a Synthetic Biology Screen Science, July 24, 2009; 325(5939): 477 - 481. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |