201
|
Pashkova N, Gakhar L, Winistorfer SC, Yu L, Ramaswamy S, Piper RC. WD40 repeat propellers define a ubiquitin-binding domain that regulates turnover of F box proteins. Mol Cell 2010; 40:433-43. [PMID: 21070969 PMCID: PMC3266742 DOI: 10.1016/j.molcel.2010.10.018] [Citation(s) in RCA: 95] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2010] [Revised: 06/01/2010] [Accepted: 08/18/2010] [Indexed: 11/24/2022]
Abstract
WD40-repeat β-propellers are found in a wide range of proteins involved in distinct biological activities. We define a large subset of WD40 β-propellers as a class of ubiquitin-binding domains. Using the β-propeller from Doa1/Ufd3 as a paradigm, we find the conserved top surface of the Doa1 β-propeller binds the hydrophobic patch of ubiquitin centered on residues I44, L8, and V70. Mutations that disrupt ubiquitin binding abrogate Doa1 function, demonstrating the importance of this interaction. We further demonstrate that WD40 β-propellers from a functionally diverse set of proteins bind ubiquitin in a similar fashion. This set includes members of the F box family of SCF ubiquitin E3 ligase adaptors. Using mutants defective in binding, we find that ubiquitin interaction by the F box protein Cdc4 promotes its autoubiquitination and turnover. Collectively, our results reveal a molecular mechanism that may account for how ubiquitin controls a broad spectrum of cellular activities.
Collapse
Affiliation(s)
| | - Lokesh Gakhar
- Carver College of Medicine Protein Crystallography Facility
| | | | - Liping Yu
- Carver College of Medicine Protein NMR Facility
| | - S. Ramaswamy
- Department of Biochemistry University of Iowa, Iowa City, IA 52242, USA
| | | |
Collapse
|
202
|
Volkamer A, Griewel A, Grombacher T, Rarey M. Analyzing the Topology of Active Sites: On the Prediction of Pockets and Subpockets. J Chem Inf Model 2010; 50:2041-52. [DOI: 10.1021/ci100241y] [Citation(s) in RCA: 122] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Andrea Volkamer
- Research Group for Computational Molecular Design, Bundesstr. 43, 20146 Hamburg, Germany, and Merck KGaA, Frankfurter Str. 250, 64293 Darmstadt, Germany
| | - Axel Griewel
- Research Group for Computational Molecular Design, Bundesstr. 43, 20146 Hamburg, Germany, and Merck KGaA, Frankfurter Str. 250, 64293 Darmstadt, Germany
| | - Thomas Grombacher
- Research Group for Computational Molecular Design, Bundesstr. 43, 20146 Hamburg, Germany, and Merck KGaA, Frankfurter Str. 250, 64293 Darmstadt, Germany
| | - Matthias Rarey
- Research Group for Computational Molecular Design, Bundesstr. 43, 20146 Hamburg, Germany, and Merck KGaA, Frankfurter Str. 250, 64293 Darmstadt, Germany
| |
Collapse
|
203
|
Lensink MF, Wodak SJ. Blind predictions of protein interfaces by docking calculations in CAPRI. Proteins 2010; 78:3085-95. [DOI: 10.1002/prot.22850] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
|
204
|
Johansson F, Toh H. A comparative study of conservation and variation scores. BMC Bioinformatics 2010; 11:388. [PMID: 20663120 PMCID: PMC2920274 DOI: 10.1186/1471-2105-11-388] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2009] [Accepted: 07/21/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Conservation and variation scores are used when evaluating sites in a multiple sequence alignment, in order to identify residues critical for structure or function. A variety of scores are available today but it is not clear how different scores relate to each other. RESULTS We applied 25 conservation and variation scores to alignments from the Catalytic Site Atlas (CSA). We calculated distances among scores based on correlation coefficients, and constructed a dendrogram of the scores by average linking cluster analysis. The cluster analysis showed that most scores fall into one of two groups--substitution matrix based group and frequency based group respectively. We also evaluated the scores' performance in predicting catalytic sites and found that frequency based scores generally perform best. CONCLUSIONS Conservation and variation scores can be classified into mainly two large groups. When using a score to predict catalytic sites, frequency based scores that also consider a background distribution are most successful.
Collapse
Affiliation(s)
- Fredrik Johansson
- Division of Bioinformatics, Medical Institute of Bioregulation, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka 812-8582, Japan.
| | | |
Collapse
|
205
|
Bell RE, Ben-Tal N. In silico identification of functional protein interfaces. Comp Funct Genomics 2010; 4:420-3. [PMID: 18629079 PMCID: PMC2447364 DOI: 10.1002/cfg.309] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2003] [Revised: 06/03/2003] [Accepted: 06/03/2003] [Indexed: 12/02/2022] Open
Abstract
Proteins perform many of their biological roles through protein–protein, protein–DNA or protein–ligand interfaces. The identification of the amino acids comprising
these interfaces often enhances our understanding of the biological function of
the proteins. Many methods for the detection of functional interfaces have been developed,
and large-scale analyses have provided assessments of their accuracy. Among
them are those that consider the size of the protein interface, its amino acid composition
and its physicochemical and geometrical properties. Other methods to this
effect use statistical potential functions of pairwise interactions, and evolutionary
information. The rationale of the evolutionary approach is that functional and structural
constraints impose selective pressure; hence, biologically important interfaces
often evolve at a slower pace than do other external regions of the protein. Recently,
an algorithm, Rate4Site, and a web-server, ConSurf (http://consurf.tau.ac.il/), for
the identification of functional interfaces based on the evolutionary relations among
homologous proteins as reflected in phylogenetic trees, were developed in our laboratory.
The explicit use of the tree topology and branch lengths makes the method
remarkably accurate and sensitive. Here we demonstrate its potency in the identification
of the functional interfaces of a hypothetical protein, the structure of which was
determined as part of the international structural genomics effort. Finally, we propose
to combine complementary procedures, in order to enhance the overall performance
of methods for the identification of functional interfaces in proteins.
Collapse
Affiliation(s)
- Rachel E Bell
- Department of Biochemistry, The George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv 69978, Israel
| | | |
Collapse
|
206
|
Guharoy M, Chakrabarti P. Conserved residue clusters at protein-protein interfaces and their use in binding site identification. BMC Bioinformatics 2010; 11:286. [PMID: 20507585 PMCID: PMC2894039 DOI: 10.1186/1471-2105-11-286] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2010] [Accepted: 05/27/2010] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Biological evolution conserves protein residues that are important for structure and function. Both protein stability and function often require a certain degree of structural co-operativity between spatially neighboring residues and it has previously been shown that conserved residues occur clustered together in protein tertiary structures, enzyme active sites and protein-DNA interfaces. Residues comprising protein interfaces are often more conserved compared to those occurring elsewhere on the protein surface. We investigate the extent to which conserved residues within protein-protein interfaces are clustered together in three-dimensions. RESULTS Out of 121 and 392 interfaces in homodimers and heterocomplexes, 96.7 and 86.7%, respectively, have the conserved positions clustered within the overall interface region. The significance of this clustering was established in comparison to what is seen for the subsets of the same size of randomly selected residues from the interface. Conserved residues occurring in larger interfaces could often be sub-divided into two or more distinct sub-clusters. These structural cluster(s) comprising conserved residues indicate functionally important regions within the protein-protein interface that can be targeted for further structural and energetic analysis by experimental scanning mutagenesis. Almost 60% of experimental hot spot residues (with DeltaDeltaG > 2 kcal/mol) were localized to these conserved residue clusters. An analysis of the residue types that are enriched within these conserved subsets compared to the overall interface showed that hydrophobic and aromatic residues are favored, but charged residues (both positive and negative) are less common. The potential use of this method for discriminating binding sites (interfaces) versus random surface patches was explored by comparing the clustering of conserved residues within each of these regions--in about 50% cases the true interface is ranked among the top 10% of all surface patches. CONCLUSIONS Protein-protein interaction sites are much larger than small molecule biding sites, but still conserved residues are not randomly distributed over the whole interface and are distinctly clustered. The clustered nature of evolutionarily conserved residues within interfaces as compared to those within other surface patches not involved in binding has important implications for the identification of protein-protein binding sites and would have applications in docking studies.
Collapse
Affiliation(s)
- Mainak Guharoy
- Bioinformatics Centre, Bose Institute, P-1/12 CIT Scheme VIIM, Kolkata, India
| | | |
Collapse
|
207
|
Structure, evolutionary conservation, and conformational dynamics of Homo sapiens fascin-1, an F-actin crosslinking protein. J Mol Biol 2010; 400:589-604. [PMID: 20434460 DOI: 10.1016/j.jmb.2010.04.043] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2010] [Revised: 04/21/2010] [Accepted: 04/22/2010] [Indexed: 12/19/2022]
Abstract
Eukaryotes have several highly conserved actin-binding proteins that crosslink filamentous actin into compact ordered bundles present in distinct cytoskeletal processes, including microvilli, stereocilia and filopodia. Fascin is an actin-binding protein that is present predominantly in filopodia, which are believed to play a central role in normal and aberrant cell migration. An important outstanding question regards the molecular basis for the unique localization and functional properties of fascin compared with other actin crosslinking proteins. Here, we present the crystal structure of full-length Homo sapiens fascin-1, and examine its packing, conformational flexibility, and evolutionary sequence conservation. The structure reveals a novel arrangement of four tandem beta-trefoil domains that form a bi-lobed structure with approximate pseudo 2-fold symmetry. Each lobe has internal approximate pseudo 2-fold and pseudo 3-fold symmetry axes that are approximately perpendicular, with beta-hairpin triplets located symmetrically on opposite sides of each lobe that mutational data suggest are actin-binding domains. Sequence conservation analysis confirms the importance of hydrophobic core residues that stabilize the beta-trefoil fold, as well as interfacial residues that are likely to stabilize the overall fascin molecule. Sequence conservation also indicates highly conserved surface patches near the putative actin-binding domains of fascin, which conformational dynamics analysis suggests to be coupled via an allosteric mechanism that might have important functional implications for F-actin crosslinking by fascin.
Collapse
|
208
|
Iwaya N, Kuwahara Y, Fujiwara Y, Goda N, Tenno T, Akiyama K, Mase S, Tochio H, Ikegami T, Shirakawa M, Hiroaki H. A common substrate recognition mode conserved between katanin p60 and VPS4 governs microtubule severing and membrane skeleton reorganization. J Biol Chem 2010; 285:16822-9. [PMID: 20339000 DOI: 10.1074/jbc.m110.108365] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Katanin p60 (kp60), a microtubule-severing enzyme, plays a key role in cytoskeletal reorganization during various cellular events in an ATP-dependent manner. We show that a single domain isolated from the N terminus of mouse katanin p60 (kp60-NTD) binds to tubulin. The solution structure of kp60-NTD was determined by NMR. Although their sequence similarities were as low as 20%, the structure of kp60-NTD revealed a striking similarity to those of the microtubule interacting and trafficking (MIT) domains, which adopt anti-parallel three-stranded helix bundle. In particular, the arrangement of helices 2 and 3 is well conserved between kp60-NTD and the MIT domain from Vps4, which is a homologous protein that promotes disassembly of the endosomal sorting complexes required for transport III membrane skeleton complex. Mutation studies revealed that the positively charged surface formed by helices 2 and 3 binds tubulin. This binding mode resembles the interaction between the MIT domain of Vps4 and Vps2/CHMP1a, a component of endosomal sorting complexes required for transport III. Our results show that both the molecular architecture and the binding modes are conserved between two AAA-ATPases, kp60 and Vps4. A common mechanism is evolutionarily conserved between two distinct cellular events, one that drives microtubule severing and the other involving membrane skeletal reorganization.
Collapse
Affiliation(s)
- Naoko Iwaya
- Department of Molecular Engineering, Graduate School of Engineering, Kyoto University, Kyoto-Daigaku Katsura, Nishikyo-ku, Kyoto 615-8530, Japan
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
209
|
Tripathi A, Kellogg GE. A novel and efficient tool for locating and characterizing protein cavities and binding sites. Proteins 2010; 78:825-42. [PMID: 19847777 DOI: 10.1002/prot.22608] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Systematic investigation of a protein and its binding site characteristics are crucial for designing small molecules that modulate protein functions. However, fundamental uncertainties in binding site interactions and insufficient knowledge of the properties of even well-defined binding pockets can make it difficult to design optimal drugs. Herein, we report the development and implementation of a cavity detection algorithm built with HINT toolkit functions that we are naming Vectorial Identification of Cavity Extents (VICE). This very efficient algorithm is based on geometric criteria applied to simple integer grid maps. In testing, we carried out a systematic investigation on a very diverse data set of proteins and protein-protein/protein-polynucleotide complexes for locating and characterizing the indentations, cavities, pockets, grooves, channels, and surface regions. Additionally, we evaluated a curated data set of unbound proteins for which a ligand-bound protein structures are also known; here the VICE algorithm located the actual ligand in the largest cavity in 83% of the cases and in one of the three largest in 90% of the cases. An interactive front-end provides a quick and simple procedure for locating, displaying and manipulating cavities in these structures. Information describing the cavity, including its volume and surface area metrics, and lists of atoms, residues, and/or chains lining the binding pocket, can be easily obtained and analyzed. For example, the relative cross-sectional surface area (to total surface area) of cavity openings in well-enclosed cavities is 0.06 +/- 0.04 and in surface clefts or crevices is 0.25 +/- 0.09. Proteins 2010. (c) 2009 Wiley-Liss, Inc.
Collapse
Affiliation(s)
- Ashutosh Tripathi
- Department of Medicinal Chemistry and Institute for Structural Biology and Drug Discovery, Virginia Commonwealth University, Richmond, Virginia 23298-0540, USA
| | | |
Collapse
|
210
|
Chemes LB, Sánchez IE, Smal C, de Prat-Gay G. Targeting mechanism of the retinoblastoma tumor suppressor by a prototypical viral oncoprotein. Structural modularity, intrinsic disorder and phosphorylation of human papillomavirus E7. FEBS J 2010; 277:973-88. [PMID: 20088881 DOI: 10.1111/j.1742-4658.2009.07540.x] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
DNA tumor viruses ensure genome amplification by hijacking the cellular replication machinery and forcing infected cells to enter the S phase. The retinoblastoma (Rb) protein controls the G1/S checkpoint, and is targeted by several viral oncoproteins, among these the E7 protein from human papillomaviruses (HPVs). A quantitative investigation of the interaction mechanism between the HPV16 E7 protein and the RbAB domain in solution revealed that 90% of the binding energy is determined by the LxCxE motif, with an additional binding determinant (1.0 kcal.mol(-1)) located in the C-terminal domain of E7, establishing a dual-contact mode. The stoichiometry and subnanomolar affinity of E7 indicated that it can bind RbAB as a monomer. The low-risk HPV11 E7 protein bound 2.0 kcal.mol(-1) more weakly than the high-risk HPV16 and HPV18 type counterparts, but the modularity and binding mode were conserved. Phosphorylation at a conserved casein kinase II site in the natively unfolded N-terminal domain of E7 affected the local conformation by increasing the polyproline II content and stabilizing an extended conformation, which allowed for a tighter interaction with the Rb protein. Thus, the E7-RbAB interaction involves multiple motifs within the N-terminal domain of E7 and at least two conserved interaction surfaces in RbAB. We discussed a mechanistic model of the interaction of the Rb protein with a viral target in solution, integrated with structural data and the analysis of other cellular and viral proteins, which provided information about the balance of interactions involving the Rb protein and how these determine the progression into either the normal cell cycle or transformation.
Collapse
Affiliation(s)
- Lucía B Chemes
- Protein Structure-Function and Engineering Laboratory, Fundación Instituto Leloir and IIBBA-CONICET, Buenos Aires, Argentina
| | | | | | | |
Collapse
|
211
|
Brylinski M, Skolnick J. Comparison of structure-based and threading-based approaches to protein functional annotation. Proteins 2010; 78:118-34. [PMID: 19731377 PMCID: PMC2804779 DOI: 10.1002/prot.22566] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
To exploit the vast amount of sequence information provided by the Genomic revolution, the biological function of these sequences must be identified. As a practical matter, this is often accomplished by functional inference. Purely sequence-based approaches, particularly in the "twilight zone" of low sequence similarity levels, are complicated by many factors. For proteins, structure-based techniques aim to overcome these problems; however, most require high-quality crystal structures and suffer from complex and equivocal relations between protein fold and function. In this study, in extensive benchmarking, we consider a number of aspects of structure-based functional annotation: binding pocket detection, molecular function assignment and ligand-based virtual screening. We demonstrate that protein threading driven by a strong sequence profile component greatly improves the quality of purely structure-based functional annotation in the "twilight zone." By detecting evolutionarily related proteins, it considerably reduces the high false positive rate of function inference derived on the basis of global structure similarity alone. Combined evolution/structure-based function assignment emerges as a powerful technique that can make a significant contribution to comprehensive proteome annotation.
Collapse
Affiliation(s)
- Michal Brylinski
- Center for the Study of Systems Biology School of Biology, Georgia Institute of Technology, 250 14th Street NW, Atlanta, GA 30318
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology School of Biology, Georgia Institute of Technology, 250 14th Street NW, Atlanta, GA 30318
| |
Collapse
|
212
|
Ramanathan K, Shanthi V, Sethumadhavan R. In silico identification of catalytic residues in azobenzene reductase from Bacillus subtilis and its docking studies with azo dyes. Interdiscip Sci 2009; 1:290-7. [PMID: 20640807 DOI: 10.1007/s12539-009-0035-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2008] [Revised: 09/02/2009] [Accepted: 09/05/2009] [Indexed: 11/28/2022]
Abstract
Prediction of catalytic residues of an enzyme molecule is of great importance for a range of applications including molecular docking, drug design, structural identification and comparison of binding sites. Over the last decades, many studies have been conducted to identify the enzyme catalytic site. But, the catalytic residues of the azobenzene reductase from bacillus subtilis are still unknown. Investigation shows that under anaerobic conditions, azo dyes can be reduced by this enzyme and other environmental microorganisms to colorless amines, which may be toxic, mutagenic, and carcinogenic to humans and animals. To assess and estimate the toxicity, it is essential to identify the catalytic residues of this enzyme. The computational methods developed that address this issue are few. In this approach, we identify the catalytic residues of azobenzene reductase from bacillus subtilis, which were then analyzed in terms of properties including function, conservation, hydrogen bonding, B-factor, solvent accessibility, and flexibility. The results indicate that, Lys (83) and Tyr (74) play an important role as catalytic site residues in the azobenzene reductase from bacillus subtilis. It is hoped that this information will provide a better understanding of the molecular mechanisms involved in catalysis and a heuristic basis for predicting the catalytic residues in enzymes of unknown function. In this study, our approach mainly looks for a better understanding of the biodegradation of the Sudan I, Sudan II, Sudan III and Sudan IV dyes mediated by azobenzene reductase from bacillus subtilis. Further more, the catalytic site residues information is essential for understanding and altering substrate specificity and for the design of enzyme inhibitors.
Collapse
Affiliation(s)
- K Ramanathan
- School of Biosciences and Technology, Vellore Institute of Technology, Vellore, Tamil Nadu, India
| | | | | |
Collapse
|
213
|
Binkley J, Karra K, Kirby A, Hosobuchi M, Stone EA, Sidow A. ProPhylER: a curated online resource for protein function and structure based on evolutionary constraint analyses. Genome Res 2009; 20:142-54. [PMID: 19846609 DOI: 10.1101/gr.097121.109] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
ProPhylER (Protein Phylogeny and Evolutionary Rates) is a next-generation curated proteome resource that uses comparative sequence analysis to predict constraint and mutation impact for eukaryotic proteins. Its purpose is to inform any research program for which protein function and structure are relevant, by the predictive power of evolutionary constraint analyses. ProPhylER currently has nearly 9000 clusters of related proteins, including more than 200,000 sequences. It serves data via two interfaces. The "ProPhylER Interface" displays predictive analyses in sequence space; the "CrystalPainter" maps evolutionary constraints onto solved protein structures. Here we summarize ProPhylER's data content and analysis pipeline, demonstrate the use of ProPhylER's interfaces, and evaluate ProPhylER's unique regional analysis of evolutionary constraint. The high accuracy of ProPhylER's regional analysis complements the high resolution of its single-site analysis to effectively guide and inform structure-function investigations and predict the impact of polymorphisms.
Collapse
Affiliation(s)
- Jonathan Binkley
- Stanford University School of Medicine, Departments of Pathology and Genetics, Stanford, California 94305, USA
| | | | | | | | | | | |
Collapse
|
214
|
Comparing the functional roles of nonconserved sequence positions in homologous transcription repressors: implications for sequence/function analyses. J Mol Biol 2009; 395:785-802. [PMID: 19818797 DOI: 10.1016/j.jmb.2009.10.001] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2009] [Revised: 10/01/2009] [Accepted: 10/02/2009] [Indexed: 11/21/2022]
Abstract
The explosion of protein sequences deduced from genetic code has led to both a problem and a potential resource: Efficient data use requires interpreting the functional impact of sequence change without experimentally characterizing each protein variant. Several groups have hypothesized that interpretation could be aided by analyzing the sequences of naturally occurring homologues. To that end, myriad sequence/function analyses have been developed to predict which conserved, semi-conserved, and nonconserved positions are functionally important. These positions must be discriminated from the nonconserved positions that are functionally silent. However, the assumptions that underlie sequence analyses are based on experimental results that are sparse and usually designed to address different questions. Here, we use three homologues from a test family common to bioinformatics-the LacI/GalR transcription repressors-to test a common assumption: If a position is functionally important for one family member, it has similar importance in all homologues. We generated experimental sequence/function information for each nonconserved position in the 18 amino acids that link the DNA-binding and regulatory domains of three LacI/GalR homologues. We find that the functional importance of each position is preserved among the three linkers, albeit to different degrees. We also find that every linker position contributes to function, which has twofold implications. (1) Since the linker positions range from highly conserved to semi-conserved to nonconserved and contribute to affinity, selectivity, and allosteric response, we assert that sequence/function analyses must identify positions in the LacI/GalR linkers to be qualified as "successful". Many analyses overlook this region since most of the residues do not directly contact ligand. (2) No position in the LacI/GalR linker is functionally silent. This finding is inconsistent with another underlying principle of many analyses: Using sequence sets to discriminate important from non-contributing positions obligates silent positions, which denotes that most homologues tolerate a variety of amino acid substitutions at the position without functional change. Instead, additional combinatorial mutants in the LacI/GalR linkers show that particular substitutions can be silent in a context-dependent manner. Thus, specific permutations of sequence change (rather than change at silent positions) would facilitate neutral drift during evolution. Finally, the combinatorial mutants also reveal functional synergy between semi- and nonconserved positions. Such functional relationships would be missed by analyses that rely primarily upon co-evolution.
Collapse
|
215
|
Friedman EJ, Temple BRS, Hicks SN, Sondek J, Jones CD, Jones AM. Prediction of protein-protein interfaces on G-protein beta subunits reveals a novel phospholipase C beta2 binding domain. J Mol Biol 2009; 392:1044-54. [PMID: 19646992 PMCID: PMC2767172 DOI: 10.1016/j.jmb.2009.07.076] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2009] [Revised: 07/08/2009] [Accepted: 07/27/2009] [Indexed: 11/25/2022]
Abstract
Gbeta subunits from heterotrimeric G-proteins (guanine nucleotide-binding proteins) directly bind diverse proteins, including effectors and regulators, to modulate a wide array of signaling cascades. These numerous interactions constrained the evolution of the molecular surface of Gbeta. Although mammals contain five Gbeta genes comprising two classes (Gbeta1-like and Gbeta5-like), plants and fungi have a single ortholog, and organisms such as Caenorhabditis elegans and Drosophila melanogaster contain one copy from each class. A limited number of crystal structures of complexes containing Gbeta subunits and complementary biochemical data highlight specific sites within Gbetas needed for protein interactions. It is difficult to determine from these interaction sites what, if any, additional regions of the Gbeta molecular surface comprise interaction interfaces essential to Gbeta's role as a nexus in numerous signaling cascades. We used a comparative evolutionary approach to identify five known and eight previously unknown putative interfaces on the surface of Gbeta. We show that one such novel interface occurs between Gbeta and phospholipase C beta2 (PLC-beta2), a mammalian Gbeta interacting protein. Substitutions of residues within this Gbeta-PLC-beta2 interface reduce the activation of PLC-beta2 by Gbeta1, confirming that our de novo comparative evolutionary approach predicts previously unknown Gbeta-protein interfaces. Similarly, we hypothesize that the seven remaining untested novel regions contribute to putative interfaces for other Gbeta interacting proteins. Finally, this comparative evolutionary approach is suitable for application to any protein involved in a significant number of protein-protein interactions.
Collapse
Affiliation(s)
- Erin J. Friedman
- Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599
| | - Brenda R. S. Temple
- The R. L. Juliano Structural Bioinformatics Core Facility, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599-7260
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, NC 27599
| | - Stephanie N. Hicks
- Department of Pharmacology, University of North Carolina School of Medicine, Chapel Hill, NC 27599
| | - John Sondek
- Department of Pharmacology, University of North Carolina School of Medicine, Chapel Hill, NC 27599
- Lineberger Comprehensive Cancer Center University of North Carolina School of Medicine, Chapel Hill, NC 27599
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, NC 27599
| | - Corbin D. Jones
- Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599
- Carolina Center for Genome Sciences, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599
| | - Alan M. Jones
- Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599
- Department of Pharmacology, University of North Carolina School of Medicine, Chapel Hill, NC 27599
| |
Collapse
|
216
|
Pozo-Dengra J, Martínez-Rodríguez S, Contreras LM, Prieto J, Andújar-Sánchez M, Clemente-Jiménez JM, Las Heras-Vázquez FJ, Rodríguez-Vico F, Neira JL. Structure and conformational stability of a tetrameric thermostableN-succinylamino acid racemase. Biopolymers 2009; 91:757-72. [DOI: 10.1002/bip.21226] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
217
|
Thomas J, Ramakrishnan N, Bailey-Kellogg C. Graphical models of protein-protein interaction specificity from correlated mutations and interaction data. Proteins 2009; 76:911-29. [DOI: 10.1002/prot.22398] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
218
|
Tkaczuk KL. Trm13p, the tRNA:Xm4 modification enzyme from Saccharomyces cerevisiae is a member of the Rossmann-fold MTase superfamily: prediction of structure and active site. J Mol Model 2009; 16:599-606. [PMID: 19697067 DOI: 10.1007/s00894-009-0570-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2009] [Accepted: 07/28/2009] [Indexed: 01/09/2023]
Abstract
2'-O-ribose methylation is one of the most common posttranscriptional modifications in RNA. Methylations at different positions are introduced by enzymes from at least two unrelated superfamilies. Recently, a new family of eukaryotic RNA methyltransferases (MTases) has been identified, and its representative from yeast (Yol125w, renamed as Trm13p) has been shown to 2'-O-methylate position 4 of tRNA. Trm13 is conserved in Eukaryota, but exhibits no sequence similarity to other known MTases. Here, I present the results of bioinformatics analysis which suggest that Trm13 is a strongly diverged member of the Rossmann-fold MTase (RFM) superfamily, and therefore is evolutionarily related to 2'-O-MTases such as Trm7 and fibrillarin. However, the character of conserved residues in the predicted active site of the Trm13 family suggests it may use a different mechanism of ribose methylation than its relatives. A molecular model of the Trm13p structure has been constructed and evaluated for potential accuracy using model quality assessment methods. The predicted structure will facilitate experimental analyses of the Trm13p mechanism of action.
Collapse
|
219
|
Huang CC, Yoshino-Koh K, Tesmer JJG. A surface of the kinase domain critical for the allosteric activation of G protein-coupled receptor kinases. J Biol Chem 2009; 284:17206-17215. [PMID: 19364770 PMCID: PMC2719358 DOI: 10.1074/jbc.m809544200] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2008] [Revised: 03/10/2009] [Indexed: 11/06/2022] Open
Abstract
G protein-coupled receptor (GPCR) kinases (GRKs) phosphorylate activated GPCRs and initiate their desensitization. Many prior studies suggest that activated GPCRs dock to an allosteric site on the GRKs and thereby stimulate kinase activity. The extreme N-terminal region of GRKs is clearly involved in this process, but its role is not understood. Using our recent structure of bovine GRK1 as a guide, we generated mutants of solvent-exposed residues in the GRK1 kinase domain that are conserved among GRKs but not in the extended protein kinase A, G, and C family and evaluated their catalytic activity. Mutation of select residues in strands beta1 and beta3 of the kinase small lobe, alphaD of the kinase large lobe, and the protein kinase A, G, and C kinase C-tail greatly impaired receptor phosphorylation. The most dramatic effect was observed for mutation of an invariant arginine on the beta1-strand (approximately 1000-fold decrease in k(cat)/K(m)). These residues form a continuous surface that is uniquely available in GRKs for protein-protein interactions. Surprisingly, these mutants, as well as a 19-amino acid N-terminal truncation of GRK1, also show decreased catalytic efficiency for peptide substrates, although to a lesser extent than for receptor phosphorylation. Our data suggest that the N-terminal region and the newly identified surface interact and stabilize the closed, active conformation of the kinase domain. Receptor binding is proposed to promote this interaction, thereby enhancing GRK activity.
Collapse
Affiliation(s)
- Chih-Chin Huang
- From the Life Sciences Institute, Department of Pharmacology, University of Michigan, Ann Arbor, Michigan 48109-2216
| | - Kae Yoshino-Koh
- From the Life Sciences Institute, Department of Pharmacology, University of Michigan, Ann Arbor, Michigan 48109-2216
| | - John J G Tesmer
- From the Life Sciences Institute, Department of Pharmacology, University of Michigan, Ann Arbor, Michigan 48109-2216.
| |
Collapse
|
220
|
Fischer K, Langendorf CG, Irving JA, Reynolds S, Willis C, Beckham S, Law RHP, Yang S, Bashtannyk-Puhalovich TA, McGowan S, Whisstock JC, Pike RN, Kemp DJ, Buckle AM. Structural mechanisms of inactivation in scabies mite serine protease paralogues. J Mol Biol 2009; 390:635-45. [PMID: 19427318 DOI: 10.1016/j.jmb.2009.04.082] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2009] [Revised: 04/28/2009] [Accepted: 04/30/2009] [Indexed: 10/20/2022]
Abstract
The scabies mite (Sarcoptes scabiei) is a parasite responsible for major morbidity in disadvantaged communities and immuno-compromised patients worldwide. In addition to the physical discomfort caused by the disease, scabies infestations facilitate infection by Streptococcal species via skin lesions, resulting in a high prevalence of rheumatic fever/heart disease in affected communities. The scabies mite produces 33 proteins that are closely related to those in the dust mite group 3 allergen and belong to the S1-like protease family (chymotrypsin-like). However, all but one of these molecules contain mutations in the conserved active-site catalytic triad that are predicted to render them catalytically inactive. These molecules are thus termed scabies mite inactivated protease paralogues (SMIPPs). The precise function of SMIPPs is unclear; however, it has been suggested that these proteins might function by binding and protecting target substrates from cleavage by host immune proteases, thus preventing the host from mounting an effective immune challenge. In order to begin to understand the structural basis for SMIPP function, we solved the crystal structures of SMIPP-S-I1 and SMIPP-S-D1 at 1.85 A and 2.0 A resolution, respectively. Both structures adopt the characteristic serine protease fold, albeit with large structural variations over much of the molecule. In both structures, mutations in the catalytic triad together with occlusion of the S1 subsite by a conserved Tyr200 residue is predicted to block substrate ingress. Accordingly, we show that both proteases lack catalytic function. Attempts to restore function (via site-directed mutagenesis of catalytic residues as well as Tyr200) were unsuccessful. Taken together, these data suggest that SMIPPs have lost the ability to bind substrates in a classical "canonical" fashion, and instead have evolved alternative functions in the lifecycle of the scabies mite.
Collapse
Affiliation(s)
- Katja Fischer
- Scabies Laboratory, Queensland Institute of Medical Research, Brisbane, Australia.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
221
|
Hemond M, Rothstein TL, Wagner G. Fas apoptosis inhibitory molecule contains a novel beta-sandwich in contact with a partially ordered domain. J Mol Biol 2009; 386:1024-37. [PMID: 19168072 PMCID: PMC2745281 DOI: 10.1016/j.jmb.2009.01.004] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2008] [Revised: 01/03/2009] [Accepted: 01/06/2009] [Indexed: 11/17/2022]
Abstract
Fas apoptosis inhibitory molecule (FAIM) is a soluble cytosolic protein inhibitor of programmed cell death and is found in organisms throughout the animal kingdom. A short isoform of FAIM is expressed in all tissue types, while an alternatively spliced long isoform is specifically expressed in the brain. Here, the short isoform is shown to consist of two independently folding domains in contact with each other. The NMR solution structure of the C-terminal domain of murine FAIM is solved in isolation and revealed to be a novel protein fold, a noninterleaved seven-stranded beta-sandwich. The structure and sequence reveal several residues that are likely to be involved in functionally significant interactions with the N-terminal domain or other binding partners. Chemical shift perturbation is used to elucidate contacts made between the N-terminal domain and the C-terminal domain.
Collapse
Affiliation(s)
- Michael Hemond
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, 240 Longwood Ave., Boston, MA 02115, USA
| | - Thomas L. Rothstein
- Center for Oncology and Cell Biology, The Feinstein Institute for Medical Research, 350 Community Drive, Manhasset, NY 11030, USA
| | - Gerhard Wagner
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, 240 Longwood Ave., Boston, MA 02115, USA
| |
Collapse
|
222
|
Engelen S, Trojan LA, Sacquin-Mora S, Lavery R, Carbone A. Joint evolutionary trees: a large-scale method to predict protein interfaces based on sequence sampling. PLoS Comput Biol 2009; 5:e1000267. [PMID: 19165315 PMCID: PMC2613531 DOI: 10.1371/journal.pcbi.1000267] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2008] [Accepted: 12/04/2008] [Indexed: 11/18/2022] Open
Abstract
The Joint Evolutionary Trees (JET) method detects protein interfaces, the core
residues involved in the folding process, and residues susceptible to
site-directed mutagenesis and relevant to molecular recognition. The approach,
based on the Evolutionary Trace (ET) method, introduces a novel way to treat
evolutionary information. Families of homologous sequences are analyzed through
a Gibbs-like sampling of distance trees to reduce effects of erroneous multiple
alignment and impacts of weakly homologous sequences on distance tree
construction. The sampling method makes sequence analysis more sensitive to
functional and structural importance of individual residues by avoiding effects
of the overrepresentation of highly homologous sequences and improves
computational efficiency. A carefully designed clustering method is parametrized
on the target structure to detect and extend patches on protein surfaces into
predicted interaction sites. Clustering takes into account residues'
physical-chemical properties as well as conservation. Large-scale application of
JET requires the system to be adjustable for different datasets and to guarantee
predictions even if the signal is low. Flexibility was achieved by a careful
treatment of the number of retrieved sequences, the amino acid distance between
sequences, and the selective thresholds for cluster identification. An iterative
version of JET (iJET) that guarantees finding the most likely interface residues
is proposed as the appropriate tool for large-scale predictions. Tests are
carried out on the Huang database of 62 heterodimer, homodimer, and transient
complexes and on 265 interfaces belonging to signal transduction proteins,
enzymes, inhibitors, antibodies, antigens, and others. A specific set of
proteins chosen for their special functional and structural properties
illustrate JET behavior on a large variety of interactions covering proteins,
ligands, DNA, and RNA. JET is compared at a large scale to ET and to Consurf,
Rate4Site, siteFiNDER|3D, and SCORECONS on specific structures. A significant
improvement in performance and computational efficiency is shown. Information obtained on the structure of macromolecular complexes is important
for identifying functionally important partners but also for determining how
such interactions will be perturbed by natural or engineered site mutations.
Hence, to fully understand or control biological processes we need to predict in
the most accurate manner protein interfaces for a protein structure, possibly
without knowing its partners. Joint Evolutionary Trees (JET) is a method
designed to detect very different types of interactions of a protein with
another protein, ligands, DNA, and RNA. It uses a carefully designed sampling
method, making sequence analysis more sensitive to the functional and structural
importance of individual residues, and a clustering method parametrized on the
target structure for the detection of patches on protein surfaces and their
extension into predicted interaction sites. JET is a large-scale method, highly
accurate and potentially applicable to search for protein partners.
Collapse
Affiliation(s)
- Stefan Engelen
- Génomique Analytique, Université Pierre et Marie
Curie-Paris 6, UMR S511, Paris, France
- INSERM, U511, Paris, France
| | - Ladislas A. Trojan
- Génomique Analytique, Université Pierre et Marie
Curie-Paris 6, UMR S511, Paris, France
- INSERM, U511, Paris, France
| | | | - Richard Lavery
- Institut de Biologie et Chimie des Protéines, CNRS UMR
5086/IFR 128/Université de Lyon, Lyon, France
| | - Alessandra Carbone
- Génomique Analytique, Université Pierre et Marie
Curie-Paris 6, UMR S511, Paris, France
- INSERM, U511, Paris, France
- * E-mail:
| |
Collapse
|
223
|
Rajagopalan L, Pereira FA, Lichtarge O, Brownell WE. Identification of functionally important residues/domains in membrane proteins using an evolutionary approach coupled with systematic mutational analysis. Methods Mol Biol 2009; 493:287-97. [PMID: 18839354 PMCID: PMC2673147 DOI: 10.1007/978-1-59745-523-7_17] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/12/2023]
Abstract
Structure-function studies of membrane proteins present a unique challenge to researchers due to the numerous technical difficulties associated with their expression, purification and structural characterization. In the absence of structural information, rational identification of putative functionally important residues/regions is difficult. Phylogenetic relationships could provide valuable information about the functional significance of a particular residue or region of a membrane protein. Evolutionary Trace (ET) analysis is a method developed to utilize this phylogenetic information to predict functional sites in proteins. In this method, residues are ranked according to conservation or divergence through evolution, based on the hypothesis that mutations at key positions should coincide with functional evolutionary divergences. This information can be used as the basis for a systematic mutational analysis of identified residues, leading to the identification of functionally important residues and/or domains in membrane proteins, in the absence of structural information apart from the primary amino acid sequence. This approach is potentially useful in the context of the auditory system, as several key processes in audition involve the action of membrane proteins, many of which are novel and not well characterized structurally or functionally to date.
Collapse
Affiliation(s)
- Lavanya Rajagopalan
- Bobby R. Alford Department of Otolaryngology- Head and Neck Surgery, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Fred A. Pereira
- Bobby R. Alford Department of Otolaryngology- Head and Neck Surgery, Baylor College of Medicine, Houston, Texas 77030, USA
- Huffington Center on Aging and Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA
| | - William E. Brownell
- Bobby R. Alford Department of Otolaryngology- Head and Neck Surgery, Baylor College of Medicine, Houston, Texas 77030, USA
| |
Collapse
|
224
|
Abstract
The receptor for activated C-kinase (RACK1), a conserved protein implicated in numerous signaling pathways, is a stoichiometric component of eukaryotic ribosomes located on the head of the 40S ribosomal subunit. To test the hypothesis that ribosome association is central to the function of RACK1 in vivo, we determined the 2.1-A crystal structure of RACK1 from Saccharomyces cerevisiae (Asc1p) and used it to design eight mutant versions of RACK1 to assess roles in ribosome binding and in vivo function. Conserved charged amino acids on one side of the beta-propeller structure were found to confer most of the 40S subunit binding affinity, whereas an adjacent conserved and structured loop had little effect on RACK1-ribosome association. Yeast mutations that confer moderate to strong defects in ribosome binding mimic some phenotypes of a RACK1 deletion strain, including increased sensitivity to drugs affecting cell wall biosynthesis and translation elongation. Furthermore, disruption of RACK1's position at the 40S ribosomal subunit results in the failure of the mRNA binding protein Scp160 to associate with actively translating ribosomes. These results provide the first direct evidence that RACK1 functions from the ribosome, implying a physical link between the eukaryotic ribosome and cell signaling pathways in vivo.
Collapse
|
225
|
Li N, Sun Z, Jiang F. Prediction of protein-protein binding site by using core interface residue and support vector machine. BMC Bioinformatics 2008; 9:553. [PMID: 19102736 PMCID: PMC2627892 DOI: 10.1186/1471-2105-9-553] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2008] [Accepted: 12/22/2008] [Indexed: 12/04/2022] Open
Abstract
Background The prediction of protein-protein binding site can provide structural annotation to the protein interaction data from proteomics studies. This is very important for the biological application of the protein interaction data that is increasing rapidly. Moreover, methods for predicting protein interaction sites can also provide crucial information for improving the speed and accuracy of protein docking methods. Results In this work, we describe a binding site prediction method by designing a new residue neighbour profile and by selecting only the core-interface residues for SVM training. The residue neighbour profile includes both the sequential and the spatial neighbour residues of an interface residue, which is a more complete description of the physical and chemical characteristics surrounding the interface residue. The concept of core interface is applied in selecting the interface residues for training the SVM models, which is shown to result in better discrimination between the core interface and other residues. The best SVM model trained was tested on a test set of 50 randomly selected proteins. The sensitivity, specificity, and MCC for the prediction of the core interface residues were 60.6%, 53.4%, and 0.243, respectively. Our prediction results on this test set were compared with other three binding site prediction methods and found to perform better. Furthermore, our method was tested on the 101 unbound proteins from the protein-protein interaction benchmark v2.0. The sensitivity, specificity, and MCC of this test were 57.5%, 32.5%, and 0.168, respectively. Conclusion By improving both the descriptions of the interface residues and their surrounding environment and the training strategy, better SVM models were obtained and shown to outperform previous methods. Our tests on the unbound protein structures suggest further improvement is possible.
Collapse
Affiliation(s)
- Nan Li
- Beijing National Laboratory for Condensed Matter Physics, Institute of Physics, Chinese Academy of Sciences, Beijing, PR China.
| | | | | |
Collapse
|
226
|
Zhu Z, Tovchigrechko A, Baronova T, Gao Y, Douguet D, O'Toole N, Vakser IA. Large-scale structural modeling of protein complexes at low resolution. J Bioinform Comput Biol 2008; 6:789-810. [PMID: 18763743 DOI: 10.1142/s0219720008003679] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2007] [Revised: 11/20/2007] [Accepted: 01/04/2008] [Indexed: 11/18/2022]
Abstract
Structural aspects of protein-protein interactions provided by large-scale, genome-wide studies are essential for the description of life processes at the molecular level. A methodology is developed that applies the protein docking approach (GRAMM), based on the knowledge of experimentally determined protein-protein structures (DOCKGROUND resource) and properties of intermolecular energy landscapes, to genome-wide systems of protein interactions. The full sequence-to-structure-of-complex modeling pipeline is implemented in the Genome Wide Docking Database (GWIDD) resource. Protein interaction data are imported to GWIDD from external datasets of experimentally determined interaction networks. Essential information is extracted and unified to form the GWIDD database. Structures of individual interacting proteins in the database are retrieved (if available) or modeled, and protein complex structures are predicted by the docking program. All protein sequence, structure, and docking information is conveniently accessible through a Web interface.
Collapse
Affiliation(s)
- Zhengwei Zhu
- Center for Bioinformatics, The University of Kansas, 2030 Becker Drive, Lawrence, KS 66047, USA
| | | | | | | | | | | | | |
Collapse
|
227
|
Punta M, Ofran Y. The rough guide to in silico function prediction, or how to use sequence and structure information to predict protein function. PLoS Comput Biol 2008; 4:e1000160. [PMID: 18974821 PMCID: PMC2518264 DOI: 10.1371/journal.pcbi.1000160] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Affiliation(s)
- Marco Punta
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York, United States of America
- Columbia University Center for Computational Biology and Bioinformatics (C2B2), New York, New York, United States of America
- Northeast Structural Genomics Consortium (NESG), Columbia University, New York, New York, United States of America
| | - Yanay Ofran
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, Israel
- * E-mail:
| |
Collapse
|
228
|
Goldenberg O, Erez E, Nimrod G, Ben-Tal N. The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures. Nucleic Acids Res 2008; 37:D323-7. [PMID: 18971256 PMCID: PMC2686473 DOI: 10.1093/nar/gkn822] [Citation(s) in RCA: 170] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
ConSurf-DB is a repository for evolutionary conservation analysis of the proteins of known structures in the Protein Data Bank (PDB). Sequence homologues of each of the PDB entries were collected and aligned using standard methods. The evolutionary conservation of each amino acid position in the alignment was calculated using the Rate4Site algorithm, implemented in the ConSurf web server. The algorithm takes into account the phylogenetic relations between the aligned proteins and the stochastic nature of the evolutionary process explicitly. Rate4Site assigns a conservation level for each position in the multiple sequence alignment using an empirical Bayesian inference. Visual inspection of the conservation patterns on the 3D structure often enables the identification of key residues that comprise the functionally important regions of the protein. The repository is updated with the latest PDB entries on a monthly basis and will be rebuilt annually. ConSurf-DB is available online at http://consurfdb.tau.ac.il/
Collapse
Affiliation(s)
- Ofir Goldenberg
- Department of Biochemistry, George S. Wise Faculty of Life Sciences, Tel Aviv University, Israel
| | | | | | | |
Collapse
|
229
|
Hertzano R, Shalit E, Rzadzinska AK, Dror AA, Song L, Ron U, Tan JT, Shitrit AS, Fuchs H, Hasson T, Ben-Tal N, Sweeney HL, de Angelis MH, Steel KP, Avraham KB. A Myo6 mutation destroys coordination between the myosin heads, revealing new functions of myosin VI in the stereocilia of mammalian inner ear hair cells. PLoS Genet 2008; 4:e1000207. [PMID: 18833301 PMCID: PMC2543112 DOI: 10.1371/journal.pgen.1000207] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2008] [Accepted: 08/25/2008] [Indexed: 11/18/2022] Open
Abstract
Myosin VI, found in organisms from Caenorhabditis elegans to humans, is essential for auditory and vestibular function in mammals, since genetic mutations lead to hearing impairment and vestibular dysfunction in both humans and mice. Here, we show that a missense mutation in this molecular motor in an ENU-generated mouse model, Tailchaser, disrupts myosin VI function. Structural changes in the Tailchaser hair bundles include mislocalization of the kinocilia and branching of stereocilia. Transfection of GFP-labeled myosin VI into epithelial cells and delivery of endocytic vesicles to the early endosome revealed that the mutant phenotype displays disrupted motor function. The actin-activated ATPase rates measured for the D179Y mutation are decreased, and indicate loss of coordination of the myosin VI heads or 'gating' in the dimer form. Proper coordination is required for walking processively along, or anchoring to, actin filaments, and is apparently destroyed by the proximity of the mutation to the nucleotide-binding pocket. This loss of myosin VI function may not allow myosin VI to transport its cargoes appropriately at the base and within the stereocilia, or to anchor the membrane of stereocilia to actin filaments via its cargos, both of which lead to structural changes in the stereocilia of myosin VI-impaired hair cells, and ultimately leading to deafness.
Collapse
MESH Headings
- Adenosine Triphosphatases/genetics
- Adenosine Triphosphatases/metabolism
- Animals
- Cell Line
- Chromosome Mapping
- Deafness/genetics
- Deafness/metabolism
- Female
- Hair Cells, Auditory, Inner/chemistry
- Hair Cells, Auditory, Inner/metabolism
- Humans
- Male
- Mice
- Mice, Inbred C3H
- Mice, Inbred C57BL
- Mice, Knockout
- Models, Molecular
- Mutation, Missense
- Myosin Heavy Chains/chemistry
- Myosin Heavy Chains/genetics
- Myosin Heavy Chains/metabolism
- Protein Structure, Tertiary
- Protein Transport
- Transport Vesicles/chemistry
- Transport Vesicles/metabolism
Collapse
Affiliation(s)
- Ronna Hertzano
- Department of Human Molecular Genetics and Biochemistry, Tel Aviv University, Tel Aviv, Israel
| | - Ella Shalit
- Department of Human Molecular Genetics and Biochemistry, Tel Aviv University, Tel Aviv, Israel
| | - Agnieszka K. Rzadzinska
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Amiel A. Dror
- Department of Human Molecular Genetics and Biochemistry, Tel Aviv University, Tel Aviv, Israel
| | - Lin Song
- Department of Physiology, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania, United States of America
| | - Uri Ron
- Department of Biochemistry, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Joshua T. Tan
- Section of Cell and Developmental Biology, University of California San Diego, La Jolla, California, United States of America
| | | | - Helmut Fuchs
- Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Experimental Genetics, Neuherberg, Germany
| | - Tama Hasson
- Section of Cell and Developmental Biology, University of California San Diego, La Jolla, California, United States of America
| | - Nir Ben-Tal
- Department of Biochemistry, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - H. Lee Sweeney
- Department of Physiology, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania, United States of America
| | - Martin Hrabe de Angelis
- Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Experimental Genetics, Neuherberg, Germany
- Technical University of Munich, Munich, Germany
| | - Karen P. Steel
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Karen B. Avraham
- Department of Human Molecular Genetics and Biochemistry, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
230
|
Abstract
AbstractProtein–protein recognition plays an essential role in structure and function. Specific non-covalent interactions stabilize the structure of macromolecular assemblies, exemplified in this review by oligomeric proteins and the capsids of icosahedral viruses. They also allow proteins to form complexes that have a very wide range of stability and lifetimes and are involved in all cellular processes. We present some of the structure-based computational methods that have been developed to characterize the quaternary structure of oligomeric proteins and other molecular assemblies and analyze the properties of the interfaces between the subunits. We compare the size, the chemical and amino acid compositions and the atomic packing of the subunit interfaces of protein–protein complexes, oligomeric proteins, viral capsids and protein–nucleic acid complexes. These biologically significant interfaces are generally close-packed, whereas the non-specific interfaces between molecules in protein crystals are loosely packed, an observation that gives a structural basis to specific recognition. A distinction is made within each interface between a core that contains buried atoms and a solvent accessible rim. The core and the rim differ in their amino acid composition and their conservation in evolution, and the distinction helps correlating the structural data with the results of site-directed mutagenesis and in vitro studies of self-assembly.
Collapse
|
231
|
Najmanovich R, Kurbatova N, Thornton J. Detection of 3D atomic similarities and their use in the discrimination of small molecule protein-binding sites. Bioinformatics 2008; 24:i105-11. [DOI: 10.1093/bioinformatics/btn263] [Citation(s) in RCA: 84] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
|
232
|
Insights into subunit interactions in the heterotetrameric structure of potato ADP-glucose pyrophosphorylase. Biophys J 2008; 95:3628-39. [PMID: 18641076 DOI: 10.1529/biophysj.107.123042] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
ADP-glucose pyrophosphorylase, a key allosteric enzyme involved in higher plant starch biosynthesis, is composed of pairs of large (LS) and small subunits (SS). Current evidence indicates that the two subunit types play distinct roles in enzyme function. The LS is involved in mainly allosteric regulation through its interaction with the catalytic SS. Recently the crystal structure of the SS homotetramer has been solved, but no crystal structure of the native heterotetrameric enzyme is currently available. In this study, we first modeled the three-dimensional structure of the LS to construct the heterotetrameric enzyme. Because the enzyme has a 2-fold symmetry, six different dimeric (either up-down or side-by-side) interactions were possible. Molecular dynamics simulations were carried out for each of these possible dimers. Trajectories obtained from molecular dynamics simulations of each dimer were then analyzed by the molecular mechanics/Poisson-Boltzmann surface area method to identify the most favorable dimers, one for up-down and the other for side-by-side. Computational results combined with site directed mutagenesis and yeast two hybrid experiments suggested that the most favorable heterotetramer is formed by LS-SS (side-by-side), and LS-SS (up-down). We further determined the order of assembly during the heterotetrameric structure formation. First, side-by-side LS-SS dimers form followed by the up-down tetramerization based on the relative binding free energies.
Collapse
|
233
|
Xie BB, Chen XL, Zhang XY, He HL, Zhang YZ, Zhou BC. Predicting protein interaction interfaces from protein sequences: case studies of subtilisin and phycocyanin. Proteins 2008; 71:1461-74. [PMID: 18076046 DOI: 10.1002/prot.21836] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Identification of protein interaction interfaces is very important for understanding the molecular mechanisms underlying biological phenomena. Here, we present a novel method for predicting protein interaction interfaces from sequences by using PAM matrix (PIFPAM). Sequence alignments for interacting proteins were constructed and parsed into segments using sliding windows. By calculating distance matrix for each segment, the correlation coefficients between segments were estimated. The interaction interfaces were predicted by extracting highly correlated segment pairs from the correlation map. The predictions achieved an accuracy 0.41-0.71 for eight intraprotein interaction examples, and 0.07-0.60 for four interprotein interaction examples. Compared with three previously published methods, PIFPAM predicted more contacting site pairs for 11 out of the 12 example proteins, and predicted at least 34% more contacting site pairs for eight proteins of them. The factors affecting the predictions were also analyzed. Since PIFPAM uses only the alignments of the two interacting proteins as input, it is especially useful when no three-dimensional protein structure data are available.
Collapse
Affiliation(s)
- Bin-Bin Xie
- State Key Lab of Microbial Technology, Shandong University, Jinan 250100, People's Republic of China
| | | | | | | | | | | |
Collapse
|
234
|
Li B, Turuvekere S, Agrawal M, La D, Ramani K, Kihara D. Characterization of local geometry of protein surfaces with the visibility criterion. Proteins 2008; 71:670-83. [PMID: 17975834 DOI: 10.1002/prot.21732] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Experimentally determined protein tertiary structures are rapidly accumulating in a database, partly due to the structural genomics projects. Included are proteins of unknown function, whose function has not been investigated by experiments and was not able to be predicted by conventional sequence-based search. Those uncharacterized protein structures highlight the urgent need of computational methods for annotating proteins from tertiary structures, which include function annotation methods through characterizing protein local surfaces. Toward structure-based protein annotation, we have developed VisGrid algorithm that uses the visibility criterion to characterize local geometric features of protein surfaces. Unlike existing methods, which only concerns identifying pockets that could be potential ligand-binding sites in proteins, VisGrid is also aimed to identify large protrusions, hollows, and flat regions, which can characterize geometric features of a protein structure. The visibility used in VisGrid is defined as the fraction of visible directions from a target position on a protein surface. A pocket or a hollow is recognized as a cluster of positions with a small visibility. A large protrusion in a protein structure is recognized as a pocket in the negative image of the structure. VisGrid correctly identified 95.0% of ligand-binding sites as one of the three largest pockets in 5616 benchmark proteins. To examine how natural flexibility of proteins affects pocket identification, VisGrid was tested on distorted structures by molecular dynamics simulation. Sensitivity decreased approximately 20% for structures of a root mean square deviation of 2.0 A to the original crystal structure, but specificity was not much affected. Because of its intuitiveness and simplicity, the visibility criterion will lay the foundation for characterization and function annotation of local shape of proteins.
Collapse
Affiliation(s)
- Bin Li
- Department of Computer Science, College of Science, Purdue University, West Lafayette, Indiana 47907, USA
| | | | | | | | | | | |
Collapse
|
235
|
Liu ZP, Wu LY, Wang Y, Zhang XS, Chen L. Bridging protein local structures and protein functions. Amino Acids 2008; 35:627-50. [PMID: 18421562 PMCID: PMC7088341 DOI: 10.1007/s00726-008-0088-8] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2008] [Accepted: 03/10/2008] [Indexed: 12/11/2022]
Abstract
One of the major goals of molecular and evolutionary biology is to understand the functions of proteins by extracting functional information from protein sequences, structures and interactions. In this review, we summarize the repertoire of methods currently being applied and report recent progress in the field of in silico annotation of protein function based on the accumulation of vast amounts of sequence and structure data. In particular, we emphasize the newly developed structure-based methods, which are able to identify locally structural motifs and reveal their relationship with protein functions. These methods include computational tools to identify the structural motifs and reveal the strong relationship between these pre-computed local structures and protein functions. We also discuss remaining problems and possible directions for this exciting and challenging area.
Collapse
Affiliation(s)
- Zhi-Ping Liu
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, 100080, Beijing, China
| | | | | | | | | |
Collapse
|
236
|
Kong L, Ranganathan S. Tandem duplication, circular permutation, molecular adaptation: how Solanaceae resist pests via inhibitors. BMC Bioinformatics 2008; 9 Suppl 1:S22. [PMID: 18315854 PMCID: PMC2259423 DOI: 10.1186/1471-2105-9-s1-s22] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
Background The Potato type II (Pot II) family of proteinase inhibitors plays critical roles in the defense system of plants from Solanaceae family against pests. To better understand the evolution of this family, we investigated the correlation between sequence and structural repeats within this family and the evolution and molecular adaptation of Pot II genes through computational analysis, using the putative ancestral domain sequence as the basic repeat unit. Results Our analysis discovered the following interesting findings in Pot II family. (1) We classified the structural domains in Pot II family into three types (original repeat domain, circularly permuted domain, the two-chain domain) according to the existence of two linkers between the two domain components, which clearly show the circular permutation relationship between the original repeat domain and circularly permuted domain. (2) The permuted domains appear more stable than original repeat domain, from available structural information. Therefore, we proposed a multiple-repeat sequence is likely to adopt the permuted domain from contiguous sequence segments, with the N- and C-termini forming a single non-contiguous structural domain, linking the bracelet of tandem repeats. (3) The analysis of nonsynonymous/synonymous substitution rates ratio in Pot II domain revealed heterogeneous selective pressures among amino acid sites: the reactive site is under positive Darwinian selection (providing different specificity to target varieties of proteinases) while the cysteine scaffold is under purifying selection (essential for maintaining the fold). (4) For multi-repeat Pot II genes from Nicotiana genus, the proteolytic processing site is under positive Darwinian selection (which may improve the cleavage efficiency). Conclusion This paper provides comprehensive analysis and characterization of Pot II family, and enlightens our understanding on the strategies (Gene and domain duplication, structural circular permutation and molecular adaptation) of Solanaceae plants for defending pathogenic attacks through the evolution of Pot II genes.
Collapse
Affiliation(s)
- Lesheng Kong
- Computational Biology Group, Temasek Life Sciences Laboratory, 1 Reseach Link National University of Singapore, Singapore 117604.
| | | |
Collapse
|
237
|
Abstract
To evaluate the evolutionary constraints placed on viral proteins by the structure and assembly of the capsid, we calculate Shannon entropies in the aligned sequences of 45 polypeptide chains in 32 icosahedral viruses, and relate these entropies to the residue location in the three-dimensional structure of the capsids. Three categories of residues have entropies lower than the chain average implying that they are better conserved than average: residues that are buried within a subunit (the protein core), residues that contain atoms buried at an interface between subunits (the interface core), and residues that contribute to several such interfaces. The interface core is also conserved in homomeric proteins and in transient protein-protein complexes, which have only one interface whereas capsids have many. In capsids, the subunit interfaces implicate most of the polypeptide chain: on average, 66% of the capsid residues are at an interface, 34% at more than one, and 47% at the interface core. Nevertheless, we observe that the degree of residue conservation can vary widely between interfaces within a capsid and between regions within an interface. The interfaces and regions of interfaces that show a low sequence variability are likely to play major roles in the self-assembly of the capsid, with implications on its mechanism that we discuss taking adeno-associated virus as an example.
Collapse
Affiliation(s)
- Ranjit P Bahadur
- Yeast Structural Genomics, IBBMC Université Paris-Sud, CNRS UMR 8619, 91405-Orsay, France
| | | |
Collapse
|
238
|
Thomas J, Ramakrishnan N, Bailey-Kellogg C. Graphical models of residue coupling in protein families. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2008; 5:183-197. [PMID: 18451428 DOI: 10.1109/tcbb.2007.70225] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
Many statistical measures and algorithmic techniques have been proposed for studying residue coupling in protein families. Generally speaking, two residue positions are considered coupled if, in the sequence record, some of their amino acid type combinations are significantly more common than others. While the proposed approaches have proven useful in finding and describing coupling, a significant missing component is a formal probabilistic model that explicates and compactly represents the coupling, integrates information about sequence,structure, and function, and supports inferential procedures for analysis, diagnosis, and prediction.We present an approach to learning and using probabilistic graphical models of residue coupling. These models capture significant conservation and coupling constraints observable ina multiply-aligned set of sequences. Our approach can place a structural prior on considered couplings, so that all identified relationships have direct mechanistic explanations. It can also incorporate information about functional classes, and thereby learn a differential graphical model that distinguishes constraints common to all classes from those unique to individual classes. Such differential models separately account for class-specific conservation and family-wide coupling, two different sources of sequence covariation. They are then able to perform interpretable functional classification of new sequences, explaining classification decisions in terms of the underlying conservation and coupling constraints. We apply our approach in studies of both G protein-coupled receptors and PDZ domains, identifying and analyzing family-wide and class-specific constraints, and performing functional classification. The results demonstrate that graphical models of residue coupling provide a powerful tool for uncovering, representing, and utilizing significant sequence structure-function relationships in protein families.
Collapse
Affiliation(s)
- John Thomas
- Department of Computer Science, Dartmouth College, Sudikoff Laboratory, Hanover, NH 03755, USA.
| | | | | |
Collapse
|
239
|
An efficient conserved region detection method for multiple protein sequences using principal component analysis and wavelet transform. Pattern Recognit Lett 2008. [DOI: 10.1016/j.patrec.2007.11.013] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
240
|
Rodriguez SM, Panjikar S, Van Belle K, Wyns L, Messens J, Loris R. Nonspecific base recognition mediated by water bridges and hydrophobic stacking in ribonuclease I from Escherichia coli. Protein Sci 2008; 17:681-90. [PMID: 18305191 PMCID: PMC2271172 DOI: 10.1110/ps.073420708] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2007] [Revised: 01/21/2008] [Accepted: 01/21/2008] [Indexed: 10/22/2022]
Abstract
The crystal structure of Escherichia coli ribonuclease I (EcRNase I) reveals an RNase T2-type fold consisting of a conserved core of six beta-strands and three alpha-helices. The overall architecture of the catalytic residues is very similar to the plant and fungal RNase T2 family members, but the perimeter surrounding the active site is characterized by structural elements specific for E. coli. In the structure of EcRNase I in complex with a substrate-mimicking decadeoxynucleotide d(CGCGATCGCG), we observe a cytosine bound in the B2 base binding site and mixed binding of thymine and guanine in the B1 base binding site. The active site residues His55, His133, and Glu129 interact with the phosphodiester linkage only through a set of water molecules. Residues forming the B2 base recognition site are well conserved among bacterial homologs and may generate limited base specificity. On the other hand, the B1 binding cleft acquires true base aspecificity by combining hydrophobic van der Waals contacts at its sides with a water-mediated hydrogen-bonding network at the bottom. This B1 base recognition site is highly variable among bacterial sequences and the observed interactions are unique to EcRNaseI and a few close relatives.
Collapse
Affiliation(s)
- Sergio Martinez Rodriguez
- Laboratorium voor Ultrastructuur, Vrije Universiteit Brussels, Pleinlaan 2, B-1050 Brussels, Belgium
| | | | | | | | | | | |
Collapse
|
241
|
Prediction of functional nonsynonymous single nucleotide polymorphisms in human G-protein-coupled receptors. J Hum Genet 2008; 53:379-389. [PMID: 18299956 DOI: 10.1007/s10038-008-0260-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2007] [Accepted: 01/24/2008] [Indexed: 10/22/2022]
Abstract
G-protein-coupled receptors (GPCRs) are found in a wide range of organisms and are central to a cellular signaling network that regulates many basic physiological processes. GPCRs are the focus of a significant amount of current pharmaceutical research because they play a key role in many diseases. In this paper, we predict the functional nonsynonymous single nucleotide polymorphisms (nsSNPs) in human GPCRs by defining optimal attributes and using a decision tree method. The predictive power of each attribute was evaluated. A subset of sequences with optimal attributes was obtained using the decision tree method combined with a genetic search algorithm. The subset contains both sequence-based and structure-based information, and the information for each subset consists of a conservation score, the location of the mutation, the BLOSUM62 substitution matrix score, as well as the hydrophobicity change, the solvent accessibility, and the buried charge. Seven important rules were derived from the decision tree. A total of 166 functional nsSNPs in human GPCRs from the dbSNP have been predicted using the optimal attributes subset.
Collapse
|
242
|
Lee TS. Reverse conservation analysis reveals the specificity determining residues of cytochrome P450 family 2 (CYP 2). Evol Bioinform Online 2008; 4:7-16. [PMID: 19204803 PMCID: PMC2614186 DOI: 10.4137/ebo.s291] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
The concept of conservation of amino acids is widely used to identify important alignment positions of orthologs. The assumption is that important amino acid residues will be conserved in the protein family during the evolutionary process. For paralog alignment, on the other hand, the opposite concept can be used to identify residues that are responsible for specificity. Assuming that the function-specific or ligand-specific residue positions will have higher diversity since they are under evolutionary pressure to fit the target specificity, these function-specific or ligand-specific residues positions will have a lower degree of conservation than other positions in a highly conserved paralog alignment. This study assessed the ability of reverse conservation analysis to identify function-specific and ligand-specific residue positions in closely related paralog. Reverse conservation analysis of paralog alignments successfully identified all six previously reported substrate recognition sites (SRSs) in cytochrome P450 family 2 (CYP 2). Further analysis of each subfamily identified the specificity-determining residues (SDRs) that have been experimentally found. New potential SDRs were also predicted and await confirmation by further experiments or modeling calculations. This concept may be also applied to identify SDRs in other protein families.
Collapse
Affiliation(s)
- Tai-Sung Lee
- Consortium for Bioinformatics and Computational Biology, and Department of Chemistry, University of Minnesota, 207 Pleasant St. SE, Minneapolis, MN 55455, USA.
| |
Collapse
|
243
|
Tong Y, Hota PK, Hamaneh MB, Buck M. Insights into oncogenic mutations of plexin-B1 based on the solution structure of the Rho GTPase binding domain. Structure 2008; 16:246-58. [PMID: 18275816 PMCID: PMC2358926 DOI: 10.1016/j.str.2007.12.012] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2007] [Revised: 11/19/2007] [Accepted: 12/06/2007] [Indexed: 10/22/2022]
Abstract
The plexin family of transmembrane receptors are important for axon guidance, angiogenesis, but also in cancer. Recently, plexin-B1 somatic missense mutations were found in both primary tumors and metastases of breast and prostate cancers, with several mutations mapping to the Rho GTPase binding domain (RBD) in the cytoplasmic region of the receptor. Here we present the NMR solution structure of this domain, confirming that the protein has both a ubiquitin-like fold and surface features. Oncogenic mutations T1795A and T1802A are located in a loop region, perturb the average structure locally, and have no effect on Rho GTPase binding affinity. Mutations L1815F and L1815P are located at the Rho GTPase binding site and are associated with a complete loss of binding for Rac1 and Rnd1. Both are found to disturb the conformation of the beta3-beta4 sheet and the orientation of surrounding side chains. Our study suggests that the oncogenic behavior of the mutants can be rationalized with reference to the structure of the RBD of plexin-B1.
Collapse
Affiliation(s)
- Yufeng Tong
- Department of Physiology and Biophysics, Case Western Reserve University School of Medicine, 10900 Euclid Avenue, Cleveland, Ohio 44106, USA
| | | | | | | |
Collapse
|
244
|
Manning JR, Jefferson ER, Barton GJ. The contrasting properties of conservation and correlated phylogeny in protein functional residue prediction. BMC Bioinformatics 2008; 9:51. [PMID: 18221517 PMCID: PMC2267696 DOI: 10.1186/1471-2105-9-51] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2007] [Accepted: 01/25/2008] [Indexed: 11/21/2022] Open
Abstract
Background Amino acids responsible for structure, core function or specificity may be inferred from multiple protein sequence alignments where a limited set of residue types are tolerated. The rise in available protein sequences continues to increase the power of techniques based on this principle. Results A new algorithm, SMERFS, for predicting protein functional sites from multiple sequences alignments was compared to 14 conservation measures and to the MINER algorithm. Validation was performed on an automatically generated dataset of 1457 families derived from the protein interactions database SNAPPI-DB, and a smaller manually curated set of 148 families. The best performing measure overall was Williamson property entropy, with ROC0.1 scores of 0.0087 and 0.0114 for domain and small molecule contact prediction, respectively. The Lancet method performed worse than random on protein-protein interaction site prediction (ROC0.1 score of 0.0008). The SMERFS algorithm gave similar accuracy to the phylogenetic tree-based MINER algorithm but was superior to Williamson in prediction of non-catalytic transient complex interfaces. SMERFS predicts sites that are significantly more solvent accessible compared to Williamson. Conclusion Williamson property entropy is the the best performing of 14 conservation measures examined. The difference in performance of SMERFS relative to Williamson in manually defined complexes was dependent on complex type. The best choice of analysis method is therefore dependent on the system of interest. Additional computation employed by Miner in calculation of phylogenetic trees did not produce improved results over SMERFS. SMERFS performance was improved by use of windows over alignment columns, illustrating the necessity of considering the local environment of positions when assessing their functional significance.
Collapse
|
245
|
Liu XS, Guo WL. Robustness of the residue conservation score reflecting both frequencies and physicochemistries. Amino Acids 2008; 34:643-52. [PMID: 18175048 DOI: 10.1007/s00726-007-0017-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2007] [Accepted: 12/07/2007] [Indexed: 10/22/2022]
Abstract
Measuring residue conservation at aligned positions has many applications in biology. Recently, a new conservation score has been defined. Unlike the previous methods, the new approach considers both residue frequencies and physicochemistries. Specifically, it measures physicochemistries based on BLOSUM matrices disregarding the meaning of the entries in such matrices, which may involve the problem of log-log probability. In this paper we present a conservation measure that also reflects both frequencies and physicochemistries while considering the fact that the entries of BLOSUM matrices are already interpreted as log probability. When the supposed score is applied to 14 protein examples, the results show that these two conservation scores are equivalent aside from the different score ranges. The method is also used to score the functional sites of three protein families. Compared with the widely used entropy-based methods, the resulting scores are more robust and consistent in the sense that the functional sites are much more conserved because of functional constraints.
Collapse
Affiliation(s)
- X-S Liu
- Institute of Nanoscience, Academy of Frontier Science, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China.
| | | |
Collapse
|
246
|
Maffini M, Denes V, Sonnenschein C, Soto A, Geck P. APRIN is a unique Pds5 paralog with features of a chromatin regulator in hormonal differentiation. J Steroid Biochem Mol Biol 2008; 108:32-43. [PMID: 17997301 PMCID: PMC3966471 DOI: 10.1016/j.jsbmb.2007.05.034] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/08/2007] [Accepted: 05/28/2007] [Indexed: 11/26/2022]
Abstract
Activation of steroid receptors results in global changes of gene expression patterns. Recent studies showed that steroid receptors control only a portion of their target genes directly, by promoter binding. The majority of the changes are indirect, through chromatin rearrangements. The mediators that relay the hormonal signals to large-scale chromatin changes are, however, unknown. We report here that APRIN, a novel hormone-induced nuclear phosphoprotein has the characteristics of a chromatin regulator and may link endocrine pathways to chromatin. We showed earlier that APRIN is involved in the hormonal regulation of proliferative arrest in cancer cells. To investigate its function we cloned and characterized APRIN orthologs and performed homology and expression studies. APRIN is a paralog of the cohesin-associated Pds5 gene lineage and arose by gene-duplication in early vertebrates. The conservation and domain differences we found suggest, however, that APRIN acquired novel chromatin-related functions (e.g. the HMG-like domains in APRIN, the hallmarks of chromatin regulators, are absent in the Pds5 family). Our results suggest that in interphase nuclei APRIN localizes in the euchromatin/heterochromatin interface and we also identified its DNA-binding and nuclear import signal domains. The results indicate that APRIN, in addition to its Pds5 similarity, has the features and localization of a hormone-induced chromatin regulator.
Collapse
Affiliation(s)
| | | | - Carlos Sonnenschein
- Department of Anatomy and Cellular Biology, Tufts University School of Medicine, Boston, Massachusetts 02111
| | - Ana Soto
- Department of Anatomy and Cellular Biology, Tufts University School of Medicine, Boston, Massachusetts 02111
| | - Peter Geck
- To whom correspondence should be addressed: Peter Geck, M.D., Department of Anatomy and Cellular Biology, Tufts University School of Medicine, 136 Harrison Avenue, Boston, Massachusetts 02111, Tel: (617) 636-2796, Fax: (617) 636-6536, E-mail:
| |
Collapse
|
247
|
Kakuta M, Nakamura S, Shimizu K. Prediction of Protein-Protein Interaction Sites Using Only Sequence Information and Using Both Sequence and Structural Information. ACTA ACUST UNITED AC 2008. [DOI: 10.2197/ipsjdc.4.217] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
248
|
Predicting Selectivity and Druggability in Drug Discovery. ANNUAL REPORTS IN COMPUTATIONAL CHEMISTRY 2008. [DOI: 10.1016/s1574-1400(08)00002-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
249
|
Sterner B, Singh R, Berger B. Predicting and annotating catalytic residues: an information theoretic approach. J Comput Biol 2007; 14:1058-73. [PMID: 17887954 DOI: 10.1089/cmb.2007.0042] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
We introduce a computational method to predict and annotate the catalytic residues of a protein using only its sequence information, so that we describe both the residues' sequence locations (prediction) and their specific biochemical roles in the catalyzed reaction (annotation). While knowing the chemistry of an enzyme's catalytic residues is essential to understanding its function, the challenges of prediction and annotation have remained difficult, especially when only the enzyme's sequence and no homologous structures are available. Our sequence-based approach follows the guiding principle that catalytic residues performing the same biochemical function should have similar chemical environments; it detects specific conservation patterns near in sequence to known catalytic residues and accordingly constrains what combination of amino acids can be present near a predicted catalytic residue. We associate with each catalytic residue a short sequence profile and define a Kullback-Leibler (KL) distance measure between these profiles, which, as we show, effectively captures even subtle biochemical variations. We apply the method to the class of glycohydrolase enzymes. This class includes proteins from 96 families with very different sequences and folds, many of which perform important functions. In a cross-validation test, our approach correctly predicts the location of the enzymes' catalytic residues with a sensitivity of 80% at a specificity of 99.4%, and in a separate cross-validation we also correctly annotate the biochemical role of 80% of the catalytic residues. Our results compare favorably to existing methods. Moreover, our method is more broadly applicable because it relies on sequence and not structure information; it may, furthermore, be used in conjunction with structure-based methods.
Collapse
Affiliation(s)
- Beckett Sterner
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| | | | | |
Collapse
|
250
|
Tian J, Wu N, Guo X, Guo J, Zhang J, Fan Y. Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines. BMC Bioinformatics 2007; 8:450. [PMID: 18005451 PMCID: PMC2216041 DOI: 10.1186/1471-2105-8-450] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2007] [Accepted: 11/16/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Human genetic variations primarily result from single nucleotide polymorphisms (SNPs) that occur approximately every 1000 bases in the overall human population. The non-synonymous SNPs (nsSNPs) that lead to amino acid changes in the protein product may account for nearly half of the known genetic variations linked to inherited human diseases. One of the key problems of medical genetics today is to identify nsSNPs that underlie disease-related phenotypes in humans. As such, the development of computational tools that can identify such nsSNPs would enhance our understanding of genetic diseases and help predict the disease. RESULTS We propose a method, named Parepro (Predicting the amino acid replacement probability), to identify nsSNPs having either deleterious or neutral effects on the resulting protein function. Two independent datasets, HumVar and NewHumVar, taken from the PhD-SNP server, were applied to train the model and test the robustness of Parepro. Using a 20-fold cross validation test on the HumVar dataset, Parepro achieved a Matthews correlation coefficient (MCC) of 50% and an overall accuracy (Q2) of 76%, both of which were higher than those predicted by the methods, such as PolyPhen, SIFT, and HydridMeth. Further analysis on an additional dataset (NewHumVar) using Parepro yielded similar results. CONCLUSION The performance of Parepro indicates that it is a powerful tool for predicting the effect of nsSNPs on protein function and would be useful for large-scale analysis of genomic nsSNP data.
Collapse
Affiliation(s)
- Jian Tian
- Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China.
| | | | | | | | | | | |
Collapse
|