1
|
Moldenhauer HJ, Tammen K, Meredith AL. Structural mapping of patient-associated KCNMA1 gene variants. Biophys J 2024; 123:1984-2000. [PMID: 38042986 PMCID: PMC11309989 DOI: 10.1016/j.bpj.2023.11.3404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 11/30/2023] [Accepted: 11/30/2023] [Indexed: 12/04/2023] Open
Abstract
KCNMA1-linked channelopathy is a neurological disorder characterized by seizures, motor abnormalities, and neurodevelopmental disabilities. The disease mechanisms are predicted to result from alterations in KCNMA1-encoded BK K+ channel activity; however, only a subset of the patient-associated variants have been functionally studied. The localization of these variants within the tertiary structure or evaluation by pathogenicity algorithms has not been systematically assessed. In this study, 82 nonsynonymous patient-associated KCNMA1 variants were mapped within the BK channel protein. Fifty-three variants localized within cryoelectron microscopy-resolved structures, including 21 classified as either gain of function (GOF) or loss of function (LOF) in BK channel activity. Clusters of LOF variants were identified in the pore, the AC region (RCK1), and near the Ca2+ bowl (RCK2), overlapping with sites of pharmacological or endogenous modulation. However, no clustering was found for GOF variants. To further understand variants of uncertain significance (VUSs), assessments by multiple standard pathogenicity algorithms were compared, and new thresholds for sensitivity and specificity were established from confirmed GOF and LOF variants. An ensemble algorithm was constructed (KCNMA1 meta score (KMS)), consisting of a weighted summation of this trained dataset combined with a structural component derived from the Ca2+-bound and unbound BK channels. KMS assessment differed from the highest-performing individual algorithm (REVEL) at 10 VUS residues, and a subset were studied further by electrophysiology in HEK293 cells. M578T, E656A, and D965V (KMS+;REVEL-) were confirmed to alter BK channel properties in voltage-clamp recordings, and D800Y (KMS-;REVEL+) was assessed as benign under the test conditions. However, KMS failed to accurately assess K457E. These combined results reveal the distribution of potentially disease-causing KCNMA1 variants within BK channel functional domains and pathogenicity evaluation for VUSs, suggesting strategies for improving channel-level predictions in future studies by building on ensemble algorithms such as KMS.
Collapse
Affiliation(s)
- Hans J Moldenhauer
- Department of Physiology, University of Maryland School of Medicine, Baltimore, Maryland
| | - Kelly Tammen
- Department of Physiology, University of Maryland School of Medicine, Baltimore, Maryland
| | - Andrea L Meredith
- Department of Physiology, University of Maryland School of Medicine, Baltimore, Maryland.
| |
Collapse
|
2
|
Moldenhauer HJ, Tammen K, Meredith AL. Structural mapping of patient-associated KCNMA1 gene variants. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.27.550850. [PMID: 37546746 PMCID: PMC10402178 DOI: 10.1101/2023.07.27.550850] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
KCNMA1-linked channelopathy is a neurological disorder characterized by seizures, motor abnormalities, and neurodevelopmental disabilities. The disease mechanisms are predicted to result from alterations in KCNMA1-encoded BK K+ channel activity; however, only a subset of the patient-associated variants have been functionally studied. The localization of these variants within the tertiary structure or evaluation by pathogenicity algorithms has not been systematically assessed. In this study, 82 nonsynonymous patient-associated KCNMA1 variants were mapped within the BK channel protein. Fifty-three variants localized within cryo-EM resolved structures, including 21 classified as either gain-of-function (GOF) or loss-of-function (LOF) in BK channel activity. Clusters of LOF variants were identified in the pore, the AC region (RCK1), and near the Ca 2+ bowl (RCK2), overlapping with sites of pharmacological or endogenous modulation. However, no clustering was found for GOF variants. To further understand variants of uncertain significance (VUS), assessments by multiple standard pathogenicity algorithms were compared, and new thresholds for sensitivity and specificity were established from confirmed GOF and LOF variants. An ensemble algorithm was constructed (KCNMA1 Meta Score), consisting of a weighted summation of this trained dataset combined with a structural component derived from the Ca 2+ bound and unbound BK channels. KMS assessment differed from the highest performing individual algorithm (REVEL) at 10 VUS residues, and a subset were studied further by electrophysiology in HEK293 cells. M578T, E656A, and D965V (KMS+;REVEL-) were confirmed to alter BK channel properties in voltage-clamp recordings, and D800Y (KMS-;REVEL+) was assessed as benign under the test conditions. However, KMS failed to accurately assess K457E. These combined results reveal the distribution of potentially disease-causing KCNMA1 variants within BK channel functional domains and pathogenicity evaluation for VUS, suggesting strategies for improving channel-level predictions in future studies by building on ensemble algorithms such as KMS.
Collapse
|
3
|
Mortensen JC, Damjanovic J, Miao J, Hui T, Lin Y. A backbone-dependent rotamer library with high (ϕ, ψ) coverage using metadynamics simulations. Protein Sci 2022; 31:e4491. [PMID: 36327064 PMCID: PMC9679973 DOI: 10.1002/pro.4491] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Revised: 10/26/2022] [Accepted: 10/28/2022] [Indexed: 12/06/2023]
Abstract
Backbone-dependent rotamer libraries are commonly used to assign the side chain dihedral angles of amino acids when modeling protein structures. Most rotamer libraries are created by curating protein crystal structure data and using various methods to extrapolate the existing data to cover all possible backbone conformations. However, these rotamer libraries may not be suitable for modeling the structures of cyclic peptides and other constrained peptides because these molecules frequently sample backbone conformations rarely seen in the crystal structures of linear proteins. To provide backbone-dependent side chain information beyond the α-helix, β-sheet, and PPII regions, we used explicit-solvent metadynamics simulations of model dipeptides to create a new rotamer library that has high coverage in the (ϕ, ψ) space. Furthermore, this approach can be applied to build high-coverage rotamer libraries for noncanonical amino acids. The resulting Metadynamics of Dipeptides for Rotamer Distribution (MEDFORD) rotamer library predicts the side chain conformations of high-resolution protein crystal structures with similar accuracy (~80%) to a state-of-the-art rotamer library. Our ability to test the accuracy of MEDFORD at predicting the side chain dihedral angles of amino acids in noncanonical backbone conformation is restricted by the limited structural data available for cyclic peptides. For the cyclic peptide data that are currently available, MEDFORD and the state-of-the-art rotamer library perform comparably. However, the two rotamer libraries indeed make different rotamer predictions in noncanonical (ϕ, ψ) regions. For noncanonical amino acids, the MEDFORD rotamer library predicts the χ1 values with approximately 75% accuracy.
Collapse
Affiliation(s)
| | | | - Jiayuan Miao
- Department of ChemistryTufts UniversityMedfordMassachusettsUSA
| | - Tiffani Hui
- Department of ChemistryTufts UniversityMedfordMassachusettsUSA
| | - Yu‐Shan Lin
- Department of ChemistryTufts UniversityMedfordMassachusettsUSA
| |
Collapse
|
4
|
Muhammed MT, Aki-Yalcin E. Homology modeling in drug discovery: Overview, current applications, and future perspectives. Chem Biol Drug Des 2018; 93:12-20. [PMID: 30187647 DOI: 10.1111/cbdd.13388] [Citation(s) in RCA: 180] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2018] [Revised: 06/29/2018] [Accepted: 08/04/2018] [Indexed: 02/06/2023]
Abstract
Homology modeling is one of the computational structure prediction methods that are used to determine protein 3D structure from its amino acid sequence. It is considered to be the most accurate of the computational structure prediction methods. It consists of multiple steps that are straightforward and easy to apply. There are many tools and servers that are used for homology modeling. There is no single modeling program or server which is superior in every aspect to others. Since the functionality of the model depends on the quality of the generated protein 3D structure, maximizing the quality of homology modeling is crucial. Homology modeling has many applications in the drug discovery process. Since drugs interact with receptors that consist mainly of proteins, protein 3D structure determination, and thus homology modeling is important in drug discovery. Accordingly, there has been the clarification of protein interactions using 3D structures of proteins that are built with homology modeling. This contributes to the identification of novel drug candidates. Homology modeling plays an important role in making drug discovery faster, easier, cheaper, and more practical. As new modeling methods and combinations are introduced, the scope of its applications widens.
Collapse
Affiliation(s)
- Muhammed Tilahun Muhammed
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, Suleyman Demirel University, Isparta, Turkey.,Department of Basic Biotechnology, Institute of Biotechnology, Ankara University, Ankara, Turkey
| | - Esin Aki-Yalcin
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, Ankara University, Ankara, Turkey
| |
Collapse
|
5
|
Colbes J, Corona RI, Lezcano C, Rodríguez D, Brizuela CA. Protein side-chain packing problem: is there still room for improvement? Brief Bioinform 2018; 18:1033-1043. [PMID: 27567382 DOI: 10.1093/bib/bbw079] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2016] [Indexed: 11/12/2022] Open
Abstract
The protein side-chain packing problem (PSCPP) is an important subproblem of both protein structure prediction and protein design. During the past two decades, a large number of methods have been proposed to tackle this problem. These methods consist of three main components: a rotamer library, a scoring function and a search strategy. The average overall accuracy level obtained by these methods is approximately 87%. Whether a better accuracy level could be achieved remains to be answered. To address this question, we calculated the maximum accuracy level attainable using a simple rotamer library, independently of the energy function or the search method. Using 2883 different structures from the Protein Data Bank, we compared this accuracy level with the accuracy level of five state-of-the-art methods. These comparisons indicated that, for buried residues in the protein, we are already close to the best possible accuracy results. In addition, for exposed residues, we found that a significant gap exists between the possible improvement and the maximum accuracy level achievable with current methods. After determining that an improvement is possible, the next step is to understand what limitations are preventing us from obtaining such an improvement. Previous works on protein structure prediction and protein design have shown that scoring function inaccuracies may represent the main obstacle to achieving better results for these problems. To show that the same is true for the PSCPP, we evaluated the quality of two scoring functions used by some state-of-the-art algorithms. Our results indicate that neither of these scoring functions can guide the search method correctly, thereby reinforcing the idea that efforts to solve the PSCPP must also focus on developing better scoring functions.
Collapse
|
6
|
Colbes J, Aguila SA, Brizuela CA. Scoring of Side-Chain Packings: An Analysis of Weight Factors and Molecular Dynamics Structures. J Chem Inf Model 2018; 58:443-452. [PMID: 29368924 DOI: 10.1021/acs.jcim.7b00679] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The protein side-chain packing problem (PSCPP) is a central task in computational protein design. The problem is usually modeled as a combinatorial optimization problem, which consists of searching for a set of rotamers, from a given rotamer library, that minimizes a scoring function (SF). The SF is a weighted sum of terms, that can be decomposed in physics-based and knowledge-based terms. Although there are many methods to obtain approximate solutions for this problem, all of them have similar performances and there has not been a significant improvement in recent years. Studies on protein structure prediction and protein design revealed the limitations of current SFs to achieve further improvements for these two problems. In the same line, a recent work reported a similar result for the PSCPP. In this work, we ask whether or not this negative result regarding further improvements in performance is due to (i) an incorrect weighting of the SFs terms or (ii) the constrained conformation resulting from the protein crystallization process. To analyze these questions, we (i) model the PSCPP as a bi-objective combinatorial optimization problem, optimizing, at the same time, the two most important terms of two SFs of state-of-the-art algorithms and (ii) performed a preprocessing relaxation of the crystal structure through molecular dynamics to simulate the protein in the solvent and evaluated the performance of these two state-of-the-art SFs under these conditions. Our results indicate that (i) no matter what combination of weight factors we use the current SFs will not lead to better performances and (ii) the evaluated SFs will not be able to improve performance on relaxed structures. Furthermore, the experiments revealed that the SFs and the methods are biased toward crystallized structures.
Collapse
Affiliation(s)
- Jose Colbes
- Computer Science Department, CICESE Research Center , 22860 Ensenada, Mexico
| | - Sergio A Aguila
- Centro de Nanociencias y Nanotecnologia, Universidad Nacional Autonoma de Mexico , Km. 107 Carretera Tijuana-Ensenada, Ensenada, Baja California, Mexico , C.P. 22860
| | - Carlos A Brizuela
- Computer Science Department, CICESE Research Center , 22860 Ensenada, Mexico
| |
Collapse
|
7
|
Chopra G, Samudrala R. Exploring Polypharmacology in Drug Discovery and Repurposing Using the CANDO Platform. Curr Pharm Des 2017; 22:3109-23. [PMID: 27013226 DOI: 10.2174/1381612822666160325121943] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2016] [Accepted: 03/01/2015] [Indexed: 01/05/2023]
Abstract
BACKGROUND Traditional drug discovery approaches focus on a limited set of target molecules for treatment against specific indications/diseases. However, drug absorption, dispersion, metabolism, and excretion (ADME) involve interactions with multiple protein systems. Drugs approved for particular indication(s) may be repurposed as novel therapeutics for others. The severely declining rate of discovery and increasing costs of new drugs illustrate the limitations of the traditional reductionist paradigm in drug discovery. METHODS We developed the Computational Analysis of Novel Drug Opportunities (CANDO) platform based on a hypothesis that drugs function by interacting with multiple protein targets to create a molecular interaction signature that can be exploited for therapeutic repurposing and discovery. We compiled a library of compounds that are human ingestible with minimal side effects, followed by an 'all-compounds' vs 'all-proteins' fragment-based multitarget docking with dynamics screen to construct compound-proteome interaction matrices that were then analyzed to determine similarity of drug behavior. The proteomic signature similarity of drugs is then ranked to make putative drug predictions for all indications in a shotgun manner. RESULTS We have previously applied this platform with success in both retrospective benchmarking and prospective validation, and to understand the effect of druggable protein classes on repurposing accuracy. Here we use the CANDO platform to analyze and determine the contribution of multitargeting (polypharmacology) to drug repurposing benchmarking accuracy. Taken together with the previous work, our results indicate that a large number of protein structures with diverse fold space and a specific polypharmacological interactome is necessary for accurate drug predictions using our proteomic and evolutionary drug discovery and repurposing platform. CONCLUSION These results have implications for future drug development and repurposing in the context of polypharmacology.
Collapse
Affiliation(s)
- Gaurav Chopra
- Department of Chemistry, Purdue University, West Lafayette, IN, USA.
| | - Ram Samudrala
- Department of Biomedical Informatics, SUNY, Buffalo, NY, USA.
| |
Collapse
|
8
|
Computational Approaches and Resources in Single Amino Acid Substitutions Analysis Toward Clinical Research. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2014; 94:365-423. [DOI: 10.1016/b978-0-12-800168-4.00010-x] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
|
9
|
Joo H, Qu X, Swanson R, McCallum CM, Tsai J. Fine grained sampling of residue characteristics using molecular dynamics simulation. Comput Biol Chem 2010; 34:172-83. [PMID: 20621565 DOI: 10.1016/j.compbiolchem.2010.06.002] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2010] [Revised: 06/11/2010] [Accepted: 06/11/2010] [Indexed: 11/19/2022]
Abstract
In a fine-grained computational analysis of protein structure, we investigated the relationships between a residue's backbone conformations and its side-chain packing as well as conformations. To produce continuous distributions in high resolution, we ran molecular dynamics simulations over a set of protein folds (dynameome). In effect, the dynameome dataset samples not only the states well represented in the PDB but also the known states that are not well represented in the structural database. In our analysis, we characterized the mutual influence among the backbone phi,psi angles with the first side-chain torsion angles (chi(1)) and the volumes occupied by the side-chains. The dependencies of these relationships on side-chain environment and amino acids are further explored. We found that residue volumes exhibit dependency on backbone 2 degrees structure conformation: side-chains pack more densely in extended beta-sheet than in alpha-helical structures. As expected, residue volumes on the protein surface were larger than those in the interior. The first side-chain torsion angles are found to be dependent on the backbone conformations in agreement with previous studies, but the dynameome dataset provides higher resolution of rotamer preferences based on the backbone conformation. All three gauche(-), gauche(+), and trans rotamers show different patterns of phi,psi dependency, and variations in chi(1) value are skewed from their canonical values to relieve the steric strains. By demonstrating the utility of dynameomic modeling on the native state ensemble, this study reveals details of the interplay among backbone conformations, residue volumes and side-chain conformations.
Collapse
Affiliation(s)
- Hyun Joo
- Chemistry Department, University of the Pacific, 3601 Pacific Avenue, Stockton, CA 95211, United States.
| | | | | | | | | |
Collapse
|
10
|
Cohen M, Potapov V, Schreiber G. Four distances between pairs of amino acids provide a precise description of their interaction. PLoS Comput Biol 2009; 5:e1000470. [PMID: 19680437 PMCID: PMC2715887 DOI: 10.1371/journal.pcbi.1000470] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2009] [Accepted: 07/15/2009] [Indexed: 11/18/2022] Open
Abstract
The three-dimensional structures of proteins are stabilized by the interactions between amino acid residues. Here we report a method where four distances are calculated between any two side chains to provide an exact spatial definition of their bonds. The data were binned into a four-dimensional grid and compared to a random model, from which the preference for specific four-distances was calculated. A clear relation between the quality of the experimental data and the tightness of the distance distribution was observed, with crystal structure data providing far tighter distance distributions than NMR data. Since the four-distance data have higher information content than classical bond descriptions, we were able to identify many unique inter-residue features not found previously in proteins. For example, we found that the side chains of Arg, Glu, Val and Leu are not symmetrical in respect to the interactions of their head groups. The described method may be developed into a function, which computationally models accurately protein structures.
Collapse
Affiliation(s)
- Mati Cohen
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot, Israel
| | - Vladimir Potapov
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot, Israel
| | - Gideon Schreiber
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot, Israel
| |
Collapse
|
11
|
Vadivel K, Namasivayam G. An estimate of the numbers and density of low-energy structures (or decoys) in the conformational landscape of proteins. PLoS One 2009; 4:e5148. [PMID: 19357778 PMCID: PMC2663821 DOI: 10.1371/journal.pone.0005148] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2008] [Accepted: 03/02/2009] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND The conformational energy landscape of a protein, as calculated by known potential energy functions, has several minima, and one of these corresponds to its native structure. It is however difficult to comprehensively estimate the actual numbers of low energy structures (or decoys), the relationships between them, and how the numbers scale with the size of the protein. METHODOLOGY We have developed an algorithm to rapidly and efficiently identify the low energy conformers of oligo peptides by using mutually orthogonal Latin squares to sample the potential energy hyper surface. Using this algorithm, and the ECEPP/3 potential function, we have made an exhaustive enumeration of the low-energy structures of peptides of different lengths, and have extrapolated these results to larger polypeptides. CONCLUSIONS AND SIGNIFICANCE We show that the number of native-like structures for a polypeptide is, in general, an exponential function of its sequence length. The density of these structures in conformational space remains more or less constant and all the increase appears to come from an expansion in the volume of the space. These results are consistent with earlier reports that were based on other models and techniques.
Collapse
Affiliation(s)
- Kanagasabai Vadivel
- Centre of Advanced Study in Crystallography & Biophysics, University of Madras, Tamilnadu, India
| | - Gautham Namasivayam
- Centre of Advanced Study in Crystallography & Biophysics, University of Madras, Tamilnadu, India
- * E-mail:
| |
Collapse
|
12
|
Solis AD, Rackovsky S. Information and discrimination in pairwise contact potentials. Proteins 2008; 71:1071-87. [PMID: 18004788 DOI: 10.1002/prot.21733] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
We examine the information-theoretic characteristics of statistical potentials that describe pairwise long-range contacts between amino acid residues in proteins. In our work, we seek to map out an efficient information-based strategy to detect and optimally utilize the structural information latent in empirical data, to make contact potentials, and other statistically derived folding potentials, more effective tools in protein structure prediction. Foremost, we establish fundamental connections between basic information-theoretic quantities (including the ubiquitous Z-score) and contact "energies" or scores used routinely in protein structure prediction, and demonstrate that the informatic quantity that mediates fold discrimination is the total divergence. We find that pairwise contacts between residues bear a moderate amount of fold information, and if optimized, can assist in the discrimination of native conformations from large ensembles of native-like decoys. Using an extensive battery of threading tests, we demonstrate that parameters that affect the information content of contact potentials (e.g., choice of atoms to define residue location and the cut-off distance between pairs) have a significant influence in their performance in fold recognition. We conclude that potentials that have been optimized for mutual information and that have high number of score events per sequence-structure alignment are superior in identifying the correct fold. We derive the quantity "information product" that embodies these two critical factors. We demonstrate that the information product, which does not require explicit threading to compute, is as effective as the Z-score, which requires expensive decoy threading to evaluate. This new objective function may be able to speed up the multidimensional parameter search for better statistical potentials. Lastly, by demonstrating the functional equivalence of quasi-chemically approximated "energies" to fundamental informatic quantities, we make statistical potentials less dependent on theoretically tenuous biophysical formalisms and more amenable to direct bioinformatic optimization.
Collapse
Affiliation(s)
- Armando D Solis
- Department of Pharmacology and Systems Therapeutics, Mount Sinai School of Medicine, New York, New York 10029, USA
| | | |
Collapse
|
13
|
|
14
|
Rykunov D, Fiser A. Effects of amino acid composition, finite size of proteins, and sparse statistics on distance-dependent statistical pair potentials. Proteins 2007; 67:559-68. [PMID: 17335003 DOI: 10.1002/prot.21279] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Statistical distance dependent pair potentials are frequently used in a variety of folding, threading, and modeling studies of proteins. The applicability of these types of potentials is tightly connected to the reliability of statistical observations. We explored the possible origin and extent of false positive signals in statistical potentials by analyzing their distance dependence in a variety of randomized protein-like models. While on average potentials derived from such models are expected to equal zero at any distance, we demonstrate that systematic and significant distortions exist. These distortions originate from the limited statistical counts in local environments of proteins and from the limited size of protein structures at large distances. We suggest that these systematic errors in statistical potentials are connected to the dependence of amino acid composition on protein size and to variation in protein sizes. Additionally, atom-based potentials are dominated by a false positive signal that is due to correlation among distances measured from atoms of one residue to atoms of another residue. The significance of residue-based pairwise potentials at various spatial pair separations was assessed in this study and it was found that as few as approximately 50% of potential values were statistically significant at distances below 4 A, and only at most approximately 80% of them were significant at larger pair separations. A new definition for reference state, free of the observed systematic errors, is suggested. It has been demonstrated to generate statistical potentials that compare favorably to other publicly available ones.
Collapse
Affiliation(s)
- Dmitry Rykunov
- Department of Biochemistry, Seaver Center for Bioinformatics, Albert Einstein College of Medicine, Bronx, New York 10461, USA
| | | |
Collapse
|
15
|
Hartmann C, Antes I, Lengauer T. IRECS: a new algorithm for the selection of most probable ensembles of side-chain conformations in protein models. Protein Sci 2007; 16:1294-307. [PMID: 17567749 PMCID: PMC2206697 DOI: 10.1110/ps.062658307] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
We introduce a new algorithm, IRECS (Iterative REduction of Conformational Space), for identifying ensembles of most probable side-chain conformations for homology modeling. On the basis of a given rotamer library, IRECS ranks all side-chain rotamers of a protein according to the probability with which each side chain adopts the respective rotamer conformation. This ranking enables the user to select small rotamer sets that are most likely to contain a near-native rotamer for each side chain. IRECS can therefore act as a fast heuristic alternative to the Dead-End-Elimination algorithm (DEE). In contrast to DEE, IRECS allows for the selection of rotamer subsets of arbitrary size, thus being able to define structure ensembles for a protein. We show that the selection of more than one rotamer per side chain is generally meaningful, since the selected rotamers represent the conformational space of flexible side chains. A knowledge-based statistical potential ROTA was constructed for the IRECS algorithm. The potential was optimized to discriminate between side-chain conformations of native and rotameric decoys of protein structures. By restricting the number of rotamers per side chain to one, IRECS can optimize side chains for a single conformation model. The average accuracy of IRECS for the chi1 and chi1+2 dihedral angles amounts to 84.7% and 71.6%, respectively, using a 40 degrees cutoff. When we compared IRECS with SCWRL and SCAP, the performance of IRECS was comparable to that of both methods. IRECS and the ROTA potential are available for download from the URL http://irecs.bioinf.mpi-inf.mpg.de.
Collapse
|
16
|
Abstract
Homology modeling plays a central role in determining protein structure in the structural genomics project. The importance of homology modeling has been steadily increasing because of the large gap that exists between the overwhelming number of available protein sequences and experimentally solved protein structures, and also, more importantly, because of the increasing reliability and accuracy of the method. In fact, a protein sequence with over 30% identity to a known structure can often be predicted with an accuracy equivalent to a low-resolution X-ray structure. The recent advances in homology modeling, especially in detecting distant homologues, aligning sequences with template structures, modeling of loops and side chains, as well as detecting errors in a model, have contributed to reliable prediction of protein structure, which was not possible even several years ago. The ongoing efforts in solving protein structures, which can be time-consuming and often difficult, will continue to spur the development of a host of new computational methods that can fill in the gap and further contribute to understanding the relationship between protein structure and function.
Collapse
Affiliation(s)
- Zhexin Xiang
- Center for Molecular Modeling, Center for Information Technology, National Institutes of Health, Building 12A Room 2051, 12 South Drive, Bethesda, Maryland 20892-5624, USA.
| |
Collapse
|
17
|
Riemann RN, Zacharias M. Refinement of protein cores and protein–peptide interfaces using a potential scaling approach. Protein Eng Des Sel 2005; 18:465-76. [PMID: 16155119 DOI: 10.1093/protein/gzi052] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Refinement of side chain conformations in protein model structures and at the interface of predicted protein-protein or protein-peptide complexes is an important step during protein structural modelling and docking. A common approach for side chain prediction is to assume a rigid protein main chain for both docking partners and search for an optimal set of side chain rotamers to optimize the steric fit. However, depending on the target-template similarity in the case of comparative protein modelling and on the accuracy of an initially docked complex, the main chain template structure is only an approximation of a realistic target main chain. An inaccurate rigid main chain conformation can in turn interfere with the prediction of side chain conformations. In the present study, a potential scaling approach (PS-MD) during a molecular dynamics (MD) simulation that also allows the inclusion of explicit solvent has been used to predict side chain conformations on semi-flexible protein main chains. The PS-MD method converges much faster to realistic protein-peptide interface structures or protein core structures than standard MD simulations. Depending on the accuracy of the protein main chain, it also gives significantly better results compared with the standard rotamer search method.
Collapse
Affiliation(s)
- Ralph Nico Riemann
- International University Bremen, School of Engineering and Science, D-28759 Bremen, Germany
| | | |
Collapse
|
18
|
Hung LH, Ngan SC, Liu T, Samudrala R. PROTINFO: new algorithms for enhanced protein structure predictions. Nucleic Acids Res 2005; 33:W77-80. [PMID: 15980581 PMCID: PMC1160164 DOI: 10.1093/nar/gki403] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
We describe new algorithms and modules for protein structure prediction available as part of the PROTINFO web server. The modules, comparative and de novo modelling, have significantly improved back-end algorithms that were rigorously evaluated at the sixth meeting on the Critical Assessment of Protein Structure Prediction methods. We were one of four server groups invited to make an oral presentation (only the best performing groups are asked to do so). These two modules allow a user to submit a protein sequence and return atomic coordinates representing the tertiary structure of that protein. The PROTINFO server is available at .
Collapse
Affiliation(s)
| | | | | | - Ram Samudrala
- To whom correspondence should be addressed. Tel: +1 206 732 6122; Fax: +1 206 732 6055;
| |
Collapse
|
19
|
Rapid Protein Side-Chain Packing via Tree Decomposition. LECTURE NOTES IN COMPUTER SCIENCE 2005. [DOI: 10.1007/11415770_32] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
|
20
|
Canutescu AA, Shelenkov AA, Dunbrack RL. A graph-theory algorithm for rapid protein side-chain prediction. Protein Sci 2003; 12:2001-14. [PMID: 12930999 PMCID: PMC2323997 DOI: 10.1110/ps.03154503] [Citation(s) in RCA: 743] [Impact Index Per Article: 35.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Fast and accurate side-chain conformation prediction is important for homology modeling, ab initio protein structure prediction, and protein design applications. Many methods have been presented, although only a few computer programs are publicly available. The SCWRL program is one such method and is widely used because of its speed, accuracy, and ease of use. A new algorithm for SCWRL is presented that uses results from graph theory to solve the combinatorial problem encountered in the side-chain prediction problem. In this method, side chains are represented as vertices in an undirected graph. Any two residues that have rotamers with nonzero interaction energies are considered to have an edge in the graph. The resulting graph can be partitioned into connected subgraphs with no edges between them. These subgraphs can in turn be broken into biconnected components, which are graphs that cannot be disconnected by removal of a single vertex. The combinatorial problem is reduced to finding the minimum energy of these small biconnected components and combining the results to identify the global minimum energy conformation. This algorithm is able to complete predictions on a set of 180 proteins with 34342 side chains in <7 min of computer time. The total chi(1) and chi(1 + 2) dihedral angle accuracies are 82.6% and 73.7% using a simple energy function based on the backbone-dependent rotamer library and a linear repulsive steric energy. The new algorithm will allow for use of SCWRL in more demanding applications such as sequence design and ab initio structure prediction, as well addition of a more complex energy function and conformational flexibility, leading to increased accuracy.
Collapse
Affiliation(s)
- Adrian A Canutescu
- Institute for Cancer Research, Fox Chase Cancer Center, Philadelphia, Pennsylvania 19111, USA
| | | | | |
Collapse
|
21
|
Hung LH, Samudrala R. PROTINFO: Secondary and tertiary protein structure prediction. Nucleic Acids Res 2003; 31:3296-9. [PMID: 12824311 PMCID: PMC168948 DOI: 10.1093/nar/gkg541] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2003] [Revised: 03/31/2003] [Accepted: 03/31/2003] [Indexed: 11/14/2022] Open
Abstract
Information about the secondary and tertiary structure of a protein sequence can greatly assist biologists in the generation and testing of hypotheses, as well as design of experiments. The PROTINFO server enables users to submit a protein sequence and request a prediction of the three-dimensional (tertiary) structure based on comparative modeling, fold generation and de novo methods developed by the authors. In addition, users can submit NMR chemical shift data and request protein secondary structure assignment that is based on using neural networks to combine the chemical shifts with secondary structure predictions. The server is available at http://protinfo.compbio.washington.edu.
Collapse
Affiliation(s)
- Ling-Hong Hung
- Computational Genomics Group, Department of Microbiology, University of Washington School of Medicine, Seattle, WA 98195, USA
| | | |
Collapse
|
22
|
Eyal E, Najmanovich R, Edelman M, Sobolev V. Protein side-chain rearrangement in regions of point mutations. Proteins 2003; 50:272-82. [PMID: 12486721 DOI: 10.1002/prot.10276] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
A major problem in predicting amino acid side-chain rearrangements following point mutations is the potentially large search space. We analyzed a nonredundant data set of 393 Protein Data Bank protein pairs, each consisting of structures differing in one amino acid, to determine the number of residues changing conformation in the region of mutation. In 91-95% of cases, two or fewer residues underwent side-chain conformational change. If mutation sites with backbone displacements were excluded, the number increased to 97%. The majority of rearrangements (over 60%) were due to the inherent flexibility of side-chains, as derived from analysis of a control set of protein subunits whose crystal structures were determined more than once. Different amino acids demonstrated different degrees of flexibility near mutation sites. Large polar or charged residues, and serine, are more flexible, while the aromatic amino acids, and cysteine, are less so. This pattern is common to the inherent side-chain flexibility, as well as the increased flexibility at ligand binding sites and mutation sites. The probability for conformational change was correlated with B-factor, frequency of the side-chain conformation in proteins and solvent accessibility. The last trend was stronger for aromatic and hydrophilic residues than for hydrophobic ones. We conclude that the search space for predicting side-chain conformations in the region of mutation can be effectively restricted. However, the overall ability to predict a particular side-chain conformation, or to check predictions according to individual existing structures, is limited. These findings may be useful in deriving empirical rules for modeling side-chain conformations.
Collapse
Affiliation(s)
- Eran Eyal
- Department of Plant Sciences, Weizmann Institute of Science, Rehovot, Israel.
| | | | | | | |
Collapse
|
23
|
Samudrala R, Levitt M. A comprehensive analysis of 40 blind protein structure predictions. BMC STRUCTURAL BIOLOGY 2002; 2:3. [PMID: 12150712 PMCID: PMC122083 DOI: 10.1186/1472-6807-2-3] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/09/2002] [Accepted: 08/01/2002] [Indexed: 11/21/2022]
Abstract
BACKGROUND We thoroughly analyse the results of 40 blind predictions for which an experimental answer was made available at the fourth meeting on the critical assessment of protein structure methods (CASP4). Using our comparative modelling and fold recognition methodologies, we made 29 predictions for targets that had sequence identities ranging from 50% to 10% to the nearest related protein with known structure. Using our ab initio methodologies, we made eleven predictions for targets that had no detectable sequence relationships. RESULTS For 23 of these proteins, we produced models ranging from 1.0 to 6.0 A root mean square deviation (RMSD) for the Calpha atoms between the model and the corresponding experimental structure for all or large parts of the protein, with model accuracies scaling fairly linearly with respect to sequence identity (i.e., the higher the sequence identity, the better the prediction). We produced nine models with accuracies ranging from 4.0 to 6.0 A Calpha RMSD for 60-100 residue proteins (or large fragments of a protein), with a prediction accuracy of 4.0 A Calpha RMSD for residues 1-80 for T110/rbfa. CONCLUSIONS The areas of protein structure prediction that work well, and areas that need improvement, are discernable by examining how our methods have performed over the past four CASP experiments. These results have implications for modelling the structure of all tractable proteins encoded by the genome of an organism.
Collapse
Affiliation(s)
- Ram Samudrala
- Department of Microbiology, University of Washington, School of Medicine, Seattle, WA 98195, USA
| | - Michael Levitt
- Department of Structural Biology, Stanford University, School of Medicine, Stanford, CA 94305, USA
| |
Collapse
|
24
|
Yang JM, Tsai CH, Hwang MJ, Tsai HK, Hwang JK, Kao CY. GEM: a Gaussian Evolutionary Method for predicting protein side-chain conformations. Protein Sci 2002; 11:1897-907. [PMID: 12142444 PMCID: PMC2373689 DOI: 10.1110/ps.4940102] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
We have developed an evolutionary approach to predicting protein side-chain conformations. This approach, referred to as the Gaussian Evolutionary Method (GEM), combines both discrete and continuous global search mechanisms. The former helps speed up convergence by reducing the size of rotamer space, whereas the latter, integrating decreasing-based Gaussian mutations and self-adaptive Gaussian mutations, continuously adapts dihedrals to optimal conformations. We tested our approach on 38 proteins ranging in size from 46 to 325 residues and showed that the results were comparable to those using other methods. The average accuracies of our predictions were 80% for chi(1), 66% for chi(1 + 2), and 1.36 A for the root mean square deviation of side-chain positions. We found that if our scoring function was perfect, the prediction accuracy was also essentially perfect. However, perfect prediction could not be achieved if only a discrete search mechanism was applied. These results suggest that GEM is robust and can be used to examine the factors limiting the accuracy of protein side-chain prediction methods. Furthermore, it can be used to systematically evaluate and thus improve scoring functions.
Collapse
Affiliation(s)
- Jinn-Moon Yang
- Department of Biological Science and Technology and Institute of Bioinformatics, National Chiao Tung University, Hsinchu, 30050, Taiwan.
| | | | | | | | | | | |
Collapse
|
25
|
Desmet J, Spriet J, Lasters I. Fast and accurate side-chain topology and energy refinement (FASTER) as a new method for protein structure optimization. Proteins 2002; 48:31-43. [PMID: 12012335 DOI: 10.1002/prot.10131] [Citation(s) in RCA: 90] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
We have developed an original method for global optimization of protein side-chain conformations, called the Fast and Accurate Side-Chain Topology and Energy Refinement (FASTER) method. The method operates by systematically overcoming local minima of increasing order. Comparison of the FASTER results with those of the dead-end elimination (DEE) algorithm showed that both methods produce nearly identical results, but the FASTER algorithm is 100-1000 times faster than the DEE method and scales in a stable and favorable way as a function of protein size. We also show that low-order local minima may be almost as accurate as the global minimum when evaluated against experimentally determined structures. In addition, the new algorithm provides significant information about the conformational flexibility of individual side-chains. We observed that strictly rigid side-chains are concentrated mainly in the core of the protein, whereas highly flexible side-chains are found almost exclusively among solvent-oriented residues.
Collapse
|
26
|
Van Loy CP, Sokurenko EV, Samudrala R, Moseley SL. Identification of amino acids in the Dr adhesin required for binding to decay-accelerating factor. Mol Microbiol 2002; 45:439-52. [PMID: 12123455 DOI: 10.1046/j.1365-2958.2002.03022.x] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Members of the Dr family of adhesins of Escherichia coli recognize as a receptor the Dr(a) blood-group antigen present on the complement regulatory and signalling molecule, decay-accelerating factor (DAF). One member of this family, the Dr haemagglutinin, also binds to a second receptor, type IV collagen. Structure/function information regarding these adhesins has been limited and domains directly involved in the interaction with DAF have not been determined. We devised a strategy to identify amino acids in the Dr haemagglutinin that are specifically involved in the interaction with DAF. The gene encoding the adhesive subunit, draE, was subjected to random mutagenesis and used to complement a strain defective for its expression. The resulting mutants were enriched and screened to obtain those that do not bind to DAF, but retain binding to type IV collagen. Individual amino acid changes at positions 10, 63, 65, 75, 77, 79 and 131 of the mature DraE sequence significantly reduced the ability of the DraE adhesin to bind DAF, but not collagen. Over half of the mutants obtained had substitutions within amino acids 63-81. Analysis of predicted structures of DraE suggest that these proximal residues may cluster to form a binding domain for DAF.
Collapse
Affiliation(s)
- Cristina P Van Loy
- University of Washington, Department of Microbiology, Box 357242, Seattle, WA 98195-7242, USA
| | | | | | | |
Collapse
|
27
|
Abstract
Modeling side-chain conformations on a fixed protein backbone has a wide application in structure prediction and molecular design. Each effort in this field requires decisions about a rotamer set, scoring function, and search strategy. We have developed a new and simple scoring function, which operates on side-chain rotamers and consists of the following energy terms: contact surface, volume overlap, backbone dependency, electrostatic interactions, and desolvation energy. The weights of these energy terms were optimized to achieve the minimal average root mean square (rms) deviation between the lowest energy rotamer and real side-chain conformation on a training set of high-resolution protein structures. In the course of optimization, for every residue, its side chain was replaced by varying rotamers, whereas conformations for all other residues were kept as they appeared in the crystal structure. We obtained prediction accuracy of 90.4% for chi(1), 78.3% for chi(1 + 2), and 1.18 A overall rms deviation. Furthermore, the derived scoring function combined with a Monte Carlo search algorithm was used to place all side chains onto a protein backbone simultaneously. The average prediction accuracy was 87.9% for chi(1), 73.2% for chi(1 + 2), and 1.34 A rms deviation for 30 protein structures. Our approach was compared with available side-chain construction methods and showed improvement over the best among them: 4.4% for chi(1), 4.7% for chi(1 + 2), and 0.21 A for rms deviation. We hypothesize that the scoring function instead of the search strategy is the main obstacle in side-chain modeling. Additionally, we show that a more detailed rotamer library is expected to increase chi(1 + 2) prediction accuracy but may have little effect on chi(1) prediction accuracy.
Collapse
Affiliation(s)
- Shide Liang
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas 75390, USA
| | | |
Collapse
|
28
|
Mendes J, Nagarajaram HA, Soares CM, Blundell TL, Carrondo MA. Incorporating knowledge-based biases into an energy-based side-chain modeling method: application to comparative modeling of protein structure. Biopolymers 2001; 59:72-86. [PMID: 11373721 DOI: 10.1002/1097-0282(200108)59:2<72::aid-bip1007>3.0.co;2-s] [Citation(s) in RCA: 21] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The performance of the self-consistent mean field theory (SCMFT) method for side-chain modeling, employing rotamer energies calculated with the flexible rotamer model (FRM), is evaluated in the context of comparative modeling of protein structure. Predictions were carried out on a test set of 56 model backbones of varying accuracy, to allow side-chain prediction accuracy to be analyzed as a function of backbone accuracy. A progressive decrease in the accuracy of prediction was observed as backbone accuracy decreased. However, even for very low backbone accuracy, prediction was substantially higher than random, indicating that the FRM can, in part, compensate for the errors in the modeled tertiary environment. It was also investigated whether the introduction in the FRM-SCMFT method of knowledge-based biases, derived from a backbone-dependent rotamer library, could enhance its performance. A bias derived from the backbone-dependent rotamer conformations alone did not improve prediction accuracy. However, a bias derived from the backbone-dependent rotamer probabilities improved prediction accuracy considerably. This bias was incorporated through two different strategies. In one (the indirect strategy), rotamer probabilities were used to reject unlikely rotamers a priori, thus restricting prediction by FRM-SCMFT to a subset containing only the most probable rotamers in the library. In the other (the direct strategy), rotamer energies were transformed into pseudo-energies that were added to the average potential energies of the respective rotamers, thereby creating hybrid energy-based/knowledge-based average rotamer energies, which were used by the FRM-SCMFT method for prediction. For all degrees of backbone accuracy, an optimal strength of the knowledge-based bias existed for both strategies for which predictions were more accurate than pure energy-based predictions, and also than pure knowledge-based predictions. Hybrid knowledge-based/energy-based methods were obtained from both strategies and compared with the SCWRL method, a hybrid method based on the same backbone-dependent rotamer library. The accuracy of the indirect method was approximately the same as that of the SCWRL method, but that of the direct method was significantly higher.
Collapse
Affiliation(s)
- J Mendes
- Instituto de Tecnologia Química e Biológica, Universidade Nova de Lisboa, Apartado 127, Av. da República, 2781-901, Oeiras, Portugal
| | | | | | | | | |
Collapse
|
29
|
Abstract
Current techniques for the prediction of side-chain conformations on a fixed backbone have an accuracy limit of about 1.0-1.5 A rmsd for core residues. We have carried out a detailed and systematic analysis of the factors that influence the prediction of side-chain conformation and, on this basis, have succeeded in extending the limits of side-chain prediction for core residues to about 0.7 A rmsd from native, and 94 % and 89 % of chi(1) and chi(1+2 ) dihedral angles correctly predicted to within 20 degrees of native, respectively. These results are obtained using a force-field that accounts for only van der Waals interactions and torsional potentials. Prediction accuracy is strongly dependent on the rotamer library used. That is, a complete and detailed rotamer library is essential. The greatest accuracy was obtained with an extensive rotamer library, containing over 7560 members, in which bond lengths and bond angles were taken from the database rather than simply assuming idealized values. Perhaps the most surprising finding is that the combinatorial problem normally associated with the prediction of the side-chain conformation does not appear to be important. This conclusion is based on the fact that the prediction of the conformation of a single side-chain with all others fixed in their native conformations is only slightly more accurate than the simultaneous prediction of all side-chain dihedral angles.
Collapse
Affiliation(s)
- Z Xiang
- Department of Biochemistry and Molecular Biophysics BB221, Columbia University, New York, NY 10032, USA
| | | |
Collapse
|
30
|
Abstract
The prediction of protein structure, based primarily on sequence and structure homology, has become an increasingly important activity. Homology models have become more accurate and their range of applicability has increased. Progress has come, in part, from the flood of sequence and structure information that has appeared over the past few years, and also from improvements in analysis tools. These include profile methods for sequence searches, the use of three-dimensional structure information in sequence alignment and new homology modeling tools, specifically in the prediction of loop and side-chain conformations. There have also been important advances in understanding the physical chemical basis of protein stability and the corresponding use of physical chemical potential functions to identify correctly folded from incorrectly folded protein conformations.
Collapse
Affiliation(s)
- B Al-Lazikani
- Department of Biochemistry and Molecular Biophysics, Howard Hughes Medical Institute, Columbia University, 630 West 168th Street, New York, NY 10032, USA
| | | | | | | |
Collapse
|
31
|
Samudrala R, Huang ES, Koehl P, Levitt M. Constructing side chains on near-native main chains for ab initio protein structure prediction. PROTEIN ENGINEERING 2000; 13:453-7. [PMID: 10906341 DOI: 10.1093/protein/13.7.453] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Is there value in constructing side chains while searching protein conformational space during an ab initio simulation? If so, what is the most computationally efficient method for constructing these side chains? To answer these questions, four published approaches were used to construct side chain conformations on a range of near-native main chains generated by ab initio protein structure prediction methods. The accuracy of these approaches was compared with a naive approach that selects the most frequently observed rotamer for a given amino acid to construct side chains. An all-atom conditional probability discriminatory function is useful at selecting conformations with overall low all-atom root mean square deviation (r.m.s.d.) and the discrimination improves on sets that are closer to the native conformation. In addition, the naive approach performs as well as more sophisticated methods in terms of the percentage of chi(1) angles built accurately and the all-atom r. m.s.d., between the native and near-native conformations. The results suggest that the naive method would be extremely useful for fast and efficient side chain construction on vast numbers of conformations for ab initio prediction of protein structure.
Collapse
Affiliation(s)
- R Samudrala
- Department of Structural Biology, Stanford University School of Medicine, Stanford, CA 94305, USA.
| | | | | | | |
Collapse
|
32
|
Abstract
Ligand binding may involve a wide range of structural changes in the receptor protein, from hinge movement of entire domains to small side-chain rearrangements in the binding pocket residues. The analysis of side chain flexibility gives insights valuable to improve docking algorithms and can provide an index of amino-acid side-chain flexibility potentially useful in molecular biology and protein engineering studies. In this study we analyzed side-chain rearrangements upon ligand binding. We constructed two non-redundant databases (980 and 353 entries) of "paired" protein structures in complexed (holo-protein) and uncomplexed (apo-protein) forms from the PDB macromolecular structural database. The number and identity of binding pocket residues that undergo side-chain conformational changes were determined. We show that, in general, only a small number of residues in the pocket undergo such changes (e.g., approximately 85% of cases show changes in three residues or less). The flexibility scale has the following order: Lys > Arg, Gln, Met > Glu, Ile, Leu > Asn, Thr, Val, Tyr, Ser, His, Asp > Cys, Trp, Phe; thus, Lys side chains in binding pockets flex 25 times more often then do the Phe side chains. Normalizing for the number of flexible dihedral bonds in each amino acid attenuates the scale somewhat, however, the clear trend of large, polar amino acids being more flexible in the pocket than aromatic ones remains. We found no correlation between backbone movement of a residue upon ligand binding and the flexibility of its side chain. These results are relevant to 1. Reduction of search space in docking algorithms by inclusion of side-chain flexibility for a limited number of binding pocket residues; and 2. Utilization of the amino acid flexibility scale in protein engineering studies to alter the flexibility of binding pockets.
Collapse
Affiliation(s)
- R Najmanovich
- Plant Sciences Department, Weizmann Institute of Science, Rehovot, Israel.
| | | | | | | |
Collapse
|
33
|
Abstract
The current state of the art in modeling protein structure has been assessed, based on the results of the CASP (Critical Assessment of protein Structure Prediction) experiments. In comparative modeling, improvements have been made in sequence alignment, sidechain orientation and loop building. Refinement of the models remains a serious challenge. Improved sequence profile methods have had a large impact in fold recognition. Although there has been some progress in alignment quality, this factor still limits model usefulness. In ab initio structure prediction, there has been notable progress in building approximately correct structures of 40-60 residue-long protein fragments. There is still a long way to go before the general ab initio prediction problem is solved. Overall, the field is maturing into a practical technology, able to deliver useful models for a large number of sequences.
Collapse
Affiliation(s)
- J Moult
- Center for Advanced Research in Biotechnology, University of Maryland Biotechnology Institute, Rockville, MD 20850, USA.
| |
Collapse
|