1
|
Francis-Lyon P, Koehl P. Protein side-chain modeling with a protein-dependent optimized rotamer library. Proteins 2014; 82:2000-17. [PMID: 24623614 DOI: 10.1002/prot.24555] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2013] [Revised: 02/28/2014] [Accepted: 03/07/2014] [Indexed: 12/16/2022]
Abstract
Despite years of effort, the problem of predicting the conformations of protein side chains remains a subject of inquiry. This problem has three major issues, namely defining the conformations that a side chain may adopt within a protein, developing a sampling procedure for generating possible side-chain packings, and defining a scoring function that can rank these possible packings. To solve the former of these issues, most procedures rely on a rotamer library derived from databases of known protein structures. We introduce an alternative method that is free of statistics. We begin with a rotamer library that is based only on stereochemical considerations; this rotamer library is then optimized independently for each protein under study. We show that this optimization step restores the diversity of conformations observed in native proteins. We combine this protein-dependent rotamer library (PDRL) method with the self-consistent mean field (SCMF) sampling approach and a physics-based scoring function into a new side-chain prediction method, SCMF-PDRL. Using two large test sets of 831 and 378 proteins, respectively, we show that this new method compares favorably with competing methods such as SCAP, OPUS-Rota, and SCWRL4 for energy-minimized structures.
Collapse
Affiliation(s)
- Patricia Francis-Lyon
- Department of Computer Science, University of San Francisco, San Francisco, California, 94117
| | | |
Collapse
|
2
|
Huang YM, Bystroff C. Expanded explorations into the optimization of an energy function for protein design. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013; 10:1176-1187. [PMID: 24384706 PMCID: PMC3919130 DOI: 10.1109/tcbb.2013.113] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Nature possesses a secret formula for the energy as a function of the structure of a protein. In protein design, approximations are made to both the structural representation of the molecule and to the form of the energy equation, such that the existence of a general energy function for proteins is by no means guaranteed. Here, we present new insights toward the application of machine learning to the problem of finding a general energy function for protein design. Machine learning requires the definition of an objective function, which carries with it the implied definition of success in protein design. We explored four functions, consisting of two functional forms, each with two criteria for success. Optimization was carried out by a Monte Carlo search through the space of all variable parameters. Cross-validation of the optimized energy function against a test set gave significantly different results depending on the choice of objective function, pointing to relative correctness of the built-in assumptions. Novel energy cross terms correct for the observed nonadditivity of energy terms and an imbalance in the distribution of predicted amino acids. This paper expands on the work presented at the 2012 ACM-BCB.
Collapse
|
3
|
Li SC, Bu D, Li M. Residues with similar hexagon neighborhoods share similar side-chain conformations. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012; 9:240-248. [PMID: 21519113 DOI: 10.1109/tcbb.2011.74] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
We present in this study a new approach to code protein side-chain conformations into hexagon substructures. Classical side-chain packing methods consist of two steps: first, side-chain conformations, known as rotamers, are extracted from known protein structures as candidates for each residue; second, a searching method along with an energy function is used to resolve conflicts among residues and to optimize the combinations of side chain conformations for all residues. These methods benefit from the fact that the number of possible side-chain conformations is limited, and the rotamer candidates are readily extracted; however, these methods also suffer from the inaccuracy of energy functions. Inspired by threading and Ab Initio approaches to protein structure prediction, we propose to use hexagon substructures to implicitly capture subtle issues of energy functions. Our initial results indicate that even without guidance from an energy function, hexagon structures alone can capture side-chain conformations at an accuracy of 83.8 percent, higher than 82.6 percent by the state-of-art side-chain packing methods.
Collapse
|
4
|
Heath AP, Kavraki LE, Clementi C. From coarse-grain to all-atom: Toward multiscale analysis of protein landscapes. Proteins 2007; 68:646-61. [PMID: 17523187 DOI: 10.1002/prot.21371] [Citation(s) in RCA: 101] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Multiscale methods are becoming increasingly promising as a way to characterize the dynamics of large protein systems on biologically relevant time-scales. The underlying assumption in multiscale simulations is that it is possible to move reliably between different resolutions. We present a method that efficiently generates realistic all-atom protein structures starting from the C(alpha) atom positions, as obtained for instance from extensive coarse-grain simulations. The method, a reconstruction algorithm for coarse-grain structures (RACOGS), is validated by reconstructing ensembles of coarse-grain structures obtained during folding simulations of the proteins src-SH3 and S6. The results show that RACOGS consistently produces low energy, all-atom structures. A comparison of the free energy landscapes calculated using the coarse-grain structures versus the all-atom structures shows good correspondence and little distortion in the protein folding landscape.
Collapse
Affiliation(s)
- Allison P Heath
- Department of Computer Science, Rice University, Houston, Texas 77005, USA
| | | | | |
Collapse
|
5
|
Alberts IL, Todorov NP, Dean PM. Receptor Flexibility in de Novo Ligand Design and Docking. J Med Chem 2005; 48:6585-96. [PMID: 16220975 DOI: 10.1021/jm050196j] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
One of the major problems in computational drug design is incorporation of the intrinsic flexibility of protein binding sites. This is particularly crucial in ligand binding events, when induced fit can lead to protein structure rearrangements. As a consequence of the huge conformational space available to protein structures, receptor flexibility is rarely considered in ligand design procedures. In this work, we present an algorithm for integrating protein binding-site flexibility into de novo ligand design and docking processes. The approach allows dynamic rearrangement of amino acid side chains during the docking and design simulations. The impact of protein conformational flexibility is investigated in the docking of highly active inhibitors in the binding sites of acetylcholinesterase and human collagenase (matrix metalloproteinase-1) and in the design of ligands in the S1' pocket of MMP-1. The results of corresponding simulations for both rigid and flexible binding sites are compared in order to gauge the influence of receptor flexibility in drug discovery protocols.
Collapse
Affiliation(s)
- Ian L Alberts
- De Novo Pharmaceuticals, Compass House, Vision Park, Histon, Cambridge CB4 9ZR, U.K.
| | | | | |
Collapse
|
6
|
Tramontano A, Morea V. Exploiting evolutionary relationships for predicting protein structures. Biotechnol Bioeng 2004; 84:756-62. [PMID: 14708116 DOI: 10.1002/bit.10850] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
In the last few years there have been many developments in computational biology, particularly with regard to novel, imaginative exploitation of genomic data. Disappointingly, there has been a lack of progress in the methodology for prediction of protein structures. In the last several years, however, promising new methods have finally begun to emerge. These methods are increasing the power and scope of the methodology, but, most importantly, they are generating new areas of investigation that we believe will accelerate progress in the field. In this review we describe recent developments and highlight the implications of their success as well as areas where efforts should be focused.
Collapse
Affiliation(s)
- Anna Tramontano
- Department of Biochemical Sciences A. Rossi Fanelli, University La Sapienza, P. le Aldo Moro 5, 00185 Rome, Italy.
| | | |
Collapse
|
7
|
|
8
|
Canutescu AA, Shelenkov AA, Dunbrack RL. A graph-theory algorithm for rapid protein side-chain prediction. Protein Sci 2003; 12:2001-14. [PMID: 12930999 PMCID: PMC2323997 DOI: 10.1110/ps.03154503] [Citation(s) in RCA: 743] [Impact Index Per Article: 35.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Fast and accurate side-chain conformation prediction is important for homology modeling, ab initio protein structure prediction, and protein design applications. Many methods have been presented, although only a few computer programs are publicly available. The SCWRL program is one such method and is widely used because of its speed, accuracy, and ease of use. A new algorithm for SCWRL is presented that uses results from graph theory to solve the combinatorial problem encountered in the side-chain prediction problem. In this method, side chains are represented as vertices in an undirected graph. Any two residues that have rotamers with nonzero interaction energies are considered to have an edge in the graph. The resulting graph can be partitioned into connected subgraphs with no edges between them. These subgraphs can in turn be broken into biconnected components, which are graphs that cannot be disconnected by removal of a single vertex. The combinatorial problem is reduced to finding the minimum energy of these small biconnected components and combining the results to identify the global minimum energy conformation. This algorithm is able to complete predictions on a set of 180 proteins with 34342 side chains in <7 min of computer time. The total chi(1) and chi(1 + 2) dihedral angle accuracies are 82.6% and 73.7% using a simple energy function based on the backbone-dependent rotamer library and a linear repulsive steric energy. The new algorithm will allow for use of SCWRL in more demanding applications such as sequence design and ab initio structure prediction, as well addition of a more complex energy function and conformational flexibility, leading to increased accuracy.
Collapse
Affiliation(s)
- Adrian A Canutescu
- Institute for Cancer Research, Fox Chase Cancer Center, Philadelphia, Pennsylvania 19111, USA
| | | | | |
Collapse
|
9
|
Eyal E, Najmanovich R, Edelman M, Sobolev V. Protein side-chain rearrangement in regions of point mutations. Proteins 2003; 50:272-82. [PMID: 12486721 DOI: 10.1002/prot.10276] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
A major problem in predicting amino acid side-chain rearrangements following point mutations is the potentially large search space. We analyzed a nonredundant data set of 393 Protein Data Bank protein pairs, each consisting of structures differing in one amino acid, to determine the number of residues changing conformation in the region of mutation. In 91-95% of cases, two or fewer residues underwent side-chain conformational change. If mutation sites with backbone displacements were excluded, the number increased to 97%. The majority of rearrangements (over 60%) were due to the inherent flexibility of side-chains, as derived from analysis of a control set of protein subunits whose crystal structures were determined more than once. Different amino acids demonstrated different degrees of flexibility near mutation sites. Large polar or charged residues, and serine, are more flexible, while the aromatic amino acids, and cysteine, are less so. This pattern is common to the inherent side-chain flexibility, as well as the increased flexibility at ligand binding sites and mutation sites. The probability for conformational change was correlated with B-factor, frequency of the side-chain conformation in proteins and solvent accessibility. The last trend was stronger for aromatic and hydrophilic residues than for hydrophobic ones. We conclude that the search space for predicting side-chain conformations in the region of mutation can be effectively restricted. However, the overall ability to predict a particular side-chain conformation, or to check predictions according to individual existing structures, is limited. These findings may be useful in deriving empirical rules for modeling side-chain conformations.
Collapse
Affiliation(s)
- Eran Eyal
- Department of Plant Sciences, Weizmann Institute of Science, Rehovot, Israel.
| | | | | | | |
Collapse
|
10
|
Liu Z, Jiang L, Gao Y, Liang S, Chen H, Han Y, Lai L. Beyond the rotamer library: genetic algorithm combined with the disturbing mutation process for upbuilding protein side-chains. Proteins 2003; 50:49-62. [PMID: 12471599 DOI: 10.1002/prot.10253] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
The disturbing genetic algorithm, incorporating the disturbing mutation process into the genetic algorithm flow, has been developed to extend the searching space of side-chain conformations and to improve the quality of the rotamer library. Moreover, the growing generation amount idea, simulating the real situation of the natural evolution, is introduced to improve the searching speed. In the calculations using the pseudo energy scoring function of the root mean squared deviation, the disturbing genetic algorithm method has been shown to be highly efficient. With the real energy function based on AMBER force field, the program has been applied to rebuilding side-chain conformations of 25 high-quality crystallographic structures of single-protein and protein-protein complexes. The averaged root mean standard deviation of atom coordinates in side-chains and veracities of the torsion angles of chi(1) and chi(1) + chi(2) are 1.165 A, 88.2 and 72.9% for the buried residues, respectively, and 1.493 A, 79.2 and 64.7% for all residues, showing that the method has equal precision to the program SCWRL, whereas it performs better in the prediction of buried residues and protein-protein interfaces. This method has been successfully used in redesigning the interface of the Basnase-Barstar complex, indicating that it will have extensive application in protein design, protein sequence and structure relationship studies, and research on protein-protein interaction.
Collapse
Affiliation(s)
- Zhijie Liu
- State key Laboratory for Structural Chemistry of Stable and Unstable Species, Beijing, China
| | | | | | | | | | | | | |
Collapse
|
11
|
Abstract
Non-rotameric ("off-rotamer") conformations are commonly observed for the side-chains of protein crystal structures. This study examines whether such conformations are real or artifactual by comparing the energetics of on and off-rotamer side-chain conformations calculated with the CHARMM energy function. Energy-based predictions of side-chain orientation are carried out by rigid-geometry mapping in the presence of the fixed protein environment for 1709 non-polar side-chains in 24 proteins for which high-resolution (2.0 A or better) structures are available. For on-rotamer conformations, 97.6 % are correctly predicted; i.e. they correspond to the absolute minima of their local side-chain energy maps (generally to within 10 degrees or less). By contrast, for the observed off-rotamer side-chain conformations, 63.8 % are predicted correctly. This difference is statistically significant (P<0.001) and suggests that while most of the observed off-rotamer conformations are real, many of the erroneously predicted ones are likely to be artifacts of the X-ray refinements. Probabilities for off-rotamer conformations of the non-polar side-chains are calculated to be 5.0-6.1 % by adaptive umbrella-sampled molecular dynamics trajectories of individual amino acid residues in vacuum and in the presence of an average protein or aqueous dielectric environment. These results correspond closely to the 5.7 % off-rotamer fraction predicted by the rigid-geometry mapping studies. Since these values are about one-half of the 10.2 % off-rotamer fraction observed in the X-ray structures, they support the conclusion that many of the latter are artifacts. In both the rigid-geometry mapping and the molecular dynamics studies, the discrepancies between the predicted and observed fractions of off-rotamer conformations are largest for leucine residues (approximately 6 % versus 16.6 %). The simulations for the isolated amino acid residues indicate that the real off-rotamer frequency of 5-6 % is consistent with the internal side-chain and local side-chain-backbone energetics and does not originate from shifts due to the protein. The present results suggest that energy-based rotation maps can be used to find side-chain positional artifacts that appear in crystal structures based on refinements in the 2 A resolution range.
Collapse
Affiliation(s)
- R J Petrella
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138, USA
| | | |
Collapse
|
12
|
Abstract
Current techniques for the prediction of side-chain conformations on a fixed backbone have an accuracy limit of about 1.0-1.5 A rmsd for core residues. We have carried out a detailed and systematic analysis of the factors that influence the prediction of side-chain conformation and, on this basis, have succeeded in extending the limits of side-chain prediction for core residues to about 0.7 A rmsd from native, and 94 % and 89 % of chi(1) and chi(1+2 ) dihedral angles correctly predicted to within 20 degrees of native, respectively. These results are obtained using a force-field that accounts for only van der Waals interactions and torsional potentials. Prediction accuracy is strongly dependent on the rotamer library used. That is, a complete and detailed rotamer library is essential. The greatest accuracy was obtained with an extensive rotamer library, containing over 7560 members, in which bond lengths and bond angles were taken from the database rather than simply assuming idealized values. Perhaps the most surprising finding is that the combinatorial problem normally associated with the prediction of the side-chain conformation does not appear to be important. This conclusion is based on the fact that the prediction of the conformation of a single side-chain with all others fixed in their native conformations is only slightly more accurate than the simultaneous prediction of all side-chain dihedral angles.
Collapse
Affiliation(s)
- Z Xiang
- Department of Biochemistry and Molecular Biophysics BB221, Columbia University, New York, NY 10032, USA
| | | |
Collapse
|
13
|
Abstract
Ligand binding may involve a wide range of structural changes in the receptor protein, from hinge movement of entire domains to small side-chain rearrangements in the binding pocket residues. The analysis of side chain flexibility gives insights valuable to improve docking algorithms and can provide an index of amino-acid side-chain flexibility potentially useful in molecular biology and protein engineering studies. In this study we analyzed side-chain rearrangements upon ligand binding. We constructed two non-redundant databases (980 and 353 entries) of "paired" protein structures in complexed (holo-protein) and uncomplexed (apo-protein) forms from the PDB macromolecular structural database. The number and identity of binding pocket residues that undergo side-chain conformational changes were determined. We show that, in general, only a small number of residues in the pocket undergo such changes (e.g., approximately 85% of cases show changes in three residues or less). The flexibility scale has the following order: Lys > Arg, Gln, Met > Glu, Ile, Leu > Asn, Thr, Val, Tyr, Ser, His, Asp > Cys, Trp, Phe; thus, Lys side chains in binding pockets flex 25 times more often then do the Phe side chains. Normalizing for the number of flexible dihedral bonds in each amino acid attenuates the scale somewhat, however, the clear trend of large, polar amino acids being more flexible in the pocket than aromatic ones remains. We found no correlation between backbone movement of a residue upon ligand binding and the flexibility of its side chain. These results are relevant to 1. Reduction of search space in docking algorithms by inclusion of side-chain flexibility for a limited number of binding pocket residues; and 2. Utilization of the amino acid flexibility scale in protein engineering studies to alter the flexibility of binding pockets.
Collapse
Affiliation(s)
- R Najmanovich
- Plant Sciences Department, Weizmann Institute of Science, Rehovot, Israel.
| | | | | | | |
Collapse
|
14
|
Petrella RJ, Lazaridis T, Karplus M. Protein sidechain conformer prediction: a test of the energy function. FOLDING & DESIGN 1998; 3:353-77. [PMID: 9806937 DOI: 10.1016/s1359-0278(98)00050-9] [Citation(s) in RCA: 46] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
BACKGROUND Homology modeling is an important technique for making use of the rapidly increasing number of protein sequences in the absence of structural information. The major problems in such modeling, once the alignment has been made, concern the positions of loops and the orientations of sidechains. Although progress has been made in recent years for sidechain prediction, current methods appear to have a limit on the order of 70% in their accuracy. It is important to have an understanding of this limitation, which for energy-based methods could arise from inaccuracies of the potential function. RESULTS A test of the CHARMM function for sidechain prediction was performed. To eliminate the multiple-residue search problem, the minimum energy positions of individual sidechains in ten proteins were calculated in the presence of all other sidechains in their crystal orientations. This test provides a necessary condition that any energy function useful for sidechain placement must satisfy. For chi1 x chi2 rotations, the accuracies were 77.4% and 89.5%, respectively, and in the presence of crystal waters were 86.5% and 94.9%, respectively. If there was an error, the crystal structure usually corresponded to an alternative local minimum on the calculated energy map. Prediction accuracy correlated with the size of the energy gap between primary and secondary minima. CONCLUSIONS The results indicate that the errors in current sidechain prediction schemes cannot be attributed to the potential energy function per se. The test used here establishes a necessary condition that any proposed energy-based sidechain prediction method, as well as many statistically based methods, must satisfy.
Collapse
Affiliation(s)
- R J Petrella
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138, USA
| | | | | |
Collapse
|
15
|
Schaffer L, Verkhivker GM. Predicting structural effects in HIV-1 protease mutant complexes with flexible ligand docking and protein side-chain optimization. Proteins 1998; 33:295-310. [PMID: 9779795 DOI: 10.1002/(sici)1097-0134(19981101)33:2<295::aid-prot12>3.0.co;2-f] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
We present a computational approach for predicting structures of ligand-protein complexes and analyzing binding energy landscapes that combines Monte Carlo simulated annealing technique to determine the ligand bound conformation with the dead-end elimination algorithm for side-chain optimization of the protein active site residues. Flexible ligand docking and optimization of mobile protein side-chains have been performed to predict structural effects in the V32I/I47V/V82I HIV-1 protease mutant bound with the SB203386 ligand and in the V82A HIV-1 protease mutant bound with the A77003 ligand. The computational structure predictions are consistent with the crystal structures of these ligand-protein complexes. The emerging relationships between ligand docking and side-chain optimization of the active site residues are rationalized based on the analysis of the ligand-protein binding energy landscape.
Collapse
Affiliation(s)
- L Schaffer
- Agouron Pharmaceuticals, Inc., La Jolla, California 92037, USA
| | | |
Collapse
|
16
|
Abstract
The computer-aided design of protein sequences requires efficient search algorithms to handle the enormous combinatorial complexity involved. A variety of different algorithms have now been applied with some success. The choice of algorithm can influence the representation of the problem in several important ways--the discreteness of the configuration, the types of energy terms that can be used and the ability to find the global minimum energy configuration. The use of dead end elimination to design the complete sequence for a small protein motif and the use of genetic and mean-field algorithms to design hydrophobic cores for proteins represent the major themes of the past year.
Collapse
Affiliation(s)
- J R Desjarlais
- Department of Chemistry, Pennsylvania State University, University Park 16802, USA.
| | | |
Collapse
|
17
|
Lasters I, Desmet J, De Maeyer M. Dead-end based modeling tools to explore the sequence space that is compatible with a given scaffold. JOURNAL OF PROTEIN CHEMISTRY 1997; 16:449-52. [PMID: 9246627 DOI: 10.1023/a:1026301208920] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
The dead-end elimination algorithm has proven to be a powerful tool in protein homology modeling since it allows one to determine rapidly the global minimum-energy conformation (GMEC) of an arbitrarily large collection of side chains, given fixed backbone coordinates. After introducing briefly the necessary background, we focus on logic arguments that increase the efficacy of the dead-end elimination process. Second, we present new theoretical considerations on the use of the dead-end elimination method as a tool to identify sequences that are compatible with a given scaffold structure. Third, we initiate a search for properties derived from the computed GMEC structure to predict whether a given sequence can be well packed in the core of a protein. Three properties will be considered: the nonbonded energy, the accessible surface area, and the extent by which the GMEC side-chain conformations deviate from a locally optimal conformation.
Collapse
Affiliation(s)
- I Lasters
- Center for Transgene Technology and Gene Therapy, Fianders Interuniversity Institute for Biotechnology, K.U. Leuven, Belgium
| | | | | |
Collapse
|
18
|
Abstract
Comparative modelling of protein 3D structure can now be applied with reasonable accuracy to ten times more protein sequences than the number of experimentally determined protein structures. A protein sequence that has at least 40% identity to a known structure can be modelled automatically with an accuracy approaching that of a low resolution X-ray structure or a medium resolution NMR structure. Currently, the errors in comparative models include mistakes in the packing of sidechains, in the conformation and shifts of the core segments and loops, and, most importantly, in an incorrect alignment of the modelled sequence with related known structures. Nevertheless, the number of applications in which comparative modelling has been proven to be useful has grown rapidly.
Collapse
Affiliation(s)
- R Sánchez
- Box 270, The Rockefeller University 1230 York Avenue, New York, NY 10021-6399, USA
| | | |
Collapse
|
19
|
De Maeyer M, Desmet J, Lasters I. All in one: a highly detailed rotamer library improves both accuracy and speed in the modelling of sidechains by dead-end elimination. FOLDING & DESIGN 1997; 2:53-66. [PMID: 9080199 DOI: 10.1016/s1359-0278(97)00006-0] [Citation(s) in RCA: 97] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
BACKGROUND About a decade ago, the concept of rotamer libraries was introduced to model sidechains given known mainchain coordinates. Since then, several groups have developed methods to handle the challenging combinatorial problem that is faced when searching rotamer libraries. To avoid a combinatorial explosion, the dead-end elimination method detects and eliminates rotamers that cannot be members of the global minimum energy conformation (GMEC). Several groups have applied and further developed this method in the fields of homology modelling and protein design. RESULTS This work addresses at the same time increased prediction accuracy and calculation speed improvements. The proposed enhancements allow the elimination of more than one-third of the possible rotameric states before applying the dead-end elimination method. This is achieved by using a highly detailed rotamer library allowing the safe application of an energy-based rejection criterion without risking the elimination of a GMEC rotamer. As a result, we gain both in modelling accuracy and in computational speed. Being completely automated, the current implementation of the dead-end elimination prediction of protein sidechains can be applied to the modelling of sidechains of proteins of any size on the high-end computer systems currently used in molecular modelling. The improved accuracy is highlighted in a comparative study on a collection of proteins of varying size for which score results have previously been published by multiple groups. Furthermore, we propose a new validation method for the scoring of the modelled structure versus the experimental data based upon the volume overlap of the predicted and observed sidechains. This overlap criterion is discussed in relation to the classic RMSD and the frequently used +/- 40 degrees window in comparing chi 1 and chi 2 angles. CONCLUSIONS We have shown that a very detailed library allows the introduction of a safe energy threshold rejection criterion, thereby increasing both the execution speed and the accuracy of the modelling program. We speculate that the current method will allow the sidechain prediction of medium-sized proteins and complex protein interfaces involving up to 150 residues on low-end desktop computers.
Collapse
Affiliation(s)
- M De Maeyer
- Center for Transgene Technology and Gene Therapy, Flanders Interuniversity Institute for Biotechnology, KU Leuven, Belgium
| | | | | |
Collapse
|
20
|
Abstract
Over the past few years, a number of methods for the calculation of side-chain conformations in proteins have been described. More recent studies have considered the effect of combinatorial packing, derivations from idealized rotameric structures and, to a limited extent, backbone flexibility on the quality and efficiency of calculations of protein side-chain conformation. Although further work is needed to address the issue of backbone displacements, the recent progress solves the packing problem to a significant degree. This opens the way for fruitful incorporation of these methods into general procedures for homology modeling and studies of ligand-protein interactions.
Collapse
Affiliation(s)
- M Vásquez
- Protein Design Labs Inc, Mountain View, CA 94043, USA.
| |
Collapse
|