1
|
Pedraza-González L, Barneschi L, Padula D, De Vico L, Olivucci M. Evolution of the Automatic Rhodopsin Modeling (ARM) Protocol. Top Curr Chem (Cham) 2022; 380:21. [PMID: 35291019 PMCID: PMC8924150 DOI: 10.1007/s41061-022-00374-w] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Accepted: 01/29/2022] [Indexed: 10/27/2022]
Abstract
In recent years, photoactive proteins such as rhodopsins have become a common target for cutting-edge research in the field of optogenetics. Alongside wet-lab research, computational methods are also developing rapidly to provide the necessary tools to analyze and rationalize experimental results and, most of all, drive the design of novel systems. The Automatic Rhodopsin Modeling (ARM) protocol is focused on providing exactly the necessary computational tools to study rhodopsins, those being either natural or resulting from mutations. The code has evolved along the years to finally provide results that are reproducible by any user, accurate and reliable so as to replicate experimental trends. Furthermore, the code is efficient in terms of necessary computing resources and time, and scalable in terms of both number of concurrent calculations as well as features. In this review, we will show how the code underlying ARM achieved each of these properties.
Collapse
Affiliation(s)
- Laura Pedraza-González
- Dipartimento di Biotecnologie, Chimica e Farmacia, Università degli Studi di Siena, Via Aldo Moro 2, 53100, Siena, Italy. .,Department of Chemistry and Industrial Chemistry, University of Pisa, Via Moruzzi 13, 56124, Pisa, Italy.
| | - Leonardo Barneschi
- Dipartimento di Biotecnologie, Chimica e Farmacia, Università degli Studi di Siena, Via Aldo Moro 2, 53100, Siena, Italy
| | - Daniele Padula
- Dipartimento di Biotecnologie, Chimica e Farmacia, Università degli Studi di Siena, Via Aldo Moro 2, 53100, Siena, Italy
| | - Luca De Vico
- Dipartimento di Biotecnologie, Chimica e Farmacia, Università degli Studi di Siena, Via Aldo Moro 2, 53100, Siena, Italy.
| | - Massimo Olivucci
- Dipartimento di Biotecnologie, Chimica e Farmacia, Università degli Studi di Siena, Via Aldo Moro 2, 53100, Siena, Italy. .,Department of Chemistry, Bowling Green State University, Bowling Green, OH, 43403, USA.
| |
Collapse
|
2
|
Applying Bioinformatic Platforms, In Vitro, and In Vivo Functional Assays in the Characterization of Genetic Variants in the GH/IGF Pathway Affecting Growth and Development. Cells 2021; 10:cells10082063. [PMID: 34440832 PMCID: PMC8392544 DOI: 10.3390/cells10082063] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2021] [Revised: 08/06/2021] [Accepted: 08/09/2021] [Indexed: 02/07/2023] Open
Abstract
Heritability accounts for over 80% of adult human height, indicating that genetic variability is the main determinant of stature. The rapid technological development of Next-Generation Sequencing (NGS), particularly Whole Exome Sequencing (WES), has resulted in the characterization of several genetic conditions affecting growth and development. The greatest challenge of NGS remains the high number of candidate variants identified. In silico bioinformatic tools represent the first approach for classifying these variants. However, solving the complicated problem of variant interpretation requires the use of experimental approaches such as in vitro and, when needed, in vivo functional assays. In this review, we will discuss a rational approach to apply to the gene variants identified in children with growth and developmental defects including: (i) bioinformatic tools; (ii) in silico modeling tools; (iii) in vitro functional assays; and (iv) the development of in vivo models. While bioinformatic tools are useful for a preliminary selection of potentially pathogenic variants, in vitro—and sometimes also in vivo—functional assays are further required to unequivocally determine the pathogenicity of a novel genetic variant. This long, time-consuming, and expensive process is the only scientifically proven method to determine causality between a genetic variant and a human genetic disease.
Collapse
|
3
|
A Critical Note on Symmetry Contact Artifacts and the Evaluation of the Quality of Homology Models. Symmetry (Basel) 2018. [DOI: 10.3390/sym10010025] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
|
4
|
Villanelo F, Escalona Y, Pareja-Barrueto C, Garate JA, Skerrett IM, Perez-Acle T. Accessing gap-junction channel structure-function relationships through molecular modeling and simulations. BMC Cell Biol 2017; 18:5. [PMID: 28124624 PMCID: PMC5267332 DOI: 10.1186/s12860-016-0121-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Background Gap junction channels (GJCs) are massive protein channels connecting the cytoplasm of adjacent cells. These channels allow intercellular transfer of molecules up to ~1 kDa, including water, ions and other metabolites. Unveiling structure-function relationships coded into the molecular architecture of these channels is necessary to gain insight on their vast biological function including electrical synapse, inflammation, development and tissular homeostasis. From early works, computational methods have been critical to analyze and interpret experimental observations. Upon the availability of crystallographic structures, molecular modeling and simulations have become a valuable tool to assess structure-function relationships in GJCs. Modeling different connexin isoforms, simulating the transport process, and exploring molecular variants, have provided new hypotheses and out-of-the-box approaches to the study of these important channels. Methods Here, we review foundational structural studies and recent developments on GJCs using molecular modeling and simulation techniques, highlighting the methods and the cross-talk with experimental evidence. Results and discussion By comparing results obtained by molecular modeling and simulations techniques with structural and functional information obtained from both recent literature and structural databases, we provide a critical assesment of structure-function relationships that can be obtained from the junction between theoretical and experimental evidence.
Collapse
Affiliation(s)
- F Villanelo
- Computational Biology Lab. Fundación Ciencia & Vida, Santiago, Chile
| | - Y Escalona
- Computational Biology Lab. Fundación Ciencia & Vida, Santiago, Chile
| | - C Pareja-Barrueto
- Computational Biology Lab. Fundación Ciencia & Vida, Santiago, Chile
| | - J A Garate
- Computational Biology Lab. Fundación Ciencia & Vida, Santiago, Chile.,Centro Interdisciplinario de Neurociencia de Valparaíso, Universidad de Valparaíso, Playa Ancha, Valparaíso, Chile
| | - I M Skerrett
- State University of New York (SUNY) Buffalo State, Buffalo, NY, 14222, USA
| | - T Perez-Acle
- Computational Biology Lab. Fundación Ciencia & Vida, Santiago, Chile. .,Centro Interdisciplinario de Neurociencia de Valparaíso, Universidad de Valparaíso, Playa Ancha, Valparaíso, Chile.
| |
Collapse
|
5
|
Gaillard T, Panel N, Simonson T. Protein side chain conformation predictions with an MMGBSA energy function. Proteins 2016; 84:803-19. [PMID: 26948696 DOI: 10.1002/prot.25030] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Revised: 02/22/2016] [Accepted: 02/27/2016] [Indexed: 12/17/2022]
Abstract
The prediction of protein side chain conformations from backbone coordinates is an important task in structural biology, with applications in structure prediction and protein design. It is a difficult problem due to its combinatorial nature. We study the performance of an "MMGBSA" energy function, implemented in our protein design program Proteus, which combines molecular mechanics terms, a Generalized Born and Surface Area (GBSA) solvent model, with approximations that make the model pairwise additive. Proteus is not a competitor to specialized side chain prediction programs due to its cost, but it allows protein design applications, where side chain prediction is an important step and MMGBSA an effective energy model. We predict the side chain conformations for 18 proteins. The side chains are first predicted individually, with the rest of the protein in its crystallographic conformation. Next, all side chains are predicted together. The contributions of individual energy terms are evaluated and various parameterizations are compared. We find that the GB and SA terms, with an appropriate choice of the dielectric constant and surface energy coefficients, are beneficial for single side chain predictions. For the prediction of all side chains, however, errors due to the pairwise additive approximation overcome the improvement brought by these terms. We also show the crucial contribution of side chain minimization to alleviate the rigid rotamer approximation. Even without GB and SA terms, we obtain accuracies comparable to SCWRL4, a specialized side chain prediction program. In particular, we obtain a better RMSD than SCWRL4 for core residues (at a higher cost), despite our simpler rotamer library. Proteins 2016; 84:803-819. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Thomas Gaillard
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Nicolas Panel
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Thomas Simonson
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| |
Collapse
|
6
|
Khan FI, Wei DQ, Gu KR, Hassan MI, Tabrez S. Current updates on computer aided protein modeling and designing. Int J Biol Macromol 2016; 85:48-62. [DOI: 10.1016/j.ijbiomac.2015.12.072] [Citation(s) in RCA: 72] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2015] [Revised: 12/17/2015] [Accepted: 12/21/2015] [Indexed: 12/15/2022]
|
7
|
Rosso C, Ermondi G, Caron G. GRID/BIOCUBE4mf to rank the influence of mutations on biological processes to design ad hoc mutants. Med Chem Res 2015. [DOI: 10.1007/s00044-015-1333-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
8
|
Three-dimensional protein structure prediction: Methods and computational strategies. Comput Biol Chem 2014; 53PB:251-276. [DOI: 10.1016/j.compbiolchem.2014.10.001] [Citation(s) in RCA: 121] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2014] [Revised: 10/03/2014] [Accepted: 10/07/2014] [Indexed: 01/01/2023]
|
9
|
Olivares-Quiroz L. Thermodynamics of ideal proteinogenic homopolymer chains as a function of the energy spectrum E, helical propensity ω and enthalpic energy barrier. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2013; 25:155103. [PMID: 23515207 DOI: 10.1088/0953-8984/25/15/155103] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
A reformulation and generalization of the Zwanzig model (ZW model) for ideal homopolymer chains poly-X, where X represents any of the twenty naturally occurring proteinogenic amino acid residues is presented. This reformulation and generalization provides a direct connection between coarse-grained parameters originally proposed in the ZW model with variables from the Lifson-Roig (LR) theory, such as the helical propensity per residue ω, and new variables introduced here, such as the energy gap Δ between unfolded and folded structures, as well as the ratio f of the energy scales involved. This enables us to discover the relevance of the energy spectrum E to the onset of configurational phase transitions. From the configurational partition function Q, thermodynamic properties such as the configurational entropy S, specific heat v and average energy <E> are calculated in terms of the number of residues K, temperature T, helical propensity ω and energy barrier ΔH for different poly-X chains in vacuo. Results obtained here provide substantial evidence that configurational phase transitions for ideal poly-X chains correspond to first-order phase transitions. An anomalous behavior of the thermodynamic functions <E>, Cv, S with respect to the number K of residues is also highlighted. On-going methods of solution are outlined.
Collapse
Affiliation(s)
- L Olivares-Quiroz
- Universidad Autónoma de la Ciudad de México, Campus Cuautepec, Av La Corona 320, Col Loma Alta CP 07160 DF, Mexico.
| |
Collapse
|
10
|
Gront D, Kmiecik S, Blaszczyk M, Ekonomiuk D, Koliński A. Optimization of protein models. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2012. [DOI: 10.1002/wcms.1090] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- Dominik Gront
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Sebastian Kmiecik
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Maciej Blaszczyk
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Dariusz Ekonomiuk
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Andrzej Koliński
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| |
Collapse
|
11
|
SWAIN MARTINT, BROOKS ANTHONYJ, KEMP GRAHAMJL. PREDICTING PEPTIDE INTERACTIONS WITH MODEL CLASS II MHC STRUCTURES. INT J ARTIF INTELL T 2011. [DOI: 10.1142/s0218213005002260] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
An automated method for constructing 3D models of class II MHC structures that uses constraint logic programming to select side-chain conformations is described. This method follows a comparative modeling approach in basing the model structures on experimentally determined MHC-peptide structures, but it uses constraints to ease open the peptide binding groove so that the modeled MHC structure is a less specific fit for the co-crystallized peptide in the starting structure. The resulting models are used by a "peptide threading" program that attempts to predict peptides from a protein sequence that will bind strongly to particular MHC alleles. Our results indicate that MHC models that have been constructed in this way enable the peptide threading program to make binding predictions that are comparable with those obtained when using experimentally determined MHC structures, suggesting that a combined modeling and peptide threading approach is worth pursuing for MHC molecules for which experimentally determined structures are not available.
Collapse
Affiliation(s)
- MARTIN T. SWAIN
- Department of Computing Science, University of Aberdeen, King's College, Aberdeen, Scotland, UK, AB24 3UE, UK
| | - ANTHONY J. BROOKS
- Department of Computing Science, University of Aberdeen, King's College, Aberdeen, Scotland, UK, AB24 3UE, UK
| | - GRAHAM J. L. KEMP
- Department of Computing Science, University of Aberdeen, King's College, Aberdeen, Scotland, UK, AB24 3UE, UK
| |
Collapse
|
12
|
Kim HJ, di Luccio E, Kong ANT, Kim JS. Nrf2-mediated induction of phase 2 detoxifying enzymes by glyceollins derived from soybean exposed to Aspergillus sojae. Biotechnol J 2011; 6:525-36. [PMID: 21538894 DOI: 10.1002/biot.201100010] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2023]
Abstract
Numerous antioxidants have been reported to cause transcriptional activation of several antioxidant enzymes through binding antioxidant-response element on their promoter region. We, therefore, attempted to examine whether glyceollins, which share common structural features with many phase 2 enzyme inducers and antioxidant activity, could induce detoxifying/antioxidant enzymes. Glyceollins induced NAD(P)H:quinone oxidoreductase activity in a dose-dependent manner in both mouse hepatoma Hepa1c1c7 and its mutant BPRc1 cells. The compounds also increased the expression of some representative antioxidant enzymes, such as heme oxygenase 1,gamma-glutamylcysteine synthase, and glutathione reductase, by promoting nuclear translocation of the NF-E2-related factor-2 (Nrf2). Furthermore, phosphorylation of Akt and antioxidant response element-mediated reporter gene expression were enhanced by glyceollins but suppressed by LY294002, an inhibitor of phosphoinositide 3-kinases (PI3K). This suggests that glyceollins may cause Nrf2-mediated phase 2 enzyme induction through activation of the PI3K signaling pathway as well as interaction with Keap1. Our molecular docking simulations also suggest that the glyceollin isomers tightly bind into the binding pocket around Cys151, preventing Nrf2 from docking to Keap1. In conclusion, the current data suggest that glyceollins induced phase 2 detoxifying enzymes likely through promoting nuclear translocation of Nrf2, which is known to be regulated by phosphorylation of Nrf2 and/or disrupting Keap1-Nrf2 complex formation.
Collapse
Affiliation(s)
- Hyo Jung Kim
- School of Applied Biosciences, Kyungpook National University, Daegu, Republic of Korea
| | | | | | | |
Collapse
|
13
|
di Luccio E, Koehl P. A quality metric for homology modeling: the H-factor. BMC Bioinformatics 2011; 12:48. [PMID: 21291572 PMCID: PMC3213331 DOI: 10.1186/1471-2105-12-48] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2010] [Accepted: 02/04/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The analysis of protein structures provides fundamental insight into most biochemical functions and consequently into the cause and possible treatment of diseases. As the structures of most known proteins cannot be solved experimentally for technical or sometimes simply for time constraints, in silico protein structure prediction is expected to step in and generate a more complete picture of the protein structure universe. Molecular modeling of protein structures is a fast growing field and tremendous works have been done since the publication of the very first model. The growth of modeling techniques and more specifically of those that rely on the existing experimental knowledge of protein structures is intimately linked to the developments of high resolution, experimental techniques such as NMR, X-ray crystallography and electron microscopy. This strong connection between experimental and in silico methods is however not devoid of criticisms and concerns among modelers as well as among experimentalists. RESULTS In this paper, we focus on homology-modeling and more specifically, we review how it is perceived by the structural biology community and what can be done to impress on the experimentalists that it can be a valuable resource to them. We review the common practices and provide a set of guidelines for building better models. For that purpose, we introduce the H-factor, a new indicator for assessing the quality of homology models, mimicking the R-factor in X-ray crystallography. The methods for computing the H-factor is fully described and validated on a series of test cases. CONCLUSIONS We have developed a web service for computing the H-factor for models of a protein structure. This service is freely accessible at http://koehllab.genomecenter.ucdavis.edu/toolkit/h-factor.
Collapse
Affiliation(s)
- Eric di Luccio
- Computer Science Department, Room 4337, Genome Center, GBSF University of California Davis 451 East Health Sciences Drive Davis, CA 95616, USA.
| | | |
Collapse
|
14
|
Štrancar J, Kavalenka A, Urbančič I, Ljubetič A, Hemminga MA. SDSL-ESR-based protein structure characterization. EUROPEAN BIOPHYSICS JOURNAL: EBJ 2009; 39:499-511. [DOI: 10.1007/s00249-009-0510-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/15/2009] [Accepted: 06/23/2009] [Indexed: 10/20/2022]
|
15
|
Vila JA, Scheraga HA. Factors affecting the use of 13C(alpha) chemical shifts to determine, refine, and validate protein structures. Proteins 2008; 71:641-54. [PMID: 17975838 DOI: 10.1002/prot.21726] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Interest centers here on the analysis of two different, but related, phenomena that affect side-chain conformations and consequently 13C(alpha) chemical shifts and their applications to determine, refine, and validate protein structures. The first is whether 13C(alpha) chemical shifts, computed at the DFT level of approximation with charged residues is a better approximation of observed 13C(alpha) chemical shifts than those computed with neutral residues for proteins in solution. Accurate computation of 13C(alpha) chemical shifts requires a proper representation of the charges, which might not take on integral values. For this analysis, the charges for 139 conformations of the protein ubiquitin were determined by explicit consideration of protein binding equilibria, at a given pH, that is, by exploring the 2(xi) possible ionization states of the whole molecule, with xi being the number of ionizable groups. The results of this analysis, as revealed by the shielding/deshielding of the 13C(alpha) nucleus, indicated that: (i) there is a significant difference in the computed 13C(alpha) chemical shifts, between basic and acidic groups, as a function of the degree of charge of the side chain; (ii) this difference is attributed to the distance between the ionizable groups and the 13C(alpha) nucleus, which is shorter for the acidic Asp and Glu groups as compared with that for the basic Lys and Arg groups; and (iii) the use of neutral, rather than charged, basic and acidic groups is a better approximation of the observed 13C(alpha) chemical shifts of a protein in solution. The second is how side-chain flexibility influences computed 13C(alpha) chemical shifts in an additional set of ubiquitin conformations, in which the side chains are generated from an NMR-derived structure with the backbone conformation assumed to be fixed. The 13C(alpha) chemical shift of a given amino acid residue in a protein is determined, mainly, by its own backbone and side-chain torsional angles, independent of the neighboring residues; the conformation of a given residue itself, however, depends on the environment of this residue and, hence, on the whole protein structure. As a consequence, this analysis reveals the role and impact of an accurate side-chain computation in the determination and refinement of protein conformation. The results of this analysis are: (i) a lower error between computed and observed 13C(alpha) chemical shifts (by up to 3.7 ppm), was found for approximately 68% and approximately 63% of all ionizable residues and all non-Ala/Pro/Gly residues, respectively, in the additional set of conformations, compared with results for the model from which the set was derived; and (ii) all the additional conformations exhibit a lower root-mean-square-deviation (1.97 ppm < or = rmsd < or = 2.13 ppm), between computed and observed 13C(alpha) chemical shifts, than the rmsd (2.32 ppm) computed for the starting conformation from which this additional set was derived. As a validation test, an analysis of the additional set of ubiquitin conformations, comparing computed and observed values of both 13C(alpha) chemical shifts and chi(1) torsional angles (given by the vicinal coupling constants, 3J(N-Cgamma) and 3J(C'-Cgamma), is discussed.
Collapse
Affiliation(s)
- Jorge A Vila
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853-1301, USA
| | | |
Collapse
|
16
|
Faure G, Bornot A, de Brevern AG. Protein contacts, inter-residue interactions and side-chain modelling. Biochimie 2008; 90:626-39. [DOI: 10.1016/j.biochi.2007.11.007] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2007] [Accepted: 11/22/2007] [Indexed: 10/22/2022]
|
17
|
Abstract
We describe an automated method for the modeling of point mutations in protein structures. The protein is represented by all non-hydrogen atoms. The scoring function consists of several types of physical potential energy terms and homology-derived restraints. The optimization method implements a combination of conjugate gradient minimization and molecular dynamics with simulated annealing. The testing set consists of 717 pairs of known protein structures differing by a single mutation. Twelve variations of the scoring function were tested in three different environments of the mutated residue. The best-performing protocol optimizes all the atoms of the mutated residue, with respect to a scoring function that includes molecular mechanics energy terms for bond distances, angles, dihedral angles, peptide bond planarity, and non-bonded atomic contacts represented by Lennard-Jones potential, dihedral angle restraints derived from the aligned homologous structure, and a statistical potential for non-bonded atomic interactions extracted from a large set of known protein structures. The current method compares favorably with other tested approaches, especially when predicting long and flexible side-chains. In addition to the thoroughness of the conformational search, sampled degrees of freedom, and the scoring function type, the accuracy of the method was also evaluated as a function of the flexibility of the mutated side-chain, the relative volume change of the mutated residue, and its residue type. The results suggest that further improvement is likely to be achieved by concentrating on the improvement of the scoring function, in addition to or instead of increasing the variety of sampled conformations.
Collapse
Affiliation(s)
- Eric Feyfant
- Wyeth Research, Chemical and Screening Sciences, Cambridge, Massachusetts 02421, USA
| | | | | |
Collapse
|
18
|
Zhu J, Xie L, Honig B. Structural refinement of protein segments containing secondary structure elements: Local sampling, knowledge-based potentials, and clustering. Proteins 2006; 65:463-79. [PMID: 16927337 DOI: 10.1002/prot.21085] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
In this article, we present an iterative, modular optimization (IMO) protocol for the local structure refinement of protein segments containing secondary structure elements (SSEs). The protocol is based on three modules: a torsion-space local sampling algorithm, a knowledge-based potential, and a conformational clustering algorithm. Alternative methods are tested for each module in the protocol. For each segment, random initial conformations were constructed by perturbing the native dihedral angles of loops (and SSEs) of the segment to be refined while keeping the protein body fixed. Two refinement procedures based on molecular mechanics force fields - using either energy minimization or molecular dynamics - were also tested but were found to be less successful than the IMO protocol. We found that DFIRE is a particularly effective knowledge-based potential and that clustering algorithms that are biased by the DFIRE energies improve the overall results. Results were further improved by adding an energy minimization step to the conformations generated with the IMO procedure, suggesting that hybrid strategies that combine both knowledge-based and physical effective energy functions may prove to be particularly effective in future applications.
Collapse
Affiliation(s)
- Jiang Zhu
- Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biophysics, Columbia University, 1130 St. Nicholas Avenue, Room 815, New York, New York 10032, USA
| | | | | |
Collapse
|
19
|
Zhang J, Liu JS. On side-chain conformational entropy of proteins. PLoS Comput Biol 2006; 2:e168. [PMID: 17154716 PMCID: PMC1676032 DOI: 10.1371/journal.pcbi.0020168] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2006] [Accepted: 10/26/2006] [Indexed: 11/19/2022] Open
Abstract
The role of side-chain entropy (SCE) in protein folding has long been speculated about but is still not fully understood. Utilizing a newly developed Monte Carlo method, we conducted a systematic investigation of how the SCE relates to the size of the protein and how it differs among a protein's X-ray, NMR, and decoy structures. We estimated the SCE for a set of 675 nonhomologous proteins, and observed that there is a significant SCE for both exposed and buried residues for all these proteins-the contribution of buried residues approaches approximately 40% of the overall SCE. Furthermore, the SCE can be quite different for structures with similar compactness or even similar conformations. As a striking example, we found that proteins' X-ray structures appear to pack more "cleverly" than their NMR or decoy counterparts in the sense of retaining higher SCE while achieving comparable compactness, which suggests that the SCE plays an important role in favouring native protein structures. By including a SCE term in a simple free energy function, we can significantly improve the discrimination of native protein structures from decoys.
Collapse
Affiliation(s)
- Jinfeng Zhang
- Department of Statistics, Harvard University, Cambridge, Massachusetts, United States of America
| | - Jun S Liu
- Department of Statistics, Harvard University, Cambridge, Massachusetts, United States of America
| |
Collapse
|
20
|
Jain T, Cerutti DS, McCammon JA. Configurational-bias sampling technique for predicting side-chain conformations in proteins. Protein Sci 2006; 15:2029-39. [PMID: 16943441 PMCID: PMC2242598 DOI: 10.1110/ps.062165906] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Prediction of side-chain conformations is an important component of several biological modeling applications. In this work, we have developed and tested an advanced Monte Carlo sampling strategy for predicting side-chain conformations. Our method is based on a cooperative rearrangement of atoms that belong to a group of neighboring side-chains. This rearrangement is accomplished by deleting groups of atoms from the side-chains in a particular region, and regrowing them with the generation of trial positions that depends on both a rotamer library and a molecular mechanics potential function. This method allows us to incorporate flexibility about the rotamers in the library and explore phase space in a continuous fashion about the primary rotamers. We have tested our algorithm on a set of 76 proteins using the all-atom AMBER99 force field and electrostatics that are governed by a distance-dependent dielectric function. When the tolerance for correct prediction of the dihedral angles is a <20 degrees deviation from the native state, our prediction accuracies for chi1 are 83.3% and for chi1 and chi2 are 65.4%. The accuracies of our predictions are comparable to the best results in the literature that often used Hamiltonians that have been specifically optimized for side-chain packing. We believe that the continuous exploration of phase space enables our method to overcome limitations inherent with using discrete rotamers as trials.
Collapse
Affiliation(s)
- Tushar Jain
- Howard Hughes Medical Institute, University of California, San Diego, CA 92093-0365, USA.
| | | | | |
Collapse
|
21
|
Santana R, Larrañaga P, Lozano JA. Side chain placement using estimation of distribution algorithms. Artif Intell Med 2006; 39:49-63. [PMID: 16854574 DOI: 10.1016/j.artmed.2006.04.004] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2005] [Revised: 04/26/2006] [Accepted: 04/28/2006] [Indexed: 11/29/2022]
Abstract
OBJECTIVE This paper presents an algorithm for the solution of the side chain placement problem. METHODS AND MATERIALS The algorithm combines the application of the Goldstein elimination criterion with the univariate marginal distribution algorithm (UMDA), which stochastically searches the space of possible solutions. The suitability of the algorithm to address the problem is investigated using a set of 425 proteins. RESULTS For a number of difficult instances where inference algorithms do not converge, it has been shown that UMDA is able to find better structures. CONCLUSIONS The results obtained show that the algorithm can achieve better structures than those obtained with other state-of-the-art methods like inference-based techniques. Additionally, a theoretical and empirical analysis of the computational cost of the algorithm introduced has been presented.
Collapse
Affiliation(s)
- Roberto Santana
- Department of Computer Science and Artificial Intelligence, University of the Basque Country, CP-20080, Donostia-San Sebastián, Spain.
| | | | | |
Collapse
|
22
|
Abstract
Homology modeling plays a central role in determining protein structure in the structural genomics project. The importance of homology modeling has been steadily increasing because of the large gap that exists between the overwhelming number of available protein sequences and experimentally solved protein structures, and also, more importantly, because of the increasing reliability and accuracy of the method. In fact, a protein sequence with over 30% identity to a known structure can often be predicted with an accuracy equivalent to a low-resolution X-ray structure. The recent advances in homology modeling, especially in detecting distant homologues, aligning sequences with template structures, modeling of loops and side chains, as well as detecting errors in a model, have contributed to reliable prediction of protein structure, which was not possible even several years ago. The ongoing efforts in solving protein structures, which can be time-consuming and often difficult, will continue to spur the development of a host of new computational methods that can fill in the gap and further contribute to understanding the relationship between protein structure and function.
Collapse
Affiliation(s)
- Zhexin Xiang
- Center for Molecular Modeling, Center for Information Technology, National Institutes of Health, Building 12A Room 2051, 12 South Drive, Bethesda, Maryland 20892-5624, USA.
| |
Collapse
|
23
|
Hu X, Kuhlman B. Protein design simulations suggest that side-chain conformational entropy is not a strong determinant of amino acid environmental preferences. Proteins 2006; 62:739-48. [PMID: 16317667 DOI: 10.1002/prot.20786] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Loss of side-chain conformational entropy is an important force opposing protein folding and the relative preferences of the amino acids for being buried or solvent exposed may be partially determined by which amino acids lose more side-chain entropy when placed in the core of a protein. To investigate these preferences, we have incorporated explicit modeling of side-chain entropy into the protein design algorithm, RosettaDesign. In the standard version of the program, the energy of a particular sequence for a fixed backbone depends only on the lowest energy side-chain conformations that can be identified for that sequence. In the new model, the free energy of a single amino acid sequence is calculated by evaluating the average energy and entropy of an ensemble of structures generated by Monte Carlo sampling of amino acid side-chain conformations. To evaluate the impact of including explicit side-chain entropy, sequences were designed for 110 native protein backbones with and without the entropy model. In general, the differences between the two sets of sequences are modest, with the largest changes being observed for the longer amino acids: methionine and arginine. Overall, the identity between the designed sequences and the native sequences does not increase with the addition of entropy, unlike what is observed when other key terms are added to the model (hydrogen bonding, Lennard-Jones energies, and solvation energies). These results suggest that side-chain conformational entropy has a relatively small role in determining the preferred amino acid at each residue position in a protein.
Collapse
Affiliation(s)
- Xiaozhen Hu
- Department of Biochemistry and Biophysics, University of North Carolina, Chapel Hill 27599, USA
| | | |
Collapse
|
24
|
Zhang W, Duan Y. Grow to Fit Molecular Dynamics (G2FMD): an ab initio method for protein side-chain assignment and refinement. Protein Eng Des Sel 2006; 19:55-65. [PMID: 16401632 DOI: 10.1093/protein/gzj001] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The rough energy landscapes and tight packing of protein interiors are two of the critical factors that have prevented the wide application of physics-based models in protein side-chain assignment and protein structure prediction in general. Complementing the rotamer-based methods, we propose an ab initio method that utilizes molecular mechanics simulations for protein side-chain assignment and refinement. By reducing the side-chain size, a smooth energy landscape was obtained owing to the increased distances between the side chains. The side chains then gradually grow back during molecular dynamics simulations while adjusting to their surrounding driven by the interaction energies. The method overcomes the barriers due to tight packing that limit conformational sampling of physics-based models. A key feature of this approach is that the resulting structures are free from steric collisions and allow the application of all-atom models in the subsequent refinement. Tests on a small set of proteins showed nearly 100% accuracy on both chi1 and chi2 of buried residues and 94% of them were within 20 degrees from the native conformation, 79% were within 10 degrees and 42% were within 5 degrees . However, the accuracy decreased when exposed side chains were involved. Further improvement and application of the method and the possible reasons that affect the accuracy on the exposed side chains are discussed.
Collapse
Affiliation(s)
- Wei Zhang
- Department of Chemistry and Biochemistry, University of Delaware, Newark, DE 19716, USA
| | | |
Collapse
|
25
|
Centeno NB, Planas-Iglesias J, Oliva B. Comparative modelling of protein structure and its impact on microbial cell factories. Microb Cell Fact 2005; 4:20. [PMID: 15989691 PMCID: PMC1183243 DOI: 10.1186/1475-2859-4-20] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2005] [Accepted: 06/30/2005] [Indexed: 11/22/2022] Open
Abstract
Comparative modeling is becoming an increasingly helpful technique in microbial cell factories as the knowledge of the three-dimensional structure of a protein would be an invaluable aid to solve problems on protein production. For this reason, an introduction to comparative modeling is presented, with special emphasis on the basic concepts, opportunities and challenges of protein structure prediction. This review is intended to serve as a guide for the biologist who has no special expertise and who is not involved in the determination of protein structure. Selected applications of comparative modeling in microbial cell factories are outlined, and the role of microbial cell factories in the structural genomics initiative is discussed.
Collapse
Affiliation(s)
- Nuria B Centeno
- Structural Bioinformatics Laboratory, Research Group on Biomedical Informatics (GRIB), IMIM/UPF. c/ Dr. Aiguader 80. 08003 Barcelona, Spain
| | - Joan Planas-Iglesias
- Structural Bioinformatics Laboratory, Research Group on Biomedical Informatics (GRIB), IMIM/UPF. c/ Dr. Aiguader 80. 08003 Barcelona, Spain
| | - Baldomero Oliva
- Structural Bioinformatics Laboratory, Research Group on Biomedical Informatics (GRIB), IMIM/UPF. c/ Dr. Aiguader 80. 08003 Barcelona, Spain
| |
Collapse
|
26
|
|
27
|
Abstract
The success of structural genomics initiatives requires the development and application of tools for structure analysis, prediction, and annotation. In this paper we review recent developments in these areas; specifically structure alignment, the detection of remote homologs and analogs, homology modeling and the use of structures to predict function. We also discuss various rationales for structural genomics initiatives. These include the structure-based clustering of sequence space and genome-wide function assignment. It is also argued that structural genomics can be integrated into more traditional biological research if specific biological questions are included in target selection strategies.
Collapse
Affiliation(s)
- Sharon Goldsmith-Fischman
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York 10032, USA
| | | |
Collapse
|
28
|
Hillisch A, Hilgenfeld R. The role of protein 3D-structures in the drug discovery process. EXS 2003:157-81. [PMID: 12613176 DOI: 10.1007/978-3-0348-7997-2_8] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/20/2023]
|
29
|
Desmet J, Spriet J, Lasters I. Fast and accurate side-chain topology and energy refinement (FASTER) as a new method for protein structure optimization. Proteins 2002; 48:31-43. [PMID: 12012335 DOI: 10.1002/prot.10131] [Citation(s) in RCA: 90] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
We have developed an original method for global optimization of protein side-chain conformations, called the Fast and Accurate Side-Chain Topology and Energy Refinement (FASTER) method. The method operates by systematically overcoming local minima of increasing order. Comparison of the FASTER results with those of the dead-end elimination (DEE) algorithm showed that both methods produce nearly identical results, but the FASTER algorithm is 100-1000 times faster than the DEE method and scales in a stable and favorable way as a function of protein size. We also show that low-order local minima may be almost as accurate as the global minimum when evaluated against experimentally determined structures. In addition, the new algorithm provides significant information about the conformational flexibility of individual side-chains. We observed that strictly rigid side-chains are concentrated mainly in the core of the protein, whereas highly flexible side-chains are found almost exclusively among solvent-oriented residues.
Collapse
|
30
|
Abstract
Modeling side-chain conformations on a fixed protein backbone has a wide application in structure prediction and molecular design. Each effort in this field requires decisions about a rotamer set, scoring function, and search strategy. We have developed a new and simple scoring function, which operates on side-chain rotamers and consists of the following energy terms: contact surface, volume overlap, backbone dependency, electrostatic interactions, and desolvation energy. The weights of these energy terms were optimized to achieve the minimal average root mean square (rms) deviation between the lowest energy rotamer and real side-chain conformation on a training set of high-resolution protein structures. In the course of optimization, for every residue, its side chain was replaced by varying rotamers, whereas conformations for all other residues were kept as they appeared in the crystal structure. We obtained prediction accuracy of 90.4% for chi(1), 78.3% for chi(1 + 2), and 1.18 A overall rms deviation. Furthermore, the derived scoring function combined with a Monte Carlo search algorithm was used to place all side chains onto a protein backbone simultaneously. The average prediction accuracy was 87.9% for chi(1), 73.2% for chi(1 + 2), and 1.34 A rms deviation for 30 protein structures. Our approach was compared with available side-chain construction methods and showed improvement over the best among them: 4.4% for chi(1), 4.7% for chi(1 + 2), and 0.21 A for rms deviation. We hypothesize that the scoring function instead of the search strategy is the main obstacle in side-chain modeling. Additionally, we show that a more detailed rotamer library is expected to increase chi(1 + 2) prediction accuracy but may have little effect on chi(1) prediction accuracy.
Collapse
Affiliation(s)
- Shide Liang
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas 75390, USA
| | | |
Collapse
|
31
|
Glick M, Rayan A, Goldblum A. A stochastic algorithm for global optimization and for best populations: a test case of side chains in proteins. Proc Natl Acad Sci U S A 2002; 99:703-8. [PMID: 11792838 PMCID: PMC117369 DOI: 10.1073/pnas.022418199] [Citation(s) in RCA: 42] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The problem of global optimization is pivotal in a variety of scientific fields. Here, we present a robust stochastic search method that is able to find the global minimum for a given cost function, as well as, in most cases, any number of best solutions for very large combinatorial "explosive" systems. The algorithm iteratively eliminates variable values that contribute consistently to the highest end of a cost function's spectrum of values for the full system. Values that have not been eliminated are retained for a full, exhaustive search, allowing the creation of an ordered population of best solutions, which includes the global minimum. We demonstrate the ability of the algorithm to explore the conformational space of side chains in eight proteins, with 54 to 263 residues, to reproduce a population of their low energy conformations. The 1,000 lowest energy solutions are identical in the stochastic (with two different seed numbers) and full, exhaustive searches for six of eight proteins. The others retain the lowest 141 and 213 (of 1,000) conformations, depending on the seed number, and the maximal difference between stochastic and exhaustive is only about 0.15 Kcal/mol. The energy gap between the lowest and highest of the 1,000 low-energy conformers in eight proteins is between 0.55 and 3.64 Kcal/mol. This algorithm offers real opportunities for solving problems of high complexity in structural biology and in other fields of science and technology.
Collapse
Affiliation(s)
- Meir Glick
- Department of Medicinal Chemistry and the David R. Bloom Center for Pharmacy, School of Pharmacy, Hebrew University of Jerusalem, Jerusalem 91120, Israel
| | | | | |
Collapse
|
32
|
Swain MT, Kemp GJ. Modelling protein side-chain conformations using constraint logic programming. COMPUTERS & CHEMISTRY 2001; 26:85-95. [PMID: 11765856 DOI: 10.1016/s0097-8485(01)00103-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
Side-chain placement is an important sub-task in protein modelling. Selecting conformations for side-chains is a difficult problem because of the large search space to be explored. This problem can be addressed using constraint logic programming (CLP), which is an artificial intelligence technique developed to solve large combinatorial search problems. The side-chain placement problem can be expressed as a CLP program in which rotamer conformations are used as values for finite domain variables, and bad steric contacts involving rotamers are represented as constraints. This paper introduces the concept of null rotamers, and shows how these can be used in implementing a novel iterative approach. We present results that compare the accuracy of models constructed using different rotamer libraries and different domain variable enumeration heuristics. The results obtained using this CLP-based approach compare favourably with those obtained by other methods.
Collapse
Affiliation(s)
- M T Swain
- Department of Computing Science, King's College, University of Aberdeen, Aberdeen, Scotland AB24 3UE, UK.
| | | |
Collapse
|
33
|
Abstract
The mapping of the human genome was completed earlier this year and efforts are underway to understand the role of gene products (i.e. proteins) in biological pathways and human disease and to exploit their functional roles to derive protein therapeutics and protein-based drugs. A key component to the next revolution in the 'post-genomic' era will be the increasingly widespread use of protein structure in rational experimental design. Improvements in quality, availability and utility of large-scale three- and four-dimensional protein structural information are enabling a revolution in rational design, having particular impact on drug discovery and optimization. New computational methodologies now yield modeled structures that are, in many cases, quantitatively comparable with crystal structures, at a fraction of the cost.
Collapse
Affiliation(s)
- E T. Maggio
- Structural Bioinformatics, 92127, Tel: +1 858 675 2400 fax: +1 858 618 1040, San Diego, CA, USA
| | | |
Collapse
|
34
|
Abstract
Current techniques for the prediction of side-chain conformations on a fixed backbone have an accuracy limit of about 1.0-1.5 A rmsd for core residues. We have carried out a detailed and systematic analysis of the factors that influence the prediction of side-chain conformation and, on this basis, have succeeded in extending the limits of side-chain prediction for core residues to about 0.7 A rmsd from native, and 94 % and 89 % of chi(1) and chi(1+2 ) dihedral angles correctly predicted to within 20 degrees of native, respectively. These results are obtained using a force-field that accounts for only van der Waals interactions and torsional potentials. Prediction accuracy is strongly dependent on the rotamer library used. That is, a complete and detailed rotamer library is essential. The greatest accuracy was obtained with an extensive rotamer library, containing over 7560 members, in which bond lengths and bond angles were taken from the database rather than simply assuming idealized values. Perhaps the most surprising finding is that the combinatorial problem normally associated with the prediction of the side-chain conformation does not appear to be important. This conclusion is based on the fact that the prediction of the conformation of a single side-chain with all others fixed in their native conformations is only slightly more accurate than the simultaneous prediction of all side-chain dihedral angles.
Collapse
Affiliation(s)
- Z Xiang
- Department of Biochemistry and Molecular Biophysics BB221, Columbia University, New York, NY 10032, USA
| | | |
Collapse
|
35
|
Abstract
The mapping of the human genome was completed earlier this year and efforts are underway to understand the role of gene products (i.e. proteins) in biological pathways and human disease and to exploit their functional roles to derive protein therapeutics and protein-based drugs. A key component to the next revolution in the 'post-genomic' era will be the increasingly widespread use of protein structure in rational experimental design. Improvements in quality, availability and utility of large-scale 3D and 4D protein structural information are enabling a revolution in rational design, having particular impact on drug discovery and optimization. New computational methodologies now yield modeled structures that are, in many cases, quantitatively comparable with crystal structures, at a fraction of the cost.
Collapse
Affiliation(s)
- E T Maggio
- Structural Bioinformatics Inc., 92127, San Diego, CA, USA.
| | | |
Collapse
|
36
|
Völkel AR, Noolandi J. Meanfield approach to the thermodynamics of protein-solvent systems with application to p53. Biophys J 2001; 80:1524-37. [PMID: 11222313 PMCID: PMC1301344 DOI: 10.1016/s0006-3495(01)76125-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022] Open
Abstract
We present a meanfield theoretical approach for studying protein-solvent interactions. Starting with the partition function of the system, we develop a field theory by introducing densities for the different components of the system. At this point, protein-solvent interactions are introduced following the inhomogeneous Flory-Huggins model for polymers. Finally, we calculate the free energy in a meanfield approximation. We apply this method to study the stability of the tetramerization domain of the tumor suppressor protein p53 when subjected to site-directed mutagenesis. The four chains of this protein are held together by hydrophobic interactions, and some mutations can weaken this bond while preserving the secondary structure of the single protein chains. We find good qualitative agreement between our numerical results and experimental data, thus encouraging the use of this method as a guide in designing experiments.
Collapse
Affiliation(s)
- A R Völkel
- Xerox Research Centre of Canada, Mississauga, Ontario L5K 2L1, Canada.
| | | |
Collapse
|
37
|
Martí-Renom MA, Stuart AC, Fiser A, Sánchez R, Melo F, Sali A. Comparative protein structure modeling of genes and genomes. ANNUAL REVIEW OF BIOPHYSICS AND BIOMOLECULAR STRUCTURE 2001; 29:291-325. [PMID: 10940251 DOI: 10.1146/annurev.biophys.29.1.291] [Citation(s) in RCA: 2376] [Impact Index Per Article: 99.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Comparative modeling predicts the three-dimensional structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. The number of protein sequences that can be modeled and the accuracy of the predictions are increasing steadily because of the growth in the number of known protein structures and because of the improvements in the modeling software. Further advances are necessary in recognizing weak sequence-structure similarities, aligning sequences with structures, modeling of rigid body shifts, distortions, loops and side chains, as well as detecting errors in a model. Despite these problems, it is currently possible to model with useful accuracy significant parts of approximately one third of all known protein sequences. The use of individual comparative models in biology is already rewarding and increasingly widespread. A major new challenge for comparative modeling is the integration of it with the torrents of data from genome sequencing projects as well as from functional and structural genomics. In particular, there is a need to develop an automated, rapid, robust, sensitive, and accurate comparative modeling pipeline applicable to whole genomes. Such large-scale modeling is likely to encourage new kinds of applications for the many resulting models, based on their large number and completeness at the level of the family, organism, or functional network.
Collapse
Affiliation(s)
- M A Martí-Renom
- Laboratories of Molecular Biophysics, Pels Family Center for Biochemistry and Structural Biology, Rockefeller University, New York, NY 10021, USA
| | | | | | | | | | | |
Collapse
|
38
|
Abstract
The prediction of protein structure, based primarily on sequence and structure homology, has become an increasingly important activity. Homology models have become more accurate and their range of applicability has increased. Progress has come, in part, from the flood of sequence and structure information that has appeared over the past few years, and also from improvements in analysis tools. These include profile methods for sequence searches, the use of three-dimensional structure information in sequence alignment and new homology modeling tools, specifically in the prediction of loop and side-chain conformations. There have also been important advances in understanding the physical chemical basis of protein stability and the corresponding use of physical chemical potential functions to identify correctly folded from incorrectly folded protein conformations.
Collapse
Affiliation(s)
- B Al-Lazikani
- Department of Biochemistry and Molecular Biophysics, Howard Hughes Medical Institute, Columbia University, 630 West 168th Street, New York, NY 10032, USA
| | | | | | | |
Collapse
|
39
|
Feig M, Rotkiewicz P, Kolinski A, Skolnick J, Brooks CL. Accurate reconstruction of all-atom protein representations from side-chain-based low-resolution models. Proteins 2000; 41:86-97. [PMID: 10944396 DOI: 10.1002/1097-0134(20001001)41:1<86::aid-prot110>3.0.co;2-y] [Citation(s) in RCA: 80] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
A procedure for the reconstruction of all-atom protein structures from side-chain center-based low-resolution models is introduced and applied to a set of test proteins with high-resolution X-ray structures. The accuracy of the rebuilt all-atom models is measured by root mean square deviations to the corresponding X-ray structures and percentages of correct chi(1) and chi(2) side-chain dihedrals. The benefit of including C(alpha) positions in the low-resolution model is examined, and the effect of lattice-based models on the reconstruction accuracy is discussed. Programs and scripts implementing the reconstruction procedure are made available through the NIH research resource for Multiscale Modeling Tools in Structural Biology (http://mmtsb.scripps.edu).
Collapse
Affiliation(s)
- M Feig
- Department of Molecular Biology, The Scripps Research Institute, La Jolla, California 92037, USA
| | | | | | | | | |
Collapse
|
40
|
Abstract
Our understanding of the rules relating sequence to structure in antibodies has led to the development of accurate knowledge-based procedures for antibody modeling. Information gained from the analysis of antibody structures has been successfully exploited to engineer antibody-like molecules endowed with prescribed properties, such as increased stability or different specificity, many of which have a broad spectrum of applications both in therapy and in research. Here we describe a knowledge-based procedure for the prediction of the antibody-variable domains, based on the canonical structures method for the antigen-binding site, and discuss its expected accuracy and limitations. The rational design of antibody-based molecules is illustrated using as an example one of the most widely employed modifications of antibody structures: the humanization of animal-derived antibodies to reduce their immunogenicity for serotherapy in humans.
Collapse
Affiliation(s)
- V Morea
- IRBM "P. Angeletti,", Via Pontina Km. 30.600, Pomezia, 00040, Italy
| | | | | |
Collapse
|
41
|
Lemak AS, Gunn JR. Rotamer-Specific Potentials of Mean Force for Residue Pair Interactions. J Phys Chem B 2000. [DOI: 10.1021/jp9919157] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Alexandre S. Lemak
- Départment de Chimie, Centre de Recherche en Calcul Appliqué, and Protein Engineering Network of Centers of Excellence, Université de Montréal, C.P. 6128, Succ. Centre-ville, Montréal, Québec H3C 3J7, Canada
| | - John R. Gunn
- Départment de Chimie, Centre de Recherche en Calcul Appliqué, and Protein Engineering Network of Centers of Excellence, Université de Montréal, C.P. 6128, Succ. Centre-ville, Montréal, Québec H3C 3J7, Canada
| |
Collapse
|
42
|
Mendes J, Baptista AM, Carrondo MA, Soares CM. Improved modeling of side-chains in proteins with rotamer-based methods: a flexible rotamer model. Proteins 1999; 37:530-43. [PMID: 10651269 DOI: 10.1002/(sici)1097-0134(19991201)37:4<530::aid-prot4>3.0.co;2-h] [Citation(s) in RCA: 63] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Side-chain modeling has a widespread application in many current methods for protein tertiary structure determination, prediction, and design. Of the existing side-chain modeling methods, rotamer-based methods are the fastest and most efficient. Classically, a rotamer is conceived as a single, rigid conformation of an amino acid sidechain. Here, we present a flexible rotamer model in which a rotamer is a continuous ensemble of conformations that cluster around the classic rigid rotamer. We have developed a thermodynamically based method for calculating effective energies for the flexible rotamer. These energies have a one-to-one correspondence with the potential energies of the rigid rotamer. Therefore, the flexible rotamer model is completely general and may be used with any rotamer-based method in substitution of the rigid rotamer model. We have compared the performance of the flexible and rigid rotamer models with one side-chain modeling method in particular (the self-consistent mean field theory method) on a set of 20 high quality crystallographic protein structures. For the flexible rotamer model, we obtained average predictions of 85.8% for chi1, 76.5% for chi1+2 and 1.34 A for root-mean-square deviation (RMSD); the corresponding values for core residues were 93.0%, 87.7% and 0.70 A, respectively. These values represent improvements of 7.3% for chi1, 8.1% for chi1+2 and 0.23 A for RMSD over the predictions obtained with the rigid rotamer model under otherwise identical conditions; the corresponding improvements for core residues were 6.9%, 10.5% and 0.43 A, respectively. We found that the predictions obtained with the flexible rotamer model were also significantly better than those obtained for the same set of proteins with another state-of-the-art side-chain placement method in the literature, especially for core residues. The flexible rotamer model represents a considerable improvement over the classic rigid rotamer model. It can, therefore, be used with considerable advantage in all rotamer-based methods commonly applied to protein tertiary structure determination, prediction, and design and also in predictions of free energies in mutational studies.
Collapse
Affiliation(s)
- J Mendes
- Instituto de Tecnologia Química e Biológica, Universidade Nova de Lisboa, Oeiras, Portugal
| | | | | | | |
Collapse
|
43
|
Abstract
We present the recursive dynamic programming (RDP) method for the threading approach to three-dimensional protein structure prediction. RDP is based on the divide-and-conquer paradigm and maps the protein sequence whose backbone structure is to be found (the protein target) onto the known backbone structure of a model protein (the protein template) in a stepwise fashion, a technique that is similar to computing local alignments but utilising different cost functions. We begin by mapping parts of the target onto the template that show statistically significant similarity with the template sequence. After mapping, the template structure is modified in order to account for the mapped target residues. Then significant similarities between the yet unmapped parts of the target and the modified template are searched, and the resulting segments of the target are mapped onto the template. This recursive process of identifying segments in the target to be mapped onto the template and modifying the template is continued until no significant similarities between the remaining parts of target and template are found. Those parts which are left unmapped by the procedure are interpreted as gaps. The RDP method is robust in the sense that different local alignment methods can be used, several alternatives of mapping parts of the target onto the template can be handled and compared in the process, and the cost functions can be dynamically adapted to biological needs. Our computer experiments show that the RDP procedure is efficient and effective. We can thread a typical protein sequence against a database of 887 template domains in about 12 hours even on a low-cost workstation (SUN Ultra 5). In statistical evaluations on databases of known protein structures, RDP significantly outperforms competing methods. RDP has been especially valuable in providing accurate alignments for modeling active sites of proteins.RDP is part of the ToPLign system (GMD Toolbox for protein alignment) and can be accessed via the WWW independently or in concert with other ToPLign tools at http://cartan.gmd.de/ToPLign.html.
Collapse
Affiliation(s)
- R Thiele
- German National Research Center for Information Technology (GMD), Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, Sankt Augustin, D-53754, Germany
| | | | | |
Collapse
|
44
|
Philippopoulos M, Lim C. Exploring the dynamic information content of a protein NMR structure: comparison of a molecular dynamics simulation with the NMR and X-ray structures of Escherichia coli ribonuclease HI. Proteins 1999; 36:87-110. [PMID: 10373009 DOI: 10.1002/(sici)1097-0134(19990701)36:1<87::aid-prot8>3.0.co;2-r] [Citation(s) in RCA: 49] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The multiconformer nature of solution nuclear magnetic resonance (NMR) structures of proteins results from the effects of intramolecular dynamics, spin diffusion and an uneven distribution of structural restraints throughout the molecule. A delineation of the former from the latter two contributions is attempted in this work for an ensemble of 15 NMR structures of the protein Escherichia coli ribonuclease HI (RNase HI). Exploration of the dynamic information content of the NMR ensemble is carried out through correlation with data from two crystal structures and a 1.7-ns molecular dynamics (MD) trajectory of RNase HI in explicit solvent. Assessment of the consistency of the crystal and mean MD structures with nuclear Overhauser effect (NOE) data showed that the NMR ensemble is overall more compatible with the high-resolution (1.48 A) crystal structure than with either the lower-resolution (2.05 A) crystal structure or the MD simulation. Furthermore, the NMR ensemble is found to span more conformational space than the MD simulation for both the backbone and the sidechains of RNase HI. Nonetheless, the backbone conformational variability of both the NMR ensemble and the simulation is especially consistent with NMR relaxation measurements of two loop regions that are putative sites of substrate recognition. Plausible side-chain dynamic information is extracted from the NMR ensemble on the basis of (i) rotamericity and syn-pentane character of variable torsion angles, (ii) comparison of the magnitude of atomic mean-square fluctuations (msf) with those deduced from crystallographic thermal factors, and (iii) comparison of torsion angle conformational behavior in the NMR ensemble and the simulation. Several heterogeneous torsion angles, while adopting non-rotameric/syn-pentane conformations in the NMR ensemble, exist in a unique conformation in the simulation and display low X-ray thermal factors. These torsions are identified as sites whose variability is likely to be an artifact of the NMR structure determination procedure. A number of other torsions show a close correspondence between the conformations sampled in the NMR and MD ensembles, as well as significant correlations among crystallographic thermal factors and atomic msf calculated from the NMR ensemble and the simulation. These results indicate that a significant amount of dynamic information is contained in the NMR ensemble. The relevance of the present findings for the biological function of RNase HI, protein recognition studies, and previous investigations of the motional content of protein NMR structures are discussed.
Collapse
|
45
|
Morea V, Leplae R, Tramontano A. Protein structure prediction and design. BIOTECHNOLOGY ANNUAL REVIEW 1999; 4:177-214. [PMID: 9890141 DOI: 10.1016/s1387-2656(08)70070-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Proteins have a unique native conformation, which can be proven in many instances to be determined by the amino acid sequence alone. The folding problem, that is the understanding of how the amino acid sequence directs folding, is still unsolved, despite more than 30 years of effort. However, many new methods have appeared in the past few years. This chapter describes the different principles underlying them and tries to give an overview of their successes and pitfalls.
Collapse
Affiliation(s)
- V Morea
- IRBM P. Angeletti, Pomezia, Rome, Italy
| | | | | |
Collapse
|
46
|
Schueler-Furman O, Elber R, Margalit H. Knowledge-based structure prediction of MHC class I bound peptides: a study of 23 complexes. FOLDING & DESIGN 1999; 3:549-64. [PMID: 9889166 DOI: 10.1016/s1359-0278(98)00070-4] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
BACKGROUND The binding of T-cell antigenic peptides to MHC molecules is a prerequisite for their immunogenicity. The ability to identify binding peptides based on the protein sequence is of great importance to the rational design of peptide vaccines. As the requirements for peptide binding cannot be fully explained by the peptide sequence per se, structural considerations should be taken into account and are expected to improve predictive algorithms. The first step in such an algorithm requires accurate and fast modeling of the peptide structure in the MHC-binding groove. RESULTS We have used 23 solved peptide-MHC class I complexes as a source of structural information in the development of a modeling algorithm. The peptide backbones and MHC structures were used as the templates for prediction. Sidechain conformations were built based on a rotamer library, using the 'dead end elimination' approach. A simple energy function selects the favorable combination of rotamers for a given sequence. It further selects the correct backbone structure from a limited library. The influence of different parameters on the prediction quality was assessed. With a specific rotamer library that incorporates information from the peptide sidechains in the solved complexes, the algorithm correctly identifies 85% (92%) of all (buried) sidechains and selects the correct backbones. Under cross-validation, 70% (78%) of all (buried) residues are correctly predicted and most of all backbones. The interaction between peptide sidechains has a negligible effect on the prediction quality. CONCLUSIONS The structure of the peptide sidechains follows from the interactions with the MHC and the peptide backbone, as the prediction is hardly influenced by sidechain interactions. The proposed methodology was able to select the correct backbone from a limited set. The impairment in performance under cross-validation suggests that, currently, the specific rotamer library is not satisfactorily representative. The predictions might improve with an increase in the data.
Collapse
Affiliation(s)
- O Schueler-Furman
- Department of Molecular Genetics and Biotechnology, The Hebrew University, Hadassah Medical School, Jerusalem, Israel.
| | | | | |
Collapse
|
47
|
Huang ES, Koehl P, Levitt M, Pappu RV, Ponder JW. Accuracy of side-chain prediction upon near-native protein backbones generated by Ab initio folding methods. Proteins 1998; 33:204-17. [PMID: 9779788 DOI: 10.1002/(sici)1097-0134(19981101)33:2<204::aid-prot5>3.0.co;2-i] [Citation(s) in RCA: 39] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The ab initio folding problem can be divided into two sequential tasks of approximately equal computational complexity: the generation of native-like backbone folds and the positioning of side chains upon these backbones. The prediction of side-chain conformation in this context is challenging, because at best only the near-native global fold of the protein is known. To test the effect of displacements in the protein backbones on side-chain prediction for folds generated ab initio, sets of near-native backbones (< or = 4 A C alpha RMS error) for four small proteins were generated by two methods. The steric environment surrounding each residue was probed by placing the side chains in the native conformation on each of these decoys, followed by torsion-space optimization to remove steric clashes on a rigid backbone. We observe that on average 40% of the chi1 angles were displaced by 40 degrees or more, effectively setting the limits in accuracy for side-chain modeling under these conditions. Three different algorithms were subsequently used for prediction of side-chain conformation. The average prediction accuracy for the three methods was remarkably similar: 49% to 51% of the chi1 angles were predicted correctly overall (33% to 36% of the chi1+2 angles). Interestingly, when the inter-side-chain interactions were disregarded, the mean accuracy increased. A consensus approach is described, in which side-chain conformations are defined based on the most frequently predicted chi angles for a given method upon each set of near-native backbones. We find that consensus modeling, which de facto includes backbone flexibility, improves side-chain prediction: chi1 accuracy improved to 51-54% (36-42% of chi1+2). Implications of a consensus method for ab initio protein structure prediction are discussed.
Collapse
Affiliation(s)
- E S Huang
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | | | | | | | | |
Collapse
|
48
|
Schaffer L, Verkhivker GM. Predicting structural effects in HIV-1 protease mutant complexes with flexible ligand docking and protein side-chain optimization. Proteins 1998; 33:295-310. [PMID: 9779795 DOI: 10.1002/(sici)1097-0134(19981101)33:2<295::aid-prot12>3.0.co;2-f] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
We present a computational approach for predicting structures of ligand-protein complexes and analyzing binding energy landscapes that combines Monte Carlo simulated annealing technique to determine the ligand bound conformation with the dead-end elimination algorithm for side-chain optimization of the protein active site residues. Flexible ligand docking and optimization of mobile protein side-chains have been performed to predict structural effects in the V32I/I47V/V82I HIV-1 protease mutant bound with the SB203386 ligand and in the V82A HIV-1 protease mutant bound with the A77003 ligand. The computational structure predictions are consistent with the crystal structures of these ligand-protein complexes. The emerging relationships between ligand docking and side-chain optimization of the active site residues are rationalized based on the analysis of the ligand-protein binding energy landscape.
Collapse
Affiliation(s)
- L Schaffer
- Agouron Pharmaceuticals, Inc., La Jolla, California 92037, USA
| | | |
Collapse
|
49
|
Abstract
The field of protein structure prediction is evolving rapidly and in the last few years a number of new methods have been developed and evaluated. However, comparative modeling, or modeling by homology, is still the method of choice when the unknown protein shares any significant sequence similarity with a protein of known structure. The accuracy of the method is highly dependent on the degree of similarity between the target protein and that used as a template. Nevertheless, careful consideration of all the steps performed in the modeling procedure allows useful information to be obtained also from a model based on very low sequence identity.
Collapse
|
50
|
Bower MJ, Cohen FE, Dunbrack RL. Prediction of protein side-chain rotamers from a backbone-dependent rotamer library: a new homology modeling tool. J Mol Biol 1997; 267:1268-82. [PMID: 9150411 DOI: 10.1006/jmbi.1997.0926] [Citation(s) in RCA: 425] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Modeling by homology is the most accurate computational method for translating an amino acid sequence into a protein structure. Homology modeling can be divided into two sub-problems, placing the polypeptide backbone and adding side-chains. We present a method for rapidly predicting the conformations of protein side-chains, starting from main-chain coordinates alone. The method involves using fewer than ten rotamers per residue from a backbone-dependent rotamer library and a search to remove steric conflicts. The method is initially tested on 299 high resolution crystal structures by rebuilding side-chains onto the experimentally determined backbone structures. A total of 77% of chi1 and 66% of chi(1 + 2) dihedral angles are predicted within 40 degrees of their crystal structure values. We then tested the method on the entire database of known structures in the Protein Data Bank. The predictive accuracy of the algorithm was strongly correlated with the resolution of the structures. In an effort to simulate a realistic homology modeling problem, 9424 homology models were created using three different modeling strategies. For prediction purposes, pairs of structures were identified which shared between 30% and 90% sequence identity. One strategy results in 82% of chi1 and 72% chi(1 + 2) dihedral angles predicted within 40 degrees of the target crystal structure values, suggesting that movements of the backbone associated with this degree of sequence identity are not large enough to disrupt the predictive ability of our method for non-native backbones. These results compared favorably with existing methods over a comprehensive data set.
Collapse
Affiliation(s)
- M J Bower
- Department of Pharmaceutical Chemistry, University of California San Francisco, 94143-0450, USA
| | | | | |
Collapse
|