1
|
Gokcan H, Isayev O. Prediction of protein p K a with representation learning. Chem Sci 2022; 13:2462-2474. [PMID: 35310485 PMCID: PMC8864681 DOI: 10.1039/d1sc05610g] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 01/29/2022] [Indexed: 11/21/2022] Open
Abstract
The behavior of proteins is closely related to the protonation states of the residues. Therefore, prediction and measurement of pK a are essential to understand the basic functions of proteins. In this work, we develop a new empirical scheme for protein pK a prediction that is based on deep representation learning. It combines machine learning with atomic environment vector (AEV) and learned quantum mechanical representation from ANI-2x neural network potential (J. Chem. Theory Comput. 2020, 16, 4192). The scheme requires only the coordinate information of a protein as the input and separately estimates the pK a for all five titratable amino acid types. The accuracy of the approach was analyzed with both cross-validation and an external test set of proteins. Obtained results were compared with the widely used empirical approach PROPKA. The new empirical model provides accuracy with MAEs below 0.5 for all amino acid types. It surpasses the accuracy of PROPKA and performs significantly better than the null model. Our model is also sensitive to the local conformational changes and molecular interactions.
Collapse
Affiliation(s)
- Hatice Gokcan
- Department of Chemistry, Mellon College of Science, Carnegie Mellon University Pittsburgh PA USA
| | - Olexandr Isayev
- Department of Chemistry, Mellon College of Science, Carnegie Mellon University Pittsburgh PA USA
| |
Collapse
|
2
|
Opuu V, Sun YJ, Hou T, Panel N, Fuentes EJ, Simonson T. A physics-based energy function allows the computational redesign of a PDZ domain. Sci Rep 2020; 10:11150. [PMID: 32636412 PMCID: PMC7341745 DOI: 10.1038/s41598-020-67972-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Accepted: 06/08/2020] [Indexed: 11/30/2022] Open
Abstract
Computational protein design (CPD) can address the inverse folding problem, exploring a large space of sequences and selecting ones predicted to fold. CPD was used previously to redesign several proteins, employing a knowledge-based energy function for both the folded and unfolded states. We show that a PDZ domain can be entirely redesigned using a "physics-based" energy for the folded state and a knowledge-based energy for the unfolded state. Thousands of sequences were generated by Monte Carlo simulation. Three were chosen for experimental testing, based on their low energies and several empirical criteria. All three could be overexpressed and had native-like circular dichroism spectra and 1D-NMR spectra typical of folded structures. Two had upshifted thermal denaturation curves when a peptide ligand was present, indicating binding and suggesting folding to a correct, PDZ structure. Evidently, the physical principles that govern folded proteins, with a dash of empirical post-filtering, can allow successful whole-protein redesign.
Collapse
Affiliation(s)
- Vaitea Opuu
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Institut Polytechnique de Paris, Palaiseau, France
| | - Young Joo Sun
- Department of Biochemistry, Carver College of Medicine, University of Iowa, Iowa City, USA
| | - Titus Hou
- Department of Biochemistry, Carver College of Medicine, University of Iowa, Iowa City, USA
| | - Nicolas Panel
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Institut Polytechnique de Paris, Palaiseau, France
| | - Ernesto J Fuentes
- Department of Biochemistry, Carver College of Medicine, University of Iowa, Iowa City, USA.
| | - Thomas Simonson
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Institut Polytechnique de Paris, Palaiseau, France.
| |
Collapse
|
3
|
Mahdizadeh SJ, Carlesso A, Eriksson LA. Deciphering the selectivity of inhibitor MKC9989 towards residue K907 in IRE1α; a multiscale in silico approach. RSC Adv 2020; 10:19720-19729. [PMID: 35515428 PMCID: PMC9054218 DOI: 10.1039/d0ra01895c] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Accepted: 05/14/2020] [Indexed: 02/05/2023] Open
Abstract
The selectivity of the ligand MKC9989, as inhibitor of the Inositol-Requiring Enzyme 1α (IRE1α) transmembrane kinase/ribonuclease protein, towards the residue K907 in the context of Schiff base formation, has been investigated by employing an array of in silico techniques including Multi-Conformation Continuum Electrostatics (MCCE) simulations, Quantum Mechanics/Molecular Mechanics (QM/MM) calculations, covalent docking, and Molecular Dynamics (MD) simulations. According to the MCCE results, K907 displays the lowest pKa value among all 23 lysine residues in IRE1α. The MMCE simulations also indicate a critical interaction between K907 and D885 within the hydrophobic pocket which increases significantly at low protein dielectric constants. The QM/MM calculations reveal a spontaneous proton transfer from K907 to D885, consistent with the low pKa value of K907. A Potential Energy Surface (PES) scan confirms the lack of energy barrier and transition state associated with this proton transfer reaction. Covalent docking and MD simulations verify that the protein pocket containing K907 can effectively stabilize the inhibitor by strong π–π and hydrogen bonding interactions. In addition, Radial Distribution Function (RDF) analysis shows that the imine group formed in the chemical reaction between MKC9989 and K907 is inaccessible to water molecules and thus the probability of imine hydrolysis is almost zero. The results of the current study explain the high selectivity of the MKC9989 inhibitor towards the K907 residue of IRE1α. The high selectivity of inhibitor MKC9989 towards Lys907 of IRE1α is explained by the unique pKa properties of the lysine.![]()
Collapse
Affiliation(s)
| | - Antonio Carlesso
- Department of Chemistry and Molecular Biology
- University of Gothenburg
- 405 30 Göteborg
- Sweden
| | - Leif A. Eriksson
- Department of Chemistry and Molecular Biology
- University of Gothenburg
- 405 30 Göteborg
- Sweden
| |
Collapse
|
4
|
Cvitkovic JP, Pauplis CD, Kaminski GA. PKA17-A Coarse-Grain Grid-Based Methodology and Web-Based Software for Predicting Protein pK a Shifts. J Comput Chem 2019; 40:1718-1726. [PMID: 30895643 DOI: 10.1002/jcc.25826] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2018] [Accepted: 02/26/2019] [Indexed: 11/09/2022]
Abstract
We have developed and tested PKA17, a coarse-grain grid-based model for predicting protein pK a shifts. Our pK a predictor is currently deployed via a website interface. We have carried out parameter fitting using 442 Asp, Glu, His, Lys, and Arg residues for which experimental results are available in the literature. PROPKA software has been used for benchmarking. The average unsigned error and root-mean-square deviation (RMSD) have been found to be 0.628 and 0.831 pH units, respectively, for PKA17. The corresponding results with PROPKA are 0.761 and 1.063 units. We have assessed the robustness of the developed PKA17 methodology with a number of tests and have also explored the possibility of using a combination of PROPKA and PKA17 calculations in order to improve the accuracy of predicted pK a values for protein residues. We have also once again confirmed that protein acidity constants are influenced almost entirely by residues in the immediate spatial proximity of the ionizable amino acids. The resulting PKA17 software has been deployed online with a web-based interface at http://users.wpi.edu/~jpcvitkovic/pka_calc.html. © 2019 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- John P Cvitkovic
- Department of Chemistry and Biochemistry, Worcester Polytechnic Institute, 100 Institute Rd., Worcester, Massachusetts, 01609
| | - Connor D Pauplis
- Department of Chemistry and Biochemistry, Worcester Polytechnic Institute, 100 Institute Rd., Worcester, Massachusetts, 01609
| | - George A Kaminski
- Department of Chemistry and Biochemistry, Worcester Polytechnic Institute, 100 Institute Rd., Worcester, Massachusetts, 01609
| |
Collapse
|
5
|
Vila JA, Arnautova YA. 13C Chemical Shifts in Proteins: A Rich Source of Encoded Structural Information. SPRINGER SERIES ON BIO- AND NEUROSYSTEMS 2019. [PMCID: PMC7123919 DOI: 10.1007/978-3-319-95843-9_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Despite the formidable progress in Nuclear Magnetic Resonance (NMR) spectroscopy, quality assessment of NMR-derived structures remains as an important problem. Thus, validation of protein structures is essential for the spectroscopists, since it could enable them to detect structural flaws and potentially guide their efforts in further refinement. Moreover, availability of accurate and efficient validation tools would help molecular biologists and computational chemists to evaluate quality of available experimental structures and to select a protein model which is the most suitable for a given scientific problem. The 13Cα nuclei are ubiquitous in proteins, moreover, their shieldings are easily obtainable from NMR experiments and represent a rich source of encoded structural information that makes 13Cα chemical shifts an attractive candidate for use in computational methods aimed at determination and validation of protein structures. In this chapter, the basis of a novel methodology of computing, at the quantum chemical level of theory, the 13Cα shielding for the amino acid residues in proteins is described. We also identify and examine the main factors affecting the 13Cα-shielding computation. Finally, we illustrate how the information encoded in the 13C chemical shifts can be used for a number of applications, viz., from protein structure prediction of both α-helical and β-sheet conformations, to determination of the fraction of the tautomeric forms of the imidazole ring of histidine in proteins as a function of pH or to accurate detection of structural flaws, at a residue-level, in NMR-determined protein models.
Collapse
|
6
|
Villa F, Mignon D, Polydorides S, Simonson T. Comparing pairwise-additive and many-body generalized Born models for acid/base calculations and protein design. J Comput Chem 2017; 38:2396-2410. [PMID: 28749575 DOI: 10.1002/jcc.24898] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2017] [Revised: 06/30/2017] [Accepted: 07/06/2017] [Indexed: 12/13/2022]
Abstract
Generalized Born (GB) solvent models are common in acid/base calculations and protein design. With GB, the interaction between a pair of solute atoms depends on the shape of the protein/solvent boundary and, therefore, the positions of all solute atoms, so that GB is a many-body potential. For compute-intensive applications, the model is often simplified further, by introducing a mean, native-like protein/solvent boundary, which removes the many-body property. We investigate a method for both acid/base calculations and protein design that uses Monte Carlo simulations in which side chains can explore rotamers, bind/release protons, or mutate. The fluctuating protein/solvent dielectric boundary is treated in a way that is numerically exact (within the GB framework), in contrast to a mean boundary. Its originality is that it captures the many-body character while retaining the residue-pairwise complexity given by a fixed boundary. The method is implemented in the Proteus protein design software. It yields a slight but systematic improvement for acid/base constants in nine proteins and a significant improvement for the computational design of three PDZ domains. It eliminates a source of model uncertainty, which will facilitate the analysis of other model limitations. © 2017 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Francesco Villa
- Ecole Polytechnique, Laboratoire de Biochimie (CNRS UMR7654), Palaiseau, 91128, France
| | - David Mignon
- Ecole Polytechnique, Laboratoire de Biochimie (CNRS UMR7654), Palaiseau, 91128, France
| | - Savvas Polydorides
- Ecole Polytechnique, Laboratoire de Biochimie (CNRS UMR7654), Palaiseau, 91128, France
| | - Thomas Simonson
- Ecole Polytechnique, Laboratoire de Biochimie (CNRS UMR7654), Palaiseau, 91128, France
| |
Collapse
|
7
|
Sharma I, Kaminski GA. Using polarizable POSSIM force field and fuzzy-border continuum solvent model to calculate pK(a) shifts of protein residues. J Comput Chem 2017; 38:65-80. [PMID: 27785788 PMCID: PMC5123858 DOI: 10.1002/jcc.24519] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2016] [Revised: 09/22/2016] [Accepted: 10/02/2016] [Indexed: 12/26/2022]
Abstract
Our Fuzzy-Border (FB) continuum solvent model has been extended and modified to produce hydration parameters for small molecules using POlarizable Simulations Second-order Interaction Model (POSSIM) framework with an average error of 0.136 kcal/mol. It was then used to compute pKa shifts for carboxylic and basic residues of the turkey ovomucoid third domain (OMTKY3) protein. The average unsigned errors in the acid and base pKa values were 0.37 and 0.4 pH units, respectively, versus 0.58 and 0.7 pH units as calculated with a previous version of polarizable protein force field and Poisson Boltzmann continuum solvent. This POSSIM/FB result is produced with explicit refitting of the hydration parameters to the pKa values of the carboxylic and basic residues of the OMTKY3 protein; thus, the values of the acidity constants can be viewed as additional fitting target data. In addition to calculating pKa shifts for the OMTKY3 residues, we have studied aspartic acid residues of Rnase Sa. This was done without any further refitting of the parameters and agreement with the experimental pKa values is within an average unsigned error of 0.65 pH units. This result included the Asp79 residue that is buried and thus has a high experimental pKa value of 7.37 units. Thus, the presented model is capable or reproducing pKa results for residues in an environment that is significantly different from the solvated protein surface used in the fitting. Therefore, the POSSIM force field and the FB continuum solvent parameters have been demonstrated to be sufficiently robust and transferable. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Ity Sharma
- Department of Chemistry and Biochemistry, Worcester Polytechnic Institute, Worcester, MA 01609
| | - George A. Kaminski
- Department of Chemistry and Biochemistry, Worcester Polytechnic Institute, Worcester, MA 01609
| |
Collapse
|
8
|
Koehl P, Poitevin F, Orland H, Delarue M. Modified Poisson–Boltzmann equations for characterizing biomolecular solvation. JOURNAL OF THEORETICAL & COMPUTATIONAL CHEMISTRY 2014. [DOI: 10.1142/s021963361440001x] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Methods for computing electrostatic interactions often account implicitly for the solvent, due to the much smaller number of degrees of freedom involved. In the Poisson–Boltzmann (PB) approach the electrostatic potential is obtained by solving the Poisson–Boltzmann equation (PBE), where the solvent region is modeled as a homogeneous medium with a high dielectric constant. PB however is not exempt of problems. It does not take into account for example the sizes of the ions in the atmosphere surrounding the solute, nor does it take into account the inhomogeneous dielectric response of water due to the presence of a highly charged surface. In this paper we review two major modifications of PB that circumvent these problems, namely the size-modified PB (SMPB) equation and the Dipolar Poisson–Boltzmann Langevin (DPBL) model. In SMPB, steric effects between ions are accounted for with a lattice gas model. In DPBL, the solvent region is no longer modeled as a homogeneous dielectric media but rather as an assembly of self-orienting interacting dipoles of variable density. This model results in a dielectric profile that transits smoothly from the solute to the solvent region as well as in a variable solvent density that depends on the charges of the solute. We show successful applications of the DPBL formalism to computing the solvation free energies of isolated ions in water. Further developments of more accurately modified PB models are discussed.
Collapse
Affiliation(s)
- Patrice Koehl
- Department of Computer Science and Genome Center, University of California, Davis, CA 95616, USA
| | - Frederic Poitevin
- Unité de Dynamique Structurale des Macromolécules, UMR 3528 du CNRS, Institut Pasteur, 75015 Paris, France
| | - Henri Orland
- Service de Physique Théorique, CEA-Saclay, 91191 Gif/Yvette Cedex, France
| | - Marc Delarue
- Unité de Dynamique Structurale des Macromolécules, UMR 3528 du CNRS, Institut Pasteur, 75015 Paris, France
| |
Collapse
|
9
|
Francis-Lyon P, Koehl P. Protein side-chain modeling with a protein-dependent optimized rotamer library. Proteins 2014; 82:2000-17. [PMID: 24623614 DOI: 10.1002/prot.24555] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2013] [Revised: 02/28/2014] [Accepted: 03/07/2014] [Indexed: 12/16/2022]
Abstract
Despite years of effort, the problem of predicting the conformations of protein side chains remains a subject of inquiry. This problem has three major issues, namely defining the conformations that a side chain may adopt within a protein, developing a sampling procedure for generating possible side-chain packings, and defining a scoring function that can rank these possible packings. To solve the former of these issues, most procedures rely on a rotamer library derived from databases of known protein structures. We introduce an alternative method that is free of statistics. We begin with a rotamer library that is based only on stereochemical considerations; this rotamer library is then optimized independently for each protein under study. We show that this optimization step restores the diversity of conformations observed in native proteins. We combine this protein-dependent rotamer library (PDRL) method with the self-consistent mean field (SCMF) sampling approach and a physics-based scoring function into a new side-chain prediction method, SCMF-PDRL. Using two large test sets of 831 and 378 proteins, respectively, we show that this new method compares favorably with competing methods such as SCAP, OPUS-Rota, and SCWRL4 for energy-minimized structures.
Collapse
Affiliation(s)
- Patricia Francis-Lyon
- Department of Computer Science, University of San Francisco, San Francisco, California, 94117
| | | |
Collapse
|
10
|
Vila JA, Arnautova YA. 13C Chemical Shifts in Proteins: A Rich Source of Encoded Structural Information. COMPUTATIONAL METHODS TO STUDY THE STRUCTURE AND DYNAMICS OF BIOMOLECULES AND BIOMOLECULAR PROCESSES 2014. [PMCID: PMC7121069 DOI: 10.1007/978-3-642-28554-7_19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Despite the formidable progress in Nuclear Magnetic Resonance (NMR) spectroscopy, quality assessment of NMR-derived structures remains as an important problem. Thus, validation of protein structures is essential for the spectroscopists, since it could enable them to detect structural flaws and potentially guide their efforts in further refinement. Moreover, availability of accurate and efficient validation tools would help molecular biologists and computational chemists to evaluate quality of available experimental structures and to select a protein model which is the most suitable for a given scientific problem. The 13Cα nuclei are ubiquitous in proteins, moreover, their shieldings are easily obtainable from NMR experiments and represent a rich source of encoded structural information that makes 13Cα chemical shifts an attractive candidate for use in computational methods aimed at determination and validation of protein structures. In this chapter, the basis of a novel methodology of computing, at the quantum chemical level of theory, the 13Cα shielding for the amino acid residues in proteins is described. We also identify and examine the main factors affecting the 13Cα-shielding computation. Finally, we illustrate how the information encoded in the 13C chemical shifts can be used for a number of applications, viz., from protein structure prediction of both α-helical and β-sheet conformations, to determination of the fraction of the tautomeric forms of the imidazole ring of histidine in proteins as a function of pH or to accurate detection of structural flaws, at a residue-level, in NMR-determined protein models.
Collapse
|
11
|
Polydorides S, Simonson T. Monte Carlo simulations of proteins at constant pH with generalized Born solvent, flexible sidechains, and an effective dielectric boundary. J Comput Chem 2013; 34:2742-56. [PMID: 24122878 DOI: 10.1002/jcc.23450] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2013] [Revised: 09/04/2013] [Accepted: 09/08/2013] [Indexed: 12/11/2022]
Abstract
Titratable residues determine the acid/base behavior of proteins, strongly influencing their function; in addition, proton binding is a valuable reporter on electrostatic interactions. We describe a method for pK(a) calculations, using constant-pH Monte Carlo (MC) simulations to explore the space of sidechain conformations and protonation states, with an efficient and accurate generalized Born model (GB) for the solvent effects. To overcome the many-body dependency of the GB model, we use a "Native Environment" approximation, whose accuracy is shown to be good. It allows the precalculation and storage of interactions between all sidechain pairs, a strategy borrowed from computational protein design, which makes the MC simulations themselves very fast. The method is tested for 12 proteins and 167 titratable sidechains. It gives an rms error of 1.1 pH units, similar to the trivial "Null" model. The only adjustable parameter is the protein dielectric constant. The best accuracy is achieved for values between 4 and 8, a range that is physically plausible for a protein interior. For sidechains with large pKa shifts, ≥2, the rms error is 1.6, compared to 2.5 with the Null model and 1.5 with the empirical PROPKA method.
Collapse
Affiliation(s)
- Savvas Polydorides
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, 91128, Palaiseau, France
| | | |
Collapse
|
12
|
Simonson T. What Is the Dielectric Constant of a Protein When Its Backbone Is Fixed? J Chem Theory Comput 2013; 9:4603-8. [DOI: 10.1021/ct400398e] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Thomas Simonson
- Laboratoire de Biochimie
(CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France
| |
Collapse
|
13
|
Simonson T, Gaillard T, Mignon D, Schmidt am Busch M, Lopes A, Amara N, Polydorides S, Sedano A, Druart K, Archontis G. Computational protein design: the Proteus software and selected applications. J Comput Chem 2013; 34:2472-84. [PMID: 24037756 DOI: 10.1002/jcc.23418] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2013] [Revised: 07/08/2013] [Accepted: 07/28/2013] [Indexed: 12/13/2022]
Abstract
We describe an automated procedure for protein design, implemented in a flexible software package, called Proteus. System setup and calculation of an energy matrix are done with the XPLOR modeling program and its sophisticated command language, supporting several force fields and solvent models. A second program provides algorithms to search sequence space. It allows a decomposition of the system into groups, which can be combined in different ways in the energy function, for both positive and negative design. The whole procedure can be controlled by editing 2-4 scripts. Two applications consider the tyrosyl-tRNA synthetase enzyme and its successful redesign to bind both O-methyl-tyrosine and D-tyrosine. For the latter, we present Monte Carlo simulations where the D-tyrosine concentration is gradually increased, displacing L-tyrosine from the binding pocket and yielding the binding free energy difference, in good agreement with experiment. Complete redesign of the Crk SH3 domain is presented. The top 10000 sequences are all assigned to the correct fold by the SUPERFAMILY library of Hidden Markov Models. Finally, we report the acid/base behavior of the SNase protein. Sidechain protonation is treated as a form of mutation; it is then straightforward to perform constant-pH Monte Carlo simulations, which yield good agreement with experiment. Overall, the software can be used for a wide range of application, producing not only native-like sequences but also thermodynamic properties with errors that appear comparable to other current software packages.
Collapse
Affiliation(s)
- Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Department of Biology, Ecole Polytechnique, Palaiseau, 91128, France
| | | | | | | | | | | | | | | | | | | |
Collapse
|
14
|
Sabri Dashti D, Meng Y, Roitberg AE. pH-replica exchange molecular dynamics in proteins using a discrete protonation method. J Phys Chem B 2012; 116:8805-11. [PMID: 22694266 DOI: 10.1021/jp303385x] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Protonation equilibria in biological molecules modulates structure, dynamics, and function. A pH-replica exchange molecular dynamics (pH-REMD) method is described here to improve the coupling between conformational and protonation sampling. Under a Hamiltonian replica exchange setup, conformations are swapped between two neighboring replicas, which themselves are at different pHs. The method has been validated on a series of biological systems. We applied pH-REMD to a series of model compounds, to an terminally charged ADFDA pentapeptide, and to a heptapeptide derived from the ovomucoid third domain (OMTKY3). In all of those systems, the predicted pK(a) by pH-REMD is very close to the experimental value and almost identical to the ones obtained by constant pH molecular dynamics (CpH MD). The method presented here, pH-REMD, has the advantage of faster convergence properties due to enhanced sampling of both conformation and protonation spaces.
Collapse
Affiliation(s)
- Danial Sabri Dashti
- Department of Physics and Quantum Theory Project, University of Florida, Gainesville, Florida 32611-8435, USA
| | | | | |
Collapse
|
15
|
Characterization and pH-dependent substrate specificity of alkalophilic xylanase from Bacillus alcalophilus. J Ind Microbiol Biotechnol 2012; 39:1465-75. [PMID: 22763748 DOI: 10.1007/s10295-012-1159-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2012] [Accepted: 06/11/2012] [Indexed: 10/28/2022]
Abstract
The gene of endo-beta-1-4 xylanase, xynT, was cloned from Bacillus alcalophilus AX2000 and expressed in Escherichia coli. This XynT, which belongs to glycoside hydrolase (GH) family 10, was found to have a molecular weight of approximately 37 kDa and exhibit optimal activity at pH 7-9 and 50 °C. It exhibits a high activity towards birchwood xylan and has the ability to bind avicel. Under optimal conditions, XynT hydrolyzes all xylooligomers into xylobiose as an end product with a preference for cleavage sites at the second or third glycosidic bond from the reducing end. XynT has a different substrate affinity on xylooligomers at pH 5.0, which contributes to its low activity toward xylotriose and its derived intermediate products. This low activity may be due to an unstable interaction with the amino acids that constitute subsites of the active site. Interestingly, the addition of Co(2+) and Mn(2+) led to a significant increase in activity by up to 40 and 50 %, respectively. XynT possesses a high binding affinity and hydrolytic activity toward the insoluble xylan, for which it exhibits high activity at pH 7-9, giving rise to its efficient biobleaching effect on Pinus densiflora kraft pulp.
Collapse
|
16
|
Chen TS, Keating AE. Designing specific protein-protein interactions using computation, experimental library screening, or integrated methods. Protein Sci 2012; 21:949-63. [PMID: 22593041 DOI: 10.1002/pro.2096] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2012] [Accepted: 05/11/2012] [Indexed: 11/11/2022]
Abstract
Given the importance of protein-protein interactions for nearly all biological processes, the design of protein affinity reagents for use in research, diagnosis or therapy is an important endeavor. Engineered proteins would ideally have high specificities for their intended targets, but achieving interaction specificity by design can be challenging. There are two major approaches to protein design or redesign. Most commonly, proteins and peptides are engineered using experimental library screening and/or in vitro evolution. An alternative approach involves using protein structure and computational modeling to rationally choose sequences predicted to have desirable properties. Computational design has successfully produced novel proteins with enhanced stability, desired interactions and enzymatic function. Here we review the strengths and limitations of experimental library screening and computational structure-based design, giving examples where these methods have been applied to designing protein interaction specificity. We highlight recent studies that demonstrate strategies for combining computational modeling with library screening. The computational methods provide focused libraries predicted to be enriched in sequences with the properties of interest. Such integrated approaches represent a promising way to increase the efficiency of protein design and to engineer complex functionality such as interaction specificity.
Collapse
Affiliation(s)
- T Scott Chen
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | | |
Collapse
|
17
|
Koehl P, Orland H, Delarue M. Adapting Poisson-Boltzmann to the self-consistent mean field theory: application to protein side-chain modeling. J Chem Phys 2011; 135:055104. [PMID: 21823735 DOI: 10.1063/1.3621831] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We present an extension of the self-consistent mean field theory for protein side-chain modeling in which solvation effects are included based on the Poisson-Boltzmann (PB) theory. In this approach, the protein is represented with multiple copies of its side chains. Each copy is assigned a weight that is refined iteratively based on the mean field energy generated by the rest of the protein, until self-consistency is reached. At each cycle, the variational free energy of the multi-copy system is computed; this free energy includes the internal energy of the protein that accounts for vdW and electrostatics interactions and a solvation free energy term that is computed using the PB equation. The method converges in only a few cycles and takes only minutes of central processing unit time on a commodity personal computer. The predicted conformation of each residue is then set to be its copy with the highest weight after convergence. We have tested this method on a database of hundred highly refined NMR structures to circumvent the problems of crystal packing inherent to x-ray structures. The use of the PB-derived solvation free energy significantly improves prediction accuracy for surface side chains. For example, the prediction accuracies for χ(1) for surface cysteine, serine, and threonine residues improve from 68%, 35%, and 43% to 80%, 53%, and 57%, respectively. A comparison with other side-chain prediction algorithms demonstrates that our approach is consistently better in predicting the conformations of exposed side chains.
Collapse
Affiliation(s)
- Patrice Koehl
- Department of Biological Sciences, National University of Singapore, Singapore.
| | | | | |
Collapse
|
18
|
Alexov E, Mehler EL, Baker N, Baptista AM, Huang Y, Milletti F, Nielsen JE, Farrell D, Carstensen T, Olsson MHM, Shen JK, Warwicker J, Williams S, Word JM. Progress in the prediction of pKa values in proteins. Proteins 2011; 79:3260-75. [PMID: 22002859 PMCID: PMC3243943 DOI: 10.1002/prot.23189] [Citation(s) in RCA: 198] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2011] [Accepted: 09/12/2011] [Indexed: 01/18/2023]
Abstract
The pK(a) -cooperative aims to provide a forum for experimental and theoretical researchers interested in protein pK(a) values and protein electrostatics in general. The first round of the pK(a) -cooperative, which challenged computational labs to carry out blind predictions against pK(a) s experimentally determined in the laboratory of Bertrand Garcia-Moreno, was completed and results discussed at the Telluride meeting (July 6-10, 2009). This article serves as an introduction to the reports submitted by the blind prediction participants that will be published in a special issue of PROTEINS: Structure, Function and Bioinformatics. Here, we briefly outline existing approaches for pK(a) calculations, emphasizing methods that were used by the participants in calculating the blind pK(a) values in the first round of the cooperative. We then point out some of the difficulties encountered by the participating groups in making their blind predictions, and finally try to provide some insights for future developments aimed at improving the accuracy of pK(a) calculations.
Collapse
Affiliation(s)
- Emil Alexov
- Department of Physics, Clemson University, Clemson, SC, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Stroganov OV, Novikov FN, Zeifman AA, Stroylov VS, Chilov GG. TSAR, a new graph-theoretical approach to computational modeling of protein side-chain flexibility: Modeling of ionization properties of proteins. Proteins 2011; 79:2693-710. [DOI: 10.1002/prot.23099] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2010] [Revised: 05/16/2011] [Accepted: 05/27/2011] [Indexed: 11/09/2022]
|
20
|
Nilsson L, Karshikoff A. Multiple pH regime molecular dynamics simulation for pK calculations. PLoS One 2011; 6:e20116. [PMID: 21647418 PMCID: PMC3103538 DOI: 10.1371/journal.pone.0020116] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2010] [Accepted: 04/25/2011] [Indexed: 11/25/2022] Open
Abstract
Ionisation equilibria in proteins are influenced by conformational flexibility, which can in principle be accounted for by molecular dynamics simulation. One problem in this method is the bias arising from the fixed protonation state during the simulation. Its effect is mostly exhibited when the ionisation behaviour of the titratable groups is extrapolated to pH regions where the predetermined protonation state of the protein may not be statistically relevant, leading to conformational sampling that is not representative of the true state. In this work we consider a simple approach which can essentially reduce this problem. Three molecular dynamics structure sets are generated, each with a different protonation state of the protein molecule expected to be relevant at three pH regions, and pK calculations from the three sets are combined to predict pK over the entire pH range of interest. This multiple pH molecular dynamics approach was tested on the GCN4 leucine zipper, a protein for which a full data set of experimental data is available. The pK values were predicted with a mean deviation from the experimental data of 0.29 pH units, and with a precision of 0.13 pH units, evaluated on the basis of equivalent sites in the dimeric GCN4 leucine zipper.
Collapse
Affiliation(s)
- Lennart Nilsson
- Department of Biosciences and Nutrition, Center for Biosciences, Karolinska Institutet, Huddinge, Sweden.
| | | |
Collapse
|
21
|
Wang L, Ye Y, Lykourinou V, Angerhofer A, Ming LJ, Zhao Y. Metal Complexes of a Multidentate Cyclophosphazene with Imidazole-Containing Side Chains for Hydrolyses of Phosphoesters - Bimolecular vs. Intramolecular Dinuclear Pathway. Eur J Inorg Chem 2011. [DOI: 10.1002/ejic.201000668] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
22
|
Aleksandrov A, Polydorides S, Archontis G, Simonson T. Predicting the Acid/Base Behavior of Proteins: A Constant-pH Monte Carlo Approach with Generalized Born Solvent. J Phys Chem B 2010; 114:10634-48. [DOI: 10.1021/jp104406x] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Alexey Aleksandrov
- Laboratoire de Biochimie (CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France, and Department of Physics, University of Cyprus, PO20537, CY1678, Nicosia, Cyprus
| | - Savvas Polydorides
- Laboratoire de Biochimie (CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France, and Department of Physics, University of Cyprus, PO20537, CY1678, Nicosia, Cyprus
| | - Georgios Archontis
- Laboratoire de Biochimie (CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France, and Department of Physics, University of Cyprus, PO20537, CY1678, Nicosia, Cyprus
| | - Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France, and Department of Physics, University of Cyprus, PO20537, CY1678, Nicosia, Cyprus
| |
Collapse
|
23
|
Anderson JS, Hernández G, LeMaster DM. Sidechain conformational dependence of hydrogen exchange in model peptides. Biophys Chem 2010; 151:61-70. [PMID: 20627534 DOI: 10.1016/j.bpc.2010.05.006] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2010] [Revised: 05/10/2010] [Accepted: 05/12/2010] [Indexed: 10/19/2022]
Abstract
Peptide hydrogens that are exposed to solvent in protein X-ray structures exhibit a billion-fold range in hydroxide-catalyzed exchange rates, and these rates have previously been shown to be predictable by continuum dielectric methods to within a factor of 7, based on single protein conformations. When using a protein coil library to model the Boltzmann-weighted conformational distribution for the various N-acetyl-[X-Ala]-N-methylamides and N-acetyl-[Ala-Y]-N-methylamides, the acidity of the central amide in the individual conformers of each peptide spans nearly a million-fold range. Nevertheless, population averaging of these conformer acidities predicts the standard sidechain-dependent hydrogen exchange correction factors for nonpolar model peptides to within a factor of 30% (10(0.11)) with a correlation coefficient r=0.91. Comparison with the analogous continuum dielectric calculations for the other N-acetyl-[X-Y]-N-methylamides indicates that deviations from the isolated residue hypothesis of classical polymer theory predict appreciable errors in the exchange rates for conformationally disordered peptides when the standard sidechain-dependent hydrogen exchange rate correction factors are assumed to be independently additive. Although electronic polarizability generally dominates the dielectric shielding for the approximately 10ps lifetime of peptide ionization, evidence is presented for modest contributions from rapid intrarotamer conformational reorganization of Asn and Gln sidechains.
Collapse
Affiliation(s)
- Janet S Anderson
- Department of Chemistry, Union College, Schenectady, NY 12308, USA
| | | | | |
Collapse
|
24
|
Aleksandrov A, Thompson D, Simonson T. Alchemical free energy simulations for biological complexes: powerful but temperamental.... J Mol Recognit 2010; 23:117-27. [PMID: 19693787 DOI: 10.1002/jmr.980] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
Free energy simulations compare multiple ligand:receptor complexes by "alchemically" transforming one into another, yielding binding free energy differences. Since their introduction in the 1980s, many technical and theoretical obstacles were surmounted, and the method ("MDFE," since molecular dynamics are often used) has matured into a powerful tool. We describe its current status, its effectiveness, and the challenges it faces. MDFE has provided chemical accuracy for many systems but remains expensive, with significant human overhead costs. The bottlenecks have shifted, partly due to increased computer power. To study diverse sets of ligands, force field availability and accuracy can be a major difficulty. Another difficulty is the frequent need to consider multiple states, related to sidechain protonation or buried waters, for example. Sophisticated, automated methods to sample these states are maturing, such as constant pH simulations. Meanwhile, combinations of MDFE and simpler approaches, like continuum dielectric models, can be very effective. As illustrations, we show how, with careful force field parameterization, MDFE accurately predicts binding specificities between complex tetracycline ligands and their targets. We describe substrate binding to the aspartyl-tRNA synthetase enzyme, where many distinct electrostatic states play a role, and a histidine and a Mg(2+) ion act as coupled switches that help enforce a strict preference for the aspartate substrate, relative to several analogs. Overall, MDFE has achieved a predictive status, where novel ligands can be studied and molecular recognition elucidated in depth. It should play an increasing role in the analysis of complex cellular processes and biomolecular engineering.
Collapse
Affiliation(s)
- Alexey Aleksandrov
- Laboratoire de Biochimie (CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France
| | | | | |
Collapse
|
25
|
Abstract
One of the most important physicochemical properties of small molecules and macromolecules are the dissociation constants for any weakly acidic or basic groups, generally expressed as the pK(a) of each group. This is a major factor in the pharmacokinetics of drugs and in the interactions of proteins with other molecules. For both the protein and small molecule cases, we survey the sources of experimental pK(a) values and then focus on current methods for predicting them. Of particular concern is an analysis of the scope, statistical validity, and predictive power of methods as well as their accuracy.
Collapse
Affiliation(s)
- Adam C Lee
- Department of Medicinal Chemistry, College of Pharmacy, University of Michigan, Ann Arbor, Michigan 48109, USA
| | | |
Collapse
|
26
|
Click TH, Kaminski GA. Reproducing basic pKa values for turkey ovomucoid third domain using a polarizable force field. J Phys Chem B 2009; 113:7844-50. [PMID: 19432439 DOI: 10.1021/jp809412e] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We have extended our previous studies of calculating acidity constants for the acidic residues found in the turkey ovomucoid third domain protein (OMTKY3) by determining the relative pKa values for the basic residues (Lys13, Arg21, Lys29, Lys34, His52, and Lys55). A polarizable force field (PFF) was employed. The values of the pKa were found by direct comparison of energies of solvated protonated and deprotonated forms of the protein. Poisson-Boltzmann (PBF) and surface generalized Born (SGB) continuum solvation models represent the hydration, and a nonpolarizable fixed-charge OPLS-AA force field was used for comparison. Our results indicate that (i) the pKa values of the basic residues can be found in close agreement with the experimental values when a PFF is used in conjunction with the PBF solvation model, (ii) it is sufficient to take into the account only the residues which are in close proximity (hydrogen bonded) to the residue in question, and (iii) the PBF solvation model is superior to the SGB solvation model for these pKa calculations. The average error with the PBF/PFF model is only 0.7 pH unit, compared with 2.2 and 6.1 units for the PBF/OPLS and SGB/OPLS, respectively. The maximum deviation of the PBF/PFF results from the experimental values is 1.7 pH units compared with 6.0 pH units for the PBF/OPLS. Moreover, the best results were obtained while using an advanced nonpolar energy calculation scheme. The overall conclusion is that this methodology and force field are suitable for the accurate assessment of pKa shifts for both acidic and basic protein residues.
Collapse
Affiliation(s)
- Timothy H Click
- Department of Chemistry, Central Michigan University, Mt. Pleasant, Michigan 48859, USA
| | | |
Collapse
|
27
|
Makowska J, Bagińska K, Liwo A, Chmurzyński L, Scheraga HA. Acidic-basic properties of three alanine-based peptides containing acidic and basic side chains: comparison between theory and experiment. Biopolymers 2008; 90:724-32. [PMID: 18618612 DOI: 10.1002/bip.21046] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
The purpose of this work was to evaluate the effect of the nature of the ionizable end groups, and the solvent, on their acid-base properties in alanine-based peptides. Hence, the acid-base properties of three alanine-based peptides: Ac-KK-(A)(7)-KK-NH(2) (KAK), Ac-OO-(A)(7)-DD-NH(2) (OAD), Ac-KK-(A)(7)-EE-NH(2) (KAE), where A, D, E, K, and O denote alanine, aspartic acid, glutamic acid, lysine, and ornithine, respectively, were determined in water and in methanol by potentiometry. With the availability of these data, the ability of two theoretical methods to simulate pH-metric titration of those peptides was assessed: (i) the electrostatically driven Monte Carlo method with the ECEPP/3 force field and the Poisson-Boltzmann approach to compute solvation energy (EDMC/PB/pH), and (ii) the molecular dynamics method with the AMBER force field and the Generalized Born model (MD/GB/pH). For OAD and KAE, pK(a1) and pK(a2) correspond to the acidic side chains. For all three compounds in both solvents, the pK(a1) value is remarkably lower than the pK(a) of a compound modeling the respective isolated side chain, which can be explained by the influence of the electrostatic field from positively charged ornithine or lysine side chains. The experimental titration curves are reproduced well by the MD/GB/pH approach, the agreement being better if restraints derived from NMR measurements are incorporated in the conformational search. Poorer agreement is achieved by the EDMC/PB/pH method.
Collapse
Affiliation(s)
- Joanna Makowska
- Faculty of Chemistry, University of Gdańsk, Sobieskiego 18, 80-952 Gdańsk, Poland
| | | | | | | | | |
Collapse
|
28
|
Hydration dynamics in a partially denatured ensemble of the globular protein human alpha-lactalbumin investigated with molecular dynamics simulations. Biophys J 2008; 95:5257-67. [PMID: 18775960 DOI: 10.1529/biophysj.108.136531] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Atomistic molecular dynamics simulations are used to probe changes in the nature and subnanosecond dynamical behavior of solvation waters that accompany partial denaturation of the globular protein, human alpha-lactalbumin. A simulated ensemble of subcompact conformers, similar to the molten globule state of human alpha-lactalbumin, demonstrates a marginal increase in the amount of surface solvation relative to the native state. This increase is accompanied by subtle but distinct enhancement in surface water dynamics, less favorable protein-water interactions, and a marginal decrease in the anomalous behavior of solvation water dynamics. The extent of solvent influx is not proportional to the increased surface area, and the partially denatured conformers are less uniformly solvated compared to their native counterpart. The observed solvation in partially denatured conformers is lesser in extent compared to earlier experimental estimates in molten globule states, and is consistent with more recent descriptions based on nuclear magnetic relaxation dispersion studies.
Collapse
|
29
|
Abstract
We report a very fast and accurate physics-based method to calculate pH-dependent electrostatic effects in protein molecules and to predict the pK values of individual sites of titration. In addition, a CHARMm-based algorithm is included to construct and refine the spatial coordinates of all hydrogen atoms at a given pH. The present method combines electrostatic energy calculations based on the Generalized Born approximation with an iterative mobile clustering approach to calculate the equilibria of proton binding to multiple titration sites in protein molecules. The use of the GBIM (Generalized Born with Implicit Membrane) CHARMm module makes it possible to model not only water-soluble proteins but membrane proteins as well. The method includes a novel algorithm for preliminary refinement of hydrogen coordinates. Another difference from existing approaches is that, instead of monopeptides, a set of relaxed pentapeptide structures are used as model compounds. Tests on a set of 24 proteins demonstrate the high accuracy of the method. On average, the RMSD between predicted and experimental pK values is close to 0.5 pK units on this data set, and the accuracy is achieved at very low computational cost. The pH-dependent assignment of hydrogen atoms also shows very good agreement with protonation states and hydrogen-bond network observed in neutron-diffraction structures. The method is implemented as a computational protocol in Accelrys Discovery Studio and provides a fast and easy way to study the effect of pH on many important mechanisms such as enzyme catalysis, ligand binding, protein-protein interactions, and protein stability.
Collapse
|
30
|
Barth P, Schoeffler A, Alber T. Targeting Metastable Coiled-Coil Domains by Computational Design. J Am Chem Soc 2008; 130:12038-44. [DOI: 10.1021/ja802447e] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Patrick Barth
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720-3220
| | - Allyn Schoeffler
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720-3220
| | - Tom Alber
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720-3220
| |
Collapse
|
31
|
Bas DC, Rogers DM, Jensen JH. Very fast prediction and rationalization of pKa values for protein-ligand complexes. Proteins 2008; 73:765-83. [PMID: 18498103 DOI: 10.1002/prot.22102] [Citation(s) in RCA: 876] [Impact Index Per Article: 54.8] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Delphine C Bas
- Equipe de Chimie et Biochimie Théoriques, UMR 7565 - CNRS, Université Henri Poincaré, Nancy I, Boulevard des Aiguillettes BP 239, 54506 Vandoeuvre-lès-Nancy Cedex, France
| | | | | |
Collapse
|
32
|
Vila JA, Scheraga HA. Factors affecting the use of 13C(alpha) chemical shifts to determine, refine, and validate protein structures. Proteins 2008; 71:641-54. [PMID: 17975838 DOI: 10.1002/prot.21726] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Interest centers here on the analysis of two different, but related, phenomena that affect side-chain conformations and consequently 13C(alpha) chemical shifts and their applications to determine, refine, and validate protein structures. The first is whether 13C(alpha) chemical shifts, computed at the DFT level of approximation with charged residues is a better approximation of observed 13C(alpha) chemical shifts than those computed with neutral residues for proteins in solution. Accurate computation of 13C(alpha) chemical shifts requires a proper representation of the charges, which might not take on integral values. For this analysis, the charges for 139 conformations of the protein ubiquitin were determined by explicit consideration of protein binding equilibria, at a given pH, that is, by exploring the 2(xi) possible ionization states of the whole molecule, with xi being the number of ionizable groups. The results of this analysis, as revealed by the shielding/deshielding of the 13C(alpha) nucleus, indicated that: (i) there is a significant difference in the computed 13C(alpha) chemical shifts, between basic and acidic groups, as a function of the degree of charge of the side chain; (ii) this difference is attributed to the distance between the ionizable groups and the 13C(alpha) nucleus, which is shorter for the acidic Asp and Glu groups as compared with that for the basic Lys and Arg groups; and (iii) the use of neutral, rather than charged, basic and acidic groups is a better approximation of the observed 13C(alpha) chemical shifts of a protein in solution. The second is how side-chain flexibility influences computed 13C(alpha) chemical shifts in an additional set of ubiquitin conformations, in which the side chains are generated from an NMR-derived structure with the backbone conformation assumed to be fixed. The 13C(alpha) chemical shift of a given amino acid residue in a protein is determined, mainly, by its own backbone and side-chain torsional angles, independent of the neighboring residues; the conformation of a given residue itself, however, depends on the environment of this residue and, hence, on the whole protein structure. As a consequence, this analysis reveals the role and impact of an accurate side-chain computation in the determination and refinement of protein conformation. The results of this analysis are: (i) a lower error between computed and observed 13C(alpha) chemical shifts (by up to 3.7 ppm), was found for approximately 68% and approximately 63% of all ionizable residues and all non-Ala/Pro/Gly residues, respectively, in the additional set of conformations, compared with results for the model from which the set was derived; and (ii) all the additional conformations exhibit a lower root-mean-square-deviation (1.97 ppm < or = rmsd < or = 2.13 ppm), between computed and observed 13C(alpha) chemical shifts, than the rmsd (2.32 ppm) computed for the starting conformation from which this additional set was derived. As a validation test, an analysis of the additional set of ubiquitin conformations, comparing computed and observed values of both 13C(alpha) chemical shifts and chi(1) torsional angles (given by the vicinal coupling constants, 3J(N-Cgamma) and 3J(C'-Cgamma), is discussed.
Collapse
Affiliation(s)
- Jorge A Vila
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853-1301, USA
| | | |
Collapse
|
33
|
Baran KL, Chimenti MS, Schlessman JL, Fitch CA, Herbst KJ, Garcia-Moreno BE. Electrostatic effects in a network of polar and ionizable groups in staphylococcal nuclease. J Mol Biol 2008; 379:1045-62. [PMID: 18499123 DOI: 10.1016/j.jmb.2008.04.021] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2007] [Revised: 02/27/2008] [Accepted: 04/09/2008] [Indexed: 12/01/2022]
Abstract
His121 and His124 are embedded in a network of polar and ionizable groups on the surface of staphylococcal nuclease. To examine how membership in a network affects the electrostatic properties of ionizable groups, the tautomeric state and the pK(a) values of these histidines were measured with NMR spectroscopy in the wild-type nuclease and in 13 variants designed to disrupt the network. In the background protein, His121 and His124 titrate with pK(a) values of 5.2 and 5.6, respectively. In the variants, where the network was disrupted, the pK(a) values range from 4.03 to 6.46 for His121, and 5.04 to 5.99 for His124. The largest decrease in a pK(a) was observed when the favorable Coulomb interaction between His121 and Glu75 was eliminated; the largest increase was observed when Tyr91 or Tyr93 was substituted with Ala or Phe. In all variants, the dominant tautomeric state at neutral pH was the N(epsilon2) state. At one level the network behaves as a rigid unit that does not readily reorganize when disrupted: crystal structures of the E75A or E75Q variants show that even when the pivotal Glu75 is removed, the overall configuration of the network was unaffected. On the other hand, a few key hydrogen bonds appear to govern the conformation of the network, and when these bonds are disrupted the network reorganizes. Coulomb interactions within the network report an effective dielectric constant of 20, whereas a dielectric constant of 80 is more consistent with the magnitude of medium to long-range Coulomb interactions in this protein. The data demonstrate that when structures are treated as static, rigid bodies, structure-based pK(a) calculations with continuum electrostatics method are not useful to treat ionizable groups in cases where pK(a) values are governed by short-range polar and Coulomb interactions.
Collapse
Affiliation(s)
- Kelli L Baran
- Department of Biophysics, The Johns Hopkins University, 3400 N. Charles St., Baltimore, MD 21218, USA
| | | | | | | | | | | |
Collapse
|
34
|
Design of protein-ligand binding based on the molecular-mechanics energy model. J Mol Biol 2008; 380:415-24. [PMID: 18514737 DOI: 10.1016/j.jmb.2008.04.001] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2007] [Revised: 03/25/2008] [Accepted: 04/01/2008] [Indexed: 11/22/2022]
Abstract
While the molecular-mechanics field has standardized on a few potential energy functions, computational protein design efforts are based on potentials that are unique to individual laboratories. Here we show that a standard molecular-mechanics potential energy function without any modifications can be used to engineer protein-ligand binding. A molecular-mechanics potential is used to reconstruct the coordinates of various binding sites with an average root-mean-square error of 0.61 A and to reproduce known ligand-induced side-chain conformational shifts. Within a series of 34 mutants, the calculation can always distinguish between weak (K(d)>1 mM) and tight (K(d)<10 microM) binding sequences. Starting from partial coordinates of the ribose-binding protein lacking the ligand and the 10 primary contact residues, the molecular-mechanics potential is used to redesign a ribose-binding site. Out of a search space of 2 x 10(12) sequences, the calculation selects a point mutant of the native protein as the top solution (experimental K(d)=17 microM) and the native protein as the second best solution (experimental K(d)=210 nM). The quality of the predictions depends on the accuracy of the generalized Born electrostatics model, treatment of protonation equilibria, high-resolution rotamer sampling, a final local energy minimization step, and explicit modeling of the bound, unbound, and unfolded states. The application of unmodified molecular-mechanics potentials to protein design links two fields in a mutually beneficial way. Design provides a new avenue for testing molecular-mechanics energy functions, and future improvements in these energy functions will presumably lead to more accurate design results.
Collapse
|
35
|
Simonov NA, Mascagni M, Fenley MO. Monte Carlo-based linear Poisson-Boltzmann approach makes accurate salt-dependent solvation free energy predictions possible. J Chem Phys 2007; 127:185105. [DOI: 10.1063/1.2803189] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
36
|
Barth P, Schonbrun J, Baker D. Toward high-resolution prediction and design of transmembrane helical protein structures. Proc Natl Acad Sci U S A 2007; 104:15682-7. [PMID: 17905872 PMCID: PMC2000396 DOI: 10.1073/pnas.0702515104] [Citation(s) in RCA: 174] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The prediction and design at the atomic level of membrane protein structures and interactions is a critical but unsolved challenge. To address this problem, we have developed an all-atom physical model that describes intraprotein and protein-solvent interactions in the membrane environment. We evaluated the ability of the model to recapitulate the energetics and structural specificities of polytopic membrane proteins by using a battery of in silico prediction and design tests. First, in side-chain packing and design tests, the model successfully predicts the side-chain conformations at 73% of nonexposed positions and the native amino acid identities at 34% of positions in naturally occurring membrane proteins. Second, the model predicts significant energy gaps between native and nonnative structures of transmembrane helical interfaces and polytopic membrane proteins. Third, distortions in transmembrane helices are successfully recapitulated in docking experiments by using fragments of ideal helices judiciously defined around helical kinks. Finally, de novo structure prediction reaches near-atomic accuracy (<2.5 A) for several small membrane protein domains (<150 residues). The success of the model highlights the critical role of van der Waals and hydrogen-bonding interactions in the stability and structural specificity of membrane protein structures and sets the stage for the high-resolution prediction and design of complex membrane protein architectures.
Collapse
Affiliation(s)
- P. Barth
- Department of Biochemistry and Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195
| | - J. Schonbrun
- Department of Biochemistry and Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195
| | - D. Baker
- Department of Biochemistry and Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|