1
|
Opuu V, Simonson T. Enzyme redesign and genetic code expansion. Protein Eng Des Sel 2023; 36:gzad017. [PMID: 37879093 DOI: 10.1093/protein/gzad017] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2023] [Revised: 09/10/2023] [Accepted: 09/19/2023] [Indexed: 10/27/2023] Open
Abstract
Enzyme design is an important application of computational protein design (CPD). It can benefit enormously from the additional chemistries provided by noncanonical amino acids (ncAAs). These can be incorporated into an 'expanded' genetic code, and introduced in vivo into target proteins. The key step for genetic code expansion is to engineer an aminoacyl-transfer RNA (tRNA) synthetase (aaRS) and an associated tRNA that handles the ncAA. Experimental directed evolution has been successfully used to engineer aaRSs and incorporate over 200 ncAAs into expanded codes. But directed evolution has severe limits, and is not yet applicable to noncanonical AA backbones. CPD can help address several of its limitations, and has begun to be applied to this problem. We review efforts to redesign aaRSs, studies that designed new proteins and functionalities with the help of ncAAs, and some of the method developments that have been used, such as adaptive landscape flattening Monte Carlo, which allows an enzyme to be redesigned with substrate or transition state binding as the design target.
Collapse
Affiliation(s)
- Vaitea Opuu
- Institut Chimie Biologie Innovation (CNRS UMR8231), Ecole Supérieure de Physique et Chimie de Paris (ESPCI), 75005 Paris, France
| | - Thomas Simonson
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Institut Polytechnique de Paris, 91128 Palaiseau, France
| |
Collapse
|
2
|
Polydorides S, Archontis G. Computational optimization of the SARS-CoV-2 receptor-binding-motif affinity for human ACE2. Biophys J 2021; 120:2859-2871. [PMID: 33984310 PMCID: PMC8110322 DOI: 10.1016/j.bpj.2021.02.049] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 01/19/2021] [Accepted: 02/15/2021] [Indexed: 01/15/2023] Open
Abstract
The coronavirus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which is responsible for the coronavirus disease 2019 pandemic, and the closely related SARS-CoV coronavirus enter cells by binding at the human angiotensin converting enzyme 2 (hACE2). The stronger hACE2 affinity of SARS-CoV-2 has been connected with its higher infectivity. In this work, we study hACE2 complexes with the receptor-binding domains (RBDs) of the human SARS-CoV-2 and human SARS-CoV viruses, using all-atom molecular dynamics simulations and computational protein design with a physics-based energy function. The molecular dynamics simulations identify charge-modifying substitutions between the CoV-2 and CoV RBDs, which either increase or decrease the hACE2 affinity of the SARS-CoV-2 RBD. The combined effect of these mutations is small, and the relative affinity is mainly determined by substitutions at residues in contact with hACE2. Many of these findings are in line and interpret recent experiments. Our computational protein design calculations redesign positions 455, 493, 494, and 501 of the SARS-CoV-2 receptor binding motif, which contact hACE2 in the complex and are important for ACE2 recognition. Sampling is enhanced by an adaptive importance sampling Monte Carlo method. Sequences with increased affinity replace CoV-2 glutamine by a negative residue at position 493; serine by a nonpolar or aromatic residue or an asparagine at position 494; and asparagine by valine or threonine at position 501. Substitutions at positions 455 and 501 have a smaller effect on affinity. Substitutions suggested by our design are seen in viral sequences encountered in other species, including bat and pangolin. Our results might be used to identify potential virus strains with higher human infectivity and assist in the design of peptide-based or peptidomimetic compounds with the potential to inhibit SARS-CoV-2 binding at hACE2.
Collapse
|
3
|
Michael E, Polydorides S, Simonson T, Archontis G. Hybrid MC/MD for protein design. J Chem Phys 2021; 153:054113. [PMID: 32770896 DOI: 10.1063/5.0013320] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Computational protein design relies on simulations of a protein structure, where selected amino acids can mutate randomly, and mutations are selected to enhance a target property, such as stability. Often, the protein backbone is held fixed and its degrees of freedom are modeled implicitly to reduce the complexity of the conformational space. We present a hybrid method where short molecular dynamics (MD) segments are used to explore conformations and alternate with Monte Carlo (MC) moves that apply mutations to side chains. The backbone is fully flexible during MD. As a test, we computed side chain acid/base constants or pKa's in five proteins. This problem can be considered a special case of protein design, with protonation/deprotonation playing the role of mutations. The solvent was modeled as a dielectric continuum. Due to cost, in each protein we allowed just one side chain position to change its protonation state and the other position to change its type or mutate. The pKa's were computed with a standard method that scans a range of pH values and with a new method that uses adaptive landscape flattening (ALF) to sample all protonation states in a single simulation. The hybrid method gave notably better accuracy than standard, fixed-backbone MC. ALF decreased the computational cost a factor of 13.
Collapse
Affiliation(s)
- Eleni Michael
- Department of Physics, University of Cyprus, P.O 20537, CY678 Nicosia, Cyprus
| | - Savvas Polydorides
- Department of Physics, University of Cyprus, P.O 20537, CY678 Nicosia, Cyprus
| | - Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Georgios Archontis
- Department of Physics, University of Cyprus, P.O 20537, CY678 Nicosia, Cyprus
| |
Collapse
|
4
|
Adaptive landscape flattening allows the design of both enzyme: Substrate binding and catalytic power. PLoS Comput Biol 2020; 16:e1007600. [PMID: 31917825 PMCID: PMC7041857 DOI: 10.1371/journal.pcbi.1007600] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2019] [Revised: 02/25/2020] [Accepted: 12/11/2019] [Indexed: 01/30/2023] Open
Abstract
Designed enzymes are of fundamental and technological interest. Experimental directed evolution still has significant limitations, and computational approaches are a complementary route. A designed enzyme should satisfy multiple criteria: stability, substrate binding, transition state binding. Such multi-objective design is computationally challenging. Two recent studies used adaptive importance sampling Monte Carlo to redesign proteins for ligand binding. By first flattening the energy landscape of the apo protein, they obtained positive design for the bound state and negative design for the unbound. We have now extended the method to design an enzyme for specific transition state binding, i.e., for its catalytic power. We considered methionyl-tRNA synthetase (MetRS), which attaches methionine (Met) to its cognate tRNA, establishing codon identity. Previously, MetRS and other synthetases have been redesigned by experimental directed evolution to accept noncanonical amino acids as substrates, leading to genetic code expansion. Here, we have redesigned MetRS computationally to bind several ligands: the Met analog azidonorleucine, methionyl-adenylate (MetAMP), and the activated ligands that form the transition state for MetAMP production. Enzyme mutants known to have azidonorleucine activity were recovered by the design calculations, and 17 mutants predicted to bind MetAMP were characterized experimentally and all found to be active. Mutants predicted to have low activation free energies for MetAMP production were found to be active and the predicted reaction rates agreed well with the experimental values. We suggest the present method should become the paradigm for computational enzyme design. Designed enzymes are of major interest. Experimental directed evolution still has significant limitations, and computational approaches are another route. Enzymes must be stable, bind substrates, and be powerful catalysts. It is challenging to design for all these properties. A method to design substrate binding was proposed recently. It used an adaptive Monte Carlo method to explore mutations of a few amino acids near the substrate. A bias energy was gradually “learned” such that, in the absence of the ligand, the simulation visited most of the possible protein mutations with comparable probabilities. Remarkably, a simulation of the protein:ligand complex, including the bias, will then preferentially sample tight-binding sequences. We generalized the method to design binding specificity. We tested it for the methionyl-tRNA synthetase enzyme, which has been engineered in order to expand the genetic code. We redesigned the enzyme to obtain variants with low activation free energies for the catalytic step. The variants proposed by the simulations were shown experimentally to be active, and the predicted activation free energies were in reasonable agreement with the experimental values. We expect the new method will become the paradigm for computational enzyme design.
Collapse
|
5
|
Gaillard T, Simonson T. Full Protein Sequence Redesign with an MMGBSA Energy Function. J Chem Theory Comput 2017; 13:4932-4943. [DOI: 10.1021/acs.jctc.7b00202] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Thomas Gaillard
- Laboratoire de Biochimie
(CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France
| | - Thomas Simonson
- Laboratoire de Biochimie
(CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France
| |
Collapse
|
6
|
Michael E, Polydorides S, Simonson T, Archontis G. Simple models for nonpolar solvation: Parameterization and testing. J Comput Chem 2017; 38:2509-2519. [PMID: 28786118 DOI: 10.1002/jcc.24910] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2017] [Revised: 07/19/2017] [Accepted: 07/20/2017] [Indexed: 12/13/2022]
Abstract
Implicit solvent models are important for many biomolecular simulations. The polarity of aqueous solvent is essential and qualitatively captured by continuum electrostatics methods like Generalized Born (GB). However, GB does not account for the solvent-induced interactions between exposed hydrophobic sidechains or solute-solvent dispersion interactions. These "nonpolar" effects are often modeled through surface area (SA) energy terms, which lack realism, create mathematical singularities, and have a many-body character. We have explored an alternate, Lazaridis-Karplus (LK) gaussian energy density for nonpolar effects and a dispersion (DI) energy term proposed earlier, associated with GB electrostatics. We parameterized several combinations of GB, SA, LK, and DI energy terms, to reproduce 62 small molecule solvation free energies, 387 protein stability changes due to point mutations, and the structures of 8 protein loops. With optimized parameters, the models all gave similar results, with GBLK and GBDILK giving no performance loss compared to GBSA, and mean errors of 1.7 kcal/mol for the stability changes and 2 Å deviations for the loop conformations. The optimized GBLK model gave poor results in MD of the Trpcage mini-protein, but parameters optimized specifically for MD performed well for Trpcage and three other small proteins. Overall, the LK and DI nonpolar terms are valid alternatives to SA treatments for a range of applications. © 2017 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Eleni Michael
- Department of Physics, University of Cyprus, PO20537, Nicosia, CY1678, Cyprus
| | - Savvas Polydorides
- Department of Physics, University of Cyprus, PO20537, Nicosia, CY1678, Cyprus.,Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Georgios Archontis
- Department of Physics, University of Cyprus, PO20537, Nicosia, CY1678, Cyprus
| |
Collapse
|
7
|
Villa F, Mignon D, Polydorides S, Simonson T. Comparing pairwise-additive and many-body generalized Born models for acid/base calculations and protein design. J Comput Chem 2017; 38:2396-2410. [PMID: 28749575 DOI: 10.1002/jcc.24898] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2017] [Revised: 06/30/2017] [Accepted: 07/06/2017] [Indexed: 12/13/2022]
Abstract
Generalized Born (GB) solvent models are common in acid/base calculations and protein design. With GB, the interaction between a pair of solute atoms depends on the shape of the protein/solvent boundary and, therefore, the positions of all solute atoms, so that GB is a many-body potential. For compute-intensive applications, the model is often simplified further, by introducing a mean, native-like protein/solvent boundary, which removes the many-body property. We investigate a method for both acid/base calculations and protein design that uses Monte Carlo simulations in which side chains can explore rotamers, bind/release protons, or mutate. The fluctuating protein/solvent dielectric boundary is treated in a way that is numerically exact (within the GB framework), in contrast to a mean boundary. Its originality is that it captures the many-body character while retaining the residue-pairwise complexity given by a fixed boundary. The method is implemented in the Proteus protein design software. It yields a slight but systematic improvement for acid/base constants in nine proteins and a significant improvement for the computational design of three PDZ domains. It eliminates a source of model uncertainty, which will facilitate the analysis of other model limitations. © 2017 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Francesco Villa
- Ecole Polytechnique, Laboratoire de Biochimie (CNRS UMR7654), Palaiseau, 91128, France
| | - David Mignon
- Ecole Polytechnique, Laboratoire de Biochimie (CNRS UMR7654), Palaiseau, 91128, France
| | - Savvas Polydorides
- Ecole Polytechnique, Laboratoire de Biochimie (CNRS UMR7654), Palaiseau, 91128, France
| | - Thomas Simonson
- Ecole Polytechnique, Laboratoire de Biochimie (CNRS UMR7654), Palaiseau, 91128, France
| |
Collapse
|
8
|
Mignon D, Simonson T. Comparing three stochastic search algorithms for computational protein design: Monte Carlo, replica exchange Monte Carlo, and a multistart, steepest-descent heuristic. J Comput Chem 2016; 37:1781-93. [PMID: 27197555 DOI: 10.1002/jcc.24393] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2015] [Revised: 02/26/2016] [Accepted: 03/27/2016] [Indexed: 01/11/2023]
Abstract
Computational protein design depends on an energy function and an algorithm to search the sequence/conformation space. We compare three stochastic search algorithms: a heuristic, Monte Carlo (MC), and a Replica Exchange Monte Carlo method (REMC). The heuristic performs a steepest-descent minimization starting from thousands of random starting points. The methods are applied to nine test proteins from three structural families, with a fixed backbone structure, a molecular mechanics energy function, and with 1, 5, 10, 20, 30, or all amino acids allowed to mutate. Results are compared to an exact, "Cost Function Network" method that identifies the global minimum energy conformation (GMEC) in favorable cases. The designed sequences accurately reproduce experimental sequences in the hydrophobic core. The heuristic and REMC agree closely and reproduce the GMEC when it is known, with a few exceptions. Plain MC performs well for most cases, occasionally departing from the GMEC by 3-4 kcal/mol. With REMC, the diversity of the sequences sampled agrees with exact enumeration where the latter is possible: up to 2 kcal/mol above the GMEC. Beyond, room temperature replicas sample sequences up to 10 kcal/mol above the GMEC, providing thermal averages and a solution to the inverse protein folding problem. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- David Mignon
- Laboratoire De Biochimie (UMR CNRS 7654), Department Of Biology, Ecole Polytechnique, Palaiseau, France
| | - Thomas Simonson
- Laboratoire De Biochimie (UMR CNRS 7654), Department Of Biology, Ecole Polytechnique, Palaiseau, France
| |
Collapse
|
9
|
Gaillard T, Panel N, Simonson T. Protein side chain conformation predictions with an MMGBSA energy function. Proteins 2016; 84:803-19. [PMID: 26948696 DOI: 10.1002/prot.25030] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Revised: 02/22/2016] [Accepted: 02/27/2016] [Indexed: 12/17/2022]
Abstract
The prediction of protein side chain conformations from backbone coordinates is an important task in structural biology, with applications in structure prediction and protein design. It is a difficult problem due to its combinatorial nature. We study the performance of an "MMGBSA" energy function, implemented in our protein design program Proteus, which combines molecular mechanics terms, a Generalized Born and Surface Area (GBSA) solvent model, with approximations that make the model pairwise additive. Proteus is not a competitor to specialized side chain prediction programs due to its cost, but it allows protein design applications, where side chain prediction is an important step and MMGBSA an effective energy model. We predict the side chain conformations for 18 proteins. The side chains are first predicted individually, with the rest of the protein in its crystallographic conformation. Next, all side chains are predicted together. The contributions of individual energy terms are evaluated and various parameterizations are compared. We find that the GB and SA terms, with an appropriate choice of the dielectric constant and surface energy coefficients, are beneficial for single side chain predictions. For the prediction of all side chains, however, errors due to the pairwise additive approximation overcome the improvement brought by these terms. We also show the crucial contribution of side chain minimization to alleviate the rigid rotamer approximation. Even without GB and SA terms, we obtain accuracies comparable to SCWRL4, a specialized side chain prediction program. In particular, we obtain a better RMSD than SCWRL4 for core residues (at a higher cost), despite our simpler rotamer library. Proteins 2016; 84:803-819. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Thomas Gaillard
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Nicolas Panel
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Thomas Simonson
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| |
Collapse
|
10
|
Simonson T, Ye-Lehmann S, Palmai Z, Amara N, Wydau-Dematteis S, Bigan E, Druart K, Moch C, Plateau P. Redesigning the stereospecificity of tyrosyl-tRNA synthetase. Proteins 2016; 84:240-53. [PMID: 26676967 DOI: 10.1002/prot.24972] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2015] [Revised: 09/30/2015] [Accepted: 11/26/2015] [Indexed: 12/14/2022]
Abstract
D-Amino acids are largely excluded from protein synthesis, yet they are of great interest in biotechnology. Unnatural amino acids have been introduced into proteins using engineered aminoacyl-tRNA synthetases (aaRSs), and this strategy might be applicable to D-amino acids. Several aaRSs can aminoacylate their tRNA with a D-amino acid; of these, tyrosyl-tRNA synthetase (TyrRS) has the weakest stereospecificity. We use computational protein design to suggest active site mutations in Escherichia coli TyrRS that could increase its D-Tyr binding further, relative to L-Tyr. The mutations selected all modify one or more sidechain charges in the Tyr binding pocket. We test their effect by probing the aminoacyl-adenylation reaction through pyrophosphate exchange experiments. We also perform extensive alchemical free energy simulations to obtain L-Tyr/D-Tyr binding free energy differences. Agreement with experiment is good, validating the structural models and detailed thermodynamic predictions the simulations provide. The TyrRS stereospecificity proves hard to engineer through charge-altering mutations in the first and second coordination shells of the Tyr ammonium group. Of six mutants tested, two are active towards D-Tyr; one of these has an inverted stereospecificity, with a large preference for D-Tyr. However, its activity is low. Evidently, the TyrRS stereospecificity is robust towards charge rearrangements near the ligand. Future design may have to consider more distant and/or electrically neutral target mutations, and possibly design for binding of the transition state, whose structure however can only be modeled.
Collapse
Affiliation(s)
- Thomas Simonson
- Department of Biology, Laboratoire De Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | | | - Zoltan Palmai
- Department of Biology, Laboratoire De Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Najette Amara
- Department of Biology, Laboratoire De Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Sandra Wydau-Dematteis
- Department of Biology, Laboratoire De Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Erwan Bigan
- Department of Biology, Laboratoire De Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Karen Druart
- Department of Biology, Laboratoire De Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Clara Moch
- Department of Biology, Laboratoire De Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Pierre Plateau
- Department of Biology, Laboratoire De Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| |
Collapse
|
11
|
Polydorides S, Michael E, Mignon D, Druart K, Archontis G, Simonson T. Proteus and the Design of Ligand Binding Sites. Methods Mol Biol 2016; 1414:77-97. [PMID: 27094287 DOI: 10.1007/978-1-4939-3569-7_6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
This chapter describes the organization and use of Proteus, a multitool computational suite for the optimization of protein and ligand conformations and sequences, and the calculation of pK α shifts and relative binding affinities. The software offers the use of several molecular mechanics force fields and solvent models, including two generalized Born variants, and a large range of scoring functions, which can combine protein stability, ligand affinity, and ligand specificity terms, for positive and negative design. We present in detail the steps for structure preparation, system setup, construction of the interaction energy matrix, protein sequence and structure optimizations, pK α calculations, and ligand titration calculations. We discuss illustrative examples, including the chemical/structural optimization of a complex between the MHC class II protein HLA-DQ8 and the vinculin epitope, and the chemical optimization of the compstatin analog Ac-Val4Trp/His9Ala, which regulates the function of protein C3 of the complement system.
Collapse
Affiliation(s)
- Savvas Polydorides
- Theoretical and Computational Biophysics Group, Department of Physics, University of Cyprus, 1678, Nicosia, Cyprus
| | - Eleni Michael
- Theoretical and Computational Biophysics Group, Department of Physics, University of Cyprus, 1678, Nicosia, Cyprus
| | - David Mignon
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, 91128, Palaiseau, France
| | - Karen Druart
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, 91128, Palaiseau, France
| | - Georgios Archontis
- Theoretical and Computational Biophysics Group, Department of Physics, University of Cyprus, 1678, Nicosia, Cyprus.
| | - Thomas Simonson
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, 91128, Palaiseau, France.
| |
Collapse
|
12
|
Druart K, Palmai Z, Omarjee E, Simonson T. Protein:Ligand binding free energies: A stringent test for computational protein design. J Comput Chem 2015; 37:404-15. [PMID: 26503829 DOI: 10.1002/jcc.24230] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2015] [Revised: 10/01/2015] [Accepted: 10/02/2015] [Indexed: 01/29/2023]
Abstract
A computational protein design method is extended to allow Monte Carlo simulations where two ligands are titrated into a protein binding pocket, yielding binding free energy differences. These provide a stringent test of the physical model, including the energy surface and sidechain rotamer definition. As a test, we consider tyrosyl-tRNA synthetase (TyrRS), which has been extensively redesigned experimentally. We consider its specificity for its substrate l-tyrosine (l-Tyr), compared to the analogs d-Tyr, p-acetyl-, and p-azido-phenylalanine (ac-Phe, az-Phe). We simulate l- and d-Tyr binding to TyrRS and six mutants, and compare the structures and binding free energies to a more rigorous "MD/GBSA" procedure: molecular dynamics with explicit solvent for structures and a Generalized Born + Surface Area model for binding free energies. Next, we consider l-Tyr, ac- and az-Phe binding to six other TyrRS variants. The titration results are sensitive to the precise rotamer definition, which involves a short energy minimization for each sidechain pair to help relax bad contacts induced by the discrete rotamer set. However, when designed mutant structures are rescored with a standard GBSA energy model, results agree well with the more rigorous MD/GBSA. As a third test, we redesign three amino acid positions in the substrate coordination sphere, with either l-Tyr or d-Tyr as the ligand. For two, we obtain good agreement with experiment, recovering the wildtype residue when l-Tyr is the ligand and a d-Tyr specific mutant when d-Tyr is the ligand. For the third, we recover His with either ligand, instead of wildtype Gln.
Collapse
Affiliation(s)
- Karen Druart
- Laboratoire De Biochimie (UMR CNRS 7654), Department of Biology, Ecole Polytechnique, Palaiseau, France
| | - Zoltan Palmai
- Laboratoire De Biochimie (UMR CNRS 7654), Department of Biology, Ecole Polytechnique, Palaiseau, France
| | - Eyaz Omarjee
- Laboratoire De Biochimie (UMR CNRS 7654), Department of Biology, Ecole Polytechnique, Palaiseau, France
| | - Thomas Simonson
- Laboratoire De Biochimie (UMR CNRS 7654), Department of Biology, Ecole Polytechnique, Palaiseau, France
| |
Collapse
|
13
|
MD Simulations of tRNA and Aminoacyl-tRNA Synthetases: Dynamics, Folding, Binding, and Allostery. Int J Mol Sci 2015; 16:15872-902. [PMID: 26184179 PMCID: PMC4519929 DOI: 10.3390/ijms160715872] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2015] [Revised: 07/07/2015] [Accepted: 07/08/2015] [Indexed: 12/21/2022] Open
Abstract
While tRNA and aminoacyl-tRNA synthetases are classes of biomolecules that have been extensively studied for decades, the finer details of how they carry out their fundamental biological functions in protein synthesis remain a challenge. Recent molecular dynamics (MD) simulations are verifying experimental observations and providing new insight that cannot be addressed from experiments alone. Throughout the review, we briefly discuss important historical events to provide a context for how far the field has progressed over the past few decades. We then review the background of tRNA molecules, aminoacyl-tRNA synthetases, and current state of the art MD simulation techniques for those who may be unfamiliar with any of those fields. Recent MD simulations of tRNA dynamics and folding and of aminoacyl-tRNA synthetase dynamics and mechanistic characterizations are discussed. We highlight the recent successes and discuss how important questions can be addressed using current MD simulations techniques. We also outline several natural next steps for computational studies of AARS:tRNA complexes.
Collapse
|
14
|
Gaillard T, Simonson T. Pairwise decomposition of an MMGBSA energy function for computational protein design. J Comput Chem 2014; 35:1371-87. [PMID: 24854675 DOI: 10.1002/jcc.23637] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2014] [Revised: 04/14/2014] [Accepted: 05/01/2014] [Indexed: 02/02/2023]
Abstract
Computational protein design (CPD) aims at predicting new proteins or modifying existing ones. The computational challenge is huge as it requires exploring an enormous sequence and conformation space. The difficulty can be reduced by considering a fixed backbone and a discrete set of sidechain conformations. Another common strategy consists in precalculating a pairwise energy matrix, from which the energy of any sequence/conformation can be quickly obtained. In this work, we examine the pairwise decomposition of protein MMGBSA energy functions from a general theoretical perspective, and an implementation proposed earlier for CPD. It includes a Generalized Born term, whose many-body character is overcome using an effective dielectric environment, and a Surface Area term, for which we present an improved pairwise decomposition. A detailed evaluation of the error introduced by the decomposition on the different energy components is performed. We show that the error remains reasonable, compared to other uncertainties.
Collapse
Affiliation(s)
- Thomas Gaillard
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, 91128, Palaiseau, France
| | | |
Collapse
|
15
|
Simonson T. What Is the Dielectric Constant of a Protein When Its Backbone Is Fixed? J Chem Theory Comput 2013; 9:4603-8. [DOI: 10.1021/ct400398e] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Thomas Simonson
- Laboratoire de Biochimie
(CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France
| |
Collapse
|
16
|
Simonson T, Gaillard T, Mignon D, Schmidt am Busch M, Lopes A, Amara N, Polydorides S, Sedano A, Druart K, Archontis G. Computational protein design: the Proteus software and selected applications. J Comput Chem 2013; 34:2472-84. [PMID: 24037756 DOI: 10.1002/jcc.23418] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2013] [Revised: 07/08/2013] [Accepted: 07/28/2013] [Indexed: 12/13/2022]
Abstract
We describe an automated procedure for protein design, implemented in a flexible software package, called Proteus. System setup and calculation of an energy matrix are done with the XPLOR modeling program and its sophisticated command language, supporting several force fields and solvent models. A second program provides algorithms to search sequence space. It allows a decomposition of the system into groups, which can be combined in different ways in the energy function, for both positive and negative design. The whole procedure can be controlled by editing 2-4 scripts. Two applications consider the tyrosyl-tRNA synthetase enzyme and its successful redesign to bind both O-methyl-tyrosine and D-tyrosine. For the latter, we present Monte Carlo simulations where the D-tyrosine concentration is gradually increased, displacing L-tyrosine from the binding pocket and yielding the binding free energy difference, in good agreement with experiment. Complete redesign of the Crk SH3 domain is presented. The top 10000 sequences are all assigned to the correct fold by the SUPERFAMILY library of Hidden Markov Models. Finally, we report the acid/base behavior of the SNase protein. Sidechain protonation is treated as a form of mutation; it is then straightforward to perform constant-pH Monte Carlo simulations, which yield good agreement with experiment. Overall, the software can be used for a wide range of application, producing not only native-like sequences but also thermodynamic properties with errors that appear comparable to other current software packages.
Collapse
Affiliation(s)
- Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Department of Biology, Ecole Polytechnique, Palaiseau, 91128, France
| | | | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Designing electrostatic interactions in biological systems via charge optimization or combinatorial approaches: insights and challenges with a continuum electrostatic framework. Theor Chem Acc 2012. [DOI: 10.1007/s00214-012-1252-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
18
|
Nielsen JE, Gunner MR, Bertrand García-Moreno E. The pKa Cooperative: a collaborative effort to advance structure-based calculations of pKa values and electrostatic effects in proteins. Proteins 2011; 79:3249-59. [PMID: 22002877 PMCID: PMC3375608 DOI: 10.1002/prot.23194] [Citation(s) in RCA: 91] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2011] [Accepted: 09/13/2011] [Indexed: 12/13/2022]
Abstract
The pK(a) Cooperative (http://www.pkacoop.org) was organized to advance development of accurate and useful computational methods for structure-based calculation of pK(a) values and electrostatic energies in proteins. The Cooperative brings together laboratories with expertise and interest in theoretical, computational, and experimental studies of protein electrostatics. To improve structure-based energy calculations, it is necessary to better understand the physical character and molecular determinants of electrostatic effects. Thus, the Cooperative intends to foment experimental research into fundamental aspects of proteins that depend on electrostatic interactions. It will maintain a depository for experimental data useful for critical assessment of methods for structure-based electrostatics calculations. To help guide the development of computational methods, the Cooperative will organize blind prediction exercises. As a first step, computational laboratories were invited to reproduce an unpublished set of experimental pK(a) values of acidic and basic residues introduced in the interior of staphylococcal nuclease by site-directed mutagenesis. The pK(a) values of these groups are unique and challenging to simulate owing to the large magnitude of their shifts relative to normal pK(a) values in water. Many computational methods were tested in this first Blind Prediction Challenge and critical assessment exercise. A workshop was organized in the Telluride Science Research Center to objectively assess the performance of many computational methods tested on this one extensive data set. This volume of Proteins: Structure, Function, and Bioinformatics introduces the pK(a) Cooperative, presents reports submitted by participants in the Blind Prediction Challenge, and highlights some of the problems in structure-based calculations identified during this exercise.
Collapse
Affiliation(s)
- Jens E. Nielsen
- School of Biomolecular and Biomedical Science, Centre for Synthesis and Chemical Biology, UCD Conway Institute, University College Dublin, Belfield, Dublin 4, Ireland
| | - M. R. Gunner
- Department of Physics, City College of New York, New York, NY 10031
| | | |
Collapse
|