1
|
Hoch SY, Netzer R, Weinstein JY, Krauss L, Hakeny K, Fleishman SJ. GGAssembler: Precise and economical design and synthesis of combinatorial mutation libraries. Protein Sci 2024; 33:e5169. [PMID: 39283039 PMCID: PMC11403590 DOI: 10.1002/pro.5169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 08/21/2024] [Accepted: 08/26/2024] [Indexed: 09/22/2024]
Abstract
Golden Gate assembly (GGA) can seamlessly generate full-length genes from DNA fragments. In principle, GGA could be used to design combinatorial mutation libraries for protein engineering, but creating accurate, complex, and cost-effective libraries has been challenging. We present GGAssembler, a graph-theoretical method for economical design of DNA fragments that assemble a combinatorial library that encodes any desired diversity. We used GGAssembler for one-pot in vitro assembly of camelid antibody libraries comprising >105 variants with DNA costs <0.007$ per variant and dropping significantly with increased library complexity. >93% of the desired variants were present in the assembly product and >99% were represented within the expected order of magnitude as verified by deep sequencing. The GGAssembler workflow is, therefore, an accurate approach for generating complex variant libraries that may drastically reduce costs and accelerate discovery and optimization of antibodies, enzymes and other proteins. The workflow is accessible through a Google Colab notebook at https://github.com/Fleishman-Lab/GGAssembler.
Collapse
Affiliation(s)
- Shlomo Yakir Hoch
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, Israel
| | - Ravit Netzer
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, Israel
| | | | - Lucas Krauss
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, Israel
| | - Karen Hakeny
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, Israel
| | | |
Collapse
|
2
|
Papamichail D, Febinger M, Almeda S, Aberbach T, Papamichail G. Synthesis cost-optimal targeted mutant protein libraries. Comput Biol Chem 2024; 110:108068. [PMID: 38669847 DOI: 10.1016/j.compbiolchem.2024.108068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 03/20/2024] [Accepted: 04/03/2024] [Indexed: 04/28/2024]
Abstract
Protein variant libraries produced by site-directed mutagenesis are a useful tool utilized by protein engineers to explore variants with potentially improved properties, such as activity and stability. These libraries are commonly built by selecting residue positions and alternative beneficial mutations for each position. All possible combinations are then constructed and screened, by incorporating degenerate codons at mutation sites. These degenerate codons often encode additional unwanted amino acids or even STOP codons. Our study aims to take advantage of annealing based recombination of oligonucleotides during synthesis and utilize multiple degenerate codons per mutation site to produce targeted protein libraries devoid of unwanted variants. Toward this goal we created an algorithm to calculate the minimum number of degenerate codons necessary to specify any given amino acid set, and a dynamic programming method that uses this algorithm to optimally partition a DNA target sequence with degeneracies into overlapping oligonucleotides, such that the total cost of synthesis of the target mutant protein library is minimized. Computational experiments show that, for a modest increase in DNA synthesis costs, beneficial variant yields in produced mutant libraries are increased by orders of magnitude, an effect particularly pronounced in large combinatorial libraries.
Collapse
Affiliation(s)
- Dimitris Papamichail
- Department of Computer Science, The College of New Jersey, 2000 Pennington Road, Ewing, 08628, NJ, USA.
| | - Madeline Febinger
- Department of Computer Science, The College of New Jersey, 2000 Pennington Road, Ewing, 08628, NJ, USA
| | - Shm Almeda
- Department of Computer Science, The College of New Jersey, 2000 Pennington Road, Ewing, 08628, NJ, USA
| | - Tomer Aberbach
- Department of Computer Science, The College of New Jersey, 2000 Pennington Road, Ewing, 08628, NJ, USA
| | | |
Collapse
|
3
|
Haloi N, Huang S, Nichols AL, Fine EJ, Friesenhahn NJ, Marotta CB, Dougherty DA, Lindahl E, Howard RJ, Mayo SL, Lester HA. Interactive computational and experimental approaches improve the sensitivity of periplasmic binding protein-based nicotine biosensors for measurements in biofluids. Protein Eng Des Sel 2024; 37:gzae003. [PMID: 38302088 PMCID: PMC10896302 DOI: 10.1093/protein/gzae003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Revised: 01/17/2024] [Accepted: 01/29/2024] [Indexed: 02/03/2024] Open
Abstract
We developed fluorescent protein sensors for nicotine with improved sensitivity. For iNicSnFR12 at pH 7.4, the proportionality constant for ∆F/F0vs [nicotine] (δ-slope, 2.7 μM-1) is 6.1-fold higher than the previously reported iNicSnFR3a. The activated state of iNicSnFR12 has a fluorescence quantum yield of at least 0.6. We measured similar dose-response relations for the nicotine-induced absorbance increase and fluorescence increase, suggesting that the absorbance increase leads to the fluorescence increase via the previously described nicotine-induced conformational change, the 'candle snuffer' mechanism. Molecular dynamics (MD) simulations identified a binding pose for nicotine, previously indeterminate from experimental data. MD simulations also showed that Helix 4 of the periplasmic binding protein (PBP) domain appears tilted in iNicSnFR12 relative to iNicSnFR3a, likely altering allosteric network(s) that link the ligand binding site to the fluorophore. In thermal melt experiments, nicotine stabilized the PBP of the tested iNicSnFR variants. iNicSnFR12 resolved nicotine in diluted mouse and human serum at 100 nM, the peak [nicotine] that occurs during smoking or vaping, and possibly at the decreasing levels during intervals between sessions. NicSnFR12 was also partially activated by unidentified endogenous ligand(s) in biofluids. Improved iNicSnFR12 variants could become the molecular sensors in continuous nicotine monitors for animal and human biofluids.
Collapse
Affiliation(s)
- Nandan Haloi
- Department of Applied Physics, Science for Life Laboratory, KTH Royal Institute of Technology, Stockholm 10044, Sweden
| | - Shan Huang
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Aaron L Nichols
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Eve J Fine
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Nicholas J Friesenhahn
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Christopher B Marotta
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Dennis A Dougherty
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Erik Lindahl
- Department of Applied Physics, Science for Life Laboratory, KTH Royal Institute of Technology, Stockholm 10044, Sweden
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Stockholm 10691, Sweden
| | - Rebecca J Howard
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Stockholm 10691, Sweden
| | - Stephen L Mayo
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Henry A Lester
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| |
Collapse
|
4
|
Gisdon FJ, Kynast JP, Ayyildiz M, Hine AV, Plückthun A, Höcker B. Modular peptide binders - development of a predictive technology as alternative for reagent antibodies. Biol Chem 2022; 403:535-543. [PMID: 35089661 DOI: 10.1515/hsz-2021-0384] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Accepted: 01/11/2022] [Indexed: 11/15/2022]
Abstract
Current biomedical research and diagnostics critically depend on detection agents for specific recognition and quantification of protein molecules. Monoclonal antibodies have been used for this purpose over decades and facilitated numerous biological and biomedical investigations. Recently, however, it has become apparent that many commercial reagent antibodies lack specificity or do not recognize their target at all. Thus, synthetic alternatives are needed whose complex designs are facilitated by multidisciplinary approaches incorporating experimental protein engineering with computational modeling. Here, we review the status of such an engineering endeavor based on the modular armadillo repeat protein scaffold and discuss challenges in its implementation.
Collapse
Affiliation(s)
- Florian J Gisdon
- Department of Biochemistry, University of Bayreuth, D-95447 Bayreuth, Germany
| | - Josef P Kynast
- Department of Biochemistry, University of Bayreuth, D-95447 Bayreuth, Germany
| | - Merve Ayyildiz
- Department of Biochemistry, University of Bayreuth, D-95447 Bayreuth, Germany
| | - Anna V Hine
- College of Health and Life Sciences, Aston University, Birmingham B4 7ET, UK
| | - Andreas Plückthun
- Department of Biochemistry, University of Zurich, CH-8057 Zürich, Switzerland
| | - Birte Höcker
- Department of Biochemistry, University of Bayreuth, D-95447 Bayreuth, Germany
| |
Collapse
|
5
|
Khrenova MG, Mulashkin FD, Nemukhin AV. Modeling Spectral Tuning in Red Fluorescent Proteins Using the Dipole Moment Variation upon Excitation. J Chem Inf Model 2021; 61:5125-5132. [PMID: 34601882 DOI: 10.1021/acs.jcim.1c00981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
We describe a model for spectral tuning in red fluorescent proteins (RFPs) based on the relation between an electronic structure descriptor, the dipole moment variation upon excitation (DMV), and the excitation energy of a protein. This approach aims to overcome the problem of accurate prediction of excitation energies in RFPs, which span a very narrow window of band maxima. The latter roughly corresponds to the energy range of 0.1 eV, which is comparable with typical errors in calculations of the excitation energy by conventional quantum chemistry methods. In this work, we demonstrate a strong quantitative correlation between DMV values, obtained computationally with modest efforts, and excitation energies ΔEex at the experimental excitation band maxima for a series of RFPs with bands between 570 and 605 nm. Protein models are constructed by motifs of the relevant crystal structures, and atomic coordinates are optimized in quantum mechanics/molecular mechanics (QM/MM) calculations with QM-subsystems composed of large chromophore-containing regions. DMV values are evaluated with the electron density computed at the time-dependent density functional theory (TDDFT) level using several functionals and basis sets. We show that the results obtained with the CAM-B3LYP, BHHLYP, and M06-2X functionals demonstrate favorable correlations between DMV and ΔEex with the mean absolute error less than 0.01 eV. Taking into account the solid theoretical grounds of the relation between the DMV and the excitation energy in fluorescent proteins, the described modeling strategy presents a rational tool for spectral tuning in these efficient markers for in vivo imaging.
Collapse
Affiliation(s)
- Maria G Khrenova
- Department of Chemistry, Lomonosov Moscow State University, Moscow 119991, Russian Federation.,Bach Institute of Biochemistry, Federal Research Centre "Fundamentals of Biotechnology", Russian Academy of Sciences, Moscow 119071, Russian Federation
| | - Fedor D Mulashkin
- Department of Chemistry, Lomonosov Moscow State University, Moscow 119991, Russian Federation
| | - Alexander V Nemukhin
- Department of Chemistry, Lomonosov Moscow State University, Moscow 119991, Russian Federation.,Emanuel Institute of Biochemical Physics, Russian Academy of Sciences, Moscow 119334, Russian Federation
| |
Collapse
|
6
|
Chen S, Sun Z, Lin L, Liu Z, Liu X, Chong Y, Lu Y, Zhao H, Yang Y. To Improve Protein Sequence Profile Prediction through Image Captioning on Pairwise Residue Distance Map. J Chem Inf Model 2019; 60:391-399. [PMID: 31800243 DOI: 10.1021/acs.jcim.9b00438] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Protein sequence profile prediction aims to generate multiple sequences from structural information to advance the protein design. Protein sequence profile can be computationally predicted by energy-based or fragment-based methods. By integrating these methods with neural networks, our previous method, SPIN2, has achieved a sequence recovery rate of 34%. However, SPIN2 employed only one-dimensional (1D) structural properties that are not sufficient to represent three-dimensional (3D) structures. In this study, we represented 3D structures by 2D maps of pairwise residue distances and developed a new method (SPROF) to predict protein sequence profiles based on an image captioning learning frame. To our best knowledge, this is the first method to employ a 2D distance map for predicting protein properties. SPROF achieved 39.8% in sequence recovery of residues on the independent test set, representing a 5.2% improvement over SPIN2. We also found the sequence recovery increased with the number of their neighbored residues in 3D structural space, indicating that our method can effectively learn long-range information from the 2D distance map. Thus, such network architecture using a 2D distance map is expected to be useful for other 3D structure-based applications, such as binding site prediction, protein function prediction, and protein interaction prediction. The online server and the source code is available at http://biomed.nscc-gz.cn and https://github.com/biomed-AI/SPROF , respectively.
Collapse
Affiliation(s)
- Sheng Chen
- School of Data and Computer Science , Sun Yat-sen University , Guangzhou 510000 , China
| | - Zhe Sun
- School of Data and Computer Science , Sun Yat-sen University , Guangzhou 510000 , China
| | - Lihua Lin
- School of Data and Computer Science , Sun Yat-sen University , Guangzhou 510000 , China
| | - Zifeng Liu
- Third Affiliated Hospital of Sun Yat-sen University , Guangzhou 510000 , China
| | - Xun Liu
- Third Affiliated Hospital of Sun Yat-sen University , Guangzhou 510000 , China
| | - Yutian Chong
- Third Affiliated Hospital of Sun Yat-sen University , Guangzhou 510000 , China
| | - Yutong Lu
- School of Data and Computer Science , Sun Yat-sen University , Guangzhou 510000 , China
| | - Huiying Zhao
- Sun Yat-sen Memorial Hospital , Sun Yat-sen University , Guangzhou 510000 , China
| | - Yuedong Yang
- School of Data and Computer Science , Sun Yat-sen University , Guangzhou 510000 , China.,Key Laboratory of Machine Intelligence and Advanced Computing (Sun Yat-sen University) of the Ministry of Education , Guangzhou 510000 , China
| |
Collapse
|
7
|
Loshbaugh AL, Kortemme T. Comparison of Rosetta flexible-backbone computational protein design methods on binding interactions. Proteins 2019; 88:206-226. [PMID: 31344278 DOI: 10.1002/prot.25790] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2019] [Revised: 07/15/2019] [Accepted: 07/19/2019] [Indexed: 01/03/2023]
Abstract
Computational design of binding sites in proteins remains difficult, in part due to limitations in our current ability to sample backbone conformations that enable precise and accurate geometric positioning of side chains during sequence design. Here we present a benchmark framework for comparison between flexible-backbone design methods applied to binding interactions. We quantify the ability of different flexible backbone design methods in the widely used protein design software Rosetta to recapitulate observed protein sequence profiles assumed to represent functional protein/protein and protein/small molecule binding interactions. The CoupledMoves method, which combines backbone flexibility and sequence exploration into a single acceptance step during the sampling trajectory, better recapitulates observed sequence profiles than the BackrubEnsemble and FastDesign methods, which separate backbone flexibility and sequence design into separate acceptance steps during the sampling trajectory. Flexible-backbone design with the CoupledMoves method is a powerful strategy for reducing sequence space to generate targeted libraries for experimental screening and selection.
Collapse
Affiliation(s)
- Amanda L Loshbaugh
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, California.,Biophysics Graduate Program, University of California San Francisco, San Francisco, California
| | - Tanja Kortemme
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, California.,Biophysics Graduate Program, University of California San Francisco, San Francisco, California.,Quantitative Biosciences Institute, University of California San Francisco, San Francisco, California.,Chan Zuckerberg Biohub, San Francisco, California
| |
Collapse
|
8
|
Verma D, Grigoryan G, Bailey-Kellogg C. Pareto Optimization of Combinatorial Mutagenesis Libraries. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:1143-1153. [PMID: 30040654 PMCID: PMC8262366 DOI: 10.1109/tcbb.2018.2858794] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
In order to increase the hit rate of discovering diverse, beneficial protein variants via high-throughput screening, we have developed a computational method to optimize combinatorial mutagenesis libraries for overall enrichment in two distinct properties of interest. Given scoring functions for evaluating individual variants, POCoM (Pareto Optimal Combinatorial Mutagenesis) scores entire libraries in terms of averages over their constituent members, and designs optimal libraries as sets of mutations whose combinations make the best trade-offs between average scores. This represents the first general-purpose method to directly design combinatorial libraries for multiple objectives characterizing their constituent members. Despite being rigorous in mapping out the Pareto frontier, it is also very fast even for very large libraries (e.g., designing 30 mutation, billion-member libraries in only hours). We here instantiate POCoM with scores based on a target's protein structure and its homologs' sequences, enabling the design of libraries containing variants balancing these two important yet quite different types of information. We demonstrate POCoM's generality and power in case study applications to green fluorescent protein, cytochrome P450, and β-lactamase. Analysis of the POCoM library designs provides insights into the trade-offs between structure- and sequence-based scores, as well as the impacts of experimental constraints on library designs. POCoM libraries incorporate mutations that have previously been found favorable experimentally, while diversifying the contexts in which these mutations are situated and maintaining overall variant quality.
Collapse
|
9
|
Foight GW, Chen TS, Richman D, Keating AE. Enriching Peptide Libraries for Binding Affinity and Specificity Through Computationally Directed Library Design. Methods Mol Biol 2018; 1561:213-232. [PMID: 28236241 DOI: 10.1007/978-1-4939-6798-8_13] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Peptide reagents with high affinity or specificity for their target protein interaction partner are of utility for many important applications. Optimization of peptide binding by screening large libraries is a proven and powerful approach. Libraries designed to be enriched in peptide sequences that are predicted to have desired affinity or specificity characteristics are more likely to yield success than random mutagenesis. We present a library optimization method in which the choice of amino acids to encode at each peptide position can be guided by available experimental data or structure-based predictions. We discuss how to use analysis of predicted library performance to inform rounds of library design. Finally, we include protocols for more complex library design procedures that consider the chemical diversity of the amino acids at each peptide position and optimize a library score based on a user-specified input model.
Collapse
Affiliation(s)
- Glenna Wink Foight
- Department of Biology, Massachusetts Institute of Technology, 77 Massachusetts Ave., Bldg., 68-622, Cambridge, MA, 02139, USA
- Department of Chemistry, University of Washington, Seattle, WA, 98195, USA
| | - T Scott Chen
- Department of Biology, Massachusetts Institute of Technology, 77 Massachusetts Ave., Bldg., 68-622, Cambridge, MA, 02139, USA
- Google Inc., Mountain View, CA, 94043, USA
| | - Daniel Richman
- Department of Biology, Massachusetts Institute of Technology, 77 Massachusetts Ave., Bldg., 68-622, Cambridge, MA, 02139, USA
| | - Amy E Keating
- Department of Biology, Massachusetts Institute of Technology, 77 Massachusetts Ave., Bldg., 68-622, Cambridge, MA, 02139, USA.
- Department of Biological Engineering, Massachusetts Institute of Technology, 77 Massachusetts Ave., Bldg., 68-622, Cambridge, MA, 02139, USA.
| |
Collapse
|
10
|
Computational protein design with backbone plasticity. Biochem Soc Trans 2016; 44:1523-1529. [PMID: 27911735 PMCID: PMC5264498 DOI: 10.1042/bst20160155] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2016] [Revised: 08/01/2016] [Accepted: 08/03/2016] [Indexed: 11/17/2022]
Abstract
The computational algorithms used in the design of artificial proteins have become increasingly sophisticated in recent years, producing a series of remarkable successes. The most dramatic of these is the de novo design of artificial enzymes. The majority of these designs have reused naturally occurring protein structures as ‘scaffolds’ onto which novel functionality can be grafted without having to redesign the backbone structure. The incorporation of backbone flexibility into protein design is a much more computationally challenging problem due to the greatly increased search space, but promises to remove the limitations of reusing natural protein scaffolds. In this review, we outline the principles of computational protein design methods and discuss recent efforts to consider backbone plasticity in the design process.
Collapse
|
11
|
Sun MGF, Seo MH, Nim S, Corbi-Verge C, Kim PM. Protein engineering by highly parallel screening of computationally designed variants. SCIENCE ADVANCES 2016; 2:e1600692. [PMID: 27453948 PMCID: PMC4956399 DOI: 10.1126/sciadv.1600692] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/31/2016] [Accepted: 06/23/2016] [Indexed: 06/06/2023]
Abstract
Current combinatorial selection strategies for protein engineering have been successful at generating binders against a range of targets; however, the combinatorial nature of the libraries and their vast undersampling of sequence space inherently limit these methods due to the difficulty in finely controlling protein properties of the engineered region. Meanwhile, great advances in computational protein design that can address these issues have largely been underutilized. We describe an integrated approach that computationally designs thousands of individual protein binders for high-throughput synthesis and selection to engineer high-affinity binders. We show that a computationally designed library enriches for tight-binding variants by many orders of magnitude as compared to conventional randomization strategies. We thus demonstrate the feasibility of our approach in a proof-of-concept study and successfully obtain low-nanomolar binders using in vitro and in vivo selection systems.
Collapse
Affiliation(s)
- Mark G. F. Sun
- Department of Computer Science, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Moon-Hyeong Seo
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Satra Nim
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Carles Corbi-Verge
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Philip M. Kim
- Department of Computer Science, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| |
Collapse
|
12
|
Sun Z, Wikmark Y, Bäckvall JE, Reetz MT. New Concepts for Increasing the Efficiency in Directed Evolution of Stereoselective Enzymes. Chemistry 2016; 22:5046-54. [DOI: 10.1002/chem.201504406] [Citation(s) in RCA: 67] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2015] [Indexed: 01/28/2023]
Affiliation(s)
- Zhoutong Sun
- Max-Planck-Institut für Kohlenforschung; Kaiser-Wilhelm-Platz 1 45470 Mülheim an der Ruhr Germany
- Fachbereich Chemie; Philipps-Universität Marburg; Hans-Meerwein-Strasse 4 35032 Marburg Germany
| | - Ylva Wikmark
- Department of Organic Chemistry; Arrhenius Laboratory; Stockholm University; 106 91 Stockholm Sweden
| | - Jan-E. Bäckvall
- Department of Organic Chemistry; Arrhenius Laboratory; Stockholm University; 106 91 Stockholm Sweden
| | - Manfred T. Reetz
- Max-Planck-Institut für Kohlenforschung; Kaiser-Wilhelm-Platz 1 45470 Mülheim an der Ruhr Germany
- Fachbereich Chemie; Philipps-Universität Marburg; Hans-Meerwein-Strasse 4 35032 Marburg Germany
| |
Collapse
|
13
|
Dean KM, Davis LM, Lubbeck JL, Manna P, Friis P, Palmer AE, Jimenez R. High-speed multiparameter photophysical analyses of fluorophore libraries. Anal Chem 2015; 87:5026-30. [PMID: 25898152 DOI: 10.1021/acs.analchem.5b00607] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
There is a critical need for high-speed multiparameter photophysical measurements of large libraries of fluorescent probe variants for imaging and biosensor development. We present a microfluidic flow cytometer that rapidly assays 10(4)-10(5) member cell-based fluorophore libraries, simultaneously measuring fluorescence lifetime and photobleaching. Together, these photophysical characteristics determine imaging performance. We demonstrate the ability to resolve the diverse photophysical characteristics of different library types and the ability to identify rare populations.
Collapse
Affiliation(s)
- Kevin M Dean
- †BioFrontiers Institute, University of Colorado, Boulder, Colorado 80309, United States.,‡Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado 80309, United States
| | - Lloyd M Davis
- §Department of Physics, University of Tennessee Knoxville, Knoxville, Tennessee 37996, United States.,∥Center for Laser Applications, University of Tennessee Space Institute, Tullahoma, Tennessee 37388, United States
| | - Jennifer L Lubbeck
- ‡Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado 80309, United States.,⊥JILA, NIST, and University of Colorado, Boulder, Colorado 80309, United States
| | - Premashis Manna
- ‡Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado 80309, United States.,⊥JILA, NIST, and University of Colorado, Boulder, Colorado 80309, United States
| | - Pia Friis
- ‡Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado 80309, United States.,⊥JILA, NIST, and University of Colorado, Boulder, Colorado 80309, United States
| | - Amy E Palmer
- †BioFrontiers Institute, University of Colorado, Boulder, Colorado 80309, United States.,‡Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado 80309, United States
| | - Ralph Jimenez
- ‡Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado 80309, United States.,⊥JILA, NIST, and University of Colorado, Boulder, Colorado 80309, United States
| |
Collapse
|
14
|
Verma D, Grigoryan G, Bailey-Kellogg C. Structure-based design of combinatorial mutagenesis libraries. Protein Sci 2015; 24:895-908. [PMID: 25611189 DOI: 10.1002/pro.2642] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2014] [Revised: 12/14/2014] [Accepted: 01/11/2015] [Indexed: 01/27/2023]
Abstract
The development of protein variants with improved properties (thermostability, binding affinity, catalytic activity, etc.) has greatly benefited from the application of high-throughput screens evaluating large, diverse combinatorial libraries. At the same time, since only a very limited portion of sequence space can be experimentally constructed and tested, an attractive possibility is to use computational protein design to focus libraries on a productive portion of the space. We present a general-purpose method, called "Structure-based Optimization of Combinatorial Mutagenesis" (SOCoM), which can optimize arbitrarily large combinatorial mutagenesis libraries directly based on structural energies of their constituents. SOCoM chooses both positions and substitutions, employing a combinatorial optimization framework based on library-averaged energy potentials in order to avoid explicitly modeling every variant in every possible library. In case study applications to green fluorescent protein, β-lactamase, and lipase A, SOCoM optimizes relatively small, focused libraries whose variants achieve energies comparable to or better than previous library design efforts, as well as larger libraries (previously not designable by structure-based methods) whose variants cover greater diversity while still maintaining substantially better energies than would be achieved by representative random library approaches. By allowing the creation of large-scale combinatorial libraries based on structural calculations, SOCoM promises to increase the scope of applicability of computational protein design and improve the hit rate of discovering beneficial variants. While designs presented here focus on variant stability (predicted by total energy), SOCoM can readily incorporate other structure-based assessments, such as the energy gap between alternative conformational or bound states.
Collapse
Affiliation(s)
- Deeptak Verma
- Department of Computer Science, Dartmouth College, Hanover, New Hampshire
| | | | | |
Collapse
|
15
|
Jacobs TM, Yumerefendi H, Kuhlman B, Leaver-Fay A. SwiftLib: rapid degenerate-codon-library optimization through dynamic programming. Nucleic Acids Res 2014; 43:e34. [PMID: 25539925 PMCID: PMC4357694 DOI: 10.1093/nar/gku1323] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Degenerate codon (DC) libraries efficiently address the experimental library-size limitations of directed evolution by focusing diversity toward the positions and toward the amino acids (AAs) that are most likely to generate hits; however, manually constructing DC libraries is challenging, error prone and time consuming. This paper provides a dynamic programming solution to the task of finding the best DCs while keeping the size of the library beneath some given limit, improving on the existing integer-linear programming formulation. It then extends the algorithm to consider multiple DCs at each position, a heretofore unsolved problem, while adhering to a constraint on the number of primers needed to synthesize the library. In the two library-design problems examined here, the use of multiple DCs produces libraries that very nearly cover the set of desired AAs while still staying within the experimental size limits. Surprisingly, the algorithm is able to find near-perfect libraries where the ratio of amino-acid sequences to nucleic-acid sequences approaches 1; it effectively side-steps the degeneracy of the genetic code. Our algorithm is freely available through our web server and solves most design problems in about a second.
Collapse
Affiliation(s)
- Timothy M Jacobs
- Department of Biochemistry, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Hayretin Yumerefendi
- Department of Biochemistry, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Brian Kuhlman
- Department of Biochemistry, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Andrew Leaver-Fay
- Department of Biochemistry, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
16
|
Li Z, Yang Y, Faraggi E, Zhan J, Zhou Y. Direct prediction of profiles of sequences compatible with a protein structure by neural networks with fragment-based local and energy-based nonlocal profiles. Proteins 2014; 82:2565-73. [PMID: 24898915 DOI: 10.1002/prot.24620] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2014] [Revised: 05/28/2014] [Accepted: 05/30/2014] [Indexed: 12/13/2022]
Abstract
Locating sequences compatible with a protein structural fold is the well-known inverse protein-folding problem. While significant progress has been made, the success rate of protein design remains low. As a result, a library of designed sequences or profile of sequences is currently employed for guiding experimental screening or directed evolution. Sequence profiles can be computationally predicted by iterative mutations of a random sequence to produce energy-optimized sequences, or by combining sequences of structurally similar fragments in a template library. The latter approach is computationally more efficient but yields less accurate profiles than the former because of lacking tertiary structural information. Here we present a method called SPIN that predicts Sequence Profiles by Integrated Neural network based on fragment-derived sequence profiles and structure-derived energy profiles. SPIN improves over the fragment-derived profile by 6.7% (from 23.6 to 30.3%) in sequence identity between predicted and wild-type sequences. The method also reduces the number of residues in low complex regions by 15.7% and has a significantly better balance of hydrophilic and hydrophobic residues at protein surface. The accuracy of sequence profiles obtained is comparable to those generated from the protein design program RosettaDesign 3.5. This highly efficient method for predicting sequence profiles from structures will be useful as a single-body scoring term for improving scoring functions used in protein design and fold recognition. It also complements protein design programs in guiding experimental design of the sequence library for screening and directed evolution of designed sequences. The SPIN server is available at http://sparks-lab.org.
Collapse
Affiliation(s)
- Zhixiu Li
- School of Informatics and Computing, Indiana University-Purdue University, Indianapolis, Indiana, 46202
| | | | | | | | | |
Collapse
|
17
|
Recent advances in engineering proteins for biocatalysis. Biotechnol Bioeng 2014; 111:1273-87. [DOI: 10.1002/bit.25240] [Citation(s) in RCA: 77] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2013] [Revised: 02/10/2014] [Accepted: 03/19/2014] [Indexed: 01/14/2023]
|
18
|
Ai HW, Baird MA, Shen Y, Davidson MW, Campbell RE. Engineering and characterizing monomeric fluorescent proteins for live-cell imaging applications. Nat Protoc 2014; 9:910-28. [DOI: 10.1038/nprot.2014.054] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
|
19
|
Residue specific contributions to stability and activity inferred from saturation mutagenesis and deep sequencing. Curr Opin Struct Biol 2014; 24:63-71. [DOI: 10.1016/j.sbi.2013.12.001] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2013] [Revised: 11/25/2013] [Accepted: 12/03/2013] [Indexed: 12/23/2022]
|
20
|
Ollikainen N, Kortemme T. Computational protein design quantifies structural constraints on amino acid covariation. PLoS Comput Biol 2013; 9:e1003313. [PMID: 24244128 PMCID: PMC3828131 DOI: 10.1371/journal.pcbi.1003313] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2013] [Accepted: 09/20/2013] [Indexed: 02/02/2023] Open
Abstract
Amino acid covariation, where the identities of amino acids at different sequence positions are correlated, is a hallmark of naturally occurring proteins. This covariation can arise from multiple factors, including selective pressures for maintaining protein structure, requirements imposed by a specific function, or from phylogenetic sampling bias. Here we employed flexible backbone computational protein design to quantify the extent to which protein structure has constrained amino acid covariation for 40 diverse protein domains. We find significant similarities between the amino acid covariation in alignments of natural protein sequences and sequences optimized for their structures by computational protein design methods. These results indicate that the structural constraints imposed by protein architecture play a dominant role in shaping amino acid covariation and that computational protein design methods can capture these effects. We also find that the similarity between natural and designed covariation is sensitive to the magnitude and mechanism of backbone flexibility used in computational protein design. Our results thus highlight the necessity of including backbone flexibility to correctly model precise details of correlated amino acid changes and give insights into the pressures underlying these correlations. Proteins generally fold into specific three-dimensional structures to perform their cellular functions, and the presence of misfolded proteins is often deleterious for cellular and organismal fitness. For these reasons, maintenance of protein structure is thought to be one of the major fitness pressures acting on proteins. Consequently, the sequences of today's naturally occurring proteins contain signatures reflecting the constraints imposed by protein structure. Here we test the ability of computational protein design methods to recapitulate and explain these signatures. We focus on the physical basis of evolutionary pressures that act on interactions between amino acids in folded proteins, which are critical in determining protein structure and function. Such pressures can be observed from the appearance of amino acid covariation, where the amino acids at certain positions in protein sequences are correlated with each other. We find similar patterns of amino acid covariation in natural sequences and sequences optimized for their structures using computational protein design, demonstrating the importance of structural constraints in protein molecular evolution and providing insights into the structural mechanisms leading to covariation. In addition, these results characterize the ability of computational methods to model the precise details of correlated amino acid changes, which is critical for engineering new proteins with useful functions beyond those seen in nature.
Collapse
Affiliation(s)
- Noah Ollikainen
- Graduate Program in Bioinformatics, University of California San Francisco, San Francisco, California, United States of America
| | - Tanja Kortemme
- Graduate Program in Bioinformatics, University of California San Francisco, San Francisco, California, United States of America
- California Institute for Quantitative Biosciences (QB3), University of California San Francisco, San Francisco, California, United States of America
- Department of Bioengineering and Therapeutic Science, University of California San Francisco, San Francisco, California, United States of America
- * E-mail:
| |
Collapse
|
21
|
Lyskov S, Chou FC, Conchúir SÓ, Der BS, Drew K, Kuroda D, Xu J, Weitzner BD, Renfrew PD, Sripakdeevong P, Borgo B, Havranek JJ, Kuhlman B, Kortemme T, Bonneau R, Gray JJ, Das R. Serverification of molecular modeling applications: the Rosetta Online Server that Includes Everyone (ROSIE). PLoS One 2013; 8:e63906. [PMID: 23717507 PMCID: PMC3661552 DOI: 10.1371/journal.pone.0063906] [Citation(s) in RCA: 275] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2013] [Accepted: 04/04/2013] [Indexed: 11/21/2022] Open
Abstract
The Rosetta molecular modeling software package provides experimentally tested and rapidly evolving tools for the 3D structure prediction and high-resolution design of proteins, nucleic acids, and a growing number of non-natural polymers. Despite its free availability to academic users and improving documentation, use of Rosetta has largely remained confined to developers and their immediate collaborators due to the code's difficulty of use, the requirement for large computational resources, and the unavailability of servers for most of the Rosetta applications. Here, we present a unified web framework for Rosetta applications called ROSIE (Rosetta Online Server that Includes Everyone). ROSIE provides (a) a common user interface for Rosetta protocols, (b) a stable application programming interface for developers to add additional protocols, (c) a flexible back-end to allow leveraging of computer cluster resources shared by RosettaCommons member institutions, and (d) centralized administration by the RosettaCommons to ensure continuous maintenance. This paper describes the ROSIE server infrastructure, a step-by-step 'serverification' protocol for use by Rosetta developers, and the deployment of the first nine ROSIE applications by six separate developer teams: Docking, RNA de novo, ERRASER, Antibody, Sequence Tolerance, Supercharge, Beta peptide design, NCBB design, and VIP redesign. As illustrated by the number and diversity of these applications, ROSIE offers a general and speedy paradigm for serverification of Rosetta applications that incurs negligible cost to developers and lowers barriers to Rosetta use for the broader biological community. ROSIE is available at http://rosie.rosettacommons.org.
Collapse
Affiliation(s)
- Sergey Lyskov
- Department of Chemical and Biomolecular Engineering, The Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Fang-Chieh Chou
- Department of Biochemistry, Stanford University School of Medicine, Stanford, California, United States of America
| | - Shane Ó. Conchúir
- California Institute for Quantitative Biomedical Research, University of California San Francisco, San Francisco, California, United States of America
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, California, United States of America
| | - Bryan S. Der
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Kevin Drew
- Department of Biology, Center for Genomics and Systems Biology, New York University, New York, New York, United States of America
| | - Daisuke Kuroda
- Department of Chemical and Biomolecular Engineering, The Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Jianqing Xu
- Department of Chemical and Biomolecular Engineering, The Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Brian D. Weitzner
- Department of Chemical and Biomolecular Engineering, The Johns Hopkins University, Baltimore, Maryland, United States of America
| | - P. Douglas Renfrew
- Department of Biology, Center for Genomics and Systems Biology, New York University, New York, New York, United States of America
| | - Parin Sripakdeevong
- Biophysics Program, Stanford University, Stanford, California, United States of America
| | - Benjamin Borgo
- Department of Genetics, Washington University in St. Louis, St. Louis, Missouri, United States of America
| | - James J. Havranek
- Department of Genetics, Washington University in St. Louis, St. Louis, Missouri, United States of America
| | - Brian Kuhlman
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Tanja Kortemme
- California Institute for Quantitative Biomedical Research, University of California San Francisco, San Francisco, California, United States of America
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, California, United States of America
- Graduate Group in Biophysics, University of California San Francisco, San Francisco, California, United States of America
| | - Richard Bonneau
- Department of Biology, Center for Genomics and Systems Biology, New York University, New York, New York, United States of America
- Computer Science Department, Courant Institute of Mathematical Sciences, New York University, New York, New York, United States of America
| | - Jeffrey J. Gray
- Department of Chemical and Biomolecular Engineering, The Johns Hopkins University, Baltimore, Maryland, United States of America
- Program in Molecular Biophysics, The Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Rhiju Das
- Department of Biochemistry, Stanford University School of Medicine, Stanford, California, United States of America
- Department of Physics, Stanford University, Stanford, California, United States of America
| |
Collapse
|
22
|
Kille S, Acevedo-Rocha CG, Parra LP, Zhang ZG, Opperman DJ, Reetz MT, Acevedo JP. Reducing codon redundancy and screening effort of combinatorial protein libraries created by saturation mutagenesis. ACS Synth Biol 2013; 2:83-92. [PMID: 23656371 DOI: 10.1021/sb300037w] [Citation(s) in RCA: 202] [Impact Index Per Article: 18.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Saturation mutagenesis probes define sections of the vast protein sequence space. However, even if randomization is limited this way, the combinatorial numbers problem is severe. Because diversity is created at the codon level, codon redundancy is a crucial factor determining the necessary effort for library screening. Additionally, due to the probabilistic nature of the sampling process, oversampling is required to ensure library completeness as well as a high probability to encounter all unique variants. Our trick employs a special mixture of three primers, creating a degeneracy of 22 unique codons coding for the 20 canonical amino acids. Therefore, codon redundancy and subsequent screening effort is significantly reduced, and a balanced distribution of codon per amino acid is achieved, as demonstrated exemplarily for a library of cyclohexanone monooxygenase. We show that this strategy is suitable for any saturation mutagenesis methodology to generate less-redundant libraries.
Collapse
Affiliation(s)
- Sabrina Kille
- Max-Planck-Institut für Kohlenforschung, Kaiser-Wilhelm-Platz 1,
45470 Mülheim an der Ruhr, Germany
- Fachbereich Chemie, Philipps-Universität Marburg, Hans-Meerwein-Straße,
35043 Marburg, Germany
| | - Carlos G. Acevedo-Rocha
- Max-Planck-Institut für Kohlenforschung, Kaiser-Wilhelm-Platz 1,
45470 Mülheim an der Ruhr, Germany
- Fachbereich Chemie, Philipps-Universität Marburg, Hans-Meerwein-Straße,
35043 Marburg, Germany
| | - Loreto P. Parra
- Max-Planck-Institut für Kohlenforschung, Kaiser-Wilhelm-Platz 1,
45470 Mülheim an der Ruhr, Germany
- Fachbereich Chemie, Philipps-Universität Marburg, Hans-Meerwein-Straße,
35043 Marburg, Germany
| | - Zhi-Gang Zhang
- Max-Planck-Institut für Kohlenforschung, Kaiser-Wilhelm-Platz 1,
45470 Mülheim an der Ruhr, Germany
- Fachbereich Chemie, Philipps-Universität Marburg, Hans-Meerwein-Straße,
35043 Marburg, Germany
| | - Diederik J. Opperman
- Max-Planck-Institut für Kohlenforschung, Kaiser-Wilhelm-Platz 1,
45470 Mülheim an der Ruhr, Germany
| | - Manfred T. Reetz
- Max-Planck-Institut für Kohlenforschung, Kaiser-Wilhelm-Platz 1,
45470 Mülheim an der Ruhr, Germany
- Fachbereich Chemie, Philipps-Universität Marburg, Hans-Meerwein-Straße,
35043 Marburg, Germany
| | - Juan Pablo Acevedo
- Facultad
de Medicina y Facultad
de Ingeniería de la Universidad de los Andes, Santiago, Chile
| |
Collapse
|
23
|
Smith C, Shi C, Chroust M, Bliska T, Kelly M, Jacobson M, Kortemme T. Design of a Phosphorylatable PDZ Domain with Peptide-Specific Affinity Changes. Structure 2013; 21:54-64. [DOI: 10.1016/j.str.2012.10.007] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2012] [Revised: 10/13/2012] [Accepted: 10/18/2012] [Indexed: 01/06/2023]
|
24
|
Chen TS, Palacios H, Keating AE. Structure-based redesign of the binding specificity of anti-apoptotic Bcl-x(L). J Mol Biol 2012; 425:171-85. [PMID: 23154169 DOI: 10.1016/j.jmb.2012.11.009] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2012] [Revised: 11/05/2012] [Accepted: 11/06/2012] [Indexed: 12/29/2022]
Abstract
Many native proteins are multi-specific and interact with numerous partners, which can confound analysis of their functions. Protein design provides a potential route to generating synthetic variants of native proteins with more selective binding profiles. Redesigned proteins could be used as research tools, diagnostics or therapeutics. In this work, we used a library screening approach to reengineer the multi-specific anti-apoptotic protein Bcl-x(L) to remove its interactions with many of its binding partners, making it a high-affinity and selective binder of the BH3 region of pro-apoptotic protein Bad. To overcome the enormity of the potential Bcl-x(L) sequence space, we developed and applied a computational/experimental framework that used protein structure information to generate focused combinatorial libraries. Sequence features were identified using structure-based modeling, and an optimization algorithm based on integer programming was used to select degenerate codons that maximally covered these features. A constraint on library size was used to ensure thorough sampling. Using yeast surface display to screen a designed library of Bcl-x(L) variants, we successfully identified a protein with ~1000-fold improvement in binding specificity for the BH3 region of Bad over the BH3 region of Bim. Although negative design was targeted only against the BH3 region of Bim, the best redesigned protein was globally specific against binding to 10 other peptides corresponding to native BH3 motifs. Our design framework demonstrates an efficient route to highly specific protein binders and may readily be adapted for application to other design problems.
Collapse
Affiliation(s)
- T Scott Chen
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | | | | |
Collapse
|
25
|
Steiner K, Schwab H. Recent advances in rational approaches for enzyme engineering. Comput Struct Biotechnol J 2012; 2:e201209010. [PMID: 24688651 PMCID: PMC3962183 DOI: 10.5936/csbj.201209010] [Citation(s) in RCA: 100] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2012] [Revised: 10/16/2012] [Accepted: 10/18/2012] [Indexed: 11/29/2022] Open
Abstract
Enzymes are an attractive alternative in the asymmetric syntheses of chiral building blocks. To meet the requirements of industrial biotechnology and to introduce new functionalities, the enzymes need to be optimized by protein engineering. This article specifically reviews rational approaches for enzyme engineering and de novo enzyme design involving structure-based approaches developed in recent years for improvement of the enzymes’ performance, broadened substrate range, and creation of novel functionalities to obtain products with high added value for industrial applications.
Collapse
Affiliation(s)
- Kerstin Steiner
- ACIB GmbH, (Austrian Centre of Industrial Biotechnology), c/o TU Graz, 8010 Graz, Austria
| | - Helmut Schwab
- ACIB GmbH, (Austrian Centre of Industrial Biotechnology), c/o TU Graz, 8010 Graz, Austria ; Institute of Molecular Biotechnology, TU Graz, 8010 Graz, Austria
| |
Collapse
|
26
|
Goldsmith M, Tawfik DS. Directed enzyme evolution: beyond the low-hanging fruit. Curr Opin Struct Biol 2012; 22:406-12. [DOI: 10.1016/j.sbi.2012.03.010] [Citation(s) in RCA: 148] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2012] [Revised: 03/14/2012] [Accepted: 03/14/2012] [Indexed: 12/26/2022]
|
27
|
Chen TS, Keating AE. Designing specific protein-protein interactions using computation, experimental library screening, or integrated methods. Protein Sci 2012; 21:949-63. [PMID: 22593041 DOI: 10.1002/pro.2096] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2012] [Accepted: 05/11/2012] [Indexed: 11/11/2022]
Abstract
Given the importance of protein-protein interactions for nearly all biological processes, the design of protein affinity reagents for use in research, diagnosis or therapy is an important endeavor. Engineered proteins would ideally have high specificities for their intended targets, but achieving interaction specificity by design can be challenging. There are two major approaches to protein design or redesign. Most commonly, proteins and peptides are engineered using experimental library screening and/or in vitro evolution. An alternative approach involves using protein structure and computational modeling to rationally choose sequences predicted to have desirable properties. Computational design has successfully produced novel proteins with enhanced stability, desired interactions and enzymatic function. Here we review the strengths and limitations of experimental library screening and computational structure-based design, giving examples where these methods have been applied to designing protein interaction specificity. We highlight recent studies that demonstrate strategies for combining computational modeling with library screening. The computational methods provide focused libraries predicted to be enriched in sequences with the properties of interest. Such integrated approaches represent a promising way to increase the efficiency of protein design and to engineer complex functionality such as interaction specificity.
Collapse
Affiliation(s)
- T Scott Chen
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | | |
Collapse
|
28
|
Predictive Bcl-2 family binding models rooted in experiment or structure. J Mol Biol 2012; 422:124-44. [PMID: 22617328 DOI: 10.1016/j.jmb.2012.05.022] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2012] [Revised: 05/10/2012] [Accepted: 05/13/2012] [Indexed: 11/23/2022]
Abstract
Proteins of the Bcl-2 family either enhance or suppress programmed cell death and are centrally involved in cancer development and resistance to chemotherapy. BH3 (Bcl-2 homology 3)-only Bcl-2 proteins promote cell death by docking an α-helix into a hydrophobic groove on the surface of one or more of five pro-survival Bcl-2 receptor proteins. There is high structural homology within the pro-death and pro-survival families, yet a high degree of interaction specificity is nevertheless encoded, posing an interesting and important molecular recognition problem. Understanding protein features that dictate Bcl-2 interaction specificity is critical for designing peptide-based cancer therapeutics and diagnostics. In this study, we present peptide SPOT arrays and deep sequencing data from yeast display screening experiments that significantly expand the BH3 sequence space that has been experimentally tested for interaction with five human anti-apoptotic receptors. These data provide rich information about the determinants of Bcl-2 family specificity. To interpret and use the information, we constructed two simple data-based models that can predict affinity and specificity when evaluated on independent data sets within a limited sequence space. We also constructed a novel structure-based statistical potential, called STATIUM, which is remarkably good at predicting Bcl-2 affinity and specificity, especially considering it is not trained on experimental data. We compare the performance of our three models to each other and to alternative structure-based methods and discuss how such tools can guide prediction and design of new Bcl-2 family complexes.
Collapse
|
29
|
Chen MMY, Snow CD, Vizcarra CL, Mayo SL, Arnold FH. Comparison of random mutagenesis and semi-rational designed libraries for improved cytochrome P450 BM3-catalyzed hydroxylation of small alkanes. Protein Eng Des Sel 2012; 25:171-8. [DOI: 10.1093/protein/gzs004] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
|
30
|
Sandström AG, Wikmark Y, Engström K, Nyhlén J, Bäckvall JE. Combinatorial reshaping of the Candida antarctica lipase A substrate pocket for enantioselectivity using an extremely condensed library. Proc Natl Acad Sci U S A 2012; 109:78-83. [PMID: 22178758 PMCID: PMC3252943 DOI: 10.1073/pnas.1111537108] [Citation(s) in RCA: 99] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
A highly combinatorial structure-based protein engineering method for obtaining enantioselectivity is reported that results in a thorough modification of the substrate binding pocket of Candida antarctica lipase A (CALA). Nine amino acid residues surrounding the entire pocket were simultaneously mutated, contributing to a reshaping of the substrate pocket to give increased enantioselectivity and activity for a sterically demanding substrate. This approach seems to be powerful for developing enantioselectivity when a complete reshaping of the active site is required. Screening toward ibuprofen ester 1, a substrate for which previously used methods had failed, gave variants with a significantly increased enantioselectivity and activity. Wild-type CALA has a moderate activity with an E value of only 3.4 toward this substrate. The best variant had an E value of 100 and it also displayed a high activity. The variation at each mutated position was highly reduced, comprising only the wild type and an alternative residue, preferably a smaller one with similar properties. These minimal binary variations allow for an extremely condensed protein library. With this highly combinatorial method synergistic effects are accounted for and the protein fitness landscape is explored efficiently.
Collapse
Affiliation(s)
- Anders G. Sandström
- Department of Organic Chemistry, Arrhenius Laboratory, Stockholm University, SE-106 91 Stockholm, Sweden
| | - Ylva Wikmark
- Department of Organic Chemistry, Arrhenius Laboratory, Stockholm University, SE-106 91 Stockholm, Sweden
| | - Karin Engström
- Department of Organic Chemistry, Arrhenius Laboratory, Stockholm University, SE-106 91 Stockholm, Sweden
| | - Jonas Nyhlén
- Department of Organic Chemistry, Arrhenius Laboratory, Stockholm University, SE-106 91 Stockholm, Sweden
| | - Jan-E. Bäckvall
- Department of Organic Chemistry, Arrhenius Laboratory, Stockholm University, SE-106 91 Stockholm, Sweden
| |
Collapse
|
31
|
Zimmer M. What does it take to improve existing fluorescent proteins for in vivo imaging applications? Methods Mol Biol 2012; 872:235-241. [PMID: 22700415 DOI: 10.1007/978-1-61779-797-2_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Although fluorescent proteins are ubiquitously used as genetic tracers and imaging agents, there is significant room for improvement. This chapter discusses how new improved fluorescent proteins can be designed. It focuses on the design of far-red and infrared fluorescent proteins, since the currently-available red fluorescent proteins are not optimal for in vivo applications.
Collapse
Affiliation(s)
- Marc Zimmer
- Hale Laboratory, Connecticut College, New London, CT, USA.
| |
Collapse
|
32
|
Parker AS, Griswold KE, Bailey-Kellogg C. Optimization of combinatorial mutagenesis. J Comput Biol 2011; 18:1743-56. [PMID: 21923411 DOI: 10.1089/cmb.2011.0152] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Protein engineering by combinatorial site-directed mutagenesis evaluates a portion of the sequence space near a target protein, seeking variants with improved properties (e.g., stability, activity, immunogenicity). In order to improve the hit-rate of beneficial variants in such mutagenesis libraries, we develop methods to select optimal positions and corresponding sets of the mutations that will be used, in all combinations, in constructing a library for experimental evaluation. Our approach, OCoM (Optimization of Combinatorial Mutagenesis), encompasses both degenerate oligonucleotides and specified point mutations, and can be directed accordingly by requirements of experimental cost and library size. It evaluates the quality of the resulting library by one- and two-body sequence potentials, averaged over the variants. To ensure that it is not simply recapitulating extant sequences, it balances the quality of a library with an explicit evaluation of the novelty of its members. We show that, despite dealing with a combinatorial set of variants, in our approach the resulting library optimization problem is actually isomorphic to single-variant optimization. By the same token, this means that the two-body sequence potential results in an NP-hard optimization problem. We present an efficient dynamic programming algorithm for the one-body case and a practically-efficient integer programming approach for the general two-body case. We demonstrate the effectiveness of our approach in designing libraries for three different case study proteins targeted by previous combinatorial libraries--a green fluorescent protein, a cytochrome P450, and a beta lactamase. We found that OCoM worked quite efficiently in practice, requiring only 1 hour even for the massive design problem of selecting 18 mutations to generate 10⁷ variants of a 443-residue P450. We demonstrate the general ability of OCoM in enabling the protein engineer to explore and evaluate trade-offs between quality and novelty as well as library construction technique, and identify optimal libraries for experimental evaluation.
Collapse
Affiliation(s)
- Andrew S Parker
- Department of Computer Science, Dartmouth College, Hanover, New Hampshire, USA
| | | | | |
Collapse
|
33
|
Huang PS, Ban YEA, Richter F, Andre I, Vernon R, Schief WR, Baker D. RosettaRemodel: a generalized framework for flexible backbone protein design. PLoS One 2011; 6:e24109. [PMID: 21909381 PMCID: PMC3166072 DOI: 10.1371/journal.pone.0024109] [Citation(s) in RCA: 241] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2011] [Accepted: 07/29/2011] [Indexed: 12/12/2022] Open
Abstract
We describe RosettaRemodel, a generalized framework for flexible protein design that provides a versatile and convenient interface to the Rosetta modeling suite. RosettaRemodel employs a unified interface, called a blueprint, which allows detailed control over many aspects of flexible backbone protein design calculations. RosettaRemodel allows the construction and elaboration of customized protocols for a wide range of design problems ranging from loop insertion and deletion, disulfide engineering, domain assembly, loop remodeling, motif grafting, symmetrical units, to de novo structure modeling.
Collapse
Affiliation(s)
- Po-Ssu Huang
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
| | - Yih-En Andrew Ban
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
| | - Florian Richter
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
- Interdisciplinary Program in Biomolecular Structure and Design, University of Washington, Seattle, Washington, United States of America
| | - Ingemar Andre
- Department of Biochemistry and Structural Biology, Lund University, Lund, Sweden
| | - Robert Vernon
- Program in Molecular Structure and Function, Hospital for Sick Children, Toronto, Ontario, Canada
| | - William R. Schief
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
- * E-mail: (WRS); (DB)
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
- Howard Hughes Medical Institute, University of Washington, Seattle, Washington, United States of America
- * E-mail: (WRS); (DB)
| |
Collapse
|
34
|
Strategy and success for the directed evolution of enzymes. Curr Opin Struct Biol 2011; 21:473-80. [DOI: 10.1016/j.sbi.2011.05.003] [Citation(s) in RCA: 139] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2011] [Accepted: 05/25/2011] [Indexed: 11/20/2022]
|
35
|
Smith CA, Kortemme T. Predicting the tolerated sequences for proteins and protein interfaces using RosettaBackrub flexible backbone design. PLoS One 2011; 6:e20451. [PMID: 21789164 PMCID: PMC3138746 DOI: 10.1371/journal.pone.0020451] [Citation(s) in RCA: 77] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2011] [Accepted: 04/20/2011] [Indexed: 11/18/2022] Open
Abstract
Predicting the set of sequences that are tolerated by a protein or protein interface, while maintaining a desired function, is useful for characterizing protein interaction specificity and for computationally designing sequence libraries to engineer proteins with new functions. Here we provide a general method, a detailed set of protocols, and several benchmarks and analyses for estimating tolerated sequences using flexible backbone protein design implemented in the Rosetta molecular modeling software suite. The input to the method is at least one experimentally determined three-dimensional protein structure or high-quality model. The starting structure(s) are expanded or refined into a conformational ensemble using Monte Carlo simulations consisting of backrub backbone and side chain moves in Rosetta. The method then uses a combination of simulated annealing and genetic algorithm optimization methods to enrich for low-energy sequences for the individual members of the ensemble. To emphasize certain functional requirements (e.g. forming a binding interface), interactions between and within parts of the structure (e.g. domains) can be reweighted in the scoring function. Results from each backbone structure are merged together to create a single estimate for the tolerated sequence space. We provide an extensive description of the protocol and its parameters, all source code, example analysis scripts and three tests applying this method to finding sequences predicted to stabilize proteins or protein interfaces. The generality of this method makes many other applications possible, for example stabilizing interactions with small molecules, DNA, or RNA. Through the use of within-domain reweighting and/or multistate design, it may also be possible to use this method to find sequences that stabilize particular protein conformations or binding interactions over others.
Collapse
Affiliation(s)
- Colin A. Smith
- Graduate Program in Biological and Medical Informatics, University of California San Francisco, San Francisco, California, United States of America
- California Institute for Quantitative Biosciences, San Francisco, California, United States of America
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, California, United States of America
| | - Tanja Kortemme
- Graduate Program in Biological and Medical Informatics, University of California San Francisco, San Francisco, California, United States of America
- California Institute for Quantitative Biosciences, San Francisco, California, United States of America
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, California, United States of America
- * E-mail:
| |
Collapse
|
36
|
Samish I, MacDermaid CM, Perez-Aguilar JM, Saven JG. Theoretical and Computational Protein Design. Annu Rev Phys Chem 2011; 62:129-49. [DOI: 10.1146/annurev-physchem-032210-103509] [Citation(s) in RCA: 119] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
| | | | | | - Jeffery G. Saven
- Department of Chemistry, University of Pennsylvania, Philadelphia, Pennsylvania 19104;
| |
Collapse
|
37
|
Babor M, Mandell DJ, Kortemme T. Assessment of flexible backbone protein design methods for sequence library prediction in the therapeutic antibody Herceptin-HER2 interface. Protein Sci 2011; 20:1082-9. [PMID: 21465611 DOI: 10.1002/pro.632] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2010] [Revised: 03/15/2011] [Accepted: 03/16/2011] [Indexed: 01/28/2023]
Abstract
Computational protein design methods can complement experimental screening and selection techniques by predicting libraries of low-energy sequences compatible with a desired structure and function. Incorporating backbone flexibility in computational design allows conformational adjustments that should broaden the range of predicted low-energy sequences. Here, we evaluate computational predictions of sequence libraries from different protocols for modeling backbone flexibility using the complex between the therapeutic antibody Herceptin and its target human epidermal growth factor receptor 2 (HER2) as a model system. Within the program RosettaDesign, three methods are compared: The first two use ensembles of structures generated by Monte Carlo protocols for near-native conformational sampling: kinematic closure (KIC) and backrub, and the third method uses snapshots from molecular dynamics (MD) simulations. KIC or backrub methods were better able to identify the amino acid residues experimentally observed by phage display in the Herceptin-HER2 interface than MD snapshots, which generated much larger conformational and sequence diversity. KIC and backrub, as well as fixed backbone simulations, captured the key mutation Asp98Trp in Herceptin, which leads to a further threefold affinity improvement of the already subnanomolar parental Herceptin-HER2 interface. Modeling subtle backbone conformational changes may assist in the design of sequence libraries for improving the affinity of antibody-antigen interfaces and could be suitable for other protein complexes for which structural information is available.
Collapse
Affiliation(s)
- Mariana Babor
- California Institute for Quantitative Biomedical Research, University of California, San Francisco, San Francisco, California 94158-2330, USA
| | | | | |
Collapse
|
38
|
Saven JG. Computational protein design: engineering molecular diversity, nonnatural enzymes, nonbiological cofactor complexes, and membrane proteins. Curr Opin Chem Biol 2011; 15:452-7. [PMID: 21493122 DOI: 10.1016/j.cbpa.2011.03.014] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2011] [Revised: 03/18/2011] [Accepted: 03/18/2011] [Indexed: 11/18/2022]
Abstract
Computational and theoretical methods are advancing protein design as a means to create and investigate proteins. Such efforts further our capacity to control, design and understand biomolecular structure, sequence and function. Herein, the focus is on some recent applications that involve using theoretical and computational methods to guide the design of protein sequence ensembles, new enzymes, proteins with novel cofactors, and membrane proteins.
Collapse
Affiliation(s)
- Jeffery G Saven
- Department of Chemistry, University of Pennsylvania, 231 South 34th Street, Philadelphia, PA 19104, USA
| |
Collapse
|
39
|
Lippow SM, Moon TS, Basu S, Yoon SH, Li X, Chapman BA, Robison K, Lipovšek D, Prather KLJ. Engineering enzyme specificity using computational design of a defined-sequence library. ACTA ACUST UNITED AC 2011; 17:1306-15. [PMID: 21168766 DOI: 10.1016/j.chembiol.2010.10.012] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2010] [Revised: 10/19/2010] [Accepted: 10/19/2010] [Indexed: 02/07/2023]
Abstract
Engineered biosynthetic pathways have the potential to produce high-value molecules from inexpensive feedstocks, but a key limitation is engineering enzymes with high activity and specificity for new reactions. Here, we developed a method for combining structure-based computational protein design with library-based enzyme screening, in which inter-residue correlations favored by the design are encoded into a defined-sequence library. We validated this approach by engineering a glucose 6-oxidase enzyme for use in a proposed pathway to convert D-glucose into D-glucaric acid. The most active variant, identified after only one round of diversification and screening of only 10,000 wells, is approximately 400-fold more active on glucose than is the wild-type enzyme. We anticipate that this strategy will be broadly applicable to the discovery of new enzymes for engineered biological pathways.
Collapse
Affiliation(s)
- Shaun M Lippow
- Codon Devices, Inc., 99 Erie Street, Cambridge, MA 02139, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
40
|
Lewis JC, Mantovani SM, Fu Y, Snow CD, Komor RS, Wong CH, Arnold FH. Combinatorial alanine substitution enables rapid optimization of cytochrome P450BM3 for selective hydroxylation of large substrates. Chembiochem 2011; 11:2502-5. [PMID: 21108271 DOI: 10.1002/cbic.201000565] [Citation(s) in RCA: 92] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Jared C Lewis
- Department of Chemistry, University of Chicago, Chicago, IL 60637, USA
| | | | | | | | | | | | | |
Collapse
|
41
|
|
42
|
MacDonald JT, Barnes C, Kitney RI, Freemont PS, Stan GBV. Computational design approaches and tools for synthetic biology. Integr Biol (Camb) 2011; 3:97-108. [DOI: 10.1039/c0ib00077a] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
43
|
Experimental library screening demonstrates the successful application of computational protein design to large structural ensembles. Proc Natl Acad Sci U S A 2010; 107:19838-43. [PMID: 21045132 DOI: 10.1073/pnas.1012985107] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The stability, activity, and solubility of a protein sequence are determined by a delicate balance of molecular interactions in a variety of conformational states. Even so, most computational protein design methods model sequences in the context of a single native conformation. Simulations that model the native state as an ensemble have been mostly neglected due to the lack of sufficiently powerful optimization algorithms for multistate design. Here, we have applied our multistate design algorithm to study the potential utility of various forms of input structural data for design. To facilitate a more thorough analysis, we developed new methods for the design and high-throughput stability determination of combinatorial mutation libraries based on protein design calculations. The application of these methods to the core design of a small model system produced many variants with improved thermodynamic stability and showed that multistate design methods can be readily applied to large structural ensembles. We found that exhaustive screening of our designed libraries helped to clarify several sources of simulation error that would have otherwise been difficult to ascertain. Interestingly, the lack of correlation between our simulated and experimentally measured stability values shows clearly that a design procedure need not reproduce experimental data exactly to achieve success. This surprising result suggests potentially fruitful directions for the improvement of computational protein design technology.
Collapse
|
44
|
Engineering a protein-protein interface using a computationally designed library. Proc Natl Acad Sci U S A 2010; 107:19296-301. [PMID: 20974935 DOI: 10.1073/pnas.1006528107] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Computational algorithms for protein design can sample large regions of sequence space, but suffer from undersampling of conformational space and energy function inaccuracies. Experimental screening of combinatorial protein libraries avoids the need for accurate energy functions, but has limited access to vast amounts of sequence space. Here, we test if these two traditionally alternative, but potentially complementary approaches can be combined to design a variant of the ubiquitin-ligase E6AP that will bind to a nonnatural partner, the NEDD8-conjugating enzyme Ubc12. Three E6AP libraries were constructed: (i) a naive library in which all 20 amino acids were allowed at every position on the target-binding surface of E6AP (13 positions), (ii) a semidirected library that varied the same residue positions as in the naive library but disallowed mutations computationally predicted to destabilize E6AP, and (iii) a directed library that used docking and sequence optimization simulations to identify mutations predicted to be favorable for binding Ubc12. Both of the directed libraries showed > 30-fold enrichment over the naive library after the first round of screening with a split-dihydrofolate reductase complementation assay and produced multiple tight binders (K(d) < 100 nM) after four rounds of selection. Four rounds of selection with the naive library failed to produce any binders with K(d)'s lower than 50 μM. These results indicate that protein design simulations can be used to create directed libraries that are enriched in tight binders and that in some cases it is sufficient to computationally screen for well-folded sequences without explicit binding calculations.
Collapse
|
45
|
Lassila JK. Conformational diversity and computational enzyme design. Curr Opin Chem Biol 2010; 14:676-82. [PMID: 20829099 PMCID: PMC2953567 DOI: 10.1016/j.cbpa.2010.08.010] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2010] [Revised: 08/06/2010] [Accepted: 08/06/2010] [Indexed: 11/22/2022]
Abstract
The application of computational protein design methods to the design of enzyme active sites offers potential routes to new catalysts and new reaction specificities. Computational design methods have typically treated the protein backbone as a rigid structure for the sake of computational tractability. However, this fixed-backbone approximation introduces its own special challenges for enzyme design and it contrasts with an emerging picture of natural enzymes as dynamic ensembles with multiple conformations and motions throughout a reaction cycle. This review considers the impact of conformational variation and dynamics on computational enzyme design and it highlights new approaches to addressing protein conformational diversity in enzyme design including recent advances in multi-state design, backbone flexibility, and computational library design.
Collapse
Affiliation(s)
- Jonathan K Lassila
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA.
| |
Collapse
|
46
|
Barakat NH, Barakat NH, Love JJ. Combined use of experimental and computational screens to characterize protein stability. Protein Eng Des Sel 2010; 23:799-807. [PMID: 20805093 DOI: 10.1093/protein/gzq052] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
One of the primary goals of protein design is to engineer proteins with improved stability. Protein stability is a key issue for chemical, biotechnology and pharmaceutical industries. The development of robust proteins/enzymes with the ability to withstand the potentially harsh conditions of industrial operations is of high importance. A number of strategies are currently being employed to achieve this goal. Two particular approaches, (i) directed evolution and (ii) computational protein design, are quite powerful yet have only recently been combined or applied and analyzed in parallel. In directed evolution, libraries of variants are searched experimentally for clones possessing the desired properties. With computational methods, protein design algorithms are utilized to perform in silico screening for stable protein sequences. Here, we used gene libraries of an unstable variant of streptococcal protein G (Gbeta1) and an in vivo screening method to identify stabilized variants. Many variants with notably increased thermal stabilities were isolated and characterized. Concomitantly, computational techniques and protein design algorithms were used to perform in silico screening of the same destabilized variant of Gbeta1. The combined use, and critical analysis, of these methods promises to advance the field of protein design.
Collapse
Affiliation(s)
- Nora H Barakat
- Department of Chemistry and Biochemistry, San Diego State University, 5500 Campanile Dr, San Diego, CA 92182-1030, USA
| | | | | |
Collapse
|
47
|
Smith CA, Kortemme T. Structure-based prediction of the peptide sequence space recognized by natural and synthetic PDZ domains. J Mol Biol 2010; 402:460-74. [PMID: 20654621 DOI: 10.1016/j.jmb.2010.07.032] [Citation(s) in RCA: 86] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2010] [Accepted: 07/07/2010] [Indexed: 11/27/2022]
Abstract
Protein-protein recognition, frequently mediated by members of large families of interaction domains, is one of the cornerstones of biological function. Here, we present a computational, structure-based method to predict the sequence space of peptides recognized by PDZ domains, one of the largest families of recognition proteins. As a test set, we use a considerable amount of recent phage display data that describe the peptide recognition preferences for 169 naturally occurring and engineered PDZ domains. For both wild-type PDZ domains and single point mutants, we find that 70-80% of the most frequently observed amino acids by phage display are predicted within the top five ranked amino acids. Phage display frequently identified recognition preferences for amino acids different from those present in the original crystal structure. Notably, in about half of these cases, our algorithm correctly captures these preferences, indicating that it can predict mutations that increase binding affinity relative to the starting structure. We also find that we can computationally recapitulate specificity changes upon mutation, a key test for successful forward design of protein-protein interface specificity. Across all evaluated data sets, we find that incorporation backbone sampling improves accuracy substantially, irrespective of using a crystal or NMR structure as the starting conformation. Finally, we report successful prediction of several amino acid specificity changes from blind tests in the DREAM4 peptide recognition domain specificity prediction challenge. Because the foundational methods developed here are structure based, these results suggest that the approach can be more generally applied to specificity prediction and redesign of other protein-protein interfaces that have structural information but lack phage display data.
Collapse
Affiliation(s)
- Colin A Smith
- Graduate Program in Biological and Medical Informatics, University of California San Francisco, 600 16th Street, MC 2240, San Francisco, CA 94158, USA
| | | |
Collapse
|
48
|
Friedland GD, Kortemme T. Designing ensembles in conformational and sequence space to characterize and engineer proteins. Curr Opin Struct Biol 2010; 20:377-84. [DOI: 10.1016/j.sbi.2010.02.004] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2010] [Accepted: 02/19/2010] [Indexed: 11/16/2022]
|
49
|
Butler MC, Itotia PN, Sullivan JM. A high-throughput biophotonics instrument to screen for novel ocular photosensitizing therapeutic agents. Invest Ophthalmol Vis Sci 2010; 51:2705-20. [PMID: 19834043 PMCID: PMC2868480 DOI: 10.1167/iovs.08-2862] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2008] [Revised: 12/15/2008] [Accepted: 03/04/2010] [Indexed: 11/24/2022] Open
Abstract
PURPOSE High-throughput techniques are needed to identify and optimize novel photodynamic therapy (PDT) agents with greater efficacy and to lower toxicity. Novel agents with the capacity to completely ablate pathologic angiogenesis could be of substantial utility in diseases such as wet age-related macular degeneration (AMD). METHODS An instrument and approach was developed based on light-emitting diode (LED) technology for high-throughput screening (HTS) of libraries of potential chemical and biological photosensitizing agents. Ninety-six-well LED arrays were generated at multiple wavelengths and under rigorous intensity control. Cell toxicity was measured in 96-well culture arrays with the nuclear dye SYTOX Green (Invitrogen-Molecular Probes, Eugene, OR). RESULTS Rapid screening of photoactivatable chemicals or biological molecules has been realized in 96-well arrays of cultured human cells. This instrument can be used to identify new PDT agents that exert cell toxicity on presentation of light of the appropriate energy. The system is further demonstrated through determination of the dose dependence of model compounds having or lacking cellular phototoxicity. Killer Red (KR), a genetically encoded red fluorescent protein expressed from transfected plasmids, is examined as a potential cellular photosensitizing agent and offers unique opportunities as a cell-type-specific phototoxic protein. CONCLUSIONS This instrument has the capacity to screen large chemical or biological libraries for rapid identification and optimization of potential novel phototoxic lead candidates. KR and its derivatives have unique potential in ocular gene therapy for pathologic angiogenesis or tumors.
Collapse
Affiliation(s)
| | | | - Jack M. Sullivan
- From the Departments of Ophthalmology
- Pharmacology and Toxicology, and
- Physiology and Biophysics
- the Neuroscience Program, and
- the Ira G. Ross Eye Institute, SUNY University at Buffalo, Buffalo, New York; and
- the Veterans Administration Western New York Healthcare System, Buffalo, New York
| |
Collapse
|
50
|
Abstract
In recent years, there have been significant advances in the field of computational protein design including the successful computational design of enzymes based on backbone scaffolds from experimentally solved structures. It is likely that large-scale sampling of protein backbone conformations will become necessary as further progress is made on more complicated systems. Removing the constraint of having to use scaffolds based on known protein backbones is a potential method of solving the problem. With this application in mind, we describe a method to systematically construct a large number of de novo backbone structures from idealized topological forms in a top–down hierarchical approach. The structural properties of these novel backbone scaffolds were analyzed and compared with a set of high-resolution experimental structures from the protein data bank (PDB). It was found that the Ramachandran plot distribution and relative γ- and β-turn frequencies were similar to those found in the PDB. The de novo scaffolds were sequence designed with RosettaDesign, and the energy distributions and amino acid compositions were comparable with the results for redesigned experimentally solved backbones. Proteins 2010. © 2009 Wiley-Liss, Inc.
Collapse
Affiliation(s)
- James T MacDonald
- Division of Mathematical Biology, MRC National Institute for Medical Research, The Ridgeway, Mill Hill, London NW7 1AA
| | | | | | | |
Collapse
|