51
|
Trudeau DL, Kaltenbach M, Tawfik DS. On the Potential Origins of the High Stability of Reconstructed Ancestral Proteins. Mol Biol Evol 2016; 33:2633-41. [PMID: 27413048 DOI: 10.1093/molbev/msw138] [Citation(s) in RCA: 82] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Ancestral reconstruction provides instrumental insights regarding the biochemical and biophysical characteristics of past proteins. A striking observation relates to the remarkably high thermostability of reconstructed ancestors. The latter has been linked to high environmental temperatures in the Precambrian era, the era relating to most reconstructed proteins. We found that inferred ancestors of the serum paraoxonase (PON) enzyme family, including the mammalian ancestor, exhibit dramatically increased thermostabilities compared with the extant, human enzyme (up to 30 °C higher melting temperature). However, the environmental temperature at the time of emergence of mammals is presumed to be similar to the present one. Additionally, the mammalian PON ancestor has superior folding properties (kinetic stability)-unlike the extant mammalian PONs, it expresses in E. coli in a soluble and functional form, and at a high yield. We discuss two potential origins of this unexpectedly high stability. First, ancestral stability may be overestimated by a "consensus effect," whereby replacing amino acids that are rare in contemporary sequences with the amino acid most common in the family increases protein stability. Comparison to other reconstructed ancestors indicates that the consensus effect may bias some but not all reconstructions. Second, we note that high stability may relate to factors other than high environmental temperature such as oxidative stress or high radiation levels. Foremost, intrinsic factors such as high rates of genetic mutations and/or of transcriptional and translational errors, and less efficient protein quality control systems, may underlie the high kinetic and thermodynamic stability of past proteins.
Collapse
Affiliation(s)
- Devin L Trudeau
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, Israel
| | - Miriam Kaltenbach
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, Israel
| | - Dan S Tawfik
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, Israel
| |
Collapse
|
52
|
Abstract
A popular and successful strategy in semi-rational design of protein stability is the use of evolutionary information encapsulated in homologous protein sequences. Consensus design is based on the hypothesis that at a given position, the respective consensus amino acid contributes more than average to the stability of the protein than non-conserved amino acids. Here, we review the consensus design approach, its theoretical underpinnings, successes, limitations and challenges, as well as providing a detailed guide to its application in protein engineering.
Collapse
Affiliation(s)
- Benjamin T Porebski
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Faculty of Medicine, Monash University, Clayton, Victoria 3800, Australia Medical Research Council Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, UK
| | - Ashley M Buckle
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Faculty of Medicine, Monash University, Clayton, Victoria 3800, Australia
| |
Collapse
|
53
|
Using natural sequences and modularity to design common and novel protein topologies. Curr Opin Struct Biol 2016; 38:26-36. [PMID: 27270240 DOI: 10.1016/j.sbi.2016.05.007] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2016] [Revised: 05/13/2016] [Accepted: 05/18/2016] [Indexed: 02/07/2023]
Abstract
Protein design is still a challenging undertaking, often requiring multiple attempts or iterations for success. Typically, the source of failure is unclear, and scoring metrics appear similar between successful and failed cases. Nevertheless, the use of sequence statistics, modularity and symmetry from natural proteins, combined with computational design both at the coarse-grained and atomistic levels is propelling a new wave of design efforts to success. Here we highlight recent examples of design, showing how the wealth of natural protein sequence and topology data may be leveraged to reduce the search space and increase the likelihood of achieving desired outcomes.
Collapse
|
54
|
Bendl J, Stourac J, Sebestova E, Vavra O, Musil M, Brezovsky J, Damborsky J. HotSpot Wizard 2.0: automated design of site-specific mutations and smart libraries in protein engineering. Nucleic Acids Res 2016; 44:W479-87. [PMID: 27174934 PMCID: PMC4987947 DOI: 10.1093/nar/gkw416] [Citation(s) in RCA: 62] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2016] [Accepted: 05/03/2016] [Indexed: 01/13/2023] Open
Abstract
HotSpot Wizard 2.0 is a web server for automated identification of hot spots and design of smart libraries for engineering proteins' stability, catalytic activity, substrate specificity and enantioselectivity. The server integrates sequence, structural and evolutionary information obtained from 3 databases and 20 computational tools. Users are guided through the processes of selecting hot spots using four different protein engineering strategies and optimizing the resulting library's size by narrowing down a set of substitutions at individual randomized positions. The only required input is a query protein structure. The results of the calculations are mapped onto the protein's structure and visualized with a JSmol applet. HotSpot Wizard lists annotated residues suitable for mutagenesis and can automatically design appropriate codons for each implemented strategy. Overall, HotSpot Wizard provides comprehensive annotations of protein structures and assists protein engineers with the rational design of site-specific mutations and focused libraries. It is freely available at http://loschmidt.chemi.muni.cz/hotspotwizard.
Collapse
Affiliation(s)
- Jaroslav Bendl
- Loschmidt Laboratories, Department of Experimental Biology and Research Centre for Toxic Compounds in the Environment RECETOX, Masaryk University, 625 00 Brno, Czech Republic Department of Information Systems, Faculty of Information Technology, Brno University of Technology, 612 66 Brno, Czech Republic International Clinical Research Center, St. Anne's University Hospital Brno, Brno, Czech Republic
| | - Jan Stourac
- Loschmidt Laboratories, Department of Experimental Biology and Research Centre for Toxic Compounds in the Environment RECETOX, Masaryk University, 625 00 Brno, Czech Republic
| | - Eva Sebestova
- Loschmidt Laboratories, Department of Experimental Biology and Research Centre for Toxic Compounds in the Environment RECETOX, Masaryk University, 625 00 Brno, Czech Republic
| | - Ondrej Vavra
- Loschmidt Laboratories, Department of Experimental Biology and Research Centre for Toxic Compounds in the Environment RECETOX, Masaryk University, 625 00 Brno, Czech Republic
| | - Milos Musil
- Loschmidt Laboratories, Department of Experimental Biology and Research Centre for Toxic Compounds in the Environment RECETOX, Masaryk University, 625 00 Brno, Czech Republic Department of Information Systems, Faculty of Information Technology, Brno University of Technology, 612 66 Brno, Czech Republic
| | - Jan Brezovsky
- Loschmidt Laboratories, Department of Experimental Biology and Research Centre for Toxic Compounds in the Environment RECETOX, Masaryk University, 625 00 Brno, Czech Republic International Clinical Research Center, St. Anne's University Hospital Brno, Brno, Czech Republic
| | - Jiri Damborsky
- Loschmidt Laboratories, Department of Experimental Biology and Research Centre for Toxic Compounds in the Environment RECETOX, Masaryk University, 625 00 Brno, Czech Republic International Clinical Research Center, St. Anne's University Hospital Brno, Brno, Czech Republic
| |
Collapse
|
55
|
Abstract
Using structure and sequence based analysis we can engineer proteins to increase their thermal stability.
Collapse
Affiliation(s)
- H. Pezeshgi Modarres
- Molecular Cell Biomechanics Laboratory
- Departments of Bioengineering and Mechanical Engineering
- University of California Berkeley
- Berkeley
- USA
| | - M. R. Mofrad
- Molecular Cell Biomechanics Laboratory
- Departments of Bioengineering and Mechanical Engineering
- University of California Berkeley
- Berkeley
- USA
| | - A. Sanati-Nezhad
- BioMEMS and Bioinspired Microfluidic Laboratory
- Department of Mechanical and Manufacturing Engineering
- University of Calgary
- Calgary
- Canada
| |
Collapse
|
56
|
Parra RG, Espada R, Verstraete N, Ferreiro DU. Structural and Energetic Characterization of the Ankyrin Repeat Protein Family. PLoS Comput Biol 2015; 11:e1004659. [PMID: 26691182 PMCID: PMC4687027 DOI: 10.1371/journal.pcbi.1004659] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2015] [Accepted: 11/10/2015] [Indexed: 11/21/2022] Open
Abstract
Ankyrin repeat containing proteins are one of the most abundant solenoid folds. Usually implicated in specific protein-protein interactions, these proteins are readily amenable for design, with promising biotechnological and biomedical applications. Studying repeat protein families presents technical challenges due to the high sequence divergence among the repeating units. We developed and applied a systematic method to consistently identify and annotate the structural repetitions over the members of the complete Ankyrin Repeat Protein Family, with increased sensitivity over previous studies. We statistically characterized the number of repeats, the folding of the repeat-arrays, their structural variations, insertions and deletions. An energetic analysis of the local frustration patterns reveal the basic features underlying fold stability and its relation to the functional binding regions. We found a strong linear correlation between the conservation of the energetic features in the repeat arrays and their sequence variations, and discuss new insights into the organization and function of these ubiquitous proteins. Some natural proteins are formed with repetitions of similar amino acid stretches. Ankyrin-repeat proteins constitute one of the most abundant families of this class of proteins that serve as model systems to analyze how variations in sequences exert effects in structures and biological functions. We present an in-depth analysis of the ankyrin repeat protein family, characterizing the variations in the repeating arrays both at the structural and energetic level. We introduce a consistent annotation for the repeat characteristics and describe how the structural differences are related to the sequences by their underlying energetic signatures.
Collapse
Affiliation(s)
- R. Gonzalo Parra
- Protein Physiology Lab, Dep de Química Biológica, Facultad de Ciencias Exactas y Naturales, UBA-CONICET-IQUIBICEN, Buenos Aires, Argentina
| | - Rocío Espada
- Protein Physiology Lab, Dep de Química Biológica, Facultad de Ciencias Exactas y Naturales, UBA-CONICET-IQUIBICEN, Buenos Aires, Argentina
| | - Nina Verstraete
- Protein Physiology Lab, Dep de Química Biológica, Facultad de Ciencias Exactas y Naturales, UBA-CONICET-IQUIBICEN, Buenos Aires, Argentina
| | - Diego U. Ferreiro
- Protein Physiology Lab, Dep de Química Biológica, Facultad de Ciencias Exactas y Naturales, UBA-CONICET-IQUIBICEN, Buenos Aires, Argentina
- * E-mail:
| |
Collapse
|
57
|
Li SF, Xu JY, Bao YJ, Zheng HC, Song H. Structure and sequence analysis-based engineering of pullulanase from Anoxybacillus sp. LM18-11 for improved thermostability. J Biotechnol 2015; 210:8-14. [DOI: 10.1016/j.jbiotec.2015.06.406] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2015] [Revised: 06/16/2015] [Accepted: 06/19/2015] [Indexed: 10/23/2022]
|
58
|
Bar-Rogovsky H, Stern A, Penn O, Kobl I, Pupko T, Tawfik DS. Assessing the prediction fidelity of ancestral reconstruction by a library approach. Protein Eng Des Sel 2015; 28:507-18. [DOI: 10.1093/protein/gzv038] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2015] [Accepted: 07/20/2015] [Indexed: 11/13/2022] Open
|
59
|
Bandyopadhyay D, Murthy MRN, Balaram H, Balaram P. Probing the role of highly conserved residues in triosephosphate isomerase - analysis of site specific mutants at positions 64 and 75 in thePlasmodialenzyme. FEBS J 2015. [DOI: 10.1111/febs.13384] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Affiliation(s)
| | | | - Hemalatha Balaram
- Molecular Biology and Genetics Unit; Jawaharlal Nehru Centre for Advanced Scientific Research; Bangalore India
| | | |
Collapse
|
60
|
Dissecting enzyme function with microfluidic-based deep mutational scanning. Proc Natl Acad Sci U S A 2015; 112:7159-64. [PMID: 26040002 DOI: 10.1073/pnas.1422285112] [Citation(s) in RCA: 138] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Natural enzymes are incredibly proficient catalysts, but engineering them to have new or improved functions is challenging due to the complexity of how an enzyme's sequence relates to its biochemical properties. Here, we present an ultrahigh-throughput method for mapping enzyme sequence-function relationships that combines droplet microfluidic screening with next-generation DNA sequencing. We apply our method to map the activity of millions of glycosidase sequence variants. Microfluidic-based deep mutational scanning provides a comprehensive and unbiased view of the enzyme function landscape. The mapping displays expected patterns of mutational tolerance and a strong correspondence to sequence variation within the enzyme family, but also reveals previously unreported sites that are crucial for glycosidase function. We modified the screening protocol to include a high-temperature incubation step, and the resulting thermotolerance landscape allowed the discovery of mutations that enhance enzyme thermostability. Droplet microfluidics provides a general platform for enzyme screening that, when combined with DNA-sequencing technologies, enables high-throughput mapping of enzyme sequence space.
Collapse
|
61
|
Protein evolution analysis of S-hydroxynitrile lyase by complete sequence design utilizing the INTMSAlign software. Sci Rep 2015; 5:8193. [PMID: 25645341 PMCID: PMC4648443 DOI: 10.1038/srep08193] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2014] [Accepted: 01/12/2015] [Indexed: 01/05/2023] Open
Abstract
Development of software and methods for design of complete sequences of functional proteins could contribute to studies of protein engineering and protein evolution. To this end, we developed the INTMSAlign software, and used it to design functional proteins and evaluate their usefulness. The software could assign both consensus and correlation residues of target proteins. We generated three protein sequences with S-selective hydroxynitrile lyase (S-HNL) activity, which we call designed S-HNLs; these proteins folded as efficiently as the native S-HNL. Sequence and biochemical analysis of the designed S-HNLs suggested that accumulation of neutral mutations occurs during the process of S-HNLs evolution from a low-activity form to a high-activity (native) form. Taken together, our results demonstrate that our software and the associated methods could be applied not only to design of complete sequences, but also to predictions of protein evolution, especially within families such as esterases and S-HNLs.
Collapse
|
62
|
Suplatov D, Voevodin V, Švedas V. Robust enzyme design: bioinformatic tools for improved protein stability. Biotechnol J 2014; 10:344-55. [PMID: 25524647 DOI: 10.1002/biot.201400150] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2014] [Revised: 09/30/2014] [Accepted: 11/04/2014] [Indexed: 01/22/2023]
Abstract
The ability of proteins and enzymes to maintain a functionally active conformation under adverse environmental conditions is an important feature of biocatalysts, vaccines, and biopharmaceutical proteins. From an evolutionary perspective, robust stability of proteins improves their biological fitness and allows for further optimization. Viewed from an industrial perspective, enzyme stability is crucial for the practical application of enzymes under the required reaction conditions. In this review, we analyze bioinformatic-driven strategies that are used to predict structural changes that can be applied to wild type proteins in order to produce more stable variants. The most commonly employed techniques can be classified into stochastic approaches, empirical or systematic rational design strategies, and design of chimeric proteins. We conclude that bioinformatic analysis can be efficiently used to study large protein superfamilies systematically as well as to predict particular structural changes which increase enzyme stability. Evolution has created a diversity of protein properties that are encoded in genomic sequences and structural data. Bioinformatics has the power to uncover this evolutionary code and provide a reproducible selection of hotspots - key residues to be mutated in order to produce more stable and functionally diverse proteins and enzymes. Further development of systematic bioinformatic procedures is needed to organize and analyze sequences and structures of proteins within large superfamilies and to link them to function, as well as to provide knowledge-based predictions for experimental evaluation.
Collapse
Affiliation(s)
- Dmitry Suplatov
- Belozersky Institute of Physicochemical Biology and Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia
| | | | | |
Collapse
|
63
|
Parmeggiani F, Huang PS, Vorobiev S, Xiao R, Park K, Caprari S, Su M, Seetharaman J, Mao L, Janjua H, Montelione GT, Hunt J, Baker D. A general computational approach for repeat protein design. J Mol Biol 2014; 427:563-75. [PMID: 25451037 PMCID: PMC4303030 DOI: 10.1016/j.jmb.2014.11.005] [Citation(s) in RCA: 64] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2014] [Revised: 10/08/2014] [Accepted: 11/07/2014] [Indexed: 01/12/2023]
Abstract
Repeat proteins have considerable potential for use as modular binding reagents or biomaterials in biomedical and nanotechnology applications. Here we describe a general computational method for building idealized repeats that integrates available family sequences and structural information with Rosetta de novo protein design calculations. Idealized designs from six different repeat families were generated and experimentally characterized; 80% of the proteins were expressed and soluble and more than 40% were folded and monomeric with high thermal stability. Crystal structures determined for members of three families are within 1Å root-mean-square deviation to the design models. The method provides a general approach for fast and reliable generation of stable modular repeat protein scaffolds.
Collapse
Affiliation(s)
- Fabio Parmeggiani
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA; Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
| | - Po-Ssu Huang
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA; Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
| | - Sergey Vorobiev
- Department of Biological Sciences, Northeast Structural Genomics Consortium, Columbia University, New York, NY 10027, USA
| | - Rong Xiao
- Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry and Department of Biochemistry, Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Northeast Structural Genomics Consortium, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Keunwan Park
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA; Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
| | - Silvia Caprari
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
| | - Min Su
- Department of Biological Sciences, Northeast Structural Genomics Consortium, Columbia University, New York, NY 10027, USA
| | - Jayaraman Seetharaman
- Department of Biological Sciences, Northeast Structural Genomics Consortium, Columbia University, New York, NY 10027, USA
| | - Lei Mao
- Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry and Department of Biochemistry, Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Northeast Structural Genomics Consortium, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Haleema Janjua
- Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry and Department of Biochemistry, Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Northeast Structural Genomics Consortium, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Gaetano T Montelione
- Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry and Department of Biochemistry, Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Northeast Structural Genomics Consortium, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - John Hunt
- Department of Biological Sciences, Northeast Structural Genomics Consortium, Columbia University, New York, NY 10027, USA
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA; Institute for Protein Design, University of Washington, Seattle, WA 98195, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
64
|
Goyal VD, Yadav P, Kumar A, Ghosh B, Makde RD. Crystallization and preliminary X-ray crystallographic analysis of an artificial molten-globular-like triosephosphate isomerase protein of mixed phylogenetic origin. Acta Crystallogr F Struct Biol Commun 2014; 70:1521-5. [PMID: 25372821 PMCID: PMC4231856 DOI: 10.1107/s2053230x14020755] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2014] [Accepted: 09/16/2014] [Indexed: 11/10/2022] Open
Abstract
A bioinformatics-based protein-engineering approach called consensus design led to the construction of a chimeric triosephosphate isomerase (TIM) protein called ccTIM (curated consensus TIM) which is as active as Saccharomyces cerevisiae TIM despite sharing only 58% sequence identity with it. The amino-acid sequence of this novel protein is as identical to native sequences from eukaryotes as to those from prokaryotes and shares some biophysical traits with a molten globular protein. Solving its crystal structure would help in understanding the physical implications of its bioinformatics-based sequence. In this report, the ccTIM protein was successfully crystallized using the microbatch-under-oil method and a full X-ray diffraction data set was collected to 2.2 Å resolution using a synchrotron-radiation source. The crystals belonged to space group C2221, with unit-cell parameters a=107.97, b=187.21, c=288.22 Å. Matthews coefficient calculations indicated the presence of six dimers in the asymmetric unit, with an approximate solvent content of 46.2%.
Collapse
Affiliation(s)
| | - Pooja Yadav
- High Pressure and Synchrotron Radiation Physics Division, Bhabha Atomic Research Centre, Trombay, Mumbai 400 085, India
| | - Ashwani Kumar
- High Pressure and Synchrotron Radiation Physics Division, Bhabha Atomic Research Centre, Trombay, Mumbai 400 085, India
| | - Biplab Ghosh
- High Pressure and Synchrotron Radiation Physics Division, Bhabha Atomic Research Centre, Trombay, Mumbai 400 085, India
| | - Ravindra D. Makde
- High Pressure and Synchrotron Radiation Physics Division, Bhabha Atomic Research Centre, Trombay, Mumbai 400 085, India
| |
Collapse
|
65
|
Yang G, Ding Y. Recent advances in biocatalyst discovery, development and applications. Bioorg Med Chem 2014; 22:5604-12. [DOI: 10.1016/j.bmc.2014.06.033] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2014] [Revised: 06/13/2014] [Accepted: 06/17/2014] [Indexed: 12/25/2022]
|
66
|
Ray WC, Rumpf RW, Sullivan B, Callahan N, Magliery T, Machiraju R, Wong B, Krzywinski M, Bartlett CW. Understanding the sequence requirements of protein families: insights from the BioVis 2013 contests. BMC Proc 2014; 8:S1. [PMID: 25237388 PMCID: PMC4155613 DOI: 10.1186/1753-6561-8-s2-s1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Introduction In 2011, the BioVis symposium of the IEEE VisWeek conferences inaugurated a new variety of data analysis contest. Aimed at fostering collaborations between computational scientists and biologists, the BioVis contest provided real data from biological domains with emerging visualization needs, in the hope that novel approaches would result in powerful new tools for the community. In 2011 and 2012 the theme of these contests was expression Quantitative Trait Locus analysis, within and across tissues respectively. In 2013 the topic was updated to protein sequence and mutation visualization. Methods The contest was framed in the context of a real protein with numerous mutations that had lost function, and the question posed "what minimal set of changes would you propose to rescue function, or how could you support a biologist attempting to answer that question?". The data was grounded in actual experimental results in triosephosphate isomerase(TIM) enzymes. Seven teams composed of 36 individuals submitted entries with proposed solutions and approaches to the challenge. Their contributions ranged from careful analysis of the visualization and analytical requirements for the problem through integration of existing tools for analyzing the context and consequences of protein mutations, to completely new tools addressing the problem. Results Judges found valuable and novel contributions in each of the entries, including interesting ways to hierarchicalize the protein into domains of informational interaction, tools for simultaneously understanding both sequential and spatial order, and approaches for conveying some types of inter-residue dependencies. In this manuscript we document the problem presented to the contestants, summarize the biological contributions of their entries, and suggest opportunities that this work has highlighted for even more improved tools in the future.
Collapse
Affiliation(s)
- William C Ray
- Nationwide Children's Hospital, 575 Children's Crossroad, 43215, Columbus, OH, USA ; The Ohio State University, 100 W. 18th Ave, 43210, Columbus, OH, USA ; Contest Chairs
| | - R Wolfgang Rumpf
- Nationwide Children's Hospital, 575 Children's Crossroad, 43215, Columbus, OH, USA
| | - Brandon Sullivan
- The Ohio State University, 100 W. 18th Ave, 43210, Columbus, OH, USA ; Domain Experts
| | - Nicholas Callahan
- The Ohio State University, 100 W. 18th Ave, 43210, Columbus, OH, USA ; Domain Experts
| | - Thomas Magliery
- The Ohio State University, 100 W. 18th Ave, 43210, Columbus, OH, USA ; Domain Experts
| | - Raghu Machiraju
- The Ohio State University, 100 W. 18th Ave, 43210, Columbus, OH, USA ; Contest Chairs
| | - Bang Wong
- The Broad Institute, 7 Cambridge Center, 02142, Cambridge, MA, USA ; Contest Chairs
| | - Martin Krzywinski
- Genome Sciences Centre, 570 W, 7th Avenue, V5Z 4S6, Vancouver, BC, Canada ; Contest Chairs
| | - Christopher W Bartlett
- Nationwide Children's Hospital, 575 Children's Crossroad, 43215, Columbus, OH, USA ; The Ohio State University, 100 W. 18th Ave, 43210, Columbus, OH, USA ; Contest Chairs
| |
Collapse
|
67
|
Longo LM, Blaber M. Symmetric protein architecture in protein design: top-down symmetric deconstruction. Methods Mol Biol 2014; 1216:161-182. [PMID: 25213415 DOI: 10.1007/978-1-4939-1486-9_8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Top-down symmetric deconstruction (TDSD) is a joint experimental and computational approach to generate a highly stable, functionally benign protein scaffold for intended application in subsequent functional design studies. By focusing on symmetric protein folds, TDSD can leverage the dramatic reduction in sequence space achieved by applying a primary structure symmetric constraint to the design process. Fundamentally, TDSD is an iterative symmetrization process, in which the goal is to maintain or improve properties of thermodynamic stability and folding cooperativity inherent to a starting sequence (the "proxy"). As such, TDSD does not attempt to solve the inverse protein folding problem directly, which is computationally intractable. The present chapter will take the reader through all of the primary steps of TDSD-selecting a proxy, identifying potential mutations, establishing a stability/folding cooperativity screen-relying heavily on a successful TDSD solution for the common β-trefoil fold.
Collapse
Affiliation(s)
- Liam M Longo
- Department of Biomedical Sciences, College of Medicine, Florida State University, 1115 West Call Street, Tallahassee, FL, 32306-4300, USA
| | | |
Collapse
|
68
|
Abstract
The genomic revolution promises great advances in the search for useful biocatalysts. Function-based metagenomic approaches have identified several enzymes with properties that make them useful candidates for a variety of bioprocesses. As DNA sequencing costs continue to decline, the volume of genomic data, along with their corresponding predicted protein sequences, will continue to increase dramatically, necessitating new approaches to leverage this information for gene-based bioprospecting efforts. Additionally, as new functions are discovered and correlated with this sequence information, the knowledge of the often complex relationship between a protein's sequence and function will improve. This in turn will lead to better gene-based bioprospecting approaches and facilitate the tailoring of desired properties through protein engineering projects. In this chapter, we discuss a number of recent advances in bioprospecting within the context of the genomic age.
Collapse
Affiliation(s)
- Michael A Hicks
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| | - Kristala L J Prather
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA; Synthetic Biology Engineering Research Center (SynBERC), Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
| |
Collapse
|
69
|
Computational tools for designing and engineering enzymes. Curr Opin Chem Biol 2013; 19:8-16. [PMID: 24780274 DOI: 10.1016/j.cbpa.2013.12.003] [Citation(s) in RCA: 132] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2013] [Revised: 12/04/2013] [Accepted: 12/04/2013] [Indexed: 01/23/2023]
Abstract
Protein engineering strategies aimed at constructing enzymes with novel or improved activities, specificities, and stabilities greatly benefit from in silico methods. Computational methods can be principally grouped into three main categories: bioinformatics; molecular modelling; and de novo design. Particularly de novo protein design is experiencing rapid development, resulting in more robust and reliable predictions. A recent trend in the field is to combine several computational approaches in an interactive manner and to complement them with structural analysis and directed evolution. A detailed investigation of designed catalysts provides valuable information on the structural basis of molecular recognition, biochemical catalysis, and natural protein evolution.
Collapse
|
70
|
Tanwar AS, Goyal VD, Choudhary D, Panjikar S, Anand R. Importance of hydrophobic cavities in allosteric regulation of formylglycinamide synthetase: insight from xenon trapping and statistical coupling analysis. PLoS One 2013; 8:e77781. [PMID: 24223728 PMCID: PMC3815217 DOI: 10.1371/journal.pone.0077781] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2013] [Accepted: 09/12/2013] [Indexed: 11/19/2022] Open
Abstract
Formylglycinamide ribonucleotide amidotransferase (FGAR-AT) is a 140 kDa bi-functional enzyme involved in a coupled reaction, where the glutaminase active site produces ammonia that is subsequently utilized to convert FGAR to its corresponding amidine in an ATP assisted fashion. The structure of FGAR-AT has been previously determined in an inactive state and the mechanism of activation remains largely unknown. In the current study, hydrophobic cavities were used as markers to identify regions involved in domain movements that facilitate catalytic coupling and subsequent activation of the enzyme. Three internal hydrophobic cavities were located by xenon trapping experiments on FGAR-AT crystals and further, these cavities were perturbed via site-directed mutagenesis. Biophysical characterization of the mutants demonstrated that two of these three voids are crucial for stability and function of the protein, although being ∼20 Å from the active centers. Interestingly, correlation analysis corroborated the experimental findings, and revealed that amino acids lining the functionally important cavities form correlated sets (co-evolving residues) that connect these regions to the amidotransferase active center. It was further proposed that the first cavity is transient and allows for breathing motion to occur and thereby serves as an allosteric hotspot. In contrast, the third cavity which lacks correlated residues was found to be highly plastic and accommodated steric congestion by local adjustment of the structure without affecting either stability or activity.
Collapse
Affiliation(s)
- Ajay Singh Tanwar
- Department of Chemistry, Indian Institute of Technology Bombay, Mumbai, India
| | - Venuka Durani Goyal
- Department of Chemistry, Indian Institute of Technology Bombay, Mumbai, India
| | - Deepanshu Choudhary
- Department of Chemistry, Indian Institute of Technology Bombay, Mumbai, India
| | - Santosh Panjikar
- Australian Synchrotron, Clayton, Australia
- Department of Biochemistry and Molecular Biology, Monash University, Victoria, Australia
| | - Ruchi Anand
- Department of Chemistry, Indian Institute of Technology Bombay, Mumbai, India
- * E-mail:
| |
Collapse
|
71
|
Abstract
Theoretical studies have focused on the environmental temperature of the universal common ancestor of life with conflicting conclusions. Here we provide experimental support for the existence of a thermophilic universal common ancestor. We present the thermal stabilities and catalytic efficiencies of nucleoside diphosphate kinases (NDK), designed using the information contained in predictive phylogenetic trees, that seem to represent the last common ancestors of Archaea and of Bacteria. These enzymes display extreme thermal stabilities, suggesting thermophilic ancestries for Archaea and Bacteria. The results are robust to the uncertainties associated with the sequence predictions and to the tree topologies used to infer the ancestral sequences. Moreover, mutagenesis experiments suggest that the universal ancestor also possessed a very thermostable NDK. Because, as we show, the stability of an NDK is directly related to the environmental temperature of its host organism, our results indicate that the last common ancestor of extant life was a thermophile that flourished at a very high temperature.
Collapse
|
72
|
Wijma HJ, Floor RJ, Janssen DB. Structure- and sequence-analysis inspired engineering of proteins for enhanced thermostability. Curr Opin Struct Biol 2013; 23:588-94. [PMID: 23683520 DOI: 10.1016/j.sbi.2013.04.008] [Citation(s) in RCA: 143] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2013] [Accepted: 04/15/2013] [Indexed: 01/03/2023]
Abstract
Protein engineering strategies for increasing stability can be improved by replacing random mutagenesis and high-throughput screening by approaches that include bioinformatics and computational design. Mutations can be focused on regions in the structure that are most flexible and involved in the early steps of thermal unfolding. Sequence analysis can often predict the position and nature of stabilizing mutations, and may allow the reconstruction of thermostable ancestral sequences. Various computational tools make it possible to design stabilizing features, such as hydrophobic clusters and surface charges. Different methods for designing chimeric enzymes can also support the engineering of more stable proteins without the need of high-throughput screening.
Collapse
Affiliation(s)
- Hein J Wijma
- Department of Biochemistry, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Nijenborgh 4, 9747 AG Groningen, The Netherlands
| | | | | |
Collapse
|
73
|
Aerts D, Verhaeghe T, Joosten HJ, Vriend G, Soetaert W, Desmet T. Consensus engineering of sucrose phosphorylase: The outcome reflects the sequence input. Biotechnol Bioeng 2013; 110:2563-72. [DOI: 10.1002/bit.24940] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2013] [Revised: 03/30/2013] [Accepted: 04/08/2013] [Indexed: 11/10/2022]
Affiliation(s)
- Dirk Aerts
- Department of Biochemical and Microbial Technology; Centre for Industrial Biotechnology and Biocatalysis; Ghent University; Coupure Links 653; B-9000; Ghent; Belgium
| | - Tom Verhaeghe
- Department of Biochemical and Microbial Technology; Centre for Industrial Biotechnology and Biocatalysis; Ghent University; Coupure Links 653; B-9000; Ghent; Belgium
| | - Henk-Jan Joosten
- Bio-Prodict; Castellastraat 116; Nijmegen; 6512; EZ; The Netherlands
| | - Gert Vriend
- Centre for Molecular and Biomolecular Informatics; Radboud University Nijmegen Medical Centre; PO Box 9101; Nijmegen; 6500; HB; The Netherlands
| | - Wim Soetaert
- Department of Biochemical and Microbial Technology; Centre for Industrial Biotechnology and Biocatalysis; Ghent University; Coupure Links 653; B-9000; Ghent; Belgium
| | - Tom Desmet
- Department of Biochemical and Microbial Technology; Centre for Industrial Biotechnology and Biocatalysis; Ghent University; Coupure Links 653; B-9000; Ghent; Belgium
| |
Collapse
|
74
|
Eisenberg M, Shumacher I, Cohen-Luria R, Ashkenasy G. Dynamic combinatorial libraries of artificial repeat proteins. Bioorg Med Chem 2013; 21:3450-7. [PMID: 23582443 DOI: 10.1016/j.bmc.2013.03.007] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2013] [Revised: 03/10/2013] [Accepted: 03/11/2013] [Indexed: 10/27/2022]
Abstract
Repeat proteins are found in almost all cellular systems, where they are involved in diverse molecular recognition processes. Recent studies have suggested that de novo designed repeat proteins may serve as universal binders, and might potentially be used as practical alternative to antibodies. We describe here a novel chemical methodology for producing small libraries of repeat proteins, and screening in parallel the ligand binding of library members. The first stage of this research involved the total synthesis of a consensus-based three-repeat tetratricopeptide (TPR) protein (~14 kDa), via sequential attachment of the respective peptides. Despite the effectiveness of the synthesis and ligation steps, this method was found to be too demanding for the production of proteins containing variable number of repeats. Additionally, the analysis of binding of the individual proteins was time consuming. Therefore, we designed and prepared novel dynamic combinatorial libraries (DCLs), and show that their equilibration can facilitate the formation of TPR proteins containing up to eight repeating units. Interestingly, equilibration of the library building blocks in the presence of the biologically relevant ligands, Hsp90 and Hsp70, induced their oligomerization into forming more of the proteins with large recognition surfaces. We suggest that this work presents a novel simple and rapid tool for the simultaneous screening of protein mixtures with variable binding surfaces, and for identifying new binders for ligands of interest.
Collapse
Affiliation(s)
- Margarita Eisenberg
- Department of Chemistry, Ben Gurion University of the Negev, Beer Sheva 84105, Israel
| | | | | | | |
Collapse
|
75
|
Sawyer N, Chen J, Regan L. All repeats are not equal: a module-based approach to guide repeat protein design. J Mol Biol 2013; 425:1826-1838. [PMID: 23434848 DOI: 10.1016/j.jmb.2013.02.013] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2012] [Revised: 02/11/2013] [Accepted: 02/12/2013] [Indexed: 12/30/2022]
Abstract
Repeat proteins composed of tandem arrays of a short structural motif often mediate protein-protein interactions. Past efforts to design repeat protein-based molecular recognition tools have focused on the creation of templates from the consensus of individual repeats, regardless of their natural context. Such an approach assumes that all repeats are essentially equivalent. In this study, we present the results of a "module-based" approach in which modules composed of tandem repeats are aligned to identify repeat-specific features. Using this approach to analyze tetratricopeptide repeat modules that contain three tandem repeats (3TPRs), we identify two classes of 3TPR modules with distinct structural signatures that are correlated with different sets of functional residues. Our analyses also reveal a high degree of correlation between positions across the entire ligand-binding surface, indicative of a coordinated, coevolving binding surface. Extension of our analyses to different repeat protein modules reveals more examples of repeat-specific features, especially in armadillo repeat modules. In summary, the module-based analyses that we present effectively capture key repeat-specific features that will be important to include in future repeat protein design templates.
Collapse
Affiliation(s)
- Nicholas Sawyer
- Integrated Graduate Program in Physical and Engineering Biology, Yale University, 266 Whitney Avenue, New Haven, CT 06511, USA.,Department of Molecular Biophysics and Biochemistry, Yale University, 266 Whitney Avenue, New Haven, CT 06511, USA
| | - Jieming Chen
- Integrated Graduate Program in Physical and Engineering Biology, Yale University, 266 Whitney Avenue, New Haven, CT 06511, USA.,Program in Computational Biology and Bioinformatics, Yale University, 266 Whitney Avenue, New Haven, CT 06511, USA
| | - Lynne Regan
- Integrated Graduate Program in Physical and Engineering Biology, Yale University, 266 Whitney Avenue, New Haven, CT 06511, USA.,Department of Molecular Biophysics and Biochemistry, Yale University, 266 Whitney Avenue, New Haven, CT 06511, USA.,Program in Computational Biology and Bioinformatics, Yale University, 266 Whitney Avenue, New Haven, CT 06511, USA.,Department of Chemistry, Yale University, 266 Whitney Avenue, New Haven, CT 06511, USA
| |
Collapse
|
76
|
Reetz MT. The Importance of Additive and Non-Additive Mutational Effects in Protein Engineering. Angew Chem Int Ed Engl 2013; 52:2658-66. [DOI: 10.1002/anie.201207842] [Citation(s) in RCA: 132] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2012] [Revised: 12/19/2012] [Indexed: 01/01/2023]
|
77
|
Die Bedeutung von additiven und nicht-additiven Mutationseffekten beim Protein-Engineering. Angew Chem Int Ed Engl 2013. [DOI: 10.1002/ange.201207842] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
|
78
|
Durani V, Magliery TJ. Protein engineering and stabilization from sequence statistics: variation and covariation analysis. Methods Enzymol 2013; 523:237-56. [PMID: 23422433 DOI: 10.1016/b978-0-12-394292-0.00011-4] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
The concepts of consensus and correlation in multiple sequence alignments (MSAs) have been used in the past to understand and engineer proteins. However, there are multiple ways of acquiring MSA databases and also numerous mathematical metrics that can be applied to calculate each of the parameters. This chapter describes an overall methodology that we have chosen to employ for acquiring and statistically analyzing MSAs. We have provided a step-by-step protocol for calculating relative entropy and mutual information metrics and describe how they can be used to predict mutations that have a high probability of stabilizing a protein. This protocol allows for flexibility for modification of formulae and parameters without using anything more complicated than Microsoft Excel. We have also demonstrated various aspects of data analysis by carrying out a sample analysis on the BPTI-Kunitz family of proteins and identified mutations that would be predicted to stabilize this protein based on consensus and correlation values.
Collapse
Affiliation(s)
- Venuka Durani
- Department of Chemistry, The Ohio State University, Columbus, Ohio, USA
| | | |
Collapse
|
79
|
Dietrich S, Borst N, Schlee S, Schneider D, Janda JO, Sterner R, Merkl R. Experimental assessment of the importance of amino acid positions identified by an entropy-based correlation analysis of multiple-sequence alignments. Biochemistry 2012; 51:5633-41. [PMID: 22737967 DOI: 10.1021/bi300747r] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The analysis of a multiple-sequence alignment (MSA) with correlation methods identifies pairs of residue positions whose occupation with amino acids changes in a concerted manner. It is plausible to assume that positions that are part of many such correlation pairs are important for protein function or stability. We have used the algorithm H2r to identify positions k in the MSAs of the enzymes anthranilate phosphoribosyl transferase (AnPRT) and indole-3-glycerol phosphate synthase (IGPS) that show a high conn(k) value, i.e., a large number of significant correlations in which k is involved. The importance of the identified residues was experimentally validated by performing mutagenesis studies with sAnPRT and sIGPS from the archaeon Sulfolobus solfataricus. For sAnPRT, five H2r mutant proteins were generated by replacing nonconserved residues with alanine or the prevalent residue of the MSA. As a control, five residues with conn(k) values of zero were chosen randomly and replaced with alanine. The catalytic activities and conformational stabilities of the H2r and control mutant proteins were analyzed by steady-state enzyme kinetics and thermal unfolding studies. Compared to wild-type sAnPRT, the catalytic efficiencies (k(cat)/K(M)) were largely unaltered. In contrast, the apparent thermal unfolding temperature (T(M)(app)) was lowered in most proteins. Remarkably, the strongest observed destabilization (ΔT(M)(app) = 14 °C) was caused by the V284A exchange, which pertains to the position with the highest correlation signal [conn(k) = 11]. For sIGPS, six H2r mutant and four control proteins with alanine exchanges were generated and characterized. The k(cat)/K(M) values of four H2r mutant proteins were reduced between 13- and 120-fold, and their T(M)(app) values were decreased by up to 5 °C. For the sIGPS control proteins, the observed activity and stability decreases were much less severe. Our findings demonstrate that positions with high conn(k) values have an increased probability of being important for enzyme function or stability.
Collapse
Affiliation(s)
- Susanne Dietrich
- Institute of Biophysics and Physical Biochemistry, University of Regensburg, Universitätsstrasse 31, D-93053 Regensburg, Germany
| | | | | | | | | | | | | |
Collapse
|