1
|
Joerger AC, Stiewe T, Soussi T. TP53: the unluckiest of genes? Cell Death Differ 2025; 32:219-224. [PMID: 39443700 PMCID: PMC11803090 DOI: 10.1038/s41418-024-01391-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Revised: 09/22/2024] [Accepted: 09/27/2024] [Indexed: 10/25/2024] Open
Abstract
The transcription factor p53 plays a key role in the cellular defense against cancer development. It is inactivated in virtually every tumor, and in every second tumor this inactivation is due to a mutation in the TP53 gene. In this perspective, we show that this diverse mutational spectrum is unique among all other cancer-associated proteins and discuss what drives the selection of TP53 mutations in cancer. We highlight that several factors conspire to make the p53 protein particularly vulnerable to inactivation by the mutations that constantly plague our genome. It appears that the TP53 gene has emerged as a victim of its own evolutionary past that shaped its structure and function towards a pluripotent tumor suppressor, but came with an increased structural fragility of its DNA-binding domain. TP53 loss of function - with associated dominant-negative effects - is the main mechanism that will impair TP53 tumor suppressive function, regardless of whether a neomorphic phenotype is associated with some of these variants.
Collapse
Affiliation(s)
- Andreas C Joerger
- Institute of Pharmaceutical Chemistry, Goethe University, Frankfurt am Main, Germany.
- Structural Genomics Consortium (SGC), Buchmann Institute for Molecular Life Sciences, Frankfurt am Main, Germany.
| | - Thorsten Stiewe
- Institute of Molecular Oncology, Universities of Giessen and Marburg Lung Center (UGMLC), German Center for Lung Research (DZL), Philipps University, Marburg, Germany.
- Institute for Lung Health (ILH), Justus Liebig University, Giessen, Germany.
| | - Thierry Soussi
- Equipe « Hematopoietic and Leukemic Development », Sorbonne Université, INSERM, Centre de Recherche Saint-Antoine, CRSA, AP-HP, SIRIC CURAMUS, Paris, France.
- Dept. of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, Clinical Genetics, Uppsala University Hospital, Uppsala, Sweden.
| |
Collapse
|
2
|
Chen Z, Ji M, Qian J, Zhang Z, Zhang X, Gao H, Wang H, Wang R, Qi Y. ProBID-Net: a deep learning model for protein-protein binding interface design. Chem Sci 2024; 15:19977-19990. [PMID: 39568891 PMCID: PMC11575592 DOI: 10.1039/d4sc02233e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Accepted: 10/11/2024] [Indexed: 11/22/2024] Open
Abstract
Protein-protein interactions are pivotal in numerous biological processes. The computational design of these interactions facilitates the creation of novel binding proteins, crucial for advancing biopharmaceutical products. With the evolution of artificial intelligence (AI), protein design tools have swiftly transitioned from scoring-function-based to AI-based models. However, many AI models for protein design are constrained by assuming complete unfamiliarity with the amino acid sequence of the input protein, a feature most suited for de novo design but posing challenges in designing protein-protein interactions when the receptor sequence is known. To bridge this gap in computational protein design, we introduce ProBID-Net. Trained using natural protein-protein complex structures and protein domain-domain interface structures, ProBID-Net can discern features from known target protein structures to design specific binding proteins based on their binding sites. In independent tests, ProBID-Net achieved interface sequence recovery rates of 52.7%, 43.9%, and 37.6%, surpassing or being on par with ProteinMPNN in binding protein design. Validated using AlphaFold-Multimer, the sequences designed by ProBID-Net demonstrated a close correspondence between the design target and the predicted structure. Moreover, the model's output can predict changes in binding affinity upon mutations in protein complexes, even in scenarios where no data on such mutations were provided during training (zero-shot prediction). In summary, the ProBID-Net model is poised to significantly advance the design of protein-protein interactions.
Collapse
Affiliation(s)
- Zhihang Chen
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University 826 Zhangheng Road Shanghai 201203 People's Republic of China
| | - Menglin Ji
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University 826 Zhangheng Road Shanghai 201203 People's Republic of China
| | - Jie Qian
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University 826 Zhangheng Road Shanghai 201203 People's Republic of China
| | - Zhe Zhang
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University 826 Zhangheng Road Shanghai 201203 People's Republic of China
| | - Xiangying Zhang
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University 826 Zhangheng Road Shanghai 201203 People's Republic of China
| | - Haotian Gao
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University 826 Zhangheng Road Shanghai 201203 People's Republic of China
| | - Haojie Wang
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University 826 Zhangheng Road Shanghai 201203 People's Republic of China
| | - Renxiao Wang
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University 826 Zhangheng Road Shanghai 201203 People's Republic of China
| | - Yifei Qi
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University 826 Zhangheng Road Shanghai 201203 People's Republic of China
| |
Collapse
|
3
|
Vila JA. Analysis of proteins in the light of mutations. EUROPEAN BIOPHYSICS JOURNAL : EBJ 2024; 53:255-265. [PMID: 38955858 DOI: 10.1007/s00249-024-01714-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Revised: 05/23/2024] [Accepted: 06/18/2024] [Indexed: 07/04/2024]
Abstract
Proteins have evolved through mutations-amino acid substitutions-since life appeared on Earth, some 109 years ago. The study of these phenomena has been of particular significance because of their impact on protein stability, function, and structure. This study offers a new viewpoint on how the most recent findings in these areas can be used to explore the impact of mutations on protein sequence, stability, and evolvability. Preliminary results indicate that: (1) mutations can be viewed as sensitive probes to identify 'typos' in the amino-acid sequence, and also to assess the resistance of naturally occurring proteins to unwanted sequence alterations; (2) the presence of 'typos' in the amino acid sequence, rather than being an evolutionary obstacle, could promote faster evolvability and, in turn, increase the likelihood of higher protein stability; (3) the mutation site is far more important than the substituted amino acid in terms of the marginal stability changes of the protein, and (4) the unpredictability of protein evolution at the molecular level-by mutations-exists even in the absence of epistasis effects. Finally, the Darwinian concept of evolution "descent with modification" and experimental evidence endorse one of the results of this study, which suggests that some regions of any protein sequence are susceptible to mutations while others are not. This work contributes to our general understanding of protein responses to mutations and may spur significant progress in our efforts to develop methods to accurately forecast changes in protein stability, their propensity for metamorphism, and their ability to evolve.
Collapse
Affiliation(s)
- Jorge A Vila
- IMASL-CONICET, Universidad Nacional de San Luis, Ejército de los Andes 950, 5700, San Luis, Argentina.
| |
Collapse
|
4
|
Gaur SK, Chaudhary Y, Jain J, Singh R, Kaul R. Structural and functional characterization of peste des petits ruminants virus coded hemagglutinin protein using various in-silico approaches. Front Microbiol 2024; 15:1427606. [PMID: 38966393 PMCID: PMC11222573 DOI: 10.3389/fmicb.2024.1427606] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2024] [Accepted: 06/10/2024] [Indexed: 07/06/2024] Open
Abstract
Peste des petits ruminants (PPR), a disease of socioeconomic importance has been a serious threat to small ruminants. The causative agent of this disease is PPR virus (PPRV) which belongs to the genus Morbillivirus. Hemagglutinin (H) is a PPRV coded transmembrane protein embedded in the viral envelope and plays a vital role in mediating the entry of virion particle into the cell. The infected host mounts an effective humoral response against H protein which is important for host to overcome the infection. In the present study, we have investigated structural, physiological and functional properties of hemagglutinin protein using various computational tools. The sequence analysis and structure prediction analysis show that hemagglutinin protein comprises of beta sheets as the predominant secondary structure, and may lack neuraminidase activity. PPRV-H consists of several important domains and motifs that form an essential scaffold which impart various critical roles to the protein. Comparative modeling predicted the protein to exist as a homo-tetramer that binds to its cognate cellular receptors. Certain amino acid substitutions identified by multiple sequence alignment were found to alter the predicted structure of the protein. PPRV-H through its predicted interaction with TLR-2 molecule may drive the expression of CD150 which could further propagate the virus into the host. Together, our study provides new insights into PPRV-H protein structure and its predicted functions.
Collapse
Affiliation(s)
| | | | | | | | - Rajeev Kaul
- Department of Microbiology, University of Delhi South Campus, New Delhi, India
| |
Collapse
|
5
|
Panigrahi R, Kailasam S. Mapping allosteric pathway in NIa-Pro using computational approach. QUANTITATIVE BIOLOGY 2023. [DOI: 10.15302/j-qb-022-0296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
|
6
|
Carpentier M, Chomilier J. Analyses of Mutation Displacements from Homology Models. Methods Mol Biol 2023; 2627:195-210. [PMID: 36959449 DOI: 10.1007/978-1-0716-2974-1_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/25/2023]
Abstract
Evaluation of the structural perturbations introduced by a single amino acid mutation is the main issue for protein structural biology. We propose here to present some recent advances in methods, allowing the splitting of distortion between the actual substitution effect and the contribution of the local flexibility of the position where the mutation occurs. Its main drawback is the need of many structures with a single mutation in each of them. To bypass this difficulty, we propose to use molecular modeling tools, with several software enabling us to build a model from a template, given the sequence. As a proof of concept, we rely on a gold standard, the human lysozyme. Both wild-type and three mutant structures are available in the PDB. Two of these mutations result in amyloid fibril formation, and the last one is neutral. As a conclusion, irrespective of the algorithm used for modeling, side chain conformations at the site of mutation are reliable, although long-range effects are out of reach of these tools.
Collapse
Affiliation(s)
- Mathilde Carpentier
- Institut Systématique Evolution Biodiversité (ISYEB), Sorbonne Université, MNHN, CNRS, EPHE, Paris, France.
| | - Jacques Chomilier
- Sorbonne Université, BiBiP, IMPMC, UMR 7590, CNRS, MNHN, Paris, France
| |
Collapse
|
7
|
Peng M, Siebert DL, Engqvist MKM, Niemeyer CM, Rabe KS. Modeling-Assisted Design of Thermostable Benzaldehyde Lyases from Rhodococcus erythropolis for Continuous Production of α-Hydroxy Ketones. Chembiochem 2022; 23:e202100468. [PMID: 34558792 PMCID: PMC9293332 DOI: 10.1002/cbic.202100468] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Revised: 09/23/2021] [Indexed: 12/18/2022]
Abstract
Enantiopure α-hydroxy ketones are important building blocks of active pharmaceutical ingredients (APIs), which can be produced by thiamine-diphosphate-dependent lyases, such as benzaldehyde lyase. Here we report the discovery of a novel thermostable benzaldehyde lyase from Rhodococcus erythropolis R138 (ReBAL). While the overall sequence identity to the only experimentally confirmed benzaldehyde lyase from Pseudomonas fluorescens Biovar I (PfBAL) was only 65 %, comparison of a structural model of ReBAL with the crystal structure of PfBAL revealed only four divergent amino acids in the substrate binding cavity. Based on rational design, we generated two ReBAL variants, which were characterized along with the wild-type enzyme in terms of their substrate spectrum, thermostability and biocatalytic performance in the presence of different co-solvents. We found that the new enzyme variants have a significantly higher thermostability (up to 22 °C increase in T50 ) and a different co-solvent-dependent activity. Using the most stable variant immobilized in packed-bed reactors via the SpyCatcher/SpyTag system, (R)-benzoin was synthesized from benzaldehyde over a period of seven days with a stable space-time-yield of 9.3 mmol ⋅ L-1 ⋅ d-1 . Our work expands the important class of benzaldehyde lyases and therefore contributes to the development of continuous biocatalytic processes for the production of α-hydroxy ketones and APIs.
Collapse
Affiliation(s)
- Martin Peng
- Karlsruhe Institute of Technology (KIT)Institute for Biological Interfaces (IBG 1)Hermann-von-Helmholtz-Platz 176344Eggenstein-LeopoldshafenGermany
| | - Dominik L. Siebert
- Karlsruhe Institute of Technology (KIT)Institute for Biological Interfaces (IBG 1)Hermann-von-Helmholtz-Platz 176344Eggenstein-LeopoldshafenGermany
| | - Martin K. M. Engqvist
- Chalmers University of TechnologyDepartment of Biology and Biological EngineeringDivision of Systems and Synthetic BiologyKemivägen 10412 96GothenburgSweden
| | - Christof M. Niemeyer
- Karlsruhe Institute of Technology (KIT)Institute for Biological Interfaces (IBG 1)Hermann-von-Helmholtz-Platz 176344Eggenstein-LeopoldshafenGermany
| | - Kersten S. Rabe
- Karlsruhe Institute of Technology (KIT)Institute for Biological Interfaces (IBG 1)Hermann-von-Helmholtz-Platz 176344Eggenstein-LeopoldshafenGermany
| |
Collapse
|
8
|
Abstract
The reconstruction of genetic material of ancestral organisms constitutes a powerful application of evolutionary biology. A fundamental step in this inference is the ancestral sequence reconstruction (ASR), which can be performed with diverse methodologies implemented in computer frameworks. However, most of these methodologies ignore evolutionary properties frequently observed in microbes, such as genetic recombination and complex selection processes, that can bias the traditional ASR. From a practical perspective, here I review methodologies for the reconstruction of ancestral DNA and protein sequences, with particular focus on microbes, and including biases, recommendations, and software implementations. I conclude that microbial ASR is a complex analysis that should be carefully performed and that there is a need for methods to infer more realistic ancestral microbial sequences.
Collapse
Affiliation(s)
- Miguel Arenas
- Biomedical Research Center (CINBIO), University of Vigo, Vigo, Spain.
- Department of Biochemistry, Genetics and Immunology, University of Vigo, Vigo, Spain.
- Galicia Sur Health Research Institute (IIS Galicia Sur), Vigo, Spain.
| |
Collapse
|
9
|
Gutman T, Goren G, Efroni O, Tuller T. Estimating the predictive power of silent mutations on cancer classification and prognosis. NPJ Genom Med 2021; 6:67. [PMID: 34385450 PMCID: PMC8361094 DOI: 10.1038/s41525-021-00229-1] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Accepted: 06/24/2021] [Indexed: 02/07/2023] Open
Abstract
In recent years it has been shown that silent mutations, in and out of the coding region, can affect gene expression and may be related to tumorigenesis and cancer cell fitness. However, the predictive ability of these mutations for cancer type diagnosis and prognosis has not been evaluated yet. In the current study, based on the analysis of 9,915 cancer genomes and approximately three million mutations, we provide a comprehensive quantitative evaluation of the predictive power of various types of silent and non-silent mutations over cancer classification and prognosis. The results indicate that silent-mutation models outperform the equivalent null models in classifying all examined cancer types and in estimating the probability of survival 10 years after the initial diagnosis. Additionally, combining both non-silent and silent mutations achieved the best classification results for 68% of the cancer types and the best survival estimation results for up to nine years after the diagnosis. Thus, silent mutations hold considerable predictive power over both cancer classification and prognosis, most likely due to their effect on gene expression. It is highly advised that silent mutations are integrated in cancer research in order to unravel the full genomic landscape of cancer and its ramifications on cancer fitness.
Collapse
Affiliation(s)
- Tal Gutman
- Department of Biomedical Engineering, the Engineering Faculty, Tel Aviv University, Tel-Aviv, Israel
| | - Guy Goren
- Department of Electrical Engineering, the Engineering Faculty, Tel Aviv University, Tel-Aviv, Israel
| | - Omri Efroni
- Department of Electrical Engineering, the Engineering Faculty, Tel Aviv University, Tel-Aviv, Israel
| | - Tamir Tuller
- Department of Biomedical Engineering, the Engineering Faculty, Tel Aviv University, Tel-Aviv, Israel.
| |
Collapse
|
10
|
Prabantu VM, Naveenkumar N, Srinivasan N. Influence of Disease-Causing Mutations on Protein Structural Networks. Front Mol Biosci 2021; 7:620554. [PMID: 33778000 PMCID: PMC7987782 DOI: 10.3389/fmolb.2020.620554] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2020] [Accepted: 12/17/2020] [Indexed: 01/18/2023] Open
Abstract
The interactions between residues in a protein tertiary structure can be studied effectively using the approach of protein structure network (PSN). A PSN is a node-edge representation of the structure with nodes representing residues and interactions between residues represented by edges. In this study, we have employed weighted PSNs to understand the influence of disease-causing mutations on proteins of known 3D structures. We have used manually curated information on disease mutations from UniProtKB/Swiss-Prot and their corresponding protein structures of wildtype and disease variant from the protein data bank. The PSNs of the wildtype and disease-causing mutant are compared to analyse variation of global and local dissimilarity in the overall network and at specific sites. We study how a mutation at a given site can affect the structural network at a distant site which may be involved in the function of the protein. We have discussed specific examples of the disease cases where the protein structure undergoes limited structural divergence in their backbone but have large dissimilarity in their all atom networks and vice versa, wherein large conformational alterations are observed while retaining overall network. We analyse the effect of variation of network parameters that characterize alteration of function or stability.
Collapse
Affiliation(s)
| | - Nagarajan Naveenkumar
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India.,National Centre for Biological Sciences, TIFR, Bangalore, India.,Bharathidasan University, Tiruchirappalli, India
| | | |
Collapse
|
11
|
Zhang C, Wang X, Liu X, Fan Y, Zhang Y, Zhou X, Li W. A Novel ' Candidatus Liberibacter asiaticus'-Encoded Sec-Dependent Secretory Protein Suppresses Programmed Cell Death in Nicotiana benthamiana. Int J Mol Sci 2019; 20:E5802. [PMID: 31752214 PMCID: PMC6888338 DOI: 10.3390/ijms20225802] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Revised: 11/12/2019] [Accepted: 11/15/2019] [Indexed: 12/28/2022] Open
Abstract
'Candidatus Liberibacter asiaticus' (CLas) is one of the causal agents of citrus Huanglongbing (HLB), a bacterial disease of citrus trees that greatly reduces fruit yield and quality. CLas strains produce an array of currently uncharacterized Sec-dependent secretory proteins. In this study, the conserved chromosomally encoded protein CLIBASIA_03875 was identified as a novel Sec-dependent secreted protein. We show that CLIBASIA_03875 contains a putative Sec- secretion signal peptide (SP), a 29 amino acid residue located at the N-terminus, with a mature protein (m3875) of 22 amino acids found to localize in multiple subcellular components of the leaf epidermal cells of Nicotiana benthamiana. When overexpressed via a Potato virus X (PVX)-based expression vector in N. benthamiana, m3875 suppressed programmed cell death (PCD) and the H2O2 accumulation triggered by the pro-apoptotic mouse protein BAX and the Phytophthora infestans elicitin INF1. Overexpression also resulted in a phenotype of dwarfing, leaf deformation and mosaics, suggesting that m3875 has roles in plant immune response, growth, and development. Substitution mutagenesis of the charged amino acid (D7, R9, R11, and K22) with alanine within m3875 did not recover the phenotypes for PCD and normal growth. In addition, the transiently overexpressed m3875 regulated the transcriptional levels of N. benthamiana orthologs of CNGCs (cyclic nucleotide-gated channels), BI-1 (Bax-inhibitor 1), and WRKY33 that are involved in plant defense mechanisms. To our knowledge, m3875 is the first PCD suppressor identified from CLas. Studying the function of this protein provides insight as to how CLas attenuates the host immune responses to proliferate and cause Huanglongbing disease in citrus plants.
Collapse
Affiliation(s)
- Chao Zhang
- Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100094, China;
- Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China; (X.L.); (Y.F.); (Y.Z.)
| | - Xuefeng Wang
- Citrus Research Institute, Southwest University, Chongqing 400712, China;
| | - Xuelu Liu
- Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China; (X.L.); (Y.F.); (Y.Z.)
- Citrus Research Institute, Southwest University, Chongqing 400712, China;
| | - Yanyan Fan
- Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China; (X.L.); (Y.F.); (Y.Z.)
- College of Life Science, Shandong Normal University, Jinan 250014, China
| | - Yongqiang Zhang
- Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China; (X.L.); (Y.F.); (Y.Z.)
| | - Xueping Zhou
- Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100094, China;
| | - Weimin Li
- Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China; (X.L.); (Y.F.); (Y.Z.)
| |
Collapse
|
12
|
Zabel WJ, Hagner KP, Livesey BJ, Marsh JA, Setayeshgar S, Lynch M, Higgs PG. Evolution of protein interfaces in multimers and fibrils. J Chem Phys 2019; 150:225102. [PMID: 31202237 PMCID: PMC6561775 DOI: 10.1063/1.5086042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
A majority of cellular proteins function as part of multimeric complexes of two or more subunits. Multimer formation requires interactions between protein surfaces that lead to closed structures, such as dimers and tetramers. If proteins interact in an open-ended way, uncontrolled growth of fibrils can occur, which is likely to be detrimental in most cases. We present a statistical physics model that allows aggregation of proteins as either closed dimers or open fibrils of all lengths. We use pairwise amino-acid contact energies to calculate the energies of interacting protein surfaces. The probabilities of all possible aggregate configurations can be calculated for any given sequence of surface amino acids. We link the statistical physics model to a population genetics model that describes the evolution of the surface residues. When proteins evolve neutrally, without selection for or against multimer formation, we find that a majority of proteins remain as monomers at moderate concentrations, but strong dimer-forming or fibril-forming sequences are also possible. If selection is applied in favor of dimers or in favor of fibrils, then it is easy to select either dimer-forming or fibril-forming sequences. It is also possible to select for oriented fibrils with protein subunits all aligned in the same direction. We measure the propensities of amino acids to occur at interfaces relative to noninteracting surfaces and show that the propensities in our model are strongly correlated with those that have been measured in real protein structures. We also show that there are significant differences between amino acid frequencies at isologous and heterologous interfaces in our model, and we observe that similar effects occur in real protein structures.
Collapse
Affiliation(s)
- W Jeffrey Zabel
- Department of Physics and Astronomy, McMaster University, Hamilton, Ontario L8S 4M1, Canada
| | - Kyle P Hagner
- Department of Physics, Indiana University, Bloomington, Indiana 47405, USA
| | - Benjamin J Livesey
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, United Kingdom
| | - Joseph A Marsh
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, United Kingdom
| | - Sima Setayeshgar
- Department of Physics, Indiana University, Bloomington, Indiana 47405, USA
| | - Michael Lynch
- Biodesign Center for Mechanisms of Evolution, Arizona State University, Tempe, Arizona 85287, USA
| | - Paul G Higgs
- Department of Physics and Astronomy, McMaster University, Hamilton, Ontario L8S 4M1, Canada
| |
Collapse
|
13
|
Marinko J, Huang H, Penn WD, Capra JA, Schlebach JP, Sanders CR. Folding and Misfolding of Human Membrane Proteins in Health and Disease: From Single Molecules to Cellular Proteostasis. Chem Rev 2019; 119:5537-5606. [PMID: 30608666 PMCID: PMC6506414 DOI: 10.1021/acs.chemrev.8b00532] [Citation(s) in RCA: 180] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2018] [Indexed: 12/13/2022]
Abstract
Advances over the past 25 years have revealed much about how the structural properties of membranes and associated proteins are linked to the thermodynamics and kinetics of membrane protein (MP) folding. At the same time biochemical progress has outlined how cellular proteostasis networks mediate MP folding and manage misfolding in the cell. When combined with results from genomic sequencing, these studies have established paradigms for how MP folding and misfolding are linked to the molecular etiologies of a variety of diseases. This emerging framework has paved the way for the development of a new class of small molecule "pharmacological chaperones" that bind to and stabilize misfolded MP variants, some of which are now in clinical use. In this review, we comprehensively outline current perspectives on the folding and misfolding of integral MPs as well as the mechanisms of cellular MP quality control. Based on these perspectives, we highlight new opportunities for innovations that bridge our molecular understanding of the energetics of MP folding with the nuanced complexity of biological systems. Given the many linkages between MP misfolding and human disease, we also examine some of the exciting opportunities to leverage these advances to address emerging challenges in the development of therapeutics and precision medicine.
Collapse
Affiliation(s)
- Justin
T. Marinko
- Department
of Biochemistry, Vanderbilt University, Nashville, Tennessee 37240, United States
- Center
for Structural Biology, Vanderbilt University, Nashville, Tennessee 37240, United States
| | - Hui Huang
- Department
of Biochemistry, Vanderbilt University, Nashville, Tennessee 37240, United States
- Center
for Structural Biology, Vanderbilt University, Nashville, Tennessee 37240, United States
| | - Wesley D. Penn
- Department
of Chemistry, Indiana University, Bloomington, Indiana 47405, United States
| | - John A. Capra
- Center
for Structural Biology, Vanderbilt University, Nashville, Tennessee 37240, United States
- Department
of Biological Sciences, Vanderbilt University, Nashville, Tennessee 37245, United States
| | - Jonathan P. Schlebach
- Department
of Chemistry, Indiana University, Bloomington, Indiana 47405, United States
| | - Charles R. Sanders
- Department
of Biochemistry, Vanderbilt University, Nashville, Tennessee 37240, United States
| |
Collapse
|
14
|
Kumar A, Biswas P. Effect of site-directed point mutations on protein misfolding: A simulation study. Proteins 2019; 87:760-773. [PMID: 31017329 DOI: 10.1002/prot.25702] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2018] [Revised: 03/19/2019] [Accepted: 04/22/2019] [Indexed: 11/09/2022]
Abstract
A Monte Carlo simulation based sequence design method is proposed to investigate the role of site-directed point mutations in protein misfolding. Site-directed point mutations are incorporated in the designed sequences of selected proteins. While most mutated sequences correctly fold to their native conformation, some of them stabilize in other nonnative conformations and thus misfold/unfold. The results suggest that a critical number of hydrophobic amino acid residues must be present in the core of the correctly folded proteins, whereas proteins misfold/unfold if this number of hydrophobic residues falls below the critical limit. A protein can accommodate only a particular number of hydrophobic residues at the surface, provided a large number of hydrophilic residues are present at the surface and critical hydrophobicity of the core is preserved. Some surface sites are observed to be equally sensitive toward site-directed point mutations as the core sites. Point mutations with highly polar and charged amino acids increases the misfold/unfold propensity of proteins. Substitution of natural amino acids at sites with different number of nonbonded contacts suggests that both amino acid identity and its respective site-specificity determine the stability of a protein. A clash-match method is developed to calculate the number of matching and clashing interactions in the mutated protein sequences. While misfolded/unfolded sequences have a higher number of clashing and a lower number of matching interactions, the correctly folded sequences have a lower number of clashing and a higher number of matching interactions. These results are valid for different SCOP classes of proteins.
Collapse
Affiliation(s)
- Adesh Kumar
- Department of Chemistry, University of Delhi, Delhi, India
| | - Parbati Biswas
- Department of Chemistry, University of Delhi, Delhi, India
| |
Collapse
|
15
|
Venev SV, Zeldovich KB. Thermophilic Adaptation in Prokaryotes Is Constrained by Metabolic Costs of Proteostasis. Mol Biol Evol 2019; 35:211-224. [PMID: 29106597 PMCID: PMC5850847 DOI: 10.1093/molbev/msx282] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Prokaryotes evolved to thrive in an extremely diverse set of habitats, and their proteomes bear signatures of environmental conditions. Although correlations between amino acid usage and environmental temperature are well-documented, understanding of the mechanisms of thermal adaptation remains incomplete. Here, we couple the energetic costs of protein folding and protein homeostasis to build a microscopic model explaining both the overall amino acid composition and its temperature trends. Low biosynthesis costs lead to low diversity of physical interactions between amino acid residues, which in turn makes proteins less stable and drives up chaperone activity to maintain appropriate levels of folded, functional proteins. Assuming that the cost of chaperone activity is proportional to the fraction of unfolded client proteins, we simulated thermal adaptation of model proteins subject to minimization of the total cost of amino acid synthesis and chaperone activity. For the first time, we predicted both the proteome-average amino acid abundances and their temperature trends simultaneously, and found strong correlations between model predictions and 402 genomes of bacteria and archaea. The energetic constraint on protein evolution is more apparent in highly expressed proteins, selected by codon adaptation index. We found that in bacteria, highly expressed proteins are similar in composition to thermophilic ones, whereas in archaea no correlation between predicted expression level and thermostability was observed. At the same time, thermal adaptations of highly expressed proteins in bacteria and archaea are nearly identical, suggesting that universal energetic constraints prevail over the phylogenetic differences between these domains of life.
Collapse
Affiliation(s)
- Sergey V Venev
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, 368 Plantation St, Worcester, MA
| | - Konstantin B Zeldovich
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, 368 Plantation St, Worcester, MA
| |
Collapse
|
16
|
Yan Z, Wang J. Superfunneled Energy Landscape of Protein Evolution Unifies the Principles of Protein Evolution, Folding, and Design. PHYSICAL REVIEW LETTERS 2019; 122:018103. [PMID: 31012725 DOI: 10.1103/physrevlett.122.018103] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2017] [Revised: 11/08/2018] [Indexed: 06/09/2023]
Abstract
Evolution is essential for shaping the biological functions. Darwin proposed the selection as the driving force for evolution upon mutations. While mutations are clear, the quantification of the selection force is still challenging. In this study, we identified and quantified both thermodynamic stability and kinetic accessibility as the selection forces for protein evolution. The protein evolution can be viewed and quantified as a trajectory moving along a superfunneled energy landscape with a line attractor at the bottom. The resulting evolved sequences and structures show strong protein characteristics including the hydrophobic core, high designability, and fast folding. The evolution principle uncovered here is validated on real proteins and sheds light on the protein design.
Collapse
Affiliation(s)
- Zhiqiang Yan
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin 130022, China
| | - Jin Wang
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin 130022, China
- Department of Chemistry & Physics, State University of New York at Stony Brook, Stony Brook, New York 11790, USA
| |
Collapse
|
17
|
Röder K, Joseph JA, Husic BE, Wales DJ. Energy Landscapes for Proteins: From Single Funnels to Multifunctional Systems. ADVANCED THEORY AND SIMULATIONS 2019. [DOI: 10.1002/adts.201800175] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Affiliation(s)
- Konstantin Röder
- Department of ChemistryUniversity of CambridgeLensfield Road CB2 1EW Cambridge UK
| | - Jerelle A. Joseph
- Department of ChemistryUniversity of CambridgeLensfield Road CB2 1EW Cambridge UK
| | - Brooke E. Husic
- Department of ChemistryUniversity of CambridgeLensfield Road CB2 1EW Cambridge UK
| | - David J. Wales
- Department of ChemistryUniversity of CambridgeLensfield Road CB2 1EW Cambridge UK
| |
Collapse
|
18
|
Yang Q, Han XM, Gu JK, Liu YJ, Yang MJ, Zeng QY. Functional and structural profiles of GST gene family from three Populus species reveal the sequence-function decoupling of orthologous genes. THE NEW PHYTOLOGIST 2019; 221:1060-1073. [PMID: 30204242 DOI: 10.1111/nph.15430] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2018] [Accepted: 08/08/2018] [Indexed: 05/07/2023]
Abstract
A common assumption in comparative genomics is that orthologous genes are functionally more similar than paralogous genes. However, the validity of this assumption needs to be assessed using robust experimental data. We conducted tissue-specific gene expression and protein function analyses of orthologous groups within the glutathione S-transferase (GST) gene family in three closely related Populus species: Populus trichocarpa, Populus euphratica and Populus yatungensis. This study identified 21 GST orthologous groups in the three Populus species. Although the sequences of the GST orthologous groups were highly conserved, the divergence in enzymatic functions was prevalent. Through site-directed mutagenesis of orthologous proteins, this study revealed that nonsynonymous substitutions at key amino acid sites played an important role in the divergence of enzymatic functions. In particular, a single amino acid mutation (Arg39→Trp39) contributed to P. euphratica PeGSTU30 possessing high enzymatic activity via increasing the hydrophobicity of the active cavity. This study provided experimental evidence showing that orthologues belonging to the gene family have functional divergences. The nonsynonymous substitutions at a few amino acid sites resulted in functional divergence of the orthologous genes. Our findings provide new insights into the evolution of orthologous genes in closely related species.
Collapse
Affiliation(s)
- Qi Yang
- State Key Laboratory of Tree Genetics and Breeding, Chinese Academy of Forestry, Beijing, 100091, China
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Xue-Min Han
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
| | - Jin-Ke Gu
- State Key Laboratory of Biomembrane and Membrane Biotechnology, Tsinghua-Peking Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing, 100084, China
| | - Yan-Jing Liu
- State Key Laboratory of Tree Genetics and Breeding, Chinese Academy of Forestry, Beijing, 100091, China
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
| | - Mao-Jun Yang
- State Key Laboratory of Biomembrane and Membrane Biotechnology, Tsinghua-Peking Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing, 100084, China
| | - Qing-Yin Zeng
- State Key Laboratory of Tree Genetics and Breeding, Chinese Academy of Forestry, Beijing, 100091, China
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| |
Collapse
|
19
|
Dasmeh P, Serohijos AWR. Estimating the contribution of folding stability to nonspecific epistasis in protein evolution. Proteins 2018; 86:1242-1250. [DOI: 10.1002/prot.25588] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Revised: 06/28/2018] [Accepted: 07/18/2018] [Indexed: 12/28/2022]
Affiliation(s)
- Pouria Dasmeh
- Department of BiochemistryUniversity of Montreal Montreal Quebec Canada
- Cedergren Center for Bioinformatics and GenomicsUniversity of Montreal Montreal, Quebec Canada
- Department of Biochemistry and Institute for Data Valorization (IVADO)University of Montreal Montreal, Quebec Canada
| | - Adrian W. R. Serohijos
- Department of BiochemistryUniversity of Montreal Montreal Quebec Canada
- Cedergren Center for Bioinformatics and GenomicsUniversity of Montreal Montreal, Quebec Canada
| |
Collapse
|
20
|
Finch AJ, Kim JR. Thermophilic Proteins as Versatile Scaffolds for Protein Engineering. Microorganisms 2018; 6:microorganisms6040097. [PMID: 30257429 PMCID: PMC6313779 DOI: 10.3390/microorganisms6040097] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2018] [Revised: 09/23/2018] [Accepted: 09/23/2018] [Indexed: 01/18/2023] Open
Abstract
Literature from the past two decades has outlined the existence of a trade-off between protein stability and function. This trade-off creates a unique challenge for protein engineers who seek to introduce new functionality to proteins. These engineers must carefully balance the mutation-mediated creation and/or optimization of function with the destabilizing effect of those mutations. Subsequent research has shown that protein stability is positively correlated with "evolvability" or the ability to support mutations which bestow new functionality on the protein. Since the ultimate goal of protein engineering is to create and/or optimize a protein's function, highly stable proteins are preferred as potential scaffolds for protein engineering. This review focuses on the application potential for thermophilic proteins as scaffolds for protein engineering. The relatively high inherent thermostability of these proteins grants them a great deal of mutational robustness, making them promising scaffolds for various protein engineering applications. Comparative studies on the evolvability of thermophilic and mesophilic proteins have strongly supported the argument that thermophilic proteins are more evolvable than mesophilic proteins. These findings indicate that thermophilic proteins may represent the scaffold of choice for protein engineering in the future.
Collapse
Affiliation(s)
- Anthony J Finch
- Department of Chemical and Biomolecular Engineering, New York University, 6 MetroTech Center, Brooklyn, NY 11201, USA.
| | - Jin Ryoun Kim
- Department of Chemical and Biomolecular Engineering, New York University, 6 MetroTech Center, Brooklyn, NY 11201, USA.
| |
Collapse
|
21
|
Joseph JA, Röder K, Chakraborty D, Mantell RG, Wales DJ. Exploring biomolecular energy landscapes. Chem Commun (Camb) 2018; 53:6974-6988. [PMID: 28489083 DOI: 10.1039/c7cc02413d] [Citation(s) in RCA: 66] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
The potential energy landscape perspective provides both a conceptual and a computational framework for predicting, understanding and designing molecular properties. In this Feature Article, we highlight some recent advances that greatly facilitate structure prediction and analysis of global thermodynamics and kinetics in proteins and nucleic acids. The geometry optimisation procedures, on which these calculations are based, can be accelerated significantly using local rigidification of selected degrees of freedom, and through implementations on graphics processing units. Results of progressive local rigidification are first summarised for trpzip1, including a systematic analysis of the heat capacity and rearrangement rates. Benchmarks for all the essential optimisation procedures are then provided for a variety of proteins. Applications are then illustrated from a study of how mutation affects the energy landscape for a coiled-coil protein, and for transitions in helix morphology for a DNA duplex. Both systems exhibit an intrinsically multifunnel landscape, with the potential to act as biomolecular switches.
Collapse
Affiliation(s)
- Jerelle A Joseph
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.
| | - Konstantin Röder
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.
| | - Debayan Chakraborty
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK. and Department of Chemistry, The University of Texas at Austin, 24th Street Stop A5300, Austin, TX 78712, USA
| | - Rosemary G Mantell
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.
| | - David J Wales
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.
| |
Collapse
|
22
|
Kubyshkin V, Acevedo-Rocha CG, Budisa N. On universal coding events in protein biogenesis. Biosystems 2018; 164:16-25. [DOI: 10.1016/j.biosystems.2017.10.004] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2017] [Revised: 10/02/2017] [Accepted: 10/03/2017] [Indexed: 12/14/2022]
|
23
|
Effects of Distal Mutations on the Structure, Dynamics and Catalysis of Human Monoacylglycerol Lipase. Sci Rep 2018; 8:1719. [PMID: 29379013 PMCID: PMC5789057 DOI: 10.1038/s41598-017-19135-7] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2017] [Accepted: 12/20/2017] [Indexed: 02/06/2023] Open
Abstract
An understanding of how conformational dynamics modulates function and catalysis of human monoacylglycerol lipase (hMGL), an important pharmaceutical target, can facilitate the development of novel ligands with potential therapeutic value. Here, we report the discovery and characterization of an allosteric, regulatory hMGL site comprised of residues Trp-289 and Leu-232 that reside over 18 Å away from the catalytic triad. These residues were identified as critical mediators of long-range communication and as important contributors to the integrity of the hMGL structure. Nonconservative replacements of Trp-289 or Leu-232 triggered concerted motions of structurally distinct regions with a significant conformational shift toward inactive states and dramatic loss in catalytic efficiency of the enzyme. Using a multimethod approach, we show that the dynamically relevant Trp-289 and Leu-232 residues serve as communication hubs within an allosteric protein network that controls signal propagation to the active site, and thus, regulates active-inactive interconversion of hMGL. Our findings provide new insights into the mechanism of allosteric regulation of lipase activity, in general, and may provide alternative drug design possibilities.
Collapse
|
24
|
Röder K, Wales DJ. Transforming the Energy Landscape of a Coiled-Coil Peptide via Point Mutations. J Chem Theory Comput 2017; 13:1468-1477. [DOI: 10.1021/acs.jctc.7b00024] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Konstantin Röder
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, U.K
| | - David J. Wales
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, U.K
| |
Collapse
|
25
|
Bershtein S, Serohijos AW, Shakhnovich EI. Bridging the physical scales in evolutionary biology: from protein sequence space to fitness of organisms and populations. Curr Opin Struct Biol 2016; 42:31-40. [PMID: 27810574 DOI: 10.1016/j.sbi.2016.10.013] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Accepted: 10/14/2016] [Indexed: 01/11/2023]
Abstract
Bridging the gap between the molecular properties of proteins and organismal/population fitness is essential for understanding evolutionary processes. This task requires the integration of the several physical scales of biological organization, each defined by a distinct set of mechanisms and constraints, into a single unifying model. The molecular scale is dominated by the constraints imposed by the physico-chemical properties of proteins and their substrates, which give rise to trade-offs and epistatic (non-additive) effects of mutations. At the systems scale, biological networks modulate protein expression and can either buffer or enhance the fitness effects of mutations. The population scale is influenced by the mutational input, selection regimes, and stochastic changes affecting the size and structure of populations, which eventually determine the evolutionary fate of mutations. Here, we summarize the recent advances in theory, computer simulations, and experiments that advance our understanding of the links between various physical scales in biology.
Collapse
Affiliation(s)
- Shimon Bershtein
- Department of Life Sciences, Ben-Gurion University of the Negev, Beer-Sheva 84501, Israel
| | - Adrian Wr Serohijos
- Département de Biochimie, Centre Robert-Cedergren en Bioinformatique & Génomique, Université de Montréal, Montréal, QC H3T 1J4, Canada
| | - Eugene I Shakhnovich
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138, United States.
| |
Collapse
|
26
|
Venev SV, Zeldovich KB. Massively parallel sampling of lattice proteins reveals foundations of thermal adaptation. J Chem Phys 2016; 143:055101. [PMID: 26254668 DOI: 10.1063/1.4927565] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Evolution of proteins in bacteria and archaea living in different conditions leads to significant correlations between amino acid usage and environmental temperature. The origins of these correlations are poorly understood, and an important question of protein theory, physics-based prediction of types of amino acids overrepresented in highly thermostable proteins, remains largely unsolved. Here, we extend the random energy model of protein folding by weighting the interaction energies of amino acids by their frequencies in protein sequences and predict the energy gap of proteins designed to fold well at elevated temperatures. To test the model, we present a novel scalable algorithm for simultaneous energy calculation for many sequences in many structures, targeting massively parallel computing architectures such as graphics processing unit. The energy calculation is performed by multiplying two matrices, one representing the complete set of sequences, and the other describing the contact maps of all structural templates. An implementation of the algorithm for the CUDA platform is available at http://www.github.com/kzeldovich/galeprot and calculates protein folding energies over 250 times faster than a single central processing unit. Analysis of amino acid usage in 64-mer cubic lattice proteins designed to fold well at different temperatures demonstrates an excellent agreement between theoretical and simulated values of energy gap. The theoretical predictions of temperature trends of amino acid frequencies are significantly correlated with bioinformatics data on 191 bacteria and archaea, and highlight protein folding constraints as a fundamental selection pressure during thermal adaptation in biological evolution.
Collapse
Affiliation(s)
- Sergey V Venev
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, 368 Plantation St, Worcester, Massachusetts 01605, USA
| | - Konstantin B Zeldovich
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, 368 Plantation St, Worcester, Massachusetts 01605, USA
| |
Collapse
|
27
|
Berka K, Laskowski R, Riley KE, Hobza P, Vondrášek J. Representative Amino Acid Side Chain Interactions in Proteins. A Comparison of Highly Accurate Correlated ab Initio Quantum Chemical and Empirical Potential Procedures. J Chem Theory Comput 2015; 5:982-92. [PMID: 26609607 DOI: 10.1021/ct800508v] [Citation(s) in RCA: 78] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Interactions between amino acid side chains play a crucial role both within a folded protein and between the interacting protein molecules. Here we have selected a representative set of 24 of the 400 (20 × 20) possible interacting side chain pairs based on data from Atlas of Protein Side-Chain Interactions. For each pair, we obtained its most favorable interaction geometry from the structural data and computed the interaction energy in the gas phase using several different, commonly used, ab initio and force field methods, namely Møller-Plesset perturbation theory (MP2), density functional theory combined with symmetry-adapted perturbation theory (DFT-SAPT), density functional theory empirically augmented with an empirical dispersion term (DFT-D), and empirical potentials using the OPLS-AA/L and Amber03 force fields. All the methods were compared against a reference method taken to be the CCSD(T) level of theory extrapolated to the complete basis set limit. We found a high degree of agreement between the different methods, even though the range of binding energies obtained was extremely large. The most computationally intensive methods yielded the best results. Among the less computationally time-consuming methods, the DFT-D method as well as parm03 force field provided consistently good results when compared to the reference values. We also tested how representative the chosen geometries of the side chains were and investigated the effect on the binding energies of the dielectric constant of the surrounding medium.
Collapse
Affiliation(s)
- Karel Berka
- Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic and Center for Complex Molecular Systems and Biomolecules, Flemingovo náměstí 2, Prague 6, 166 10 Czech Republic, Department of Physical and Macromolecular Chemistry, Faculty of Natural Sciences, Charles University in Prague, Hlavova 8, Prague 2, 128 43 Czech Republic, EMBL Outstation - Hinxton, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, U.K., and Department of Chemistry, P.O. Box 23346, University of Puerto Rico, Rio Piedras, Puerto Rico 00931
| | - Roman Laskowski
- Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic and Center for Complex Molecular Systems and Biomolecules, Flemingovo náměstí 2, Prague 6, 166 10 Czech Republic, Department of Physical and Macromolecular Chemistry, Faculty of Natural Sciences, Charles University in Prague, Hlavova 8, Prague 2, 128 43 Czech Republic, EMBL Outstation - Hinxton, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, U.K., and Department of Chemistry, P.O. Box 23346, University of Puerto Rico, Rio Piedras, Puerto Rico 00931
| | - Kevin E Riley
- Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic and Center for Complex Molecular Systems and Biomolecules, Flemingovo náměstí 2, Prague 6, 166 10 Czech Republic, Department of Physical and Macromolecular Chemistry, Faculty of Natural Sciences, Charles University in Prague, Hlavova 8, Prague 2, 128 43 Czech Republic, EMBL Outstation - Hinxton, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, U.K., and Department of Chemistry, P.O. Box 23346, University of Puerto Rico, Rio Piedras, Puerto Rico 00931
| | - Pavel Hobza
- Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic and Center for Complex Molecular Systems and Biomolecules, Flemingovo náměstí 2, Prague 6, 166 10 Czech Republic, Department of Physical and Macromolecular Chemistry, Faculty of Natural Sciences, Charles University in Prague, Hlavova 8, Prague 2, 128 43 Czech Republic, EMBL Outstation - Hinxton, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, U.K., and Department of Chemistry, P.O. Box 23346, University of Puerto Rico, Rio Piedras, Puerto Rico 00931
| | - Jiří Vondrášek
- Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic and Center for Complex Molecular Systems and Biomolecules, Flemingovo náměstí 2, Prague 6, 166 10 Czech Republic, Department of Physical and Macromolecular Chemistry, Faculty of Natural Sciences, Charles University in Prague, Hlavova 8, Prague 2, 128 43 Czech Republic, EMBL Outstation - Hinxton, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, U.K., and Department of Chemistry, P.O. Box 23346, University of Puerto Rico, Rio Piedras, Puerto Rico 00931
| |
Collapse
|
28
|
Acceleration of protein folding by four orders of magnitude through a single amino acid substitution. Sci Rep 2015; 5:11840. [PMID: 26121966 PMCID: PMC4485320 DOI: 10.1038/srep11840] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2015] [Accepted: 06/04/2015] [Indexed: 11/23/2022] Open
Abstract
Cis prolyl peptide bonds are conserved structural elements in numerous protein
families, although their formation is energetically unfavorable, intrinsically slow
and often rate-limiting for folding. Here we investigate the reasons underlying the
conservation of the cis proline that is diagnostic for the fold of
thioredoxin-like thiol-disulfide oxidoreductases. We show that replacement of the
conserved cis proline in thioredoxin by alanine can accelerate spontaneous
folding to the native, thermodynamically most stable state by more than four orders
of magnitude. However, the resulting trans alanine bond leads to small
structural rearrangements around the active site that impair the function of
thioredoxin as catalyst of electron transfer reactions by more than 100-fold. Our
data provide evidence for the absence of a strong evolutionary pressure to achieve
intrinsically fast folding rates, which is most likely a consequence of proline
isomerases and molecular chaperones that guarantee high in vivo folding rates
and yields.
Collapse
|
29
|
Sikosek T, Chan HS. Biophysics of protein evolution and evolutionary protein biophysics. J R Soc Interface 2015; 11:20140419. [PMID: 25165599 DOI: 10.1098/rsif.2014.0419] [Citation(s) in RCA: 163] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
The study of molecular evolution at the level of protein-coding genes often entails comparing large datasets of sequences to infer their evolutionary relationships. Despite the importance of a protein's structure and conformational dynamics to its function and thus its fitness, common phylogenetic methods embody minimal biophysical knowledge of proteins. To underscore the biophysical constraints on natural selection, we survey effects of protein mutations, highlighting the physical basis for marginal stability of natural globular proteins and how requirement for kinetic stability and avoidance of misfolding and misinteractions might have affected protein evolution. The biophysical underpinnings of these effects have been addressed by models with an explicit coarse-grained spatial representation of the polypeptide chain. Sequence-structure mappings based on such models are powerful conceptual tools that rationalize mutational robustness, evolvability, epistasis, promiscuous function performed by 'hidden' conformational states, resolution of adaptive conflicts and conformational switches in the evolution from one protein fold to another. Recently, protein biophysics has been applied to derive more accurate evolutionary accounts of sequence data. Methods have also been developed to exploit sequence-based evolutionary information to predict biophysical behaviours of proteins. The success of these approaches demonstrates a deep synergy between the fields of protein biophysics and protein evolution.
Collapse
Affiliation(s)
- Tobias Sikosek
- Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Physics, University of Toronto, Toronto, Ontario, Canada M5S 1A8
| | - Hue Sun Chan
- Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Physics, University of Toronto, Toronto, Ontario, Canada M5S 1A8
| |
Collapse
|
30
|
Massey SE. Genetic code evolution reveals the neutral emergence of mutational robustness, and information as an evolutionary constraint. Life (Basel) 2015; 5:1301-32. [PMID: 25919033 PMCID: PMC4500140 DOI: 10.3390/life5021301] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2015] [Revised: 04/02/2015] [Accepted: 04/03/2015] [Indexed: 01/09/2023] Open
Abstract
The standard genetic code (SGC) is central to molecular biology and its origin and evolution is a fundamental problem in evolutionary biology, the elucidation of which promises to reveal much about the origins of life. In addition, we propose that study of its origin can also reveal some fundamental and generalizable insights into mechanisms of molecular evolution, utilizing concepts from complexity theory. The first is that beneficial traits may arise by non-adaptive processes, via a process of "neutral emergence". The structure of the SGC is optimized for the property of error minimization, which reduces the deleterious impact of point mutations. Via simulation, it can be shown that genetic codes with error minimization superior to the SGC can emerge in a neutral fashion simply by a process of genetic code expansion via tRNA and aminoacyl-tRNA synthetase duplication, whereby similar amino acids are added to codons related to that of the parent amino acid. This process of neutral emergence has implications beyond that of the genetic code, as it suggests that not all beneficial traits have arisen by the direct action of natural selection; we term these "pseudaptations", and discuss a range of potential examples. Secondly, consideration of genetic code deviations (codon reassignments) reveals that these are mostly associated with a reduction in proteome size. This code malleability implies the existence of a proteomic constraint on the genetic code, proportional to the size of the proteome (P), and that its reduction in size leads to an "unfreezing" of the codon - amino acid mapping that defines the genetic code, consistent with Crick's Frozen Accident theory. The concept of a proteomic constraint may be extended to propose a general informational constraint on genetic fidelity, which may be used to explain variously, differences in mutation rates in genomes with differing proteome sizes, differences in DNA repair capacity and genome GC content between organisms, a selective pressure in the evolution of sexual reproduction, and differences in translational fidelity. Lastly, the utility of the concept of an informational constraint to other diverse fields of research is explored.
Collapse
Affiliation(s)
- Steven E Massey
- Biology Department, PO Box 23360, University of Puerto Rico-Rio Piedras, San Juan, PR 00931, USA.
| |
Collapse
|
31
|
Hemery M, Rivoire O. Evolution of sparsity and modularity in a model of protein allostery. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2015; 91:042704. [PMID: 25974524 DOI: 10.1103/physreve.91.042704] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2014] [Indexed: 06/04/2023]
Abstract
The sequence of a protein is not only constrained by its physical and biochemical properties under current selection, but also by features of its past evolutionary history. Understanding the extent and the form that these evolutionary constraints may take is important to interpret the information in protein sequences. To study this problem, we introduce a simple but physical model of protein evolution where selection targets allostery, the functional coupling of distal sites on protein surfaces. This model shows how the geometrical organization of couplings between amino acids within a protein structure can depend crucially on its evolutionary history. In particular, two scenarios are found to generate a spatial concentration of functional constraints: high mutation rates and fluctuating selective pressures. This second scenario offers a plausible explanation for the high tolerance of natural proteins to mutations and for the spatial organization of their least tolerant amino acids, as revealed by sequence analysis and mutagenesis experiments. It also implies a faculty to adapt to new selective pressures that is consistent with observations. The model illustrates how several independent functional modules may emerge within the same protein structure, depending on the nature of past environmental fluctuations. Our model thus relates the evolutionary history of proteins to the geometry of their functional constraints, with implications for decoding and engineering protein sequences.
Collapse
Affiliation(s)
- Mathieu Hemery
- ESPCI ParisTech, PCT, Gulliver, F-75005, Paris, France
- CNRS, LIPhy, F-38000 Grenoble, France
- Univ. Grenoble Alpes, LIPhy, F-38000 Grenoble, France
| | - Olivier Rivoire
- CNRS, LIPhy, F-38000 Grenoble, France
- Univ. Grenoble Alpes, LIPhy, F-38000 Grenoble, France
| |
Collapse
|
32
|
Currin A, Swainston N, Day PJ, Kell DB. Synthetic biology for the directed evolution of protein biocatalysts: navigating sequence space intelligently. Chem Soc Rev 2015; 44:1172-239. [PMID: 25503938 PMCID: PMC4349129 DOI: 10.1039/c4cs00351a] [Citation(s) in RCA: 258] [Impact Index Per Article: 25.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2014] [Indexed: 12/21/2022]
Abstract
The amino acid sequence of a protein affects both its structure and its function. Thus, the ability to modify the sequence, and hence the structure and activity, of individual proteins in a systematic way, opens up many opportunities, both scientifically and (as we focus on here) for exploitation in biocatalysis. Modern methods of synthetic biology, whereby increasingly large sequences of DNA can be synthesised de novo, allow an unprecedented ability to engineer proteins with novel functions. However, the number of possible proteins is far too large to test individually, so we need means for navigating the 'search space' of possible protein sequences efficiently and reliably in order to find desirable activities and other properties. Enzymologists distinguish binding (Kd) and catalytic (kcat) steps. In a similar way, judicious strategies have blended design (for binding, specificity and active site modelling) with the more empirical methods of classical directed evolution (DE) for improving kcat (where natural evolution rarely seeks the highest values), especially with regard to residues distant from the active site and where the functional linkages underpinning enzyme dynamics are both unknown and hard to predict. Epistasis (where the 'best' amino acid at one site depends on that or those at others) is a notable feature of directed evolution. The aim of this review is to highlight some of the approaches that are being developed to allow us to use directed evolution to improve enzyme properties, often dramatically. We note that directed evolution differs in a number of ways from natural evolution, including in particular the available mechanisms and the likely selection pressures. Thus, we stress the opportunities afforded by techniques that enable one to map sequence to (structure and) activity in silico, as an effective means of modelling and exploring protein landscapes. Because known landscapes may be assessed and reasoned about as a whole, simultaneously, this offers opportunities for protein improvement not readily available to natural evolution on rapid timescales. Intelligent landscape navigation, informed by sequence-activity relationships and coupled to the emerging methods of synthetic biology, offers scope for the development of novel biocatalysts that are both highly active and robust.
Collapse
Affiliation(s)
- Andrew Currin
- Manchester Institute of Biotechnology , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK . ; http://dbkgroup.org/; @dbkell ; Tel: +44 (0)161 306 4492
- School of Chemistry , The University of Manchester , Manchester M13 9PL , UK
- Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM) , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK
| | - Neil Swainston
- Manchester Institute of Biotechnology , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK . ; http://dbkgroup.org/; @dbkell ; Tel: +44 (0)161 306 4492
- Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM) , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK
- School of Computer Science , The University of Manchester , Manchester M13 9PL , UK
| | - Philip J. Day
- Manchester Institute of Biotechnology , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK . ; http://dbkgroup.org/; @dbkell ; Tel: +44 (0)161 306 4492
- Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM) , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK
- Faculty of Medical and Human Sciences , The University of Manchester , Manchester M13 9PT , UK
| | - Douglas B. Kell
- Manchester Institute of Biotechnology , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK . ; http://dbkgroup.org/; @dbkell ; Tel: +44 (0)161 306 4492
- School of Chemistry , The University of Manchester , Manchester M13 9PL , UK
- Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM) , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK
| |
Collapse
|
33
|
Abstract
Tracking the evolution of thermostability in resurrected ancestors of a heat-tolerant extremophile protein and its less heat tolerant Escherichia coli homologue shows how thermostability has probably explored different mechanisms of protein stabilization over evolutionary time. Proteins from thermophiles are generally more thermostable than their mesophilic homologs, but little is known about the evolutionary process driving these differences. Here we attempt to understand how the diverse thermostabilities of bacterial ribonuclease H1 (RNH) proteins evolved. RNH proteins from Thermus thermophilus (ttRNH) and Escherichia coli (ecRNH) share similar structures but differ in melting temperature (Tm) by 20°C. ttRNH's greater stability is caused in part by the presence of residual structure in the unfolded state, which results in a low heat capacity of unfolding (ΔCp) relative to ecRNH. We first characterized RNH proteins from a variety of extant bacteria and found that Tm correlates with the species' growth temperatures, consistent with environmental selection for stability. We then used ancestral sequence reconstruction to statistically infer evolutionary intermediates along lineages leading to ecRNH and ttRNH from their common ancestor, which existed approximately 3 billion years ago. Finally, we synthesized and experimentally characterized these intermediates. The shared ancestor has a melting temperature between those of ttRNH and ecRNH; the Tms of intermediate ancestors along the ttRNH lineage increased gradually over time, while the ecRNH lineage exhibited an abrupt drop in Tm followed by relatively little change. To determine whether the underlying mechanisms for thermostability correlate with the changes in Tm, we measured the thermodynamic basis for stabilization—ΔCp and other thermodynamic parameters—for each of the ancestors. We observed that, while the Tm changes smoothly, the mechanistic basis for stability fluctuates over evolutionary time. Thus, even while overall stability appears to be strongly driven by selection, the proteins explored a wide variety of mechanisms of stabilization, a phenomenon we call “thermodynamic system drift.” This suggests that even on lineages with strong selection to increase stability, proteins have wide latitude to explore sequence space, generating biophysical diversity and potentially opening new evolutionary pathways. The biophysical properties of proteins must adjust to accommodate environmental temperatures because of the narrow range over which any given protein sequence can remain folded and functional. We compared the evolution of homologous bacterial enzymes (ribonucleases H1) from two lineages: one from Escherichia coli, which live at moderate temperatures, the other from Thermus thermophilus, which live at extremely high temperatures. Our aim was to investigate how these structurally homologous proteins can have such different thermostabilities, unfolding at temperatures that are 20°C apart. We used bioinformatics to reconstruct the sequences of ancestral proteins along each lineage, synthesized the proteins in the lab, and experimentally traced the evolution of ribonuclease H1 stability. While thermostability appears to have been strongly shaped by selection, the biophysical mechanisms used to tune protein stability appear to have varied throughout evolutionary history; this suggests that proteins have wide latitude to explore different mechanisms of stabilization, generating biophysical diversity and opening up new evolutionary pathways.
Collapse
|
34
|
Dasmeh P, Serohijos AWR, Kepp KP, Shakhnovich EI. The influence of selection for protein stability on dN/dS estimations. Genome Biol Evol 2014; 6:2956-67. [PMID: 25355808 PMCID: PMC4224349 DOI: 10.1093/gbe/evu223] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Understanding the relative contributions of various evolutionary processes-purifying selection, neutral drift, and adaptation-is fundamental to evolutionary biology. A common metric to distinguish these processes is the ratio of nonsynonymous to synonymous substitutions (i.e., dN/dS) interpreted from the neutral theory as a null model. However, from biophysical considerations, mutations have non-negligible effects on the biophysical properties of proteins such as folding stability. In this work, we investigated how stability affects the rate of protein evolution in phylogenetic trees by using simulations that combine explicit protein sequences with associated stability changes. We first simulated myoglobin evolution in phylogenetic trees with a biophysically realistic approach that accounts for 3D structural information and estimates of changes in stability upon mutation. We then compared evolutionary rates inferred directly from simulation to those estimated using maximum-likelihood (ML) methods. We found that the dN/dS estimated by ML methods (ωML) is highly predictive of the per gene dN/dS inferred from the simulated phylogenetic trees. This agreement is strong in the regime of high stability where protein evolution is neutral. At low folding stabilities and under mutation-selection balance, we observe deviations from neutrality (per gene dN/dS > 1 and dN/dS < 1). We showed that although per gene dN/dS is robust to these deviations, ML tests for positive selection detect statistically significant per site dN/dS > 1. Altogether, we show how protein biophysics affects the dN/dS estimations and its subsequent interpretation. These results are important for improving the current approaches for detecting positive selection.
Collapse
Affiliation(s)
- Pouria Dasmeh
- Department of Chemistry and Chemical Biology, Harvard University DTU Chemistry, Technical University of Denmark, Kongens Lyngby, Denmark Present address: Max Planck Institute of Immunobiology and Epigenetics, Stübeweg, Freiburg, Germany
| | | | - Kasper P Kepp
- DTU Chemistry, Technical University of Denmark, Kongens Lyngby, Denmark
| | | |
Collapse
|
35
|
Schafer NP, Kim BL, Zheng W, Wolynes PG. Learning To Fold Proteins Using Energy Landscape Theory. Isr J Chem 2014; 54:1311-1337. [PMID: 25308991 PMCID: PMC4189132 DOI: 10.1002/ijch.201300145] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
This review is a tutorial for scientists interested in the problem of protein structure prediction, particularly those interested in using coarse-grained molecular dynamics models that are optimized using lessons learned from the energy landscape theory of protein folding. We also present a review of the results of the AMH/AMC/AMW/AWSEM family of coarse-grained molecular dynamics protein folding models to illustrate the points covered in the first part of the article. Accurate coarse-grained structure prediction models can be used to investigate a wide range of conceptual and mechanistic issues outside of protein structure prediction; specifically, the paper concludes by reviewing how AWSEM has in recent years been able to elucidate questions related to the unusual kinetic behavior of artificially designed proteins, multidomain protein misfolding, and the initial stages of protein aggregation.
Collapse
Affiliation(s)
- N P Schafer
- Department of Physics, Rice University, Houston, TX 77005, USA ; Center for Theoretical Biological Physics, Rice University, Houston, TX 77005, USA
| | - B L Kim
- Department of Chemistry, Rice University, Houston, TX 77005, USA ; Center for Theoretical Biological Physics, Rice University, Houston, TX 77005, USA
| | - W Zheng
- Department of Chemistry, Rice University, Houston, TX 77005, USA ; Center for Theoretical Biological Physics, Rice University, Houston, TX 77005, USA
| | - P G Wolynes
- Department of Physics, Rice University, Houston, TX 77005, USA ; Department of Chemistry, Rice University, Houston, TX 77005, USA ; Center for Theoretical Biological Physics, Rice University, Houston, TX 77005, USA
| |
Collapse
|
36
|
Kepp KP, Dasmeh P. A model of proteostatic energy cost and its use in analysis of proteome trends and sequence evolution. PLoS One 2014; 9:e90504. [PMID: 24587382 PMCID: PMC3938754 DOI: 10.1371/journal.pone.0090504] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2013] [Accepted: 02/03/2014] [Indexed: 12/25/2022] Open
Abstract
A model of proteome-associated chemical energetic costs of cells is derived from protein-turnover kinetics and protein folding. Minimization of the proteostatic maintenance cost can explain a range of trends of proteomes and combines both protein function, stability, size, proteostatic cost, temperature, resource availability, and turnover rates in one simple framework. We then explore the ansatz that the chemical energy remaining after proteostatic maintenance is available for reproduction (or cell division) and thus, proportional to organism fitness. Selection for lower proteostatic costs is then shown to be significant vs. typical effective population sizes of yeast. The model explains and quantifies evolutionary conservation of highly abundant proteins as arising both from functional mutations and from changes in other properties such as stability, cost, or turnover rates. We show that typical hypomorphic mutations can be selected against due to increased cost of compensatory protein expression (both in the mutated gene and in related genes, i.e. epistasis) rather than compromised function itself, although this compensation depends on the protein's importance. Such mutations exhibit larger selective disadvantage in abundant, large, synthetically costly, and/or short-lived proteins. Selection against increased turnover costs of less stable proteins rather than misfolding toxicity per se can explain equilibrium protein stability distributions, in agreement with recent findings in E. coli. The proteostatic selection pressure is stronger at low metabolic rates (i.e. scarce environments) and in hot habitats, explaining proteome adaptations towards rough environments as a question of energy. The model may also explain several trade-offs observed in protein evolution and suggests how protein properties can coevolve to maintain low proteostatic cost.
Collapse
Affiliation(s)
- Kasper P. Kepp
- Department of Chemistry, Technical University of Denmark, Kongens Lyngby, Denmark
- * E-mail:
| | - Pouria Dasmeh
- Department of Chemistry, Technical University of Denmark, Kongens Lyngby, Denmark
| |
Collapse
|
37
|
Pagan RF, Massey SE. A nonadaptive origin of a beneficial trait: in silico selection for free energy of folding leads to the neutral emergence of mutational robustness in single domain proteins. J Mol Evol 2013; 78:130-9. [PMID: 24362542 DOI: 10.1007/s00239-013-9606-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2013] [Accepted: 12/04/2013] [Indexed: 10/25/2022]
Abstract
Proteins are regarded as being robust to the deleterious effects of mutations. Here, the neutral emergence of mutational robustness in a population of single domain proteins is explored using computer simulations. A pairwise contact model was used to calculate the ΔG of folding (ΔG folding) using the three dimensional protein structure of leech eglin C. A random amino acid sequence with low mutational robustness, defined as the average ΔΔG resulting from a point mutation (ΔΔG average), was threaded onto the structure. A population of 1,000 threaded sequences was evolved under selection for stability, using an upper and lower energy threshold. Under these conditions, mutational robustness increased over time in the most common sequence in the population. In contrast, when the wild type sequence was used it did not show an increase in robustness. This implies that the emergence of mutational robustness is sequence specific and that wild type sequences may be close to maximal robustness. In addition, an inverse relationship between ∆∆G average and protein stability is shown, resulting partly from a larger average effect of point mutations in more stable proteins. The emergence of mutational robustness was also observed in the Escherichia coli colE1 Rop and human CD59 proteins, implying that the property may be common in single domain proteins under certain simulation conditions. The results indicate that at least a portion of mutational robustness in small globular proteins might have arisen by a process of neutral emergence, and could be an example of a beneficial trait that has not been directly selected for, termed a "pseudaptation."
Collapse
Affiliation(s)
- Rafael F Pagan
- Physics Department, University of Puerto Rico - Rio Piedras, San Juan, PR, USA
| | | |
Collapse
|
38
|
Zhang X, Perica T, Teichmann SA. Evolution of protein structures and interactions from the perspective of residue contact networks. Curr Opin Struct Biol 2013; 23:954-63. [DOI: 10.1016/j.sbi.2013.07.004] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2013] [Revised: 07/02/2013] [Accepted: 07/04/2013] [Indexed: 10/26/2022]
|
39
|
Pica A, Merlino A, Buell AK, Knowles TPJ, Pizzo E, D'Alessio G, Sica F, Mazzarella L. Three-dimensional domain swapping and supramolecular protein assembly: insights from the X-ray structure of a dimeric swapped variant of human pancreatic RNase. ACTA CRYSTALLOGRAPHICA SECTION D: BIOLOGICAL CRYSTALLOGRAPHY 2013; 69:2116-23. [PMID: 24100329 DOI: 10.1107/s0907444913020507] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2013] [Accepted: 07/23/2013] [Indexed: 11/10/2022]
Abstract
The deletion of five residues in the loop connecting the N-terminal helix to the core of monomeric human pancreatic ribonuclease leads to the formation of an enzymatically active domain-swapped dimer (desHP). The crystal structure of desHP reveals the generation of an intriguing fibril-like aggregate of desHP molecules that extends along the c crystallographic axis. Dimers are formed by three-dimensional domain swapping. Tetramers are formed by the aggregation of swapped dimers with slightly different quaternary structures. The tetramers interact in such a way as to form an infinite rod-like structure that propagates throughout the crystal. The observed supramolecular assembly captured in the crystal predicts that desHP fibrils could form in solution; this has been confirmed by atomic force microscopy. These results provide new evidence that three-dimensional domain swapping can be a mechanism for the formation of elaborate large assemblies in which the protein, apart from the swapping, retains its original fold.
Collapse
Affiliation(s)
- Andrea Pica
- Department of Chemical Sciences, University of Naples `Federico II', Via Cintia, 80126 Naples, Italy
| | | | | | | | | | | | | | | |
Collapse
|
40
|
Arenas M, Dos Santos HG, Posada D, Bastolla U. Protein evolution along phylogenetic histories under structurally constrained substitution models. ACTA ACUST UNITED AC 2013; 29:3020-8. [PMID: 24037213 DOI: 10.1093/bioinformatics/btt530] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
MOTIVATION Models of molecular evolution aim at describing the evolutionary processes at the molecular level. However, current models rarely incorporate information from protein structure. Conversely, structure-based models of protein evolution have not been commonly applied to simulate sequence evolution in a phylogenetic framework, and they often ignore relevant evolutionary processes such as recombination. A simulation evolutionary framework that integrates substitution models that account for protein structure stability should be able to generate more realistic in silico evolved proteins for a variety of purposes. RESULTS We developed a method to simulate protein evolution that combines models of protein folding stability, such that the fitness depends on the stability of the native state both with respect to unfolding and misfolding, with phylogenetic histories that can be either specified by the user or simulated with the coalescent under complex evolutionary scenarios, including recombination, demographics and migration. We have implemented this framework in a computer program called ProteinEvolver. Remarkably, comparing these models with empirical amino acid replacement models, we found that the former produce amino acid distributions closer to distributions observed in real protein families, and proteins that are predicted to be more stable. Therefore, we conclude that evolutionary models that consider protein stability and realistic evolutionary histories constitute a better approximation of the real evolutionary process.
Collapse
Affiliation(s)
- Miguel Arenas
- Centre for Molecular Biology 'Severo Ochoa', Consejo Superior de Investigaciones Científicas (CSIC), Madrid, Spain and Department of Biochemistry, Genetics and Immunology, University of Vigo, Vigo, Spain
| | | | | | | |
Collapse
|
41
|
Galán JC, González-Candelas F, Rolain JM, Cantón R. Antibiotics as selectors and accelerators of diversity in the mechanisms of resistance: from the resistome to genetic plasticity in the β-lactamases world. Front Microbiol 2013; 4:9. [PMID: 23404545 PMCID: PMC3567504 DOI: 10.3389/fmicb.2013.00009] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2012] [Accepted: 01/09/2013] [Indexed: 11/13/2022] Open
Abstract
Antibiotics and antibiotic resistance determinants, natural molecules closely related to bacterial physiology and consistent with an ancient origin, are not only present in antibiotic-producing bacteria. Throughput sequencing technologies have revealed an unexpected reservoir of antibiotic resistance in the environment. These data suggest that co-evolution between antibiotic and antibiotic resistance genes has occurred since the beginning of time. This evolutionary race has probably been slow because of highly regulated processes and low antibiotic concentrations. Therefore to understand this global problem, a new variable must be introduced, that the antibiotic resistance is a natural event, inherent to life. However, the industrial production of natural and synthetic antibiotics has dramatically accelerated this race, selecting some of the many resistance genes present in nature and contributing to their diversification. One of the best models available to understand the biological impact of selection and diversification are β-lactamases. They constitute the most widespread mechanism of resistance, at least among pathogenic bacteria, with more than 1000 enzymes identified in the literature. In the last years, there has been growing concern about the description, spread, and diversification of β-lactamases with carbapenemase activity and AmpC-type in plasmids. Phylogenies of these enzymes help the understanding of the evolutionary forces driving their selection. Moreover, understanding the adaptive potential of β-lactamases contribute to exploration the evolutionary antagonists trajectories through the design of more efficient synthetic molecules. In this review, we attempt to analyze the antibiotic resistance problem from intrinsic and environmental resistomes to the adaptive potential of resistance genes and the driving forces involved in their diversification, in order to provide a global perspective of the resistance problem.
Collapse
Affiliation(s)
- Juan-Carlos Galán
- Servicio de Microbiología, Hospital Universitario Ramón y Cajal Madrid, Spain ; Centros de Investigación Biomédica en Red en Epidemiología y Salud Pública, Instituto Ramón y Cajal de Investigación Sanitaria Madrid, Spain ; Unidad de Resistencia a Antibióticos y Virulencia Bacteriana Asociada al Consejo Superior de Investigaciones Científicas Madrid, Spain
| | | | | | | |
Collapse
|
42
|
Eick GN, Colucci JK, Harms MJ, Ortlund EA, Thornton JW. Evolution of minimal specificity and promiscuity in steroid hormone receptors. PLoS Genet 2012; 8:e1003072. [PMID: 23166518 PMCID: PMC3499368 DOI: 10.1371/journal.pgen.1003072] [Citation(s) in RCA: 95] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2012] [Accepted: 09/21/2012] [Indexed: 01/02/2023] Open
Abstract
Most proteins are regulated by physical interactions with other molecules; some are highly specific, but others interact with many partners. Despite much speculation, we know little about how and why specificity/promiscuity evolves in natural proteins. It is widely assumed that specific proteins evolved from more promiscuous ancient forms and that most proteins' specificity has been tuned to an optimal state by selection. Here we use ancestral protein reconstruction to trace the evolutionary history of ligand recognition in the steroid hormone receptors (SRs), a family of hormone-regulated animal transcription factors. We resurrected the deepest ancestral proteins in the SR family and characterized the structure-activity relationships by which they distinguished among ligands. We found that that the most ancient split in SR evolution involved a discrete switch from an ancient receptor for aromatized estrogens--including xenobiotics--to a derived receptor that recognized non-aromatized progestagens and corticosteroids. The family's history, viewed in relation to the evolution of their ligands, suggests that SRs evolved according to a principle of minimal specificity: at each point in time, receptors evolved ligand recognition criteria that were just specific enough to parse the set of endogenous substances to which they were exposed. By studying the atomic structures of resurrected SR proteins, we found that their promiscuity evolved because the ancestral binding cavity was larger than the primary ligand and contained excess hydrogen bonding capacity, allowing adventitious recognition of larger molecules with additional functional groups. Our findings provide an historical explanation for the sensitivity of modern SRs to natural and synthetic ligands--including endocrine-disrupting drugs and pollutants--and show that knowledge of history can contribute to ligand prediction. They suggest that SR promiscuity may reflect the limited power of selection within real biological systems to discriminate between perfect and "good enough."
Collapse
Affiliation(s)
- Geeta N. Eick
- Institute of Ecology and Evolution, University of Oregon, Eugene, Oregon, United States of America
- Howard Hughes Medical Institute, Eugene, Oregon, United States of America
| | - Jennifer K. Colucci
- Biochemistry Department, Emory University School of Medicine, Atlanta, Georgia, United States of America
| | - Michael J. Harms
- Institute of Ecology and Evolution, University of Oregon, Eugene, Oregon, United States of America
| | - Eric A. Ortlund
- Biochemistry Department, Emory University School of Medicine, Atlanta, Georgia, United States of America
| | - Joseph W. Thornton
- Institute of Ecology and Evolution, University of Oregon, Eugene, Oregon, United States of America
- Howard Hughes Medical Institute, Eugene, Oregon, United States of America
- Department of Human Genetics and Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
| |
Collapse
|
43
|
Mittenthal J, Caetano-Anollés D, Caetano-Anollés G. Biphasic patterns of diversification and the emergence of modules. Front Genet 2012; 3:147. [PMID: 22891076 PMCID: PMC3413098 DOI: 10.3389/fgene.2012.00147] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2012] [Accepted: 07/19/2012] [Indexed: 01/08/2023] Open
Abstract
The intricate molecular and cellular structure of organisms converts energy to work, which builds and maintains structure. Evolving structure implements modules, in which parts are tightly linked. Each module performs characteristic functions. In this work we propose that a module can emerge through two phases of diversification of parts. Early in the first phase of this biphasic pattern, the parts have weak linkage-they interact weakly and associate variously. The parts diversify and compete. Under selection for performance, interactions among the parts increasingly constrain their structure and associations. As many variants are eliminated, parts self-organize into modules with tight linkage. Linkage may increase in response to exogenous stresses as well as endogenous processes. In the second phase of diversification, variants of the module and its functions evolve and become new parts for a new cycle of generation of higher-level modules. This linkage hypothesis can interpret biphasic patterns in the diversification of protein domain structure, RNA and protein shapes, and networks in metabolism, codes, and embryos, and can explain hierarchical levels of structural organization that are widespread in biology.
Collapse
Affiliation(s)
- Jay Mittenthal
- Department of Cell and Developmental Biology, University of IllinoisUrbana-Champaign, IL, USA
- Institute for Genomic Biology, University of IllinoisUrbana-Champaign, IL, USA
| | - Derek Caetano-Anollés
- Department of Cell and Developmental Biology, University of IllinoisUrbana-Champaign, IL, USA
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of IllinoisUrbana, IL, USA
- Institute for Genomic Biology, University of IllinoisUrbana-Champaign, IL, USA
| |
Collapse
|
44
|
Rorick M. Quantifying protein modularity and evolvability: a comparison of different techniques. Biosystems 2012; 110:22-33. [PMID: 22796584 DOI: 10.1016/j.biosystems.2012.06.006] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2011] [Revised: 06/20/2012] [Accepted: 06/27/2012] [Indexed: 10/28/2022]
Abstract
Modularity increases evolvability by reducing constraints on adaptation and by allowing preexisting parts to function in new contexts for novel uses. Protein evolution provides an excellent context to study the causes and consequences of biological modularity. In order to address such questions, however, an index for protein modularity is necessary. This paper proposes a simple index for protein modularity-"module density"-which is the number of evolutionarily independent modules that compose a protein divided by the number of amino acids in the protein. The decomposition of proteins into constituent modules can be accomplished by either of two classes of methods. The first class of methods relies on "suppositional" criteria to assign amino acids to modules, whereas the second class of methods relies on "coevolutionary" criteria for this task. One simple and practical method from the first class consists of approximating the number of modules in a protein as the number of regular secondary structure elements (i.e., helices and sheets). Methods based on coevolutionary criteria require more elaborate data, but they have the advantage of being able to specify modules without prior assumptions about why they exist. Given the increasing availability of datasets sampling protein mutational spectra (e.g., from comparative genomics, experimental evolution, and computational prediction), methods based on coevolutionary criteria will likely become more promising in the near future. The ability to meaningfully quantify protein modularity via simple indices has the potential to aid future efforts to understand protein evolutionary rate determinants, improve molecular evolution models and engineer novel proteins.
Collapse
Affiliation(s)
- Mary Rorick
- University of Michigan, Department of Ecology and Evolutionary Biology, Ann Arbor, MI 48109-1048, United States.
| |
Collapse
|
45
|
Liberles DA, Teichmann SA, Bahar I, Bastolla U, Bloom J, Bornberg-Bauer E, Colwell LJ, de Koning APJ, Dokholyan NV, Echave J, Elofsson A, Gerloff DL, Goldstein RA, Grahnen JA, Holder MT, Lakner C, Lartillot N, Lovell SC, Naylor G, Perica T, Pollock DD, Pupko T, Regan L, Roger A, Rubinstein N, Shakhnovich E, Sjölander K, Sunyaev S, Teufel AI, Thorne JL, Thornton JW, Weinreich DM, Whelan S. The interface of protein structure, protein biophysics, and molecular evolution. Protein Sci 2012; 21:769-85. [PMID: 22528593 PMCID: PMC3403413 DOI: 10.1002/pro.2071] [Citation(s) in RCA: 155] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2012] [Revised: 03/22/2012] [Accepted: 03/23/2012] [Indexed: 12/20/2022]
Abstract
Abstract The interface of protein structural biology, protein biophysics, molecular evolution, and molecular population genetics forms the foundations for a mechanistic understanding of many aspects of protein biochemistry. Current efforts in interdisciplinary protein modeling are in their infancy and the state-of-the art of such models is described. Beyond the relationship between amino acid substitution and static protein structure, protein function, and corresponding organismal fitness, other considerations are also discussed. More complex mutational processes such as insertion and deletion and domain rearrangements and even circular permutations should be evaluated. The role of intrinsically disordered proteins is still controversial, but may be increasingly important to consider. Protein geometry and protein dynamics as a deviation from static considerations of protein structure are also important. Protein expression level is known to be a major determinant of evolutionary rate and several considerations including selection at the mRNA level and the role of interaction specificity are discussed. Lastly, the relationship between modeling and needed high-throughput experimental data as well as experimental examination of protein evolution using ancestral sequence resurrection and in vitro biochemistry are presented, towards an aim of ultimately generating better models for biological inference and prediction.
Collapse
Affiliation(s)
- David A Liberles
- Department of Molecular Biology, University of WyomingLaramie, Wyoming 82071
| | - Sarah A Teichmann
- MRC Laboratory of Molecular BiologyHills Road, Cambridge CB2 0QH, United Kingdom
| | - Ivet Bahar
- Department of Computational and Systems Biology, School of Medicine, University of PittsburghPittsburgh, Pennsylvania 15213
| | - Ugo Bastolla
- Bioinformatics Unit. Centro de Biología Molecular Severo Ochoa (CSIC-UAM), Universidad Autonoma de Madrid28049 Cantoblanco Madrid, Spain
| | - Jesse Bloom
- Division of Basic Sciences, Fred Hutchinson Cancer Research CenterSeattle, Washington 98109
| | - Erich Bornberg-Bauer
- Evolutionary Bioinformatics Group, Institute for Evolution and Biodiversity, University of MuensterGermany
| | - Lucy J Colwell
- MRC Laboratory of Molecular BiologyHills Road, Cambridge CB2 0QH, United Kingdom
| | - A P Jason de Koning
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of ColoradoAurora, Colorado
| | - Nikolay V Dokholyan
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel HillNorth Carolina 27599
| | - Julian Echave
- Escuela de Ciencia y Tecnología, Universidad Nacional de San MartínMartín de Irigoyen 3100, 1650 San Martín, Buenos Aires, Argentina
| | - Arne Elofsson
- Department of Biochemistry and Biophysics, Center for Biomembrane Research, Stockholm Bioinformatics Center, Science for Life Laboratory, Swedish E-science Research Center, Stockholm University106 91 Stockholm, Sweden
| | - Dietlind L Gerloff
- Biomolecular Engineering Department, University of CaliforniaSanta Cruz, California 95064
| | - Richard A Goldstein
- Division of Mathematical Biology, National Institute for Medical Research (MRC)Mill Hill, London NW7 1AA, United Kingdom
| | - Johan A Grahnen
- Department of Molecular Biology, University of WyomingLaramie, Wyoming 82071
| | - Mark T Holder
- Department of Ecology and Evolutionary Biology, University of KansasLawrence, Kansas 66045
| | - Clemens Lakner
- Bioinformatics Research Center, North Carolina State UniversityRaleigh, North Carolina 27695
| | - Nicholas Lartillot
- Département de Biochimie, Faculté de Médecine, Université de MontréalMontréal, QC H3T1J4, Canada
| | - Simon C Lovell
- Faculty of Life Sciences, University of ManchesterManchester M13 9PT, United Kingdom
| | - Gavin Naylor
- Department of Biology, College of CharlestonCharleston, South Carolina 29424
| | - Tina Perica
- MRC Laboratory of Molecular BiologyHills Road, Cambridge CB2 0QH, United Kingdom
| | - David D Pollock
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of ColoradoAurora, Colorado
| | - Tal Pupko
- Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv UniversityTel Aviv, Israel
| | - Lynne Regan
- Department of Molecular Biophysics and Biochemistry, Yale UniversityNew Haven 06511
| | - Andrew Roger
- Department of Biochemistry and Molecular Biology, Dalhousie UniversityHalifax, NS, Canada
| | - Nimrod Rubinstein
- Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv UniversityTel Aviv, Israel
| | - Eugene Shakhnovich
- Department of Chemistry and Chemical Biology, Harvard UniversityCambridge, Massachusetts 02138
| | - Kimmen Sjölander
- Department of Bioengineering, University of CaliforniaBerkeley, Berkeley, California 94720
| | - Shamil Sunyaev
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School77 Avenue Louis Pasteur, Boston, Massachusetts 02115
| | - Ashley I Teufel
- Department of Molecular Biology, University of WyomingLaramie, Wyoming 82071
| | - Jeffrey L Thorne
- Bioinformatics Research Center, North Carolina State UniversityRaleigh, North Carolina 27695
| | - Joseph W Thornton
- Howard Hughes Medical Institute and Institute for Ecology and Evolution, University of OregonEugene, Oregon 97403
- Department of Human Genetics, University of ChicagoChicago, Illinois 60637
- Department of Ecology and Evolution, University of ChicagoChicago, Illinois 60637
| | - Daniel M Weinreich
- Department of Ecology and Evolutionary Biology, and Center for Computational Molecular Biology, Brown UniversityProvidence, Rhode Island 02912
| | - Simon Whelan
- Faculty of Life Sciences, University of ManchesterManchester M13 9PT, United Kingdom
| |
Collapse
|
46
|
Hamza A, Wei NN, Johnson-Scalise T, Naftolin F, Cho H, Zhan CG. Unveiling the Unfolding Pathway of F5F8D Disorder-Associated D81H/V100D Mutant of MCFD2viaMultiple Molecular Dynamics Simulations. J Biomol Struct Dyn 2012; 29:699-714. [DOI: 10.1080/07391102.2012.10507410] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
47
|
Lobkovsky AE, Wolf YI, Koonin EV. Predictability of evolutionary trajectories in fitness landscapes. PLoS Comput Biol 2011; 7:e1002302. [PMID: 22194675 PMCID: PMC3240586 DOI: 10.1371/journal.pcbi.1002302] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2011] [Accepted: 10/29/2011] [Indexed: 11/19/2022] Open
Abstract
Experimental studies on enzyme evolution show that only a small fraction of all possible mutation trajectories are accessible to evolution. However, these experiments deal with individual enzymes and explore a tiny part of the fitness landscape. We report an exhaustive analysis of fitness landscapes constructed with an off-lattice model of protein folding where fitness is equated with robustness to misfolding. This model mimics the essential features of the interactions between amino acids, is consistent with the key paradigms of protein folding and reproduces the universal distribution of evolutionary rates among orthologous proteins. We introduce mean path divergence as a quantitative measure of the degree to which the starting and ending points determine the path of evolution in fitness landscapes. Global measures of landscape roughness are good predictors of path divergence in all studied landscapes: the mean path divergence is greater in smooth landscapes than in rough ones. The model-derived and experimental landscapes are significantly smoother than random landscapes and resemble additive landscapes perturbed with moderate amounts of noise; thus, these landscapes are substantially robust to mutation. The model landscapes show a deficit of suboptimal peaks even compared with noisy additive landscapes with similar overall roughness. We suggest that smoothness and the substantial deficit of peaks in the fitness landscapes of protein evolution are fundamental consequences of the physics of protein folding. Is evolution deterministic, hence predictable, or stochastic, that is unpredictable? What would happen if one could “replay the tape of evolution”: will the outcomes of evolution be completely different or is evolution so constrained that history will be repeated? Arguably, these questions are among the most intriguing and most difficult in evolutionary biology. In other words, the predictability of evolution depends on the fraction of the trajectories on fitness landscapes that are accessible for evolutionary exploration. Because direct experimental investigation of fitness landscapes is technically challenging, the available studies only explore a minuscule portion of the landscape for individual enzymes. We therefore sought to investigate the topography of fitness landscapes within the framework of a previously developed model of protein folding and evolution where fitness is equated with robustness to misfolding. We show that model-derived and experimental landscapes are significantly smoother than random landscapes and resemble moderately perturbed additive landscapes; thus, these landscapes are substantially robust to mutation. The model landscapes show a deficit of suboptimal peaks even compared with noisy additive landscapes with similar overall roughness. Thus, the smoothness and substantial deficit of peaks in fitness landscapes of protein evolution could be fundamental consequences of the physics of protein folding.
Collapse
Affiliation(s)
- Alexander E. Lobkovsky
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Yuri I. Wolf
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail:
| |
Collapse
|
48
|
The evolution of protein structures and structural ensembles under functional constraint. Genes (Basel) 2011; 2:748-62. [PMID: 24710290 PMCID: PMC3927589 DOI: 10.3390/genes2040748] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2011] [Revised: 10/15/2011] [Accepted: 10/19/2011] [Indexed: 02/06/2023] Open
Abstract
Protein sequence, structure, and function are inherently linked through evolution and population genetics. Our knowledge of protein structure comes from solved structures in the Protein Data Bank (PDB), our knowledge of sequence through sequences found in the NCBI sequence databases (http://www.ncbi.nlm.nih.gov/), and our knowledge of function through a limited set of in-vitro biochemical studies. How these intersect through evolution is described in the first part of the review. In the second part, our understanding of a series of questions is addressed. This includes how sequences evolve within structures, how evolutionary processes enable structural transitions, how the folding process can change through evolution and what the fitness impacts of this might be. Moving beyond static structures, the evolution of protein kinetics (including normal modes) is discussed, as is the evolution of conformational ensembles and structurally disordered proteins. This ties back to a question of the role of neostructuralization and how it relates to selection on sequences for functions. The relationship between metastability, the fitness landscape, sequence divergence, and organismal effective population size is explored. Lastly, a brief discussion of modeling the evolution of sequences of ordered and disordered proteins is entertained.
Collapse
|
49
|
Network models of TEM β-lactamase mutations coevolving under antibiotic selection show modular structure and anticipate evolutionary trajectories. PLoS Comput Biol 2011; 7:e1002184. [PMID: 21966264 PMCID: PMC3178621 DOI: 10.1371/journal.pcbi.1002184] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2011] [Accepted: 07/19/2011] [Indexed: 01/13/2023] Open
Abstract
Understanding how novel functions evolve (genetic adaptation) is a critical goal of evolutionary biology. Among asexual organisms, genetic adaptation involves multiple mutations that frequently interact in a non-linear fashion (epistasis). Non-linear interactions pose a formidable challenge for the computational prediction of mutation effects. Here we use the recent evolution of β-lactamase under antibiotic selection as a model for genetic adaptation. We build a network of coevolving residues (possible functional interactions), in which nodes are mutant residue positions and links represent two positions found mutated together in the same sequence. Most often these pairs occur in the setting of more complex mutants. Focusing on extended-spectrum resistant sequences, we use network-theoretical tools to identify triple mutant trajectories of likely special significance for adaptation. We extrapolate evolutionary paths (n = 3) that increase resistance and that are longer than the units used to build the network (n = 2). These paths consist of a limited number of residue positions and are enriched for known triple mutant combinations that increase cefotaxime resistance. We find that the pairs of residues used to build the network frequently decrease resistance compared to their corresponding singlets. This is a surprising result, given that their coevolution suggests a selective advantage. Thus, β-lactamase adaptation is highly epistatic. Our method can identify triplets that increase resistance despite the underlying rugged fitness landscape and has the unique ability to make predictions by placing each mutant residue position in its functional context. Our approach requires only sequence information, sufficient genetic diversity, and discrete selective pressures. Thus, it can be used to analyze recent evolutionary events, where coevolution analysis methods that use phylogeny or statistical coupling are not possible. Improving our ability to assess evolutionary trajectories will help predict the evolution of clinically relevant genes and aid in protein design. Understanding how new biological activities evolve on the molecular level has critical implications for biotechnology and for human health. Here we collect a database of mutations that contribute to the evolution of β-lactamase resistance to inhibitors and to new β-lactam antibiotics in bacterial pathogens, such as Escherichia coli. We compiled a database of TEM β-lactamase sequences evolved under antibiotic pressure and identified functional interactions between individual residue positions. We visualized these complex molecular interactions as a network and used network theory to derive information regarding the origin of individual mutations and their contribution to the observed resistance. Our approach should help interpret sequence databases for clinically relevant proteins undergoing high mutation rates and under selective (drug, immune) pressure, such as surface proteins of pathogens (particularly of RNA viruses such as HIV) or targets for chemotherapy in microbial pathogen or tumor cells. Notably, our approach only requires sequence data; detailed phylogenetic or tertiary structure information for the target gene is not necessary. Our analysis of how individual mutations work together to produce new biological activities should help anticipate evolution driven by a variety of clinically-relevant selections such as drug resistance, virulence, and immunity.
Collapse
|
50
|
Yang JR, Zhuang SM, Zhang J. Impact of translational error-induced and error-free misfolding on the rate of protein evolution. Mol Syst Biol 2011; 6:421. [PMID: 20959819 PMCID: PMC2990641 DOI: 10.1038/msb.2010.78] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2010] [Accepted: 08/31/2010] [Indexed: 11/26/2022] Open
Abstract
Theoretical calculations suggest that, in addition to translational error-induced protein misfolding, a non-negligible fraction of misfolded proteins are error free. We propose that the anticorrelation between the expression level of a protein and its rate of sequence evolution be explained by an overarching protein-misfolding-avoidance hypothesis that includes selection against both error-induced and error-free protein misfolding, and verify this model by a molecular-level evolutionary simulation. We provide strong empirical evidence for the protein-misfolding-avoidance hypothesis, including a positive correlation between protein expression level and stability, enrichment of misfolding-minimizing codons and amino acids in highly expressed genes, and stronger evolutionary conservation of residues in which nonsynonymous changes are more likely to increase protein misfolding.
The rate of protein sequence evolution has long been of central interest to molecular evolutionists. Different proteins of the same species evolve at vastly different rates, which is commonly explained by a variation in functional constraint among different proteins (Kimura and Ohta, 1974). However, it is unclear how to quantify the functional constraint of a protein from the knowledge of its function. In the past decade, various types of genomic data from model organisms have been examined to look for the determinants of the rate of protein sequence evolution. The most unexpected discovery was a very strong anticorrelation between the expression level and evolutionary rate of a protein (E–R anticorrelation) (Pal et al, 2001). The prevailing explanation of the E–R anticorrelation is the translational robustness hypothesis (Drummond et al, 2005). This hypothesis posits that mistranslation induces protein misfolding, which is toxic to cells (Figure 1). Consequently, highly expressed proteins are under stronger pressures to be translationally robust and thus are more constrained in sequence evolution. However, the impact of the other source of misfolded proteins, translational error-free proteins (Figure 1), has not been evaluated. By theoretical calculation, computer simulation, and empirical data analysis, we examined the role of selection against both error-induced and error-free protein misfolding in creating the E–R correlation. Our theoretical calculations suggested that a non-negligible fraction of misfolded proteins are error free. We estimated that when a protein is not very stable, on average ∼20% of misfolded molecules are error free. However, when a protein is very stable, this fraction reduces to ∼5%, which is probably a result of natural selection against protein misfolding. We conducted a molecular-level evolutionary simulation (Figure 2A) using three different schemes: error-induced misfolding only, error-free misfolding only, and both types of misfolding. As expected, results from the first simulation are similar to those from a previous study that considers only error-induced misfolding (Drummond and Wilke, 2008). Interestingly, the second and third simulations can also generate the same patterns, including a positive correlation between the protein expression level and the unfolding energy (ΔG) of the error-free protein (Figure 2B), a negative correlation between the expression level and the fraction of protein molecules that misfold after being mistranslated (Figure 2C), a negative correlation between ΔG and the evolutionary rate (Figure 2D), and a negative correlation between the expression level and the evolutionary rate (i.e., the E–R anticorrelation) (Figure 2E). Furthermore, we found that selection against protein misfolding is more effective in reducing error-free misfolding than error-induced misfolding. Based on these results, we propose that an overarching protein-misfolding-avoidance hypothesis that includes both sources of misfolding is superior to the prevailing translational robustness hypothesis, which considers only error-induced misfolding. We tested three key predictions of the protein-misfolding-avoidance hypotheses using yeast data. First, we showed that, consistent with our prediction, a positive correlation exists between the protein expression level and stability, which is measured by the unfolding energy or melting temperature. In addition, protein expression level is negatively correlated with protein aggregation propensity. Second, we found that codons minimizing protein misfolding are used more frequently in highly expressed proteins than in lowly expressed ones. Third, we showed that, within the same protein, amino acid residues in which random nonsynonymous mutations are more likely to increase protein misfolding are evolutionarily more conserved. Together, these results provide unambiguous evidence that avoidance of both error-induced and error-free protein misfolding is a major source of the E–R anticorrelation and that protein stability and mistranslation have important roles in protein evolution. What determines the rate of protein evolution is a fundamental question in biology. Recent genomic studies revealed a surprisingly strong anticorrelation between the expression level of a protein and its rate of sequence evolution. This observation is currently explained by the translational robustness hypothesis in which the toxicity of translational error-induced protein misfolding selects for higher translational robustness of more abundant proteins, which constrains sequence evolution. However, the impact of error-free protein misfolding has not been evaluated. We estimate that a non-negligible fraction of misfolded proteins are error free and demonstrate by a molecular-level evolutionary simulation that selection against protein misfolding results in a greater reduction of error-free misfolding than error-induced misfolding. Thus, an overarching protein-misfolding-avoidance hypothesis that includes both sources of misfolding is superior to the translational robustness hypothesis. We show that misfolding-minimizing amino acids are preferentially used in highly abundant yeast proteins and that these residues are evolutionarily more conserved than other residues of the same proteins. These findings provide unambiguous support to the role of protein-misfolding-avoidance in determining the rate of protein sequence evolution.
Collapse
Affiliation(s)
- Jian-Rong Yang
- Key Laboratory of Gene Engineering of the Ministry of Education, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, PR China
| | | | | |
Collapse
|