1
|
Li Z, Chen M, Tian W, Wang L, Wu X. Investigating the role of polar amino acids driven by evolution in the active site architecture of GH11 xylanase. Int J Biol Macromol 2025; 315:144464. [PMID: 40403789 DOI: 10.1016/j.ijbiomac.2025.144464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2025] [Revised: 05/09/2025] [Accepted: 05/19/2025] [Indexed: 05/24/2025]
Abstract
Enzymes, as vital biomacromolecules, have developed significant plasticity, enabling adaptation to diverse environments and catalysis of numerous biochemical reactions. However, enzyme evolution is constrained by mutational limitations, as amino acid substitutions often impair structure or function, hampering optimization endeavors. To address this, we integrated structural bioinformatics with site-directed mutagenesis to investigate the evolutionary trends of four GH11 family xylanases (XynA, XynB, XynD, and XynE) from Aspergillus niger An76. Our analysis revealed that conserved residues in active sites are unevenly distributed, with highly conserved residues critical for catalysis and relatively conserved residues offering mutation potential. The mutation of Asp/Asn near catalytic residues at -1 subsite could not only alter the catalytic activity, but also shift the optimal pH by one unit. Additional mutants, including XynB-A143P, XynA-F142W, XynD-E20T, and XynD-E192Q, increased enzymatic activity by 17%, 46%, 82%, and 26%, respectively. More importantly, ancestral sequence reconstruction highlighted the importance of Arg at the -1 subsite of GH11 xylanases, and combinatorial mutation based on Y160R reinstated the pseudo-enzyme XynE's activity to 417.6 IU/mg. This study demonstrates the efficacy of evolutionary-informed mutagenesis for precise enzyme design, providing insights for optimizing GH11 family and other enzymes.
Collapse
Affiliation(s)
- Zhaoran Li
- State Key Laboratory of Microbial Technology, Institute of Microbial Technology, Shandong University, Qingdao 266237, China
| | - Muyang Chen
- State Key Laboratory of Microbial Technology, Institute of Microbial Technology, Shandong University, Qingdao 266237, China
| | - Wenya Tian
- State Key Laboratory of Microbial Technology, Institute of Microbial Technology, Shandong University, Qingdao 266237, China
| | - Lushan Wang
- State Key Laboratory of Microbial Technology, Institute of Microbial Technology, Shandong University, Qingdao 266237, China
| | - Xiuyun Wu
- State Key Laboratory of Microbial Technology, Institute of Microbial Technology, Shandong University, Qingdao 266237, China.
| |
Collapse
|
2
|
Picolo F, Bardin J, Laurin M, Piégu B, Monget P. Genes Encoding Intracellular Signaling Proteins in Animals Originated Along with Metazoa and Chordata: Chance or Necessity? Genome Biol Evol 2025; 17:evaf034. [PMID: 40200633 PMCID: PMC11979099 DOI: 10.1093/gbe/evaf034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/19/2025] [Indexed: 04/10/2025] Open
Abstract
In this work, we investigate whether the construction of signaling pathways during evolution follows a deterministic law through a study of the eventual link between age of appearance in the tree of life and position in the signaling pathway of genes involved in these pathways. We use the 47 human signaling pathways described in the Kyoto Encyclopedia of Genes and Genomes and investigate the orthologs of these genes in 315 animal species plus a yeast taxon, representing 15 large clades. Many genes appear on two key branches: those between the last common ancestor of Opisthokonta and Metazoa and between Deuterostomia and Chordata. We look for a link between the age of appearance of an upstream A gene and that of its downstream B partner. We observe that for all the interactions of two partners, only 20.6% of the corresponding genes arose simultaneously in the tree of life, 40.7% being called "backward" (i.e. B appearing before A) and 38.7% "forward" (A appearing before B). For 16 of the 47 pathways, there is a positive correlation between the age rank difference between interacting partner genes and the position of the corresponding proteins in the pathway: the more upstream a protein is involved in the pathway, the greater the rank difference is (the correlation, positive or negative, is not significant for 30 pathways). For the sole insulin signaling pathway, this correlation is negative. Moreover, by permutation test, we find that 14 of the 47 observed pathway contained larger modules (subset respecting a homogeneous appearance pattern) than expected by chance alone. Finally, for 20 of the 47 pathways, the construction scenario appears to be random, as these pathways do not validate any of our statistical tests (permutation tests on interaction direction and module sizes as well as correlation test on pathway position and age rank). Given that only 14.9% of the tests are significant and that significant effects are different among pathways, we conclude that there is no deterministic rule in the establishment of the pathways herein studied or that the patterns have been obscured by subsequent transformations.
Collapse
Affiliation(s)
- Floriane Picolo
- PCR, UMR85, INRAE, CRNS, IFCE, Université de Tours, Nouzilly F-37380, France
| | - Jérémie Bardin
- CR2P “Centre de Recherches sur la Paléo-biodiversité et les Paléo-environnements”, UMR 7207, CNRS/MNHN, Muséum National d'Histoire Naturelle, Sorbonne Université, Paris, France
| | - Michel Laurin
- CR2P “Centre de Recherches sur la Paléo-biodiversité et les Paléo-environnements”, UMR 7207, CNRS/MNHN, Muséum National d'Histoire Naturelle, Sorbonne Université, Paris, France
| | - Benoît Piégu
- PCR, UMR85, INRAE, CRNS, IFCE, Université de Tours, Nouzilly F-37380, France
| | - Philippe Monget
- PCR, UMR85, INRAE, CRNS, IFCE, Université de Tours, Nouzilly F-37380, France
| |
Collapse
|
3
|
Pan T, Bi Y, Wang X, Zhang Y, Webb GI, Gasser RB, Kurgan L, Song J. SCREEN: A Graph-based Contrastive Learning Tool to Infer Catalytic Residues and Assess Enzyme Mutations. GENOMICS, PROTEOMICS & BIOINFORMATICS 2025; 22:qzae094. [PMID: 39724324 PMCID: PMC11961199 DOI: 10.1093/gpbjnl/qzae094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/22/2024] [Revised: 12/05/2024] [Accepted: 12/06/2024] [Indexed: 12/28/2024]
Abstract
The accurate identification of catalytic residues contributes to our understanding of enzyme functions in biological processes and pathways. The increasing number of protein sequences necessitates computational tools for the automated prediction of catalytic residues in enzymes. Here, we introduce SCREEN, a graph neural network for the high-throughput prediction of catalytic residues via the integration of enzyme functional and structural information. SCREEN constructs residue representations based on spatial arrangements and incorporates enzyme function priors into such representations through contrastive learning. We demonstrate that SCREEN (1) consistently outperforms currently-available predictors; (2) provides accurate results when applied to inferred enzyme structures; and (3) generalizes well to enzymes dissimilar from those in the training set. We also show that the putative catalytic residues predicted by SCREEN mimic key structural and biophysical characteristics of native catalytic residues. Moreover, using experimental datasets, we show that SCREEN's predictions can be used to distinguish residues with a high mutation tolerance from those likely to cause functional loss when mutated, indicating that this tool might be used to infer disease-associated mutations. SCREEN is publicly available at https://github.com/BioColLab/SCREEN and https://ngdc.cncb.ac.cn/biocode/tool/7580.
Collapse
Affiliation(s)
- Tong Pan
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Clayton, VIC 3800, Australia
- Monash Biomedicine Discovery Institute-Wenzhou Medical University Alliance in Clinical and Experimental Biomedicine, Monash University, Clayton, VIC 3800, Australia
| | - Yue Bi
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Clayton, VIC 3800, Australia
- Monash Biomedicine Discovery Institute-Wenzhou Medical University Alliance in Clinical and Experimental Biomedicine, Monash University, Clayton, VIC 3800, Australia
| | - Xiaoyu Wang
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Clayton, VIC 3800, Australia
- Monash Biomedicine Discovery Institute-Wenzhou Medical University Alliance in Clinical and Experimental Biomedicine, Monash University, Clayton, VIC 3800, Australia
| | - Ying Zhang
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Clayton, VIC 3800, Australia
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
| | - Geoffrey I Webb
- Department of Data Science and Artificial Intelligence, Monash University, Clayton, VIC 3800, Australia
| | - Robin B Gasser
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, VIC 3010, Australia
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Jiangning Song
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Clayton, VIC 3800, Australia
- Monash Biomedicine Discovery Institute-Wenzhou Medical University Alliance in Clinical and Experimental Biomedicine, Monash University, Clayton, VIC 3800, Australia
- Key Laboratory of Clinical Laboratory Diagnosis and Translational Research of Zhejiang Province, Department of Clinical Laboratory, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou 325015, China
| |
Collapse
|
4
|
Usmanova DR, Plata G, Vitkup D. Functional Optimization in Distinct Tissues and Conditions Constrains the Rate of Protein Evolution. Mol Biol Evol 2024; 41:msae200. [PMID: 39431545 PMCID: PMC11523136 DOI: 10.1093/molbev/msae200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 07/29/2024] [Accepted: 08/05/2024] [Indexed: 10/22/2024] Open
Abstract
Understanding the main determinants of protein evolution is a fundamental challenge in biology. Despite many decades of active research, the molecular and cellular mechanisms underlying the substantial variability of evolutionary rates across cellular proteins are not currently well understood. It also remains unclear how protein molecular function is optimized in the context of multicellular species and why many proteins, such as enzymes, are only moderately efficient on average. Our analysis of genomics and functional datasets reveals in multiple organisms a strong inverse relationship between the optimality of protein molecular function and the rate of protein evolution. Furthermore, we find that highly expressed proteins tend to be substantially more functionally optimized. These results suggest that cellular expression costs lead to more pronounced functional optimization of abundant proteins and that the purifying selection to maintain high levels of functional optimality significantly slows protein evolution. We observe that in multicellular species both the rate of protein evolution and the degree of protein functional efficiency are primarily affected by expression in several distinct cell types and tissues, specifically, in developed neurons with upregulated synaptic processes in animals and in young and fast-growing tissues in plants. Overall, our analysis reveals how various constraints from the molecular, cellular, and species' levels of biological organization jointly affect the rate of protein evolution and the level of protein functional adaptation.
Collapse
Affiliation(s)
- Dinara R Usmanova
- Department of Systems Biology, Columbia University, New York, NY 10032, USA
| | - Germán Plata
- Department of Systems Biology, Columbia University, New York, NY 10032, USA
- BiomEdit, Fishers, IN 46037, USA
| | - Dennis Vitkup
- Department of Systems Biology, Columbia University, New York, NY 10032, USA
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA
| |
Collapse
|
5
|
Yehorova D, Crean RM, Kasson PM, Kamerlin SCL. Key interaction networks: Identifying evolutionarily conserved non-covalent interaction networks across protein families. Protein Sci 2024; 33:e4911. [PMID: 38358258 PMCID: PMC10868456 DOI: 10.1002/pro.4911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 01/08/2024] [Accepted: 01/10/2024] [Indexed: 02/16/2024]
Abstract
Protein structure (and thus function) is dictated by non-covalent interaction networks. These can be highly evolutionarily conserved across protein families, the members of which can diverge in sequence and evolutionary history. Here we present KIN, a tool to identify and analyze conserved non-covalent interaction networks across evolutionarily related groups of proteins. KIN is available for download under a GNU General Public License, version 2, from https://www.github.com/kamerlinlab/KIN. KIN can operate on experimentally determined structures, predicted structures, or molecular dynamics trajectories, providing insight into both conserved and missing interactions across evolutionarily related proteins. This provides useful insight both into protein evolution, as well as a tool that can be exploited for protein engineering efforts. As a showcase system, we demonstrate applications of this tool to understanding the evolutionary-relevant conserved interaction networks across the class A β-lactamases.
Collapse
Affiliation(s)
- Dariia Yehorova
- School of Chemistry and Biochemistry, Georgia Institute of TechnologyAtlantaGeorgiaUSA
| | - Rory M. Crean
- Department of Chemistry—BMCUppsala UniversityUppsalaSweden
| | - Peter M. Kasson
- Department of Molecular PhysiologyUniversity of VirginiaCharlottesvilleVirginiaUSA
- Department Biomedical EngineeringUniversity of VirginiaCharlottesvilleVirginiaUSA
- Department of Cell and Molecular BiologyUppsala UniversityUppsalaSweden
| | - Shina C. L. Kamerlin
- School of Chemistry and Biochemistry, Georgia Institute of TechnologyAtlantaGeorgiaUSA
- Department of Chemistry—BMCUppsala UniversityUppsalaSweden
| |
Collapse
|
6
|
Ferreiro D, Khalil R, Sousa SF, Arenas M. Substitution Models of Protein Evolution with Selection on Enzymatic Activity. Mol Biol Evol 2024; 41:msae026. [PMID: 38314876 PMCID: PMC10873502 DOI: 10.1093/molbev/msae026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 01/25/2024] [Accepted: 01/31/2024] [Indexed: 02/07/2024] Open
Abstract
Substitution models of evolution are necessary for diverse evolutionary analyses including phylogenetic tree and ancestral sequence reconstructions. At the protein level, empirical substitution models are traditionally used due to their simplicity, but they ignore the variability of substitution patterns among protein sites. Next, in order to improve the realism of the modeling of protein evolution, a series of structurally constrained substitution models were presented, but still they usually ignore constraints on the protein activity. Here, we present a substitution model of protein evolution with selection on both protein structure and enzymatic activity, and that can be applied to phylogenetics. In particular, the model considers the binding affinity of the enzyme-substrate complex as well as structural constraints that include the flexibility of structural flaps, hydrogen bonds, amino acids backbone radius of gyration, and solvent-accessible surface area that are quantified through molecular dynamics simulations. We applied the model to the HIV-1 protease and evaluated it by phylogenetic likelihood in comparison with the best-fitting empirical substitution model and a structurally constrained substitution model that ignores the enzymatic activity. We found that accounting for selection on the protein activity improves the fitting of the modeled functional regions with the real observations, especially in data with high molecular identity, which recommends considering constraints on the protein activity in the development of substitution models of evolution.
Collapse
Affiliation(s)
- David Ferreiro
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain
- Department of Biochemistry, Genetics and Immunology, Universidade de Vigo, 36310 Vigo, Spain
| | - Ruqaiya Khalil
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain
- Department of Biochemistry, Genetics and Immunology, Universidade de Vigo, 36310 Vigo, Spain
| | - Sergio F Sousa
- UCIBIO/REQUIMTE, BioSIM, Departamento de Biomedicina, Faculdade de Medicina da Universidade do Porto, 4200-319 Porto, Portugal
| | - Miguel Arenas
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain
- Department of Biochemistry, Genetics and Immunology, Universidade de Vigo, 36310 Vigo, Spain
| |
Collapse
|
7
|
Xie WJ, Warshel A. Harnessing generative AI to decode enzyme catalysis and evolution for enhanced engineering. Natl Sci Rev 2023; 10:nwad331. [PMID: 38299119 PMCID: PMC10829072 DOI: 10.1093/nsr/nwad331] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2023] [Revised: 09/27/2023] [Accepted: 10/13/2023] [Indexed: 02/02/2024] Open
Abstract
Enzymes, as paramount protein catalysts, occupy a central role in fostering remarkable progress across numerous fields. However, the intricacy of sequence-function relationships continues to obscure our grasp of enzyme behaviors and curtails our capabilities in rational enzyme engineering. Generative artificial intelligence (AI), known for its proficiency in handling intricate data distributions, holds the potential to offer novel perspectives in enzyme research. Generative models could discern elusive patterns within the vast sequence space and uncover new functional enzyme sequences. This review highlights the recent advancements in employing generative AI for enzyme sequence analysis. We delve into the impact of generative AI in predicting mutation effects on enzyme fitness, catalytic activity and stability, rationalizing the laboratory evolution of de novo enzymes, and decoding protein sequence semantics and their application in enzyme engineering. Notably, the prediction of catalytic activity and stability of enzymes using natural protein sequences serves as a vital link, indicating how enzyme catalysis shapes enzyme evolution. Overall, we foresee that the integration of generative AI into enzyme studies will remarkably enhance our knowledge of enzymes and expedite the creation of superior biocatalysts.
Collapse
Affiliation(s)
- Wen Jun Xie
- Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, Genetics Institute, University of Florida, Gainesville, FL 32610, USA
| | - Arieh Warshel
- Department of Chemistry, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
8
|
Ma S, Xi W, Wang S, Chen H, Guo S, Mo T, Chen W, Deng Z, Chen F, Ding W, Zhang Q. Substrate-Controlled Catalysis in the Ether Cross-Link-Forming Radical SAM Enzymes. J Am Chem Soc 2023; 145:22945-22953. [PMID: 37769281 DOI: 10.1021/jacs.3c04355] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/30/2023]
Abstract
Darobactin is a heptapeptide antibiotic featuring an ether cross-link and a C-C cross-link, and both cross-links are installed by a radical S-adenosylmethionine (rSAM) enzyme DarE. How a single DarE enzyme affords the two chemically distinct cross-links remains largely obscure. Herein, by mapping the biosynthetic landscape for darobactin-like RiPP (daropeptide), we identified and characterized two novel daropeptides that lack the C-C cross-link present in darobactin and instead are solely composed of ether cross-links. Phylogenetic and mutagenesis analyses reveal that the daropeptide maturases possess intrinsic multifunctionality, catalyzing not only the formation of ether cross-link but also C-C cross-linking and Ser oxidation. Intriguingly, the different chemical outcomes are controlled by the exact substrate motifs. Our work not only provides a roadmap for the discovery of new daropeptide natural products but also offers insights into the regulatory mechanisms that govern these remarkably versatile ether cross-link-forming rSAM enzymes.
Collapse
Affiliation(s)
- Suze Ma
- Department of Chemistry, Fudan University, Shanghai 200433, China
| | - Wenhui Xi
- Department of Chemistry, Fudan University, Shanghai 200433, China
| | - Shu Wang
- Department of Chemistry, Fudan University, Shanghai 200433, China
| | - Heng Chen
- Department of Chemistry, Fudan University, Shanghai 200433, China
| | - Sijia Guo
- State Key Laboratory of Microbial Metabolism, School of Life Sciences & Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Tianlu Mo
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
| | - Wenxue Chen
- Department of Chemistry, Fudan University, Shanghai 200433, China
| | - Zixin Deng
- State Key Laboratory of Microbial Metabolism, School of Life Sciences & Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Fener Chen
- Department of Chemistry, Fudan University, Shanghai 200433, China
- National Engineering Research Center for Carbohydrate Synthesis, Jiangxi Normal University, Nanchang 330022, China
| | - Wei Ding
- State Key Laboratory of Microbial Metabolism, School of Life Sciences & Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Qi Zhang
- Department of Chemistry, Fudan University, Shanghai 200433, China
| |
Collapse
|
9
|
Xie WJ, Warshel A. Harnessing Generative AI to Decode Enzyme Catalysis and Evolution for Enhanced Engineering. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.10.561808. [PMID: 37873334 PMCID: PMC10592750 DOI: 10.1101/2023.10.10.561808] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Enzymes, as paramount protein catalysts, occupy a central role in fostering remarkable progress across numerous fields. However, the intricacy of sequence-function relationships continues to obscure our grasp of enzyme behaviors and curtails our capabilities in rational enzyme engineering. Generative artificial intelligence (AI), known for its proficiency in handling intricate data distributions, holds the potential to offer novel perspectives in enzyme research. By applying generative models, we could discern elusive patterns within the vast sequence space and uncover new functional enzyme sequences. This review highlights the recent advancements in employing generative AI for enzyme sequence analysis. We delve into the impact of generative AI in predicting mutation effects on enzyme fitness, activity, and stability, rationalizing the laboratory evolution of de novo enzymes, decoding protein sequence semantics, and its applications in enzyme engineering. Notably, the prediction of enzyme activity and stability using natural enzyme sequences serves as a vital link, indicating how enzyme catalysis shapes enzyme evolution. Overall, we foresee that the integration of generative AI into enzyme studies will remarkably enhance our knowledge of enzymes and expedite the creation of superior biocatalysts.
Collapse
Affiliation(s)
- Wen Jun Xie
- Department of Chemistry, University of Southern California, Los Angeles, CA, USA
- Departmet of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development (CNPD3), Genetics Institute, University of Florida, Gainesville, FL, USA
| | - Arieh Warshel
- Department of Chemistry, University of Southern California, Los Angeles, CA, USA
| |
Collapse
|
10
|
Cagiada M, Bottaro S, Lindemose S, Schenstrøm SM, Stein A, Hartmann-Petersen R, Lindorff-Larsen K. Discovering functionally important sites in proteins. Nat Commun 2023; 14:4175. [PMID: 37443362 PMCID: PMC10345196 DOI: 10.1038/s41467-023-39909-0] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Accepted: 07/02/2023] [Indexed: 07/15/2023] Open
Abstract
Proteins play important roles in biology, biotechnology and pharmacology, and missense variants are a common cause of disease. Discovering functionally important sites in proteins is a central but difficult problem because of the lack of large, systematic data sets. Sequence conservation can highlight residues that are functionally important but is often convoluted with a signal for preserving structural stability. We here present a machine learning method to predict functional sites by combining statistical models for protein sequences with biophysical models of stability. We train the model using multiplexed experimental data on variant effects and validate it broadly. We show how the model can be used to discover active sites, as well as regulatory and binding sites. We illustrate the utility of the model by prospective prediction and subsequent experimental validation on the functional consequences of missense variants in HPRT1 which may cause Lesch-Nyhan syndrome, and pinpoint the molecular mechanisms by which they cause disease.
Collapse
Affiliation(s)
- Matteo Cagiada
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Sandro Bottaro
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Søren Lindemose
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Signe M Schenstrøm
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Amelie Stein
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Rasmus Hartmann-Petersen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| | - Kresten Lindorff-Larsen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
11
|
Precord T, Ramesh S, Dommaraju SR, Harris LA, Kille BL, Mitchell DA. Catalytic Site Proximity Profiling for Functional Unification of Sequence-Diverse Radical S-Adenosylmethionine Enzymes. ACS BIO & MED CHEM AU 2023; 3:240-251. [PMID: 37363077 PMCID: PMC10288494 DOI: 10.1021/acsbiomedchemau.2c00085] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/24/2022] [Revised: 02/08/2023] [Accepted: 02/10/2023] [Indexed: 06/28/2023]
Abstract
The radical S-adenosylmethionine (rSAM) superfamily has become a wellspring for discovering new enzyme chemistry, especially regarding ribosomally synthesized and post-translationally modified peptides (RiPPs). Here, we report a compendium of nearly 15,000 rSAM proteins with high-confidence involvement in RiPP biosynthesis. While recent bioinformatics advances have unveiled the broad sequence space covered by rSAM proteins, the significant challenge of functional annotation remains unsolved. Through a combination of sequence analysis and protein structural predictions, we identified a set of catalytic site proximity residues with functional predictive power, especially among the diverse rSAM proteins that form sulfur-to-α carbon thioether (sactionine) linkages. As a case study, we report that an rSAM protein from Streptomyces sparsogenes (StsB) shares higher full-length similarity with MftC (mycofactocin biosynthesis) than any other characterized enzyme. However, a comparative analysis of StsB to known rSAM proteins using "catalytic site proximity" predicted that StsB would be distinct from MftC and instead form sactionine bonds. The prediction was confirmed by mass spectrometry, targeted mutagenesis, and chemical degradation. We further used "catalytic site proximity" analysis to identify six new sactipeptide groups undetectable by traditional genome-mining strategies. Additional catalytic site proximity profiling of cyclophane-forming rSAM proteins suggests that this approach will be more broadly applicable and enhance, if not outright correct, protein functional predictions based on traditional genomic enzymology principles.
Collapse
Affiliation(s)
- Timothy
W. Precord
- Department
of Chemistry, University of Illinois at
Urbana-Champaign, Urbana, Illinois 61801, United States
- Carl
R. Woese Institute for Genomic Biology, University of Illinois at
Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Sangeetha Ramesh
- Department
of Microbiology, University of Illinois
at Urbana-Champaign, Urbana, Illinois 61801, United States
- Carl
R. Woese Institute for Genomic Biology, University of Illinois at
Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Shravan R. Dommaraju
- Department
of Chemistry, University of Illinois at
Urbana-Champaign, Urbana, Illinois 61801, United States
- Carl
R. Woese Institute for Genomic Biology, University of Illinois at
Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Lonnie A. Harris
- Department
of Chemistry, University of Illinois at
Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Bryce L. Kille
- Department
of Computer Science, Rice University, Houston, Texas 77005, United States
| | - Douglas A. Mitchell
- Department
of Chemistry, University of Illinois at
Urbana-Champaign, Urbana, Illinois 61801, United States
- Department
of Microbiology, University of Illinois
at Urbana-Champaign, Urbana, Illinois 61801, United States
- Carl
R. Woese Institute for Genomic Biology, University of Illinois at
Urbana-Champaign, Urbana, Illinois 61801, United States
| |
Collapse
|
12
|
Hou Q, Rooman M, Pucci F. Enzyme Stability-Activity Trade-Off: New Insights from Protein Stability Weaknesses and Evolutionary Conservation. J Chem Theory Comput 2023. [PMID: 37276063 DOI: 10.1021/acs.jctc.3c00036] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
A general limitation of the use of enzymes in biotechnological processes under sometimes nonphysiological conditions is the complex interplay between two key quantities, enzyme activity and stability, where the increase of one is often associated with the decrease of the other. A precise stability-activity trade-off is necessary for the enzymes to be fully functional, but its weight in different protein regions and its dependence on environmental conditions is not yet elucidated. To advance this issue, we used the formalism that we have recently developed to effectively identify stability strength and weakness regions in protein structures and applied it to a large set of globular enzymes with known experimental structure and catalytic sites. Our analysis showed a striking oscillatory pattern of free energy compensation centered on the catalytic region. Indeed, catalytic residues are usually nonoptimal with respect to stability, but residues in the first shell around the catalytic site are, on the average, stability strengths and thus compensate for this lack of stability; residues in the second shell are weaker again, and so on. This trend is consistent across all enzyme families. It is accompanied by a similar, but less pronounced, pattern of residue conservation across evolution. In addition, we analyzed cold- and heat-adapted enzymes separately and highlighted different patterns of stability strengths and weaknesses, which provide insight into the longstanding problem of catalytic rate enhancement in cold environments. The successful comparison of our stability and conservation results with experimental fitness data, obtained by deep mutagenesis scanning, led us to propose criteria for improving catalytic activity while maintaining enzyme stability, a key goal in enzyme design.
Collapse
Affiliation(s)
- Qingzhen Hou
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, Shandong 250012, China
- National Institute of Health Data Science of China, Shandong University, Jinan, Shandong 250002, China
| | - Marianne Rooman
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, 1050 Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, 1050 Brussels, Belgium
| | - Fabrizio Pucci
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, 1050 Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, 1050 Brussels, Belgium
| |
Collapse
|
13
|
Kiefl E, Esen OC, Miller SE, Kroll KL, Willis AD, Rappé MS, Pan T, Eren AM. Structure-informed microbial population genetics elucidate selective pressures that shape protein evolution. SCIENCE ADVANCES 2023; 9:eabq4632. [PMID: 36812328 DOI: 10.1126/sciadv.abq4632] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Accepted: 01/18/2023] [Indexed: 06/18/2023]
Abstract
Comprehensive sampling of natural genetic diversity with metagenomics enables highly resolved insights into the interplay between ecology and evolution. However, resolving adaptive, neutral, or purifying processes of evolution from intrapopulation genomic variation remains a challenge, partly due to the sole reliance on gene sequences to interpret variants. Here, we describe an approach to analyze genetic variation in the context of predicted protein structures and apply it to a marine microbial population within the SAR11 subclade 1a.3.V, which dominates low-latitude surface oceans. Our analyses reveal a tight association between genetic variation and protein structure. In a central gene in nitrogen metabolism, we observe decreased occurrence of nonsynonymous variants from ligand-binding sites as a function of nitrate concentrations, revealing genetic targets of distinct evolutionary pressures maintained by nutrient availability. Our work yields insights into the governing principles of evolution and enables structure-aware investigations of microbial population genetics.
Collapse
Affiliation(s)
- Evan Kiefl
- Department of Medicine, University of Chicago, Chicago, IL 60637, USA
- Graduate Program in Biophysical Sciences, University of Chicago, Chicago, IL 60637, USA
| | - Ozcan C Esen
- Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Samuel E Miller
- Department of Medicine, University of Chicago, Chicago, IL 60637, USA
- Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, MA 02543, USA
| | - Kourtney L Kroll
- Graduate Program in Biophysical Sciences, University of Chicago, Chicago, IL 60637, USA
| | - Amy D Willis
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA
| | - Michael S Rappé
- Hawai'i Institute of Marine Biology, University of Hawai'i at Mānoa, Kāne'ohe, HI 96822, USA
| | - Tao Pan
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL 60637, USA
| | - A Murat Eren
- Department of Medicine, University of Chicago, Chicago, IL 60637, USA
- Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, MA 02543, USA
- Institute for Chemistry and Biology of the Marine Environment, University of Oldenburg, Oldenburg, Germany
- Alfred Wegener Institute for Polar and Marine Research, Bremerhaven, Germany
- Helmholtz Institute for Functional Marine Biodiversity, Oldenburg, Germany
| |
Collapse
|
14
|
Pillai AS, Hochberg GK, Thornton JW. Simple mechanisms for the evolution of protein complexity. Protein Sci 2022; 31:e4449. [PMID: 36107026 PMCID: PMC9601886 DOI: 10.1002/pro.4449] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 09/01/2022] [Accepted: 09/10/2022] [Indexed: 01/26/2023]
Abstract
Proteins are tiny models of biological complexity: specific interactions among their many amino acids cause proteins to fold into elaborate structures, assemble with other proteins into higher-order complexes, and change their functions and structures upon binding other molecules. These complex features are classically thought to evolve via long and gradual trajectories driven by persistent natural selection. But a growing body of evidence from biochemistry, protein engineering, and molecular evolution shows that naturally occurring proteins often exist at or near the genetic edge of multimerization, allostery, and even new folds, so just one or a few mutations can trigger acquisition of these properties. These sudden transitions can occur because many of the physical properties that underlie these features are present in simpler proteins as fortuitous by-products of their architecture. Moreover, complex features of proteins can be encoded by huge arrays of sequences, so they are accessible from many different starting points via many possible paths. Because the bridges to these features are both short and numerous, random chance can join selection as a key factor in explaining the evolution of molecular complexity.
Collapse
Affiliation(s)
- Arvind S. Pillai
- Department of Ecology and EvolutionUniversity of ChicagoChicagoIllinoisUSA
- Institute for Protein DesignUniversity of WashingtonSeattleWAUSA
| | - Georg K.A. Hochberg
- Max Planck Institute for Terrestrial MicrobiologyMarburgGermany
- Department of Chemistry, Center for Synthetic MicrobiologyPhilipps University MarburgMarburgGermany
| | - Joseph W. Thornton
- Department of Ecology and EvolutionUniversity of ChicagoChicagoIllinoisUSA
- Departments of Human Genetics and Ecology and EvolutionUniversity of ChicagoChicagoIllinoisUSA
| |
Collapse
|
15
|
Melo-Filho CC, Bobrowski T, Martin HJ, Sessions Z, Popov KI, Moorman NJ, Baric RS, Muratov EN, Tropsha A. Conserved coronavirus proteins as targets of broad-spectrum antivirals. Antiviral Res 2022; 204:105360. [PMID: 35691424 PMCID: PMC9183392 DOI: 10.1016/j.antiviral.2022.105360] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Revised: 06/02/2022] [Accepted: 06/06/2022] [Indexed: 11/16/2022]
Abstract
Coronaviruses are a class of single-stranded, positive-sense RNA viruses that have caused three major outbreaks over the past two decades: Middle East respiratory syndrome-related coronavirus (MERS-CoV), severe acute respiratory syndrome coronavirus (SARS-CoV), and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). All outbreaks have been associated with significant morbidity and mortality. In this study, we have identified and explored conserved binding sites in the key coronavirus proteins for the development of broad-spectrum direct acting anti-coronaviral compounds and validated the significance of this conservation for drug discovery with existing experimental data. We have identified four coronaviral proteins with highly conserved binding site sequence and 3D structure similarity: PLpro, Mpro, nsp10-nsp16 complex(methyltransferase), and nsp15 endoribonuclease. We have compiled all available experimental data for known antiviral medications inhibiting these targets and identified compounds active against multiple coronaviruses. The identified compounds representing potential broad-spectrum antivirals include: GC376, which is active against six viral Mpro (out of six tested, as described in research literature); mycophenolic acid, which is active against four viral PLpro (out of four); and emetine, which is active against four viral RdRp (out of four). The approach described in this study for coronaviruses, which combines the assessment of sequence and structure conservation across a viral family with the analysis of accessible chemical structure - antiviral activity data, can be explored for the development of broad-spectrum drugs for multiple viral families.
Collapse
Affiliation(s)
- Cleber C Melo-Filho
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Tesia Bobrowski
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Holli-Joi Martin
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Zoe Sessions
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Konstantin I Popov
- Department of Biochemistry and Biophysics, School of Medicine, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Nathaniel J Moorman
- Department of Microbiology and Immunology, School of Medicine, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Ralph S Baric
- Department of Epidemiology, Gillings School of Public Health, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Eugene N Muratov
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC, 27599, USA.
| | - Alexander Tropsha
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC, 27599, USA.
| |
Collapse
|
16
|
Sen N, Anishchenko I, Bordin N, Sillitoe I, Velankar S, Baker D, Orengo C. Characterizing and explaining the impact of disease-associated mutations in proteins without known structures or structural homologs. Brief Bioinform 2022; 23:bbac187. [PMID: 35641150 PMCID: PMC9294430 DOI: 10.1093/bib/bbac187] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2021] [Revised: 04/23/2022] [Accepted: 04/27/2022] [Indexed: 12/12/2022] Open
Abstract
Mutations in human proteins lead to diseases. The structure of these proteins can help understand the mechanism of such diseases and develop therapeutics against them. With improved deep learning techniques, such as RoseTTAFold and AlphaFold, we can predict the structure of proteins even in the absence of structural homologs. We modeled and extracted the domains from 553 disease-associated human proteins without known protein structures or close homologs in the Protein Databank. We noticed that the model quality was higher and the Root mean square deviation (RMSD) lower between AlphaFold and RoseTTAFold models for domains that could be assigned to CATH families as compared to those which could only be assigned to Pfam families of unknown structure or could not be assigned to either. We predicted ligand-binding sites, protein-protein interfaces and conserved residues in these predicted structures. We then explored whether the disease-associated missense mutations were in the proximity of these predicted functional sites, whether they destabilized the protein structure based on ddG calculations or whether they were predicted to be pathogenic. We could explain 80% of these disease-associated mutations based on proximity to functional sites, structural destabilization or pathogenicity. When compared to polymorphisms, a larger percentage of disease-associated missense mutations were buried, closer to predicted functional sites, predicted as destabilizing and pathogenic. Usage of models from the two state-of-the-art techniques provide better confidence in our predictions, and we explain 93 additional mutations based on RoseTTAFold models which could not be explained based solely on AlphaFold models.
Collapse
Affiliation(s)
- Neeladri Sen
- Institute of Structural and Molecular Biology, University College London, London, WC1E 6BT, UK
| | - Ivan Anishchenko
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
| | - Nicola Bordin
- Institute of Structural and Molecular Biology, University College London, London, WC1E 6BT, UK
| | - Ian Sillitoe
- Institute of Structural and Molecular Biology, University College London, London, WC1E 6BT, UK
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| | - Christine Orengo
- Institute of Structural and Molecular Biology, University College London, London, WC1E 6BT, UK
| |
Collapse
|
17
|
Vincent T, Gaillet B, Garnier A. Oleic acid based experimental evolution of Bacillus megaterium yielding an enhanced P450 BM3 variant. BMC Biotechnol 2022; 22:20. [PMID: 35831844 PMCID: PMC9281120 DOI: 10.1186/s12896-022-00750-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Accepted: 06/28/2022] [Indexed: 12/02/2022] Open
Abstract
Background Unlike most other P450 cytochrome monooxygenases, CYP102A1 from Bacillus megaterium (BM3) is both soluble and fused to its redox partner forming a single polypeptide chain. Like other monooxygenases, it can catalyze the insertion of oxygen unto the carbon-hydrogen bond which can result in a wide variety of commercially relevant products for pharmaceutical and fine chemical industries. However, the instability of the enzyme holds back the implementation of a BM3-based biocatalytic industrial processes due to the important enzyme cost it would prompt. Results In this work, we sought to enhance BM3’s total specific product output by using experimental evolution, an approach not yet reported to improve this enzyme. By exploiting B. megaterium’s own oleic acid metabolism, we pressed the evolution of a new variant of BM3, harbouring 34 new amino acid substitutions. The resulting variant, dubbed DE, increased the conversion of the substrate 10-pNCA to its product p-nitrophenolate 1.23 and 1.76-fold when using respectively NADPH or NADH as a cofactor, compared to wild type BM3. Conclusions This new DE variant, showed increased organic cosolvent tolerance, increased product output and increased versatility in the use of either nicotinamide cofactors NADPH and NADH. Experimental evolution can be used to evolve or to create libraries of evolved BM3 variants with increased productivity and cosolvent tolerance. Such libraries could in turn be used in bioinformatics to further evolve BM3 more precisely. The experimental evolution results also supports the hypothesis which surmises that one of the roles of BM3 in Bacillus megaterium is to protect it from exogenous unsaturated fatty acids by breaking them down. Supplementary Information The online version contains supplementary material available at 10.1186/s12896-022-00750-w.
Collapse
Affiliation(s)
- Thierry Vincent
- Department of Chemical Engineering, Université Laval, Québec, Québec, G1V 0A6, Canada
| | - Bruno Gaillet
- Department of Chemical Engineering, Université Laval, Québec, Québec, G1V 0A6, Canada
| | - Alain Garnier
- Department of Chemical Engineering, Université Laval, Québec, Québec, G1V 0A6, Canada.
| |
Collapse
|
18
|
Bzówka M, Mitusińska K, Raczyńska A, Skalski T, Samol A, Bagrowska W, Magdziarz T, Góra A. Evolution of tunnels in α/β-hydrolase fold proteins—What can we learn from studying epoxide hydrolases? PLoS Comput Biol 2022; 18:e1010119. [PMID: 35580137 PMCID: PMC9140254 DOI: 10.1371/journal.pcbi.1010119] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2021] [Revised: 05/27/2022] [Accepted: 04/19/2022] [Indexed: 12/27/2022] Open
Abstract
The evolutionary variability of a protein’s residues is highly dependent on protein region and function. Solvent-exposed residues, excluding those at interaction interfaces, are more variable than buried residues whereas active site residues are considered to be conserved. The abovementioned rules apply also to α/β-hydrolase fold proteins—one of the oldest and the biggest superfamily of enzymes with buried active sites equipped with tunnels linking the reaction site with the exterior. We selected soluble epoxide hydrolases as representative of this family to conduct the first systematic study on the evolution of tunnels. We hypothesised that tunnels are lined by mostly conserved residues, and are equipped with a number of specific variable residues that are able to respond to evolutionary pressure. The hypothesis was confirmed, and we suggested a general and detailed way of the tunnels’ evolution analysis based on entropy values calculated for tunnels’ residues. We also found three different cases of entropy distribution among tunnel-lining residues. These observations can be applied for protein reengineering mimicking the natural evolution process. We propose a ‘perforation’ mechanism for new tunnels design via the merging of internal cavities or protein surface perforation. Based on the literature data, such a strategy of new tunnel design could significantly improve the enzyme’s performance and can be applied widely for enzymes with buried active sites. So far very little is known about proteins tunnels evolution. The goal of this study is to evaluate the evolution of tunnels in the family of soluble epoxide hydrolases—representatives of numerous α/β-hydrolase fold enzymes. As a result two types of tunnels evolution analysis were proposed (a general and a detailed approach), as well as a ‘perforation’ mechanism which can mimic native evolution in proteins and can be used as an additional strategy for enzymes redesign.
Collapse
Affiliation(s)
- Maria Bzówka
- Tunneling Group, Biotechnology Centre, Silesian University of Technology, Gliwice, Poland
| | - Karolina Mitusińska
- Tunneling Group, Biotechnology Centre, Silesian University of Technology, Gliwice, Poland
| | - Agata Raczyńska
- Tunneling Group, Biotechnology Centre, Silesian University of Technology, Gliwice, Poland
| | - Tomasz Skalski
- Biotechnology Centre, Silesian University of Technology, Gliwice, Poland
| | - Aleksandra Samol
- Tunneling Group, Biotechnology Centre, Silesian University of Technology, Gliwice, Poland
| | - Weronika Bagrowska
- Tunneling Group, Biotechnology Centre, Silesian University of Technology, Gliwice, Poland
| | - Tomasz Magdziarz
- Tunneling Group, Biotechnology Centre, Silesian University of Technology, Gliwice, Poland
| | - Artur Góra
- Tunneling Group, Biotechnology Centre, Silesian University of Technology, Gliwice, Poland
- * E-mail:
| |
Collapse
|
19
|
Secretory quality control constrains functional selection-associated protein structure innovation. Commun Biol 2022; 5:268. [PMID: 35338247 PMCID: PMC8956723 DOI: 10.1038/s42003-022-03220-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Accepted: 03/03/2022] [Indexed: 12/26/2022] Open
Abstract
Biophysical models suggest a dominant role of structural over functional constraints in shaping protein evolution. Selection on structural constraints is linked closely to expression levels of proteins, which together with structure-associated activities determine in vivo functions of proteins. Here we show that despite the up to two orders of magnitude differences in levels of C-reactive protein (CRP) in distinct species, the in vivo functions of CRP are paradoxically conserved. Such a pronounced level-function mismatch cannot be explained by activities associated with the conserved native structure, but is coupled to hidden activities associated with the unfolded, activated conformation. This is not the result of selection on structural constraints like foldability and stability, but is achieved by folding determinants-mediated functional selection that keeps a confined carrier structure to pass the stringent eukaryotic quality control on secretion. Further analysis suggests a folding threshold model which may partly explain the mismatch between the vast sequence space and the limited structure space of proteins. The mismatch in the conserved structure but different expression levels of C-reactive protein (CRP) in distinct species is reconciled by functional selection on hidden activities of unfolded CRPs.
Collapse
|
20
|
Palenchar PM. The Influence of Codon Usage, Protein Abundance, and Protein Stability on Protein Evolution Vary by Evolutionary Distance and the Type of Protein. Protein J 2022; 41:216-229. [PMID: 35147896 DOI: 10.1007/s10930-022-10045-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/01/2022] [Indexed: 12/01/2022]
Abstract
In general, the evolutionary rate of proteins is not primarily related to protein and amino acid functions, and factors such as protein abundance, codon usage, and the protein's TM are more important. To better understand the factors that affect protein evolution, E. coli MG1655 orthologs were compared to those in closely related bacteria and to more distantly related prokaryotes, eukaryotes, and archaea. Also, the evolution of different types of proteins was studied. The analyses indicate that the amino acid conservation of enzymes that do not use macromolecules (e.g. DNA, RNA, and proteins) as substrates and that carry out metabolic processes involving small molecules (i.e. small molecule enzymes) is different than other enzymes. For example, the small molecule enzymes have a lower percent identity than other enzymes when sequences from closely related bacteria are compared. Analyses indicate the lower percent identity is not a result of the amino acid or codon usage of the small molecule enzymes. The small molecule enzymes also don't have a significantly lower protein abundance indicating that is also not likely an important factor driving differences in amino acid conservation. Analyses indicate different methods to measure the TM of proteins have different relationships between amino acid conservation over different evolutionary distances. In totality, the results demonstrate that the relationship between the factors thought to affect protein evolution (protein abundance, codon usage, and proteins TMs) and protein evolution are complex and depend on the factor, the organisms, and the type of proteins being analyzed.
Collapse
Affiliation(s)
- Peter M Palenchar
- Department of Chemistry, Villanova University, 800 E. Lancaster Ave, Villanova, PA, 19805, USA.
| |
Collapse
|
21
|
Nagar M, Hayden JA, Sagey E, Worthen G, Park M, Sharma AN, Fetter CM, Kuehm OP, Bearne SL. Altering the binding determinant on the interdigitating loop of mandelate racemase shifts specificity towards that of d-tartrate dehydratase. Arch Biochem Biophys 2022; 718:109119. [DOI: 10.1016/j.abb.2022.109119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Revised: 01/05/2022] [Accepted: 01/06/2022] [Indexed: 11/02/2022]
|
22
|
Truong DP, Rousseau S, Machala BW, Huddleston JP, Zhu M, Hull KG, Romo D, Raushel FM, Sacchettini JC, Glasner ME. Second-Shell Amino Acid R266 Helps Determine N-Succinylamino Acid Racemase Reaction Specificity in Promiscuous N-Succinylamino Acid Racemase/ o-Succinylbenzoate Synthase Enzymes. Biochemistry 2021; 60:3829-3840. [PMID: 34845903 DOI: 10.1021/acs.biochem.1c00627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Catalytic promiscuity is the coincidental ability to catalyze nonbiological reactions in the same active site as the native biological reaction. Several lines of evidence show that catalytic promiscuity plays a role in the evolution of new enzyme functions. Thus, studying catalytic promiscuity can help identify structural features that predispose an enzyme to evolve new functions. This study identifies a potentially preadaptive residue in a promiscuous N-succinylamino acid racemase/o-succinylbenzoate synthase (NSAR/OSBS) enzyme from Amycolatopsis sp. T-1-60. This enzyme belongs to a branch of the OSBS family which includes many catalytically promiscuous NSAR/OSBS enzymes. R266 is conserved in all members of the NSAR/OSBS subfamily. However, the homologous position is usually hydrophobic in other OSBS subfamilies, whose enzymes lack NSAR activity. The second-shell amino acid R266 is close to the catalytic acid/base K263, but it does not contact the substrate, suggesting that R266 could affect the catalytic mechanism. Mutating R266 to glutamine in Amycolatopsis NSAR/OSBS profoundly reduces NSAR activity but moderately reduces OSBS activity. This is due to a 1000-fold decrease in the rate of proton exchange between the substrate and the general acid/base catalyst K263. This mutation is less deleterious for the OSBS reaction because K263 forms a cation-π interaction with the OSBS substrate and/or the intermediate, rather than acting as a general acid/base catalyst. Together, the data explain how R266 contributes to NSAR reaction specificity and was likely an essential preadaptation for the evolution of NSAR activity.
Collapse
Affiliation(s)
- Dat P Truong
- Department of Biochemistry and Biophysics, Texas A&M University, 2128 TAMU, College Station, Texas 77843-2128, United States
| | - Simon Rousseau
- Department of Biochemistry and Biophysics, Texas A&M University, 2128 TAMU, College Station, Texas 77843-2128, United States
| | - Benjamin W Machala
- Department of Biochemistry and Biophysics, Texas A&M University, 2128 TAMU, College Station, Texas 77843-2128, United States
| | - Jamison P Huddleston
- Department of Chemistry, Texas A&M University, 3255 TAMU, College Station, Texas 77843-3255, United States
| | - Mingzhao Zhu
- Baylor Synthesis and Drug-Lead Discovery Laboratory, Department of Chemistry and Biochemistry, Baylor University, One Bear Place, Waco, Texas 76798-7348, United States
| | - Kenneth G Hull
- Baylor Synthesis and Drug-Lead Discovery Laboratory, Department of Chemistry and Biochemistry, Baylor University, One Bear Place, Waco, Texas 76798-7348, United States
| | - Daniel Romo
- Baylor Synthesis and Drug-Lead Discovery Laboratory, Department of Chemistry and Biochemistry, Baylor University, One Bear Place, Waco, Texas 76798-7348, United States
| | - Frank M Raushel
- Department of Biochemistry and Biophysics, Texas A&M University, 2128 TAMU, College Station, Texas 77843-2128, United States.,Department of Chemistry, Texas A&M University, 3255 TAMU, College Station, Texas 77843-3255, United States
| | - James C Sacchettini
- Department of Chemistry, Texas A&M University, 3255 TAMU, College Station, Texas 77843-3255, United States
| | - Margaret E Glasner
- Department of Biochemistry and Biophysics, Texas A&M University, 2128 TAMU, College Station, Texas 77843-2128, United States
| |
Collapse
|
23
|
Caetano-Anollés G, Aziz MF, Mughal F, Caetano-Anollés D. Tracing protein and proteome history with chronologies and networks: folding recapitulates evolution. Expert Rev Proteomics 2021; 18:863-880. [PMID: 34628994 DOI: 10.1080/14789450.2021.1992277] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
INTRODUCTION While the origin and evolution of proteins remain mysterious, advances in evolutionary genomics and systems biology are facilitating the historical exploration of the structure, function and organization of proteins and proteomes. Molecular chronologies are series of time events describing the history of biological systems and subsystems and the rise of biological innovations. Together with time-varying networks, these chronologies provide a window into the past. AREAS COVERED Here, we review molecular chronologies and networks built with modern methods of phylogeny reconstruction. We discuss how chronologies of structural domain families uncover the explosive emergence of metabolism, the late rise of translation, the co-evolution of ribosomal proteins and rRNA, and the late development of the ribosomal exit tunnel; events that coincided with a tendency to shorten folding time. Evolving networks described the early emergence of domains and a late 'big bang' of domain combinations. EXPERT OPINION Two processes, folding and recruitment appear central to the evolutionary progression. The former increases protein persistence. The later fosters diversity. Chronologically, protein evolution mirrors folding by combining supersecondary structures into domains, developing translation machinery to facilitate folding speed and stability, and enhancing structural complexity by establishing long-distance interactions in novel structural and architectural designs.
Collapse
Affiliation(s)
- Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, Illinois, USA.,C. R. Woese Institute for Genomic Biology, University of Illinois, Urbana, Illinois, USA
| | - M Fayez Aziz
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, Illinois, USA
| | - Fizza Mughal
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, Illinois, USA
| | - Derek Caetano-Anollés
- Data Science Platform, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| |
Collapse
|
24
|
Learning the local landscape of protein structures with convolutional neural networks. J Biol Phys 2021; 47:435-454. [PMID: 34751854 DOI: 10.1007/s10867-021-09593-6] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 10/18/2021] [Indexed: 10/19/2022] Open
Abstract
One fundamental problem of protein biochemistry is to predict protein structure from amino acid sequence. The inverse problem, predicting either entire sequences or individual mutations that are consistent with a given protein structure, has received much less attention even though it has important applications in both protein engineering and evolutionary biology. Here, we ask whether 3D convolutional neural networks (3D CNNs) can learn the local fitness landscape of protein structure to reliably predict either the wild-type amino acid or the consensus in a multiple sequence alignment from the local structural context surrounding site of interest. We find that the network can predict wild type with good accuracy, and that network confidence is a reliable measure of whether a given prediction is likely going to be correct or not. Predictions of consensus are less accurate and are primarily driven by whether or not the consensus matches the wild type. Our work suggests that high-confidence mis-predictions of the wild type may identify sites that are primed for mutation and likely targets for protein engineering.
Collapse
|
25
|
Naganathan AN, Kannan A. A hierarchy of coupling free energies underlie the thermodynamic and functional architecture of protein structures. Curr Res Struct Biol 2021; 3:257-267. [PMID: 34704074 PMCID: PMC8526763 DOI: 10.1016/j.crstbi.2021.09.003] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Revised: 09/08/2021] [Accepted: 09/30/2021] [Indexed: 12/22/2022] Open
Abstract
Protein sequences and structures evolve by satisfying varied physical and biochemical constraints. This multi-level selection is enabled not just by the patterning of amino acids on the sequence, but also via coupling between residues in the native structure. Here, we employ an energetically detailed statistical mechanical model with millions of microstates to extract such long-range structural correlations, i.e. thermodynamic coupling free energies, from a diverse family of protein structures. We find that despite the intricate and anisotropic distribution of coupling patterns, the majority of residues (>70%) are only marginally coupled contributing to functional motions and catalysis. Physical origins of ‘sectors’, determinants of native ensemble heterogeneity in extant, ancient and designed proteins, and the basis for allostery emerge naturally from coupling free energies. The statistical framework highlights how evolutionary selection and optimization occur at the level of global interaction network for a given protein fold impacting folding, function, and allosteric outputs. Evolution of protein structures occurs at the level of global interaction network. More than 70% of the protein residues are weakly or marginally coupled. Functional ‘sector’ regions are a manifestation of marginal coupling. Coupling indices vary across the entire proteins in extant-ancient and natural-designed pairs. The proposed methodology can be used to understand allostery and epistasis.
Collapse
Affiliation(s)
- Athi N Naganathan
- Department of Biotechnology, Bhupat & Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, 600036, India
| | - Adithi Kannan
- Department of Biotechnology, Bhupat & Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, 600036, India
| |
Collapse
|
26
|
Echave J. Evolutionary coupling range varies widely among enzymes depending on selection pressure. Biophys J 2021; 120:4320-4324. [PMID: 34480927 DOI: 10.1016/j.bpj.2021.08.042] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 07/19/2021] [Accepted: 08/30/2021] [Indexed: 10/20/2022] Open
Abstract
Recent studies proposed that enzyme-active sites induce evolutionary constraints at long distances. The physical origin of such long-range evolutionary coupling is unknown. Here, I use a recent biophysical model of evolution to study the relationship between physical and evolutionary couplings on a diverse data set of monomeric enzymes. I show that evolutionary coupling is not universally long-range. Rather, range varies widely among enzymes, from 2 to 20 Å. Furthermore, the evolutionary coupling range of an enzyme does not inform on the underlying physical coupling, which is short range for all enzymes. Rather, evolutionary coupling range is determined by functional selection pressure.
Collapse
Affiliation(s)
- Julian Echave
- Instituto de Ciencias Físicas, Escuela de Ciencia y Tecnología, Universidad Nacional de San Martín, San Martín, Buenos Aires, Argentina.
| |
Collapse
|
27
|
Cea-Rama I, Coscolín C, Katsonis P, Bargiela R, Golyshin PN, Lichtarge O, Ferrer M, Sanz-Aparicio J. Structure and evolutionary trace-assisted screening of a residue swapping the substrate ambiguity and chiral specificity in an esterase. Comput Struct Biotechnol J 2021; 19:2307-2317. [PMID: 33995922 PMCID: PMC8105184 DOI: 10.1016/j.csbj.2021.04.041] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2021] [Revised: 04/15/2021] [Accepted: 04/16/2021] [Indexed: 01/02/2023] Open
Abstract
Our understanding of enzymes with high substrate ambiguity remains limited because their large active sites allow substrate docking freedom to an extent that seems incompatible with stereospecificity. One possibility is that some of these enzymes evolved a set of evolutionarily fitted sequence positions that stringently allow switching substrate ambiguity and chiral specificity. To explore this hypothesis, we targeted for mutation a serine ester hydrolase (EH3) that exhibits an impressive 71-substrate repertoire but is not stereospecific (e.e. 50%). We used structural actions and the computational evolutionary trace method to explore specificity-swapping sequence positions and hypothesized that position I244 was critical. Driven by evolutionary action analysis, this position was substituted to leucine, which together with isoleucine appears to be the amino acid most commonly present in the closest homologous sequences (max. identity, ca. 67.1%), and to phenylalanine, which appears in distant homologues. While the I244L mutation did not have any functional consequences, the I244F mutation allowed the esterase to maintain a remarkable 53-substrate range while gaining stereospecificity properties (e.e. 99.99%). These data support the possibility that some enzymes evolve sequence positions that control the substrate scope and stereospecificity. Such residues, which can be evolutionarily screened, may serve as starting points for further designing substrate-ambiguous, yet chiral-specific, enzymes that are greatly appreciated in biotechnology and synthetic chemistry.
Collapse
Affiliation(s)
- Isabel Cea-Rama
- Institute of Physical Chemistry “Rocasolano”, CSIC, 28006 Madrid, Spain
| | | | | | - Rafael Bargiela
- Centre for Environmental Biotechnology, Bangor University, LL57 2UW Bangor, UK
| | - Peter N. Golyshin
- Centre for Environmental Biotechnology, Bangor University, LL57 2UW Bangor, UK
- School of Natural Sciences, Bangor University, LL57 2UW Bangor, UK
| | | | | | | |
Collapse
|
28
|
Sharir-Ivry A, Xia Y. Quantifying evolutionary importance of protein sites: A Tale of two measures. PLoS Genet 2021; 17:e1009476. [PMID: 33826605 PMCID: PMC8026052 DOI: 10.1371/journal.pgen.1009476] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Accepted: 03/09/2021] [Indexed: 12/05/2022] Open
Abstract
A key challenge in evolutionary biology is the accurate quantification of selective pressure on proteins and other biological macromolecules at single-site resolution. The evolutionary importance of a protein site under purifying selection is typically measured by the degree of conservation of the protein site itself. A possible alternative measure is the strength of the site-induced conservation gradient in the rest of the protein structure. However, the quantitative relationship between these two measures remains unknown. Here, we show that despite major differences, there is a strong linear relationship between the two measures such that more conserved protein sites also induce stronger conservation gradient in the rest of the protein. This linear relationship is universal as it holds for different types of proteins and functional sites in proteins. Our results show that the strong selective pressure acting on the functional site in general percolates through the rest of the protein via residue-residue contacts. Surprisingly however, catalytic sites in enzymes are the principal exception to this rule. Catalytic sites induce significantly stronger conservation gradients in the rest of the protein than expected from the degree of conservation of the site alone. The unique requirement for the active site to selectively stabilize the transition state of the catalyzed chemical reaction imposes additional selective constraints on the rest of the enzyme. Sites within proteins which are important for stability or function are under stronger selective pressure and evolve more slowly than other sites. Catalytic sites in enzymes are such highly conserved sites with relatively low evolutionary rates. Recently, catalytic sites were shown to induce a strong gradient of conservation such that the closer a residue is to the catalytic site, the more conserved it is. Here we show that there is a universal linear relationship between the degree of evolutionary conservation of a protein site and the conservation gradient it induces in the protein tertiary structure, applicable to all types of sites. Our findings suggest that selective pressure acting on a protein site generally percolates through the rest of the protein via residue-residue contacts. Remarkably however, catalytic sites induce significantly stronger conservation gradients than expected from their degree of conservation alone. Our results indicate that the strong conservation gradient induced by catalytic sites is driven by the unique function of enzyme catalysis, which requires the participation of many residues beyond the few key catalytic residues. Our results provide insights into evolutionary conservation patterns of and surrounding proteins functional sites, with implications for functional site prediction and protein design.
Collapse
Affiliation(s)
- Avital Sharir-Ivry
- Department of Bioengineering, McGill University, Montreal, Quebec, Canada
| | - Yu Xia
- Department of Bioengineering, McGill University, Montreal, Quebec, Canada
- * E-mail:
| |
Collapse
|
29
|
Cagiada M, Johansson KE, Valanciute A, Nielsen SV, Hartmann-Petersen R, Yang JJ, Fowler DM, Stein A, Lindorff-Larsen K. Understanding the Origins of Loss of Protein Function by Analyzing the Effects of Thousands of Variants on Activity and Abundance. Mol Biol Evol 2021; 38:3235-3246. [PMID: 33779753 PMCID: PMC8321532 DOI: 10.1093/molbev/msab095] [Citation(s) in RCA: 62] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Understanding and predicting how amino acid substitutions affect proteins are keys to our basic understanding of protein function and evolution. Amino acid changes may affect protein function in a number of ways including direct perturbations of activity or indirect effects on protein folding and stability. We have analyzed 6,749 experimentally determined variant effects from multiplexed assays on abundance and activity in two proteins (NUDT15 and PTEN) to quantify these effects and find that a third of the variants cause loss of function, and about half of loss-of-function variants also have low cellular abundance. We analyze the structural and mechanistic origins of loss of function and use the experimental data to find residues important for enzymatic activity. We performed computational analyses of protein stability and evolutionary conservation and show how we may predict positions where variants cause loss of activity or abundance. In this way, our results link thermodynamic stability and evolutionary conservation to experimental studies of different properties of protein fitness landscapes.
Collapse
Affiliation(s)
- Matteo Cagiada
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Kristoffer E Johansson
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Audrone Valanciute
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Sofie V Nielsen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Rasmus Hartmann-Petersen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Jun J Yang
- Department of Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, TN, USA.,Department of Oncology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.,Department of Bioengineering, University of Washington, Seattle, WA, USA
| | - Amelie Stein
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Kresten Lindorff-Larsen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
30
|
Tanaka SI, Tsutaki M, Yamamoto S, Mizutani H, Kurahashi R, Hirata A, Takano K. Exploring mutable conserved sites and fatal non-conserved sites by random mutation of esterase from Sulfolobus tokodaii and subtilisin from Thermococcus kodakarensis. Int J Biol Macromol 2020; 170:343-353. [PMID: 33383075 DOI: 10.1016/j.ijbiomac.2020.12.171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Revised: 12/21/2020] [Accepted: 12/22/2020] [Indexed: 10/22/2022]
Abstract
Homologous proteins differ in their amino acid sequences at several positions. Generally, conserved sites are recognized as not suitable for amino acid substitution, and thus in evolutionary protein engineering, non-conserved sites are often selected as mutation sites. However, there have also been reports of possible mutations in conserved sites. In this study, we explored mutable conserved sites and immutable non-conserved sites by testing random mutations of two thermostable proteins, an esterase from Sulfolobus tokodaii (Sto-Est) and a subtilisin from Thermococcus kodakarensis (Tko-Sub). The subtilisin domain of Tko-Sub needs Ca2+ ions and the propeptide domain for stability, folding and maturation. The results from the two proteins showed that about one-third of the mutable sites were detected in conserved sites and some non-conserved sites lost enzymatic activity at high temperatures due to mutation. Of the conserved sites in Sto-Est, the sites on the loop, on the surface, and far from the active site are more resistant to mutation. In Tko-Sub, the sites flanking Ca2+-binding sites and propeptide were undesirable for mutation. The results presented here serve as an index for selecting mutation sites and contribute to the expansion of available sequence range by introducing mutations at conserved sites.
Collapse
Affiliation(s)
- Shun-Ichi Tanaka
- Department of Biomolecular Chemistry, Kyoto Prefectural University, Hangi-cho, Shimogamo, Sakyo-ku, Kyoto 606-8522, Japan
| | - Minami Tsutaki
- Department of Biomolecular Chemistry, Kyoto Prefectural University, Hangi-cho, Shimogamo, Sakyo-ku, Kyoto 606-8522, Japan
| | - Seira Yamamoto
- Department of Biomolecular Chemistry, Kyoto Prefectural University, Hangi-cho, Shimogamo, Sakyo-ku, Kyoto 606-8522, Japan
| | - Hayate Mizutani
- Department of Biomolecular Chemistry, Kyoto Prefectural University, Hangi-cho, Shimogamo, Sakyo-ku, Kyoto 606-8522, Japan
| | - Ryo Kurahashi
- Department of Biomolecular Chemistry, Kyoto Prefectural University, Hangi-cho, Shimogamo, Sakyo-ku, Kyoto 606-8522, Japan
| | - Azumi Hirata
- Department of Anatomy and Cell Biology, Osaka Medical College, Daigaku-machi, Takatsuki, Osaka 569-8686, Japan
| | - Kazufumi Takano
- Department of Biomolecular Chemistry, Kyoto Prefectural University, Hangi-cho, Shimogamo, Sakyo-ku, Kyoto 606-8522, Japan.
| |
Collapse
|
31
|
Sruthi C, Balaram H, Prakash MK. Toward Developing Intuitive Rules for Protein Variant Effect Prediction Using Deep Mutational Scanning Data. ACS OMEGA 2020; 5:29667-29677. [PMID: 33251402 PMCID: PMC7689672 DOI: 10.1021/acsomega.0c02402] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 07/28/2020] [Indexed: 05/30/2023]
Abstract
Protein structure and function can be severely altered by even a single amino acid mutation. Predictions of mutational effects using extensive artificial intelligence (AI)-based models, although accurate, remain as enigmatic as the experimental observations in terms of improving intuitions about the contributions of various factors. Inspired by Lipinski's rules for drug-likeness, we devise simple thresholding criteria on five different descriptors such as conservation, which have so far been limited to qualitative interpretations such as high conservation implies high mutational effect. We analyze systematic deep mutational scanning data of all possible single amino acid substitutions on seven proteins (25153 mutations) to first define these thresholds and then to evaluate the scope and limits of the predictions. At this stage, the approach allows us to comment easily and with a low error rate on the subset of mutations classified as neutral or deleterious by all of the descriptors. We hope that complementary to the accurate AI predictions, these thresholding rules or their subsequent modifications will serve the purpose of codifying the knowledge about the effects of mutations.
Collapse
Affiliation(s)
- Cheloor
Kovilakam Sruthi
- Theoretical
Sciences Unit, Jawaharlal Nehru Centre for
Advanced Scientific Research, Bangalore 560064, India
| | - Hemalatha Balaram
- Molecular
Biology and Genetics Unit, Jawaharlal Nehru
Centre for Advanced Scientific Research, Bangalore 560064, India
| | - Meher K. Prakash
- Theoretical
Sciences Unit, Jawaharlal Nehru Centre for
Advanced Scientific Research, Bangalore 560064, India
| |
Collapse
|
32
|
Subramanian S, Golla H, Divakar K, Kannan A, de Sancho D, Naganathan AN. Slow Folding of a Helical Protein: Large Barriers, Strong Internal Friction, or a Shallow, Bumpy Landscape? J Phys Chem B 2020; 124:8973-8983. [PMID: 32955882 PMCID: PMC7659034 DOI: 10.1021/acs.jpcb.0c05976] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
![]()
The rate at which a protein molecule
folds is determined by opposing
energetic and entropic contributions to the free energy that shape
the folding landscape. Delineating the extent to which they impact
the diffusional barrier-crossing events, including the magnitude of
internal friction and barrier height, has largely been a challenging
task. In this work, we extract the underlying thermodynamic and dynamic
contributions to the folding rate of an unusually slow-folding helical
DNA-binding domain, PurR, which shares the characteristics of ultrafast
downhill-folding proteins but nonetheless appears to exhibit an apparent
two-state equilibrium. We combine equilibrium spectroscopy, temperature-viscosity-dependent
kinetics, statistical mechanical modeling, and coarse-grained simulations
to show that the conformational behavior of PurR is highly heterogeneous
characterized by a large spread in melting temperatures, marginal
thermodynamic barriers, and populated partially structured states.
PurR appears to be at the threshold of disorder arising from frustrated
electrostatics and weak packing that in turn slows down folding due
to a shallow, bumpy landscape and not due to large thermodynamic barriers
or strong internal friction. Our work highlights how a strong temperature
dependence on the pre-exponential could signal a shallow landscape
and not necessarily a slow-folding diffusion coefficient, thus determining
the folding timescales of even millisecond folding proteins and hints
at possible structural origins for the shallow landscape.
Collapse
Affiliation(s)
- Sandhyaa Subramanian
- Department of Biotechnology, Bhupat & Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| | - Hemashree Golla
- Department of Biotechnology, Bhupat & Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| | - Kalivarathan Divakar
- Department of Biotechnology, National Institute of Technology Warangal, Warangal 506004, India
| | - Adithi Kannan
- Department of Biotechnology, Bhupat & Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| | - David de Sancho
- Polimero eta Material Aurreratuak: Fisika, Kimika eta Teknologia, Kimika Fakultatea, Euskal Herriko Unibertsitatea UPV/EHU, Donostia-San Sebastián 20080, Spain.,Donostia International Physics Center (DIPC), PK 1072, Donostia-San Sebastián 20080, Spain
| | - Athi N Naganathan
- Department of Biotechnology, Bhupat & Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| |
Collapse
|
33
|
Chan YH, Zeldovich KB, Matthews CR. An allosteric pathway explains beneficial fitness in yeast for long-range mutations in an essential TIM barrel enzyme. Protein Sci 2020; 29:1911-1923. [PMID: 32643222 PMCID: PMC7454521 DOI: 10.1002/pro.3911] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 07/03/2020] [Accepted: 07/07/2020] [Indexed: 11/06/2022]
Abstract
Protein evolution proceeds by a complex response of organismal fitness to mutations that can simultaneously affect protein stability, structure, and enzymatic activity. To probe the relationship between genotype and phenotype, we chose a fundamental paradigm for protein evolution, folding, and design, the (βα)8 TIM barrel fold. Here, we demonstrate the role of long-range allosteric interactions in the adaptation of an essential hyperthermophilic TIM barrel enzyme to mesophilic conditions in a yeast host. Beneficial fitness effects observed with single and double mutations of the canonical βα-hairpin clamps and the α-helical shell distal to the active site revealed an underlying energy network between opposite faces of the cylindrical β-barrel. We experimentally determined the fitness of multiple mutants in the energetic phase plane, contrasting the energy barrier of the chemical reaction and the folding free energy of the protein. For the system studied, the reaction energy barrier was the primary determinant of organism fitness. Our observations of long-range epistatic interactions uncovered an allosteric pathway in an ancient and ubiquitous enzyme that may provide a novel way of designing proteins with a desired activity and stability profile.
Collapse
Affiliation(s)
- Yvonne H Chan
- Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, Massachusetts, USA.,Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts, USA.,Sanofi Pasteur, Cambridge, Massachusetts, USA
| | - Konstantin B Zeldovich
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts, USA.,Sanofi Pasteur, Cambridge, Massachusetts, USA
| | - Charles R Matthews
- Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, Massachusetts, USA
| |
Collapse
|
34
|
Paul A, Srinivasan N. Genome-wide and structural analyses of pseudokinases encoded in the genome of Arabidopsis thaliana provide functional insights. Proteins 2020; 88:1620-1638. [PMID: 32667690 DOI: 10.1002/prot.25981] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Revised: 05/26/2020] [Accepted: 07/12/2020] [Indexed: 12/31/2022]
Abstract
Protein Kinase-Like Non-Kinases (PKLNKs), commonly known as "pseudokinases", are homologous to eukaryotic Ser/Thr/Tyr protein kinases (PKs) but lack the crucial aspartate residue in the catalytic loop, indispensable for phosphotransferase activity. Therefore, they are predicted to be "catalytically inactive" enzyme homologs. Analysis of protein-kinase like sequences from Arabidopsis thaliana led to the identification of more than 120 pseudokinases lacking catalytic aspartate, majority of which are closely related to the plant-specific receptor-like kinase family. These pseudokinases engage in different biological processes, enabled by their diverse domain architectures and specific subcellular localizations. Structural comparison of pseudokinases with active and inactive conformations of canonical PKs, belonging to both plant and animal origin, revealed unique structural differences. The currently available crystal structures of pseudokinases show that the loop topologically equivalent to activation segment of PKs adopts a distinct-folded conformation, packing against the pseudoenzyme core, in contrast to the extended and inhibitory geometries observed for active and inactive states, respectively, of catalytic PKs. Salt-bridge between ATP-binding Lys and DFG-Asp as well as hydrophobic interactions between the conserved nonpolar residue C-terminal to the equivalent DFG motif and nonpolar residues in C-helix mediate such a conformation in pseudokinases. This results in enhanced solvent accessibility of the pseudocatalytic loop in pseudokinases that can possibly serve as an interacting surface while associating with other proteins. Specifically, our analysis identified several residues that may be involved in pseudokinase regulation and hints at the repurposing of pseudocatalytic residues to achieve mechanistic control over noncatalytic functions of pseudoenzymes.
Collapse
Affiliation(s)
- Anindita Paul
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
| | | |
Collapse
|
35
|
Alfayate A, Rodriguez Caceres C, Gomes Dos Santos H, Bastolla U. Predicted dynamical couplings of protein residues characterize catalysis, transport and allostery. Bioinformatics 2020; 35:4971-4978. [PMID: 31038697 DOI: 10.1093/bioinformatics/btz301] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2018] [Revised: 03/21/2019] [Accepted: 04/19/2019] [Indexed: 02/06/2023] Open
Abstract
MOTIVATION Protein function is intrinsically linked to native dynamics, but the systematic characterization of functionally relevant dynamics remains elusive besides specific examples. Here we exhaustively characterize three types of dynamical couplings between protein residues: co-directionality (moving along collinear directions), coordination (small fluctuations of the interatomic distance) and deformation (the extent by which perturbations applied at one residue modify the local structure of the other one), which we analytically compute through the torsional network model. RESULTS We find that ligand binding sites are characterized by large within-site coordination and co-directionality, much larger than expected for generic sets of residues with equivalent sequence distances. In addition, catalytic sites are characterized by high coordination couplings with other residues in the protein, supporting the view that the overall protein structure facilitates the catalytic dynamics. The binding sites of allosteric effectors are characterized by comparably smaller coordination and higher within-site deformation than other ligands, which supports their dynamic nature. Allosteric inhibitors are coupled to the active site more frequently through deformation than through coordination, while the contrary holds for activators. We characterize the dynamical couplings of the sodium-dependent Leucine transporter protein (LeuT). The couplings between and within sites progress consistently along the transport cycle, providing a mechanistic description of the coupling between the uptake and release of ions and substrate, and they highlight qualitative differences between the wild-type and a mutant for which chloride is necessary for transport. AVAILABILITY AND IMPLEMENTATION The program tnm is freely available at https://github.com/ugobas/tnm. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Alvaro Alfayate
- Centro de Biologia Molecular "Severo Ochoa" CSIC-UAM Cantoblanco, Madrid, Spain
| | | | | | - Ugo Bastolla
- Centro de Biologia Molecular "Severo Ochoa" CSIC-UAM Cantoblanco, Madrid, Spain
| |
Collapse
|
36
|
Mazmanian K, Sargsyan K, Lim C. How the Local Environment of Functional Sites Regulates Protein Function. J Am Chem Soc 2020; 142:9861-9871. [PMID: 32407086 DOI: 10.1021/jacs.0c02430] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Proteins form complex biological machineries whose functions in the cell are highly regulated at both the cellular and molecular levels. Cellular regulation of protein functions involves differential gene expressions, post-translation modifications, and signaling cascades. Molecular regulation, on the other hand, involves tuning an optimal local protein environment for the functional site. Precisely how a protein achieves such an optimal environment around a given functional site is not well understood. Herein, by surveying the literature, we first summarize the various reported strategies used by certain proteins to ensure their correct functioning. We then formulate three key physicochemical factors for regulating a protein's functional site, namely, (i) its immediate interactions, (ii) its solvent accessibility, and (iii) its conformational flexibility. We illustrate how these factors are applied to regulate the functions of free/metal-bound Cys and Zn sites in proteins.
Collapse
Affiliation(s)
- Karine Mazmanian
- Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan
| | - Karen Sargsyan
- Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan
| | - Carmay Lim
- Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan.,Department of Chemistry, National Tsing Hua University, Hsinchu 300, Taiwan
| |
Collapse
|
37
|
Integrated structural and evolutionary analysis reveals common mechanisms underlying adaptive evolution in mammals. Proc Natl Acad Sci U S A 2020; 117:5977-5986. [PMID: 32123117 DOI: 10.1073/pnas.1916786117] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Understanding the molecular basis of adaptation to the environment is a central question in evolutionary biology, yet linking detected signatures of positive selection to molecular mechanisms remains challenging. Here we demonstrate that combining sequence-based phylogenetic methods with structural information assists in making such mechanistic interpretations on a genomic scale. Our integrative analysis shows that positively selected sites tend to colocalize on protein structures and that positively selected clusters are found in functionally important regions of proteins, indicating that positive selection can contravene the well-known principle of evolutionary conservation of functionally important regions. This unexpected finding, along with our discovery that positive selection acts on structural clusters, opens previously unexplored strategies for the development of better models of protein evolution. Remarkably, proteins where we detect the strongest evidence of clustering belong to just two functional groups: Components of immune response and metabolic enzymes. This gives a coherent picture of pathogens and xenobiotics as important drivers of adaptive evolution of mammals.
Collapse
|
38
|
Faber MS, Wrenbeck EE, Azouz LR, Steiner PJ, Whitehead TA. Impact of In Vivo Protein Folding Probability on Local Fitness Landscapes. Mol Biol Evol 2020; 36:2764-2777. [PMID: 31400199 DOI: 10.1093/molbev/msz184] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
It is incompletely understood how biophysical properties like protein stability impact molecular evolution and epistasis. Epistasis is defined as specific when a mutation exclusively influences the phenotypic effect of another mutation, often at physically interacting residues. In contrast, nonspecific epistasis results when a mutation is influenced by a large number of nonlocal mutations. As most mutations are pleiotropic, the in vivo folding probability-governed by basal protein stability-is thought to determine activity-enhancing mutational tolerance, implying that nonspecific epistasis is dominant. However, evidence exists for both specific and nonspecific epistasis as the prevalent factor, with limited comprehensive data sets to support either claim. Here, we use deep mutational scanning to probe how in vivo enzyme folding probability impacts local fitness landscapes. We computationally designed two different variants of the amidase AmiE with statistically indistinguishable catalytic efficiencies but lower probabilities of folding in vivo compared with wild-type. Local fitness landscapes show slight alterations among variants, with essentially the same global distribution of fitness effects. However, specific epistasis was predominant for the subset of mutations exhibiting positive sign epistasis. These mutations mapped to spatially distinct locations on AmiE near the initial mutation or proximal to the active site. Intriguingly, the majority of specific epistatic mutations were codon dependent, with different synonymous codons resulting in fitness sign reversals. Together, these results offer a nuanced view of how protein folding probability impacts local fitness landscapes and suggest that transcriptional-translational effects are as important as stability in determining evolutionary outcomes.
Collapse
Affiliation(s)
- Matthew S Faber
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI
| | - Emily E Wrenbeck
- Department of Chemical Engineering and Materials Science, Michigan State University, East Lansing, MI
| | - Laura R Azouz
- Department of Chemical Engineering and Materials Science, Michigan State University, East Lansing, MI
| | - Paul J Steiner
- Department of Chemical and Biological Engineering, University of Colorado, Boulder, CO
| | - Timothy A Whitehead
- Department of Chemical Engineering and Materials Science, Michigan State University, East Lansing, MI.,Department of Chemical and Biological Engineering, University of Colorado, Boulder, CO.,E.E.W. Ginkgo Bioworks, L.R.A. McKetta Department of Chemical Engineering, University of Texas at Austin, Austin, TX
| |
Collapse
|
39
|
Mayorov A, Dal Peraro M, Abriata LA. Active Site-Induced Evolutionary Constraints Follow Fold Polarity Principles in Soluble Globular Enzymes. Mol Biol Evol 2020; 36:1728-1733. [PMID: 31004173 DOI: 10.1093/molbev/msz096] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
A recent analysis of evolutionary rates in >500 globular soluble enzymes revealed pervasive conservation gradients toward catalytic residues. By looking at amino acid preference profiles rather than evolutionary rates in the same data set, we quantified the effects of active sites on site-specific constraints for physicochemical traits. We found that conservation gradients respond to constraints for polarity, hydrophobicity, flexibility, rigidity and structure in ways consistent with fold polarity principles; while sites far from active sites seem to experience no physicochemical constraint, rather being highly variable and favoring amino acids of low metabolic cost. Globally, our results highlight that amino acid variation contains finer information about protein structure than usually regarded in evolutionary models, and that this information is retrievable automatically with simple fits. We propose that analyses of the kind presented here incorporated into models of protein evolution should allow for better description of the physical chemistry that underlies molecular evolution.
Collapse
Affiliation(s)
- Alexander Mayorov
- Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| | - Matteo Dal Peraro
- Laboratory for Biomolecular Modeling, École Polytechnique Fédérale de Lausanne and Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Luciano A Abriata
- Laboratory for Biomolecular Modeling, École Polytechnique Fédérale de Lausanne and Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Protein Production and Structure Core Facility, École Polytechnique Fédérale de Lausanne and Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
40
|
Differences in protein structural regions that impact functional specificity in GT2 family β-glucan synthases. PLoS One 2019; 14:e0224442. [PMID: 31665152 PMCID: PMC6821405 DOI: 10.1371/journal.pone.0224442] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Accepted: 10/14/2019] [Indexed: 12/16/2022] Open
Abstract
Most cell wall and secreted β-glucans are synthesised by the CAZy Glycosyltransferase 2 family (www.cazy.org), with different members catalysing the formation of (1,4)-β-, (1,3)-β-, or both (1,4)- and (1,3)-β-glucosidic linkages. Given the distinct physicochemical properties of each of the resultant β-glucans (cellulose, curdlan, and mixed linkage glucan, respectively) are crucial to their biological and biotechnological functions, there is a desire to understand the molecular evolution of synthesis and how linkage specificity is determined. With structural studies hamstrung by the instability of these proteins to solubilisation, we have utilised in silico techniques and the crystal structure for a bacterial cellulose synthase to further understand how these enzymes have evolved distinct functions. Sequence and phylogenetic analyses were performed to determine amino acid conservation, both family-wide and within each sub-family. Further structural analysis centred on comparison of a bacterial curdlan synthase homology model with the bacterial cellulose synthase crystal structure, with molecular dynamics simulations performed with their respective β-glucan products bound in the trans-membrane channel. Key residues that differentially interact with the different β-glucan chains and have sub-family-specific conservation were found to reside at the entrance of the trans-membrane channel. The linkage-specific catalytic activity of these enzymes and hence the type of β-glucan chain built is thus likely determined by the different interactions between the proteins and the first few glucose residues in the channel, which in turn dictates the position of the acceptor glucose. The sequence-function relationships for the bacterial β-glucan synthases pave the way for extending this understanding to other kingdoms, such as plants.
Collapse
|
41
|
Konaté MM, Plata G, Park J, Usmanova DR, Wang H, Vitkup D. Molecular function limits divergent protein evolution on planetary timescales. eLife 2019; 8:e39705. [PMID: 31532392 PMCID: PMC6750897 DOI: 10.7554/elife.39705] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Accepted: 08/07/2019] [Indexed: 01/25/2023] Open
Abstract
Functional conservation is known to constrain protein evolution. Nevertheless, the long-term divergence patterns of proteins maintaining the same molecular function and the possible limits of this divergence have not been explored in detail. We investigate these fundamental questions by characterizing the divergence between ancient protein orthologs with conserved molecular function. Our results demonstrate that the decline of sequence and structural similarities between such orthologs significantly slows down after ~1-2 billion years of independent evolution. As a result, the sequence and structural similarities between ancient orthologs have not substantially decreased for the past billion years. The effective divergence limit (>25% sequence identity) is not primarily due to protein sites universally conserved in all linages. Instead, less than four amino acid types are accepted, on average, per site across orthologous protein sequences. Our analysis also reveals different divergence patterns for protein sites with experimentally determined small and large fitness effects of mutations. Editorial note This article has been through an editorial process in which the authors decide how to respond to the issues raised during peer review. The Reviewing Editor's assessment is that all the issues have been addressed (see decision letter).
Collapse
Affiliation(s)
- Mariam M Konaté
- Department of Systems BiologyColumbia UniversityNew YorkUnited States
- Division of Cancer Treatment and Diagnosis, National Cancer InstituteBethesdaUnited States
| | - Germán Plata
- Department of Systems BiologyColumbia UniversityNew YorkUnited States
| | - Jimin Park
- Department of Systems BiologyColumbia UniversityNew YorkUnited States
- Department of Pathology and Cell BiologyColumbia UniversityNew YorkUnited States
| | - Dinara R Usmanova
- Department of Systems BiologyColumbia UniversityNew YorkUnited States
| | - Harris Wang
- Department of Systems BiologyColumbia UniversityNew YorkUnited States
- Department of Pathology and Cell BiologyColumbia UniversityNew YorkUnited States
| | - Dennis Vitkup
- Department of Systems BiologyColumbia UniversityNew YorkUnited States
- Department of Biomedical InformaticsColumbia UniversityNew YorkUnited States
| |
Collapse
|
42
|
Sharir-Ivry A, Xia Y. Non-catalytic Binding Sites Induce Weaker Long-Range Evolutionary Rate Gradients than Catalytic Sites in Enzymes. J Mol Biol 2019; 431:3860-3870. [DOI: 10.1016/j.jmb.2019.07.019] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2019] [Revised: 06/26/2019] [Accepted: 07/11/2019] [Indexed: 01/02/2023]
|
43
|
Agarwal N, Walvekar AS, Punekar NS. 2-Oxoglutarate cooperativity and biphasic ammonium saturation of Aspergillus niger NADP-glutamate dehydrogenase are structurally coupled. Arch Biochem Biophys 2019; 669:50-60. [PMID: 31136734 DOI: 10.1016/j.abb.2019.05.018] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Revised: 05/22/2019] [Accepted: 05/24/2019] [Indexed: 11/18/2022]
Abstract
NADP-glutamate dehydrogenase from Aspergillus niger (AnGDH) exhibits sigmoidal 2-oxoglutarate saturation. Despite sharing 88% amino acid identity, the homologous enzyme from Aspergillus terreus (AtGDH) shows hyperbolic 2-oxoglutarate saturation. In order to address the structural origins of this phenomenon, six AnGDH-AtGDH chimeras were constructed and characterized. The C-terminal sequence (residues 315-460, named the D-segment) was implicated in the AnGDH cooperativity. The D-segment residues largely contribute to the monomer-monomer interface of each trimer in the native hexamer and are far removed from the enzyme active site. The D-segment appears to be a part of the allosteric network responsible for 2-oxoglutarate homotropic interactions in AnGDH. AnGDH and its C415S mutant, but not AtGDH, also showed atypical, biphasic ammonium saturation, particularly at sub-saturating 2-oxoglutarate concentrations. We found that the sigmoidal 2-oxoglutarate saturation and the biphasic ammonium response are tightly coupled; the analysis of AnGDH-AtGDH chimeras ascribes the two features to the AnGDH D-segment. The two non-Michaelis-Menten substrate saturations of AnGDH were influenced by ionic strength. Increase in ionic strength reduced the nH of 2-oxoglutarate saturation as well as abolished the biphasic response, suggesting that polar/ionic interactions determine the allosteric, inter-subunit communications. The biochemical analysis in the context of available structural data implicates the D-segment of AnGDH in the allosteric feature of this enzyme. The coupling of sigmoidal 2-oxoglutarate saturation and the biphasic ammonium response could possibly confer growth advantage to A. niger experiencing carbon and/or nitrogen limitation.
Collapse
Affiliation(s)
- Nupur Agarwal
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Mumbai, 400076, Maharashtra, India
| | - Adhish S Walvekar
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Mumbai, 400076, Maharashtra, India
| | - Narayan S Punekar
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Mumbai, 400076, Maharashtra, India.
| |
Collapse
|
44
|
Mojica MF, Rutter JD, Taracila M, Abriata LA, Fouts DE, Papp-Wallace KM, Walsh TJ, LiPuma JJ, Vila AJ, Bonomo RA. Population Structure, Molecular Epidemiology, and β-Lactamase Diversity among Stenotrophomonas maltophilia Isolates in the United States. mBio 2019; 10:e00405-19. [PMID: 31266860 PMCID: PMC6606795 DOI: 10.1128/mbio.00405-19] [Citation(s) in RCA: 55] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2019] [Accepted: 06/03/2019] [Indexed: 01/06/2023] Open
Abstract
Stenotrophomonas maltophilia is a Gram-negative, nonfermenting, environmental bacillus that is an important cause of nosocomial infections, primarily associated with the respiratory tract in the immunocompromised population. Aiming to understand the population structure, microbiological characteristics and impact of allelic variation on β-lactamase structure and function, we collected 130 clinical isolates from across the United States. Identification of 90 different sequence types (STs), of which 63 are new allelic combinations, demonstrates the high diversity of this species. The majority of the isolates (45%) belong to genomic group 6. We also report excellent activity of the ceftazidime-avibactam and aztreonam combination, especially against strains recovered from blood and respiratory infections for which the susceptibility is higher than the susceptibility to trimethoprim-sulfamethoxazole, considered the "first-line" antibiotic to treat S. maltophilia Analysis of 73 blaL1 and 116 blaL2 genes identified 35 and 43 novel variants of L1 and L2 β-lactamases, respectively. Investigation of the derived amino acid sequences showed that substitutions are mostly conservative and scattered throughout the protein, preferentially affecting positions that do not compromise enzyme function but that may have an impact on substrate and inhibitor binding. Interestingly, we detected a probable association between a specific type of L1 and L2 and genomic group 6. Taken together, our results provide an overview of the molecular epidemiology of S. maltophilia clinical strains from the United States. In particular, the discovery of new L1 and L2 variants warrants further study to fully understand the relationship between them and the β-lactam resistance phenotype in this pathogen.IMPORTANCE Multiple antibiotic resistance mechanisms, including two β-lactamases, L1, a metallo-β-lactamase, and L2, a class A cephalosporinase, make S. maltophilia naturally multidrug resistant. Thus, infections caused by S. maltophilia pose a big therapeutic challenge. Our study aims to understand the microbiological and molecular characteristics of S. maltophilia isolates recovered from human sources. A highlight of the resistance profile of this collection is the excellent activity of the ceftazidime-avibactam and aztreonam combination. We hope this result prompts controlled and observational studies to add clinical data on the utility and safety of this therapy. We also identify 35 and 43 novel variants of L1 and L2, respectively, some of which harbor novel substitutions that could potentially affect substrate and/or inhibitor binding. We believe our results provide valuable knowledge to understand the epidemiology of this species and to advance mechanism-based inhibitor design to add to the limited arsenal of antibiotics active against this pathogen.
Collapse
Affiliation(s)
- Maria F Mojica
- Department of Biochemistry, Case Western Reserve University School of Medicine, Cleveland, Ohio, USA
- Research Service, Louis Stokes Veterans Affairs Medical Center, Cleveland, Ohio, USA
| | - Joseph D Rutter
- Research Service, Louis Stokes Veterans Affairs Medical Center, Cleveland, Ohio, USA
| | - Magdalena Taracila
- Research Service, Louis Stokes Veterans Affairs Medical Center, Cleveland, Ohio, USA
- Department of Medicine, Case Western Reserve University School of Medicine, Cleveland, Ohio, USA
| | - Luciano A Abriata
- Laboratory for Biomolecular Modeling, Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | - Krisztina M Papp-Wallace
- Department of Biochemistry, Case Western Reserve University School of Medicine, Cleveland, Ohio, USA
- Research Service, Louis Stokes Veterans Affairs Medical Center, Cleveland, Ohio, USA
- Department of Medicine, Case Western Reserve University School of Medicine, Cleveland, Ohio, USA
| | - Thomas J Walsh
- Transplantation Oncology Infectious Diseases Program, Weill Cornell Medical Center, New York, New York, USA
| | - John J LiPuma
- Department of Pediatrics, University of Michigan Medical School, Ann Arbor, Michigan, USA
| | - Alejandro J Vila
- Instituto de Biología Molecular y Celular de Rosario (IBR, CONICET-UNR), Rosario, Argentina
| | - Robert A Bonomo
- Department of Medicine, Case Western Reserve University School of Medicine, Cleveland, Ohio, USA
- Department of Pharmacology, Case Western Reserve University School of Medicine, Cleveland, Ohio, USA
- Department of Molecular Biology and Microbiology, Case Western Reserve University School of Medicine, Cleveland, Ohio, USA
- Department of Biochemistry, Case Western Reserve University School of Medicine, Cleveland, Ohio, USA
- Center for Proteomics and Bioinformatics, Case Western Reserve University School of Medicine, Cleveland, Ohio, USA
- Medical Service, Louis Stokes Cleveland Veterans Affairs Medical Center, Cleveland, Ohio, USA
- GRECC, Louis Stokes Cleveland Veterans Affairs Medical Center, Cleveland, Ohio, USA
- CWRU-Cleveland VAMC Center for Antimicrobial Resistance and Epidemiology (Case VA CARES), Cleveland, Ohio, USA
| |
Collapse
|
45
|
Sharir-Ivry A, Xia Y. Nature of Long-Range Evolutionary Constraint in Enzymes: Insights from Comparison to Pseudoenzymes with Similar Structures. Mol Biol Evol 2019; 35:2597-2606. [PMID: 30202983 DOI: 10.1093/molbev/msy177] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Enzymes are known to fine-tune their sequences to optimize catalytic function, yet quantitative evolutionary design principles of enzymes remain elusive on the proteomic scale. Recently, it was found that the catalytic site in enzymes induces long-range evolutionary constraint, where even sites distant to the catalytic site are more conserved than expected. Given that protein-fold usage is generally different between enzymes and nonenzymes, it remains an open question to what extent this long-range evolutionary constraint in enzymes is dictated, either directly or indirectly, by the special three-dimensional structure of the enzyme. To investigate this question, we have compared evolutionary properties of enzymes with those of counterpart pseudoenzymes that share the same protein fold but are catalytically inactive. We found that the long-range evolutionary constraint observed in enzymes is significantly reduced in pseudoenzyme counterparts, despite very high structural similarity (∼1.5 Å RMSD on average). Furthermore, this significant reduction in long-range evolutionary constraint is observed even in pseudoenzyme counterparts which retain the ligand-binding ability of enzymes. Finally, the distance between the site that induces the highest gradient of sequence conservation and the pseudocatalytic site in pseudoenzymes is significantly larger than the corresponding distance in enzymes. Taken together, our results suggest that the long-range evolutionary constraint in enzymes is induced mainly by the presence of the catalytic site rather than by the special three-dimensional structure of the enzyme, and that such long-range evolutionary constraint in enzymes depends mainly on the catalytic function of the active site rather than on the ligand-binding ability of the enzyme.
Collapse
Affiliation(s)
| | - Yu Xia
- Department of Bioengineering, McGill University, Montreal, QC, Canada
| |
Collapse
|
46
|
Sharir-Ivry A, Xia Y. Using Pseudoenzymes to Probe Evolutionary Design Principles of Enzymes. Evol Bioinform Online 2019; 15:1176934319855937. [PMID: 31236007 PMCID: PMC6572901 DOI: 10.1177/1176934319855937] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Accepted: 05/15/2019] [Indexed: 12/24/2022] Open
Abstract
Enzymes are governed by unique evolutionary design principles as their catalytic sites were shown to induce long-range evolutionary conservation gradients. We have recently used a comparative bioinformatics approach to disentangle structural determinants from other possible determinants of the evolutionary conservation gradients. The approach is based on comparing the evolutionary patterns of enzymes to those of pseudoenzymes with the same tertiary structure where the catalytic functionality is turned off. This approach provides a way to evaluate several hypotheses regarding the origin of the observed evolutionary conservation gradient in enzymes. The conclusions from such comparative analyses are important for a better understanding of the unique evolutionary design principles of enzymes, which can in turn potentially guide the design of new and improved enzymes.
Collapse
Affiliation(s)
| | - Yu Xia
- Department of Bioengineering, McGill University, Montreal, QC, Canada
| |
Collapse
|
47
|
Timinskas K, Venclovas Č. New insights into the structures and interactions of bacterial Y-family DNA polymerases. Nucleic Acids Res 2019; 47:4393-4405. [PMID: 30916324 PMCID: PMC6511836 DOI: 10.1093/nar/gkz198] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2018] [Revised: 03/09/2019] [Accepted: 03/19/2019] [Indexed: 11/15/2022] Open
Abstract
Bacterial Y-family DNA polymerases are usually classified into DinB (Pol IV), UmuC (the catalytic subunit of Pol V) and ImuB, a catalytically dead essential component of the ImuA-ImuB-DnaE2 mutasome. However, the true diversity of Y-family polymerases is unknown. Furthermore, for most of them the structures are unavailable and interactions are poorly characterized. To gain a better understanding of bacterial Y-family DNA polymerases, we performed a detailed computational study. It revealed substantial diversity, far exceeding traditional classification. We found that a large number of subfamilies feature a C-terminal extension next to the common Y-family region. Unexpectedly, in most C-terminal extensions we identified a region homologous to the N-terminal oligomerization motif of RecA. This finding implies a universal mode of interaction between Y-family members and RecA (or ImuA), in the case of Pol V strongly supported by experimental data. In gram-positive bacteria, we identified a putative Pol V counterpart composed of a Y-family polymerase, a YolD homolog and RecA. We also found ImuA-ImuB-DnaE2 variants lacking ImuA, but retaining active or inactive Y-family polymerase, a standalone ImuB C-terminal domain and/or DnaE2. In summary, our analyses revealed that, despite considerable diversity, bacterial Y-family polymerases share previously unanticipated similarities in their structural domains/motifs and interactions.
Collapse
Affiliation(s)
- Kęstutis Timinskas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Saulėtekio 7, Vilnius LT-10257, Lithuania
| | - Česlovas Venclovas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Saulėtekio 7, Vilnius LT-10257, Lithuania
| |
Collapse
|
48
|
Beaver SK, Mesa-Torres N, Pey AL, Timson DJ. NQO1: A target for the treatment of cancer and neurological diseases, and a model to understand loss of function disease mechanisms. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2019; 1867:663-676. [PMID: 31091472 DOI: 10.1016/j.bbapap.2019.05.002] [Citation(s) in RCA: 62] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Revised: 05/07/2019] [Accepted: 05/09/2019] [Indexed: 01/08/2023]
Abstract
NAD(P)H quinone oxidoreductase 1 (NQO1) is a multi-functional protein that catalyses the reduction of quinones (and other molecules), thus playing roles in xenobiotic detoxification and redox balance, and also has roles in stabilising apoptosis regulators such as p53. The structure and enzymology of NQO1 is well-characterised, showing a substituted enzyme mechanism in which NAD(P)H binds first and reduces an FAD cofactor in the active site, assisted by a charge relay system involving Tyr-155 and His-161. Protein dynamics play important role in physio-pathological aspects of this protein. NQO1 is a good target to treat cancer due to its overexpression in cancer cells. A polymorphic form of NQO1 (p.P187S) is associated with increased cancer risk and certain neurological disorders (such as multiple sclerosis and Alzheimer´s disease), possibly due to its roles in the antioxidant defence. p.P187S has greatly reduced FAD affinity and stability, due to destabilization of the flavin binding site and the C-terminal domain, which leading to reduced activity and enhanced degradation. Suppressor mutations partially restore the activity of p.P187S by local stabilization of these regions, and showing long-range allosteric communication within the protein. Consequently, the correction of NQO1 misfolding by pharmacological chaperones is a viable strategy, which may be useful to treat cancer and some neurological conditions, targeting structural spots linked to specific disease-mechanisms. Thus, NQO1 emerges as a good model to investigate loss of function mechanisms in genetic diseases as well as to improve strategies to discriminate between neutral and pathogenic variants in genome-wide sequencing studies.
Collapse
Affiliation(s)
- Sarah K Beaver
- School of Pharmacy and Biomolecular Sciences, University of Brighton, Huxley Building, Lewes Road, Brighton BN2 4GJ, UK
| | - Noel Mesa-Torres
- Department of Physical Chemistry, Faculty of Sciences, University of Granada, Av. Fuentenueva s/n, 18071, Spain
| | - Angel L Pey
- Department of Physical Chemistry, Faculty of Sciences, University of Granada, Av. Fuentenueva s/n, 18071, Spain.
| | - David J Timson
- School of Pharmacy and Biomolecular Sciences, University of Brighton, Huxley Building, Lewes Road, Brighton BN2 4GJ, UK.
| |
Collapse
|
49
|
Vats S, Shanker A. Groups of coevolving positions provide drug resistance to Mycobacterium tuberculosis: A study using targets of first-line antituberculosis drugs. Int J Antimicrob Agents 2019; 53:197-202. [DOI: 10.1016/j.ijantimicag.2018.10.027] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2018] [Revised: 10/13/2018] [Accepted: 10/20/2018] [Indexed: 01/19/2023]
|
50
|
Echave J. Beyond Stability Constraints: A Biophysical Model of Enzyme Evolution with Selection on Stability and Activity. Mol Biol Evol 2018; 36:613-620. [DOI: 10.1093/molbev/msy244] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Affiliation(s)
- Julian Echave
- Escuela de Ciencia y Tecnología, Universidad Nacional de San Martín (UNSAM), Buenos Aires, Argentina
| |
Collapse
|