1
|
Miotto M, Di Rienzo L, Bo’ L, Ruocco G, Milanetti E. Zepyros: a webserver to evaluate the shape complementarity of protein-protein interfaces. BIOINFORMATICS ADVANCES 2025; 5:vbaf051. [PMID: 40191547 PMCID: PMC11968322 DOI: 10.1093/bioadv/vbaf051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/30/2024] [Revised: 02/26/2025] [Accepted: 03/06/2025] [Indexed: 04/09/2025]
Abstract
Motivation Shape complementarity of molecular surfaces at the interfaces is a well-known characteristic of protein-protein binding regions, and it is critical in influencing the stability of the complex. Measuring such complementarity is of great importance for a number of theoretical and practical implications; however, only a limited number of tools are currently available to efficiently and rapidly assess it. Results Here, we introduce Zepyros (ZErnike Polynomials analYsis of pROtein Shapes), a webserver for fast measurement of the shape complementarity between two molecular interfaces of a given protein-protein complex using structural information. Zepyros is implemented as a publicly available tool with a user-friendly interface. Availability and implementation Our server can be found at the following link (all major browser supported): https://zepyros.bio-groups.com.
Collapse
Affiliation(s)
- Mattia Miotto
- Center for Life Nano & Neuroscience, Italian Institute of Technology, Rome 00161, Italy
| | - Lorenzo Di Rienzo
- Center for Life Nano & Neuroscience, Italian Institute of Technology, Rome 00161, Italy
| | - Leonardo Bo’
- Center for Life Nano & Neuroscience, Italian Institute of Technology, Rome 00161, Italy
| | - Giancarlo Ruocco
- Center for Life Nano & Neuroscience, Italian Institute of Technology, Rome 00161, Italy
- Department of Physics, Sapienza University of Rome, Rome 00185, Italy
| | - Edoardo Milanetti
- Center for Life Nano & Neuroscience, Italian Institute of Technology, Rome 00161, Italy
- Department of Physics, Sapienza University of Rome, Rome 00185, Italy
| |
Collapse
|
2
|
Schulz S, Tan TJC, Wu NC, Wang S. Epistatic hotspots organize antibody fitness landscape and boost evolvability. Proc Natl Acad Sci U S A 2025; 122:e2413884122. [PMID: 39773024 PMCID: PMC11745389 DOI: 10.1073/pnas.2413884122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2024] [Accepted: 12/06/2024] [Indexed: 01/11/2025] Open
Abstract
The course of evolution is strongly shaped by interaction between mutations. Such epistasis can yield rugged sequence-function maps and constrain the availability of adaptive paths. While theoretical intuition is often built on global statistics of large, homogeneous model landscapes, mutagenesis measurements necessarily probe a limited neighborhood of a reference genotype. It is unclear to what extent local topography of a real epistatic landscape represents its global shape. Here, we demonstrate that epistatic landscapes can be heterogeneously rugged and this heterogeneity may render biomolecules more evolvable. By characterizing a multipeaked fitness landscape of a SARS-CoV-2 antibody mutant library, we show that heterogeneous ruggedness arises from sparse epistatic hotspots, whose mutation impacts the fitness effect of numerous sequence sites. Surprisingly, mutating an epistatic hotspot may enhance, rather than reduce, the accessibility of the fittest genotype, while increasing the overall ruggedness. Further, migratory constraints in real space alleviate mutational constraints in sequence space, which not only diversify direct paths taken but may also turn a road-blocking fitness peak into a stepping stone leading toward the global optimum. Our results suggest that a hierarchy of epistatic hotspots may organize the fitness landscape in such a way that path-orienting ruggedness confers global smoothness.
Collapse
Affiliation(s)
- Steven Schulz
- Department of Physics and Astronomy, University of California, Los Angeles, CA90095
| | - Timothy J. C. Tan
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, IL61801
| | - Nicholas C. Wu
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, IL61801
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL61801
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL61801
| | - Shenshen Wang
- Department of Physics and Astronomy, University of California, Los Angeles, CA90095
| |
Collapse
|
3
|
Meijers M, Ruchnewitz D, Eberhardt J, Karmakar M, Łuksza M, Lässig M. Concepts and Methods for Predicting Viral Evolution. Methods Mol Biol 2025; 2890:253-290. [PMID: 39890732 DOI: 10.1007/978-1-0716-4326-6_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2025]
Abstract
The seasonal human influenza virus undergoes rapid evolution, leading to significant changes in circulating viral strains from year to year. These changes are typically driven by adaptive mutations, particularly in the antigenic epitopes, the regions of the viral surface protein hemagglutinin targeted by human antibodies. Here, we describe a consistent set of methods for data-driven predictive analysis of viral evolution. Our pipeline integrates four types of data: (1) sequence data of viral isolates collected on a worldwide scale, (2) epidemiological data on incidences, (3) antigenic characterization of circulating viruses, and (4) intrinsic viral phenotypes. From the combined analysis of these data, we obtain estimates of relative fitness for circulating strains and predictions of clade frequencies for periods of up to 1 year. Furthermore, we obtain comparative estimates of protection against future viral populations for candidate vaccine strains, providing a basis for pre-emptive vaccine strain selection. Continuously updated predictions obtained from the prediction pipeline for influenza and SARS-CoV-2 are available at https://previr.app .
Collapse
Affiliation(s)
- Matthijs Meijers
- Institute for Biological Physics, University of Cologne, Köln, Germany
| | - Denis Ruchnewitz
- Institute for Biological Physics, University of Cologne, Köln, Germany
| | - Jan Eberhardt
- Institute for Biological Physics, University of Cologne, Köln, Germany
| | - Malancha Karmakar
- Institute for Biological Physics, University of Cologne, Köln, Germany
| | - Marta Łuksza
- Departments of Oncological Sciences and Genetics and Genomic Sciences, Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| | - Michael Lässig
- Institute for Biological Physics, University of Cologne, Köln, Germany.
| |
Collapse
|
4
|
Meijers M, Ruchnewitz D, Eberhardt J, Karmakar M, Łuksza M, Lässig M. Concepts and methods for predicting viral evolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.19.585703. [PMID: 38746108 PMCID: PMC11092427 DOI: 10.1101/2024.03.19.585703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
The seasonal human influenza virus undergoes rapid evolution, leading to significant changes in circulating viral strains from year to year. These changes are typically driven by adaptive mutations, particularly in the antigenic epitopes, the regions of the viral surface protein haemagglutinin targeted by human antibodies. Here we describe a consistent set of methods for data-driven predictive analysis of viral evolution. Our pipeline integrates four types of data: (1) sequence data of viral isolates collected on a worldwide scale, (2) epidemiological data on incidences, (3) antigenic characterization of circulating viruses, and (4) intrinsic viral phenotypes. From the combined analysis of these data, we obtain estimates of relative fitness for circulating strains and predictions of clade frequencies for periods of up to one year. Furthermore, we obtain comparative estimates of protection against future viral populations for candidate vaccine strains, providing a basis for pre-emptive vaccine strain selection. Continuously updated predictions obtained from the prediction pipeline for influenza and SARS-CoV-2 are available on the website previr.app.
Collapse
Affiliation(s)
- Matthijs Meijers
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| | - Denis Ruchnewitz
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| | - Jan Eberhardt
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| | - Malancha Karmakar
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| | - Marta Łuksza
- Tisch Cancer Institute, Departments of Oncological Sciences and Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Michael Lässig
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| |
Collapse
|
5
|
Meijers M, Ruchnewitz D, Eberhardt J, Karmakar M, Luksza M, Lässig M. Concepts and methods for predicting viral evolution. ARXIV 2024:arXiv:2403.12684v3. [PMID: 38745695 PMCID: PMC11092678] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
The seasonal human influenza virus undergoes rapid evolution, leading to significant changes in circulating viral strains from year to year. These changes are typically driven by adaptive mutations, particularly in the antigenic epitopes, the regions of the viral surface protein haemagglutinin targeted by human antibodies. Here we describe a consistent set of methods for data-driven predictive analysis of viral evolution. Our pipeline integrates four types of data: (1) sequence data of viral isolates collected on a worldwide scale, (2) epidemiological data on incidences, (3) antigenic characterization of circulating viruses, and (4) intrinsic viral phenotypes. From the combined analysis of these data, we obtain estimates of relative fitness for circulating strains and predictions of clade frequencies for periods of up to one year. Furthermore, we obtain comparative estimates of protection against future viral populations for candidate vaccine strains, providing a basis for pre-emptive vaccine strain selection. Continuously updated predictions obtained from the prediction pipeline for influenza and SARS-CoV-2 are available on the website previr.app.
Collapse
Affiliation(s)
- Matthijs Meijers
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| | - Denis Ruchnewitz
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| | - Jan Eberhardt
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| | - Malancha Karmakar
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| | - Marta Luksza
- Tisch Cancer Institute, Departments of Oncological Sciences and Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Michael Lässig
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| |
Collapse
|
6
|
Abrusán G, Zelezniak A. Cellular location shapes quaternary structure of enzymes. Nat Commun 2024; 15:8505. [PMID: 39353940 PMCID: PMC11445431 DOI: 10.1038/s41467-024-52662-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2024] [Accepted: 09/18/2024] [Indexed: 10/03/2024] Open
Abstract
The main forces driving protein complex evolution are currently not well understood, especially in homomers, where quaternary structure might frequently evolve neutrally. Here we examine the factors determining oligomerisation by analysing the evolution of enzymes in circumstances where homomers rarely evolve. We show that 1) In extracellular environments, most enzymes with known structure are monomers, while in the cytoplasm homomers, indicating that the evolution of oligomers is cellular environment dependent; 2) The evolution of quaternary structure within protein orthogroups is more consistent with the predictions of constructive neutral evolution than an adaptive process: quaternary structure is gained easier than it is lost, and most extracellular monomers evolved from proteins that were monomers also in their ancestral state, without the loss of interfaces. Our results indicate that oligomerisation is context-dependent, and even when adaptive, in many cases it is probably not driven by the intrinsic properties of enzymes, like their biochemical function, but rather the properties of the environment where the enzyme is active. These factors might be macromolecular crowding and excluded volume effects facilitating the evolution of interfaces, and the maintenance of cellular homeostasis through shaping cytoplasm fluidity, protein degradation, or diffusion rates.
Collapse
Affiliation(s)
- György Abrusán
- Randall Centre for Cell and Molecular Biophysics, School of Basic and Medical Biosciences, King's College London, New Hunt's House, London, UK.
| | - Aleksej Zelezniak
- Randall Centre for Cell and Molecular Biophysics, School of Basic and Medical Biosciences, King's College London, New Hunt's House, London, UK
- Department of Life Sciences, Chalmers University of Technology, Gothenburg, Sweden
- Institute of Biotechnology, Life Sciences Centre, Vilnius University, Vilnius, Lithuania
| |
Collapse
|
7
|
Barnes JE, América Chi L, Marty Ytreberg F, Patel JS. Leveraging neural networks to correct FoldX free energy estimates. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.23.614615. [PMID: 39386633 PMCID: PMC11463479 DOI: 10.1101/2024.09.23.614615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/12/2024]
Abstract
Proteins play a pivotal role in many biological processes, and changes in their amino acid sequences can lead to dysfunction and disease. These changes can affect protein folding or interaction with other biomolecules, such as preventing antibodies from inhibiting a viral infection or causing proteins to misfold. The ability to predict the effects of mutations in proteins is crucial. Although experimental techniques can accurately quantify the effect of mutations on protein folding free energies and protein-protein binding free energies, they are often time-consuming and costly. By contrast, computational techniques offer fast and cost-effective alternatives for estimating free energies, but they typically suffer from lower accuracy. Enhancing the accuracy of computational predictions is therefore of high importance, with the potential to greatly impact fields ranging from drug design to understanding disease mechanisms. One such widely used computational method, FoldX, is capable of rapidly predicting the relative folding stability ( Δ Δ G fold ) for a protein as well as the relative binding affinity ( Δ Δ G bind ) between proteins using a single protein structure as input. However, it can suffer from low accuracy, especially for antibody-antigen systems. In this work, we trained a neural network on FoldX output to enhance its prediction accuracy. We first performed FoldX calculations on the largest datasets available for mutations that affect binding (SKEMPIv2) and folding (ProTherm4) with experimentally measured Δ Δ G . Features were then extracted from the FoldX output files including its prediction for Δ Δ G . We then developed and optimized a neural network framework to predict the difference between FoldX's estimated Δ Δ G and the experimental data, creating a model capable of producing a correction factor. Our approach showed significant improvements in Pearson correlation performance. For single mutations affecting folding, the correlation improved from a baseline of 0.3 to 0.66. In terms of binding, performance increased from 0.37 to 0.61 for single mutations and from 0.52 to 0.81 for double mutations. For epistasis, the correlation for binding affinity (both singles and doubles) improved from 0.19 to 0.59. Our results also indicated that models trained on double mutations enhanced accuracy when predicting higher-order mutations (such as triple or quadruple mutations), whereas models trained on singles did not. This suggests that interaction energy and epistasis effects present in the FoldX output are not fully utilized by FoldX itself. Once trained, these models add minimal computational time but provide a substantial increase in performance, especially for higher-order mutations and epistasis. This makes them a valuable addition to any free energy prediction pipeline using FoldX. Furthermore, we believe this technique can be further optimized and tested for predicting antibody escape, aiding in the efficient development of watch lists.
Collapse
Affiliation(s)
- Jonathan E Barnes
- Institute for Modeling Collaboration and Innovation, University of Idaho, Moscow, ID, USA
| | - L América Chi
- Department of Chemical and Biological Engineering, University of Idaho, Moscow, ID, USA
| | - F Marty Ytreberg
- Department of Physics, University of Idaho, Moscow, ID, USA
- Institute for Modeling Collaboration and Innovation, University of Idaho, Moscow, ID, USA
| | - Jagdish Suresh Patel
- Department of Chemical and Biological Engineering, University of Idaho, Moscow, ID, USA
- Institute for Modeling Collaboration and Innovation, University of Idaho, Moscow, ID, USA
| |
Collapse
|
8
|
Kim I, Dubrow A, Zuniga B, Zhao B, Sherer N, Bastiray A, Li P, Cho JH. Energy landscape reshaped by strain-specific mutations underlies epistasis in NS1 evolution of influenza A virus. Nat Commun 2022; 13:5775. [PMID: 36182933 PMCID: PMC9526705 DOI: 10.1038/s41467-022-33554-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Accepted: 09/22/2022] [Indexed: 11/24/2022] Open
Abstract
Elucidating how individual mutations affect the protein energy landscape is crucial for understanding how proteins evolve. However, predicting mutational effects remains challenging because of epistasis—the nonadditive interactions between mutations. Here, we investigate the biophysical mechanism of strain-specific epistasis in the nonstructural protein 1 (NS1) of influenza A viruses (IAVs). We integrate structural, kinetic, thermodynamic, and conformational dynamics analyses of four NS1s of influenza strains that emerged between 1918 and 2004. Although functionally near-neutral, strain-specific NS1 mutations exhibit long-range epistatic interactions with residues at the p85β-binding interface. We reveal that strain-specific mutations reshaped the NS1 energy landscape during evolution. Using NMR spin dynamics, we find that the strain-specific mutations altered the conformational dynamics of the hidden network of tightly packed residues, underlying the evolution of long-range epistasis. This work shows how near-neutral mutations silently alter the biophysical energy landscapes, resulting in diverse background effects during molecular evolution. Influenza A virus (IAV) nonstructural protein 1 (NS1) is a multifunctional virulence factor that interacts with several host factors such as phosphatidylinositol-3-kinase (PI3K). NS1 binds specifically to the p85β regulatory subunit of PI3K and subsequently activates PI3K signaling. Here, Kim et al. show that functionally near-neutral, strain-specific NS1 mutations lead to variations in binding kinetics to p85β exhibit long-range epistatic interactions. Applying NMR they provide evidence that the structural dynamics of the NS1 hydrophobic core have evolved over time and contributed to epistasis.
Collapse
Affiliation(s)
- Iktae Kim
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, 77843, USA
| | - Alyssa Dubrow
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, 77843, USA
| | - Bryan Zuniga
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, 77843, USA
| | - Baoyu Zhao
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, 77843, USA
| | - Noah Sherer
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, 77843, USA
| | - Abhishek Bastiray
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, 77843, USA
| | - Pingwei Li
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, 77843, USA
| | - Jae-Hyun Cho
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, 77843, USA.
| |
Collapse
|
9
|
Desantis F, Miotto M, Di Rienzo L, Milanetti E, Ruocco G. Spatial organization of hydrophobic and charged residues affects protein thermal stability and binding affinity. Sci Rep 2022; 12:12087. [PMID: 35840609 PMCID: PMC9287411 DOI: 10.1038/s41598-022-16338-5] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2022] [Accepted: 07/08/2022] [Indexed: 11/12/2022] Open
Abstract
What are the molecular determinants of protein–protein binding affinity and whether they are similar to those regulating fold stability are two major questions of molecular biology, whose answers bring important implications both from a theoretical and applicative point of view. Here, we analyze chemical and physical features on a large dataset of protein–protein complexes with reliable experimental binding affinity data and compare them with a set of monomeric proteins for which melting temperature data was available. In particular, we probed the spatial organization of protein (1) intramolecular and intermolecular interaction energies among residues, (2) amino acidic composition, and (3) their hydropathy features. Analyzing the interaction energies, we found that strong Coulombic interactions are preferentially associated with a high protein thermal stability, while strong intermolecular van der Waals energies correlate with stronger protein–protein binding affinity. Statistical analysis of amino acids abundances, exposed to the molecular surface and/or in interaction with the molecular partner, confirmed that hydrophobic residues present on the protein surfaces are preferentially located in the binding regions, while charged residues behave oppositely. Leveraging on the important role of van der Waals interface interactions in binding affinity, we focused on the molecular surfaces in the binding regions and evaluated their shape complementarity, decomposing the molecular patches in the 2D Zernike basis. For the first time, we quantified the correlation between local shape complementarity and binding affinity via the Zernike formalism. In addition, considering the solvent interactions via the residue hydropathy, we found that the hydrophobicity of the binding regions dictates their shape complementary as much as the correlation between van der Waals energy and binding affinity. In turn, these relationships pave the way to the fast and accurate prediction and design of optimal binding regions as the 2D Zernike formalism allows a rapid and superposition-free comparison between possible binding surfaces.
Collapse
Affiliation(s)
- Fausta Desantis
- Center for Life Nano and Neuro Science, Istituto Italiano di Tecnologia (IIT), Viale Regina Elena 291, 00161, Rome, Italy.,The Open University Affiliated Research Centre at Istituto Italiano di Tecnologia, Via Morego, 30, 16163, Genoa, Italy
| | - Mattia Miotto
- Center for Life Nano and Neuro Science, Istituto Italiano di Tecnologia (IIT), Viale Regina Elena 291, 00161, Rome, Italy.
| | - Lorenzo Di Rienzo
- Center for Life Nano and Neuro Science, Istituto Italiano di Tecnologia (IIT), Viale Regina Elena 291, 00161, Rome, Italy
| | - Edoardo Milanetti
- Center for Life Nano and Neuro Science, Istituto Italiano di Tecnologia (IIT), Viale Regina Elena 291, 00161, Rome, Italy.,Department of Physics, Sapienza University of Rome, Piazzale Aldo Moro, 5, 00185, Rome, Italy
| | - Giancarlo Ruocco
- Center for Life Nano and Neuro Science, Istituto Italiano di Tecnologia (IIT), Viale Regina Elena 291, 00161, Rome, Italy.,Department of Physics, Sapienza University of Rome, Piazzale Aldo Moro, 5, 00185, Rome, Italy
| |
Collapse
|
10
|
Jayaraman V, Toledo‐Patiño S, Noda‐García L, Laurino P. Mechanisms of protein evolution. Protein Sci 2022; 31:e4362. [PMID: 35762715 PMCID: PMC9214755 DOI: 10.1002/pro.4362] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 05/11/2022] [Accepted: 05/14/2022] [Indexed: 11/06/2022]
Abstract
How do proteins evolve? How do changes in sequence mediate changes in protein structure, and in turn in function? This question has multiple angles, ranging from biochemistry and biophysics to evolutionary biology. This review provides a brief integrated view of some key mechanistic aspects of protein evolution. First, we explain how protein evolution is primarily driven by randomly acquired genetic mutations and selection for function, and how these mutations can even give rise to completely new folds. Then, we also comment on how phenotypic protein variability, including promiscuity, transcriptional and translational errors, may also accelerate this process, possibly via "plasticity-first" mechanisms. Finally, we highlight open questions in the field of protein evolution, with respect to the emergence of more sophisticated protein systems such as protein complexes, pathways, and the emergence of pre-LUCA enzymes.
Collapse
Affiliation(s)
- Vijay Jayaraman
- Department of Molecular Cell BiologyWeizmann Institute of ScienceRehovotIsrael
| | - Saacnicteh Toledo‐Patiño
- Protein Engineering and Evolution UnitOkinawa Institute of Science and Technology Graduate UniversityOkinawaJapan
| | - Lianet Noda‐García
- Department of Plant Pathology and Microbiology, Institute of Environmental Sciences, Robert H. Smith Faculty of Agriculture, Food and EnvironmentHebrew University of JerusalemRehovotIsrael
| | - Paola Laurino
- Protein Engineering and Evolution UnitOkinawa Institute of Science and Technology Graduate UniversityOkinawaJapan
| |
Collapse
|
11
|
Rodriguez Gama A, Miller T, Lange JJ, Unruh JR, Halfmann R. A nucleation barrier spring-loads the CBM signalosome for binary activation. eLife 2022; 11:79826. [PMID: 35727133 PMCID: PMC9342958 DOI: 10.7554/elife.79826] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 06/20/2022] [Indexed: 11/26/2022] Open
Abstract
Immune cells activate in binary, switch-like fashion via large protein assemblies known as signalosomes, but the molecular mechanism of the switch is not yet understood. Here, we employed an in-cell biophysical approach to dissect the assembly mechanism of the CARD-BCL10-MALT1 (CBM) signalosome, which governs nuclear transcription factor-κB activation in both innate and adaptive immunity. We found that the switch consists of a sequence-encoded and deeply conserved nucleation barrier to ordered polymerization by the adaptor protein BCL10. The particular structure of the BCL10 polymers did not matter for activity. Using optogenetic tools and single-cell transcriptional reporters, we discovered that endogenous BCL10 is functionally supersaturated even in unstimulated human cells, and this results in a predetermined response to stimulation upon nucleation by activated CARD multimers. Our findings may inform on the progressive nature of age-associated inflammation, and suggest that signalosome structure has evolved via selection for kinetic rather than equilibrium properties of the proteins. The innate immune system is the body’s first line of defence against pathogens. Although innate immune cells do not recognize specific disease-causing agents, they can detect extremely low levels of harmful organisms or substances. In response, they activate signals that lead to inflammation, which tells other cells that there is an infection. Innate immune cells are turned on in a switch-like fashion, becoming active very quickly after interacting with a pathogen. This is due to the action of signalosomes, large complexes made up of several proteins that clump together to form long chains that activate the cell. But how do these large protein complexes assemble quick enough to create the switch-like activation observed in innate immune cells? To answer this question, Rodríguez Gama et al. focused on the CBM signalosome, which is involved in triggering inflammation through the activation of a protein called NF-kB. First, Rodríguez Gama et al. used genetic tools to determine that activating the CBM signalosome drives a switch-like activation of NF-kB in cells. This means that individual cells in a population either become fully activated or not at all in response to minute amounts of harmful substances. Once they had established this, Rodríguez Gama et al. wanted to know which protein in the CBM signalosome was responsible for the switch. They found that one of the proteins in the signalosome, called BCL10, has a ‘nucleation barrier’ encoded in its sequence. This means that it is very hard for BCL10 to start clumping together, but once it does, the clumps grow on their own. The nucleation barrier describes exactly how hard it is for these clumps to get started, and is determined by how disorganized the protein is. When a pathogen ‘stimulates’ an immune cell, a tiny template is formed that lowers the nucleation barrier so that BCL10 can then aggregate itself together, leading to the switch-like behaviour observed. The nucleation barrier allows there to be more than enough BCL10 present in the cell at all times – ready to clump together at a moment’s notice – and this permits the cell to detect very low levels of a pathogen. Rodríguez Gama et al. then tested whether BCL10 from other animals also has a nucleation barrier. They found that this feature is conserved from cnidarians, such as corals or jellyfish, to mammals, including humans. This suggests that the use of nucleation barriers to regulate innate immune signalling has existed for a long time throughout evolution. The work by Rodríguez Gama et al. broadens our understanding of how the innate immune system senses and responds to extremely low levels of pathogens. That BCL10 is always ready to clump together suggests it may be a driving force for chronic and age-associated inflammation. Additionally, the findings of Rodríguez Gama et al. also offer insights into how other signalosomes may become activated, and offer the possibility of new drugs aimed at modifying nucleation barriers.
Collapse
Affiliation(s)
| | - Tayla Miller
- Stowers Institute for Medical Research, Kansas City, United States
| | - Jeffrey J Lange
- Stowers Institute for Medical Research, Kansas City, United States
| | - Jay R Unruh
- Stowers Institute for Medical Research, Kansas City, United States
| | - Randal Halfmann
- Stowers Institute for Medical Research, Kansas City, United States
| |
Collapse
|
12
|
Tareen A, Kooshkbaghi M, Posfai A, Ireland WT, McCandlish DM, Kinney JB. MAVE-NN: learning genotype-phenotype maps from multiplex assays of variant effect. Genome Biol 2022; 23:98. [PMID: 35428271 PMCID: PMC9011994 DOI: 10.1186/s13059-022-02661-7] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 03/21/2022] [Accepted: 03/24/2022] [Indexed: 12/17/2022] Open
Abstract
Multiplex assays of variant effect (MAVEs) are a family of methods that includes deep mutational scanning experiments on proteins and massively parallel reporter assays on gene regulatory sequences. Despite their increasing popularity, a general strategy for inferring quantitative models of genotype-phenotype maps from MAVE data is lacking. Here we introduce MAVE-NN, a neural-network-based Python package that implements a broadly applicable information-theoretic framework for learning genotype-phenotype maps-including biophysically interpretable models-from MAVE datasets. We demonstrate MAVE-NN in multiple biological contexts, and highlight the ability of our approach to deconvolve mutational effects from otherwise confounding experimental nonlinearities and noise.
Collapse
Affiliation(s)
- Ammar Tareen
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, 11724, NY, USA
- Present Address: Regeneron Pharmaceuticals, Inc., Tarrytown, 10591, NY, USA
| | - Mahdi Kooshkbaghi
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, 11724, NY, USA
| | - Anna Posfai
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, 11724, NY, USA
| | - William T Ireland
- Department of Physics, California Institute of Technology, Pasadena, 91125, CA, USA
- Present Address: Department of Applied Physics, Harvard University, Cambridge, 02134, MA, USA
| | - David M McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, 11724, NY, USA
| | - Justin B Kinney
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, 11724, NY, USA.
| |
Collapse
|
13
|
Schulz L, Sendker FL, Hochberg GKA. Non-adaptive complexity and biochemical function. Curr Opin Struct Biol 2022; 73:102339. [PMID: 35247750 DOI: 10.1016/j.sbi.2022.102339] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Revised: 12/06/2021] [Accepted: 01/24/2022] [Indexed: 11/25/2022]
Abstract
Intricate biochemical structures are usually thought to be useful, because natural selection preserves them from degradation by a constant hail of destructive mutations. Biochemists therefore often deliberately disrupt them to understand how complexity improves protein function or fitness. However, evolutionary theory suggests that even useless complexity that never improved fitness can become completely essential if a simple set of evolutionary conditions is fulfilled. We review evidence that stable protein complexes, protein-chaperone interactions, and complexes consisting of several paralogs all fulfill these conditions. This makes reverse genetics or destructive mutagenesis unsuitable for assigning functions to these kinds of complexity. Instead, we advocate that incorporating evolutionary approaches into biochemistry overcomes this difficulty and allows us to distinguish useless from useful biochemical complexity.
Collapse
Affiliation(s)
- Luca Schulz
- Max Planck Institute for Terrestrial Microbiology, Karl-von-Frisch Straße 10, 35043 Marburg, Germany. https://twitter.com/schulluc
| | - Franziska L Sendker
- Max Planck Institute for Terrestrial Microbiology, Karl-von-Frisch Straße 10, 35043 Marburg, Germany. https://twitter.com/SendkerFL
| | - Georg K A Hochberg
- Max Planck Institute for Terrestrial Microbiology, Karl-von-Frisch Straße 10, 35043 Marburg, Germany; Department of Chemistry, University of Marburg, Hans-Meerwein-Straße 4, 35032 Marburg, Germany; Center for Synthetic Microbiology (SYNMIKRO), Hans-Meerwein-Straße 6, 35032 Marburg, Germany.
| |
Collapse
|
14
|
Sangesland M, Lingwood D. Public Immunity: Evolutionary Spandrels for Pathway-Amplifying Protective Antibodies. Front Immunol 2021; 12:708882. [PMID: 34956170 PMCID: PMC8696009 DOI: 10.3389/fimmu.2021.708882] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Accepted: 11/23/2021] [Indexed: 12/14/2022] Open
Abstract
Humoral immunity is seeded by affinity between the B cell receptor (BCR) and cognate antigen. While the BCR is a chimeric display of diverse antigen engagement solutions, we discuss its functional activity as an ‘innate-like’ immune receptor, wherein genetically hardwired antigen complementarity can serve as reproducible templates for pathway-amplifying otherwise immunologically recessive antibody responses. We propose that the capacity for germline reactivity to new antigen emerged as a set of evolutionary spandrels or coupled traits, which can now be exploited by rational vaccine design to focus humoral immunity upon conventionally immune-subdominant antibody targets. Accordingly, we suggest that evolutionary spandrels account for the necessary but unanticipated antigen reactivity of the germline antibody repertoire.
Collapse
Affiliation(s)
- Maya Sangesland
- The Ragon Institute of Massachusetts General Hospital, Massachusetts Institute of Technology, and Harvard University, Cambridge, MA, United States
| | - Daniel Lingwood
- The Ragon Institute of Massachusetts General Hospital, Massachusetts Institute of Technology, and Harvard University, Cambridge, MA, United States
| |
Collapse
|
15
|
Latrille T, Lartillot N. Quantifying the impact of changes in effective population size and expression level on the rate of coding sequence evolution. Theor Popul Biol 2021; 142:57-66. [PMID: 34563555 DOI: 10.1016/j.tpb.2021.09.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 09/08/2021] [Accepted: 09/11/2021] [Indexed: 02/07/2023]
Abstract
Molecular sequences are shaped by selection, where the strength of selection relative to drift is determined by effective population size (Ne). Populations with high Ne are expected to undergo stronger purifying selection, and consequently to show a lower substitution rate for selected mutations relative to the substitution rate for neutral mutations (ω). However, computational models based on biophysics of protein stability have suggested that ω can also be independent of Ne. Together, the response of ω to changes in Ne depends on the specific mapping from sequence to fitness. Importantly, an increase in protein expression level has been found empirically to result in decrease of ω, an observation predicted by theoretical models assuming selection for protein stability. Here, we derive a theoretical approximation for the response of ω to changes in Ne and expression level, under an explicit genotype-phenotype-fitness map. The method is generally valid for additive traits and log-concave fitness functions. We applied these results to protein undergoing selection for their conformational stability and corroborate out findings with simulations under more complex models. We predict a weak response of ω to changes in either Ne or expression level, which are interchangeable. Based on empirical data, we propose that fitness based on the conformational stability may not be a sufficient mechanism to explain the empirically observed variation in ω across species. Other aspects of protein biophysics might be explored, such as protein-protein interactions, which can lead to a stronger response of ω to changes in Ne.
Collapse
Affiliation(s)
- T Latrille
- Université de Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR 5558, F-69622 Villeurbanne, France; École Normale Supérieure de Lyon, Université de Lyon, Université Lyon 1, Lyon, France.
| | - N Lartillot
- Université de Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR 5558, F-69622 Villeurbanne, France
| |
Collapse
|
16
|
Sharpe DJ, Wales DJ. Numerical analysis of first-passage processes in finite Markov chains exhibiting metastability. Phys Rev E 2021; 104:015301. [PMID: 34412280 DOI: 10.1103/physreve.104.015301] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 05/29/2021] [Indexed: 12/19/2022]
Abstract
We describe state-reduction algorithms for the analysis of first-passage processes in discrete- and continuous-time finite Markov chains. We present a formulation of the graph transformation algorithm that allows for the evaluation of exact mean first-passage times, stationary probabilities, and committor probabilities for all nonabsorbing nodes of a Markov chain in a single computation. Calculation of the committor probabilities within the state-reduction formalism is readily generalizable to the first hitting problem for any number of alternative target states. We then show that a state-reduction algorithm can be formulated to compute the expected number of times that each node is visited along a first-passage path. Hence, all properties required to analyze the first-passage path ensemble (FPPE) at both a microscopic and macroscopic level of detail, including the mean and variance of the first-passage time distribution, can be computed using state-reduction methods. In particular, we derive expressions for the probability that a node is visited along a direct transition path, which proceeds without returning to the initial state, considering both the nonequilibrium and equilibrium (steady-state) FPPEs. The reactive visitation probability provides a rigorous metric to quantify the dynamical importance of a node for the productive transition between two endpoint states and thus allows the local states that facilitate the dominant transition mechanisms to be readily identified. The state-reduction procedures remain numerically stable even for Markov chains exhibiting metastability, which can be severely ill-conditioned. The rare event regime is frequently encountered in realistic models of dynamical processes, and our methodology therefore provides valuable tools for the analysis of Markov chains in practical applications. We illustrate our approach with numerical results for a kinetic network representing a structural transition in an atomic cluster.
Collapse
Affiliation(s)
- Daniel J Sharpe
- Department of Chemistry, University of Cambridge, Lensfield Road, and Cambridge CB2 1EW, United Kingdom
| | - David J Wales
- Department of Chemistry, University of Cambridge, Lensfield Road, and Cambridge CB2 1EW, United Kingdom
| |
Collapse
|
17
|
Manrubia S, Cuesta JA, Aguirre J, Ahnert SE, Altenberg L, Cano AV, Catalán P, Diaz-Uriarte R, Elena SF, García-Martín JA, Hogeweg P, Khatri BS, Krug J, Louis AA, Martin NS, Payne JL, Tarnowski MJ, Weiß M. From genotypes to organisms: State-of-the-art and perspectives of a cornerstone in evolutionary dynamics. Phys Life Rev 2021; 38:55-106. [PMID: 34088608 DOI: 10.1016/j.plrev.2021.03.004] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Accepted: 03/01/2021] [Indexed: 12/21/2022]
Abstract
Understanding how genotypes map onto phenotypes, fitness, and eventually organisms is arguably the next major missing piece in a fully predictive theory of evolution. We refer to this generally as the problem of the genotype-phenotype map. Though we are still far from achieving a complete picture of these relationships, our current understanding of simpler questions, such as the structure induced in the space of genotypes by sequences mapped to molecular structures, has revealed important facts that deeply affect the dynamical description of evolutionary processes. Empirical evidence supporting the fundamental relevance of features such as phenotypic bias is mounting as well, while the synthesis of conceptual and experimental progress leads to questioning current assumptions on the nature of evolutionary dynamics-cancer progression models or synthetic biology approaches being notable examples. This work delves with a critical and constructive attitude into our current knowledge of how genotypes map onto molecular phenotypes and organismal functions, and discusses theoretical and empirical avenues to broaden and improve this comprehension. As a final goal, this community should aim at deriving an updated picture of evolutionary processes soundly relying on the structural properties of genotype spaces, as revealed by modern techniques of molecular and functional analysis.
Collapse
Affiliation(s)
- Susanna Manrubia
- Department of Systems Biology, Centro Nacional de Biotecnología (CSIC), Madrid, Spain; Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain.
| | - José A Cuesta
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Departamento de Matemáticas, Universidad Carlos III de Madrid, Leganés, Spain; Instituto de Biocomputación y Física de Sistemas Complejos (BiFi), Universidad de Zaragoza, Spain; UC3M-Santander Big Data Institute (IBiDat), Getafe, Madrid, Spain
| | - Jacobo Aguirre
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Centro de Astrobiología, CSIC-INTA, ctra. de Ajalvir km 4, 28850 Torrejón de Ardoz, Madrid, Spain
| | - Sebastian E Ahnert
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, UK; The Alan Turing Institute, British Library, 96 Euston Road, London NW1 2DB, UK
| | | | - Alejandro V Cano
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Pablo Catalán
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Departamento de Matemáticas, Universidad Carlos III de Madrid, Leganés, Spain
| | - Ramon Diaz-Uriarte
- Department of Biochemistry, Universidad Autónoma de Madrid, Madrid, Spain; Instituto de Investigaciones Biomédicas "Alberto Sols" (UAM-CSIC), Madrid, Spain
| | - Santiago F Elena
- Instituto de Biología Integrativa de Sistemas, I(2)SysBio (CSIC-UV), València, Spain; The Santa Fe Institute, Santa Fe, NM, USA
| | | | - Paulien Hogeweg
- Theoretical Biology and Bioinformatics Group, Utrecht University, the Netherlands
| | - Bhavin S Khatri
- The Francis Crick Institute, London, UK; Department of Life Sciences, Imperial College London, London, UK
| | - Joachim Krug
- Institute for Biological Physics, University of Cologne, Köln, Germany
| | - Ard A Louis
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford, UK
| | - Nora S Martin
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, Cambridge, UK; Sainsbury Laboratory, University of Cambridge, Cambridge, UK
| | - Joshua L Payne
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | - Marcel Weiß
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, Cambridge, UK; Sainsbury Laboratory, University of Cambridge, Cambridge, UK
| |
Collapse
|
18
|
Hochberg GKA, Liu Y, Marklund EG, Metzger BPH, Laganowsky A, Thornton JW. A hydrophobic ratchet entrenches molecular complexes. Nature 2020; 588:503-508. [PMID: 33299178 PMCID: PMC8168016 DOI: 10.1038/s41586-020-3021-2] [Citation(s) in RCA: 75] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2019] [Accepted: 10/20/2020] [Indexed: 02/07/2023]
Abstract
Most proteins assemble into multisubunit complexes1. The persistence of these complexes across evolutionary time is usually explained as the result of natural selection for functional properties that depend on multimerization, such as intersubunit allostery or the capacity to do mechanical work2. In many complexes, however, multimerization does not enable any known function3. An alternative explanation is that multimers could become entrenched if substitutions accumulate that are neutral in multimers but deleterious in monomers; purifying selection would then prevent reversion to the unassembled form, even if assembly per se does not enhance biological function3-7. Here we show that a hydrophobic mutational ratchet systematically entrenches molecular complexes. By applying ancestral protein reconstruction and biochemical assays to the evolution of steroid hormone receptors, we show that an ancient hydrophobic interface, conserved for hundreds of millions of years, is entrenched because exposure of this interface to solvent reduces protein stability and causes aggregation, even though the interface makes no detectable contribution to function. Using structural bioinformatics, we show that a universal mutational propensity drives sites that are buried in multimeric interfaces to accumulate hydrophobic substitutions to levels that are not tolerated in monomers. In a database of hundreds of families of multimers, most show signatures of long-term hydrophobic entrenchment. It is therefore likely that many protein complexes persist because a simple ratchet-like mechanism entrenches them across evolutionary time, even when they are functionally gratuitous.
Collapse
Affiliation(s)
- Georg K A Hochberg
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA
| | - Yang Liu
- Department of Chemistry, Texas A&M University, College Station, TX, USA
| | - Erik G Marklund
- Department of Chemistry - BMC, Uppsala University, Uppsala, Sweden
| | - Brian P H Metzger
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA
| | - Arthur Laganowsky
- Department of Chemistry, Texas A&M University, College Station, TX, USA
| | - Joseph W Thornton
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA.
- Department of Human Genetics, University of Chicago, Chicago, IL, USA.
| |
Collapse
|
19
|
Swinburne TD, Kannan D, Sharpe DJ, Wales DJ. Rare events and first passage time statistics from the energy landscape. J Chem Phys 2020; 153:134115. [PMID: 33032418 DOI: 10.1063/5.0016244] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
We analyze the probability distribution of rare first passage times corresponding to transitions between product and reactant states in a kinetic transition network. The mean first passage times and the corresponding rate constants are analyzed in detail for two model landscapes and the double funnel landscape corresponding to an atomic cluster. Evaluation schemes based on eigendecomposition and kinetic path sampling, which both allow access to the first passage time distribution, are benchmarked against mean first passage times calculated using graph transformation. Numerical precision issues severely limit the useful temperature range for eigendecomposition, but kinetic path sampling is capable of extending the first passage time analysis to lower temperatures, where the kinetics of interest constitute rare events. We then investigate the influence of free energy based state regrouping schemes for the underlying network. Alternative formulations of the effective transition rates for a given regrouping are compared in detail to determine their numerical stability and capability to reproduce the true kinetics, including recent coarse-graining approaches that preserve occupancy cross correlation functions. We find that appropriate regrouping of states under the simplest local equilibrium approximation can provide reduced transition networks with useful accuracy at somewhat lower temperatures. Finally, a method is provided to systematically interpolate between the local equilibrium approximation and exact intergroup dynamics. Spectral analysis is applied to each grouping of states, employing a moment-based mode selection criterion to produce a reduced state space, which does not require any spectral gap to exist, but reduces to gap-based coarse graining as a special case. Implementations of the developed methods are freely available online.
Collapse
Affiliation(s)
- Thomas D Swinburne
- Aix-Marseille Université, CNRS, CINaM UMR 7325, Campus de Luminy, 13288 Marseille, France
| | - Deepti Kannan
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Daniel J Sharpe
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - David J Wales
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| |
Collapse
|
20
|
Tian P, Best RB. Exploring the sequence fitness landscape of a bridge between protein folds. PLoS Comput Biol 2020; 16:e1008285. [PMID: 33048928 PMCID: PMC7553338 DOI: 10.1371/journal.pcbi.1008285] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 08/24/2020] [Indexed: 12/15/2022] Open
Abstract
Most foldable protein sequences adopt only a single native fold. Recent protein design studies have, however, created protein sequences which fold into different structures apon changes of environment, or single point mutation, the best characterized example being the switch between the folds of the GA and GB binding domains of streptococcal protein G. To obtain further insight into the design of sequences which can switch folds, we have used a computational model for the fitness landscape of a single fold, built from the observed sequence variation of protein homologues. We have recently shown that such coevolutionary models can be used to design novel foldable sequences. By appropriately combining two of these models to describe the joint fitness landscape of GA and GB, we are able to describe the propensity of a given sequence for each of the two folds. We have successfully tested the combined model against the known series of designed GA/GB hybrids. Using Monte Carlo simulations on this landscape, we are able to identify pathways of mutations connecting the two folds. In the absence of a requirement for domain stability, the most frequent paths go via sequences in which neither domain is stably folded, reminiscent of the propensity for certain intrinsically disordered proteins to fold into different structures according to context. Even if the folded state is required to be stable, we find that there is nonetheless still a wide range of sequences which are close to the transition region and therefore likely fold switches, consistent with recent estimates that fold switching may be more widespread than had been thought.
Collapse
Affiliation(s)
- Pengfei Tian
- Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland, U.S.A
| | - Robert B. Best
- Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland, U.S.A
| |
Collapse
|
21
|
Ballal A, Laurendon C, Salmon M, Vardakou M, Cheema J, Defernez M, O'Maille PE, Morozov AV. Sparse Epistatic Patterns in the Evolution of Terpene Synthases. Mol Biol Evol 2020; 37:1907-1924. [PMID: 32119077 DOI: 10.1093/molbev/msaa052] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
We explore sequence determinants of enzyme activity and specificity in a major enzyme family of terpene synthases. Most enzymes in this family catalyze reactions that produce cyclic terpenes-complex hydrocarbons widely used by plants and insects in diverse biological processes such as defense, communication, and symbiosis. To analyze the molecular mechanisms of emergence of terpene cyclization, we have carried out in-depth examination of mutational space around (E)-β-farnesene synthase, an Artemisia annua enzyme which catalyzes production of a linear hydrocarbon chain. Each mutant enzyme in our synthetic libraries was characterized biochemically, and the resulting reaction rate data were used as input to the Michaelis-Menten model of enzyme kinetics, in which free energies were represented as sums of one-amino-acid contributions and two-amino-acid couplings. Our model predicts measured reaction rates with high accuracy and yields free energy landscapes characterized by relatively few coupling terms. As a result, the Michaelis-Menten free energy landscapes have simple, interpretable structure and exhibit little epistasis. We have also developed biophysical fitness models based on the assumption that highly fit enzymes have evolved to maximize the output of correct products, such as cyclic products or a specific product of interest, while minimizing the output of byproducts. This approach results in nonlinear fitness landscapes that are considerably more epistatic. Overall, our experimental and computational framework provides focused characterization of evolutionary emergence of novel enzymatic functions in the context of microevolutionary exploration of sequence space around naturally occurring enzymes.
Collapse
Affiliation(s)
- Aditya Ballal
- Department of Physics & Astronomy and Center for Quantitative Biology, Rutgers University, Piscataway, NJ
| | - Caroline Laurendon
- John Innes Centre, Department of Metabolic Biology, Norwich Research Park, Norwich, United Kingdom.,Food & Health Programme, Institute of Food Research, Norwich Research Park, Norwich, United Kingdom
| | - Melissa Salmon
- John Innes Centre, Department of Metabolic Biology, Norwich Research Park, Norwich, United Kingdom.,Food & Health Programme, Institute of Food Research, Norwich Research Park, Norwich, United Kingdom.,Earlham Institute, Norwich Research Park, Norwich, United Kingdom
| | - Maria Vardakou
- John Innes Centre, Department of Metabolic Biology, Norwich Research Park, Norwich, United Kingdom.,Food & Health Programme, Institute of Food Research, Norwich Research Park, Norwich, United Kingdom.,School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich, United Kingdom
| | - Jitender Cheema
- John Innes Centre, Department of Computational and Systems Biology, Norwich Research Park, Norwich, United Kingdom
| | - Marianne Defernez
- Core Science Resources, Quadram Institute, Norwich Research Park, Norwich, United Kingdom
| | - Paul E O'Maille
- John Innes Centre, Department of Metabolic Biology, Norwich Research Park, Norwich, United Kingdom.,Food & Health Programme, Institute of Food Research, Norwich Research Park, Norwich, United Kingdom.,SRI International, Menlo Park, CA
| | - Alexandre V Morozov
- Department of Physics & Astronomy and Center for Quantitative Biology, Rutgers University, Piscataway, NJ
| |
Collapse
|
22
|
Rivoire O. Parsimonious evolutionary scenario for the origin of allostery and coevolution patterns in proteins. Phys Rev E 2020; 100:032411. [PMID: 31640027 DOI: 10.1103/physreve.100.032411] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2018] [Indexed: 12/16/2022]
Abstract
Proteins display generic properties that are challenging to explain by direct selection, notably allostery, the capacity to be regulated through long-range effects, and evolvability, the capacity to adapt to new selective pressures. An evolutionary scenario is proposed where proteins acquire these two features indirectly as a by-product of their selection for a more fundamental property, exquisite discrimination, the capacity to bind discriminatively very similar ligands. Achieving this task is shown to typically require proteins to undergo a conformational change. We argue that physical and evolutionary constraints impel this change to be controlled by a group of sites extending from the binding site. Proteins can thus acquire a latent potential for allosteric regulation and evolutionary adaptation because of long-range effects that initially arise as evolutionary spandrels. This scenario accounts for the groups of conserved and coevolving residues observed in multiple sequence alignments. However, we propose that most pairs of coevolving and contacting residues inferred from such alignments have a different origin, related to thermal stability. A physical model is presented that illustrates this evolutionary scenario and its implications. The scenario can be implemented in experiments of protein evolution to directly test its predictions.
Collapse
Affiliation(s)
- Olivier Rivoire
- Center for Interdisciplinary Research in Biology, Collège de France, Centre National de la Recherche Scientifique, INSERM, PSL Research University, 75005 Paris, France
| |
Collapse
|
23
|
Zabel WJ, Hagner KP, Livesey BJ, Marsh JA, Setayeshgar S, Lynch M, Higgs PG. Evolution of protein interfaces in multimers and fibrils. J Chem Phys 2019; 150:225102. [PMID: 31202237 PMCID: PMC6561775 DOI: 10.1063/1.5086042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
A majority of cellular proteins function as part of multimeric complexes of two or more subunits. Multimer formation requires interactions between protein surfaces that lead to closed structures, such as dimers and tetramers. If proteins interact in an open-ended way, uncontrolled growth of fibrils can occur, which is likely to be detrimental in most cases. We present a statistical physics model that allows aggregation of proteins as either closed dimers or open fibrils of all lengths. We use pairwise amino-acid contact energies to calculate the energies of interacting protein surfaces. The probabilities of all possible aggregate configurations can be calculated for any given sequence of surface amino acids. We link the statistical physics model to a population genetics model that describes the evolution of the surface residues. When proteins evolve neutrally, without selection for or against multimer formation, we find that a majority of proteins remain as monomers at moderate concentrations, but strong dimer-forming or fibril-forming sequences are also possible. If selection is applied in favor of dimers or in favor of fibrils, then it is easy to select either dimer-forming or fibril-forming sequences. It is also possible to select for oriented fibrils with protein subunits all aligned in the same direction. We measure the propensities of amino acids to occur at interfaces relative to noninteracting surfaces and show that the propensities in our model are strongly correlated with those that have been measured in real protein structures. We also show that there are significant differences between amino acid frequencies at isologous and heterologous interfaces in our model, and we observe that similar effects occur in real protein structures.
Collapse
Affiliation(s)
- W Jeffrey Zabel
- Department of Physics and Astronomy, McMaster University, Hamilton, Ontario L8S 4M1, Canada
| | - Kyle P Hagner
- Department of Physics, Indiana University, Bloomington, Indiana 47405, USA
| | - Benjamin J Livesey
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, United Kingdom
| | - Joseph A Marsh
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, United Kingdom
| | - Sima Setayeshgar
- Department of Physics, Indiana University, Bloomington, Indiana 47405, USA
| | - Michael Lynch
- Biodesign Center for Mechanisms of Evolution, Arizona State University, Tempe, Arizona 85287, USA
| | - Paul G Higgs
- Department of Physics and Astronomy, McMaster University, Hamilton, Ontario L8S 4M1, Canada
| |
Collapse
|
24
|
Held T, Klemmer D, Lässig M. Survival of the simplest in microbial evolution. Nat Commun 2019; 10:2472. [PMID: 31171781 PMCID: PMC6554311 DOI: 10.1038/s41467-019-10413-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2018] [Accepted: 05/10/2019] [Indexed: 01/09/2023] Open
Abstract
The evolution of microbial and viral organisms often generates clonal interference, a mode of competition between genetic clades within a population. Here we show how interference impacts systems biology by constraining genetic and phenotypic complexity. Our analysis uses biophysically grounded evolutionary models for molecular phenotypes, such as fold stability and enzymatic activity of genes. We find a generic mode of phenotypic interference that couples the function of individual genes and the population’s global evolutionary dynamics. Biological implications of phenotypic interference include rapid collateral system degradation in adaptation experiments and long-term selection against genome complexity: each additional gene carries a cost proportional to the total number of genes. Recombination above a threshold rate can eliminate this cost, which establishes a universal, biophysically grounded scenario for the evolution of sex. In a broader context, our analysis suggests that the systems biology of microbes is strongly intertwined with their mode of evolution. In asexual populations selection at different genomic loci can interfere with each other. Here, using a biophysical model of molecular evolution the authors show that interference results in long-term degradation of molecular function, an effect that strongly depends on genome size.
Collapse
Affiliation(s)
- Torsten Held
- Institut für Biologische Physik, Universität zu Köln, Zülpicherstr. 77, 50937, Köln, Germany
| | - Daniel Klemmer
- Institut für Biologische Physik, Universität zu Köln, Zülpicherstr. 77, 50937, Köln, Germany
| | - Michael Lässig
- Institut für Biologische Physik, Universität zu Köln, Zülpicherstr. 77, 50937, Köln, Germany.
| |
Collapse
|
25
|
Kinney JB, McCandlish DM. Massively Parallel Assays and Quantitative Sequence-Function Relationships. Annu Rev Genomics Hum Genet 2019; 20:99-127. [PMID: 31091417 DOI: 10.1146/annurev-genom-083118-014845] [Citation(s) in RCA: 96] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Over the last decade, a rich variety of massively parallel assays have revolutionized our understanding of how biological sequences encode quantitative molecular phenotypes. These assays include deep mutational scanning, high-throughput SELEX, and massively parallel reporter assays. Here, we review these experimental methods and how the data they produce can be used to quantitatively model sequence-function relationships. In doing so, we touch on a diverse range of topics, including the identification of clinically relevant genomic variants, the modeling of transcription factor binding to DNA, the functional and evolutionary landscapes of proteins, and cis-regulatory mechanisms in both transcription and mRNA splicing. We further describe a unified conceptual framework and a core set of mathematical modeling strategies that studies in these diverse areas can make use of. Finally, we highlight key aspects of experimental design and mathematical modeling that are important for the results of such studies to be interpretable and reproducible.
Collapse
Affiliation(s)
- Justin B Kinney
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA; ,
| | - David M McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA; ,
| |
Collapse
|
26
|
Gauthier L, Di Franco R, Serohijos AWR. SodaPop: a forward simulation suite for the evolutionary dynamics of asexual populations on protein fitness landscapes. Bioinformatics 2019; 35:4053-4062. [DOI: 10.1093/bioinformatics/btz175] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2018] [Revised: 01/21/2019] [Accepted: 03/12/2019] [Indexed: 11/14/2022] Open
Abstract
Abstract
Motivation
Protein evolution is determined by forces at multiple levels of biological organization. Random mutations have an immediate effect on the biophysical properties, structure and function of proteins. These same mutations also affect the fitness of the organism. However, the evolutionary fate of mutations, whether they succeed to fixation or are purged, also depends on population size and dynamics. There is an emerging interest, both theoretically and experimentally, to integrate these two factors in protein evolution. Although there are several tools available for simulating protein evolution, most of them focus on either the biophysical or the population-level determinants, but not both. Hence, there is a need for a publicly available computational tool to explore both the effects of protein biophysics and population dynamics on protein evolution.
Results
To address this need, we developed SodaPop, a computational suite to simulate protein evolution in the context of the population dynamics of asexual populations. SodaPop accepts as input several fitness landscapes based on protein biochemistry or other user-defined fitness functions. The user can also provide as input experimental fitness landscapes derived from deep mutational scanning approaches or theoretical landscapes derived from physical force field estimates. Here, we demonstrate the broad utility of SodaPop with different applications describing the interplay of selection for protein properties and population dynamics. SodaPop is designed such that population geneticists can explore the influence of protein biochemistry on patterns of genetic variation, and that biochemists and biophysicists can explore the role of population size and demography on protein evolution.
Availability and implementation
Source code and binaries are freely available at https://github.com/louisgt/SodaPop under the GNU GPLv3 license. The software is implemented in C++ and supported on Linux, Mac OS/X and Windows.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Louis Gauthier
- Département de Biochimie, Université de Montréal, Montréal, QC, Canada
- Centre Robert-Cedergren en Bioinformatique et Génomique, Université de Montréal, Montréal, QC, Canada
| | - Rémicia Di Franco
- Département de Biochimie, Université de Montréal, Montréal, QC, Canada
- Centre Robert-Cedergren en Bioinformatique et Génomique, Université de Montréal, Montréal, QC, Canada
- Enseirb-Matmeca, Bordeaux Institute of Technology, Talence, France
| | - Adrian W R Serohijos
- Département de Biochimie, Université de Montréal, Montréal, QC, Canada
- Centre Robert-Cedergren en Bioinformatique et Génomique, Université de Montréal, Montréal, QC, Canada
| |
Collapse
|
27
|
Yan Z, Wang J. Superfunneled Energy Landscape of Protein Evolution Unifies the Principles of Protein Evolution, Folding, and Design. PHYSICAL REVIEW LETTERS 2019; 122:018103. [PMID: 31012725 DOI: 10.1103/physrevlett.122.018103] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2017] [Revised: 11/08/2018] [Indexed: 06/09/2023]
Abstract
Evolution is essential for shaping the biological functions. Darwin proposed the selection as the driving force for evolution upon mutations. While mutations are clear, the quantification of the selection force is still challenging. In this study, we identified and quantified both thermodynamic stability and kinetic accessibility as the selection forces for protein evolution. The protein evolution can be viewed and quantified as a trajectory moving along a superfunneled energy landscape with a line attractor at the bottom. The resulting evolved sequences and structures show strong protein characteristics including the hydrophobic core, high designability, and fast folding. The evolution principle uncovered here is validated on real proteins and sheds light on the protein design.
Collapse
Affiliation(s)
- Zhiqiang Yan
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin 130022, China
| | - Jin Wang
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin 130022, China
- Department of Chemistry & Physics, State University of New York at Stony Brook, Stony Brook, New York 11790, USA
| |
Collapse
|
28
|
Data-driven supervised learning of a viral protease specificity landscape from deep sequencing and molecular simulations. Proc Natl Acad Sci U S A 2018; 116:168-176. [PMID: 30587591 DOI: 10.1073/pnas.1805256116] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Biophysical interactions between proteins and peptides are key determinants of molecular recognition specificity landscapes. However, an understanding of how molecular structure and residue-level energetics at protein-peptide interfaces shape these landscapes remains elusive. We combine information from yeast-based library screening, next-generation sequencing, and structure-based modeling in a supervised machine learning approach to report the comprehensive sequence-energetics-function mapping of the specificity landscape of the hepatitis C virus (HCV) NS3/4A protease, whose function-site-specific cleavages of the viral polyprotein-is a key determinant of viral fitness. We screened a library of substrates in which five residue positions were randomized and measured cleavability of ∼30,000 substrates (∼1% of the library) using yeast display and fluorescence-activated cell sorting followed by deep sequencing. Structure-based models of a subset of experimentally derived sequences were used in a supervised learning procedure to train a support vector machine to predict the cleavability of 3.2 million substrate variants by the HCV protease. The resulting landscape allows identification of previously unidentified HCV protease substrates, and graph-theoretic analyses reveal extensive clustering of cleavable and uncleavable motifs in sequence space. Specificity landscapes of known drug-resistant variants are similarly clustered. The described approach should enable the elucidation and redesign of specificity landscapes of a wide variety of proteases, including human-origin enzymes. Our results also suggest a possible role for residue-level energetics in shaping plateau-like functional landscapes predicted from viral quasispecies theory.
Collapse
|
29
|
Abstract
Many proteins assemble into homomultimeric structures, with a number of subunits that can vary substantially among phylogenetic lineages. As protein-protein interactions require productive encounters among subunits, such variation might partially be explained by variation in cellular protein abundance. Protein abundance in turn depends on the intrinsic rates of production and decay of mRNA and protein molecules, as well as rates of cell growth and division. Using a stochastic framework for prediction of the multimeric state of a protein as a function of these processes and the free energy associated with interface-interface binding, we demonstrate agreement with a wide class of proteins using E. coli proteome data. As such, this platform, which links protein quaternary structure with biochemical rates governing gene expression, protein association and dissociation, and cell growth and division, can be extended to evolutionary models for the emergence and diversification of multimers. While it is tempting to think of multimerization as adaptive, the diversity of multimeric states raises the question of its functional role and impact on fitness. As a force driving selection, we consider the possible increase in enzymatic activity of proteins arising strictly as a consequence of interface-interface binding-namely, enhanced stability to degradation, substrate binding affinity, or catalytic rate of multimers with respect to monomers without invoking further conformational changes, as in allostery. For fixed cost of protein production, we find a benefit conferred by multimers that is dependent on context and can therefore become different in diverging lineages.
Collapse
Affiliation(s)
- Kyle Hagner
- Department of Physics, Indiana University, Bloomington, Indiana 47405, USA
| | - Sima Setayeshgar
- Department of Physics, Indiana University, Bloomington, Indiana 47405, USA
| | - Michael Lynch
- Biodesign Center for Mechanisms of Evolution, Arizona State University, Tempe, Arizona 85287, USA
| |
Collapse
|
30
|
Otwinowski J. Biophysical Inference of Epistasis and the Effects of Mutations on Protein Stability and Function. Mol Biol Evol 2018; 35:2345-2354. [PMID: 30085303 PMCID: PMC6188545 DOI: 10.1093/molbev/msy141] [Citation(s) in RCA: 51] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Understanding the relationship between protein sequence, function, and stability is a fundamental problem in biology. The essential function of many proteins that fold into a specific structure is their ability to bind to a ligand, which can be assayed for thousands of mutated variants. However, binding assays do not distinguish whether mutations affect the stability of the binding interface or the overall fold. Here, we introduce a statistical method to infer a detailed energy landscape of how a protein folds and binds to a ligand by combining information from many mutated variants. We fit a thermodynamic model describing the bound, unbound, and unfolded states to high quality data of protein G domain B1 binding to IgG-Fc. We infer distinct folding and binding energies for each mutation providing a detailed view of how mutations affect binding and stability across the protein. We accurately infer the folding energy of each variant in physical units, validated by independent data, whereas previous high-throughput methods could only measure indirect changes in stability. While we assume an additive sequence-energy relationship, the binding fraction is epistatic due its nonlinear relation to energy. Despite having no epistasis in energy, our model explains much of the observed epistasis in binding fraction, with the remaining epistasis identifying conformationally dynamic regions.
Collapse
Affiliation(s)
- Jakub Otwinowski
- Biology Department, University of Pennsylvania, Philadelphia, PA
| |
Collapse
|
31
|
Abstract
Genotype-phenotype relationships are notoriously complicated. Idiosyncratic interactions between specific combinations of mutations occur and are difficult to predict. Yet it is increasingly clear that many interactions can be understood in terms of global epistasis. That is, mutations may act additively on some underlying, unobserved trait, and this trait is then transformed via a nonlinear function to the observed phenotype as a result of subsequent biophysical and cellular processes. Here we infer the shape of such global epistasis in three proteins, based on published high-throughput mutagenesis data. To do so, we develop a maximum-likelihood inference procedure using a flexible family of monotonic nonlinear functions spanned by an I-spline basis. Our analysis uncovers dramatic nonlinearities in all three proteins; in some proteins a model with global epistasis accounts for virtually all of the measured variation, whereas in others we find substantial local epistasis as well. This method allows us to test hypotheses about the form of global epistasis and to distinguish variance components attributable to global epistasis, local epistasis, and measurement error.
Collapse
|
32
|
Causes and evolutionary consequences of primordial germ-cell specification mode in metazoans. Proc Natl Acad Sci U S A 2018; 114:5784-5791. [PMID: 28584112 DOI: 10.1073/pnas.1610600114] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
In animals, primordial germ cells (PGCs) give rise to the germ lines, the cell lineages that produce sperm and eggs. PGCs form in embryogenesis, typically by one of two modes: a likely ancestral mode wherein germ cells are induced during embryogenesis by cell-cell signaling (induction) or a derived mechanism whereby germ cells are specified by using germ plasm-that is, maternally specified germ-line determinants (inheritance). The causes of the shift to germ plasm for PGC specification in some animal clades remain largely unknown, but its repeated convergent evolution raises the question of whether it may result from or confer an innate selective advantage. It has been hypothesized that the acquisition of germ plasm confers enhanced evolvability, resulting from the release of selective constraint on somatic gene networks in embryogenesis, thus leading to acceleration of an organism's protein-sequence evolution, particularly for genes expressed at early developmental stages, and resulting in high speciation rates in germ plasm-containing lineages (denoted herein as the "PGC-specification hypothesis"). Although that hypothesis, if supported, could have major implications for animal evolution, our recent large-scale coding-sequence analyses from vertebrates and invertebrates provided important examples of genera that do not support the hypothesis of liberated constraint under germ plasm. Here, we consider reasons why germ plasm might be neither a direct target of selection nor causally linked to accelerated animal evolution. We explore alternate scenarios that could explain the repeated evolution of germ plasm and propose potential consequences of the inheritance and induction modes to animal evolutionary biology.
Collapse
|
33
|
Dandage R, Pandey R, Jayaraj G, Rai M, Berger D, Chakraborty K. Differential strengths of molecular determinants guide environment specific mutational fates. PLoS Genet 2018; 14:e1007419. [PMID: 29813059 PMCID: PMC5993328 DOI: 10.1371/journal.pgen.1007419] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2018] [Revised: 06/08/2018] [Accepted: 05/16/2018] [Indexed: 01/14/2023] Open
Abstract
Organisms maintain competitive fitness in the face of environmental challenges through molecular evolution. However, it remains largely unknown how different biophysical factors constrain molecular evolution in a given environment. Here, using deep mutational scanning, we quantified empirical fitness of >2000 single site mutants of the Gentamicin-resistant gene (GmR) in Escherichia coli, in a representative set of physical (non-native temperatures) and chemical (small molecule supplements) environments. From this, we could infer how different biophysical parameters of the mutations constrain molecular function in different environments. We find ligand binding, and protein stability to be the best predictors of mutants' fitness, but their relative predictive power differs across environments. While protein folding emerges as the strongest predictor at minimal antibiotic concentration, ligand binding becomes a stronger predictor of mutant fitness at higher concentration. Remarkably, strengths of environment-specific selection pressures were largely predictable from the degree of mutational perturbation of protein folding and ligand binding. By identifying structural constraints that act as determinants of fitness, our study thus provides coarse mechanistic insights into the environment specific accessibility of mutational fates.
Collapse
Affiliation(s)
- Rohan Dandage
- CSIR- Institute of Genomics and Integrative Biology, New Delhi, India
- Academy of Scientific and Innovative Research (AcSIR), New Delhi, India
| | - Rajesh Pandey
- CSIR Ayurgenomics Unit—TRISUTRA, CSIR- Institute of Genomics and Integrative Biology, New Delhi, India
| | - Gopal Jayaraj
- CSIR- Institute of Genomics and Integrative Biology, New Delhi, India
- Academy of Scientific and Innovative Research (AcSIR), New Delhi, India
| | - Manish Rai
- CSIR- Institute of Genomics and Integrative Biology, New Delhi, India
- Academy of Scientific and Innovative Research (AcSIR), New Delhi, India
| | - David Berger
- Department of Ecology and Genetics, Animal Ecology, Evolutionary Biology Centre at Uppsala University, Uppsala, Sweden
| | - Kausik Chakraborty
- CSIR- Institute of Genomics and Integrative Biology, New Delhi, India
- Academy of Scientific and Innovative Research (AcSIR), New Delhi, India
| |
Collapse
|
34
|
Hane FT, Lee BY, Leonenko Z. Recent Progress in Alzheimer's Disease Research, Part 1: Pathology. J Alzheimers Dis 2018; 57:1-28. [PMID: 28222507 DOI: 10.3233/jad-160882] [Citation(s) in RCA: 65] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
The field of Alzheimer's disease (AD) research has grown exponentially over the past few decades, especially since the isolation and identification of amyloid-β from postmortem examination of the brains of AD patients. Recently, the Journal of Alzheimer's Disease (JAD) put forth approximately 300 research reports which were deemed to be the most influential research reports in the field of AD since 2010. JAD readers were asked to vote on these most influential reports. In this 3-part review, we review the results of the 300 most influential AD research reports to provide JAD readers with a readily accessible, yet comprehensive review of the state of contemporary research. Notably, this multi-part review identifies the "hottest" fields of AD research providing guidance for both senior investigators as well as investigators new to the field on what is the most pressing fields within AD research. Part 1 of this review covers pathogenesis, both on a molecular and macro scale. Part 2 review genetics and epidemiology, and part 3 covers diagnosis and treatment. This part of the review, pathology, reviews amyloid-β, tau, prions, brain structure, and functional changes with AD and the neuroimmune response of AD.
Collapse
Affiliation(s)
- Francis T Hane
- Department of Biology, University of Waterloo, Waterloo, ON, Canada.,Department of Chemistry, Lakehead University, Thunder Bay, ON, Canada
| | - Brenda Y Lee
- Department of Biology, University of Waterloo, Waterloo, ON, Canada
| | - Zoya Leonenko
- Department of Biology, University of Waterloo, Waterloo, ON, Canada.,Department of Physics and Astronomy, University of Waterloo, Waterloo, ON, Canada
| |
Collapse
|
35
|
François P, Hemery M, Johnson KA, Saunders LN. Phenotypic spandrel: absolute discrimination and ligand antagonism. Phys Biol 2016; 13:066011. [PMID: 27922826 DOI: 10.1088/1478-3975/13/6/066011] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
We consider the general problem of sensitive and specific discrimination between biochemical species. An important instance is immune discrimination between self and not-self, where it is also observed experimentally that ligands just below the discrimination threshold negatively impact response, a phenomenon called antagonism. We characterize mathematically the generic properties of such discrimination, first relating it to biochemical adaptation. Then, based on basic biochemical rules, we establish that, surprisingly, antagonism is a generic consequence of any strictly specific discrimination made independently from ligand concentration. Thus antagonism constitutes a 'phenotypic spandrel': a phenotype existing as a necessary by-product of another phenotype. We exhibit a simple analytic model of discrimination displaying antagonism, where antagonism strength is linear in distance from the detection threshold. This contrasts with traditional proofreading based models where antagonism vanishes far from threshold and thus displays an inverted hierarchy of antagonism compared to simpler models. The phenotypic spandrel studied here is expected to structure many decision pathways such as immune detection mediated by TCRs and FCϵRIs, as well as endocrine signalling/disruption.
Collapse
|
36
|
Bershtein S, Serohijos AW, Shakhnovich EI. Bridging the physical scales in evolutionary biology: from protein sequence space to fitness of organisms and populations. Curr Opin Struct Biol 2016; 42:31-40. [PMID: 27810574 DOI: 10.1016/j.sbi.2016.10.013] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Accepted: 10/14/2016] [Indexed: 01/11/2023]
Abstract
Bridging the gap between the molecular properties of proteins and organismal/population fitness is essential for understanding evolutionary processes. This task requires the integration of the several physical scales of biological organization, each defined by a distinct set of mechanisms and constraints, into a single unifying model. The molecular scale is dominated by the constraints imposed by the physico-chemical properties of proteins and their substrates, which give rise to trade-offs and epistatic (non-additive) effects of mutations. At the systems scale, biological networks modulate protein expression and can either buffer or enhance the fitness effects of mutations. The population scale is influenced by the mutational input, selection regimes, and stochastic changes affecting the size and structure of populations, which eventually determine the evolutionary fate of mutations. Here, we summarize the recent advances in theory, computer simulations, and experiments that advance our understanding of the links between various physical scales in biology.
Collapse
Affiliation(s)
- Shimon Bershtein
- Department of Life Sciences, Ben-Gurion University of the Negev, Beer-Sheva 84501, Israel
| | - Adrian Wr Serohijos
- Département de Biochimie, Centre Robert-Cedergren en Bioinformatique & Génomique, Université de Montréal, Montréal, QC H3T 1J4, Canada
| | - Eugene I Shakhnovich
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138, United States.
| |
Collapse
|
37
|
Manhart M, Kion-Crosby W, Morozov AV. Path statistics, memory, and coarse-graining of continuous-time random walks on networks. J Chem Phys 2016; 143:214106. [PMID: 26646868 DOI: 10.1063/1.4935968] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Continuous-time random walks (CTRWs) on discrete state spaces, ranging from regular lattices to complex networks, are ubiquitous across physics, chemistry, and biology. Models with coarse-grained states (for example, those employed in studies of molecular kinetics) or spatial disorder can give rise to memory and non-exponential distributions of waiting times and first-passage statistics. However, existing methods for analyzing CTRWs on complex energy landscapes do not address these effects. Here we use statistical mechanics of the nonequilibrium path ensemble to characterize first-passage CTRWs on networks with arbitrary connectivity, energy landscape, and waiting time distributions. Our approach can be applied to calculating higher moments (beyond the mean) of path length, time, and action, as well as statistics of any conservative or non-conservative force along a path. For homogeneous networks, we derive exact relations between length and time moments, quantifying the validity of approximating a continuous-time process with its discrete-time projection. For more general models, we obtain recursion relations, reminiscent of transfer matrix and exact enumeration techniques, to efficiently calculate path statistics numerically. We have implemented our algorithm in PathMAN (Path Matrix Algorithm for Networks), a Python script that users can apply to their model of choice. We demonstrate the algorithm on a few representative examples which underscore the importance of non-exponential distributions, memory, and coarse-graining in CTRWs.
Collapse
Affiliation(s)
- Michael Manhart
- Department of Physics and Astronomy, Rutgers University, Piscataway, New Jersey 08854, USA
| | - Willow Kion-Crosby
- Department of Physics and Astronomy, Rutgers University, Piscataway, New Jersey 08854, USA
| | - Alexandre V Morozov
- Department of Physics and Astronomy, Rutgers University, Piscataway, New Jersey 08854, USA
| |
Collapse
|
38
|
Chéron N, Serohijos AWR, Choi JM, Shakhnovich EI. Evolutionary dynamics of viral escape under antibodies stress: A biophysical model. Protein Sci 2016; 25:1332-40. [PMID: 26939576 PMCID: PMC4918420 DOI: 10.1002/pro.2915] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2015] [Revised: 02/23/2016] [Accepted: 03/02/2016] [Indexed: 12/12/2022]
Abstract
Viruses constantly face the selection pressure of antibodies, either from innate immune response of the host or from administered antibodies for treatment. We explore the interplay between the biophysical properties of viral proteins and the population and demographic variables in the viral escape. The demographic and population genetics aspect of the viral escape have been explored before; however one important assumption was the a priori distribution of fitness effects (DFE). Here, we relax this assumption by instead considering a realistic biophysics-based genotype-phenotype relationship for RNA viruses escaping antibodies stress. In this model the DFE is itself an evolvable property that depends on the genetic background (epistasis) and the distribution of biophysical effects of mutations, which is informed by biochemical experiments and theoretical calculations in protein engineering. We quantitatively explore in silico the viability of viral populations under antibodies pressure and derive the phase diagram that defines the fate of the virus population (extinction or escape from stress) in a range of viral mutation rates and antibodies concentrations. We find that viruses are most resistant to stress at an optimal mutation rate (OMR) determined by the competition between supply of beneficial mutation to facilitate escape from stressors and lethal mutagenesis caused by excess of destabilizing mutations. We then show the quantitative dependence of the OMR on genome length and viral burst size. We also recapitulate the experimental observation that viruses with longer genomes have smaller mutation rate per nucleotide.
Collapse
Affiliation(s)
- Nicolas Chéron
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts, 02138
- Département de Biochimie et Centre Robert-Cedergren en Bioinformatique et Génomique, Université de Montréal, Montréal, Quebec, Canada, H3T 1J4
| | - Adrian W R Serohijos
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts, 02138
| | - Jeong-Mo Choi
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts, 02138
| | - Eugene I Shakhnovich
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts, 02138
| |
Collapse
|
39
|
Springer SA, Manhart M, Morozov AV. Separating Spandrels from Phenotypic Targets of Selection in Adaptive Molecular Evolution. Evol Biol 2016. [DOI: 10.1007/978-3-319-41324-2_18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
40
|
Hunter P. Nature's origami: Understanding folding helps to analyze the self-structuring of molecules, organs and surfaces. EMBO Rep 2015; 16:1435-8. [PMID: 26474903 DOI: 10.15252/embr.201541390] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
|
41
|
Manhart M, Morozov AV. Scaling properties of evolutionary paths in a biophysical model of protein adaptation. Phys Biol 2015; 12:045001. [PMID: 26020812 DOI: 10.1088/1478-3975/12/4/045001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
The enormous size and complexity of genotypic sequence space frequently requires consideration of coarse-grained sequences in empirical models. We develop scaling relations to quantify the effect of this coarse-graining on properties of fitness landscapes and evolutionary paths. We first consider evolution on a simple Mount Fuji fitness landscape, focusing on how the length and predictability of evolutionary paths scale with the coarse-grained sequence length and alphabet. We obtain simple scaling relations for both the weak- and strong-selection limits, with a non-trivial crossover regime at intermediate selection strengths. We apply these results to evolution on a biophysical fitness landscape that describes how proteins evolve new binding interactions while maintaining their folding stability. We combine the scaling relations with numerical calculations for coarse-grained protein sequences to obtain quantitative properties of the model for realistic binding interfaces and a full amino acid alphabet.
Collapse
Affiliation(s)
- Michael Manhart
- Department of Physics and Astronomy, Rutgers University, Piscataway, NJ 08854, USA
| | | |
Collapse
|