1
|
Grah R, Guet CC, Tkačik G, Lagator M. Linking molecular mechanisms to their evolutionary consequences: a primer. Genetics 2025; 229:iyae191. [PMID: 39601269 PMCID: PMC11796464 DOI: 10.1093/genetics/iyae191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2024] [Accepted: 11/13/2024] [Indexed: 11/29/2024] Open
Abstract
A major obstacle to predictive understanding of evolution stems from the complexity of biological systems, which prevents detailed characterization of key evolutionary properties. Here, we highlight some of the major sources of complexity that arise when relating molecular mechanisms to their evolutionary consequences and ask whether accounting for every mechanistic detail is important to accurately predict evolutionary outcomes. To do this, we developed a mechanistic model of a bacterial promoter regulated by 2 proteins, allowing us to connect any promoter genotype to 6 phenotypes that capture the dynamics of gene expression following an environmental switch. Accounting for the mechanisms that govern how this system works enabled us to provide an in-depth picture of how regulated bacterial promoters might evolve. More importantly, we used the model to explore which factors that contribute to the complexity of this system are essential for understanding its evolution, and which can be simplified without information loss. We found that several key evolutionary properties-the distribution of phenotypic and fitness effects of mutations, the evolutionary trajectories during selection for regulation-can be accurately captured without accounting for all, or even most, parameters of the system. Our findings point to the need for a mechanistic approach to studying evolution, as it enables tackling biological complexity and in doing so improves the ability to predict evolutionary outcomes.
Collapse
Affiliation(s)
- Rok Grah
- Institute of Science and Technology Austria, Klosterneuburg AT-3400, Austria
| | - Calin C Guet
- Institute of Science and Technology Austria, Klosterneuburg AT-3400, Austria
| | - Gasper Tkačik
- Institute of Science and Technology Austria, Klosterneuburg AT-3400, Austria
| | - Mato Lagator
- Division of Evolution, Infection and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester M13 9PL, UK
| |
Collapse
|
2
|
Meijers M, Ruchnewitz D, Eberhardt J, Karmakar M, Łuksza M, Lässig M. Concepts and Methods for Predicting Viral Evolution. Methods Mol Biol 2025; 2890:253-290. [PMID: 39890732 DOI: 10.1007/978-1-0716-4326-6_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2025]
Abstract
The seasonal human influenza virus undergoes rapid evolution, leading to significant changes in circulating viral strains from year to year. These changes are typically driven by adaptive mutations, particularly in the antigenic epitopes, the regions of the viral surface protein hemagglutinin targeted by human antibodies. Here, we describe a consistent set of methods for data-driven predictive analysis of viral evolution. Our pipeline integrates four types of data: (1) sequence data of viral isolates collected on a worldwide scale, (2) epidemiological data on incidences, (3) antigenic characterization of circulating viruses, and (4) intrinsic viral phenotypes. From the combined analysis of these data, we obtain estimates of relative fitness for circulating strains and predictions of clade frequencies for periods of up to 1 year. Furthermore, we obtain comparative estimates of protection against future viral populations for candidate vaccine strains, providing a basis for pre-emptive vaccine strain selection. Continuously updated predictions obtained from the prediction pipeline for influenza and SARS-CoV-2 are available at https://previr.app .
Collapse
Affiliation(s)
- Matthijs Meijers
- Institute for Biological Physics, University of Cologne, Köln, Germany
| | - Denis Ruchnewitz
- Institute for Biological Physics, University of Cologne, Köln, Germany
| | - Jan Eberhardt
- Institute for Biological Physics, University of Cologne, Köln, Germany
| | - Malancha Karmakar
- Institute for Biological Physics, University of Cologne, Köln, Germany
| | - Marta Łuksza
- Departments of Oncological Sciences and Genetics and Genomic Sciences, Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| | - Michael Lässig
- Institute for Biological Physics, University of Cologne, Köln, Germany.
| |
Collapse
|
3
|
Meijers M, Ruchnewitz D, Eberhardt J, Karmakar M, Łuksza M, Lässig M. Concepts and methods for predicting viral evolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.19.585703. [PMID: 38746108 PMCID: PMC11092427 DOI: 10.1101/2024.03.19.585703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
The seasonal human influenza virus undergoes rapid evolution, leading to significant changes in circulating viral strains from year to year. These changes are typically driven by adaptive mutations, particularly in the antigenic epitopes, the regions of the viral surface protein haemagglutinin targeted by human antibodies. Here we describe a consistent set of methods for data-driven predictive analysis of viral evolution. Our pipeline integrates four types of data: (1) sequence data of viral isolates collected on a worldwide scale, (2) epidemiological data on incidences, (3) antigenic characterization of circulating viruses, and (4) intrinsic viral phenotypes. From the combined analysis of these data, we obtain estimates of relative fitness for circulating strains and predictions of clade frequencies for periods of up to one year. Furthermore, we obtain comparative estimates of protection against future viral populations for candidate vaccine strains, providing a basis for pre-emptive vaccine strain selection. Continuously updated predictions obtained from the prediction pipeline for influenza and SARS-CoV-2 are available on the website previr.app.
Collapse
Affiliation(s)
- Matthijs Meijers
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| | - Denis Ruchnewitz
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| | - Jan Eberhardt
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| | - Malancha Karmakar
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| | - Marta Łuksza
- Tisch Cancer Institute, Departments of Oncological Sciences and Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Michael Lässig
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| |
Collapse
|
4
|
Meijers M, Ruchnewitz D, Eberhardt J, Karmakar M, Luksza M, Lässig M. Concepts and methods for predicting viral evolution. ARXIV 2024:arXiv:2403.12684v3. [PMID: 38745695 PMCID: PMC11092678] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
The seasonal human influenza virus undergoes rapid evolution, leading to significant changes in circulating viral strains from year to year. These changes are typically driven by adaptive mutations, particularly in the antigenic epitopes, the regions of the viral surface protein haemagglutinin targeted by human antibodies. Here we describe a consistent set of methods for data-driven predictive analysis of viral evolution. Our pipeline integrates four types of data: (1) sequence data of viral isolates collected on a worldwide scale, (2) epidemiological data on incidences, (3) antigenic characterization of circulating viruses, and (4) intrinsic viral phenotypes. From the combined analysis of these data, we obtain estimates of relative fitness for circulating strains and predictions of clade frequencies for periods of up to one year. Furthermore, we obtain comparative estimates of protection against future viral populations for candidate vaccine strains, providing a basis for pre-emptive vaccine strain selection. Continuously updated predictions obtained from the prediction pipeline for influenza and SARS-CoV-2 are available on the website previr.app.
Collapse
Affiliation(s)
- Matthijs Meijers
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| | - Denis Ruchnewitz
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| | - Jan Eberhardt
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| | - Malancha Karmakar
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| | - Marta Luksza
- Tisch Cancer Institute, Departments of Oncological Sciences and Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Michael Lässig
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| |
Collapse
|
5
|
Collesano L, Łuksza M, Lässig M. Energy landscapes of peptide-MHC binding. PLoS Comput Biol 2024; 20:e1012380. [PMID: 39226310 PMCID: PMC11398667 DOI: 10.1371/journal.pcbi.1012380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Revised: 09/13/2024] [Accepted: 07/31/2024] [Indexed: 09/05/2024] Open
Abstract
Molecules of the Major Histocompatibility Complex (MHC) present short protein fragments on the cell surface, an important step in T cell immune recognition. MHC-I molecules process peptides from intracellular proteins; MHC-II molecules act in antigen-presenting cells and present peptides derived from extracellular proteins. Here we show that the sequence-dependent energy landscapes of MHC-peptide binding encode class-specific nonlinearities (epistasis). MHC-I has a smooth landscape with global epistasis; the binding energy is a simple deformation of an underlying linear trait. This form of epistasis enhances the discrimination between strong-binding peptides. In contrast, MHC-II has a rugged landscape with idiosyncratic epistasis: binding depends on detailed amino acid combinations at multiple positions of the peptide sequence. The form of epistasis affects the learning of energy landscapes from training data. For MHC-I, a low-complexity problem, we derive a simple matrix model of binding energies that outperforms current models trained by machine learning. For MHC-II, higher complexity prevents learning by simple regression methods. Epistasis also affects the energy and fitness effects of mutations in antigen-derived peptides (epitopes). In MHC-I, large-effect mutations occur predominantly in anchor positions of strong-binding epitopes. In MHC-II, large effects depend on the background epitope sequence but are broadly distributed over the epitope, generating a bigger target for escape mutations due to loss of presentation. Together, our analysis shows how an energy landscape of protein-protein binding constrains the target of escape mutations from T cell immunity, linking the complexity of the molecular interactions to the dynamics of adaptive immune response.
Collapse
Affiliation(s)
- Laura Collesano
- Institute for Biological Physics, University of Cologne, Cologne, Germany
| | - Marta Łuksza
- Tisch Cancer Institute, Departments of Oncological Sciences and Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America
| | - Michael Lässig
- Institute for Biological Physics, University of Cologne, Cologne, Germany
| |
Collapse
|
6
|
Selvakumar P, Siddharthan R. Position-specific evolution in transcription factor binding sites, and a fast likelihood calculation for the F81 model. ROYAL SOCIETY OPEN SCIENCE 2024; 11:231088. [PMID: 38269075 PMCID: PMC10805598 DOI: 10.1098/rsos.231088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 12/20/2023] [Indexed: 01/26/2024]
Abstract
Transcription factor binding sites (TFBS), like other DNA sequence, evolve via mutation and selection relating to their function. Models of nucleotide evolution describe DNA evolution via single-nucleotide mutation. A stationary vector of such a model is the long-term distribution of nucleotides, unchanging under the model. Neutrally evolving sites may have uniform stationary vectors, but one expects that sites within a TFBS instead have stationary vectors reflective of the fitness of various nucleotides at those positions. We introduce 'position-specific stationary vectors' (PSSVs), the collection of stationary vectors at each site in a TFBS locus, analogous to the position weight matrix (PWM) commonly used to describe TFBS. We infer PSSVs for human TFs using two evolutionary models (Felsenstein 1981 and Hasegawa-Kishino-Yano 1985). We find that PSSVs reflect the nucleotide distribution from PWMs, but with reduced specificity. We infer ancestral nucleotide distributions at individual positions and calculate 'conditional PSSVs' conditioned on specific choices of majority ancestral nucleotide. We find that certain ancestral nucleotides exert a strong evolutionary pressure on neighbouring sequence while others have a negligible effect. Finally, we present a fast likelihood calculation for the F81 model on moderate-sized trees that makes this approach feasible for large-scale studies along these lines.
Collapse
Affiliation(s)
- Pavitra Selvakumar
- The Institute of Mathematical Sciences, Chennai, India
- Homi Bhabha National Institute, Mumbai, India
| | - Rahul Siddharthan
- The Institute of Mathematical Sciences, Chennai, India
- Homi Bhabha National Institute, Mumbai, India
| |
Collapse
|
7
|
Liu X, Chen M, Qu X, Liu W, Dou Y, Liu Q, Shi D, Jiang M, Li H. Cis-Regulatory Elements in Mammals. Int J Mol Sci 2023; 25:343. [PMID: 38203513 PMCID: PMC10779164 DOI: 10.3390/ijms25010343] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 12/21/2023] [Accepted: 12/23/2023] [Indexed: 01/12/2024] Open
Abstract
In cis-regulatory elements, enhancers and promoters with complex molecular interactions are used to coordinate gene transcription through physical proximity and chemical modifications. These processes subsequently influence the phenotypic characteristics of an organism. An in-depth exploration of enhancers and promoters can substantially enhance our understanding of gene regulatory networks, shedding new light on mammalian development, evolution and disease pathways. In this review, we provide a comprehensive overview of the intrinsic structural attributes, detection methodologies as well as the operational mechanisms of enhancers and promoters, coupled with the relevant novel and innovative investigative techniques used to explore their actions. We further elucidated the state-of-the-art research on the roles of enhancers and promoters in the realms of mammalian development, evolution and disease, and we conclude with forward-looking insights into prospective research avenues.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Mingsheng Jiang
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangxi Key Laboratory of Animal Breeding, Disease Control and Prevention, College of Animal Science and Technology, Guangxi University, Nanning 530005, China
| | - Hui Li
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangxi Key Laboratory of Animal Breeding, Disease Control and Prevention, College of Animal Science and Technology, Guangxi University, Nanning 530005, China
| |
Collapse
|
8
|
Srivastava M, Payne JL. On the incongruence of genotype-phenotype and fitness landscapes. PLoS Comput Biol 2022; 18:e1010524. [PMID: 36121840 PMCID: PMC9521842 DOI: 10.1371/journal.pcbi.1010524] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Revised: 09/29/2022] [Accepted: 08/30/2022] [Indexed: 11/22/2022] Open
Abstract
The mapping from genotype to phenotype to fitness typically involves multiple nonlinearities that can transform the effects of mutations. For example, mutations may contribute additively to a phenotype, but their effects on fitness may combine non-additively because selection favors a low or intermediate value of that phenotype. This can cause incongruence between the topographical properties of a fitness landscape and its underlying genotype-phenotype landscape. Yet, genotype-phenotype landscapes are often used as a proxy for fitness landscapes to study the dynamics and predictability of evolution. Here, we use theoretical models and empirical data on transcription factor-DNA interactions to systematically study the incongruence of genotype-phenotype and fitness landscapes when selection favors a low or intermediate phenotypic value. Using the theoretical models, we prove a number of fundamental results. For example, selection for low or intermediate phenotypic values does not change simple sign epistasis into reciprocal sign epistasis, implying that genotype-phenotype landscapes with only simple sign epistasis motifs will always give rise to single-peaked fitness landscapes under such selection. More broadly, we show that such selection tends to create fitness landscapes that are more rugged than the underlying genotype-phenotype landscape, but this increased ruggedness typically does not frustrate adaptive evolution because the local adaptive peaks in the fitness landscape tend to be nearly as tall as the global peak. Many of these results carry forward to the empirical genotype-phenotype landscapes, which may help to explain why low- and intermediate-affinity transcription factor-DNA interactions are so prevalent in eukaryotic gene regulation.
Collapse
Affiliation(s)
- Malvika Srivastava
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Joshua L. Payne
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
9
|
Krieger G, Lupo O, Wittkopp P, Barkai N. Evolution of transcription factor binding through sequence variations and turnover of binding sites. Genome Res 2022; 32:1099-1111. [PMID: 35618416 PMCID: PMC9248875 DOI: 10.1101/gr.276715.122] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Accepted: 05/20/2022] [Indexed: 01/08/2023]
Abstract
Variations in noncoding regulatory sequences play a central role in evolution. Interpreting such variations, however, remains difficult even in the context of defined attributes such as transcription factor (TF) binding sites. Here, we systematically link variations in cis-regulatory sequences to TF binding by profiling the allele-specific binding of 27 TFs expressed in a yeast hybrid, in which two related genomes are present within the same nucleus. TFs localize preferentially to sites containing their known consensus motifs but occupy only a small fraction of the motif-containing sites available within the genomes. Differential binding of TFs to the orthologous alleles was well explained by variations that alter motif sequence, whereas differences in chromatin accessibility between alleles were of little apparent effect. Motif variations that abolished binding when present in only one allele were still bound when present in both alleles, suggesting evolutionary compensation, with a potential role for sequence conservation at the motif's vicinity. At the level of the full promoter, we identify cases of binding-site turnover, in which binding sites are reciprocally gained and lost, yet most interspecific differences remained uncompensated. Our results show the flexibility of TFs to bind imprecise motifs and the fast evolution of TF binding sites between related species.
Collapse
Affiliation(s)
- Gat Krieger
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Offir Lupo
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Patricia Wittkopp
- Department of Ecology and Evolutionary Biology, Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Naama Barkai
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| |
Collapse
|
10
|
The evolution, evolvability and engineering of gene regulatory DNA. Nature 2022; 603:455-463. [PMID: 35264797 DOI: 10.1038/s41586-022-04506-6] [Citation(s) in RCA: 124] [Impact Index Per Article: 41.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Accepted: 02/02/2022] [Indexed: 11/08/2022]
Abstract
Mutations in non-coding regulatory DNA sequences can alter gene expression, organismal phenotype and fitness1-3. Constructing complete fitness landscapes, in which DNA sequences are mapped to fitness, is a long-standing goal in biology, but has remained elusive because it is challenging to generalize reliably to vast sequence spaces4-6. Here we build sequence-to-expression models that capture fitness landscapes and use them to decipher principles of regulatory evolution. Using millions of randomly sampled promoter DNA sequences and their measured expression levels in the yeast Saccharomyces cerevisiae, we learn deep neural network models that generalize with excellent prediction performance, and enable sequence design for expression engineering. Using our models, we study expression divergence under genetic drift and strong-selection weak-mutation regimes to find that regulatory evolution is rapid and subject to diminishing returns epistasis; that conflicting expression objectives in different environments constrain expression adaptation; and that stabilizing selection on gene expression leads to the moderation of regulatory complexity. We present an approach for using such models to detect signatures of selection on expression from natural variation in regulatory sequences and use it to discover an instance of convergent regulatory evolution. We assess mutational robustness, finding that regulatory mutation effect sizes follow a power law, characterize regulatory evolvability, visualize promoter fitness landscapes, discover evolvability archetypes and illustrate the mutational robustness of natural regulatory sequence populations. Our work provides a general framework for designing regulatory sequences and addressing fundamental questions in regulatory evolution.
Collapse
|
11
|
Lagator M, Sarikas S, Steinrueck M, Toledo-Aparicio D, Bollback JP, Guet CC, Tkačik G. Predicting bacterial promoter function and evolution from random sequences. eLife 2022; 11:64543. [PMID: 35080492 PMCID: PMC8791639 DOI: 10.7554/elife.64543] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Accepted: 01/09/2022] [Indexed: 12/12/2022] Open
Abstract
Predicting function from sequence is a central problem of biology. Currently, this is possible only locally in a narrow mutational neighborhood around a wildtype sequence rather than globally from any sequence. Using random mutant libraries, we developed a biophysical model that accounts for multiple features of σ70 binding bacterial promoters to predict constitutive gene expression levels from any sequence. We experimentally and theoretically estimated that 10–20% of random sequences lead to expression and ~80% of non-expressing sequences are one mutation away from a functional promoter. The potential for generating expression from random sequences is so pervasive that selection acts against σ70-RNA polymerase binding sites even within inter-genic, promoter-containing regions. This pervasiveness of σ70-binding sites implies that emergence of promoters is not the limiting step in gene regulatory evolution. Ultimately, the inclusion of novel features of promoter function into a mechanistic model enabled not only more accurate predictions of gene expression levels, but also identified that promoters evolve more rapidly than previously thought.
Collapse
Affiliation(s)
- Mato Lagator
- School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, United Kingdom.,Institute of Science and Technology Austria, Klosterneuburg, Austria
| | - Srdjan Sarikas
- Institute of Science and Technology Austria, Klosterneuburg, Austria.,Center for Physiology and Pharmacology, Medical University of Vienna, Klosterneuburg, Austria
| | | | | | - Jonathan P Bollback
- Institute of Integrative Biology, Functional and Comparative Genomics, University of Liverpool, Liverpool, United Kingdom
| | - Calin C Guet
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| | - Gašper Tkačik
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| |
Collapse
|
12
|
Wang Y, Lei R, Nourmohammad A, Wu NC. Antigenic evolution of human influenza H3N2 neuraminidase is constrained by charge balancing. eLife 2021; 10:e72516. [PMID: 34878407 PMCID: PMC8683081 DOI: 10.7554/elife.72516] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Accepted: 12/07/2021] [Indexed: 11/13/2022] Open
Abstract
As one of the main influenza antigens, neuraminidase (NA) in H3N2 virus has evolved extensively for more than 50 years due to continuous immune pressure. While NA has recently emerged as an effective vaccine target, biophysical constraints on the antigenic evolution of NA remain largely elusive. Here, we apply combinatorial mutagenesis and next-generation sequencing to characterize the local fitness landscape in an antigenic region of NA in six different human H3N2 strains that were isolated around 10 years apart. The local fitness landscape correlates well among strains and the pairwise epistasis is highly conserved. Our analysis further demonstrates that local net charge governs the pairwise epistasis in this antigenic region. In addition, we show that residue coevolution in this antigenic region is correlated with the pairwise epistasis between charge states. Overall, this study demonstrates the importance of quantifying epistasis and the underlying biophysical constraint for building a model of influenza evolution.
Collapse
Affiliation(s)
- Yiquan Wang
- Department of Biochemistry, University of Illinois at Urbana-ChampaignUrbanaUnited States
| | - Ruipeng Lei
- Department of Biochemistry, University of Illinois at Urbana-ChampaignUrbanaUnited States
| | - Armita Nourmohammad
- Department of Physics, University of WashingtonSeattleUnited States
- Max Planck Institute for Dynamics and Self-OrganizationGöttingenGermany
- Fred Hutchinson Cancer Research CenterSeattleUnited States
| | - Nicholas C Wu
- Department of Biochemistry, University of Illinois at Urbana-ChampaignUrbanaUnited States
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-ChampaignUrbanaUnited States
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-ChampaignUrbanaUnited States
- Carle Illinois College of Medicine, University of Illinois at Urbana-ChampaignUrbanaUnited States
| |
Collapse
|
13
|
Abstract
Because gene expression is important for evolutionary adaptation, its misregulation is an important cause of maladaptation. A misregulated gene can be incorrectly silent ("off") when a transcription factor (TF) that is required for its activation does not binds its regulatory region. Conversely, a misregulated gene can be incorrectly active ("on") when a TF not normally involved in its activation binds its regulatory region, a phenomenon also known as regulatory crosstalk. DNA mutations that destroy or create TF binding sites on DNA are an important source of misregulation and crosstalk. Although misregulation reduces fitness in an environment to which an organism is well-adapted, it may become adaptive in a new environment. Here, I derive simple yet general mathematical expressions that delimit the conditions under which misregulation can be adaptive. These expressions depend on the strength of selection against misregulation, on the fraction of DNA sequence space filled with TF binding sites, and on the fraction of genes that must be expressed for optimal adaptation. I then use empirical data from RNA sequencing, protein-binding microarrays, and genome evolution, together with population genetic simulations to ask when these conditions are likely to be met. I show that they can be met under realistic circumstances, but these circumstances may vary among organisms and environments. My analysis provides a framework in which improved theory and data collection can help us demonstrate the role of misregulation in adaptation. It also shows that misregulation, like DNA mutation, is one of life's many imperfections that can help propel Darwinian evolution.
Collapse
Affiliation(s)
- Andreas Wagner
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, CH-8057, Switzerland.,The Santa Fe Institute, Santa Fe, NM 87501, USA.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
14
|
Meijers M, Vanshylla K, Gruell H, Klein F, Lässig M. Predicting in vivo escape dynamics of HIV-1 from a broadly neutralizing antibody. Proc Natl Acad Sci U S A 2021; 118:e2104651118. [PMID: 34301904 PMCID: PMC8325275 DOI: 10.1073/pnas.2104651118] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Broadly neutralizing antibodies are promising candidates for treatment and prevention of HIV-1 infections. Such antibodies can temporarily suppress viral load in infected individuals; however, the virus often rebounds by escape mutants that have evolved resistance. In this paper, we map a fitness model of HIV-1 interacting with broadly neutralizing antibodies using in vivo data from a recent clinical trial. We identify two fitness factors, antibody dosage and viral load, that determine viral reproduction rates reproducibly across different hosts. The model successfully predicts the escape dynamics of HIV-1 in the course of an antibody treatment, including a characteristic frequency turnover between sensitive and resistant strains. This turnover is governed by a dosage-dependent fitness ranking, resulting from an evolutionary trade-off between antibody resistance and its collateral cost in drug-free growth. Our analysis suggests resistance-cost trade-off curves as a measure of antibody performance in the presence of resistance evolution.
Collapse
Affiliation(s)
- Matthijs Meijers
- Institut für Biologische Physik, University of Cologne, 50937 Cologne, Germany
| | - Kanika Vanshylla
- Laboratory of Experimental Immunology, Institute of Virology, Faculty of Medicine, University of Cologne, 50931 Cologne, Germany
- Laboratory of Experimental Immunology, Institute of Virology, Faculty of Medicine and University Hospital Cologne, University of Cologne, 50931 Cologne, Germany
| | - Henning Gruell
- Laboratory of Experimental Immunology, Institute of Virology, Faculty of Medicine, University of Cologne, 50931 Cologne, Germany
- Laboratory of Experimental Immunology, Institute of Virology, Faculty of Medicine and University Hospital Cologne, University of Cologne, 50931 Cologne, Germany
| | - Florian Klein
- Laboratory of Experimental Immunology, Institute of Virology, Faculty of Medicine, University of Cologne, 50931 Cologne, Germany
- Laboratory of Experimental Immunology, Institute of Virology, Faculty of Medicine and University Hospital Cologne, University of Cologne, 50931 Cologne, Germany
- Partner Site Bonn-Cologne, German Center for Infection Research, 50931 Cologne, Germany
- Center for Molecular Medicine, University of Cologne, 50931 Cologne, Germany
| | - Michael Lässig
- Institut für Biologische Physik, University of Cologne, 50937 Cologne, Germany;
| |
Collapse
|
15
|
Manrubia S, Cuesta JA, Aguirre J, Ahnert SE, Altenberg L, Cano AV, Catalán P, Diaz-Uriarte R, Elena SF, García-Martín JA, Hogeweg P, Khatri BS, Krug J, Louis AA, Martin NS, Payne JL, Tarnowski MJ, Weiß M. From genotypes to organisms: State-of-the-art and perspectives of a cornerstone in evolutionary dynamics. Phys Life Rev 2021; 38:55-106. [PMID: 34088608 DOI: 10.1016/j.plrev.2021.03.004] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Accepted: 03/01/2021] [Indexed: 12/21/2022]
Abstract
Understanding how genotypes map onto phenotypes, fitness, and eventually organisms is arguably the next major missing piece in a fully predictive theory of evolution. We refer to this generally as the problem of the genotype-phenotype map. Though we are still far from achieving a complete picture of these relationships, our current understanding of simpler questions, such as the structure induced in the space of genotypes by sequences mapped to molecular structures, has revealed important facts that deeply affect the dynamical description of evolutionary processes. Empirical evidence supporting the fundamental relevance of features such as phenotypic bias is mounting as well, while the synthesis of conceptual and experimental progress leads to questioning current assumptions on the nature of evolutionary dynamics-cancer progression models or synthetic biology approaches being notable examples. This work delves with a critical and constructive attitude into our current knowledge of how genotypes map onto molecular phenotypes and organismal functions, and discusses theoretical and empirical avenues to broaden and improve this comprehension. As a final goal, this community should aim at deriving an updated picture of evolutionary processes soundly relying on the structural properties of genotype spaces, as revealed by modern techniques of molecular and functional analysis.
Collapse
Affiliation(s)
- Susanna Manrubia
- Department of Systems Biology, Centro Nacional de Biotecnología (CSIC), Madrid, Spain; Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain.
| | - José A Cuesta
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Departamento de Matemáticas, Universidad Carlos III de Madrid, Leganés, Spain; Instituto de Biocomputación y Física de Sistemas Complejos (BiFi), Universidad de Zaragoza, Spain; UC3M-Santander Big Data Institute (IBiDat), Getafe, Madrid, Spain
| | - Jacobo Aguirre
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Centro de Astrobiología, CSIC-INTA, ctra. de Ajalvir km 4, 28850 Torrejón de Ardoz, Madrid, Spain
| | - Sebastian E Ahnert
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, UK; The Alan Turing Institute, British Library, 96 Euston Road, London NW1 2DB, UK
| | | | - Alejandro V Cano
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Pablo Catalán
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Departamento de Matemáticas, Universidad Carlos III de Madrid, Leganés, Spain
| | - Ramon Diaz-Uriarte
- Department of Biochemistry, Universidad Autónoma de Madrid, Madrid, Spain; Instituto de Investigaciones Biomédicas "Alberto Sols" (UAM-CSIC), Madrid, Spain
| | - Santiago F Elena
- Instituto de Biología Integrativa de Sistemas, I(2)SysBio (CSIC-UV), València, Spain; The Santa Fe Institute, Santa Fe, NM, USA
| | | | - Paulien Hogeweg
- Theoretical Biology and Bioinformatics Group, Utrecht University, the Netherlands
| | - Bhavin S Khatri
- The Francis Crick Institute, London, UK; Department of Life Sciences, Imperial College London, London, UK
| | - Joachim Krug
- Institute for Biological Physics, University of Cologne, Köln, Germany
| | - Ard A Louis
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford, UK
| | - Nora S Martin
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, Cambridge, UK; Sainsbury Laboratory, University of Cambridge, Cambridge, UK
| | - Joshua L Payne
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | - Marcel Weiß
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, Cambridge, UK; Sainsbury Laboratory, University of Cambridge, Cambridge, UK
| |
Collapse
|
16
|
Tang H, Wu Y, Deng J, Chen N, Zheng Z, Wei Y, Luo X, Keasling JD. Promoter Architecture and Promoter Engineering in Saccharomyces cerevisiae. Metabolites 2020; 10:metabo10080320. [PMID: 32781665 PMCID: PMC7466126 DOI: 10.3390/metabo10080320] [Citation(s) in RCA: 52] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Revised: 07/30/2020] [Accepted: 08/04/2020] [Indexed: 12/23/2022] Open
Abstract
Promoters play an essential role in the regulation of gene expression for fine-tuning genetic circuits and metabolic pathways in Saccharomyces cerevisiae (S. cerevisiae). However, native promoters in S. cerevisiae have several limitations which hinder their applications in metabolic engineering. These limitations include an inadequate number of well-characterized promoters, poor dynamic range, and insufficient orthogonality to endogenous regulations. Therefore, it is necessary to perform promoter engineering to create synthetic promoters with better properties. Here, we review recent advances related to promoter architecture, promoter engineering and synthetic promoter applications in S. cerevisiae. We also provide a perspective of future directions in this field with an emphasis on the recent advances of machine learning based promoter designs.
Collapse
Affiliation(s)
- Hongting Tang
- Center for Synthetic Biochemistry, Shenzhen Institutes for Advanced Technologies, Chinese Academy of Sciences, Shenzhen 518055, China; (H.T.); (Y.W.); (J.D.); (N.C.); (Z.Z.)
| | - Yanling Wu
- Center for Synthetic Biochemistry, Shenzhen Institutes for Advanced Technologies, Chinese Academy of Sciences, Shenzhen 518055, China; (H.T.); (Y.W.); (J.D.); (N.C.); (Z.Z.)
| | - Jiliang Deng
- Center for Synthetic Biochemistry, Shenzhen Institutes for Advanced Technologies, Chinese Academy of Sciences, Shenzhen 518055, China; (H.T.); (Y.W.); (J.D.); (N.C.); (Z.Z.)
| | - Nanzhu Chen
- Center for Synthetic Biochemistry, Shenzhen Institutes for Advanced Technologies, Chinese Academy of Sciences, Shenzhen 518055, China; (H.T.); (Y.W.); (J.D.); (N.C.); (Z.Z.)
| | - Zhaohui Zheng
- Center for Synthetic Biochemistry, Shenzhen Institutes for Advanced Technologies, Chinese Academy of Sciences, Shenzhen 518055, China; (H.T.); (Y.W.); (J.D.); (N.C.); (Z.Z.)
| | - Yongjun Wei
- School of Pharmaceutical Sciences, Key Laboratory of Advanced Drug Preparation Technologies, Ministry of Education, Zhengzhou University, Zhengzhou 450001, China;
| | - Xiaozhou Luo
- Center for Synthetic Biochemistry, Shenzhen Institutes for Advanced Technologies, Chinese Academy of Sciences, Shenzhen 518055, China; (H.T.); (Y.W.); (J.D.); (N.C.); (Z.Z.)
- Correspondence: (X.L.); (J.D.K.)
| | - Jay D. Keasling
- Center for Synthetic Biochemistry, Shenzhen Institutes for Advanced Technologies, Chinese Academy of Sciences, Shenzhen 518055, China; (H.T.); (Y.W.); (J.D.); (N.C.); (Z.Z.)
- Joint BioEnergy Institute, Emeryville, CA 94608, USA
- Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
- Department of Chemical and Biomolecular Engineering & Department of Bioengineering, University of California, Berkeley, CA 94720, USA
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, 2800 Kongens Lyngby, Denmark
- Correspondence: (X.L.); (J.D.K.)
| |
Collapse
|
17
|
Zhou J, McCandlish DM. Minimum epistasis interpolation for sequence-function relationships. Nat Commun 2020; 11:1782. [PMID: 32286265 PMCID: PMC7156698 DOI: 10.1038/s41467-020-15512-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Accepted: 03/12/2020] [Indexed: 12/17/2022] Open
Abstract
Massively parallel phenotyping assays have provided unprecedented insight into how multiple mutations combine to determine biological function. While such assays can measure phenotypes for thousands to millions of genotypes in a single experiment, in practice these measurements are not exhaustive, so that there is a need for techniques to impute values for genotypes whose phenotypes have not been directly assayed. Here, we present an imputation method based on inferring the least epistatic possible sequence-function relationship compatible with the data. In particular, we infer the reconstruction where mutational effects change as little as possible across adjacent genetic backgrounds. The resulting models can capture complex higher-order genetic interactions near the data, but approach additivity where data is sparse or absent. We apply the method to high-throughput transcription factor binding assays and use it to explore a fitness landscape for protein G.
Collapse
Affiliation(s)
- Juannan Zhou
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - David M McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA.
| |
Collapse
|
18
|
Kemble H, Nghe P, Tenaillon O. Recent insights into the genotype-phenotype relationship from massively parallel genetic assays. Evol Appl 2019; 12:1721-1742. [PMID: 31548853 PMCID: PMC6752143 DOI: 10.1111/eva.12846] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2019] [Revised: 06/21/2019] [Accepted: 07/02/2019] [Indexed: 12/20/2022] Open
Abstract
With the molecular revolution in Biology, a mechanistic understanding of the genotype-phenotype relationship became possible. Recently, advances in DNA synthesis and sequencing have enabled the development of deep mutational scanning assays, capable of scoring comprehensive libraries of genotypes for fitness and a variety of phenotypes in massively parallel fashion. The resulting empirical genotype-fitness maps pave the way to predictive models, potentially accelerating our ability to anticipate the behaviour of pathogen and cancerous cell populations from sequencing data. Besides from cellular fitness, phenotypes of direct application in industry (e.g. enzyme activity) and medicine (e.g. antibody binding) can be quantified and even selected directly by these assays. This review discusses the technological basis of and recent developments in massively parallel genetics, along with the trends it is uncovering in the genotype-phenotype relationship (distribution of mutation effects, epistasis), their possible mechanistic bases and future directions for advancing towards the goal of predictive genetics.
Collapse
Affiliation(s)
- Harry Kemble
- Infection, Antimicrobials, Modelling, Evolution, INSERM, Unité Mixte de Recherche 1137Université Paris Diderot, Université Paris NordParisFrance
- École Supérieure de Physique et de Chimie Industrielles de la Ville de Paris (ESPCI Paris), UMR CNRS‐ESPCI CBI 8231PSL Research UniversityParis Cedex 05France
| | - Philippe Nghe
- École Supérieure de Physique et de Chimie Industrielles de la Ville de Paris (ESPCI Paris), UMR CNRS‐ESPCI CBI 8231PSL Research UniversityParis Cedex 05France
| | - Olivier Tenaillon
- Infection, Antimicrobials, Modelling, Evolution, INSERM, Unité Mixte de Recherche 1137Université Paris Diderot, Université Paris NordParisFrance
| |
Collapse
|
19
|
Khatri BS, Goldstein RA. Biophysics and population size constrains speciation in an evolutionary model of developmental system drift. PLoS Comput Biol 2019; 15:e1007177. [PMID: 31335870 PMCID: PMC6677325 DOI: 10.1371/journal.pcbi.1007177] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Revised: 08/02/2019] [Accepted: 06/13/2019] [Indexed: 02/06/2023] Open
Abstract
Developmental system drift is a likely mechanism for the origin of hybrid incompatibilities between closely related species. We examine here the detailed mechanistic basis of hybrid incompatibilities between two allopatric lineages, for a genotype-phenotype map of developmental system drift under stabilising selection, where an organismal phenotype is conserved, but the underlying molecular phenotypes and genotype can drift. This leads to number of emergent phenomenon not obtainable by modelling genotype or phenotype alone. Our results show that: 1) speciation is more rapid at smaller population sizes with a characteristic, Orr-like, power law, but at large population sizes slow, characterised by a sub-diffusive growth law; 2) the molecular phenotypes under weakest selection contribute to the earliest incompatibilities; and 3) pair-wise incompatibilities dominate over higher order, contrary to previous predictions that the latter should dominate. The population size effect we find is consistent with previous results on allopatric divergence of transcription factor-DNA binding, where smaller populations have common ancestors with a larger drift load because genetic drift favours phenotypes which have a larger number of genotypes (higher sequence entropy) over more fit phenotypes which have far fewer genotypes; this means less substitutions are required in either lineage before incompatibilities arise. Overall, our results indicate that biophysics and population size provide a much stronger constraint to speciation than suggested by previous models, and point to a general mechanistic principle of how incompatibilities arise the under stabilising selection for an organismal phenotype. The process of speciation is of fundamental importance to the field of evolution as it is intimately connected to understanding the immense bio-diversity of life. There is still relatively little understanding of the underlying genetic mechanisms that give rise to hybrid incompatibilities with results suggesting that divergence in transcription factor DNA binding and gene expression play an important role. A key finding from the field of evo-devo is that organismal phenotypes show developmental system drift, where species maintain the same phenotype, but diverge in developmental pathways; this is an important potential source of hybrid incompatibilities. Here, we explore a theoretical framework to understand how incompatibilities arise due to developmental system drift, using a tractable biophysically inspired genotype-phenotype for spatial gene expression. Modelling the evolution of phenotypes in this way has the key advantage that it mirrors how selection works in nature, i.e. that selection acts on phenotypes, but variation (mutation) arise at the level of genotypes. This results, as we demonstrate, in a number of non-trivial and testable predictions concerning speciation due to developmental system drift, which would not be obtainable by modelling evolution of genotypes or phenotypes alone.
Collapse
Affiliation(s)
| | - Richard A. Goldstein
- Division of Infection & Immunity, University College London, London, United Kingdom
| |
Collapse
|
20
|
Held T, Klemmer D, Lässig M. Survival of the simplest in microbial evolution. Nat Commun 2019; 10:2472. [PMID: 31171781 PMCID: PMC6554311 DOI: 10.1038/s41467-019-10413-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2018] [Accepted: 05/10/2019] [Indexed: 01/09/2023] Open
Abstract
The evolution of microbial and viral organisms often generates clonal interference, a mode of competition between genetic clades within a population. Here we show how interference impacts systems biology by constraining genetic and phenotypic complexity. Our analysis uses biophysically grounded evolutionary models for molecular phenotypes, such as fold stability and enzymatic activity of genes. We find a generic mode of phenotypic interference that couples the function of individual genes and the population’s global evolutionary dynamics. Biological implications of phenotypic interference include rapid collateral system degradation in adaptation experiments and long-term selection against genome complexity: each additional gene carries a cost proportional to the total number of genes. Recombination above a threshold rate can eliminate this cost, which establishes a universal, biophysically grounded scenario for the evolution of sex. In a broader context, our analysis suggests that the systems biology of microbes is strongly intertwined with their mode of evolution. In asexual populations selection at different genomic loci can interfere with each other. Here, using a biophysical model of molecular evolution the authors show that interference results in long-term degradation of molecular function, an effect that strongly depends on genome size.
Collapse
Affiliation(s)
- Torsten Held
- Institut für Biologische Physik, Universität zu Köln, Zülpicherstr. 77, 50937, Köln, Germany
| | - Daniel Klemmer
- Institut für Biologische Physik, Universität zu Köln, Zülpicherstr. 77, 50937, Köln, Germany
| | - Michael Lässig
- Institut für Biologische Physik, Universität zu Köln, Zülpicherstr. 77, 50937, Köln, Germany.
| |
Collapse
|
21
|
Hill MS, Reuter M, Stewart AJ. Sexual antagonism drives the displacement of polymorphism across gene regulatory cascades. Proc Biol Sci 2019; 286:20190660. [PMID: 31161912 DOI: 10.1098/rspb.2019.0660] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Males and females have different reproductive roles and are often subject to contrasting selection pressures. This sexual antagonism can lead, at a given locus, to different alleles being favoured in each sex and, consequently, to genetic variation being maintained in a population. Although the presence of sexually antagonistic (SA) polymorphisms has been documented across a range of species, their evolutionary dynamics remain poorly understood. Here, we study SA selection on gene expression, which is fundamental to sexual dimorphism, via the evolution of regulatory binding sites. We show that for sites longer than 1 nucleotide, expression polymorphism is maintained only when intermediate expression levels are deleterious to both sexes. We then show that, in a regulatory cascade, expression polymorphism tends to become displaced over evolutionary time from the target of SA selection to upstream regulators. Our results have consequences for understanding the evolution of sexual dimorphism, and provide specific empirical predictions for the regulatory architecture of genes under SA selection.
Collapse
Affiliation(s)
- Mark S Hill
- 1 Department of Ecology and Evolutionary Biology, University of Michigan , Ann Arbor, MI , USA.,2 Research Department of Genetics, Evolution and Environment, University College London , London , UK
| | - Max Reuter
- 2 Research Department of Genetics, Evolution and Environment, University College London , London , UK
| | - Alexander J Stewart
- 3 Department of Biology and Biochemistry, University of Houston , Houston, TX , USA
| |
Collapse
|
22
|
Kinney JB, McCandlish DM. Massively Parallel Assays and Quantitative Sequence-Function Relationships. Annu Rev Genomics Hum Genet 2019; 20:99-127. [PMID: 31091417 DOI: 10.1146/annurev-genom-083118-014845] [Citation(s) in RCA: 96] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Over the last decade, a rich variety of massively parallel assays have revolutionized our understanding of how biological sequences encode quantitative molecular phenotypes. These assays include deep mutational scanning, high-throughput SELEX, and massively parallel reporter assays. Here, we review these experimental methods and how the data they produce can be used to quantitatively model sequence-function relationships. In doing so, we touch on a diverse range of topics, including the identification of clinically relevant genomic variants, the modeling of transcription factor binding to DNA, the functional and evolutionary landscapes of proteins, and cis-regulatory mechanisms in both transcription and mRNA splicing. We further describe a unified conceptual framework and a core set of mathematical modeling strategies that studies in these diverse areas can make use of. Finally, we highlight key aspects of experimental design and mathematical modeling that are important for the results of such studies to be interpretable and reproducible.
Collapse
Affiliation(s)
- Justin B Kinney
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA; ,
| | - David M McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA; ,
| |
Collapse
|
23
|
Collins-Hed AI, Ardell DH. Match fitness landscapes for macromolecular interaction networks: Selection for translational accuracy and rate can displace tRNA-binding interfaces of non-cognate aminoacyl-tRNA synthetases. Theor Popul Biol 2019; 129:68-80. [PMID: 31042487 DOI: 10.1016/j.tpb.2019.03.007] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2018] [Revised: 01/26/2019] [Accepted: 03/13/2019] [Indexed: 12/21/2022]
Abstract
Advances in structural biology of aminoacyl-tRNA synthetases (aaRSs) have revealed incredible diversity in how aaRSs bind their tRNA substrates. The causes of this diversity remain mysterious. We developed a new class of highly rugged fitness landscape models called match landscapes, through which genes encode the assortative interactions of their gene products through the complementarity and identifiability of their structural features. We used results from coding theory to prove bounds and equalities on fitness in match landscapes assuming additive interaction energies, macroscopic aminoacylation kinetics including proofreading, site-specific modifiers of interaction, and selection for translational accuracy in multiple, perfectly encoded site-types. Using genotypes based on extended Hamming codes we show that over a wide array of interface sizes and numbers of encoded cognate pairs, selection for translational accuracy alone is insufficient to displace the tRNA-binding interfaces of aaRSs. Yet, under combined selection for translational accuracy and rate, site-specific modifiers are selected to adaptively displace the tRNA-binding interfaces of non-cognate aaRS-tRNA pairs. We describe a remarkable correspondence between the lengths of perfect RNA (quaternary) codes and the modal sizes of small non-coding RNA families.
Collapse
Affiliation(s)
- Andrea I Collins-Hed
- Quantitative and Systems Biology Program, University of California, Merced, CA, 95306, United States
| | - David H Ardell
- Quantitative and Systems Biology Program, University of California, Merced, CA, 95306, United States; Molecular and Cell Biology Department, School of Natural Sciences, University of California, Merced, CA, 95306, United States.
| |
Collapse
|
24
|
Djordjevic M, Rodic A, Graovac S. From biophysics to 'omics and systems biology. EUROPEAN BIOPHYSICS JOURNAL: EBJ 2019; 48:413-424. [PMID: 30972433 DOI: 10.1007/s00249-019-01366-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 10/12/2018] [Revised: 02/12/2019] [Accepted: 04/03/2019] [Indexed: 01/03/2023]
Abstract
Recent decades brought a revolution to biology, driven mainly by exponentially increasing amounts of data coming from "'omics" sciences. To handle these data, bioinformatics often has to combine biologically heterogeneous signals, for which methods from statistics and engineering (e.g. machine learning) are often used. While such an approach is sometimes necessary, it effectively treats the underlying biological processes as a black box. Similarly, systems biology deals with inherently complex systems, characterized by a large number of degrees of freedom, and interactions that are highly non-linear. To deal with this complexity, the underlying physical interactions are often (over)simplified, such as in Boolean modelling of network dynamics. In this review, we argue for the utility of applying a biophysical approach in bioinformatics and systems biology, including discussion of two examples from our research which address sequence analysis and understanding intracellular gene expression dynamics.
Collapse
Affiliation(s)
- Marko Djordjevic
- Faculty of Biology, Institute of Physiology and Biochemistry, University of Belgrade, Belgrade, Serbia.
| | - Andjela Rodic
- Faculty of Biology, Institute of Physiology and Biochemistry, University of Belgrade, Belgrade, Serbia.,Interdisciplinary PhD Program in Biophysics, University of Belgrade, Belgrade, Serbia
| | - Stefan Graovac
- Faculty of Biology, Institute of Physiology and Biochemistry, University of Belgrade, Belgrade, Serbia.,Interdisciplinary PhD Program in Biophysics, University of Belgrade, Belgrade, Serbia
| |
Collapse
|
25
|
Otwinowski J. Biophysical Inference of Epistasis and the Effects of Mutations on Protein Stability and Function. Mol Biol Evol 2018; 35:2345-2354. [PMID: 30085303 PMCID: PMC6188545 DOI: 10.1093/molbev/msy141] [Citation(s) in RCA: 51] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Understanding the relationship between protein sequence, function, and stability is a fundamental problem in biology. The essential function of many proteins that fold into a specific structure is their ability to bind to a ligand, which can be assayed for thousands of mutated variants. However, binding assays do not distinguish whether mutations affect the stability of the binding interface or the overall fold. Here, we introduce a statistical method to infer a detailed energy landscape of how a protein folds and binds to a ligand by combining information from many mutated variants. We fit a thermodynamic model describing the bound, unbound, and unfolded states to high quality data of protein G domain B1 binding to IgG-Fc. We infer distinct folding and binding energies for each mutation providing a detailed view of how mutations affect binding and stability across the protein. We accurately infer the folding energy of each variant in physical units, validated by independent data, whereas previous high-throughput methods could only measure indirect changes in stability. While we assume an additive sequence-energy relationship, the binding fraction is epistatic due its nonlinear relation to energy. Despite having no epistasis in energy, our model explains much of the observed epistasis in binding fraction, with the remaining epistasis identifying conformationally dynamic regions.
Collapse
Affiliation(s)
- Jakub Otwinowski
- Biology Department, University of Pennsylvania, Philadelphia, PA
| |
Collapse
|
26
|
Igler C, Lagator M, Tkačik G, Bollback JP, Guet CC. Evolutionary potential of transcription factors for gene regulatory rewiring. Nat Ecol Evol 2018; 2:1633-1643. [PMID: 30201966 DOI: 10.1038/s41559-018-0651-y] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2018] [Accepted: 07/27/2018] [Indexed: 11/09/2022]
Abstract
Gene regulatory networks evolve through rewiring of individual components-that is, through changes in regulatory connections. However, the mechanistic basis of regulatory rewiring is poorly understood. Using a canonical gene regulatory system, we quantify the properties of transcription factors that determine the evolutionary potential for rewiring of regulatory connections: robustness, tunability and evolvability. In vivo repression measurements of two repressors at mutated operator sites reveal their contrasting evolutionary potential: while robustness and evolvability were positively correlated, both were in trade-off with tunability. Epistatic interactions between adjacent operators alleviated this trade-off. A thermodynamic model explains how the differences in robustness, tunability and evolvability arise from biophysical characteristics of repressor-DNA binding. The model also uncovers that the energy matrix, which describes how mutations affect repressor-DNA binding, encodes crucial information about the evolutionary potential of a repressor. The biophysical determinants of evolutionary potential for regulatory rewiring constitute a mechanistic framework for understanding network evolution.
Collapse
Affiliation(s)
| | - Mato Lagator
- IST Austria, Am Campus 1, Klosterneuburg, Austria
| | | | - Jonathan P Bollback
- IST Austria, Am Campus 1, Klosterneuburg, Austria.,Institute of Integrative Biology, University of Liverpool, Liverpool, UK
| | - Călin C Guet
- IST Austria, Am Campus 1, Klosterneuburg, Austria.
| |
Collapse
|
27
|
Systematic approach for dissecting the molecular mechanisms of transcriptional regulation in bacteria. Proc Natl Acad Sci U S A 2018; 115:E4796-E4805. [PMID: 29728462 PMCID: PMC6003448 DOI: 10.1073/pnas.1722055115] [Citation(s) in RCA: 64] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Organisms must constantly make regulatory decisions in response to a change in cellular state or environment. However, while the catalog of genomes expands rapidly, we remain ignorant about how the genes in these genomes are regulated. Here, we show how a massively parallel reporter assay, Sort-Seq, and information-theoretic modeling can be used to identify regulatory sequences. We then use chromatography and mass spectrometry to identify the regulatory proteins that bind these sequences. The approach results in quantitative base pair-resolution models of promoter mechanism and was shown in both well-characterized and unannotated promoters in Escherichia coli. Given the generality of the approach, it opens up the possibility of quantitatively dissecting the mechanisms of promoter function in a wide range of bacteria. Gene regulation is one of the most ubiquitous processes in biology. However, while the catalog of bacterial genomes continues to expand rapidly, we remain ignorant about how almost all of the genes in these genomes are regulated. At present, characterizing the molecular mechanisms by which individual regulatory sequences operate requires focused efforts using low-throughput methods. Here, we take a first step toward multipromoter dissection and show how a combination of massively parallel reporter assays, mass spectrometry, and information-theoretic modeling can be used to dissect multiple bacterial promoters in a systematic way. We show this approach on both well-studied and previously uncharacterized promoters in the enteric bacterium Escherichia coli. In all cases, we recover nucleotide-resolution models of promoter mechanism. For some promoters, including previously unannotated ones, the approach allowed us to further extract quantitative biophysical models describing input–output relationships. Given the generality of the approach presented here, it opens up the possibility of quantitatively dissecting the mechanisms of promoter function in E. coli and a wide range of other bacteria.
Collapse
|
28
|
Comprehensive, high-resolution binding energy landscapes reveal context dependencies of transcription factor binding. Proc Natl Acad Sci U S A 2018; 115:E3702-E3711. [PMID: 29588420 PMCID: PMC5910820 DOI: 10.1073/pnas.1715888115] [Citation(s) in RCA: 51] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Transcription factors (TFs) are primary regulators of gene expression in cells, where they bind specific genomic target sites to control transcription. Quantitative measurements of TF-DNA binding energies can improve the accuracy of predictions of TF occupancy and downstream gene expression in vivo and shed light on how transcriptional networks are rewired throughout evolution. Here, we present a sequencing-based TF binding assay and analysis pipeline (BET-seq, for Binding Energy Topography by sequencing) capable of providing quantitative estimates of binding energies for more than one million DNA sequences in parallel at high energetic resolution. Using this platform, we measured the binding energies associated with all possible combinations of 10 nucleotides flanking the known consensus DNA target interacting with two model yeast TFs, Pho4 and Cbf1. A large fraction of these flanking mutations change overall binding energies by an amount equal to or greater than consensus site mutations, suggesting that current definitions of TF binding sites may be too restrictive. By systematically comparing estimates of binding energies output by deep neural networks (NNs) and biophysical models trained on these data, we establish that dinucleotide (DN) specificities are sufficient to explain essentially all variance in observed binding behavior, with Cbf1 binding exhibiting significantly more nonadditivity than Pho4. NN-derived binding energies agree with orthogonal biochemical measurements and reveal that dynamically occupied sites in vivo are both energetically and mutationally distant from the highest affinity sites.
Collapse
|
29
|
Gursky VV, Kozlov KN, Kulakovskiy IV, Zubair A, Marjoram P, Lawrie DS, Nuzhdin SV, Samsonova MG. Translating natural genetic variation to gene expression in a computational model of the Drosophila gap gene regulatory network. PLoS One 2017; 12:e0184657. [PMID: 28898266 PMCID: PMC5595321 DOI: 10.1371/journal.pone.0184657] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2017] [Accepted: 08/28/2017] [Indexed: 11/18/2022] Open
Abstract
Annotating the genotype-phenotype relationship, and developing a proper quantitative description of the relationship, requires understanding the impact of natural genomic variation on gene expression. We apply a sequence-level model of gap gene expression in the early development of Drosophila to analyze single nucleotide polymorphisms (SNPs) in a panel of natural sequenced D. melanogaster lines. Using a thermodynamic modeling framework, we provide both analytical and computational descriptions of how single-nucleotide variants affect gene expression. The analysis reveals that the sequence variants increase (decrease) gene expression if located within binding sites of repressors (activators). We show that the sign of SNP influence (activation or repression) may change in time and space and elucidate the origin of this change in specific examples. The thermodynamic modeling approach predicts non-local and non-linear effects arising from SNPs, and combinations of SNPs, in individual fly genotypes. Simulation of individual fly genotypes using our model reveals that this non-linearity reduces to almost additive inputs from multiple SNPs. Further, we see signatures of the action of purifying selection in the gap gene regulatory regions. To infer the specific targets of purifying selection, we analyze the patterns of polymorphism in the data at two phenotypic levels: the strengths of binding and expression. We find that combinations of SNPs show evidence of being under selective pressure, while individual SNPs do not. The model predicts that SNPs appear to accumulate in the genotypes of the natural population in a way biased towards small increases in activating action on the expression pattern. Taken together, these results provide a systems-level view of how genetic variation translates to the level of gene regulatory networks via combinatorial SNP effects.
Collapse
Affiliation(s)
- Vitaly V. Gursky
- Theoretical Department, Ioffe Institute, Saint Petersburg, Russia
- Systems Biology and Bioinformatics Laboratory, Peter the Great Saint Petersburg Polytechnic University, Saint Petersburg, Russia
- * E-mail:
| | - Konstantin N. Kozlov
- Systems Biology and Bioinformatics Laboratory, Peter the Great Saint Petersburg Polytechnic University, Saint Petersburg, Russia
| | - Ivan V. Kulakovskiy
- Engelhardt Institute of Molecular Biology, Moscow, Russia
- Vavilov Institute of General Genetics, Moscow, Russia
- Center for Data-Intensive Biomedicine and Biotechnology, Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Asif Zubair
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Paul Marjoram
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - David S. Lawrie
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Sergey V. Nuzhdin
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Maria G. Samsonova
- Systems Biology and Bioinformatics Laboratory, Peter the Great Saint Petersburg Polytechnic University, Saint Petersburg, Russia
| |
Collapse
|
30
|
|
31
|
A thousand empirical adaptive landscapes and their navigability. Nat Ecol Evol 2017; 1:45. [PMID: 28812623 DOI: 10.1038/s41559-016-0045] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2016] [Accepted: 12/05/2016] [Indexed: 01/22/2023]
Abstract
The adaptive landscape is an iconic metaphor that pervades evolutionary biology. It was mostly applied in theoretical models until recent years, when empirical data began to allow partial landscape reconstructions. Here, we exhaustively analyse 1,137 complete landscapes from 129 eukaryotic species, each describing the binding affinity of a transcription factor to all possible short DNA sequences. We find that the navigability of these landscapes through single mutations is intermediate to that of additive and shuffled null models, suggesting that binding affinity-and thereby gene expression-is readily fine-tuned via mutations in transcription factor binding sites. The landscapes have few peaks that vary in their accessibility and in the number of sequences they contain. Binding sites in the mouse genome are enriched in sequences found in the peaks of especially navigable landscapes and the genetic diversity of binding sites in yeast increases with the number of sequences in a peak. Our findings suggest that landscape navigability may have contributed to the enormous success of transcriptional regulation as a source of evolutionary adaptations and innovations.
Collapse
|
32
|
Martin O, Krzywicki A, Zagorski M. Drivers of structural features in gene regulatory networks: From biophysical constraints to biological function. Phys Life Rev 2016; 17:124-58. [DOI: 10.1016/j.plrev.2016.06.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2015] [Revised: 03/25/2016] [Accepted: 04/20/2016] [Indexed: 12/23/2022]
|
33
|
Tuğrul M, Paixão T, Barton NH, Tkačik G. Dynamics of Transcription Factor Binding Site Evolution. PLoS Genet 2015; 11:e1005639. [PMID: 26545200 PMCID: PMC4636380 DOI: 10.1371/journal.pgen.1005639] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2015] [Accepted: 10/09/2015] [Indexed: 11/19/2022] Open
Abstract
Evolution of gene regulation is crucial for our understanding of the phenotypic differences between species, populations and individuals. Sequence-specific binding of transcription factors to the regulatory regions on the DNA is a key regulatory mechanism that determines gene expression and hence heritable phenotypic variation. We use a biophysical model for directional selection on gene expression to estimate the rates of gain and loss of transcription factor binding sites (TFBS) in finite populations under both point and insertion/deletion mutations. Our results show that these rates are typically slow for a single TFBS in an isolated DNA region, unless the selection is extremely strong. These rates decrease drastically with increasing TFBS length or increasingly specific protein-DNA interactions, making the evolution of sites longer than ∼ 10 bp unlikely on typical eukaryotic speciation timescales. Similarly, evolution converges to the stationary distribution of binding sequences very slowly, making the equilibrium assumption questionable. The availability of longer regulatory sequences in which multiple binding sites can evolve simultaneously, the presence of “pre-sites” or partially decayed old sites in the initial sequence, and biophysical cooperativity between transcription factors, can all facilitate gain of TFBS and reconcile theoretical calculations with timescales inferred from comparative genomics. Evolution has produced a remarkable diversity of living forms that manifests in qualitative differences as well as quantitative traits. An essential factor that underlies this variability is transcription factor binding sites, short pieces of DNA that control gene expression levels. Nevertheless, we lack a thorough theoretical understanding of the evolutionary times required for the appearance and disappearance of these sites. By combining a biophysically realistic model for how cells read out information in transcription factor binding sites with model for DNA sequence evolution, we explore these timescales and ask what factors crucially affect them. We find that the emergence of binding sites from a random sequence is generically slow under point and insertion/deletion mutational mechanisms. Strong selection, sufficient genomic sequence in which the sites can evolve, the existence of partially decayed old binding sites in the sequence, as well as certain biophysical mechanisms such as cooperativity, can accelerate the binding site gain times and make them consistent with the timescales suggested by comparative analyses of genomic data.
Collapse
Affiliation(s)
- Murat Tuğrul
- Institute of Science and Technology Austria, Klosterneuburg, Austria
- * E-mail:
| | - Tiago Paixão
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| | | | - Gašper Tkačik
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| |
Collapse
|
34
|
Simple Biophysical Model Predicts Faster Accumulation of Hybrid Incompatibilities in Small Populations Under Stabilizing Selection. Genetics 2015; 201:1525-37. [PMID: 26434721 PMCID: PMC4676520 DOI: 10.1534/genetics.115.181685] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2015] [Accepted: 09/23/2015] [Indexed: 01/07/2023] Open
Abstract
Speciation is fundamental to the process of generating the huge diversity of life on Earth. However, we are yet to have a clear understanding of its molecular-genetic basis. Here, we examine a computational model of reproductive isolation that explicitly incorporates a map from genotype to phenotype based on the biophysics of protein–DNA binding. In particular, we model the binding of a protein transcription factor to a DNA binding site and how their independent coevolution, in a stabilizing fitness landscape, of two allopatric lineages leads to incompatibilities. Complementing our previous coarse-grained theoretical results, our simulations give a new prediction for the monomorphic regime of evolution that smaller populations should develop incompatibilities more quickly. This arises as (1) smaller populations have a greater initial drift load, as there are more sequences that bind poorly than well, so fewer substitutions are needed to reach incompatible regions of phenotype space, and (2) slower divergence when the population size is larger than the inverse of discrete differences in fitness. Further, we find longer sequences develop incompatibilities more quickly at small population sizes, but more slowly at large population sizes. The biophysical model thus represents a robust mechanism of rapid reproductive isolation for small populations and large sequences that does not require peak shifts or positive selection. Finally, we show that the growth of DMIs with time is quadratic for small populations, agreeing with Orr’s model, but nonpower law for large populations, with a form consistent with our previous theoretical results.
Collapse
|
35
|
Wunderlich Z, Bragdon MDJ, Vincent BJ, White JA, Estrada J, DePace AH. Krüppel Expression Levels Are Maintained through Compensatory Evolution of Shadow Enhancers. Cell Rep 2015; 12:1740-7. [PMID: 26344774 PMCID: PMC4581983 DOI: 10.1016/j.celrep.2015.08.021] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2015] [Revised: 06/24/2015] [Accepted: 08/05/2015] [Indexed: 01/08/2023] Open
Abstract
Many developmental genes are controlled by shadow enhancers—pairs of enhancers that drive overlapping expression patterns. We hypothesized that compensatory evolution can maintain the total expression of a gene, while individual shadow enhancers diverge between species. To test this hypothesis, we analyzed expression driven by orthologous pairs of shadow enhancers from Drosophila melanogaster, Drosophila yakuba, and Drosophila pseudoobscura that control expression of Krüppel, a transcription factor that patterns the anterior-posterior axis of blastoderm embryos. We found that the expression driven by the pair of enhancers is conserved between these three species, but expression levels driven by the individual enhancers are not. Using sequence analysis and experimental perturbation, we show that each shadow enhancer is regulated by different transcription factors. These results support the hypothesis that compensatory evolution can occur between shadow enhancers, which has implications for mechanistic and evolutionary studies of gene regulation.
Collapse
Affiliation(s)
- Zeba Wunderlich
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Meghan D J Bragdon
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Ben J Vincent
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | | | - Javier Estrada
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Angela H DePace
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA.
| |
Collapse
|
36
|
Khatri BS, Goldstein RA. A coarse-grained biophysical model of sequence evolution and the population size dependence of the speciation rate. J Theor Biol 2015; 378:56-64. [PMID: 25936759 PMCID: PMC4457359 DOI: 10.1016/j.jtbi.2015.04.027] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2014] [Revised: 02/20/2015] [Accepted: 04/20/2015] [Indexed: 11/29/2022]
Abstract
Speciation is fundamental to understanding the huge diversity of life on Earth. Although still controversial, empirical evidence suggests that the rate of speciation is larger for smaller populations. Here, we explore a biophysical model of speciation by developing a simple coarse-grained theory of transcription factor-DNA binding and how their co-evolution in two geographically isolated lineages leads to incompatibilities. To develop a tractable analytical theory, we derive a Smoluchowski equation for the dynamics of binding energy evolution that accounts for the fact that natural selection acts on phenotypes, but variation arises from mutations in sequences; the Smoluchowski equation includes selection due to both gradients in fitness and gradients in sequence entropy, which is the logarithm of the number of sequences that correspond to a particular binding energy. This simple consideration predicts that smaller populations develop incompatibilities more quickly in the weak mutation regime; this trend arises as sequence entropy poises smaller populations closer to incompatible regions of phenotype space. These results suggest a generic coarse-grained approach to evolutionary stochastic dynamics, allowing realistic modelling at the phenotypic level.
Collapse
Affiliation(s)
- Bhavin S Khatri
- The Francis Crick Institute, Mill Hill Laboratory, The Ridgeway, London NW7 1AA, UK; Division of Infection & Immunity, University College London, London WC1E 6BT, UK.
| | - Richard A Goldstein
- Division of Infection & Immunity, University College London, London WC1E 6BT, UK.
| |
Collapse
|
37
|
Young RS, Hayashizaki Y, Andersson R, Sandelin A, Kawaji H, Itoh M, Lassmann T, Carninci P, Bickmore WA, Forrest AR, Taylor MS. The frequent evolutionary birth and death of functional promoters in mouse and human. Genome Res 2015; 25:1546-57. [PMID: 26228054 PMCID: PMC4579340 DOI: 10.1101/gr.190546.115] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2015] [Accepted: 07/28/2015] [Indexed: 12/04/2022]
Abstract
Promoters are central to the regulation of gene expression. Changes in gene regulation are thought to underlie much of the adaptive diversification between species and phenotypic variation within populations. In contrast to earlier work emphasizing the importance of enhancer evolution and subtle sequence changes at promoters, we show that dramatic changes such as the complete gain and loss (collectively, turnover) of functional promoters are common. Using quantitative measures of transcription initiation in both humans and mice across 52 matched tissues, we discriminate promoter sequence gains from losses and resolve the lineage of changes. We also identify expression divergence and functional turnover between orthologous promoters, finding only the latter is associated with local sequence changes. Promoter turnover has occurred at the majority (>56%) of protein-coding genes since humans and mice diverged. Tissue-restricted promoters are the most evolutionarily volatile where retrotransposition is an important, but not the sole, source of innovation. There is considerable heterogeneity of turnover rates between promoters in different tissues, but the consistency of these in both lineages suggests that the same biological systems are similarly inclined to transcriptional rewiring. The genes affected by promoter turnover show evidence of adaptive evolution. In mice, promoters are primarily lost through deletion of the promoter containing sequence, whereas in humans, many promoters appear to be gradually decaying with weak transcriptional output and relaxed selective constraint. Our results suggest that promoter gain and loss is an important process in the evolutionary rewiring of gene regulation and may be a significant source of phenotypic diversification.
Collapse
Affiliation(s)
- Robert S Young
- MRC Human Genetics Unit, MRC Institute for Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, EH4 2XU, United Kingdom
| | - Yoshihide Hayashizaki
- RIKEN Preventive Medicine and Diagnosis Innovation Program, Wako, Saitama, 351-0198, Japan
| | - Robin Andersson
- Department of Biology and Biotech Research and Innovation Centre, Copenhagen University, 2200 Copenhagen N, Denmark
| | - Albin Sandelin
- Department of Biology and Biotech Research and Innovation Centre, Copenhagen University, 2200 Copenhagen N, Denmark
| | - Hideya Kawaji
- RIKEN Preventive Medicine and Diagnosis Innovation Program, Wako, Saitama, 351-0198, Japan; RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Tsurumi-ku, Yokohama, 230-0045, Japan
| | - Masayoshi Itoh
- RIKEN Preventive Medicine and Diagnosis Innovation Program, Wako, Saitama, 351-0198, Japan; RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Tsurumi-ku, Yokohama, 230-0045, Japan
| | - Timo Lassmann
- RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Tsurumi-ku, Yokohama, 230-0045, Japan
| | - Piero Carninci
- RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Tsurumi-ku, Yokohama, 230-0045, Japan
| | | | - Wendy A Bickmore
- MRC Human Genetics Unit, MRC Institute for Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, EH4 2XU, United Kingdom
| | - Alistair R Forrest
- RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Tsurumi-ku, Yokohama, 230-0045, Japan; Systems Biology and Genomics, Harry Perkins Institute of Medical Research, QEII Medical Centre, Nedlands, Western Australia 6009, Australia
| | - Martin S Taylor
- MRC Human Genetics Unit, MRC Institute for Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, EH4 2XU, United Kingdom
| |
Collapse
|
38
|
Multiple-Line Inference of Selection on Quantitative Traits. Genetics 2015; 201:305-22. [PMID: 26139839 DOI: 10.1534/genetics.115.178988] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2014] [Accepted: 06/18/2015] [Indexed: 11/18/2022] Open
Abstract
Trait differences between species may be attributable to natural selection. However, quantifying the strength of evidence for selection acting on a particular trait is a difficult task. Here we develop a population genetics test for selection acting on a quantitative trait that is based on multiple-line crosses. We show that using multiple lines increases both the power and the scope of selection inferences. First, a test based on three or more lines detects selection with strongly increased statistical significance, and we show explicitly how the sensitivity of the test depends on the number of lines. Second, a multiple-line test can distinguish between different lineage-specific selection scenarios. Our analytical results are complemented by extensive numerical simulations. We then apply the multiple-line test to QTL data on floral character traits in plant species of the Mimulus genus and on photoperiodic traits in different maize strains, where we find a signature of lineage-specific selection not seen in two-line tests.
Collapse
|
39
|
Evolutionary meandering of intermolecular interactions along the drift barrier. Proc Natl Acad Sci U S A 2014; 112:E30-8. [PMID: 25535374 DOI: 10.1073/pnas.1421641112] [Citation(s) in RCA: 73] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Many cellular functions depend on highly specific intermolecular interactions, for example transcription factors and their DNA binding sites, microRNAs and their RNA binding sites, the interfaces between heterodimeric protein molecules, the stems in RNA molecules, and kinases and their response regulators in signal-transduction systems. Despite the need for complementarity between interacting partners, such pairwise systems seem to be capable of high levels of evolutionary divergence, even when subject to strong selection. Such behavior is a consequence of the diminishing advantages of increasing binding affinity between partners, the multiplicity of evolutionary pathways between selectively equivalent alternatives, and the stochastic nature of evolutionary processes. Because mutation pressure toward reduced affinity conflicts with selective pressure for greater interaction, situations can arise in which the expected distribution of the degree of matching between interacting partners is bimodal, even in the face of constant selection. Although biomolecules with larger numbers of interacting partners are subject to increased levels of evolutionary conservation, their more numerous partners need not converge on a single sequence motif or be increasingly constrained in more complex systems. These results suggest that most phylogenetic differences in the sequences of binding interfaces are not the result of adaptive fine tuning but a simple consequence of random genetic drift.
Collapse
|
40
|
Siepel A, Arbiza L. Cis-regulatory elements and human evolution. Curr Opin Genet Dev 2014; 29:81-9. [PMID: 25218861 PMCID: PMC4258466 DOI: 10.1016/j.gde.2014.08.011] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2014] [Revised: 08/17/2014] [Accepted: 08/23/2014] [Indexed: 11/20/2022]
Abstract
Modification of gene regulation has long been considered an important force in human evolution, particularly through changes to cis-regulatory elements (CREs) that function in transcriptional regulation. For decades, however, the study of cis-regulatory evolution was severely limited by the available data. New data sets describing the locations of CREs and genetic variation within and between species have now made it possible to study CRE evolution much more directly on a genome-wide scale. Here, we review recent research on the evolution of CREs in humans based on large-scale genomic data sets. We consider inferences based on primate divergence, human polymorphism, and combinations of divergence and polymorphism. We then consider 'new frontiers' in this field stemming from recent research on transcriptional regulation.
Collapse
Affiliation(s)
- Adam Siepel
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA.
| | - Leonardo Arbiza
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
41
|
Hybrid incompatibility despite pleiotropic constraint in a sequence-based bioenergetic model of transcription factor binding. Genetics 2014; 198:1645-54. [PMID: 25313130 DOI: 10.1534/genetics.114.171397] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Hybrid incompatibility can result from gene misregulation produced by divergence in trans-acting regulatory factors and their cis-regulatory targets. However, change in trans-acting factors may be constrained by pleiotropy, which would in turn limit the evolution of incompatibility. We employed a mechanistically explicit bioenergetic model of gene expression wherein parameter combinations (number of transcription factor molecules, energetic properties of binding to the regulatory site, and genomic background size) determine the shape of the genotype-phenotype (G-P) map, and interacting allelic variants of mutable cis and trans sites determine the phenotype along that map. Misregulation occurs when the phenotype differs from its optimal value. We simulated a pleiotropic regulatory pathway involving a positively selected and a conserved trait regulated by a shared transcription factor (TF), with two populations evolving in parallel. Pleiotropic constraints shifted evolution in the positively selected trait to its cis-regulatory locus. We nevertheless found that the TF genotypes often evolved, accompanied by compensatory evolution in the conserved trait, and both traits contributed to hybrid misregulation. Compensatory evolution resulted in "developmental system drift," whereby the regulatory basis of the conserved phenotype changed although the phenotype itself did not. Pleiotropic constraints became stronger and in some cases prohibitive when the bioenergetic properties of the molecular interaction produced a G-P map that was too steep. Likewise, compensatory evolution slowed and hybrid misregulation was not evident when the G-P map was too shallow. A broad pleiotropic "sweet spot" nevertheless existed where evolutionary constraints were moderate to weak, permitting substantial hybrid misregulation in both traits. None of these pleiotropic constraints manifested when the TF contained nonrecombining domains independently regulating the respective traits.
Collapse
|
42
|
McCandlish DM, Stoltzfus A. Modeling evolution using the probability of fixation: history and implications. QUARTERLY REVIEW OF BIOLOGY 2014; 89:225-52. [PMID: 25195318 DOI: 10.1086/677571] [Citation(s) in RCA: 102] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
Many models of evolution calculate the rate of evolution by multiplying the rate at which new mutations originate within a population by a probability of fixation. Here we review the historical origins, contemporary applications, and evolutionary implications of these "origin-fixation" models, which are widely used in evolutionary genetics, molecular evolution, and phylogenetics. Origin-fixation models were first introduced in 1969, in association with an emerging view of "molecular" evolution. Early origin-fixation models were used to calculate an instantaneous rate of evolution across a large number of independently evolving loci; in the 1980s and 1990s, a second wave of origin-fixation models emerged to address a sequence of fixation events at a single locus. Although origin fixation models have been applied to a broad array of problems in contemporary evolutionary research, their rise in popularity has not been accompanied by an increased appreciation of their restrictive assumptions or their distinctive implications. We argue that origin-fixation models constitute a coherent theory of mutation-limited evolution that contrasts sharply with theories of evolution that rely on the presence of standing genetic variation. A major unsolved question in evolutionary biology is the degree to which these models provide an accurate approximation of evolution in natural populations.
Collapse
|
43
|
Hybrid incompatibility arises in a sequence-based bioenergetic model of transcription factor binding. Genetics 2014; 198:1155-66. [PMID: 25173845 DOI: 10.1534/genetics.114.168112] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Postzygotic isolation between incipient species results from the accumulation of incompatibilities that arise as a consequence of genetic divergence. When phenotypes are determined by regulatory interactions, hybrid incompatibility can evolve even as a consequence of parallel adaptation in parental populations because interacting genes can produce the same phenotype through incompatible allelic combinations. We explore the evolutionary conditions that promote and constrain hybrid incompatibility in regulatory networks using a bioenergetic model (combining thermodynamics and kinetics) of transcriptional regulation, considering the bioenergetic basis of molecular interactions between transcription factors (TFs) and their binding sites. The bioenergetic parameters consider the free energy of formation of the bond between the TF and its binding site and the availability of TFs in the intracellular environment. Together these determine fractional occupancy of the TF on the promoter site, the degree of subsequent gene expression and in diploids, and the degree of dominance among allelic interactions. This results in a sigmoid genotype-phenotype map and fitness landscape, with the details of the shape determining the degree of bioenergetic evolutionary constraint on hybrid incompatibility. Using individual-based simulations, we subjected two allopatric populations to parallel directional or stabilizing selection. Misregulation of hybrid gene expression occurred under either type of selection, although it evolved faster under directional selection. Under directional selection, the extent of hybrid incompatibility increased with the slope of the genotype-phenotype map near the derived parental expression level. Under stabilizing selection, hybrid incompatibility arose from compensatory mutations and was greater when the bioenergetic properties of the interaction caused the space of nearly neutral genotypes around the stable expression level to be wide. F2's showed higher hybrid incompatibility than F1's to the extent that the bioenergetic properties favored dominant regulatory interactions. The present model is a mechanistically explicit case of the Bateson-Dobzhansky-Muller model, connecting environmental selective pressure to hybrid incompatibility through the molecular mechanism of regulatory divergence. The bioenergetic parameters that determine expression represent measurable properties of transcriptional regulation, providing a predictive framework for empirical studies of how phenotypic evolution results in epistatic incompatibility at the molecular level in hybrids.
Collapse
|
44
|
Haldane A, Manhart M, Morozov AV. Biophysical fitness landscapes for transcription factor binding sites. PLoS Comput Biol 2014; 10:e1003683. [PMID: 25010228 PMCID: PMC4091707 DOI: 10.1371/journal.pcbi.1003683] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2013] [Accepted: 05/11/2014] [Indexed: 11/18/2022] Open
Abstract
Phenotypic states and evolutionary trajectories available to cell populations are ultimately dictated by complex interactions among DNA, RNA, proteins, and other molecular species. Here we study how evolution of gene regulation in a single-cell eukaryote S. cerevisiae is affected by interactions between transcription factors (TFs) and their cognate DNA sites. Our study is informed by a comprehensive collection of genomic binding sites and high-throughput in vitro measurements of TF-DNA binding interactions. Using an evolutionary model for monomorphic populations evolving on a fitness landscape, we infer fitness as a function of TF-DNA binding to show that the shape of the inferred fitness functions is in broad agreement with a simple functional form inspired by a thermodynamic model of two-state TF-DNA binding. However, the effective parameters of the model are not always consistent with physical values, indicating selection pressures beyond the biophysical constraints imposed by TF-DNA interactions. We find little statistical support for the fitness landscape in which each position in the binding site evolves independently, indicating that epistasis is common in the evolution of gene regulation. Finally, by correlating TF-DNA binding energies with biological properties of the sites or the genes they regulate, we are able to rule out several scenarios of site-specific selection, under which binding sites of the same TF would experience different selection pressures depending on their position in the genome. These findings support the existence of universal fitness landscapes which shape evolution of all sites for a given TF, and whose properties are determined in part by the physics of protein-DNA interactions. Specialized proteins called transcription factors turn genes on and off by binding to short stretches of DNA in their regulatory regions. Precise gene regulation is essential for cellular survival and proliferation, and its evolution and maintenance under mutational pressure are central issues in biology. Here we discuss how evolution of gene regulation is shaped by the need to maintain favorable binding energies between transcription factors and their genomic binding sites. We show that, surprisingly, transcription factor binding is not affected by many biological properties, such as the essentiality of the gene it regulates. Rather, all sites for a given factor appear to evolve under a universal set of constraints, which can be rationalized in terms of a simple model inspired by transcription factor – DNA binding thermodynamics.
Collapse
Affiliation(s)
- Allan Haldane
- Department of Physics and Astronomy, Rutgers University, Piscataway, New Jersey, United States of America
| | - Michael Manhart
- Department of Physics and Astronomy, Rutgers University, Piscataway, New Jersey, United States of America
| | - Alexandre V. Morozov
- Department of Physics and Astronomy, Rutgers University, Piscataway, New Jersey, United States of America
- BioMaPS Institute for Quantitative Biology, Rutgers University, Piscataway, New Jersey, United States of America
- * E-mail:
| |
Collapse
|
45
|
Ivankov DN, Finkelstein AV, Kondrashov FA. A structural perspective of compensatory evolution. Curr Opin Struct Biol 2014; 26:104-12. [PMID: 24981969 PMCID: PMC4141909 DOI: 10.1016/j.sbi.2014.05.004] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2014] [Revised: 04/11/2014] [Accepted: 05/16/2014] [Indexed: 11/25/2022]
Abstract
The study of molecular evolution is important because it reveals how protein functions emerge and evolve. Recently, several types of studies indicated that substitutions in molecular evolution occur in a compensatory manner, whereby the occurrence of a substitution depends on the amino acid residues at other sites. However, a molecular or structural basis behind the compensation often remains obscure. Here, we review studies on the interface of structural biology and molecular evolution that revealed novel aspects of compensatory evolution. In many cases structural studies benefit from evolutionary data while structural data often add a functional dimension to the study of molecular evolution.
Collapse
Affiliation(s)
- Dmitry N Ivankov
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), 88 Dr. Aiguader, 08003 Barcelona, Spain; Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain; Laboratory of Protein Physics, Institute of Protein Research of the Russian Academy of Sciences, 4 Institutskaya str., Pushchino, Moscow Region, 142290, Russia
| | - Alexei V Finkelstein
- Laboratory of Protein Physics, Institute of Protein Research of the Russian Academy of Sciences, 4 Institutskaya str., Pushchino, Moscow Region, 142290, Russia
| | - Fyodor A Kondrashov
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), 88 Dr. Aiguader, 08003 Barcelona, Spain; Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain; Institució Catalana de Recerca i Estudis Avançats (ICREA), 23 Pg. Lluís Companys, 08010 Barcelona, Spain.
| |
Collapse
|
46
|
Abstract
The efficient recognition of pathogens by the adaptive immune system relies on the diversity of receptors displayed at the surface of immune cells. T-cell receptor diversity results from an initial random DNA editing process, called VDJ recombination, followed by functional selection of cells according to the interaction of their surface receptors with self and foreign antigenic peptides. Using high-throughput sequence data from the β-chain of human T-cell receptors, we infer factors that quantify the overall effect of selection on the elements of receptor sequence composition: the V and J gene choice and the length and amino acid composition of the variable region. We find a significant correlation between biases induced by VDJ recombination and our inferred selection factors together with a reduction of diversity during selection. Both effects suggest that natural selection acting on the recombination process has anticipated the selection pressures experienced during somatic evolution. The inferred selection factors differ little between donors or between naive and memory repertoires. The number of sequences shared between donors is well-predicted by our model, indicating a stochastic origin of such public sequences. Our approach is based on a probabilistic maximum likelihood method, which is necessary to disentangle the effects of selection from biases inherent in the recombination process.
Collapse
|
47
|
de Visser JAGM, Krug J. Empirical fitness landscapes and the predictability of evolution. Nat Rev Genet 2014; 15:480-90. [PMID: 24913663 DOI: 10.1038/nrg3744] [Citation(s) in RCA: 438] [Impact Index Per Article: 39.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
The genotype-fitness map (that is, the fitness landscape) is a key determinant of evolution, yet it has mostly been used as a superficial metaphor because we know little about its structure. This is now changing, as real fitness landscapes are being analysed by constructing genotypes with all possible combinations of small sets of mutations observed in phylogenies or in evolution experiments. In turn, these first glimpses of empirical fitness landscapes inspire theoretical analyses of the predictability of evolution. Here, we review these recent empirical and theoretical developments, identify methodological issues and organizing principles, and discuss possibilities to develop more realistic fitness landscape models.
Collapse
Affiliation(s)
- J Arjan G M de Visser
- Laboratory of Genetics, Wageningen University, Droevendaalsesteeg 1, 6708PB Wageningen, The Netherlands
| | - Joachim Krug
- Institute for Theoretical Physics, University of Cologne, Zülpicher Str. 77, 50937 Köln, Germany
| |
Collapse
|
48
|
Payne JL, Wagner A. Latent phenotypes pervade gene regulatory circuits. BMC SYSTEMS BIOLOGY 2014; 8:64. [PMID: 24884746 PMCID: PMC4061115 DOI: 10.1186/1752-0509-8-64] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/17/2014] [Accepted: 05/12/2014] [Indexed: 12/22/2022]
Abstract
BACKGROUND Latent phenotypes are non-adaptive byproducts of adaptive phenotypes. They exist in biological systems as different as promiscuous enzymes and genome-scale metabolic reaction networks, and can give rise to evolutionary adaptations and innovations. We know little about their prevalence in the gene expression phenotypes of regulatory circuits, important sources of evolutionary innovations. RESULTS Here, we study a space of more than sixteen million three-gene model regulatory circuits, where each circuit is represented by a genotype, and has one or more functions embodied in one or more gene expression phenotypes. We find that the majority of circuits with single functions have latent expression phenotypes. Moreover, the set of circuits with a given spectrum of functions has a repertoire of latent phenotypes that is much larger than that of any one circuit. Most of this latent repertoire can be easily accessed through a series of small genetic changes that preserve a circuit's main functions. Both circuits and gene expression phenotypes that are robust to genetic change are associated with a greater number of latent phenotypes. CONCLUSIONS Our observations suggest that latent phenotypes are pervasive in regulatory circuits, and may thus be an important source of evolutionary adaptations and innovations involving gene regulation.
Collapse
|
49
|
Nourmohammad A, Held T, Lässig M. Universality and predictability in molecular quantitative genetics. Curr Opin Genet Dev 2013; 23:684-93. [PMID: 24291213 DOI: 10.1016/j.gde.2013.11.001] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2013] [Revised: 10/14/2013] [Accepted: 11/01/2013] [Indexed: 12/15/2022]
Abstract
Molecular traits, such as gene expression levels or protein binding affinities, are increasingly accessible to quantitative measurement by modern high-throughput techniques. Such traits measure molecular functions and, from an evolutionary point of view, are important as targets of natural selection. We review recent developments in evolutionary theory and experiments that are expected to become building blocks of a quantitative genetics of molecular traits. We focus on universal evolutionary characteristics: these are largely independent of a trait's genetic basis, which is often at least partially unknown. We show that universal measurements can be used to infer selection on a quantitative trait, which determines its evolutionary mode of conservation or adaptation. Furthermore, universality is closely linked to predictability of trait evolution across lineages. We argue that universal trait statistics extends over a range of cellular scales and opens new avenues of quantitative evolutionary systems biology.
Collapse
Affiliation(s)
- Armita Nourmohammad
- Joseph-Henri Laboratories of Physics and Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, United States
| | | | | |
Collapse
|
50
|
Serohijos AWR, Shakhnovich EI. Contribution of selection for protein folding stability in shaping the patterns of polymorphisms in coding regions. Mol Biol Evol 2013; 31:165-76. [PMID: 24124208 PMCID: PMC3879451 DOI: 10.1093/molbev/mst189] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
The patterns of polymorphisms in genomes are imprints of the evolutionary forces at play in nature. In particular, polymorphisms have been extensively used to infer the fitness effects of mutations and their dynamics of fixation. However, the role and contribution of molecular biophysics to these observations remain unclear. Here, we couple robust findings from protein biophysics, enzymatic flux theory, the selection against the cytotoxic effects of protein misfolding, and explicit population dynamics simulations in the polyclonal regime. First, we recapitulate results on the dynamics of clonal interference and on the shape of the DFE, thus providing them with a molecular and mechanistic foundation. Second, we predict that if evolution is indeed under the dynamic equilibrium of mutation-selection balance, the fraction of stabilizing and destabilizing mutations is almost equal among single-nucleotide polymorphisms segregating at high allele frequencies. This prediction is proven true for polymorphisms in the human coding region. Overall, our results show how selection for protein folding stability predominantly shapes the patterns of polymorphisms in coding regions.
Collapse
|