1
|
Martin NS, Schaper S, Camargo CQ, Louis AA. Non-Poissonian Bursts in the Arrival of Phenotypic Variation Can Strongly Affect the Dynamics of Adaptation. Mol Biol Evol 2024; 41:msae085. [PMID: 38693911 PMCID: PMC11156200 DOI: 10.1093/molbev/msae085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 03/01/2024] [Accepted: 04/17/2024] [Indexed: 05/03/2024] Open
Abstract
Modeling the rate at which adaptive phenotypes appear in a population is a key to predicting evolutionary processes. Given random mutations, should this rate be modeled by a simple Poisson process, or is a more complex dynamics needed? Here we use analytic calculations and simulations of evolving populations on explicit genotype-phenotype maps to show that the introduction of novel phenotypes can be "bursty" or overdispersed. In other words, a novel phenotype either appears multiple times in quick succession or not at all for many generations. These bursts are fundamentally caused by statistical fluctuations and other structure in the map from genotypes to phenotypes. Their strength depends on population parameters, being highest for "monomorphic" populations with low mutation rates. They can also be enhanced by additional inhomogeneities in the mapping from genotypes to phenotypes. We mainly investigate the effect of bursts using the well-studied genotype-phenotype map for RNA secondary structure, but find similar behavior in a lattice protein model and in Richard Dawkins's biomorphs model of morphological development. Bursts can profoundly affect adaptive dynamics. Most notably, they imply that fitness differences play a smaller role in determining which phenotype fixes than would be the case for a Poisson process without bursts.
Collapse
Affiliation(s)
- Nora S Martin
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford OX1 3PU, UK
| | - Steffen Schaper
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford OX1 3PU, UK
| | - Chico Q Camargo
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford OX1 3PU, UK
- Faculty of Environment, Science and Economy, University of Exeter, Exeter EX4 4QF, UK
| | - Ard A Louis
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford OX1 3PU, UK
| |
Collapse
|
2
|
Martin NS, Ahnert SE. The Boltzmann distributions of molecular structures predict likely changes through random mutations. Biophys J 2023; 122:4467-4475. [PMID: 37897043 PMCID: PMC10698324 DOI: 10.1016/j.bpj.2023.10.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 08/19/2023] [Accepted: 10/20/2023] [Indexed: 10/29/2023] Open
Abstract
New folded molecular structures can only evolve after arising through mutations. This aspect is modeled using genotype-phenotype maps, which connect sequence changes through mutations to changes in molecular structures. Previous work has shown that the likelihood of appearing through mutations can differ by orders of magnitude from structure to structure and that this can affect the outcomes of evolutionary processes. Thus, we focus on the phenotypic mutation probabilities φqp, i.e., the likelihood that a random mutation changes structure p into structure q. For both RNA secondary structures and the HP protein model, we show that a simple biophysical principle can explain and predict how this likelihood depends on the new structure q: φqp is high if sequences that fold into p as the minimum-free-energy structure are likely to have q as an alternative structure with high Boltzmann frequency. This generalizes the existing concept of plastogenetic congruence from individual sequences to the entire neutral spaces of structures. Our result helps us understand why some structural changes are more likely than others, may be useful for estimating these likelihoods via sampling and makes a connection to alternative structures with high Boltzmann frequency, which could be relevant in evolutionary processes.
Collapse
Affiliation(s)
- Nora S Martin
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford, United Kingdom; Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, Cambridge, United Kingdom; Sainsbury Laboratory, University of Cambridge, Cambridge, United Kingdom.
| | - Sebastian E Ahnert
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Cambridge, United Kingdom; The Alan Turing Institute, London, United Kingdom
| |
Collapse
|
3
|
Manrubia S, Cuesta JA. Physics of diffusion in viral genome evolution. Proc Natl Acad Sci U S A 2023; 120:e2310999120. [PMID: 37556488 PMCID: PMC10450443 DOI: 10.1073/pnas.2310999120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/11/2023] Open
Affiliation(s)
- Susanna Manrubia
- Departamento de Biología de Sistemas, Centro Nacional de Biotecnología (CSIC), 28049Madrid, Spain
- Grupo Interdisciplinar de Sistemas Complejos, 28911Madrid, Spain
| | - José A. Cuesta
- Grupo Interdisciplinar de Sistemas Complejos, 28911Madrid, Spain
- Instituto de Biocomputación y Física de Sistemas Complejos, Campus Río Ebro, Universidad de Zaragoza, 50018Zaragoza, Spain
- Departamento de Matemáticas, Universidad Carlos III de Madrid, 28911Leganés, Spain
| |
Collapse
|
4
|
Martin NS, Ahnert SE. Insertions and deletions in the RNA sequence-structure map. J R Soc Interface 2021; 18:20210380. [PMID: 34610259 PMCID: PMC8492174 DOI: 10.1098/rsif.2021.0380] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2021] [Accepted: 09/13/2021] [Indexed: 12/21/2022] Open
Abstract
Genotype-phenotype maps link genetic changes to their fitness effect and are thus an essential component of evolutionary models. The map between RNA sequences and their secondary structures is a key example and has applications in functional RNA evolution. For this map, the structural effect of substitutions is well understood, but models usually assume a constant sequence length and do not consider insertions or deletions. Here, we expand the sequence-structure map to include single nucleotide insertions and deletions by using the RNAshapes concept. To quantify the structural effect of insertions and deletions, we generalize existing definitions for robustness and non-neutral mutation probabilities. We find striking similarities between substitutions, deletions and insertions: robustness to substitutions is correlated with robustness to insertions and, for most structures, to deletions. In addition, frequent structural changes after substitutions also tend to be common for insertions and deletions. This is consistent with the connection between energetically suboptimal folds and possible structural transitions. The similarities observed hold both for genotypic and phenotypic robustness and mutation probabilities, i.e. for individual sequences and for averages over sequences with the same structure. Our results could have implications for the rate of neutral and non-neutral evolution.
Collapse
Affiliation(s)
- Nora S. Martin
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, JJ Thomson Avenue, Cambridge CB3 0HE, UK
- Sainsbury Laboratory, University of Cambridge, Bateman Street, Cambridge CB2 1LR, UK
| | - Sebastian E. Ahnert
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, UK
- The Alan Turing Institute, British Library, Euston Road, London NW1 2DB, UK
| |
Collapse
|
5
|
Manrubia S, Cuesta JA, Aguirre J, Ahnert SE, Altenberg L, Cano AV, Catalán P, Diaz-Uriarte R, Elena SF, García-Martín JA, Hogeweg P, Khatri BS, Krug J, Louis AA, Martin NS, Payne JL, Tarnowski MJ, Weiß M. From genotypes to organisms: State-of-the-art and perspectives of a cornerstone in evolutionary dynamics. Phys Life Rev 2021; 38:55-106. [PMID: 34088608 DOI: 10.1016/j.plrev.2021.03.004] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Accepted: 03/01/2021] [Indexed: 12/21/2022]
Abstract
Understanding how genotypes map onto phenotypes, fitness, and eventually organisms is arguably the next major missing piece in a fully predictive theory of evolution. We refer to this generally as the problem of the genotype-phenotype map. Though we are still far from achieving a complete picture of these relationships, our current understanding of simpler questions, such as the structure induced in the space of genotypes by sequences mapped to molecular structures, has revealed important facts that deeply affect the dynamical description of evolutionary processes. Empirical evidence supporting the fundamental relevance of features such as phenotypic bias is mounting as well, while the synthesis of conceptual and experimental progress leads to questioning current assumptions on the nature of evolutionary dynamics-cancer progression models or synthetic biology approaches being notable examples. This work delves with a critical and constructive attitude into our current knowledge of how genotypes map onto molecular phenotypes and organismal functions, and discusses theoretical and empirical avenues to broaden and improve this comprehension. As a final goal, this community should aim at deriving an updated picture of evolutionary processes soundly relying on the structural properties of genotype spaces, as revealed by modern techniques of molecular and functional analysis.
Collapse
Affiliation(s)
- Susanna Manrubia
- Department of Systems Biology, Centro Nacional de Biotecnología (CSIC), Madrid, Spain; Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain.
| | - José A Cuesta
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Departamento de Matemáticas, Universidad Carlos III de Madrid, Leganés, Spain; Instituto de Biocomputación y Física de Sistemas Complejos (BiFi), Universidad de Zaragoza, Spain; UC3M-Santander Big Data Institute (IBiDat), Getafe, Madrid, Spain
| | - Jacobo Aguirre
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Centro de Astrobiología, CSIC-INTA, ctra. de Ajalvir km 4, 28850 Torrejón de Ardoz, Madrid, Spain
| | - Sebastian E Ahnert
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, UK; The Alan Turing Institute, British Library, 96 Euston Road, London NW1 2DB, UK
| | | | - Alejandro V Cano
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Pablo Catalán
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Departamento de Matemáticas, Universidad Carlos III de Madrid, Leganés, Spain
| | - Ramon Diaz-Uriarte
- Department of Biochemistry, Universidad Autónoma de Madrid, Madrid, Spain; Instituto de Investigaciones Biomédicas "Alberto Sols" (UAM-CSIC), Madrid, Spain
| | - Santiago F Elena
- Instituto de Biología Integrativa de Sistemas, I(2)SysBio (CSIC-UV), València, Spain; The Santa Fe Institute, Santa Fe, NM, USA
| | | | - Paulien Hogeweg
- Theoretical Biology and Bioinformatics Group, Utrecht University, the Netherlands
| | - Bhavin S Khatri
- The Francis Crick Institute, London, UK; Department of Life Sciences, Imperial College London, London, UK
| | - Joachim Krug
- Institute for Biological Physics, University of Cologne, Köln, Germany
| | - Ard A Louis
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford, UK
| | - Nora S Martin
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, Cambridge, UK; Sainsbury Laboratory, University of Cambridge, Cambridge, UK
| | - Joshua L Payne
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | - Marcel Weiß
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, Cambridge, UK; Sainsbury Laboratory, University of Cambridge, Cambridge, UK
| |
Collapse
|
6
|
Alexander B, Pushkar A, Girvan M. Phase transitions and assortativity in models of gene regulatory networks evolved under different selection processes. J R Soc Interface 2021; 18:20200790. [PMID: 33849335 DOI: 10.1098/rsif.2020.0790] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
We study a simplified model of gene regulatory network evolution in which links (regulatory interactions) are added via various selection rules that are based on the structural and dynamical features of the network nodes (genes). Similar to well-studied models of 'explosive' percolation, in our approach, links are selectively added so as to delay the transition to large-scale damage propagation, i.e. to make the network robust to small perturbations of gene states. We find that when selection depends only on structure, evolved networks are resistant to widespread damage propagation, even without knowledge of individual gene propensities for becoming 'damaged'. We also observe that networks evolved to avoid damage propagation tend towards disassortativity (i.e. directed links preferentially connect high degree 'source' genes to low degree 'target' genes and vice versa). We compare our simulations to reconstructed gene regulatory networks for several different species, with genes and links added over evolutionary time, and we find a similar bias towards disassortativity in the reconstructed networks.
Collapse
Affiliation(s)
- Brandon Alexander
- Department of Mathematics, University of Maryland, College Park, MD 20740, USA.,Institute for Physical Science and Technology, University of Maryland, College Park, MD 20740, USA.,Program in Applied Mathematics & Statistics and Scientific Computation, University of Maryland, College Park, MD 20740, USA
| | - Alexandra Pushkar
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL 60208, USA
| | - Michelle Girvan
- Institute for Physical Science and Technology, University of Maryland, College Park, MD 20740, USA.,Program in Applied Mathematics & Statistics and Scientific Computation, University of Maryland, College Park, MD 20740, USA.,Department of Physics, University of Maryland, College Park, MD 20740, USA.,Santa Fe Institute, Santa Fe, NM 87501, USA
| |
Collapse
|
7
|
Weiß M, Ahnert SE. Neutral components show a hierarchical community structure in the genotype-phenotype map of RNA secondary structure. J R Soc Interface 2020; 17:20200608. [PMID: 33081646 DOI: 10.1098/rsif.2020.0608] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Genotype-phenotype (GP) maps describe the relationship between biological sequences and structural or functional outcomes. They can be represented as networks in which genotypes are the nodes, and one-point mutations between them are the edges. The genotypes that map to the same phenotype form subnetworks consisting of one or multiple disjoint connected components-so-called neutral components (NCs). For the GP map of RNA secondary structure, the NCs have been found to exhibit distinctive network features that can affect the dynamical processes taking place on them. Here, we focus on the community structure of RNA secondary structure NCs. Building on previous findings, we introduce a method to reveal the hierarchical community structure solely from the sequence constraints and composition of the genotypes that form a given NC. Thereby, we obtain modularity values similar to common community detection algorithms, which are much more complex. From this knowledge, we endorse a sampling method that allows a fast exploration of the different communities of a given NC. Furthermore, we introduce a way to estimate the community structure from genotype samples, which is useful when an exhaustive analysis of the NC is not feasible, as is the case for longer sequence lengths.
Collapse
Affiliation(s)
- Marcel Weiß
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, JJ Thomson Avenue, Cambridge CB3 0HE, UK.,Sainsbury Laboratory, University of Cambridge, Bateman Street, Cambridge CB2 1LR, UK
| | - Sebastian E Ahnert
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, UK.,The Alan Turing Institute, British Library, Euston Road, London NW1 2DB, UK
| |
Collapse
|
8
|
Weiß M, Ahnert SE. Using small samples to estimate neutral component size and robustness in the genotype-phenotype map of RNA secondary structure. J R Soc Interface 2020; 17:20190784. [PMID: 32429824 DOI: 10.1098/rsif.2019.0784] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
In genotype-phenotype (GP) maps, the genotypes that map to the same phenotype are usually not randomly distributed across the space of genotypes, but instead are predominantly connected through one-point mutations, forming network components that are commonly referred to as neutral components (NCs). Because of their impact on evolutionary processes, the characteristics of these NCs, like their size or robustness, have been studied extensively. Here, we introduce a framework that allows the estimation of NC size and robustness in the GP map of RNA secondary structure. The advantage of this framework is that it only requires small samples of genotypes and their local environment, which also allows experimental realizations. We verify our framework by applying it to the exhaustively analysable GP map of RNA sequence length L = 15, and benchmark it against an existing method by applying it to longer, naturally occurring functional non-coding RNA sequences. Although it is specific to the RNA secondary structure GP map in the first place, our framework can probably be transferred and adapted to other sequence-to-structure GP maps.
Collapse
Affiliation(s)
- Marcel Weiß
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, JJ Thomson Avenue, Cambridge CB3 0HE, UK.,Sainsbury Laboratory, University of Cambridge, Bateman Street, Cambridge CB2 1LR, UK
| | - Sebastian E Ahnert
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, JJ Thomson Avenue, Cambridge CB3 0HE, UK.,Sainsbury Laboratory, University of Cambridge, Bateman Street, Cambridge CB2 1LR, UK
| |
Collapse
|
9
|
Khatri BS, Goldstein RA. Biophysics and population size constrains speciation in an evolutionary model of developmental system drift. PLoS Comput Biol 2019; 15:e1007177. [PMID: 31335870 PMCID: PMC6677325 DOI: 10.1371/journal.pcbi.1007177] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Revised: 08/02/2019] [Accepted: 06/13/2019] [Indexed: 02/06/2023] Open
Abstract
Developmental system drift is a likely mechanism for the origin of hybrid incompatibilities between closely related species. We examine here the detailed mechanistic basis of hybrid incompatibilities between two allopatric lineages, for a genotype-phenotype map of developmental system drift under stabilising selection, where an organismal phenotype is conserved, but the underlying molecular phenotypes and genotype can drift. This leads to number of emergent phenomenon not obtainable by modelling genotype or phenotype alone. Our results show that: 1) speciation is more rapid at smaller population sizes with a characteristic, Orr-like, power law, but at large population sizes slow, characterised by a sub-diffusive growth law; 2) the molecular phenotypes under weakest selection contribute to the earliest incompatibilities; and 3) pair-wise incompatibilities dominate over higher order, contrary to previous predictions that the latter should dominate. The population size effect we find is consistent with previous results on allopatric divergence of transcription factor-DNA binding, where smaller populations have common ancestors with a larger drift load because genetic drift favours phenotypes which have a larger number of genotypes (higher sequence entropy) over more fit phenotypes which have far fewer genotypes; this means less substitutions are required in either lineage before incompatibilities arise. Overall, our results indicate that biophysics and population size provide a much stronger constraint to speciation than suggested by previous models, and point to a general mechanistic principle of how incompatibilities arise the under stabilising selection for an organismal phenotype. The process of speciation is of fundamental importance to the field of evolution as it is intimately connected to understanding the immense bio-diversity of life. There is still relatively little understanding of the underlying genetic mechanisms that give rise to hybrid incompatibilities with results suggesting that divergence in transcription factor DNA binding and gene expression play an important role. A key finding from the field of evo-devo is that organismal phenotypes show developmental system drift, where species maintain the same phenotype, but diverge in developmental pathways; this is an important potential source of hybrid incompatibilities. Here, we explore a theoretical framework to understand how incompatibilities arise due to developmental system drift, using a tractable biophysically inspired genotype-phenotype for spatial gene expression. Modelling the evolution of phenotypes in this way has the key advantage that it mirrors how selection works in nature, i.e. that selection acts on phenotypes, but variation (mutation) arise at the level of genotypes. This results, as we demonstrate, in a number of non-trivial and testable predictions concerning speciation due to developmental system drift, which would not be obtainable by modelling evolution of genotypes or phenotypes alone.
Collapse
Affiliation(s)
| | - Richard A. Goldstein
- Division of Infection & Immunity, University College London, London, United Kingdom
| |
Collapse
|
10
|
Modelling and simulating Lenski’s long-term evolution experiment. Theor Popul Biol 2019; 127:58-74. [DOI: 10.1016/j.tpb.2019.03.006] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2018] [Revised: 03/28/2019] [Accepted: 03/29/2019] [Indexed: 01/15/2023]
|
11
|
Ali A, Melcher U. Modeling of Mutational Events in the Evolution of Viruses. Viruses 2019; 11:v11050418. [PMID: 31060293 PMCID: PMC6563203 DOI: 10.3390/v11050418] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Revised: 04/27/2019] [Accepted: 05/02/2019] [Indexed: 11/24/2022] Open
Abstract
Diverse studies of viral evolution have led to the recognition that the evolutionary rates of viral taxa observed are dependent on the time scale being investigated—with short-term studies giving fast substitution rates, and orders of magnitude lower rates for deep calibrations. Although each of these factors may contribute to this time dependent rate phenomenon, a more fundamental cause should be considered. We sought to test computationally whether the basic phenomena of virus evolution (mutation, replication, and selection) can explain the relationships between the evolutionary and phylogenetic distances. We tested, by computational inference, the hypothesis that the phylogenetic distances between the pairs of sequences are functions of the evolutionary path lengths between them. A Basic simulation revealed that the relationship between simulated genetic and mutational distances is non-linear, and can be consistent with different rates of nucleotide substitution at different depths of branches in phylogenetic trees.
Collapse
Affiliation(s)
- Akhtar Ali
- Department of Biological Sciences, University of Tulsa, Tulsa, OK 74104, USA.
| | - Ulrich Melcher
- Department of Biochemistry & Molecular Biology, Oklahoma State University, Stillwater, OK 74078-3035, USA.
| |
Collapse
|
12
|
Aguirre J, Catalán P, Cuesta JA, Manrubia S. On the networked architecture of genotype spaces and its critical effects on molecular evolution. Open Biol 2018; 8:180069. [PMID: 29973397 PMCID: PMC6070719 DOI: 10.1098/rsob.180069] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2018] [Accepted: 06/12/2018] [Indexed: 12/26/2022] Open
Abstract
Evolutionary dynamics is often viewed as a subtle process of change accumulation that causes a divergence among organisms and their genomes. However, this interpretation is an inheritance of a gradualistic view that has been challenged at the macroevolutionary, ecological and molecular level. Actually, when the complex architecture of genotype spaces is taken into account, the evolutionary dynamics of molecular populations becomes intrinsically non-uniform, sharing deep qualitative and quantitative similarities with slowly driven physical systems: nonlinear responses analogous to critical transitions, sudden state changes or hysteresis, among others. Furthermore, the phenotypic plasticity inherent to genotypes transforms classical fitness landscapes into multiscapes where adaptation in response to an environmental change may be very fast. The quantitative nature of adaptive molecular processes is deeply dependent on a network-of-networks multilayered structure of the map from genotype to function that we begin to unveil.
Collapse
Affiliation(s)
- Jacobo Aguirre
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain
- Programa de Biología de Sistemas, Centro Nacional de Biotecnología (CSIC), Madrid, Spain
| | - Pablo Catalán
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain
- Departamento de Matemáticas, Universidad Carlos III de Madrid, Leganés, Madrid, Spain
| | - José A Cuesta
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain
- Departamento de Matemáticas, Universidad Carlos III de Madrid, Leganés, Madrid, Spain
- Instituto de Biocomputación y Física de Sistemas Complejos (BIFI), Universidad de Zaragoza, Zaragoza, Spain
- UC3M-BS Institute of Financial Big Data (IFiBiD), Universidad Carlos III de Madrid, Getafe, Madrid, Spain
| | - Susanna Manrubia
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain
- Programa de Biología de Sistemas, Centro Nacional de Biotecnología (CSIC), Madrid, Spain
| |
Collapse
|
13
|
Aguilar‐Rodríguez J, Peel L, Stella M, Wagner A, Payne JL. The architecture of an empirical genotype-phenotype map. Evolution 2018; 72:1242-1260. [PMID: 29676774 PMCID: PMC6055911 DOI: 10.1111/evo.13487] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2017] [Accepted: 04/03/2018] [Indexed: 12/15/2022]
Abstract
Recent advances in high-throughput technologies are bringing the study of empirical genotype-phenotype (GP) maps to the fore. Here, we use data from protein-binding microarrays to study an empirical GP map of transcription factor (TF) -binding preferences. In this map, each genotype is a DNA sequence. The phenotype of this DNA sequence is its ability to bind one or more TFs. We study this GP map using genotype networks, in which nodes represent genotypes with the same phenotype, and edges connect nodes if their genotypes differ by a single small mutation. We describe the structure and arrangement of genotype networks within the space of all possible binding sites for 525 TFs from three eukaryotic species encompassing three kingdoms of life (animal, plant, and fungi). We thus provide a high-resolution depiction of the architecture of an empirical GP map. Among a number of findings, we show that these genotype networks are "small-world" and assortative, and that they ubiquitously overlap and interface with one another. We also use polymorphism data from Arabidopsis thaliana to show how genotype network structure influences the evolution of TF-binding sites in vivo. We discuss our findings in the context of regulatory evolution.
Collapse
Affiliation(s)
- José Aguilar‐Rodríguez
- Department of Evolutionary Biology and Environmental StudiesUniversity of ZurichZurichSwitzerland
- Swiss Institute of BioinformaticsLausanneSwitzerland
- Current Address: Department of Biology, Stanford University, StanfordCA, USA; Department of Chemical and Systems Biology, Stanford UniversityStanfordCAUSA
| | - Leto Peel
- Institute of Information and Communication Technologies, Electronics and Applied MathematicsUniversité Catholique de LouvainLouvain‐la‐NeuveBelgium
- Namur Center for Complex SystemsUniversity of NamurNamurBelgium
| | - Massimo Stella
- Institute for Complex Systems Simulation, Department of Electronics and Computer ScienceUniversity of SouthamptonSouthamptonUnited Kingdom
| | - Andreas Wagner
- Department of Evolutionary Biology and Environmental StudiesUniversity of ZurichZurichSwitzerland
- Swiss Institute of BioinformaticsLausanneSwitzerland
- The Santa Fe InstituteSanta FeNew MexicoUSA
| | - Joshua L. Payne
- Swiss Institute of BioinformaticsLausanneSwitzerland
- Institute for Integrative Biology, ETHZurichSwitzerland
| |
Collapse
|
14
|
Wagner A. Information theory, evolutionary innovations and evolvability. Philos Trans R Soc Lond B Biol Sci 2018; 372:rstb.2016.0416. [PMID: 29061889 DOI: 10.1098/rstb.2016.0416] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/08/2017] [Indexed: 11/12/2022] Open
Abstract
How difficult is it to 'discover' an evolutionary adaptation or innovation? I here suggest that information theory, in combination with high-throughput DNA sequencing, can help answer this question by quantifying a new phenotype's information content. I apply this framework to compute the phenotypic information associated with novel gene regulation and with the ability to use novel carbon sources. The framework can also help quantify how DNA duplications affect evolvability, estimate the complexity of phenotypes and clarify the meaning of 'progress' in Darwinian evolution.This article is part of the themed issue 'Process and pattern in innovations from cells to societies'.
Collapse
Affiliation(s)
- Andreas Wagner
- Institute of Evolutionary Biology and Environmental Studies, University of Zurich, 8057 Zurich, Switzerland .,Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland.,Santa Fe Institute, Santa Fe, NM 87501, USA
| |
Collapse
|
15
|
Buchholz PCF, Fademrecht S, Pleiss J. Percolation in protein sequence space. PLoS One 2017; 12:e0189646. [PMID: 29261740 PMCID: PMC5738032 DOI: 10.1371/journal.pone.0189646] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2017] [Accepted: 11/28/2017] [Indexed: 01/08/2023] Open
Abstract
The currently known protein sequences are not distributed equally in sequence space, but cluster into families. Analyzing the cluster size distribution gives a glimpse of the large and unknown extant protein sequence space, which has been explored during evolution. For six protein superfamilies with different fold and function, the cluster size distributions followed a power law with slopes between 2.4 and 3.3, which represent upper limits to the cluster distribution of extant sequences. The power law distribution of cluster sizes is in accordance with percolation theory and strongly supports connectedness of extant sequence space. Percolation of extant sequence space has three major consequences: (1) It transforms our view of sequence space as a highly connected network where each sequence has multiple neighbors, and each pair of sequences is connected by many different paths. A high degree of connectedness is a necessary condition of efficient evolution, because it overcomes the possible blockage by sign epistasis and reciprocal sign epistasis. (2) The Fisher exponent is an indicator of connectedness and saturation of sequence space of each protein superfamily. (3) All clusters are expected to be connected by extant sequences that become apparent as a higher portion of extant sequence space becomes known. Being linked to biochemically distinct homologous families, bridging sequences are promising enzyme candidates for applications in biotechnology because they are expected to have substrate ambiguity or catalytic promiscuity.
Collapse
Affiliation(s)
- Patrick C. F. Buchholz
- Institute of Biochemistry and Technical Biochemistry, University of Stuttgart, Stuttgart, Germany
| | - Silvia Fademrecht
- Institute of Biochemistry and Technical Biochemistry, University of Stuttgart, Stuttgart, Germany
| | - Jürgen Pleiss
- Institute of Biochemistry and Technical Biochemistry, University of Stuttgart, Stuttgart, Germany
- * E-mail:
| |
Collapse
|
16
|
Yubero P, Manrubia S, Aguirre J. The space of genotypes is a network of networks: implications for evolutionary and extinction dynamics. Sci Rep 2017; 7:13813. [PMID: 29062002 PMCID: PMC5653773 DOI: 10.1038/s41598-017-14048-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2017] [Accepted: 10/04/2017] [Indexed: 11/09/2022] Open
Abstract
The forcing that environmental variation exerts on populations causes continuous changes with only two possible evolutionary outcomes: adaptation or extinction. Here we address this topic by studying the transient dynamics of populations on complex fitness landscapes. There are three important features of realistic landscapes of relevance in the evolutionary process: fitness landscapes are rough but correlated, their fitness values depend on the current environment, and many (often most) genotypes do not yield viable phenotypes. We capture these properties by defining time-varying, holey, NK fitness landscapes. We show that the structure of the space of genotypes so generated is that of a network of networks: in a sufficiently holey landscape, populations are temporarily stuck in local networks of genotypes. Sudden jumps to neighbouring networks through narrow adaptive pathways (connector links) are possible, though strong enough local trapping may also cause decays in population growth and eventual extinction. A combination of analytical and numerical techniques to characterize complex networks and population dynamics on such networks permits to derive several quantitative relationships between the topology of the space of genotypes and the fate of evolving populations.
Collapse
Affiliation(s)
- Pablo Yubero
- Centro Nacional de Biotecnología, CSIC, c/Darwin 3, 28049, Madrid, Spain
| | - Susanna Manrubia
- Centro Nacional de Biotecnología, CSIC, c/Darwin 3, 28049, Madrid, Spain
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain
| | - Jacobo Aguirre
- Centro Nacional de Biotecnología, CSIC, c/Darwin 3, 28049, Madrid, Spain.
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain.
| |
Collapse
|
17
|
Internal Disequilibria and Phenotypic Diversification during Replication of Hepatitis C Virus in a Noncoevolving Cellular Environment. J Virol 2017; 91:JVI.02505-16. [PMID: 28275194 DOI: 10.1128/jvi.02505-16] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2016] [Accepted: 02/28/2017] [Indexed: 12/14/2022] Open
Abstract
Viral quasispecies evolution upon long-term virus replication in a noncoevolving cellular environment raises relevant general issues, such as the attainment of population equilibrium, compliance with the molecular-clock hypothesis, or stability of the phenotypic profile. Here, we evaluate the adaptation, mutant spectrum dynamics, and phenotypic diversification of hepatitis C virus (HCV) in the course of 200 passages in human hepatoma cells in an experimental design that precluded coevolution of the cells with the virus. Adaptation to the cells was evidenced by increase in progeny production. The rate of accumulation of mutations in the genomic consensus sequence deviated slightly from linearity, and mutant spectrum analyses revealed a complex dynamic of mutational waves, which was sustained beyond passage 100. The virus underwent several phenotypic changes, some of which impacted the virus-host relationship, such as enhanced cell killing, a shift toward higher virion density, and increased shutoff of host cell protein synthesis. Fluctuations in progeny production and failure to reach population equilibrium at the genomic level suggest internal instabilities that anticipate an unpredictable HCV evolution in the complex liver environment.IMPORTANCE Long-term virus evolution in an unperturbed cellular environment can reveal features of virus evolution that cannot be explained by comparing natural viral isolates. In the present study, we investigate genetic and phenotypic changes that occur upon prolonged passage of hepatitis C virus (HCV) in human hepatoma cells in an experimental design in which host cell evolutionary change is prevented. Despite replication in a noncoevolving cellular environment, the virus exhibited internal population disequilibria that did not decline with increased adaptation to the host cells. The diversification of phenotypic traits suggests that disequilibria inherent to viral populations may provide a selective advantage to viruses that can be fully exploited in changing environments.
Collapse
|
18
|
Catalán P, Arias CF, Cuesta JA, Manrubia S. Adaptive multiscapes: an up-to-date metaphor to visualize molecular adaptation. Biol Direct 2017; 12:7. [PMID: 28245845 PMCID: PMC5331743 DOI: 10.1186/s13062-017-0178-1] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Accepted: 02/11/2017] [Indexed: 01/08/2023] Open
Abstract
Background Wright’s metaphor of the fitness landscape has shaped and conditioned our view of the adaptation of populations for almost a century. Since its inception, and including criticism raised by Wright himself, the concept has been surrounded by controversy. Among others, the debate stems from the intrinsic difficulty to capture important features of the space of genotypes, such as its high dimensionality or the existence of abundant ridges, in a visually appealing two-dimensional picture. Two additional currently widespread observations come to further constrain the applicability of the original metaphor: the very skewed distribution of phenotype sizes (which may actively prevent, due to entropic effects, the achievement of fitness maxima), and functional promiscuity (i.e. the existence of secondary functions which entail partial adaptation to environments never encountered before by the population). Results Here we revise some of the shortcomings of the fitness landscape metaphor and propose a new “scape” formed by interconnected layers, each layer containing the phenotypes viable in a given environment. Different phenotypes within a layer are accessible through mutations with selective value, while neutral mutations cause displacements of populations within a phenotype. A different environment is represented as a separated layer, where phenotypes may have new fitness values, other phenotypes may be viable, and the same genotype may yield a different phenotype, representing genotypic promiscuity. This scenario explicitly includes the many-to-many structure of the genotype-to-phenotype map. A number of empirical observations regarding the adaptation of populations in the light of adaptive multiscapes are reviewed. Conclusions Several shortcomings of Wright’s visualization of fitness landscapes can be overcome through adaptive multiscapes. Relevant aspects of population adaptation, such as neutral drift, functional promiscuity or environment-dependent fitness, as well as entropic trapping and the concomitant impossibility to reach fitness peaks are visualized at once. Adaptive multiscapes should aid in the qualitative understanding of the multiple pathways involved in evolutionary dynamics. Reviewers This article was reviewed by Eugene Koonin and Ricard Solé.
Collapse
Affiliation(s)
- Pablo Catalán
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain.,Departamento de Matemáticas, Universidad Carlos III de Madrid, Madrid, Spain
| | - Clemente F Arias
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain
| | - Jose A Cuesta
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain.,Departamento de Matemáticas, Universidad Carlos III de Madrid, Madrid, Spain.,Institute for Biocomputation and Physics of Complex Systems, Zaragoza, Spain.,UC3M-BS Institute of Financial Big Data (IFiBiD), Madrid, Spain
| | - Susanna Manrubia
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain. .,National Biotechnology Centre (CSIC), c/ Darwin 3, Madrid, 28049, Spain.
| |
Collapse
|
19
|
Keller-Schmidt S, Tuğrul M, Eguíluz VM, Hernández-García E, Klemm K. Anomalous scaling in an age-dependent branching model. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2015; 91:022803. [PMID: 25768548 DOI: 10.1103/physreve.91.022803] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/04/2013] [Indexed: 06/04/2023]
Abstract
We introduce a one-parametric family of tree growth models, in which branching probabilities decrease with branch age τ as τ(-α). Depending on the exponent α, the scaling of tree depth with tree size n displays a transition between the logarithmic scaling of random trees and an algebraic growth. At the transition (α=1) tree depth grows as (logn)(2). This anomalous scaling is in good agreement with the trend observed in evolution of biological species, thus providing a theoretical support for age-dependent speciation and associating it to the occurrence of a critical point.
Collapse
Affiliation(s)
- Stephanie Keller-Schmidt
- Bioinformatics, Institute of Computer Science, University Leipzig, Härtelstr. 16-18, 04107 Leipzig, Germany
| | - Murat Tuğrul
- IST Austria, Am Campus 1, 3400 Klosterneuburg, Austria
| | - Víctor M Eguíluz
- IFISC (CSIC-UIB), Instituto de Física Interdisciplinar y Sistemas Complejos, E-07122 Palma de Mallorca, Spain
| | - Emilio Hernández-García
- IFISC (CSIC-UIB), Instituto de Física Interdisciplinar y Sistemas Complejos, E-07122 Palma de Mallorca, Spain
| | - Konstantin Klemm
- Bioinformatics, Institute of Computer Science, University Leipzig, Härtelstr. 16-18, 04107 Leipzig, Germany
- Bioinformatics and Computational Biology, University of Vienna, Währingerstraße 29, 1090 Vienna, Austria
- Theoretical Chemistry, University of Vienna, Währingerstraße 17, 1090 Vienna, Austria
- School of Science and Technology, Nazarbayev University, Kabanbay Batyr Ave. 53, 010000 Astana, Kazakhstan
| |
Collapse
|