1
|
Martin NS, Schaper S, Camargo CQ, Louis AA. Non-Poissonian Bursts in the Arrival of Phenotypic Variation Can Strongly Affect the Dynamics of Adaptation. Mol Biol Evol 2024; 41:msae085. [PMID: 38693911 PMCID: PMC11156200 DOI: 10.1093/molbev/msae085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 03/01/2024] [Accepted: 04/17/2024] [Indexed: 05/03/2024] Open
Abstract
Modeling the rate at which adaptive phenotypes appear in a population is a key to predicting evolutionary processes. Given random mutations, should this rate be modeled by a simple Poisson process, or is a more complex dynamics needed? Here we use analytic calculations and simulations of evolving populations on explicit genotype-phenotype maps to show that the introduction of novel phenotypes can be "bursty" or overdispersed. In other words, a novel phenotype either appears multiple times in quick succession or not at all for many generations. These bursts are fundamentally caused by statistical fluctuations and other structure in the map from genotypes to phenotypes. Their strength depends on population parameters, being highest for "monomorphic" populations with low mutation rates. They can also be enhanced by additional inhomogeneities in the mapping from genotypes to phenotypes. We mainly investigate the effect of bursts using the well-studied genotype-phenotype map for RNA secondary structure, but find similar behavior in a lattice protein model and in Richard Dawkins's biomorphs model of morphological development. Bursts can profoundly affect adaptive dynamics. Most notably, they imply that fitness differences play a smaller role in determining which phenotype fixes than would be the case for a Poisson process without bursts.
Collapse
Affiliation(s)
- Nora S Martin
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford OX1 3PU, UK
| | - Steffen Schaper
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford OX1 3PU, UK
| | - Chico Q Camargo
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford OX1 3PU, UK
- Faculty of Environment, Science and Economy, University of Exeter, Exeter EX4 4QF, UK
| | - Ard A Louis
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford OX1 3PU, UK
| |
Collapse
|
2
|
Martin NS, Camargo CQ, Louis AA. Bias in the arrival of variation can dominate over natural selection in Richard Dawkins's biomorphs. PLoS Comput Biol 2024; 20:e1011893. [PMID: 38536880 PMCID: PMC10971585 DOI: 10.1371/journal.pcbi.1011893] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Accepted: 02/02/2024] [Indexed: 11/12/2024] Open
Abstract
Biomorphs, Richard Dawkins's iconic model of morphological evolution, are traditionally used to demonstrate the power of natural selection to generate biological order from random mutations. Here we show that biomorphs can also be used to illustrate how developmental bias shapes adaptive evolutionary outcomes. In particular, we find that biomorphs exhibit phenotype bias, a type of developmental bias where certain phenotypes can be many orders of magnitude more likely than others to appear through random mutations. Moreover, this bias exhibits a strong preference for simpler phenotypes with low descriptional complexity. Such bias towards simplicity is formalised by an information-theoretic principle that can be intuitively understood from a picture of evolution randomly searching in the space of algorithms. By using population genetics simulations, we demonstrate how moderately adaptive phenotypic variation that appears more frequently upon random mutations can fix at the expense of more highly adaptive biomorph phenotypes that are less frequent. This result, as well as many other patterns found in the structure of variation for the biomorphs, such as high mutational robustness and a positive correlation between phenotype evolvability and robustness, closely resemble findings in molecular genotype-phenotype maps. Many of these patterns can be explained with an analytic model based on constrained and unconstrained sections of the genome. We postulate that the phenotype bias towards simplicity and other patterns biomorphs share with molecular genotype-phenotype maps may hold more widely for developmental systems.
Collapse
Affiliation(s)
- Nora S. Martin
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford, United Kingdom
| | - Chico Q. Camargo
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford, United Kingdom
- College of Engineering, Mathematics and Physical Sciences, University of Exeter, Exeter, United Kingdom
| | - Ard A. Louis
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
3
|
García-Galindo P, Ahnert SE, Martin NS. The non-deterministic genotype-phenotype map of RNA secondary structure. J R Soc Interface 2023; 20:20230132. [PMID: 37608711 PMCID: PMC10445035 DOI: 10.1098/rsif.2023.0132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Accepted: 08/01/2023] [Indexed: 08/24/2023] Open
Abstract
Selection and variation are both key aspects in the evolutionary process. Previous research on the mapping between molecular sequence (genotype) and molecular fold (phenotype) has shown the presence of several structural properties in different biological contexts, implying that these might be universal in evolutionary spaces. The deterministic genotype-phenotype (GP) map that links short RNA sequences to minimum free energy secondary structures has been studied extensively because of its computational tractability and biologically realistic nature. However, this mapping ignores the phenotypic plasticity of RNA. We define a GP map that incorporates non-deterministic (ND) phenotypes, and take RNA as a case study; we use the Boltzmann probability distribution of folded structures and examine the structural properties of ND GP maps for RNA sequences of length 12 and coarse-grained RNA structures of length 30 (RNAshapes30). A framework is presented to study robustness, evolvability and neutral spaces in the ND map. This framework is validated by demonstrating close correspondence between the ND quantities and sample averages of their deterministic counterparts. When using the ND framework we observe the same structural properties as in the deterministic GP map, such as bias, negative correlation between genotypic robustness and evolvability, and positive correlation between phenotypic robustness and evolvability.
Collapse
Affiliation(s)
- Paula García-Galindo
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, UK
| | - Sebastian E. Ahnert
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, UK
- The Alan Turing Institute, 96 Euston Road, London NW1 2DB, UK
| | - Nora S. Martin
- Rudolf Peierls Centre for Theoretical Physics, Beecroft Building, Parks Road, Oxford OX1 3PU, UK
| |
Collapse
|
4
|
Mohanty V, Greenbury SF, Sarkany T, Narayanan S, Dingle K, Ahnert SE, Louis AA. Maximum mutational robustness in genotype-phenotype maps follows a self-similar blancmange-like curve. J R Soc Interface 2023; 20:20230169. [PMID: 37491910 PMCID: PMC10369032 DOI: 10.1098/rsif.2023.0169] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 06/27/2023] [Indexed: 07/27/2023] Open
Abstract
Phenotype robustness, defined as the average mutational robustness of all the genotypes that map to a given phenotype, plays a key role in facilitating neutral exploration of novel phenotypic variation by an evolving population. By applying results from coding theory, we prove that the maximum phenotype robustness occurs when genotypes are organized as bricklayer's graphs, so-called because they resemble the way in which a bricklayer would fill in a Hamming graph. The value of the maximal robustness is given by a fractal continuous everywhere but differentiable nowhere sums-of-digits function from number theory. Interestingly, genotype-phenotype maps for RNA secondary structure and the hydrophobic-polar (HP) model for protein folding can exhibit phenotype robustness that exactly attains this upper bound. By exploiting properties of the sums-of-digits function, we prove a lower bound on the deviation of the maximum robustness of phenotypes with multiple neutral components from the bricklayer's graph bound, and show that RNA secondary structure phenotypes obey this bound. Finally, we show how robustness changes when phenotypes are coarse-grained and derive a formula and associated bounds for the transition probabilities between such phenotypes.
Collapse
Affiliation(s)
- Vaibhav Mohanty
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford, UK
- Program in Health Sciences and Technology, Massachusetts Institute of Technology, Cambridge, MA, USA
- MD-PhD Program, Harvard Medical School, Boston, MA, USA and Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Sam F. Greenbury
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, Cambridge, UK
- The Alan Turing Institute, British Library, London, UK
| | - Tasmin Sarkany
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
| | - Shyam Narayanan
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Kamaludin Dingle
- Department of Mathematics and Natural Sciences, Centre for Applied Mathematics and Bioinformatics (CAMB), Gulf University of Science and Technology, Kuwait
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, USA
| | - Sebastian E. Ahnert
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, Cambridge, UK
- Department of Chemical Engineering and Biotechnology, Cavendish Laboratory, University of Cambridge, Cambridge, UK
- The Alan Turing Institute, British Library, London, UK
| | - Ard A. Louis
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford, UK
| |
Collapse
|
5
|
Random and Natural Non-Coding RNA Have Similar Structural Motif Patterns but Differ in Bulge, Loop, and Bond Counts. Life (Basel) 2023; 13:life13030708. [PMID: 36983865 PMCID: PMC10054693 DOI: 10.3390/life13030708] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 02/15/2023] [Accepted: 02/27/2023] [Indexed: 03/08/2023] Open
Abstract
An important question in evolutionary biology is whether (and in what ways) genotype–phenotype (GP) map biases can influence evolutionary trajectories. Untangling the relative roles of natural selection and biases (and other factors) in shaping phenotypes can be difficult. Because the RNA secondary structure (SS) can be analyzed in detail mathematically and computationally, is biologically relevant, and a wealth of bioinformatic data are available, it offers a good model system for studying the role of bias. For quite short RNA (length L≤126), it has recently been shown that natural and random RNA types are structurally very similar, suggesting that bias strongly constrains evolutionary dynamics. Here, we extend these results with emphasis on much larger RNA with lengths up to 3000 nucleotides. By examining both abstract shapes and structural motif frequencies (i.e., the number of helices, bonds, bulges, junctions, and loops), we find that large natural and random structures are also very similar, especially when contrasted to typical structures sampled from the spaces of all possible RNA structures. Our motif frequency study yields another result, where the frequencies of different motifs can be used in machine learning algorithms to classify random and natural RNA with high accuracy, especially for longer RNA (e.g., ROC AUC 0.86 for L = 1000). The most important motifs for classification are the number of bulges, loops, and bonds. This finding may be useful in using SS to detect candidates for functional RNA within ‘junk’ DNA regions.
Collapse
|
6
|
Dingle K, Novev JK, Ahnert SE, Louis AA. Predicting phenotype transition probabilities via conditional algorithmic probability approximations. J R Soc Interface 2022; 19:20220694. [PMID: 36514888 PMCID: PMC9748496 DOI: 10.1098/rsif.2022.0694] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Accepted: 11/18/2022] [Indexed: 12/15/2022] Open
Abstract
Unravelling the structure of genotype-phenotype (GP) maps is an important problem in biology. Recently, arguments inspired by algorithmic information theory (AIT) and Kolmogorov complexity have been invoked to uncover simplicity bias in GP maps, an exponentially decaying upper bound in phenotype probability with the increasing phenotype descriptional complexity. This means that phenotypes with many genotypes assigned via the GP map must be simple, while complex phenotypes must have few genotypes assigned. Here, we use similar arguments to bound the probability P(x → y) that phenotype x, upon random genetic mutation, transitions to phenotype y. The bound is [Formula: see text], where [Formula: see text] is the estimated conditional complexity of y given x, quantifying how much extra information is required to make y given access to x. This upper bound is related to the conditional form of algorithmic probability from AIT. We demonstrate the practical applicability of our derived bound by predicting phenotype transition probabilities (and other related quantities) in simulations of RNA and protein secondary structures. Our work contributes to a general mathematical understanding of GP maps and may facilitate the prediction of transition probabilities directly from examining phenotype themselves, without utilizing detailed knowledge of the GP map.
Collapse
Affiliation(s)
- Kamaludin Dingle
- Department of Chemical Engineering and Biotechnology, Cambridge University, Cambridge CB2 1TN, UK
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA 91125, USA
- Department of Mathematics and Natural Sciences, Centre for Applied Mathematics and Bioinformatics (CAMB), Gulf University for Science and Technology, 32093, Kuwait
| | - Javor K. Novev
- Department of Chemical Engineering and Biotechnology, Cambridge University, Cambridge CB2 1TN, UK
| | - Sebastian E. Ahnert
- Department of Chemical Engineering and Biotechnology, Cambridge University, Cambridge CB2 1TN, UK
| | - Ard A. Louis
- Department of Physics, Rudolf Peierls Centre for Theoretical Physics, Oxford University, Oxford OX1 2JD, UK
| |
Collapse
|
7
|
Dingle K, Ghaddar F, Šulc P, Louis AA. Phenotype Bias Determines How Natural RNA Structures Occupy the Morphospace of All Possible Shapes. Mol Biol Evol 2022; 39:msab280. [PMID: 34542628 PMCID: PMC8763027 DOI: 10.1093/molbev/msab280] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Morphospaces-representations of phenotypic characteristics-are often populated unevenly, leaving large parts unoccupied. Such patterns are typically ascribed to contingency, or else to natural selection disfavoring certain parts of the morphospace. The extent to which developmental bias, the tendency of certain phenotypes to preferentially appear as potential variation, also explains these patterns is hotly debated. Here we demonstrate quantitatively that developmental bias is the primary explanation for the occupation of the morphospace of RNA secondary structure (SS) shapes. Upon random mutations, some RNA SS shapes (the frequent ones) are much more likely to appear than others. By using the RNAshapes method to define coarse-grained SS classes, we can directly compare the frequencies that noncoding RNA SS shapes appear in the RNAcentral database to frequencies obtained upon a random sampling of sequences. We show that: 1) only the most frequent structures appear in nature; the vast majority of possible structures in the morphospace have not yet been explored; 2) remarkably small numbers of random sequences are needed to produce all the RNA SS shapes found in nature so far; and 3) perhaps most surprisingly, the natural frequencies are accurately predicted, over several orders of magnitude in variation, by the likelihood that structures appear upon a uniform random sampling of sequences. The ultimate cause of these patterns is not natural selection, but rather a strong phenotype bias in the RNA genotype-phenotype map, a type of developmental bias or "findability constraint," which limits evolutionary dynamics to a hugely reduced subset of structures that are easy to "find."
Collapse
Affiliation(s)
- Kamaludin Dingle
- Centre for Applied Mathematics and Bioinformatics, Department of Mathematics and Natural Sciences, Gulf University for Science and Technology, Hawally, Kuwait
| | - Fatme Ghaddar
- Centre for Applied Mathematics and Bioinformatics, Department of Mathematics and Natural Sciences, Gulf University for Science and Technology, Hawally, Kuwait
| | - Petr Šulc
- School of Molecular Sciences and Center for Molecular Design and Biomimetics at the Biodesign Institute, Arizona State University, Tempe, AZ, USA
| | - Ard A Louis
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
8
|
Martin NS, Ahnert SE. Insertions and deletions in the RNA sequence-structure map. J R Soc Interface 2021; 18:20210380. [PMID: 34610259 PMCID: PMC8492174 DOI: 10.1098/rsif.2021.0380] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2021] [Accepted: 09/13/2021] [Indexed: 12/21/2022] Open
Abstract
Genotype-phenotype maps link genetic changes to their fitness effect and are thus an essential component of evolutionary models. The map between RNA sequences and their secondary structures is a key example and has applications in functional RNA evolution. For this map, the structural effect of substitutions is well understood, but models usually assume a constant sequence length and do not consider insertions or deletions. Here, we expand the sequence-structure map to include single nucleotide insertions and deletions by using the RNAshapes concept. To quantify the structural effect of insertions and deletions, we generalize existing definitions for robustness and non-neutral mutation probabilities. We find striking similarities between substitutions, deletions and insertions: robustness to substitutions is correlated with robustness to insertions and, for most structures, to deletions. In addition, frequent structural changes after substitutions also tend to be common for insertions and deletions. This is consistent with the connection between energetically suboptimal folds and possible structural transitions. The similarities observed hold both for genotypic and phenotypic robustness and mutation probabilities, i.e. for individual sequences and for averages over sequences with the same structure. Our results could have implications for the rate of neutral and non-neutral evolution.
Collapse
Affiliation(s)
- Nora S. Martin
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, JJ Thomson Avenue, Cambridge CB3 0HE, UK
- Sainsbury Laboratory, University of Cambridge, Bateman Street, Cambridge CB2 1LR, UK
| | - Sebastian E. Ahnert
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, UK
- The Alan Turing Institute, British Library, Euston Road, London NW1 2DB, UK
| |
Collapse
|
9
|
Manrubia S, Cuesta JA, Aguirre J, Ahnert SE, Altenberg L, Cano AV, Catalán P, Diaz-Uriarte R, Elena SF, García-Martín JA, Hogeweg P, Khatri BS, Krug J, Louis AA, Martin NS, Payne JL, Tarnowski MJ, Weiß M. From genotypes to organisms: State-of-the-art and perspectives of a cornerstone in evolutionary dynamics. Phys Life Rev 2021; 38:55-106. [PMID: 34088608 DOI: 10.1016/j.plrev.2021.03.004] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Accepted: 03/01/2021] [Indexed: 12/21/2022]
Abstract
Understanding how genotypes map onto phenotypes, fitness, and eventually organisms is arguably the next major missing piece in a fully predictive theory of evolution. We refer to this generally as the problem of the genotype-phenotype map. Though we are still far from achieving a complete picture of these relationships, our current understanding of simpler questions, such as the structure induced in the space of genotypes by sequences mapped to molecular structures, has revealed important facts that deeply affect the dynamical description of evolutionary processes. Empirical evidence supporting the fundamental relevance of features such as phenotypic bias is mounting as well, while the synthesis of conceptual and experimental progress leads to questioning current assumptions on the nature of evolutionary dynamics-cancer progression models or synthetic biology approaches being notable examples. This work delves with a critical and constructive attitude into our current knowledge of how genotypes map onto molecular phenotypes and organismal functions, and discusses theoretical and empirical avenues to broaden and improve this comprehension. As a final goal, this community should aim at deriving an updated picture of evolutionary processes soundly relying on the structural properties of genotype spaces, as revealed by modern techniques of molecular and functional analysis.
Collapse
Affiliation(s)
- Susanna Manrubia
- Department of Systems Biology, Centro Nacional de Biotecnología (CSIC), Madrid, Spain; Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain.
| | - José A Cuesta
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Departamento de Matemáticas, Universidad Carlos III de Madrid, Leganés, Spain; Instituto de Biocomputación y Física de Sistemas Complejos (BiFi), Universidad de Zaragoza, Spain; UC3M-Santander Big Data Institute (IBiDat), Getafe, Madrid, Spain
| | - Jacobo Aguirre
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Centro de Astrobiología, CSIC-INTA, ctra. de Ajalvir km 4, 28850 Torrejón de Ardoz, Madrid, Spain
| | - Sebastian E Ahnert
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, UK; The Alan Turing Institute, British Library, 96 Euston Road, London NW1 2DB, UK
| | | | - Alejandro V Cano
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Pablo Catalán
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Departamento de Matemáticas, Universidad Carlos III de Madrid, Leganés, Spain
| | - Ramon Diaz-Uriarte
- Department of Biochemistry, Universidad Autónoma de Madrid, Madrid, Spain; Instituto de Investigaciones Biomédicas "Alberto Sols" (UAM-CSIC), Madrid, Spain
| | - Santiago F Elena
- Instituto de Biología Integrativa de Sistemas, I(2)SysBio (CSIC-UV), València, Spain; The Santa Fe Institute, Santa Fe, NM, USA
| | | | - Paulien Hogeweg
- Theoretical Biology and Bioinformatics Group, Utrecht University, the Netherlands
| | - Bhavin S Khatri
- The Francis Crick Institute, London, UK; Department of Life Sciences, Imperial College London, London, UK
| | - Joachim Krug
- Institute for Biological Physics, University of Cologne, Köln, Germany
| | - Ard A Louis
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford, UK
| | - Nora S Martin
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, Cambridge, UK; Sainsbury Laboratory, University of Cambridge, Cambridge, UK
| | - Joshua L Payne
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | - Marcel Weiß
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, Cambridge, UK; Sainsbury Laboratory, University of Cambridge, Cambridge, UK
| |
Collapse
|