1
|
Mello B, Schrago CG. Modeling Substitution Rate Evolution across Lineages and Relaxing the Molecular Clock. Genome Biol Evol 2024; 16:evae199. [PMID: 39332907 PMCID: PMC11430275 DOI: 10.1093/gbe/evae199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/08/2024] [Indexed: 09/29/2024] Open
Abstract
Relaxing the molecular clock using models of how substitution rates change across lineages has become essential for addressing evolutionary problems. The diversity of rate evolution models and their implementations are substantial, and studies have demonstrated their impact on divergence time estimates can be as significant as that of calibration information. In this review, we trace the development of rate evolution models from the proposal of the molecular clock concept to the development of sophisticated Bayesian and non-Bayesian methods that handle rate variation in phylogenies. We discuss the various approaches to modeling rate evolution, provide a comprehensive list of available software, and examine the challenges and advancements of the prevalent Bayesian framework, contrasting them to faster non-Bayesian methods. Lastly, we offer insights into potential advancements in the field in the era of big data.
Collapse
Affiliation(s)
- Beatriz Mello
- Department of Genetics, Federal University of Rio de Janeiro, Rio de Janeiro, RJ 21941-617, Brazil
| | - Carlos G Schrago
- Department of Genetics, Federal University of Rio de Janeiro, Rio de Janeiro, RJ 21941-617, Brazil
| |
Collapse
|
2
|
Evolutionary Shift from Purifying Selection towards Divergent Selection of SARS-CoV2 Favors its Invasion into Multiple Human Organs. Virus Res 2022; 313:198712. [PMID: 35176330 PMCID: PMC8843322 DOI: 10.1016/j.virusres.2022.198712] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 02/10/2022] [Accepted: 02/13/2022] [Indexed: 01/07/2023]
Abstract
SARS-CoV2 virus is believed to be originated from a closely related bat Coronavirus RaTG13 lineage and uses its key entry-point residues in S1 protein to attach with human ACE2 receptor. SARS-CoV2 could enter human from bat with its poorly developed entry-point residues much before its known appearance with slower mutation rate or recently with efficiently developed entry-point residues with higher mutation rate or through an intermediate host. Temporal analysis of SARS-CoV2 genome shows that its nucleotide substitution rate is as low as 27nt/year with an evolutionary rate of 9×10−4/site/year, which is well within the range of other RNA virus (10−4 to 10−6/site/year). TMRCA of SARS-CoV2 from bat RaTG13 lineage appears to be in between 9 and 14 years. Evolution of a critical entry-point residue Y493Q needs two substitutions with an intermediate virus carrying Y493H (Y>H>Q) but has not been identified in known twenty-nine bat CoV virus. Genetic codon analysis indicates that SARS-CoV2 evolution during propagation in human disobeys neutral evolution as nonsynonymous mutations surpass synonymous mutations with the increase of ω (dn/ds). Taken together, genetic data suggests that SARS-CoV2 is originated long time back before its appearance in human in 2019. Increase of ω signifies that SARs-CoV2 evolution is approaching towards diversifying selection from purifying selection predictably for its infection power to evade multiple human organs.
Collapse
|
3
|
Campbell CR, Tiley GP, Poelstra JW, Hunnicutt KE, Larsen PA, Lee HJ, Thorne JL, Dos Reis M, Yoder AD. Pedigree-based and phylogenetic methods support surprising patterns of mutation rate and spectrum in the gray mouse lemur. Heredity (Edinb) 2021; 127:233-244. [PMID: 34272504 PMCID: PMC8322134 DOI: 10.1038/s41437-021-00446-5] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2020] [Revised: 05/25/2021] [Accepted: 05/26/2021] [Indexed: 02/06/2023] Open
Abstract
Mutations are the raw material on which evolution acts, and knowledge of their frequency and genomic distribution is crucial for understanding how evolution operates at both long and short timescales. At present, the rate and spectrum of de novo mutations have been directly characterized in relatively few lineages. Our study provides the first direct mutation-rate estimate for a strepsirrhine (i.e., the lemurs and lorises), which comprises nearly half of the primate clade. Using high-coverage linked-read sequencing for a focal quartet of gray mouse lemurs (Microcebus murinus), we estimated the mutation rate to be among the highest calculated for a mammal at 1.52 × 10-8 (95% credible interval: 1.28 × 10-8-1.78 × 10-8) mutations/site/generation. Further, we found an unexpectedly low count of paternal mutations, and only a modest overrepresentation of mutations at CpG sites. Despite the surprising nature of these results, we found both the rate and spectrum to be robust to the manipulation of a wide range of computational filtering criteria. We also sequenced a technical replicate to estimate a false-negative and false-positive rate for our data and show that any point estimate of a de novo mutation rate should be considered with a large degree of uncertainty. For validation, we conducted an independent analysis of context-dependent substitution types for gray mouse lemur and five additional primate species for which de novo mutation rates have also been estimated. These comparisons revealed general consistency of the mutation spectrum between the pedigree-based and the substitution-rate analyses for all species compared.
Collapse
Affiliation(s)
- C Ryan Campbell
- Department of Biology, Duke University, Durham, NC, USA
- Department of Evolutionary Anthropology, Duke University, Durham, NC, USA
| | | | | | - Kelsie E Hunnicutt
- Department of Biology, Duke University, Durham, NC, USA
- Department of Biological Sciences, University of Denver, Denver, CO, USA
| | - Peter A Larsen
- Department of Biology, Duke University, Durham, NC, USA
- Department of Veterinary and Biomedical Sciences, University of Minnesota, St. Paul, MN, USA
| | - Hui-Jie Lee
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA
| | - Jeffrey L Thorne
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA
| | - Mario Dos Reis
- School of Biological and Chemical Sciences, Queen Mary University of London, London, UK
| | - Anne D Yoder
- Department of Biology, Duke University, Durham, NC, USA.
| |
Collapse
|
4
|
Laurin-Lemay S, Rodrigue N, Lartillot N, Philippe H. Conditional Approximate Bayesian Computation: A New Approach for Across-Site Dependency in High-Dimensional Mutation-Selection Models. Mol Biol Evol 2019; 35:2819-2834. [PMID: 30203003 DOI: 10.1093/molbev/msy173] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
A key question in molecular evolutionary biology concerns the relative roles of mutation and selection in shaping genomic data. Moreover, features of mutation and selection are heterogeneous along the genome and over time. Mechanistic codon substitution models based on the mutation-selection framework are promising approaches to separating these effects. In practice, however, several complications arise, since accounting for such heterogeneities often implies handling models of high dimensionality (e.g., amino acid preferences), or leads to across-site dependence (e.g., CpG hypermutability), making the likelihood function intractable. Approximate Bayesian Computation (ABC) could address this latter issue. Here, we propose a new approach, named Conditional ABC (CABC), which combines the sampling efficiency of MCMC and the flexibility of ABC. To illustrate the potential of the CABC approach, we apply it to the study of mammalian CpG hypermutability based on a new mutation-level parameter implying dependence across adjacent sites, combined with site-specific purifying selection on amino-acids captured by a Dirichlet process. Our proof-of-concept of the CABC methodology opens new modeling perspectives. Our application of the method reveals a high level of heterogeneity of CpG hypermutability across loci and mild heterogeneity across taxonomic groups; and finally, we show that CpG hypermutability is an important evolutionary factor in rendering relative synonymous codon usage. All source code is available as a GitHub repository (https://github.com/Simonll/LikelihoodFreePhylogenetics.git).
Collapse
Affiliation(s)
- Simon Laurin-Lemay
- Robert-Cedergren Center for Bioinformatics and Genomics, Department of Biochemistry and Molecular Medicine, Faculty of Medicine, Université de Montréal, Montréal, QC, Canada
| | - Nicolas Rodrigue
- Department of Biology, Institute of Biochemistry, and School of Mathematics and Statistics, Carleton University, Ottawa, ON, Canada
| | - Nicolas Lartillot
- Laboratoire de Biométrie et Biologie Évolutive, UMR CNRS 5558, Université Lyon 1, Lyon, France
| | - Hervé Philippe
- Robert-Cedergren Center for Bioinformatics and Genomics, Department of Biochemistry and Molecular Medicine, Faculty of Medicine, Université de Montréal, Montréal, QC, Canada.,Centre de Théorisation et de Modélisation de la Biodiversité, Station d'Écologie Théorique et Expérimentale, UMR CNRS 5321, Moulis, France
| |
Collapse
|
5
|
Lee HJ, Kishino H, Rodrigue N, Thorne JL. Grouping substitution types into different relaxed molecular clocks. Philos Trans R Soc Lond B Biol Sci 2017; 371:rstb.2015.0141. [PMID: 27325837 DOI: 10.1098/rstb.2015.0141] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/07/2016] [Indexed: 11/12/2022] Open
Abstract
Different types of nucleotide substitutions experience different patterns of rate change over time. We propose clustering context-dependent (or context-independent) nucleotide substitution types according to how their rates change and then using the grouping for divergence time estimation. With our models, relative rates among types that are in the same group are fixed, whereas absolute rates of the types within a group change over time according to a shared relaxed molecular clock. We illustrate our procedure by analysing a 0.15 Mb intergenic region to infer divergence times relating eight primates. The different groupings of substitution types that we explore have little effect on the posterior means of divergence times, but the widths of the credibility intervals decrease as the number of groups increases.This article is part of the themed issue 'Dating species divergences using rocks and clocks'.
Collapse
Affiliation(s)
- Hui-Jie Lee
- Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA
| | - Hirohisa Kishino
- Laboratory of Biometrics and Bioinformatics, University of Tokyo, Tokyo, Japan
| | - Nicolas Rodrigue
- Department of Biology, Institute of Biochemistry, and School of Mathematics and Statistics, Carleton University, Ottawa, Ontario, Canada
| | - Jeffrey L Thorne
- Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| |
Collapse
|
6
|
Abstract
Events in primate evolution are often dated by assuming a constant rate of substitution per unit time, but the validity of this assumption remains unclear. Among mammals, it is well known that there exists substantial variation in yearly substitution rates. Such variation is to be expected from differences in life history traits, suggesting it should also be found among primates. Motivated by these considerations, we analyze whole genomes from 10 primate species, including Old World Monkeys (OWMs), New World Monkeys (NWMs), and apes, focusing on putatively neutral autosomal sites and controlling for possible effects of biased gene conversion and methylation at CpG sites. We find that substitution rates are up to 64% higher in lineages leading from the hominoid-NWM ancestor to NWMs than to apes. Within apes, rates are ∼2% higher in chimpanzees and ∼7% higher in the gorilla than in humans. Substitution types subject to biased gene conversion show no more variation among species than those not subject to it. Not all mutation types behave similarly, however; in particular, transitions at CpG sites exhibit a more clocklike behavior than do other types, presumably because of their nonreplicative origin. Thus, not only the total rate, but also the mutational spectrum, varies among primates. This finding suggests that events in primate evolution are most reliably dated using CpG transitions. Taking this approach, we estimate the human and chimpanzee divergence time is 12.1 million years, and the human and gorilla divergence time is 15.1 million years.
Collapse
|