1
|
Barba-Montoya J, Tao Q, Kumar S. Molecular and morphological clocks for estimating evolutionary divergence times. BMC Ecol Evol 2021; 21:83. [PMID: 33980146 PMCID: PMC8117668 DOI: 10.1186/s12862-021-01798-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Accepted: 04/20/2021] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND Matrices of morphological characters are frequently used for dating species divergence times in systematics. In some studies, morphological and molecular character data from living taxa are combined, whereas others use morphological characters from extinct taxa as well. We investigated whether morphological data produce time estimates that are concordant with molecular data. If true, it will justify the use of morphological characters alongside molecular data in divergence time inference. RESULTS We systematically analyzed three empirical datasets from different species groups to test the concordance of species divergence dates inferred using molecular and discrete morphological data from extant taxa as test cases. We found a high correlation between their divergence time estimates, despite a poor linear relationship between branch lengths for morphological and molecular data mapped onto the same phylogeny. This was because node-to-tip distances showed a much higher correlation than branch lengths due to an averaging effect over multiple branches. We found that nodes with a large number of taxa often benefit from such averaging. However, considerable discordance between time estimates from molecules and morphology may still occur as some intermediate nodes may show large time differences between these two types of data. CONCLUSIONS Our findings suggest that node- and tip-calibration approaches may be better suited for nodes with many taxa. Nevertheless, we highlight the importance of evaluating the concordance of intrinsic time structure in morphological and molecular data before any dating analysis using combined datasets.
Collapse
Affiliation(s)
- Jose Barba-Montoya
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, 19122, USA
- Department of Biology, Temple University, Philadelphia, PA, 19122, USA
| | - Qiqing Tao
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, 19122, USA
- Department of Biology, Temple University, Philadelphia, PA, 19122, USA
| | - Sudhir Kumar
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, 19122, USA.
- Department of Biology, Temple University, Philadelphia, PA, 19122, USA.
- Center for Excellence in Genome Medicine and Research, King Abdulaziz University, Jeddah, Saudi Arabia.
| |
Collapse
|
2
|
Nie Y, Foster CSP, Zhu T, Yao R, Duchêne DA, Ho SYW, Zhong B. Accounting for Uncertainty in the Evolutionary Timescale of Green Plants Through Clock-Partitioning and Fossil Calibration Strategies. Syst Biol 2020; 69:1-16. [PMID: 31058981 DOI: 10.1093/sysbio/syz032] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2018] [Revised: 04/30/2019] [Accepted: 05/02/2019] [Indexed: 11/13/2022] Open
Abstract
Establishing an accurate evolutionary timescale for green plants (Viridiplantae) is essential to understanding their interaction and coevolution with the Earth's climate and the many organisms that rely on green plants. Despite being the focus of numerous studies, the timing of the origin of green plants and the divergence of major clades within this group remain highly controversial. Here, we infer the evolutionary timescale of green plants by analyzing 81 protein-coding genes from 99 chloroplast genomes, using a core set of 21 fossil calibrations. We test the sensitivity of our divergence-time estimates to various components of Bayesian molecular dating, including the tree topology, clock models, clock-partitioning schemes, rate priors, and fossil calibrations. We find that the choice of clock model affects date estimation and that the independent-rates model provides a better fit to the data than the autocorrelated-rates model. Varying the rate prior and tree topology had little impact on age estimates, with far greater differences observed among calibration choices and clock-partitioning schemes. Our analyses yield date estimates ranging from the Paleoproterozoic to Mesoproterozoic for crown-group green plants, and from the Ediacaran to Middle Ordovician for crown-group land plants. We present divergence-time estimates of the major groups of green plants that take into account various sources of uncertainty. Our proposed timeline lays the foundation for further investigations into how green plants shaped the global climate and ecosystems, and how embryophytes became dominant in terrestrial environments.
Collapse
Affiliation(s)
- Yuan Nie
- College of Life Sciences, Nanjing Normal University, Nanjing 210046, China
| | - Charles S P Foster
- School of Life and Environmental Sciences, University of Sydney, Sydney, New South Wales 2006, Australia
| | - Tianqi Zhu
- National Center for Mathematics and Interdisciplinary Sciences, Key Laboratory of Random Complex Structures and Data Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100000, China
| | - Ru Yao
- College of Life Sciences, Nanjing Normal University, Nanjing 210046, China
| | - David A Duchêne
- School of Life and Environmental Sciences, University of Sydney, Sydney, New South Wales 2006, Australia
| | - Simon Y W Ho
- School of Life and Environmental Sciences, University of Sydney, Sydney, New South Wales 2006, Australia
| | - Bojian Zhong
- College of Life Sciences, Nanjing Normal University, Nanjing 210046, China
| |
Collapse
|
3
|
The phylogeography and incidence of multi-drug resistant typhoid fever in sub-Saharan Africa. Nat Commun 2018; 9:5094. [PMID: 30504848 PMCID: PMC6269545 DOI: 10.1038/s41467-018-07370-z] [Citation(s) in RCA: 76] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2018] [Accepted: 10/18/2018] [Indexed: 11/18/2022] Open
Abstract
There is paucity of data regarding the geographical distribution, incidence, and phylogenetics of multi-drug resistant (MDR) Salmonella Typhi in sub-Saharan Africa. Here we present a phylogenetic reconstruction of whole genome sequenced 249 contemporaneous S. Typhi isolated between 2008-2015 in 11 sub-Saharan African countries, in context of the 2,057 global S. Typhi genomic framework. Despite the broad genetic diversity, the majority of organisms (225/249; 90%) belong to only three genotypes, 4.3.1 (H58) (99/249; 40%), 3.1.1 (97/249; 39%), and 2.3.2 (29/249; 12%). Genotypes 4.3.1 and 3.1.1 are confined within East and West Africa, respectively. MDR phenotype is found in over 50% of organisms restricted within these dominant genotypes. High incidences of MDR S. Typhi are calculated in locations with a high burden of typhoid, specifically in children aged <15 years. Antimicrobial stewardship, MDR surveillance, and the introduction of typhoid conjugate vaccines will be critical for the control of MDR typhoid in Africa. Typhoid fever is caused by the bacterium Salmonella Typhi. Here, Park et al. analyse the genomes of 249 S. Typhi isolates from 11 sub-Saharan African countries, identifying genes and plasmids associated with antibiotic resistance and showing that multi-drug resistance is highly pervasive in sub-Saharan Africa.
Collapse
|
4
|
Duchene S, Duchene DA, Geoghegan JL, Dyson ZA, Hawkey J, Holt KE. Inferring demographic parameters in bacterial genomic data using Bayesian and hybrid phylogenetic methods. BMC Evol Biol 2018; 18:95. [PMID: 29914372 PMCID: PMC6006949 DOI: 10.1186/s12862-018-1210-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2017] [Accepted: 06/05/2018] [Indexed: 12/04/2022] Open
Abstract
Background Recent developments in sequencing technologies make it possible to obtain genome sequences from a large number of isolates in a very short time. Bayesian phylogenetic approaches can take advantage of these data by simultaneously inferring the phylogenetic tree, evolutionary timescale, and demographic parameters (such as population growth rates), while naturally integrating uncertainty in all parameters. Despite their desirable properties, Bayesian approaches can be computationally intensive, hindering their use for outbreak investigations involving genome data for a large numbers of pathogen isolates. An alternative to using full Bayesian inference is to use a hybrid approach, where the phylogenetic tree and evolutionary timescale are estimated first using maximum likelihood. Under this hybrid approach, demographic parameters are inferred from estimated trees instead of the sequence data, using maximum likelihood, Bayesian inference, or approximate Bayesian computation. This can vastly reduce the computational burden, but has the disadvantage of ignoring the uncertainty in the phylogenetic tree and evolutionary timescale. Results We compared the performance of a fully Bayesian and a hybrid method by analysing six whole-genome SNP data sets from a range of bacteria and simulations. The estimates from the two methods were very similar, suggesting that the hybrid method is a valid alternative for very large datasets. However, we also found that congruence between these methods is contingent on the presence of strong temporal structure in the data (i.e. clocklike behaviour), which is typically verified using a date-randomisation test in a Bayesian framework. To reduce the computational burden of this Bayesian test we implemented a date-randomisation test using a rapid maximum likelihood method, which has similar performance to its Bayesian counterpart. Conclusions Hybrid approaches can produce reliable inferences of evolutionary timescales and phylodynamic parameters in a fraction of the time required for fully Bayesian analyses. As such, they are a valuable alternative in outbreak studies involving a large number of isolates. Electronic supplementary material The online version of this article (10.1186/s12862-018-1210-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sebastian Duchene
- Department of Biochemistry and Molecular Biology, Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Parkville, VIC, 3020, Australia.
| | - David A Duchene
- School of Life and Environmental Sciences, University of Sydney, Sydney, NSW, 2006, Australia
| | - Jemma L Geoghegan
- Department of Biological Sciences, Macquarie University, Sydney, NSW, 2109, Australia
| | - Zoe A Dyson
- Department of Biochemistry and Molecular Biology, Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Parkville, VIC, 3020, Australia
| | - Jane Hawkey
- Department of Biochemistry and Molecular Biology, Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Parkville, VIC, 3020, Australia
| | - Kathryn E Holt
- Department of Biochemistry and Molecular Biology, Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Parkville, VIC, 3020, Australia
| |
Collapse
|
5
|
Foster CSP, Ho SYW. Strategies for Partitioning Clock Models in Phylogenomic Dating: Application to the Angiosperm Evolutionary Timescale. Genome Biol Evol 2018; 9:2752-2763. [PMID: 29036288 PMCID: PMC5647803 DOI: 10.1093/gbe/evx198] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/25/2017] [Indexed: 12/14/2022] Open
Abstract
Evolutionary timescales can be inferred from molecular sequence data using a Bayesian phylogenetic approach. In these methods, the molecular clock is often calibrated using fossil data. The uncertainty in these fossil calibrations is important because it determines the limiting posterior distribution for divergence-time estimates as the sequence length tends to infinity. Here, we investigate how the accuracy and precision of Bayesian divergence-time estimates improve with the increased clock-partitioning of genome-scale data into clock-subsets. We focus on a data set comprising plastome-scale sequences of 52 angiosperm taxa. There was little difference among the Bayesian date estimates whether we chose clock-subsets based on patterns of among-lineage rate heterogeneity or relative rates across genes, or by random assignment. Increasing the degree of clock-partitioning usually led to an improvement in the precision of divergence-time estimates, but this increase was asymptotic to a limit presumably imposed by fossil calibrations. Our clock-partitioning approaches yielded highly precise age estimates for several key nodes in the angiosperm phylogeny. For example, when partitioning the data into 20 clock-subsets based on patterns of among-lineage rate heterogeneity, we inferred crown angiosperms to have arisen 198–178 Ma. This demonstrates that judicious clock-partitioning can improve the precision of molecular dating based on phylogenomic data, but the meaning of this increased precision should be considered critically.
Collapse
Affiliation(s)
- Charles S P Foster
- School of Life and Environmental Sciences, University of Sydney, Sydney, New South Wales 2006, Australia
| | - Simon Y W Ho
- School of Life and Environmental Sciences, University of Sydney, Sydney, New South Wales 2006, Australia
| |
Collapse
|
6
|
Bromham L, Duchêne S, Hua X, Ritchie AM, Duchêne DA, Ho SYW. Bayesian molecular dating: opening up the black box. Biol Rev Camb Philos Soc 2017; 93:1165-1191. [DOI: 10.1111/brv.12390] [Citation(s) in RCA: 104] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2017] [Revised: 11/13/2017] [Accepted: 11/17/2017] [Indexed: 12/27/2022]
Affiliation(s)
- Lindell Bromham
- Macroevolution & Macroecology, Division of Ecology & Evolution, Research School of Biology; Australian National University; Canberra ACT 2601 Australia
| | - Sebastián Duchêne
- Department of Biochemistry and Molecular Biology, Bio21 Molecular Science and Biotechnology Institute; The University of Melbourne; Melbourne VIC 3010 Australia
- School of Life and Environmental Sciences; University of Sydney; Sydney NSW 2006 Australia
| | - Xia Hua
- Macroevolution & Macroecology, Division of Ecology & Evolution, Research School of Biology; Australian National University; Canberra ACT 2601 Australia
| | - Andrew M. Ritchie
- School of Life and Environmental Sciences; University of Sydney; Sydney NSW 2006 Australia
| | - David A. Duchêne
- Macroevolution & Macroecology, Division of Ecology & Evolution, Research School of Biology; Australian National University; Canberra ACT 2601 Australia
- School of Life and Environmental Sciences; University of Sydney; Sydney NSW 2006 Australia
| | - Simon Y. W. Ho
- School of Life and Environmental Sciences; University of Sydney; Sydney NSW 2006 Australia
| |
Collapse
|
7
|
Tong KJ, Duchêne S, Lo N, Ho SYW. The impacts of drift and selection on genomic evolution in insects. PeerJ 2017; 5:e3241. [PMID: 28462044 PMCID: PMC5410144 DOI: 10.7717/peerj.3241] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2016] [Accepted: 03/28/2017] [Indexed: 11/20/2022] Open
Abstract
Genomes evolve through a combination of mutation, drift, and selection, all of which act heterogeneously across genes and lineages. This leads to differences in branch-length patterns among gene trees. Genes that yield trees with the same branch-length patterns can be grouped together into clusters. Here, we propose a novel phylogenetic approach to explain the factors that influence the number and distribution of these gene-tree clusters. We apply our method to a genomic dataset from insects, an ancient and diverse group of organisms. We find some evidence that when drift is the dominant evolutionary process, each cluster tends to contain a large number of fast-evolving genes. In contrast, strong negative selection leads to many distinct clusters, each of which contains only a few slow-evolving genes. Our work, although preliminary in nature, illustrates the use of phylogenetic methods to shed light on the factors driving rate variation in genomic evolution.
Collapse
Affiliation(s)
- K Jun Tong
- School of Life and Environmental Sciences, University of Sydney, Sydney, New South Wales, Australia
| | - Sebastián Duchêne
- School of Life and Environmental Sciences, University of Sydney, Sydney, New South Wales, Australia.,Centre for Systems Genomics, University of Melbourne, Melbourne, Victoria, Australia
| | - Nathan Lo
- School of Life and Environmental Sciences, University of Sydney, Sydney, New South Wales, Australia
| | - Simon Y W Ho
- School of Life and Environmental Sciences, University of Sydney, Sydney, New South Wales, Australia
| |
Collapse
|