1
|
Khurana MP, Scheidwasser-Clow N, Penn MJ, Bhatt S, Duchêne DA. The Limits of the Constant-rate Birth-Death Prior for Phylogenetic Tree Topology Inference. Syst Biol 2024; 73:235-246. [PMID: 38153910 PMCID: PMC11129600 DOI: 10.1093/sysbio/syad075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 12/20/2023] [Accepted: 12/27/2023] [Indexed: 12/30/2023] Open
Abstract
Birth-death models are stochastic processes describing speciation and extinction through time and across taxa and are widely used in biology for inference of evolutionary timescales. Previous research has highlighted how the expected trees under the constant-rate birth-death (crBD) model tend to differ from empirical trees, for example, with respect to the amount of phylogenetic imbalance. However, our understanding of how trees differ between the crBD model and the signal in empirical data remains incomplete. In this Point of View, we aim to expose the degree to which the crBD model differs from empirically inferred phylogenies and test the limits of the model in practice. Using a wide range of topology indices to compare crBD expectations against a comprehensive dataset of 1189 empirically estimated trees, we confirm that crBD model trees frequently differ topologically compared with empirical trees. To place this in the context of standard practice in the field, we conducted a meta-analysis for a subset of the empirical studies. When comparing studies that used Bayesian methods and crBD priors with those that used other non-crBD priors and non-Bayesian methods (i.e., maximum likelihood methods), we do not find any significant differences in tree topology inferences. To scrutinize this finding for the case of highly imbalanced trees, we selected the 100 trees with the greatest imbalance from our dataset, simulated sequence data for these tree topologies under various evolutionary rates, and re-inferred the trees under maximum likelihood and using the crBD model in a Bayesian setting. We find that when the substitution rate is low, the crBD prior results in overly balanced trees, but the tendency is negligible when substitution rates are sufficiently high. Overall, our findings demonstrate the general robustness of crBD priors across a broad range of phylogenetic inference scenarios but also highlight that empirically observed phylogenetic imbalance is highly improbable under the crBD model, leading to systematic bias in data sets with limited information content.
Collapse
Affiliation(s)
- Mark P Khurana
- Section of Epidemiology, Department of Public Health, University of Copenhagen, 1352 Copenhagen, Denmark
| | - Neil Scheidwasser-Clow
- Section of Epidemiology, Department of Public Health, University of Copenhagen, 1352 Copenhagen, Denmark
| | - Matthew J Penn
- Department of Statistics, University of Oxford, OX1 3LB, Oxford, UK
| | - Samir Bhatt
- Section of Epidemiology, Department of Public Health, University of Copenhagen, 1352 Copenhagen, Denmark
- MRC Centre for Global Infectious Disease Analysis, School of Public Health, Imperial College London, SW7 2AZ, London, UK
| | - David A Duchêne
- Centre for Evolutionary Hologenomics, University of Copenhagen, 1352 Copenhagen, Denmark
| |
Collapse
|
2
|
Di Nunzio A, Disanto F. Clade size distribution under neutral evolutionary models. Theor Popul Biol 2024; 156:93-102. [PMID: 38367870 DOI: 10.1016/j.tpb.2024.02.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 02/07/2024] [Accepted: 02/12/2024] [Indexed: 02/19/2024]
Abstract
Given a labeled tree topology t, consider a population P of k leaves chosen among those of t. The clade of P is the minimal subtree of t containing P and its size is given by the number of leaves in the clade. When t is selected under the Yule or uniform distribution among the labeled topologies of size n, we study the "clade size" random variable determining closed formulas for its probability mass function, its mean, and its variance. Our calculations show that for large n the clade size tends to be smaller under the uniform model than under the Yule model, with a larger variability in the first scenario for values of k≥5. We apply our probability formulas to investigate set-theoretic relationships between the clades of two populations in a random tree, determining how likely one clade is contained in or it is equal to the other. Our study relates to earlier calculations for the probability that under the Yule model the clade size of P equals the size of P - that is, the population P forms a monophyletic group - and extends known results for the probability that the minimal (non-trivial) clade containing a random taxon has a given size.
Collapse
|
3
|
Mulder WH. Probability Distribution of Tree Age for the Simple Birth-Death Process, with Applications to Distributions of Number of Ancestral Lineages and Divergence Times for Pairs of Taxa in a Yule Tree. Bull Math Biol 2023; 85:94. [PMID: 37658245 DOI: 10.1007/s11538-023-01196-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Accepted: 08/11/2023] [Indexed: 09/03/2023]
Abstract
In this contribution, a general expression is derived for the probability density of the time to the most recent common ancestor (TMRCA) of a simple birth-death tree, a widely used stochastic null-model of biological speciation and extinction, conditioned on the constant birth and death rates and number of extant lineages. This density is contrasted with a previous result which was obtained using a uniform prior for the time of origin. The new distribution is applied to two problems of phylogenetic interest. First, that of the probability of the number of taxa existing at any time in the past in a tree of a known number of extant species, and given birth and death rates, and second, that of determining the TMRCA of two randomly selected taxa in an unobserved tree that is produced by a simple birth-only, or Yule, process. In the latter case, it is assumed that only the rate of bifurcation (speciation) and the size, or number of tips, are known. This is shown to lead to a closed-form analytical expression for the probability distribution of this parameter, which is arrived at based on the known mathematical form of the age distribution of Yule trees of a given size and branching rate, which is derived here de novo, and a similar distribution which additionally is conditioned on tree age. The new distribution is the exact Yule prior for divergence times of pairs of taxa under the stated conditions and is potentially useful in statistical (Bayesian) inference studies of phylogenies.
Collapse
Affiliation(s)
- Willem H Mulder
- Department of Chemistry, The University of the West Indies, Mona Campus, Kingston 7, Jamaica.
| |
Collapse
|
4
|
Yu P, Lian Y, Zuleger CL, Albertini RJ, Albertini MR, Newton MA. SURROGATE SELECTION OVERSAMPLES EXPANDED T CELL CLONOTYPES. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.13.548950. [PMID: 37503118 PMCID: PMC10369934 DOI: 10.1101/2023.07.13.548950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Inference from immunological data on cells in the adaptive immune system may benefit from modeling specifications that describe variation in the sizes of various clonal sub-populations. We develop one such specification in order to quantify the effects of surrogate selection assays, which we confirm may lead to an enrichment for amplified, potentially disease-relevant T cell clones. Our specification couples within-clonotype birth-death processes with an exchangeable model across clonotypes. Beyond enrichment questions about the surrogate selection design, our framework enables a study of sampling properties of elementary sample diversity statistics; it also points to new statistics that may usefully measure the burden of somatic genomic alterations associated with clonal expansion. We examine statistical properties of immunological samples governed by the coupled model specification, and we illustrate calculations in surrogate selection studies of melanoma and in single-cell genomic studies of T cell repertoires.
Collapse
Affiliation(s)
- Peng Yu
- Department of Statistics, University of Wisconsin, Madison
| | - Yumin Lian
- Department of Chemistry, Laboratory of Genetics, University of Wisconsin, Madison
| | - Cindy L. Zuleger
- Department of Medicine, School of Medicine and Public Health, University of Wisconsin, Madison
- Carbone Cancer Center, University of Wisconsin, Madison
| | | | - Mark R. Albertini
- Department of Medicine, School of Medicine and Public Health, University of Wisconsin, Madison
- Carbone Cancer Center, University of Wisconsin, Madison
- Medical Service, William S. Middleton Memorial Veterans Hospital, Madison
| | - Michael A. Newton
- Department of Statistics, University of Wisconsin, Madison
- Carbone Cancer Center, University of Wisconsin, Madison
- Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison
| |
Collapse
|
5
|
Mathur S, Rosenberg NA. All galls are divided into three or more parts: recursive enumeration of labeled histories for galled trees. Algorithms Mol Biol 2023; 18:1. [PMID: 36782318 PMCID: PMC9926779 DOI: 10.1186/s13015-023-00224-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Accepted: 01/27/2023] [Indexed: 02/15/2023] Open
Abstract
OBJECTIVE In mathematical phylogenetics, a labeled rooted binary tree topology can possess any of a number of labeled histories, each of which represents a possible temporal ordering of its coalescences. Labeled histories appear frequently in calculations that describe the combinatorics of phylogenetic trees. Here, we generalize the concept of labeled histories from rooted phylogenetic trees to rooted phylogenetic networks, specifically for the class of rooted phylogenetic networks known as rooted galled trees. RESULTS Extending a recursive algorithm for enumerating the labeled histories of a labeled tree topology, we present a method to enumerate the labeled histories associated with a labeled rooted galled tree. The method relies on a recursive decomposition by which each gall in a galled tree possesses three or more descendant subtrees. We exhaustively provide the numbers of labeled histories for all small galled trees, finding that each gall reduces the number of labeled histories relative to a specified galled tree that does not contain it. CONCLUSION The results expand the set of structures for which labeled histories can be enumerated, extending a well-known calculation for phylogenetic trees to a class of phylogenetic networks.
Collapse
Affiliation(s)
- Shaili Mathur
- grid.168010.e0000000419368956Department of Biology, Stanford University, Stanford, 94305 CA USA
| | - Noah A. Rosenberg
- grid.168010.e0000000419368956Department of Biology, Stanford University, Stanford, 94305 CA USA
| |
Collapse
|
6
|
Vukičević D, Matijević D. The Connection of the Generalized Robinson-Foulds Metric with Partial Wiener Indices. Acta Biotheor 2023; 71:5. [PMID: 36695929 DOI: 10.1007/s10441-023-09457-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Accepted: 01/11/2023] [Indexed: 01/26/2023]
Abstract
In this work we propose the partial Wiener index as one possible measure of branching in phylogenetic evolutionary trees. We establish the connection between the generalized Robinson-Foulds (RF) metric for measuring the similarity of phylogenetic trees and partial Wiener indices by expressing the number of conflicting pairs of edges in the generalized RF metric in terms of partial Wiener indices. To do so we compute the minimum and maximum value of the partial Wiener index [Formula: see text], where [Formula: see text] is a binary rooted tree with root [Formula: see text] and [Formula: see text] leaves. Moreover, under the Yule probabilistic model, we show how to compute the expected value of [Formula: see text]. As a direct consequence, we give exact formulas for the upper bound and the expected number of conflicting pairs. By doing so we provide a better theoretical understanding of the computational complexity of the generalized RF metric.
Collapse
Affiliation(s)
- Damir Vukičević
- Department of Mathematics, Faculty of Science, University of Split, Ruđera Boškovića 33, 21000, Split, Croatia.
| | - Domagoj Matijević
- Department of Mathematics, University of Osijek, Trg Lj. Gaja 6, 31000, Osijek, Croatia
| |
Collapse
|
7
|
Carmelet‐Rescan D, Morgan‐Richards M, Pattabiraman N, Trewick SA. Time-calibrated phylogeny and ecological niche models indicate Pliocene aridification drove intraspecific diversification of brushtail possums in Australia. Ecol Evol 2022; 12:e9633. [PMID: 36540081 PMCID: PMC9755819 DOI: 10.1002/ece3.9633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Accepted: 11/28/2022] [Indexed: 12/23/2022] Open
Abstract
Major aridification events in Australia during the Pliocene may have had significant impact on the distribution and structure of widespread species. To explore the potential impact of Pliocene and Pleistocene climate oscillations, we estimated the timing of population fragmentation and past connectivity of the currently isolated but morphologically similar subspecies of the widespread brushtail possum (Trichosurus vulpecula). We use ecological niche modeling (ENM) with the current fragmented distribution of brushtail possums to estimate the environmental envelope of this marsupial. We projected the ENM on models of past climatic conditions in Australia to infer the potential distribution of brushtail possums over 6 million years. D-loop haplotypes were used to describe population structure. From shotgun sequencing, we assembled whole mitochondrial DNA genomes and estimated the timing of intraspecific divergence. Our projections of ENMs suggest current possum populations were unlikely to have been in contact during the Pleistocene. Although lowered sea level during glacial periods enabled connection with habitat in Tasmania, climate fluctuation during this time would not have facilitated gene flow over much of Australia. The most recent common ancestor of sampled intraspecific diversity dates to the early Pliocene when continental aridification caused significant changes to Australian ecology and Trichosurus vulpecula distribution was likely fragmented. Phylogenetic analysis revealed that the subspecies T. v. hypoleucus (koomal; southwest), T. v. arnhemensis (langkurr; north), and T. v. vulpecula (bilda; southeast) correspond to distinct mitochondrial lineages. Despite little phenotypic differentiation, Trichosurus vulpecula populations probably experienced little gene flow with one another since the Pliocene, supporting the recognition of several subspecies and explaining their adaptations to the regional plant assemblages on which they feed.
Collapse
Affiliation(s)
- David Carmelet‐Rescan
- Wildlife and Ecology, School of Natural SciencesMassey UniversityPalmerston NorthNew Zealand
| | - Mary Morgan‐Richards
- Wildlife and Ecology, School of Natural SciencesMassey UniversityPalmerston NorthNew Zealand
| | - Nimeshika Pattabiraman
- Wildlife and Ecology, School of Natural SciencesMassey UniversityPalmerston NorthNew Zealand
| | - Steven A. Trewick
- Wildlife and Ecology, School of Natural SciencesMassey UniversityPalmerston NorthNew Zealand
| |
Collapse
|
8
|
Disanto F, Fuchs M, Paningbatan AR, Rosenberg NA. The distributions under two species-tree models of the number of root ancestral configurations for matching gene trees and species trees. ANN APPL PROBAB 2022. [DOI: 10.1214/22-aap1791] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Affiliation(s)
| | - Michael Fuchs
- Department of Mathematical Sciences, National Chengchi University
| | | | | |
Collapse
|
9
|
OUP accepted manuscript. Syst Biol 2022; 71:1378-1390. [DOI: 10.1093/sysbio/syac008] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Revised: 02/05/2022] [Accepted: 02/08/2022] [Indexed: 11/12/2022] Open
|
10
|
Yao H, Zhang Y, Wang Z, Liu G, Ran Q, Zhang Z, Guo K, Yang A, Wang N, Wang P. Inter-glacial isolation caused divergence of cold-adapted species: the case of the snow partridge. Curr Zool 2021; 68:489-498. [PMID: 36090147 PMCID: PMC9450178 DOI: 10.1093/cz/zoab075] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Accepted: 09/01/2021] [Indexed: 01/03/2023] Open
Abstract
Deciphering the role of climatic oscillations in species divergence helps us understand the mechanisms that shape global biodiversity. The cold-adapted species may have expanded their distribution with the development of glaciers during glacial period. With the retreat of glaciers, these species were discontinuously distributed in the high-altitude mountains and isolated by geographical barriers. However, the study that focuses on the speciation process of cold-adapted species is scant. To fill this gap, we combined population genetic data and ecological niche models (ENMs) to explore divergence process of snow partridge (Lerwa lerwa). Lerwa lerwa is a cold-adapted bird that is distributed from 4,000 to 5,500 m. We found 2 genetic populations within L. lerwa, and they diverged from each other at about 0.40–0.44 million years ago (inter-glacial period after Zhongliangan glaciation). The ENMs suggested that L. lerwa expanded to the low elevations of the Himalayas and Hengduan mountains during glacial period, whereas it contracted to the high elevations, southern of Himalayas, and Hengduan mountains during inter-glacial periods. Effective population size trajectory also suggested that L. lerwa expanded its population size during the glacial period. Consistent with our expectation, the results support that inter-glacial isolation contributed to the divergence of cold-adapted L. lerwa on Qinghai-Tibetan Plateau. This study deepens our understanding of how climatic oscillations have driven divergence process of cold-adapted Phasianidae species distributed on mountains.
Collapse
Affiliation(s)
- Hongyan Yao
- School of Ecology and Nature Conservation, Beijing Forestry University, Beijing, 100083, China
| | - Yanan Zhang
- School of Ecology and Nature Conservation, Beijing Forestry University, Beijing, 100083, China
| | - Zhen Wang
- Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing 100875, China
- Hangzhou Xi’ao Environmental Science Technique Company Limited, Zhejiang 310011, China
| | - Gaoming Liu
- Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Quan Ran
- Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
- Yancheng Wetland and World Natural Heritage Conservation and Management Center, Jiangsu 224000, China
| | - Zhengwang Zhang
- Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing 100875, China
| | - Keji Guo
- Central South Inventory and Planning Institute of National Forestry and Grassland Administration, Changsha 410014, China
| | - Ailin Yang
- Chinese Institute for Brain Research, Beijing 102206, China
| | - Nan Wang
- School of Ecology and Nature Conservation, Beijing Forestry University, Beijing, 100083, China
| | - Pengcheng Wang
- Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing 100875, China
- Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| |
Collapse
|
11
|
Wulff JP, Segura DF, Devescovi F, Muntaabski I, Milla FH, Scannapieco AC, Cladera JL, Lanzavecchia SB. Identification and characterization of soluble binding proteins associated with host foraging in the parasitoid wasp Diachasmimorpha longicaudata. PLoS One 2021; 16:e0252765. [PMID: 34138896 PMCID: PMC8211293 DOI: 10.1371/journal.pone.0252765] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Accepted: 05/22/2021] [Indexed: 11/19/2022] Open
Abstract
The communication and reproduction of insects are driven by chemical sensing. During this process, chemical compounds are transported across the sensillum lymph to the sensory neurons assisted by different types of soluble binding proteins: odorant-binding proteins (OBPs); chemosensory proteins (CSPs); some members of ML-family proteins (MD-2 (myeloid differentiation factor-2)-related Lipid-recognition), also known as NPC2-like proteins. Potential transcripts involved in chemosensing were identified by an in silico analysis of whole-body female and male transcriptomes of the parasitic wasp Diachasmimorpha longicaudata. This analysis facilitated the characterization of fourteen OBPs (all belonging to the Classic type), seven CSPs (and two possible isoforms), and four NPC2-like proteins. A differential expression analysis by qPCR showed that eleven of these proteins (CSPs 2 and 8, OBPs 2, 3, 4, 5, 6, 9, 10, and 11, and NPC2b) were over-expressed in female antenna and two (CSP 1 and OBP 12) in the body without antennae. Foraging behavior trials (linked to RNA interference) suggest that OBPs 9, 10, and 11 are potentially involved in the female orientation to chemical cues associated with the host. OBP 12 seems to be related to physiological processes of female longevity regulation. In addition, transcriptional silencing of CSP 3 showed that this protein is potentially associated with the regulation of foraging behavior. This study supports the hypothesis that soluble binding proteins are potentially linked to fundamental physiological processes and behaviors in D. longicaudata. The results obtained here contribute useful information to increase the parasitoid performance as a biological control agent of fruit fly pest species.
Collapse
Affiliation(s)
- Juan P. Wulff
- Laboratorio de Insectos de Importancia Agronómica, Instituto de Genética Ewald A. Favret (INTA) gv IABIMO (CONICET), Buenos Aires, Argentina
| | - Diego F. Segura
- Laboratorio de Insectos de Importancia Agronómica, Instituto de Genética Ewald A. Favret (INTA) gv IABIMO (CONICET), Buenos Aires, Argentina
| | - Francisco Devescovi
- Laboratorio de Insectos de Importancia Agronómica, Instituto de Genética Ewald A. Favret (INTA) gv IABIMO (CONICET), Buenos Aires, Argentina
| | - Irina Muntaabski
- Laboratorio de Insectos de Importancia Agronómica, Instituto de Genética Ewald A. Favret (INTA) gv IABIMO (CONICET), Buenos Aires, Argentina
| | - Fabian H. Milla
- Laboratorio de Insectos de Importancia Agronómica, Instituto de Genética Ewald A. Favret (INTA) gv IABIMO (CONICET), Buenos Aires, Argentina
| | - Alejandra C. Scannapieco
- Laboratorio de Insectos de Importancia Agronómica, Instituto de Genética Ewald A. Favret (INTA) gv IABIMO (CONICET), Buenos Aires, Argentina
| | - Jorge L. Cladera
- Laboratorio de Insectos de Importancia Agronómica, Instituto de Genética Ewald A. Favret (INTA) gv IABIMO (CONICET), Buenos Aires, Argentina
| | - Silvia B. Lanzavecchia
- Laboratorio de Insectos de Importancia Agronómica, Instituto de Genética Ewald A. Favret (INTA) gv IABIMO (CONICET), Buenos Aires, Argentina
| |
Collapse
|
12
|
Martinez-Seidel F, Beine-Golovchuk O, Hsieh YC, Eshraky KE, Gorka M, Cheong BE, Jimenez-Posada EV, Walther D, Skirycz A, Roessner U, Kopka J, Pereira Firmino AA. Spatially Enriched Paralog Rearrangements Argue Functionally Diverse Ribosomes Arise during Cold Acclimation in Arabidopsis. Int J Mol Sci 2021; 22:6160. [PMID: 34200446 PMCID: PMC8201131 DOI: 10.3390/ijms22116160] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 05/23/2021] [Accepted: 06/01/2021] [Indexed: 12/15/2022] Open
Abstract
Ribosome biogenesis is essential for plants to successfully acclimate to low temperature. Without dedicated steps supervising the 60S large subunits (LSUs) maturation in the cytosol, e.g., Rei-like (REIL) factors, plants fail to accumulate dry weight and fail to grow at suboptimal low temperatures. Around REIL, the final 60S cytosolic maturation steps include proofreading and assembly of functional ribosomal centers such as the polypeptide exit tunnel and the P-Stalk, respectively. In consequence, these ribosomal substructures and their assembly, especially during low temperatures, might be changed and provoke the need for dedicated quality controls. To test this, we blocked ribosome maturation during cold acclimation using two independent reil double mutant genotypes and tested changes in their ribosomal proteomes. Additionally, we normalized our mutant datasets using as a blank the cold responsiveness of a wild-type Arabidopsis genotype. This allowed us to neglect any reil-specific effects that may happen due to the presence or absence of the factor during LSU cytosolic maturation, thus allowing us to test for cold-induced changes that happen in the early nucleolar biogenesis. As a result, we report that cold acclimation triggers a reprogramming in the structural ribosomal proteome. The reprogramming alters the abundance of specific RP families and/or paralogs in non-translational LSU and translational polysome fractions, a phenomenon known as substoichiometry. Next, we tested whether the cold-substoichiometry was spatially confined to specific regions of the complex. In terms of RP proteoforms, we report that remodeling of ribosomes after a cold stimulus is significantly constrained to the polypeptide exit tunnel (PET), i.e., REIL factor binding and functional site. In terms of RP transcripts, cold acclimation induces changes in RP families or paralogs that are significantly constrained to the P-Stalk and the ribosomal head. The three modulated substructures represent possible targets of mechanisms that may constrain translation by controlled ribosome heterogeneity. We propose that non-random ribosome heterogeneity controlled by specialized biogenesis mechanisms may contribute to a preferential or ultimately even rigorous selection of transcripts needed for rapid proteome shifts and successful acclimation.
Collapse
Affiliation(s)
- Federico Martinez-Seidel
- Willmitzer Department, Max-Planck-Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany; (O.B.-G.); (Y.-C.H.); (K.E.E.); (M.G.); (B.-E.C.); (D.W.); (A.S.); (J.K.); (A.A.P.F.)
- School of BioSciences, University of Melbourne, Parkville, VIC 3010, Australia;
| | - Olga Beine-Golovchuk
- Willmitzer Department, Max-Planck-Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany; (O.B.-G.); (Y.-C.H.); (K.E.E.); (M.G.); (B.-E.C.); (D.W.); (A.S.); (J.K.); (A.A.P.F.)
- Heidelberg University, Biochemie-Zentrum, Nuclear Pore Complex and Ribosome Assembly, 69120 Heidelberg, Germany
| | - Yin-Chen Hsieh
- Willmitzer Department, Max-Planck-Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany; (O.B.-G.); (Y.-C.H.); (K.E.E.); (M.G.); (B.-E.C.); (D.W.); (A.S.); (J.K.); (A.A.P.F.)
- Institute for Arctic and Marine Biology, UiT Arctic University of Norway, 9037 Tromsø, Norway
| | - Kheloud El Eshraky
- Willmitzer Department, Max-Planck-Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany; (O.B.-G.); (Y.-C.H.); (K.E.E.); (M.G.); (B.-E.C.); (D.W.); (A.S.); (J.K.); (A.A.P.F.)
| | - Michal Gorka
- Willmitzer Department, Max-Planck-Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany; (O.B.-G.); (Y.-C.H.); (K.E.E.); (M.G.); (B.-E.C.); (D.W.); (A.S.); (J.K.); (A.A.P.F.)
| | - Bo-Eng Cheong
- Willmitzer Department, Max-Planck-Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany; (O.B.-G.); (Y.-C.H.); (K.E.E.); (M.G.); (B.-E.C.); (D.W.); (A.S.); (J.K.); (A.A.P.F.)
- School of BioSciences, University of Melbourne, Parkville, VIC 3010, Australia;
- Biotechnology Research Institute, Universiti Malaysia Sabah, Jalan UMS, 88400 Kota Kinabalu, Malaysia
| | - Erika V. Jimenez-Posada
- Grupo de Biotecnología-Productos Naturales, Universidad Tecnológica de Pereira, Pereira 660003, Colombia;
- Emerging Infectious Diseases and Tropical Medicine Research Group—Sci-Help, Pereira 660009, Colombia
| | - Dirk Walther
- Willmitzer Department, Max-Planck-Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany; (O.B.-G.); (Y.-C.H.); (K.E.E.); (M.G.); (B.-E.C.); (D.W.); (A.S.); (J.K.); (A.A.P.F.)
| | - Aleksandra Skirycz
- Willmitzer Department, Max-Planck-Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany; (O.B.-G.); (Y.-C.H.); (K.E.E.); (M.G.); (B.-E.C.); (D.W.); (A.S.); (J.K.); (A.A.P.F.)
| | - Ute Roessner
- School of BioSciences, University of Melbourne, Parkville, VIC 3010, Australia;
| | - Joachim Kopka
- Willmitzer Department, Max-Planck-Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany; (O.B.-G.); (Y.-C.H.); (K.E.E.); (M.G.); (B.-E.C.); (D.W.); (A.S.); (J.K.); (A.A.P.F.)
| | - Alexandre Augusto Pereira Firmino
- Willmitzer Department, Max-Planck-Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany; (O.B.-G.); (Y.-C.H.); (K.E.E.); (M.G.); (B.-E.C.); (D.W.); (A.S.); (J.K.); (A.A.P.F.)
| |
Collapse
|
13
|
Deformity Index: A Semi-Reference Clade-Based Quality Metric of Phylogenetic Trees. J Mol Evol 2021; 89:302-312. [PMID: 33811501 DOI: 10.1007/s00239-021-10006-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Accepted: 03/20/2021] [Indexed: 10/21/2022]
Abstract
Measuring the dissimilarity of a phylogenetic tree with respect to a reference tree or the hypotheses is a fundamental task in the phylogenetic study. A large number of methods have been proposed to compute the distance between the reference tree and the target tree. Due to the presence of unresolved relationships among the species, it is challenging to obtain a precise and an accurate reference tree for a selected dataset. As a result, the existing tree comparison methods may behave unexpectedly in various scenarios. In this paper, we introduce a novel scoring function, called the deformity index, to quantify the dissimilarity of a tree based on the list of clades of a reference tree. The strength of our proposed method is that it depends on the list of clades that can be acquired either from the reference tree or from the hypotheses. We investigate the distributions of different modules of the deformity index and perform different goodness-of-fit tests to understand the cumulative distribution. Then, we examine, in detail, the robustness as well as the scalability of our measure by performing different statistical tests under various models. Finally, we experiment on different biological datasets and show that our proposed scoring function overcomes the limitations of the conventional methods.
Collapse
|
14
|
Neretina AN, Karabanov DP, Sacherova V, Kotov AA. Unexpected mitochondrial lineage diversity within the genus Alonella Sars, 1862 (Crustacea: Cladocera) across the Northern Hemisphere. PeerJ 2021; 9:e10804. [PMID: 33585083 PMCID: PMC7860113 DOI: 10.7717/peerj.10804] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Accepted: 12/30/2020] [Indexed: 02/05/2023] Open
Abstract
Representatives of the genus Alonella Sars (Crustacea: Cladocera: Chydorinae) belong to the smallest known water fleas. Although species of Alonella are widely distributed and often abundant in acidic and mountain water bodies, their diversity is poorly studied. Morphological and genetic approaches have been complicated by the minute size of these microcrustaceans. As a result, taxonomists have avoided revising these species. Here, we present genetic data on Alonella species diversity across the Northern Hemisphere with particular attention to the A. excisa species complex. We analyzed 82 16S rRNA sequences (all newly obtained), and 78 COI sequences (39 were newly obtained). The results revealed at least twelve divergent phylogenetic lineages, possible cryptic species, of Alonella, with different distribution patterns. As expected, the potential species diversity of this genus is significantly higher than traditionally accepted. The A. excisa complex is represented by nine divergent clades in the Northern Hemisphere, some of them have relatively broad distribution ranges and others are more locally distributed. Our results provide a genetic background for subsequent morphological analyses, formal descriptions of Alonella species and detailed phylogeographical studies.
Collapse
Affiliation(s)
- Anna N. Neretina
- A.N. Severtsov Institute of Ecology and Evolution, Russian Academy of Sciences, Moscow, Russia
| | - Dmitry P. Karabanov
- A.N. Severtsov Institute of Ecology and Evolution, Russian Academy of Sciences, Moscow, Russia
- I.D. Papanin Institute for Biology of Inland Waters, Borok, Yaroslavl State, Russia
| | | | - Alexey A. Kotov
- A.N. Severtsov Institute of Ecology and Evolution, Russian Academy of Sciences, Moscow, Russia
| |
Collapse
|
15
|
Truszkowski J, Scornavacca C, Pardi F. Computing the probability of gene trees concordant with the species tree in the multispecies coalescent. Theor Popul Biol 2020; 137:22-31. [PMID: 33333117 DOI: 10.1016/j.tpb.2020.12.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2020] [Revised: 12/04/2020] [Accepted: 12/08/2020] [Indexed: 10/22/2022]
Abstract
The multispecies coalescent process models the genealogical relationships of genes sampled from several species, enabling useful predictions about phenomena such as the discordance between a gene tree and the species phylogeny due to incomplete lineage sorting. Conversely, knowledge of large collections of gene trees can inform us about several aspects of the species phylogeny, such as its topology and ancestral population sizes. A fundamental open problem in this context is how to efficiently compute the probability of a gene tree topology, given the species phylogeny. Although a number of algorithms for this task have been proposed, they either produce approximate results, or, when they are exact, they do not scale to large data sets. In this paper, we present some progress towards exact and efficient computation of the probability of a gene tree topology. We provide a new algorithm that, given a species tree and the number of genes sampled for each species, calculates the probability that the gene tree topology will be concordant with the species tree. Moreover, we provide an algorithm that computes the probability of any specific gene tree topology concordant with the species tree. Both algorithms run in polynomial time and have been implemented in Python. Experiments show that they are able to analyze data sets where thousands of genes are sampled in a matter of minutes to hours.
Collapse
Affiliation(s)
| | - Celine Scornavacca
- ISEM, CNRS, Université Montpellier, Montpellier, France; Institut de Biologie Computationnelle, Montpellier, France
| | - Fabio Pardi
- LIRMM, CNRS, Université Montpellier, Montpellier, France; Institut de Biologie Computationnelle, Montpellier, France.
| |
Collapse
|
16
|
Disanto F, Wiehe T. Measuring the external branches of a Kingman tree: A discrete approach. Theor Popul Biol 2020; 134:92-105. [PMID: 32485202 DOI: 10.1016/j.tpb.2020.05.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2019] [Revised: 04/02/2020] [Accepted: 05/21/2020] [Indexed: 10/24/2022]
Abstract
The Kingman coalescent process is a classical model of gene genealogies in population genetics. It generates Yule-distributed, binary ranked tree topologies - also called histories - with a finite number of n leaves, together with n-1 exponentially distributed time lengths: one for each layer of the history. Using a discrete approach, we study the lengths of the external branches of Yule distributed histories, where the length of an external branch is defined as the rank of its parent node. We study the multiplicity of external branches of given length in a random history of n leaves. A correspondence between the external branches of the ordered histories of size n and the non-peak entries of the permutations of size n-1 provides easy access to the length distributions of the first and second longest external branches in a random Yule history and coalescent tree of size n. The length of the longest external branch is also studied in dependence of root balance of a random tree. As a practical application, we compare the observed and expected number of mutations on the longest external branches in samples from natural populations.
Collapse
Affiliation(s)
| | - Thomas Wiehe
- Institut für Genetik, Universität zu Köln, Germany.
| |
Collapse
|
17
|
Hayati M, Chindelevitch L. Computing the distribution of the Robinson-Foulds distance. Comput Biol Chem 2020; 87:107284. [PMID: 32599459 DOI: 10.1016/j.compbiolchem.2020.107284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2020] [Accepted: 05/09/2020] [Indexed: 11/22/2022]
Abstract
With the exponential growth of genome databases, the importance of phylogenetics has increased dramatically over the past years. Studying phylogenetic trees enables us not only to understand how genes, genomes, and species evolve, but also helps us predict how they might change in future. One of the crucial aspects of phylogenetics is the comparison of two or more phylogenetic trees. There are different metrics for computing the dissimilarity between a pair of trees. The Robinson-Foulds (RF) distance is one of the widely used metrics on the space of labeled trees. The distribution of the RF distance from a given tree has been studied before, but the fastest known algorithm for computing this distribution is a slow, albeit polynomial-time, O(l5) algorithm. In this paper, we modify the dynamic programming algorithm for computing the distribution of this distance for a given tree by leveraging the number-theoretic transform (NTT), and improve the running time from O(l5) to O(l3logl), where l is the number of tips of the tree. In addition to its practical usefulness, our method represents a theoretical novelty, as it is, to our knowledge, one of the rare applications of the number-theoretic transform for solving a computational biology problem.
Collapse
Affiliation(s)
- Maryam Hayati
- Simon Fraser University, Department of Computing Science, add8888 University Avenue, Burnaby, BC V5A 1S6, Canada
| | - Leonid Chindelevitch
- Simon Fraser University, Department of Computing Science, add8888 University Avenue, Burnaby, BC V5A 1S6, Canada.
| |
Collapse
|
18
|
Affiliation(s)
- K. Bartoszek
- Department of Computer and Information Science, Linköping University, Linköping, Sweden
| |
Collapse
|
19
|
Scale-invariant topology and bursty branching of evolutionary trees emerge from niche construction. Proc Natl Acad Sci U S A 2020; 117:7879-7887. [PMID: 32209672 DOI: 10.1073/pnas.1915088117] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Phylogenetic trees describe both the evolutionary process and community diversity. Recent work has established that they exhibit scale-invariant topology, which quantifies the fact that their branching lies in between the two extreme cases of balanced binary trees and maximally unbalanced ones. In addition, the backbones of phylogenetic trees exhibit bursts of diversification on all timescales. Here, we present a simple, coarse-grained statistical model of niche construction coupled to speciation. Finite-size scaling analysis of the dynamics shows that the resultant phylogenetic tree topology is scale-invariant due to a singularity arising from large niche construction fluctuations that follow extinction events. The same model recapitulates the bursty pattern of diversification in time. These results show how dynamical scaling laws of phylogenetic trees on long timescales can reflect the indelible imprint of the interplay between ecological and evolutionary processes.
Collapse
|
20
|
Ding L, Liao J, Liu N. The uplift of the Qinghai-Tibet Plateau and glacial oscillations triggered the diversification of Tetraogallus (Galliformes, Phasianidae). Ecol Evol 2020; 10:1722-1736. [PMID: 32076546 PMCID: PMC7029067 DOI: 10.1002/ece3.6008] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2019] [Revised: 12/12/2019] [Accepted: 12/23/2019] [Indexed: 11/11/2022] Open
Abstract
The Qinghai-Tibet Plateau (QTP) plays an important role in avian diversification. To reveal the relationship between the QTP uplift and avian diversification since the Late Cenozoic, here, we analyzed the phylogenetic relationship and biogeographical pattern of the genus Tetraogallus (Galliformes, Phasianidae) and the probable factors of speciation in the period of the QTP uplift inferred from concatenated data of four nuclear and five mitochondrial genes using the method of the Bayesian inference. Phylogenetic analysis indicated that T. himalayensis had a close relationship with T. altaicus and conflicted with the previous taxonomy of dark-bellied and white-bellied groups. The molecular clock showed that the speciation of Tetraogallus was profoundly affected by the uplift of the QTP and glacial oscillations. Biogeographic analysis suggested that the extant snowcocks originated from the QTP, and the QTP uplift and glacial oscillations triggered the diversification of Tetraogallus ancestor. Specifically, the uplift of the mountain provided a prerequisite for the colonization of snowcocks Tetraogallus as a result of the collision between the Indian and the Arab plates and the Eurasian plate, in which ecological isolation (the glacial and interglacial periods alternate) and geographical barrier had accelerated the Tetraogallus diversification process. Interestingly, we discovered hybrids between T. tibetanus and T. himalayensis for the first time and suggested that T. tibetanus and T. himalayensis hybridized after a second contact during the glacial period. Here, we proposed that the hybrid offspring was the ancestor of the T. altaicus. In conclusion, the uplift of QTP and glacial oscillations triggered the snowcocks colonization, and then, isolation and introgression hybridization promoted diversification.
Collapse
Affiliation(s)
- Li Ding
- School of Life SciencesLanzhou UniversityLanzhouChina
| | - Jicheng Liao
- School of Life SciencesLanzhou UniversityLanzhouChina
| | - Naifa Liu
- School of Life SciencesLanzhou UniversityLanzhouChina
| |
Collapse
|
21
|
Hayati M, Shadgar B, Chindelevitch L. A new resolution function to evaluate tree shape statistics. PLoS One 2019; 14:e0224197. [PMID: 31751352 PMCID: PMC6874070 DOI: 10.1371/journal.pone.0224197] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2019] [Accepted: 10/07/2019] [Indexed: 01/03/2023] Open
Abstract
Phylogenetic trees are frequently used in biology to study the relationships between a number of species or organisms. The shape of a phylogenetic tree contains useful information about patterns of speciation and extinction, so powerful tools are needed to investigate the shape of a phylogenetic tree. Tree shape statistics are a common approach to quantifying the shape of a phylogenetic tree by encoding it with a single number. In this article, we propose a new resolution function to evaluate the power of different tree shape statistics to distinguish between dissimilar trees. We show that the new resolution function requires less time and space in comparison with the previously proposed resolution function for tree shape statistics. We also introduce a new class of tree shape statistics, which are linear combinations of two existing statistics that are optimal with respect to a resolution function, and show evidence that the statistics in this class converge to a limiting linear combination as the size of the tree increases. Our implementation is freely available at https://github.com/WGS-TB/TreeShapeStats.
Collapse
Affiliation(s)
- Maryam Hayati
- School of Computing Science, Simon Fraser University, Burnaby, BC, Canada
| | - Bita Shadgar
- School of Computing Science, Simon Fraser University, Burnaby, BC, Canada
| | | |
Collapse
|
22
|
Banguela-Castillo A, Ramos-González PL, Peña-Marey M, Godoy CV, Harakava R. An updated phylogenetic classification of Corynespora cassiicola isolates and a practical approach to their identification based on the nucleotide polymorphisms at the ga4 and caa5 loci. Mycologia 2019; 112:24-38. [PMID: 31750788 DOI: 10.1080/00275514.2019.1670018] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Corynespora cassiicola (Burk. & M.A. Curtis) C.T. Wei. is an anamorphic fungus that affects more than 530 plant species, including economically important crops. Several lineages of this pathogen have been recognized, but the classification of isolates into clades is time-consuming and still sometimes leads to unclear results. In this work, eight major phylogenetic clades (PhL1-PhL8) including 245 isolates of C. cassiicola from 44 plant species were established based on a Bayesian inference analysis of four combined C. cassiicola genomic loci retrieved from GenBank, i.e., rDNA internal transcribed spacer (ITS), actin-1,ga4, and caa5. The existence of PhL1-PhL5 and PhL7 as clonal lineages was further confirmed through the analysis of full-genome single-nucleotide polymorphisms of 39 isolates. Haplotypes of the caa5 locus were PhL specific and encode isoforms of the LDB19 domain of a putative α-arrestin N-terminal-like protein. Evolution of the Caa5 arrestin is in correspondence with the PhLs. ga4 and caa5 PhL consensus sequences and a cleaved amplified polymorphic sequence (CAPS) procedure were generated based on the conserved nucleotide sequences and enzyme restriction patterns observed among isolates from the same lineage, respectively. The CAPS method was validated in silico, and its practical use allowed us to differentiate between tomato and papaya isolates, as well as to reveal the prevalence of PhL1 among isolates infecting soybean in Brazil. This novel approach could be useful in the efforts to control the diseases associated with C. cassiicola.
Collapse
Affiliation(s)
- Alexander Banguela-Castillo
- Phytopathology and Plant Biochemistry Laboratory, Instituto Biológico de São Paulo, Avenida Conselheiro Rodrigues Alves, 1252 Vila Mariana, CEP 04014-900, São Paulo, São Paulo, Brazil
| | - Pedro L Ramos-González
- Phytopathology and Plant Biochemistry Laboratory, Instituto Biológico de São Paulo, Avenida Conselheiro Rodrigues Alves, 1252 Vila Mariana, CEP 04014-900, São Paulo, São Paulo, Brazil
| | - Mabel Peña-Marey
- Microbiology and Bacteriology Laboratory, St. Joseph's Hospital, 3001 W Martin Luther King Jr. Boulevard, Tampa, Florida 33607.,Instituto de Investigaciones en Fruticultura Tropical, Avenida 7ma 3005, Playa, La Habana 10500, Cuba
| | - Claudia V Godoy
- Embrapa Soja, Rodovia Carlos João Strass, s/nº Acesso Orlando Amaral, Distrito de Warta Caixa, Postal: 231, CEP: 86001-970, Londrina, Paraná, Brazil
| | - Ricardo Harakava
- Phytopathology and Plant Biochemistry Laboratory, Instituto Biológico de São Paulo, Avenida Conselheiro Rodrigues Alves, 1252 Vila Mariana, CEP 04014-900, São Paulo, São Paulo, Brazil
| |
Collapse
|
23
|
The Evolving Moran Genealogy. Theor Popul Biol 2019; 130:94-105. [PMID: 31330138 DOI: 10.1016/j.tpb.2019.07.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2018] [Revised: 06/24/2019] [Accepted: 07/05/2019] [Indexed: 11/21/2022]
Abstract
We study the evolution of the population genealogy in the classic neutral Moran Model of finite size n∈N and in discrete time. The stochastic transformations that shape a Moran population can be realized directly on its genealogy and give rise to a process on a state space consisting of n-sized binary increasing trees. We derive a number of properties of this process, and show that they are in agreement with existing results on the infinite-population limit of the Moran Model. Most importantly, this process admits time reversal, which makes it possible to simplify the mechanisms determining state changes, and allows for a thorough investigation of the Most Recent Common Ancestorprocess.
Collapse
|
24
|
Hamilton MJ, Walker RS. Nonlinear diversification rates of linguistic phylogenies over the Holocene. PLoS One 2019; 14:e0213126. [PMID: 31314806 PMCID: PMC6636708 DOI: 10.1371/journal.pone.0213126] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2019] [Accepted: 06/20/2019] [Indexed: 11/25/2022] Open
Abstract
The expansion of the human species out of Africa in the Pleistocene, and the subsequent development of agriculture in the Holocene, resulted in waves of linguistic diversification and replacement across the planet. Analogous to the growth of populations or the speciation of biological organisms, languages diversify over time to form phylogenies of language families. However, the dynamics of this diversification process are unclear. Bayesian methods applied to lexical and phonetic data have created dated linguistic phylogenies for 18 language families encompassing ~3,000 of the world's ~7,000 extant languages. In this paper we use these phylogenies to quantify how fast languages expand and diversify through time both within and across language families. The overall diversification rate of languages in our sample is ~0.001 yr-1 (or a doubling time of ~700 yr) over the last 6,000 years with evidence for nonlinear dynamics in language diversification rates over time, where both within and across language families, diversity initially increases rapidly and then slows. The expansion, evolution, and diversification of languages as they spread around the planet was a non-constant process.
Collapse
Affiliation(s)
- Marcus J. Hamilton
- Department of Anthropology, University of Texas at San Antonio, San Antonio, TX, United States of America
- Santa Fe Institute, Santa Fe, New Mexico, NM, United States of America
| | - Robert S. Walker
- Department of Anthropology, University of Missouri, Columbia, MO, United States of America
| |
Collapse
|
25
|
Sevillya G, Snir S. Synteny footprints provide clearer phylogenetic signal than sequence data for prokaryotic classification. Mol Phylogenet Evol 2019; 136:128-137. [DOI: 10.1016/j.ympev.2019.03.010] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2018] [Revised: 03/07/2019] [Accepted: 03/17/2019] [Indexed: 01/22/2023]
|
26
|
Ding L, Liao J. Phylogeography of the Tibetan hamster Cricetulus kamensis in response to uplift and environmental change in the Qinghai-Tibet Plateau. Ecol Evol 2019; 9:7291-7306. [PMID: 31380051 PMCID: PMC6662396 DOI: 10.1002/ece3.5301] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2019] [Revised: 04/28/2019] [Accepted: 05/08/2019] [Indexed: 11/16/2022] Open
Abstract
AIM The evolutionary process of an organism provides valuable data toward an understanding of the Earth evolution history. To investigate the relationship between the uplift of the Qinghai-Tibet Plateau (QTP) and mammalian evolution since the late Cenozoic, the geographic distribution of genetic variations in the Tibetan hamster Cricetulus kamensis was investigated using phylogeographical methods. In particular, population divergence, demographic history, genetic variation, and the prediction of species distribution area were investigated. LOCATION The Qinghai-Tibet Plateau. METHODS A total of 53 specimens, representing 13 geographic populations, were collected from the QTP. The phylogeographical pattern and demographic history of C. kamensis were analyzed, and the probable factors in the QTP uplift and the Quaternary glacial periods were inferred from one nuclear and four mitochondrial genes. Furthermore, the species distribution model (SDM) was used to predict changes in potentially suitable habitats since the last Interglacial. RESULTS Phylogenetic analysis demonstrated that two major genetic differentiations of the C. kamensis population occurred during the Early Pleistocene that were influenced by the Qing-Zang tectonic movement from the Middle Pliocene to the Early Pleistocene. Genetic distance between two major clades indicated low genetic divergence. Demographic history analysis showed that the C. kamensis population was affected by the Quaternary glacial period. SDM analysis indicated that C. kamensis was endemic to the QTP and the suitable habitat was affected by climate change, especially during the Last Glacial Maximum (LGM). MAIN CONCLUSION Our results indicated that the QTP uplift led to the population divergence of C. kamensis, and vicariance well accounted for the geographic distribution of genetic variation in C. kamensis as a result of genetic divergence and lack of gene flow. The genetic distance shows that C. alticola may be a subspecies of C. kamensis. Demographic history analysis suggests that the QTP was affected by the last glacial period. SDM analysis supports that almost the entire QTP is covered by a huge ice sheet during the LGM.
Collapse
Affiliation(s)
- Li Ding
- School of Life SciencesLanzhou UniversityLanzhouChina
| | - Jicheng Liao
- School of Life SciencesLanzhou UniversityLanzhouChina
| |
Collapse
|
27
|
An ancestral process with selection in an ecological community. J Theor Biol 2019; 466:128-144. [PMID: 30586554 DOI: 10.1016/j.jtbi.2018.12.032] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2018] [Revised: 12/17/2018] [Accepted: 12/21/2018] [Indexed: 11/20/2022]
Abstract
An ecological community is a geographical area composed of two or more species. The ancestral histories of individuals from the same and different species in an ecological community may be interconnected due to direct and indirect interactions. Here, we present a model of the ancestral history of an ecological community that is built upon the framework of coalescent and ancestral graph theory. The model includes selection, whereby the fitness of an ancestral lineage is a function of both its abiotic environment and interactions with individuals from its biotic environment. The model also allows for metacommunity structure. We first define a forward-time percolation process characterizing the evolution of an ecological community and then present its corresponding backward-time graphical model in the limit of large population sizes. Next, we present expectations of properties of phenotypes in the graph. These expectations give insight into the structure of phenotypic variation and trait-environment covariances across local communities, including the effects of drift, intra and inter-species genealogical structure and the sampling effects of selection. In addition, we derive expectations for multivariate phenotypic diversity in a community assuming neutrality and compare this to expectations with stabilizing selection.
Collapse
|
28
|
Sarver BAJ, Pennell MW, Brown JW, Keeble S, Hardwick KM, Sullivan J, Harmon LJ. The choice of tree prior and molecular clock does not substantially affect phylogenetic inferences of diversification rates. PeerJ 2019; 7:e6334. [PMID: 30886768 PMCID: PMC6421065 DOI: 10.7717/peerj.6334] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2018] [Accepted: 12/23/2018] [Indexed: 11/20/2022] Open
Abstract
Comparative methods allow researchers to make inferences about evolutionary processes and patterns from phylogenetic trees. In Bayesian phylogenetics, estimating a phylogeny requires specifying priors on parameters characterizing the branching process and rates of substitution among lineages, in addition to others. Accordingly, characterizing the effect of prior selection on phylogenies is an active area of research. The choice of priors may systematically bias phylogenetic reconstruction and, subsequently, affect conclusions drawn from the resulting phylogeny. Here, we focus on the impact of priors in Bayesian phylogenetic inference and evaluate how they affect the estimation of parameters in macroevolutionary models of lineage diversification. Specifically, we simulate trees under combinations of tree priors and molecular clocks, simulate sequence data, estimate trees, and estimate diversification parameters (e.g., speciation and extinction rates) from these trees. When substitution rate heterogeneity is large, diversification rate estimates deviate substantially from those estimated under the simulation conditions when not captured by an appropriate choice of relaxed molecular clock. However, in general, we find that the choice of tree prior and molecular clock has relatively little impact on the estimation of diversification rates insofar as the sequence data are sufficiently informative and substitution rate heterogeneity among lineages is low-to-moderate.
Collapse
Affiliation(s)
- Brice A J Sarver
- Department of Biological Sciences and Institute for Bioinformatics and Evolutionary Studies, University of Idaho, Moscow, ID, USA
| | - Matthew W Pennell
- Department of Zoology and Biodiversity Research Centre, University of British Columbia, Vancouver, BC, Canada
| | - Joseph W Brown
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, UK
| | - Sara Keeble
- Department of Molecular and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Kayla M Hardwick
- Department of Biological Sciences and Institute for Bioinformatics and Evolutionary Studies, University of Idaho, Moscow, ID, USA
| | - Jack Sullivan
- Department of Biological Sciences and Institute for Bioinformatics and Evolutionary Studies, University of Idaho, Moscow, ID, USA
| | - Luke J Harmon
- Department of Biological Sciences and Institute for Bioinformatics and Evolutionary Studies, University of Idaho, Moscow, ID, USA
| |
Collapse
|
29
|
Molecular phylogeny of Caudofoveata (Mollusca) challenges traditional views. Mol Phylogenet Evol 2018; 132:138-150. [PMID: 30423439 DOI: 10.1016/j.ympev.2018.10.037] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2018] [Revised: 07/23/2018] [Accepted: 10/30/2018] [Indexed: 11/22/2022]
Abstract
The shell-less, worm-shaped Caudofoveata (=Chaetodermomorpha) is one of the least known groups of molluscs. The taxon consists of 141 recognized species found from intertidal environments to the deep-sea where they live burrowing in sediment. Evolutionary relationships of the group have been debated, but few studies based on morphological or molecular data have investigated the phylogeny of the group. Here we use molecular phylogenetics to resolve relationships among and within families of Caudofoveata. Phylogenetic analyses were performed using selected mitochondrial and nuclear genes from species from all recognized families of Caudofoveata. In resulting trees and contrary to traditional views, Prochaetodermatidae forms the sister clade to a clade containing the other two currently recognized families, Chaetodermatidae and Limifossoridae. The monophyly of Prochaetodermatidae is highly supported, but Limifossoridae and Chaetodermatidae are not recovered as monophyletic. Most of the caudofoveate genera are also not recovered as monophyletic in our analyses. Thus results from our molecular data suggest that the current classification of Caudofoveata is in need of revision, and indicate evolutionary scenarios that differ from previously proposed hypotheses based on morphology.
Collapse
|
30
|
Moshiri N, Mirarab S. A Two-State Model of Tree Evolution and Its Applications to Alu Retrotransposition. Syst Biol 2018; 67:475-489. [PMID: 29165679 DOI: 10.1093/sysbio/syx088] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2017] [Accepted: 11/15/2017] [Indexed: 11/14/2022] Open
Abstract
Models of tree evolution have mostly focused on capturing the cladogenesis processes behind speciation. Processes that derive the evolution of genomic elements, such as repeats, are not necessarily captured by these existing models. In this article, we design a model of tree evolution that we call the dual-birth model, and we show how it can be useful in studying the evolution of short Alu repeats found in the human genome in abundance. The dual-birth model extends the traditional birth-only model to have two rates of propagation, one for active nodes that propagate often, and another for inactive nodes, that with a lower rate, activate and start propagating. Adjusting the ratio of the rates controls the expected tree balance. We present several theoretical results under the dual-birth model, introduce parameter estimation techniques, and study the properties of the model in simulations. We then use the dual-birth model to estimate the number of active Alu elements and their rates of propagation and activation in the human genome based on a large phylogenetic tree that we build from close to one million Alu sequences.
Collapse
Affiliation(s)
- Niema Moshiri
- Bioinformatics and Systems Biology Graduate Program, UC San Diego, 9500 Gilman Dr., La Jolla, CA 92093, USA
| | - Siavash Mirarab
- Department of Electrical and Computer Engineering, UC San Diego, 9500 Gilman Dr., La Jolla, CA 92093, USA
| |
Collapse
|
31
|
Holicová T, Sedláček F, Mácová A, Vlček J, Robovský J. New record of Microtusmystacinus in eastern Kazakhstan: phylogeographical considerations. Zookeys 2018:67-80. [PMID: 30271235 PMCID: PMC6160783 DOI: 10.3897/zookeys.781.25359] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2018] [Accepted: 06/18/2018] [Indexed: 11/22/2022] Open
Abstract
The Eastern European vole (Microtusmystacinus) is an arvicoline rodent distributed across northern and eastern Europe, the Balkans, Turkey, Armenia, NW and N Iran, Russia as far east as the Tobol River in W Siberia, and W and N Kazakhstan. We present a novel records from eastern Kazakhstan (the village of Dzhambul – 49°14'21.3"N, 86°18'29.9"E and the village of Sekisovka – 50°21'9.18"N, 82°35'46.5"E) based on mtDNA and we discuss implications of this findings on biogeography of eastern Kazakhstan populations. Marine Isotope Stage 11 is considered an important period for the diversification of the arvalis species group. In the context of our study, it is important to analyse genetically discontinuous Siberian populations, and the current distribution of Microtusmystacinus in new localities in eastern Kazakhstan.
Collapse
Affiliation(s)
- Tereza Holicová
- Department of Zoology, Faculty of Science, University of South Bohemia, České Budějovice, Czech Republic University of South Bohemia České Budějovice Czech Republic
| | - František Sedláček
- Department of Zoology, Faculty of Science, University of South Bohemia, České Budějovice, Czech Republic University of South Bohemia České Budějovice Czech Republic
| | - Anna Mácová
- Department of Parasitology, Faculty of Science, University of South Bohemia, České Budějovice, Czech Republic University of South Bohemia České Budějovice Czech Republic
| | - Jakub Vlček
- Department of Zoology, Faculty of Science, University of South Bohemia, České Budějovice, Czech Republic University of South Bohemia České Budějovice Czech Republic
| | - Jan Robovský
- Department of Zoology, Faculty of Science, University of South Bohemia, České Budějovice, Czech Republic University of South Bohemia České Budějovice Czech Republic
| |
Collapse
|
32
|
Magner A, Turowski K, Szpankowski W. Lossless Compression of Binary Trees with Correlated Vertex Names. IEEE TRANSACTIONS ON INFORMATION THEORY 2018; 64:6070-6080. [PMID: 31537945 PMCID: PMC6752213 DOI: 10.1109/tit.2018.2851224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Compression schemes for advanced data structures have become a central modern challenge. Information theory has traditionally dealt with conventional data such as text, images, or video. In contrast, most data available today is multitype and context-dependent. To meet this challenge, we have recently initiated a systematic study of advanced data structures such as unlabeled graphs [8]. In this paper, we continue this program by considering trees with statistically correlated vertex names. Trees come in many forms, but here we deal with binary plane trees (where order of subtrees matters) and their non-plane version (where order of subtrees doesn't matter). Furthermore, we assume that each name is generated by a known memoryless source (horizontal independence), but a symbol of a vertex name depends in a Markovian sense on the corresponding symbol of the parent vertex name (vertical Markovian dependency). Such a model is closely connected to models of phylogenetic trees. While in general the problem of multimodal compression and associated analysis can be extremely complicated, we find that in this natural setting, both the entropy analysis and optimal compression are analytically tractable. We evaluate the entropy for both types of trees. For the plane case, with or without vertex names, we find that a simple two-stage compression scheme is both efficient and optimal. We then present efficient and optimal compression algorithms for the more complicated non-plane case.
Collapse
Affiliation(s)
- Abram Magner
- NSF Center for the Science of Information, Purdue University, West Lafayette, IN 47907
| | - Krzysztof Turowski
- NSF Center for the Science of Information, Purdue University, West Lafayette, IN 47907
- Department of Computer Science, Purdue University, IN 47907, USA ()
- Faculty of Electronics, Telecommunications and Informatics, Gdańsk University of Technology, Poland
| | - Wojciech Szpankowski
- NSF Center for the Science of Information, Purdue University, West Lafayette, IN 47907
| |
Collapse
|
33
|
Exact and approximate limit behaviour of the Yule tree’s cophenetic index. Math Biosci 2018; 303:26-45. [DOI: 10.1016/j.mbs.2018.05.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2017] [Revised: 04/30/2018] [Accepted: 05/04/2018] [Indexed: 11/21/2022]
|
34
|
Cherlin S, Heaps SE, Nye TMW, Boys RJ, Williams TA, Embley TM. The Effect of Nonreversibility on Inferring Rooted Phylogenies. Mol Biol Evol 2018; 35:984-1002. [PMID: 29149300 PMCID: PMC5889004 DOI: 10.1093/molbev/msx294] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Most phylogenetic models assume that the evolutionary process is stationary and reversible. In addition to being biologically improbable, these assumptions also impair inference by generating models under which the likelihood does not depend on the position of the root. Consequently, the root of the tree cannot be inferred as part of the analysis. Yet identifying the root position is a key component of phylogenetic inference because it provides a point of reference for polarizing ancestor-descendant relationships and therefore interpreting the tree. In this paper, we investigate the effect of relaxing the unrealistic reversibility assumption and allowing the position of the root to be another unknown. We propose two hierarchical models that are centered on a reversible model but perturbed to allow nonreversibility. The models differ in the degree of structure imposed on the perturbations. The analysis is performed in the Bayesian framework using Markov chain Monte Carlo methods for which software is provided. We illustrate the performance of the two nonreversible models in analyses of simulated data using two types of topological priors. We then apply the models to a real biological data set, the radiation of polyploid yeasts, for which there is robust biological opinion about the root position. Finally, we apply the models to a second biological alignment for which the rooted tree is controversial: the ribosomal tree of life. We compare the two nonreversible models and conclude that both are useful in inferring the position of the root from real biological data.
Collapse
Affiliation(s)
- Svetlana Cherlin
- Institute of Genetic Medicine, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Sarah E Heaps
- School of Mathematics, Statistics and Physics, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Tom M W Nye
- School of Mathematics, Statistics and Physics, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Richard J Boys
- School of Mathematics, Statistics and Physics, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Tom A Williams
- School of Biological Sciences, University of Bristol, Bristol, United Kingdom
| | - T Martin Embley
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne, United Kingdom
| |
Collapse
|
35
|
Chase EE, Robicheau BM, Veinot S, Breton S, Stewart DT. The complete mitochondrial genome of the hermaphroditic freshwater mussel Anodonta cygnea (Bivalvia: Unionidae): in silico analyses of sex-specific ORFs across order Unionoida. BMC Genomics 2018; 19:221. [PMID: 29587633 PMCID: PMC5870820 DOI: 10.1186/s12864-018-4583-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2017] [Accepted: 03/07/2018] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND Doubly uniparental inheritance (DUI) of mitochondrial DNA in bivalves is a fascinating exception to strictly maternal inheritance as practiced by all other animals. Recent work on DUI suggests that there may be unique regions of the mitochondrial genomes that play a role in sex determination and/or sexual development in freshwater mussels (order Unionoida). In this study, one complete mitochondrial genome of the hermaphroditic swan mussel, Anodonta cygnea, is sequenced and compared to the complete mitochondrial genome of the gonochoric duck mussel, Anodonta anatina. An in silico assessment of novel proteins found within freshwater bivalve species (known as F-, H-, and M-open reading frames or ORFs) is conducted, with special attention to putative transmembrane domains (TMs), signal peptides (SPs), signal cleavage sites (SCS), subcellular localization, and potential control regions. Characteristics of TMs are also examined across freshwater mussel lineages. RESULTS In silico analyses suggests the presence of SPs and SCSs and provides some insight into possible function(s) of these novel ORFs. The assessed confidence in these structures and functions was highly variable, possibly due to the novelty of these proteins. The number and topology of putative TMs appear to be maintained among both F- and H-ORFs, however, this is not the case for M-ORFs. There does not appear to be a typical control region in H-type mitochondrial DNA, especially given the loss of tandem repeats in unassigned regions when compared to F-type mtDNA. CONCLUSION In silico analyses provides a useful tool to discover patterns in DUI and to navigate further in situ analyses related to DUI in freshwater mussels. In situ analysis will be necessary to further explore the intracellular localizations and possible role of these open reading frames in the process of sex determination in freshwater mussel.
Collapse
Affiliation(s)
- E. E. Chase
- Department of Biology, Acadia University, Wolfville, NS Canada
| | - B. M. Robicheau
- Department of Biology, Dalhousie University, Halifax, NS Canada
| | - S. Veinot
- Department of Biology, Dalhousie University, Halifax, NS Canada
| | - S. Breton
- Département de Sciences Biologiques, Université de Montréal, Montréal, QC, Canada
| | - D. T. Stewart
- Department of Biology, Acadia University, Wolfville, NS Canada
| |
Collapse
|
36
|
Hagen O, Stadler T, Price S. TreeSimGM: Simulating phylogenetic trees under general Bellman-Harris models with lineage-specific shifts of speciation and extinction in R. Methods Ecol Evol 2018; 9:754-760. [PMID: 29938014 PMCID: PMC5993341 DOI: 10.1111/2041-210x.12917] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2017] [Accepted: 10/04/2017] [Indexed: 11/28/2022]
Abstract
Understanding macroevolutionary processes using phylogenetic trees is a challenging and complex process that draws on mathematics, computer science and biology. Given the development of complex mathematical models and the growing computational processing power, simulation tools are becoming increasingly popular.In order to simulate phylogenetic trees, most evolutionary biologists are forced to build their own algorithms or use existing tools built on different platforms and/or as standalone programmes. The absence of a simulation tool accommodating for user-chosen model specifications limits, amongst others, model testing and pipelining with approximate Bayesian computation methods or other subsequent statistical analysis.We introduce "TreeSimGM," an r-package simulation tool for phylogenetic trees under a general Bellman and Harris model. This package allows the user to specify any desired probability distribution for the waiting times until speciation and extinction (e.g. age-dependent speciation/extinction). Upon speciation, the user can specify whether one descendant species corresponds to the ancestor species inheriting its age or whether both descendant species are new species of age 0. Moreover, it is possible to scale the waiting time to speciation/extinction for newly formed species. Thus, "TreeSimGM" not only allows the user to simulate stochastic phylogenetic trees assuming several popular existing models, such as the Yule model, the constant-rate birth-death model, and proportional to distinguishable arrangement models, but it also allows the user to formulate new models for exploration. A short explanation of the supported models and a few examples of how to use our package are presented here.As an r-package, "TreeSimGM" allows flexible and powerful stochastic phylogenetic tree simulations. Moreover, it facilitates the pipelining of outputs or inputs with other functions in r. "TreeSimGM" contributes to the tools available to the r community in the fields of ecology and evolution, is freely available under the GPL-2 licence and can be downloaded at https://cran.r-project.org/web/packages/TreeSimGM.
Collapse
Affiliation(s)
- Oskar Hagen
- Swiss Federal Research Institute WSLBirmensdorfSwitzerland
- Landscape EcologyInstitute of Terrestrial EcosystemsETH ZurichZurichSwitzerland
| | - Tanja Stadler
- Department of Biosystems Science and EngineeringETH ZurichBaselSwitzerland
- Swiss Institute of Bioinformatics (SIB)LausanneSwitzerland
| | | |
Collapse
|
37
|
Cardona G, Mir A, Rosselló F, Rotger L. The expected value of the squared cophenetic metric under the Yule and the uniform models. Math Biosci 2017; 295:73-85. [PMID: 29155134 DOI: 10.1016/j.mbs.2017.11.007] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2017] [Revised: 11/09/2017] [Accepted: 11/14/2017] [Indexed: 11/26/2022]
Abstract
The cophenetic metrics dφ,p, for p ∈ {0} ∪ [1, ∞), are a recent addition to the kit of available distances for the comparison of phylogenetic trees. Based on a fifty years old idea of Sokal and Rohlf, these metrics compare phylogenetic trees on a same set of taxa by encoding them by means of their vectors of cophenetic values of pairs of taxa and depths of single taxa, and then computing the Lp norm of the difference of the corresponding vectors. In this paper we compute the expected value of the square of dφ,2 on the space of fully resolved rooted phylogenetic trees with n leaves, under the Yule and the uniform probability distributions.
Collapse
Affiliation(s)
- Gabriel Cardona
- Department of Mathematics and Computer Science, University of the Balearic Islands, Palma de Mallorca, E-07122 Spain.
| | - Arnau Mir
- Department of Mathematics and Computer Science, University of the Balearic Islands, Palma de Mallorca, E-07122 Spain.
| | - Francesc Rosselló
- Department of Mathematics and Computer Science, University of the Balearic Islands, Palma de Mallorca, E-07122 Spain.
| | - Lucía Rotger
- Department of Mathematics and Computer Science, University of the Balearic Islands, Palma de Mallorca, E-07122 Spain.
| |
Collapse
|
38
|
Disanto F, Rosenberg NA. Enumeration of Ancestral Configurations for Matching Gene Trees and Species Trees. J Comput Biol 2017; 24:831-850. [PMID: 28437136 PMCID: PMC5610458 DOI: 10.1089/cmb.2016.0159] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Given a gene tree and a species tree, ancestral configurations represent the combinatorially distinct sets of gene lineages that can reach a given node of the species tree. They have been introduced as a data structure for use in the recursive computation of the conditional probability under the multispecies coalescent model of a gene tree topology given a species tree, the cost of this computation being affected by the number of ancestral configurations of the gene tree in the species tree. For matching gene trees and species trees, we obtain enumerative results on ancestral configurations. We study ancestral configurations in balanced and unbalanced families of trees determined by a given seed tree, showing that for seed trees with more than one taxon, the number of ancestral configurations increases for both families exponentially in the number of taxa n. For fixed n, the maximal number of ancestral configurations tabulated at the species tree root node and the largest number of labeled histories possible for a labeled topology occur for trees with precisely the same unlabeled shape. For ancestral configurations at the root, the maximum increases with [Formula: see text], where [Formula: see text] is a quadratic recurrence constant. Under a uniform distribution over the set of labeled trees of given size, the mean number of root ancestral configurations grows with [Formula: see text] and the variance with ∼[Formula: see text]. The results provide a contribution to the combinatorial study of gene trees and species trees.
Collapse
Affiliation(s)
- Filippo Disanto
- Department of Biology, Stanford University , Stanford, California
| | - Noah A Rosenberg
- Department of Biology, Stanford University , Stanford, California
| |
Collapse
|
39
|
Requeno JI, Colom JM. Evaluation of properties over phylogenetic trees using stochastic logics. BMC Bioinformatics 2016; 17:235. [PMID: 27301397 PMCID: PMC4908722 DOI: 10.1186/s12859-016-1077-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2015] [Accepted: 05/07/2016] [Indexed: 12/02/2022] Open
Abstract
Background Model checking has been recently introduced as an integrated framework for extracting information of the phylogenetic trees using temporal logics as a querying language, an extension of modal logics that imposes restrictions of a boolean formula along a path of events. The phylogenetic tree is considered a transition system modeling the evolution as a sequence of genomic mutations (we understand mutation as different ways that DNA can be changed), while this kind of logics are suitable for traversing it in a strict and exhaustive way. Given a biological property that we desire to inspect over the phylogeny, the verifier returns true if the specification is satisfied or a counterexample that falsifies it. However, this approach has been only considered over qualitative aspects of the phylogeny. Results In this paper, we repair the limitations of the previous framework for including and handling quantitative information such as explicit time or probability. To this end, we apply current probabilistic continuous-time extensions of model checking to phylogenetics. We reinterpret a catalog of qualitative properties in a numerical way, and we also present new properties that couldn’t be analyzed before. For instance, we obtain the likelihood of a tree topology according to a mutation model. As case of study, we analyze several phylogenies in order to obtain the maximum likelihood with the model checking tool PRISM. In addition, we have adapted the software for optimizing the computation of maximum likelihoods. Conclusions We have shown that probabilistic model checking is a competitive framework for describing and analyzing quantitative properties over phylogenetic trees. This formalism adds soundness and readability to the definition of models and specifications. Besides, the existence of model checking tools hides the underlying technology, omitting the extension, upgrade, debugging and maintenance of a software tool to the biologists. A set of benchmarks justify the feasibility of our approach.
Collapse
Affiliation(s)
- José Ignacio Requeno
- Department of Computer Science and Systems Engineering (DIIS), Universidad de Zaragoza, C/ María de Luna 1, Zaragoza, 50018, Spain.
| | - José Manuel Colom
- Department of Computer Science and Systems Engineering (DIIS), Universidad de Zaragoza, C/ María de Luna 1, Zaragoza, 50018, Spain
| |
Collapse
|
40
|
Towards sub-quadratic time and space complexity solutions for the dated tree reconciliation problem. Algorithms Mol Biol 2016; 11:15. [PMID: 27213010 PMCID: PMC4875752 DOI: 10.1186/s13015-016-0077-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2015] [Accepted: 05/03/2016] [Indexed: 11/10/2022] Open
Abstract
Background Recent coevolutionary analysis has considered tree topology as a means to reduce the asymptotic complexity associated with inferring the complex coevolutionary interrelationships that arise between phylogenetic trees. Targeted algorithmic design for specific tree topologies has to date been highly successful, with one recent formulation providing a logarithmic space complexity reduction for the dated tree reconciliation problem. Methods In this work we build on this prior analysis providing a further asymptotic space reduction, by providing a new formulation for the dynamic programming table used by a number of popular coevolutionary analysis techniques. This model gives rise to a sub quadratic running time solution for the dated tree reconciliation problem for selected tree topologies, and is shown to be, in practice, the fastest method for solving the dated tree reconciliation problem for expected evolutionary trees. This result is achieved through the analysis of not only the topology of the trees considered for coevolutionary analysis, but also the underlying structure of the dynamic programming algorithms that are traditionally applied to such analysis. Conclusion The newly inferred theoretical complexity bounds introduced herein are then validated using a combination of synthetic and biological data sets, where the proposed model is shown to provide an \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$O(\sqrt{n})$$\end{document}O(n) space saving, while it is observed to run in half the time compared to the fastest known algorithm for solving the dated tree reconciliation problem. What is even more significant is that the algorithm derived herein is able to guarantee the optimality of its inferred solution, something that algorithms of comparable speed have to date been unable to achieve.
Collapse
|
41
|
Abstract
We consider a stochastic evolutionary model for a phenotype developing amongst n related species with unknown phylogeny. The unknown tree is modelled by a Yule process conditioned on n contemporary nodes. The trait value is assumed to evolve along lineages as an Ornstein-Uhlenbeck process. As a result, the trait values of the n species form a sample with dependent observations. We establish three limit theorems for the sample mean corresponding to three domains for the adaptation rate. In the case of fast adaptation, we show that for large n the normalized sample mean is approximately normally distributed. Using these limit theorems, we develop novel confidence interval formulae for the optimal trait value.
Collapse
|
42
|
Abstract
We consider a stochastic evolutionary model for a phenotype developing amongst n related species with unknown phylogeny. The unknown tree is modelled by a Yule process conditioned on n contemporary nodes. The trait value is assumed to evolve along lineages as an Ornstein-Uhlenbeck process. As a result, the trait values of the n species form a sample with dependent observations. We establish three limit theorems for the sample mean corresponding to three domains for the adaptation rate. In the case of fast adaptation, we show that for large n the normalized sample mean is approximately normally distributed. Using these limit theorems, we develop novel confidence interval formulae for the optimal trait value.
Collapse
|
43
|
Plazzotta G, Kwan C, Boyd M, Colijn C. Effects of memory on the shapes of simple outbreak trees. Sci Rep 2016; 6:21159. [PMID: 26888437 PMCID: PMC4758066 DOI: 10.1038/srep21159] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2015] [Accepted: 01/07/2016] [Indexed: 12/15/2022] Open
Abstract
Genomic tools, including phylogenetic trees derived from sequence data, are increasingly used to understand outbreaks of infectious diseases. One challenge is to link phylogenetic trees to patterns of transmission. Particularly in bacteria that cause chronic infections, this inference is affected by variable infectious periods and infectivity over time. It is known that non-exponential infectious periods can have substantial effects on pathogens’ transmission dynamics. Here we ask how this non-Markovian nature of an outbreak process affects the branching trees describing that process, with particular focus on tree shapes. We simulate Crump-Mode-Jagers branching processes and compare different patterns of infectivity over time. We find that memory (non-Markovian-ness) in the process can have a pronounced effect on the shapes of the outbreak’s branching pattern. However, memory also has a pronounced effect on the sizes of the trees, even when the duration of the simulation is fixed. When the sizes of the trees are constrained to a constant value, memory in our processes has little direct effect on tree shapes, but can bias inference of the birth rate from trees. We compare simulated branching trees to phylogenetic trees from an outbreak of tuberculosis in Canada, and discuss the relevance of memory to this dataset.
Collapse
Affiliation(s)
| | - Christopher Kwan
- Department of Electrical and Electronic Engineering, Imperial College London, London, UK
| | - Michael Boyd
- Department of Mathematics, University of Cambridge, Cambridge, UK
| | - Caroline Colijn
- Department of Mathematics, Imperial College London, London, UK
| |
Collapse
|
44
|
Comparative phylogeography of Meriones meridianus, Dipus sagitta, and Allactaga sibirica: Potential indicators of the impact of the Qinghai-Tibetan Plateau uplift. Mamm Biol 2016. [DOI: 10.1016/j.mambio.2015.05.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
45
|
Ashkani J, Rees DJG. A Comprehensive Study of Molecular Evolution at the Self-Incompatibility Locus of Rosaceae. J Mol Evol 2015; 82:128-45. [PMID: 26714486 DOI: 10.1007/s00239-015-9726-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2015] [Accepted: 12/16/2015] [Indexed: 10/22/2022]
Abstract
The family Rosaceae includes a range of important fruit trees, most of which have the S-RNase-based self-incompatibility (SI). Several models have been developed to explain how pollen (SLF) and pistil (S-RNase) components of the S-locus interact. It was discovered in 2010 that additional SLF proteins are involved in pollen specificity, and a Collaborative Non-Self Recognition model has been proposed for SI in Solanaceae; however, the validity of such model remains to be elucidated for other species. The results of this study support the divergent evolution of the S-locus genes from two Rosaceae subfamilies, Prunoideae/Amygdaloideae and Maloideae, The difference identified in the selective pressures between the two lineages provides evidence for positive selection at specific sites in both the S-RNase and the SLF proteins. The evolutionary findings of this study support the role of multiple SLF proteins leading to a Collaborative Non-Self Recognition model for SI in the Maloideae. Furthermore, the identification of the sites responsible for SI specificity determination and the mapping of these sites onto the modelled tertiary structure of ancestor proteins provide useful information for rational functional redesign and protein engineering for the future engineering of new functional alleles providing increased diversity in the SI system in the Maloideae.
Collapse
Affiliation(s)
- Jahanshah Ashkani
- Biotechnology Department, University of the Western Cape, Private Bag X17, Bellville, 7535, South Africa. .,Biotechnology Platform, Agricultural Research Council, Private Bag X5, Onderstepoort, 0110, South Africa.
| | - D J G Rees
- Biotechnology Platform, Agricultural Research Council, Private Bag X5, Onderstepoort, 0110, South Africa
| |
Collapse
|
46
|
Abstract
Computational phylogenetics is in the process of revolutionizing historical linguistics. Recent applications have shed new light on controversial issues, such as the location and time depth of language families and the dynamics of their spread. So far, these approaches have been limited to single-language families because they rely on a large body of expert cognacy judgments or grammatical classifications, which is currently unavailable for most language families. The present study pursues a different approach. Starting from raw phonetic transcription of core vocabulary items from very diverse languages, it applies weighted string alignment to track both phonetic and lexical change. Applied to a collection of ∼1,000 Eurasian languages and dialects, this method, combined with phylogenetic inference, leads to a classification in excellent agreement with established findings of historical linguistics. Furthermore, it provides strong statistical support for several putative macrofamilies contested in current historical linguistics. In particular, there is a solid signal for the Nostratic/Eurasiatic macrofamily.
Collapse
|
47
|
Bartoszek K, Sagitov S. A consistent estimator of the evolutionary rate. J Theor Biol 2015; 371:69-78. [DOI: 10.1016/j.jtbi.2015.01.019] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2014] [Revised: 01/14/2015] [Accepted: 01/18/2015] [Indexed: 11/25/2022]
|
48
|
Sheinman M, Massip F, Arndt PF. Statistical properties of pairwise distances between leaves on a random Yule tree. PLoS One 2015; 10:e0120206. [PMID: 25826216 PMCID: PMC4380457 DOI: 10.1371/journal.pone.0120206] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2014] [Accepted: 01/20/2015] [Indexed: 11/24/2022] Open
Abstract
A Yule tree is the result of a branching process with constant birth and death rates. Such a process serves as an instructive null model of many empirical systems, for instance, the evolution of species leading to a phylogenetic tree. However, often in phylogeny the only available information is the pairwise distances between a small fraction of extant species representing the leaves of the tree. In this article we study statistical properties of the pairwise distances in a Yule tree. Using a method based on a recursion, we derive an exact, analytic and compact formula for the expected number of pairs separated by a certain time distance. This number turns out to follow a increasing exponential function. This property of a Yule tree can serve as a simple test for empirical data to be well described by a Yule process. We further use this recursive method to calculate the expected number of the n-most closely related pairs of leaves and the number of cherries separated by a certain time distance. To make our results more useful for realistic scenarios, we explicitly take into account that the leaves of a tree may be incompletely sampled and derive a criterion for poorly sampled phylogenies. We show that our result can account for empirical data, using two families of birds species.
Collapse
Affiliation(s)
- Michael Sheinman
- Max Planck Institute for Molecular Genetics, Berlin, Germany
- * E-mail:
| | - Florian Massip
- Max Planck Institute for Molecular Genetics, Berlin, Germany
- INRA, UR1077 Unite Mathematique Informatique et Genome, Jouy-en-Josas, France
| | - Peter F. Arndt
- Max Planck Institute for Molecular Genetics, Berlin, Germany
| |
Collapse
|
49
|
Hagen O, Hartmann K, Steel M, Stadler T. Age-dependent speciation can explain the shape of empirical phylogenies. Syst Biol 2015; 64:432-40. [PMID: 25575504 PMCID: PMC4395845 DOI: 10.1093/sysbio/syv001] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2014] [Accepted: 01/02/2015] [Indexed: 11/12/2022] Open
Abstract
Tens of thousands of phylogenetic trees, describing the evolutionary relationships between hundreds of thousands of taxa, are readily obtainable from various databases. From such trees, inferences can be made about the underlying macroevolutionary processes, yet remarkably these processes are still poorly understood. Simple and widely used evolutionary null models are problematic: Empirical trees show very different imbalance between the sizes of the daughter clades of ancestral taxa compared to what models predict. Obtaining a simple evolutionary model that is both biologically plausible and produces the imbalance seen in empirical trees is a challenging problem, to which none of the existing models provide a satisfying answer. Here we propose a simple, biologically plausible macroevolutionary model in which the rate of speciation decreases with species age, whereas extinction rates can vary quite generally. We show that this model provides a remarkable fit to the thousands of trees stored in the online database TreeBase. The biological motivation for the identified age-dependent speciation process may be that recently evolved taxa often colonize new regions or niches and may initially experience little competition. These new taxa are thus more likely to give rise to further new taxa than a taxon that has remained largely unchanged and is, therefore, well adapted to its niche. We show that age-dependent speciation may also be the result of different within-species populations following the same laws of lineage splitting to produce new species. As the fit of our model to the tree database shows, this simple biological motivation provides an explanation for a long standing problem in macroevolution.
Collapse
Affiliation(s)
- Oskar Hagen
- Institute of Integrative Biology, ETH Zürich, Universitätsstr. 16, 8092 Zürich, Switzerland; Institute for Marine and Antarctic Studies, University of Tasmania, Private Bag 49, Hobart, Tasmania 7001, Australia; Allan Wilson Centre for Molecular Ecology and Evolution, Biomathematics Research Centre, University of Canterbury, Christchurch 8140, New Zealand; and Department of Biosystems Science and Engineering, ETH Zürich, Mattenstrasse 26, 4058 Basel, Switzerland
| | - Klaas Hartmann
- Institute of Integrative Biology, ETH Zürich, Universitätsstr. 16, 8092 Zürich, Switzerland; Institute for Marine and Antarctic Studies, University of Tasmania, Private Bag 49, Hobart, Tasmania 7001, Australia; Allan Wilson Centre for Molecular Ecology and Evolution, Biomathematics Research Centre, University of Canterbury, Christchurch 8140, New Zealand; and Department of Biosystems Science and Engineering, ETH Zürich, Mattenstrasse 26, 4058 Basel, Switzerland
| | - Mike Steel
- Institute of Integrative Biology, ETH Zürich, Universitätsstr. 16, 8092 Zürich, Switzerland; Institute for Marine and Antarctic Studies, University of Tasmania, Private Bag 49, Hobart, Tasmania 7001, Australia; Allan Wilson Centre for Molecular Ecology and Evolution, Biomathematics Research Centre, University of Canterbury, Christchurch 8140, New Zealand; and Department of Biosystems Science and Engineering, ETH Zürich, Mattenstrasse 26, 4058 Basel, Switzerland
| | - Tanja Stadler
- Institute of Integrative Biology, ETH Zürich, Universitätsstr. 16, 8092 Zürich, Switzerland; Institute for Marine and Antarctic Studies, University of Tasmania, Private Bag 49, Hobart, Tasmania 7001, Australia; Allan Wilson Centre for Molecular Ecology and Evolution, Biomathematics Research Centre, University of Canterbury, Christchurch 8140, New Zealand; and Department of Biosystems Science and Engineering, ETH Zürich, Mattenstrasse 26, 4058 Basel, Switzerland Institute of Integrative Biology, ETH Zürich, Universitätsstr. 16, 8092 Zürich, Switzerland; Institute for Marine and Antarctic Studies, University of Tasmania, Private Bag 49, Hobart, Tasmania 7001, Australia; Allan Wilson Centre for Molecular Ecology and Evolution, Biomathematics Research Centre, University of Canterbury, Christchurch 8140, New Zealand; and Department of Biosystems Science and Engineering, ETH Zürich, Mattenstrasse 26, 4058 Basel, Switzerland
| |
Collapse
|
50
|
Drinkwater B, Charleston MA. Introducing TreeCollapse: a novel greedy algorithm to solve the cophylogeny reconstruction problem. BMC Bioinformatics 2014; 15 Suppl 16:S14. [PMID: 25521705 PMCID: PMC4290644 DOI: 10.1186/1471-2105-15-s16-s14] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Cophylogeny mapping is used to uncover deep coevolutionary associations between two or more phylogenetic histories at a macro coevolutionary scale. As cophylogeny mapping is NP-Hard, this technique relies heavily on heuristics to solve all but the most trivial cases. One notable approach utilises a metaheuristic to search only a subset of the exponential number of fixed node orderings possible for the phylogenetic histories in question. This is of particular interest as it is the only known heuristic that guarantees biologically feasible solutions. This has enabled research to focus on larger coevolutionary systems, such as coevolutionary associations between figs and their pollinator wasps, including over 200 taxa. Although able to converge on solutions for problem instances of this size, a reduction from the current cubic running time is required to handle larger systems, such as Wolbachia and their insect hosts. RESULTS Rather than solving this underlying problem optimally this work presents a greedy algorithm called TreeCollapse, which uses common topological patterns to recover an approximation of the coevolutionary history where the internal node ordering is fixed. This approach offers a significant speed-up compared to previous methods, running in linear time. This algorithm has been applied to over 100 well-known coevolutionary systems converging on Pareto optimal solutions in over 68% of test cases, even where in some cases the Pareto optimal solution has not previously been recoverable. Further, while TreeCollapse applies a local search technique, it can guarantee solutions are biologically feasible, making this the fastest method that can provide such a guarantee. CONCLUSION As a result, we argue that the newly proposed algorithm is a valuable addition to the field of coevolutionary research. Not only does it offer a significantly faster method to estimate the cost of cophylogeny mappings but by using this approach, in conjunction with existing heuristics, it can assist in recovering a larger subset of the Pareto front than has previously been possible.
Collapse
Affiliation(s)
- Benjamin Drinkwater
- School of Information Technologies, 1 Cleveland St, 2006 University of Sydney, Australia Full list of author information is available at the end of the article
| | - Michael A Charleston
- School of Information Technologies, 1 Cleveland St, 2006 University of Sydney, Australia Full list of author information is available at the end of the article
| |
Collapse
|