1
|
Warnow T, Tabatabaee Y, Evans SN. Advances in Estimating Level-1 Phylogenetic Networks from Unrooted SNPs. J Comput Biol 2025; 32:3-27. [PMID: 39582425 DOI: 10.1089/cmb.2024.0710] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2024] Open
Abstract
We address the problem of how to estimate a phylogenetic network when given single-nucleotide polymorphisms (i.e., SNPs, or bi-allelic markers that have evolved under the infinite sites assumption). We focus on level-1 phylogenetic networks (i.e., networks where the cycles are node-disjoint), since more complex networks are unidentifiable. We provide a polynomial time quartet-based method that we prove correct for reconstructing the semi-directed level-1 phylogenetic network N, if we are given a set of SNPs that covers all the bipartitions of N, even if the ancestral state is not known, provided that the cycles are of length at least 5; we also prove that an algorithm developed by Dan Gusfield in the Journal of Computer and System Sciences in 2005 correctly recovers semi-directed level-1 phylogenetic networks in polynomial time in this case. We present a stochastic model for DNA evolution, and we prove that the two methods (our quartet-based method and Gusfield's method) are statistically consistent estimators of the semi-directed level-1 phylogenetic network. For the case of multi-state homoplasy-free characters, we prove that our quartet-based method correctly constructs semi-directed level-1 networks under the required conditions (all cycles of length at least five), while Gusfield's algorithm cannot be used in that case. These results assume that we have access to an oracle for indicating which sites in the DNA alignment are homoplasy-free, and we show that the methods are robust, under some conditions, to oracle errors.
Collapse
Affiliation(s)
- Tandy Warnow
- Siebel School of Computing and Data Science, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| | - Yasamin Tabatabaee
- Siebel School of Computing and Data Science, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| | - Steven N Evans
- Department of Statistics, University of California at Berkeley, Berkeley, California, USA
| |
Collapse
|
2
|
Suzuki T, Guo H, Hayamizu M. Bridging Between Deviation Indices for Non-Tree-Based Phylogenetic Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:2226-2234. [PMID: 39250364 DOI: 10.1109/tcbb.2024.3456575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/11/2024]
Abstract
Phylogenetic networks are a useful model that can represent reticulate evolution and complex biological data. In recent years, mathematical and computational aspects of tree-based networks have been well studied. However, not all phylogenetic networks are tree-based, so it is meaningful to consider how close a given network is to being tree-based; Francis-Steel-Semple (2018) proposed several different indices to measure the degree of deviation of a phylogenetic network from being tree-based. One is the minimum number of leaves that need to be added to convert a given network to tree-based, and another is the number of vertices that are not included in the largest subtree covering its leaf-set. Both values are zero if and only if the network is tree-based. Both deviation indices can be computed efficiently, but the relationship between the above two is unknown, as each has been studied using different approaches. In this study, we derive a tight inequality for the values of the two measures and also give a characterisation of phylogenetic networks such that they coincide. This characterisation yields a new efficient algorithm for the Maximum Covering Subtree Problem based on the maximal zig-zag trail decomposition.
Collapse
|
3
|
Gross E, Krone R, Martin S. Dimensions of Level-1 Group-Based Phylogenetic Networks. Bull Math Biol 2024; 86:90. [PMID: 38886260 PMCID: PMC11182832 DOI: 10.1007/s11538-024-01314-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 05/15/2024] [Indexed: 06/20/2024]
Abstract
Phylogenetic networks represent evolutionary histories of sets of taxa where horizontal evolution or hybridization has occurred. Placing a Markov model of evolution on a phylogenetic network gives a model that is particularly amenable to algebraic study by representing it as an algebraic variety. In this paper, we give a formula for the dimension of the variety corresponding to a triangle-free level-1 phylogenetic network under a group-based evolutionary model. On our way to this, we give a dimension formula for codimension zero toric fiber products. We conclude by illustrating applications to identifiability.
Collapse
Affiliation(s)
- Elizabeth Gross
- Department of Mathematics, University of Hawai'i at Mānoa, Mānoa, HI, USA
| | - Robert Krone
- Department of Mathematics, UC Davis, Davis, CA, USA
| | | |
Collapse
|
4
|
Francis A, Marchei D, Steel M. Phylogenetic network classes through the lens of expanding covers. J Math Biol 2024; 88:58. [PMID: 38584237 PMCID: PMC10999392 DOI: 10.1007/s00285-024-02075-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Revised: 11/18/2023] [Accepted: 03/05/2024] [Indexed: 04/09/2024]
Abstract
It was recently shown that a large class of phylogenetic networks, the 'labellable' networks, is in bijection with the set of 'expanding' covers of finite sets. In this paper, we show how several prominent classes of phylogenetic networks can be characterised purely in terms of properties of their associated covers. These classes include the tree-based, tree-child, orchard, tree-sibling, and normal networks. In the opposite direction, we give an example of how a restriction on the set of expanding covers can define a new class of networks, which we call 'spinal' phylogenetic networks.
Collapse
Affiliation(s)
- Andrew Francis
- Centre for Research in Mathematics and Data Science, Western Sydney University, Sydney, Australia.
- School of Mathematics and Statistics, University of New South Wales, Sydney, Australia.
| | - Daniele Marchei
- Centre for Research in Mathematics and Data Science, Western Sydney University, Sydney, Australia
- Computer Science, University of Camerino, Camerino, Italy
| | - Mike Steel
- Biomathematics Research Centre, University of Canterbury, Christchurch, New Zealand
| |
Collapse
|
5
|
López Sánchez A, Lafond M. Predicting horizontal gene transfers with perfect transfer networks. Algorithms Mol Biol 2024; 19:6. [PMID: 38321476 PMCID: PMC10848447 DOI: 10.1186/s13015-023-00242-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Accepted: 10/25/2023] [Indexed: 02/08/2024] Open
Abstract
BACKGROUND Horizontal gene transfer inference approaches are usually based on gene sequences: parametric methods search for patterns that deviate from a particular genomic signature, while phylogenetic methods use sequences to reconstruct the gene and species trees. However, it is well-known that sequences have difficulty identifying ancient transfers since mutations have enough time to erase all evidence of such events. In this work, we ask whether character-based methods can predict gene transfers. Their advantage over sequences is that homologous genes can have low DNA similarity, but still have retained enough important common motifs that allow them to have common character traits, for instance the same functional or expression profile. A phylogeny that has two separate clades that acquired the same character independently might indicate the presence of a transfer even in the absence of sequence similarity. OUR CONTRIBUTIONS We introduce perfect transfer networks, which are phylogenetic networks that can explain the character diversity of a set of taxa under the assumption that characters have unique births, and that once a character is gained it is rarely lost. Examples of such traits include transposable elements, biochemical markers and emergence of organelles, just to name a few. We study the differences between our model and two similar models: perfect phylogenetic networks and ancestral recombination networks. Our goals are to initiate a study on the structural and algorithmic properties of perfect transfer networks. We then show that in polynomial time, one can decide whether a given network is a valid explanation for a set of taxa, and show how, for a given tree, one can add transfer edges to it so that it explains a set of taxa. We finally provide lower and upper bounds on the number of transfers required to explain a set of taxa, in the worst case.
Collapse
Affiliation(s)
| | - Manuel Lafond
- Department of Computer Science, Université de Sherbrooke, Sherbrooke, Canada
| |
Collapse
|
6
|
Hellmuth M, Schaller D, Stadler PF. Clustering systems of phylogenetic networks. Theory Biosci 2023; 142:301-358. [PMID: 37573261 PMCID: PMC10564800 DOI: 10.1007/s12064-023-00398-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 06/25/2023] [Indexed: 08/14/2023]
Abstract
Rooted acyclic graphs appear naturally when the phylogenetic relationship of a set X of taxa involves not only speciations but also recombination, horizontal transfer, or hybridization that cannot be captured by trees. A variety of classes of such networks have been discussed in the literature, including phylogenetic, level-1, tree-child, tree-based, galled tree, regular, or normal networks as models of different types of evolutionary processes. Clusters arise in models of phylogeny as the sets [Formula: see text] of descendant taxa of a vertex v. The clustering system [Formula: see text] comprising the clusters of a network N conveys key information on N itself. In the special case of rooted phylogenetic trees, T is uniquely determined by its clustering system [Formula: see text]. Although this is no longer true for networks in general, it is of interest to relate properties of N and [Formula: see text]. Here, we systematically investigate the relationships of several well-studied classes of networks and their clustering systems. The main results are correspondences of classes of networks and clustering systems of the following form: If N is a network of type [Formula: see text], then [Formula: see text] satisfies [Formula: see text], and conversely if [Formula: see text] is a clustering system satisfying [Formula: see text] then there is network N of type [Formula: see text] such that [Formula: see text].This, in turn, allows us to investigate the mutual dependencies between the distinct types of networks in much detail.
Collapse
Affiliation(s)
- Marc Hellmuth
- Department of Mathematics, Faculty of Science, Stockholm University, Albanovägen 28, 10691 Stockholm, Sweden
| | - David Schaller
- Bioinformatics Group, Department of Computer Science and Interdisciplinary Center for Bioinformatics, Leipzig University, Härtelstraße 16-18, 04107 Leipzig, Germany
| | - Peter F. Stadler
- Bioinformatics Group, Department of Computer Science and Interdisciplinary Center for Bioinformatics, Leipzig University, Härtelstraße 16-18, 04107 Leipzig, Germany
- Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, 04103 Leipzig, Germany
- Department of Theoretical Chemistry, University of Vienna, Währingerstraße 17, 1090 Vienna, Austria
- Facultad de Ciencias, Universidad National de Colombia, Bogotá, Colombia
- Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501 USA
| |
Collapse
|
7
|
Abstract
Phylogenetic networks are mathematical representations of evolutionary history that are able to capture both tree-like evolutionary processes (speciations) and non-tree-like 'reticulate' processes such as hybridization or horizontal gene transfer. The additional complexity that comes with this capacity, however, makes networks harder to infer from data, and more complicated to work with as mathematical objects. In this paper, we define a new, large class of phylogenetic networks, that we call labellable, and show that they are in bijection with the set of 'expanding covers' of finite sets. This correspondence is a generalisation of the encoding of phylogenetic forests by partitions of finite sets. Labellable networks can be characterised by a simple combinatorial condition, and we describe the relationship between this large class and other commonly studied classes. Furthermore, we show that all phylogenetic networks have a quotient network that is labellable.
Collapse
Affiliation(s)
- Andrew Francis
- Centre for Research in Mathematics and Data Science, Western Sydney University, Penrith, Australia.
| | - Mike Steel
- Biomathematics Research Centre, Canterbury University, Christchurch, New Zealand
| |
Collapse
|
8
|
Huber KT, Maher LJ. Autopolyploidy, Allopolyploidy, and Phylogenetic Networks with Horizontal Arcs. Bull Math Biol 2023; 85:40. [PMID: 37022524 PMCID: PMC10079759 DOI: 10.1007/s11538-023-01140-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Accepted: 02/28/2023] [Indexed: 04/07/2023]
Abstract
Polyploidization is an evolutionary process by which a species acquires multiple copies of its complete set of chromosomes. The reticulate nature of the signal left behind by it means that phylogenetic networks offer themselves as a framework to reconstruct the evolutionary past of species affected by it. The main strategy for doing this is to first construct a so-called multiple-labelled tree and to then somehow derive such a network from it. The following question therefore arises: How much can be said about that past if such a tree is not readily available? By viewing a polyploid dataset as a certain vector which we call a ploidy (level) profile, we show that among other results, there always exists a phylogenetic network in the form of a beaded phylogenetic tree with additional arcs that realizes a given ploidy profile. Intriguingly, the two end vertices of almost all of these additional arcs can be interpreted as having co-existed in time thereby adding biological realism to our network, a feature that is, in general, not enjoyed by phylogenetic networks. In addition, we show that our network may be viewed as a generator of ploidy profile space, a novel concept similar to phylogenetic tree space that we introduce to be able to compare phylogenetic networks that realize one and the same ploidy profile. We illustrate our findings in terms of a publicly available Viola dataset.
Collapse
|
9
|
Forest-Based Networks. Bull Math Biol 2022; 84:119. [PMID: 36107279 PMCID: PMC9477957 DOI: 10.1007/s11538-022-01081-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Accepted: 09/05/2022] [Indexed: 11/04/2022]
Abstract
In evolutionary studies, it is common to use phylogenetic trees to represent the evolutionary history of a set of species. However, in case the transfer of genes or other genetic information between the species or their ancestors has occurred in the past, a tree may not provide a complete picture of their history. In such cases, tree-based phylogenetic networks can provide a useful, more refined representation of the species’ evolution. Such a network is essentially a phylogenetic tree with some arcs added between the tree’s edges so as to represent reticulate events such as gene transfer, hybridization and recombination. Even so, this model does not permit the direct representation of evolutionary scenarios where reticulate events have taken place between different subfamilies or lineages of species. To represent such scenarios, in this paper we introduce the notion of a forest-based network, that is, a collection of leaf-disjoint phylogenetic trees on a set of species with arcs added between the edges of distinct trees within the collection. Forest-based networks include the recently introduced class of overlaid species forests which can be used to model introgression. As we shall see, even though the definition of forest-based networks is closely related to that of tree-based networks, they lead to new mathematical theory which complements that of tree-based networks. As well as studying the relationship of forest-based networks with other classes of phylogenetic networks, such as tree-child networks and universal tree-based networks, we present some characterizations of some special classes of forest-based networks. We expect that our results will be useful for developing new models and algorithms to understand reticulate evolution, such as introgression and gene transfer between species.
Collapse
|
10
|
van Iersel L, Janssen R, Jones M, Murakami Y. Orchard Networks are Trees with Additional Horizontal Arcs. Bull Math Biol 2022; 84:76. [PMID: 35727410 PMCID: PMC9213324 DOI: 10.1007/s11538-022-01037-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Accepted: 05/30/2022] [Indexed: 11/30/2022]
Abstract
Phylogenetic networks are used in biology to represent evolutionary histories. The class of orchard phylogenetic networks was recently introduced for their computational benefits, without any biological justification. Here, we show that orchard networks can be interpreted as trees with additional horizontal arcs. Therefore, they are closely related to tree-based networks, where the difference is that in tree-based networks the additional arcs do not need to be horizontal. Then, we use this new characterization to show that the space of orchard networks on n leaves with k reticulations is connected under the rNNI rearrangement move with diameter \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$O(kn+n\log (n))$$\end{document}O(kn+nlog(n)).
Collapse
Affiliation(s)
- Leo van Iersel
- Delft Institute of Applied Mathematics, Delft University of Technology, Mekelweg 4, 2628 CD, Delft, South Holland, The Netherlands
| | - Remie Janssen
- Delft Institute of Applied Mathematics, Delft University of Technology, Mekelweg 4, 2628 CD, Delft, South Holland, The Netherlands
| | - Mark Jones
- Delft Institute of Applied Mathematics, Delft University of Technology, Mekelweg 4, 2628 CD, Delft, South Holland, The Netherlands
| | - Yukihiro Murakami
- Delft Institute of Applied Mathematics, Delft University of Technology, Mekelweg 4, 2628 CD, Delft, South Holland, The Netherlands.
| |
Collapse
|
11
|
Wawerka M, Dąbkowski D, Rutecka N, Mykowiecka A, Górecki P. Embedding gene trees into phylogenetic networks by conflict resolution algorithms. Algorithms Mol Biol 2022; 17:11. [PMID: 35590416 PMCID: PMC9119282 DOI: 10.1186/s13015-022-00218-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 03/22/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Phylogenetic networks are mathematical models of evolutionary processes involving reticulate events such as hybridization, recombination, or horizontal gene transfer. One of the crucial notions in phylogenetic network modelling is displayed tree, which is obtained from a network by removing a set of reticulation edges. Displayed trees may represent an evolutionary history of a gene family if the evolution is shaped by reticulation events. RESULTS We address the problem of inferring an optimal tree displayed by a network, given a gene tree G and a tree-child network N, under the deep coalescence and duplication costs. We propose an O(mn)-time dynamic programming algorithm (DP) to compute a lower bound of the optimal displayed tree cost, where m and n are the sizes of G and N, respectively. In addition, our algorithm can verify whether the solution is exact. Moreover, it provides a set of reticulation edges corresponding to the obtained cost. If the cost is exact, the set induces an optimal displayed tree. Otherwise, the set contains pairs of conflicting edges, i.e., edges sharing a reticulation node. Next, we show a conflict resolution algorithm that requires [Formula: see text] invocations of DP in the worst case, where r is the number of reticulations. We propose a similar [Formula: see text]-time algorithm for level-k tree-child networks and a branch and bound solution to compute lower and upper bounds of optimal costs. We also extend the algorithms to a broader class of phylogenetic networks. Based on simulated data, the average runtime is [Formula: see text] under the deep-coalescence cost and [Formula: see text] under the duplication cost. CONCLUSIONS Despite exponential complexity in the worst case, our algorithms perform significantly well on empirical and simulated datasets, due to the strategy of resolving internal dissimilarities between gene trees and networks. Therefore, the algorithms are efficient alternatives to enumeration strategies commonly proposed in the literature and enable analyses of complex networks with dozens of reticulations.
Collapse
|
12
|
Kong S, Pons JC, Kubatko L, Wicke K. Classes of explicit phylogenetic networks and their biological and mathematical significance. J Math Biol 2022; 84:47. [PMID: 35503141 DOI: 10.1007/s00285-022-01746-y] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Revised: 01/18/2022] [Accepted: 03/31/2022] [Indexed: 11/24/2022]
Abstract
The evolutionary relationships among organisms have traditionally been represented using rooted phylogenetic trees. However, due to reticulate processes such as hybridization or lateral gene transfer, evolution cannot always be adequately represented by a phylogenetic tree, and rooted phylogenetic networks that describe such complex processes have been introduced as a generalization of rooted phylogenetic trees. In fact, estimating rooted phylogenetic networks from genomic sequence data and analyzing their structural properties is one of the most important tasks in contemporary phylogenetics. Over the last two decades, several subclasses of rooted phylogenetic networks (characterized by certain structural constraints) have been introduced in the literature, either to model specific biological phenomena or to enable tractable mathematical and computational analyses. In the present manuscript, we provide a thorough review of these network classes, as well as provide a biological interpretation of the structural constraints underlying these networks where possible. In addition, we discuss how imposing structural constraints on the network topology can be used to address the scalability and identifiability challenges faced in the estimation of phylogenetic networks from empirical data.
Collapse
Affiliation(s)
- Sungsik Kong
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH, USA
| | - Joan Carles Pons
- Department of Mathematics and Computer Science, University of the Balearic Islands, Palma, 07122, Spain
| | - Laura Kubatko
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH, USA.,Department of Statistics, The Ohio State University, Columbus, OH, USA
| | - Kristina Wicke
- Department of Mathematics, The Ohio State University, Columbus, OH, USA.
| |
Collapse
|
13
|
Abstract
Phylogenetic networks represent evolutionary history of species and can record natural reticulate evolutionary processes such as horizontal gene transfer and gene recombination. This makes phylogenetic networks a more comprehensive representation of evolutionary history compared to phylogenetic trees. Stochastic processes for generating random trees or networks are important tools in evolutionary analysis, especially in phylogeny reconstruction where they can be utilized for validation or serve as priors for Bayesian methods. However, as more network generators are developed, there is a lack of discussion or comparison for different generators. To bridge this gap, we compare a set of phylogenetic network generators by profiling topological summary statistics of the generated networks over the number of reticulations and comparing the topological profiles.
Collapse
Affiliation(s)
- Remie Janssen
- Delft University of Technology, Delft Institute of Applied Mathematics, Mekelweg 4, 2628 CD, Delft, The Netherlands
| | - Pengyu Liu
- Simon Fraser University, Department of Mathematics, 8888 University Drive, Burnaby, BC V5A 1S6, Canada
| |
Collapse
|
14
|
Davidov N, Hernandez A, Jian J, McKenna P, Medlin KA, Mojumder R, Owen M, Quijano A, Rodriguez A, St John K, Thai K, Uraga M. Maximum Covering Subtrees for Phylogenetic Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:2823-2827. [PMID: 33242309 DOI: 10.1109/tcbb.2020.3040910] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Tree-based phylogenetic networks, which may be roughly defined as leaf-labeled networks built by adding arcs only between the original tree edges, have elegant properties for modeling evolutionary histories. We answer an open question of Francis, Semple, and Steel about the complexity of determining how far a phylogenetic network is from being tree-based, including non-binary phylogenetic networks. We show that finding a phylogenetic tree covering the maximum number of nodes in a phylogenetic network can be computed in polynomial time via an encoding into a minimum-cost flow problem.
Collapse
|
15
|
Molloy EK, Durvasula A, Sankararaman S. Advancing admixture graph estimation via maximum likelihood network orientation. Bioinformatics 2021; 37:i142-i150. [PMID: 34252951 PMCID: PMC8336447 DOI: 10.1093/bioinformatics/btab267] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/21/2021] [Indexed: 11/18/2022] Open
Abstract
Motivation Admixture, the interbreeding between previously distinct populations, is a pervasive force in evolution. The evolutionary history of populations in the presence of admixture can be modeled by augmenting phylogenetic trees with additional nodes that represent admixture events. While enabling a more faithful representation of evolutionary history, admixture graphs present formidable inferential challenges, and there is an increasing need for methods that are accurate, fully automated and computationally efficient. One key challenge arises from the size of the space of admixture graphs. Given that exhaustively evaluating all admixture graphs can be prohibitively expensive, heuristics have been developed to enable efficient search over this space. One heuristic, implemented in the popular method TreeMix, consists of adding edges to a starting tree while optimizing a suitable objective function. Results Here, we present a demographic model (with one admixed population incident to a leaf) where TreeMix and any other starting-tree-based maximum likelihood heuristic using its likelihood function is guaranteed to get stuck in a local optimum and return an incorrect network topology. To address this issue, we propose a new search strategy that we term maximum likelihood network orientation (MLNO). We augment TreeMix with an exhaustive search for an MLNO, referring to this approach as OrientAGraph. In evaluations including previously published admixture graphs, OrientAGraph outperformed TreeMix on 4/8 models (there are no differences in the other cases). Overall, OrientAGraph found graphs with higher likelihood scores and topological accuracy while remaining computationally efficient. Lastly, our study reveals several directions for improving maximum likelihood admixture graph estimation. Availability and implementation OrientAGraph is available on Github (https://github.com/sriramlab/OrientAGraph) under the GNU General Public License v3.0. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Erin K Molloy
- Department of Computer Science, University of California, Los Angeles, LA 90095, USA.,Institute for Advanced Computer Studies, University of Maryland, College Park, College Park, MD 20740, USA
| | - Arun Durvasula
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, LA 90095, USA
| | - Sriram Sankararaman
- Department of Computer Science, University of California, Los Angeles, LA 90095, USA.,Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, LA 90095, USA.,Bioinformatics Interdepartmental Program, University of California, Los Angeles, LA 90095, USA.,Department of Computational Medicine, University of California, Los Angeles, LA 90095, USA
| |
Collapse
|
16
|
Francis A, Huson DH, Steel M. Normalising phylogenetic networks. Mol Phylogenet Evol 2021; 163:107215. [PMID: 34089842 DOI: 10.1016/j.ympev.2021.107215] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Revised: 04/23/2021] [Accepted: 05/27/2021] [Indexed: 11/24/2022]
Abstract
Rooted phylogenetic networks provide a way to describe species' relationships when evolution departs from the simple model of a tree. However, networks inferred from genomic data can be highly tangled, making it difficult to discern the main reticulation signals present. In this paper, we describe a natural way to transform any rooted phylogenetic network into a simpler canonical network, which has desirable mathematical and computational properties, and is based only on the 'visible' vertices in the original network. The method has been implemented and we demonstrate its application to some examples.
Collapse
Affiliation(s)
- Andrew Francis
- Centre for Research in Mathematics and Data Science, Western Sydney University, Australia.
| | - Daniel H Huson
- Algorithmen der Bioinformatik, Universität Tübingen, Germany.
| | - Mike Steel
- Biomathematics Research Centre, University of Canterbury, Christchurch, New Zealand.
| |
Collapse
|
17
|
Caterpillars on three and four leaves are sufficient to reconstruct binary normal networks. J Math Biol 2020; 81:961-980. [DOI: 10.1007/s00285-020-01533-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2020] [Revised: 07/15/2020] [Indexed: 10/23/2022]
|
18
|
Fischer M, Francis A. The Space of Tree-Based Phylogenetic Networks. Bull Math Biol 2020; 82:70. [PMID: 32500263 DOI: 10.1007/s11538-020-00744-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2019] [Accepted: 05/05/2020] [Indexed: 10/24/2022]
Abstract
Phylogenetic networks are generalizations of phylogenetic trees that allow the representation of reticulation events such as horizontal gene transfer or hybridization, and can also represent uncertainty in inference. A subclass of these, tree-based phylogenetic networks, have been introduced to capture the extent to which reticulate evolution nevertheless broadly follows tree-like patterns. Several important operations that change a general phylogenetic network have been developed in recent years and are important for allowing algorithms to move around spaces of networks; a vital ingredient in finding an optimal network given some biological data. A key such operation is the nearest neighbour interchange, or NNI. While it is already known that the space of unrooted phylogenetic networks is connected under NNI, it has been unclear whether this also holds for the subspace of tree-based networks. In this paper, we show that the space of unrooted tree-based phylogenetic networks is indeed connected under the NNI operation. We do so by explicitly showing how to get from one such network to another one without losing tree-basedness along the way. Moreover, we introduce some new concepts, for instance "shoat networks", and derive some interesting aspects concerning tree-basedness. Last, we use our results to derive an upper bound on the size of the space of tree-based networks.
Collapse
Affiliation(s)
- Mareike Fischer
- Institute of Mathematics and Computer Science, University of Greifswald, Greifswald, Germany
| | - Andrew Francis
- Centre for Research in Mathematics and Data Science, Western Sydney University, Sydney, Australia.
| |
Collapse
|
19
|
Abstract
Recently, so-called tree-based phylogenetic networks have attracted considerable attention. These networks can be constructed from a phylogenetic tree, called the base tree, by adding additional edges. The primary aim of this study is to provide sufficient criteria for tree-basedness by reducing phylogenetic networks to related graph structures. Even though it is generally known that determining whether a network is tree-based is an NP-complete problem, one of these criteria, namely edge-basedness, can be verified in linear time. Surprisingly, the class of edge-based networks is closely related to a well-known family of graphs, namely, the class of generalized series-parallel graphs, and we explore this relationship in full detail. Additionally, we introduce further classes of tree-based networks and analyze their relationships.
Collapse
|
20
|
Pons JC, Scornavacca C, Cardona G. Generation of Level- k LGT Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:158-164. [PMID: 30703035 DOI: 10.1109/tcbb.2019.2895344] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Phylogenetic networks provide a mathematical model to represent the evolution of a set of species where, apart from speciation, reticulate evolutionary events have to be taken into account. Among these events, lateral gene transfers need special consideration due to the asymmetry in the roles of the species involved in such an event. To take into account this asymmetry, LGT networks were introduced. Contrarily to the case of phylogenetic trees, the combinatorial structure of phylogenetic networks is much less known and difficult to describe. One of the approaches in the literature is to classify them according to their level and find generators of the given level that can be used to recursively generate all networks. In this paper, we adapt the concept of generators to the case of LGT networks. We show how these generators, classified by their level, give rise to simple LGT networks of the specified level, and how any LGT network can be obtained from these simple networks, that act as building blocks of the generic structure. The stochastic models of evolution of phylogenetic networks are also much less studied than those for phylogenetic trees. In this setting, we introduce a novel two-parameter model that generates LGT networks. Finally, we present some computer simulations using this model in order to investigate the complexity of the generated networks, depending on the parameters of the model.
Collapse
|
21
|
Abstract
Background Galled trees are studied as a recombination model in theoretical population genetics. This class of phylogenetic networks has been generalized to tree-child networks and other network classes by relaxing a structural condition imposed on galled trees. Although these networks are simple, their topological structures have yet to be fully understood. Results It is well-known that all phylogenetic trees on n taxa can be generated by the insertion of the n-th taxa to each edge of all the phylogenetic trees on n−1 taxa. We prove that all tree-child (resp. normal) networks with k reticulate nodes on n taxa can be uniquely generated via three operations from all the tree-child (resp. normal) networks with k−1 or k reticulate nodes on n−1 taxa. Applying this result to counting rooted phylogenetic networks, we show that there are exactly \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$\frac {(2n)!}{2^{n} (n-1)!}-2^{n-1} n!$\end{document}(2n)!2n(n−1)!−2n−1n! binary phylogenetic networks with one reticulate node on n taxa. Conclusions The work makes two contributions to understand normal networks. One is a generalization of an enumeration procedure for phylogenetic trees into one for normal networks. Another is simple formulas for counting normal networks and phylogenetic networks that have only one reticulate node.
Collapse
Affiliation(s)
- Louxin Zhang
- Department of Mathematics, National University of Singapore, 10 Lower Kent Ridge Road, Singapore, 119076, Singapore.
| |
Collapse
|
22
|
Hendriksen M, Francis A. Tree-metrizable HGT networks. Math Biosci 2019; 318:108283. [PMID: 31711966 DOI: 10.1016/j.mbs.2019.108283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Revised: 11/03/2019] [Accepted: 11/03/2019] [Indexed: 11/17/2022]
Abstract
Phylogenetic trees are often constructed by using a metric on the set of taxa that label the leaves of the tree. While there are a number of methods for constructing a tree using a given metric, such trees will only display the metric if it satisfies the so-called "four point condition", established by Buneman in 1971. While this condition guarantees that a unique tree will display the metric, meaning that the distance between any two leaves can be found by adding the distances on arcs in the path between the leaves, it doesn't exclude the possibility that a phylogenetic network might also display the metric. This possibility was recently pointed out and "tree-metrized" networks - that display a tree metric - with a single reticulation were characterized. In this paper, we show that in the case of HGT (horizontal gene transfer) networks, in fact there are tree-metrized networks containing many reticulations.
Collapse
Affiliation(s)
- Michael Hendriksen
- Centre for Research in Mathematics, Western Sydney University, Australia.
| | - Andrew Francis
- Centre for Research in Mathematics, Western Sydney University, Australia.
| |
Collapse
|
23
|
Cardona G, Pons JC, Scornavacca C. Generation of Binary Tree-Child phylogenetic networks. PLoS Comput Biol 2019; 15:e1007347. [PMID: 31509525 PMCID: PMC6756559 DOI: 10.1371/journal.pcbi.1007347] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2019] [Revised: 09/23/2019] [Accepted: 08/20/2019] [Indexed: 11/18/2022] Open
Abstract
Phylogenetic networks generalize phylogenetic trees by allowing the modelization of events of reticulate evolution. Among the different kinds of phylogenetic networks that have been proposed in the literature, the subclass of binary tree-child networks is one of the most studied ones. However, very little is known about the combinatorial structure of these networks. In this paper we address the problem of generating all possible binary tree-child (BTC) networks with a given number of leaves in an efficient way via reduction/augmentation operations that extend and generalize analogous operations for phylogenetic trees, and are biologically relevant. Since our solution is recursive, this also provides us with a recurrence relation giving an upper bound on the number of such networks. We also show how the operations introduced in this paper can be employed to extend the evolutive history of a set of sequences, represented by a BTC network, to include a new sequence. An implementation in python of the algorithms described in this paper, along with some computational experiments, can be downloaded from https://github.com/bielcardona/TCGenerators.
Collapse
Affiliation(s)
- Gabriel Cardona
- Department of Mathematics and Computer Science, University of the Balearic Islands, Ctra. de Valldemossa Ctra. de Valldemossa km. 7.5. 07122 - Palma, Spain
| | - Joan Carles Pons
- Department of Mathematics and Computer Science, University of the Balearic Islands, Ctra. de Valldemossa Ctra. de Valldemossa km. 7.5. 07122 - Palma, Spain
| | - Celine Scornavacca
- Institut des Sciences de l’Evolution (ISE-M), Université de Montpellier, CNRS, IRD, EPHE, 34095 Montpellier Cedex 5, France
| |
Collapse
|
24
|
A class of phylogenetic networks reconstructable from ancestral profiles. Math Biosci 2019; 313:33-40. [DOI: 10.1016/j.mbs.2019.04.009] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2019] [Revised: 04/30/2019] [Accepted: 04/30/2019] [Indexed: 11/18/2022]
|
25
|
Bouckaert R, Vaughan TG, Barido-Sottani J, Duchêne S, Fourment M, Gavryushkina A, Heled J, Jones G, Kühnert D, De Maio N, Matschiner M, Mendes FK, Müller NF, Ogilvie HA, du Plessis L, Popinga A, Rambaut A, Rasmussen D, Siveroni I, Suchard MA, Wu CH, Xie D, Zhang C, Stadler T, Drummond AJ. BEAST 2.5: An advanced software platform for Bayesian evolutionary analysis. PLoS Comput Biol 2019; 15:e1006650. [PMID: 30958812 PMCID: PMC6472827 DOI: 10.1371/journal.pcbi.1006650] [Citation(s) in RCA: 1997] [Impact Index Per Article: 332.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2018] [Revised: 04/18/2019] [Accepted: 02/04/2019] [Indexed: 11/18/2022] Open
Abstract
Elaboration of Bayesian phylogenetic inference methods has continued at pace in recent years with major new advances in nearly all aspects of the joint modelling of evolutionary data. It is increasingly appreciated that some evolutionary questions can only be adequately answered by combining evidence from multiple independent sources of data, including genome sequences, sampling dates, phenotypic data, radiocarbon dates, fossil occurrences, and biogeographic range information among others. Including all relevant data into a single joint model is very challenging both conceptually and computationally. Advanced computational software packages that allow robust development of compatible (sub-)models which can be composed into a full model hierarchy have played a key role in these developments. Developing such software frameworks is increasingly a major scientific activity in its own right, and comes with specific challenges, from practical software design, development and engineering challenges to statistical and conceptual modelling challenges. BEAST 2 is one such computational software platform, and was first announced over 4 years ago. Here we describe a series of major new developments in the BEAST 2 core platform and model hierarchy that have occurred since the first release of the software, culminating in the recent 2.5 release.
Collapse
Affiliation(s)
- Remco Bouckaert
- Centre of Computational Evolution, University of Auckland, Auckland, New Zealand
- Max Planck Institute for the Science of Human History, Jena, Germany
| | - Timothy G. Vaughan
- ETH Zürich, Department of Biosystems Science and Engineering, 4058 Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Joëlle Barido-Sottani
- ETH Zürich, Department of Biosystems Science and Engineering, 4058 Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Sebastián Duchêne
- Department of Biochemistry and Molecular Biology, University of Melbourne, Melbourne, Victoria, Australia
| | - Mathieu Fourment
- ithree institute, University of Technology Sydney, Sydney, Australia
| | | | | | - Graham Jones
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, SE 405 30 Göteborg, Sweden
| | - Denise Kühnert
- Max Planck Institute for the Science of Human History, Jena, Germany
| | - Nicola De Maio
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridgeshire, UK
| | - Michael Matschiner
- Department of Environmental Sciences, University of Basel, 4051 Basel, Switzerland
| | - Fábio K. Mendes
- Centre of Computational Evolution, University of Auckland, Auckland, New Zealand
| | - Nicola F. Müller
- ETH Zürich, Department of Biosystems Science and Engineering, 4058 Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Huw A. Ogilvie
- Department of Computer Science, Rice University, Houston, TX 77005-1892, USA
| | - Louis du Plessis
- Department of Zoology, University of Oxford, Oxford, OX1 3PS, UK
| | - Alex Popinga
- Centre of Computational Evolution, University of Auckland, Auckland, New Zealand
| | - Andrew Rambaut
- Institute of Evolutionary Biology, University of Edinburgh, Ashworth Laboratories, Edinburgh, EH9 3FL UK
| | - David Rasmussen
- Department of Entomology and Plant Pathology, North Carolina State University, Raleigh, NC 27695, USA
| | - Igor Siveroni
- Department of Infectious Disease Epidemiology, Imperial College London, Norfolk Place, W2 1PG, UK
| | - Marc A. Suchard
- Department of Biomathematics, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
| | - Chieh-Hsi Wu
- Department of Statistics, University of Oxford, OX1 3LB, UK
| | - Dong Xie
- Centre of Computational Evolution, University of Auckland, Auckland, New Zealand
| | - Chi Zhang
- Institute of Vertebrate Paleontology and Paleoanthropology, Chinese Academy of Sciences, Beijing, China
| | - Tanja Stadler
- ETH Zürich, Department of Biosystems Science and Engineering, 4058 Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Alexei J. Drummond
- Centre of Computational Evolution, University of Auckland, Auckland, New Zealand
| |
Collapse
|
26
|
Gunawan ADM, Yan H, Zhang L. Compression of Phylogenetic Networks and Algorithm for the Tree Containment Problem. J Comput Biol 2019; 26:285-294. [PMID: 30624954 DOI: 10.1089/cmb.2018.0220] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Rooted phylogenetic networks are rooted acyclic digraphs. They are used to model complex evolution where hybridization, recombination, and other reticulation events play a role. A rigorous definition of network compression is introduced on the basis of recent studies of relationships between cluster, tree, and rooted phylogenetic networks. The concept reveals new connections between well-studied network classes, including tree-child networks and reticulation-visible networks. It also enables us to define a new class of networks for which the cluster containment problem is solvable in linear time.
Collapse
Affiliation(s)
- Andreas D M Gunawan
- Department of Mathematics, National University of Singapore, Singapore 119076, Singapore
| | - Hongwei Yan
- Department of Mathematics, National University of Singapore, Singapore 119076, Singapore
| | - Louxin Zhang
- Department of Mathematics, National University of Singapore, Singapore 119076, Singapore
| |
Collapse
|
27
|
Advances in Computational Methods for Phylogenetic Networks in the Presence of Hybridization. BIOINFORMATICS AND PHYLOGENETICS 2019. [DOI: 10.1007/978-3-030-10837-3_13] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
28
|
Abstract
Most phylogenies are typically represented as purely bifurcating. However, as genomic data have become more common in phylogenetic studies, it is not unusual to find reticulation among terminal lineages or among internal nodes (deep time reticulation; DTR). In these situations, gene flow must have happened in the same or adjacent geographic areas for these DTRs to have occurred and therefore biogeographic reconstruction should provide similar area estimates for parental nodes, provided extinction or dispersal has not eroded these patterns. We examine the phylogeny of the widely distributed New World kingsnakes (Lampropeltis), determine if DTR is present in this group, and estimate the ancestral area for reticulation. Importantly, we develop a new method that uses coalescent simulations in a machine learning framework to show conclusively that this phylogeny is best represented as reticulating at deeper time. Using joint probabilities of ancestral area reconstructions on the bifurcating parental lineages from the reticulating node, we show that this reticulation likely occurred in northwestern Mexico/southwestern US, and subsequently, led to the diversification of the Mexican kingsnakes. This region has been previously identified as an area important for understanding speciation and secondary contact with gene flow in snakes and other squamates. This research shows that phylogenetic reticulation is common, even in well-studied groups, and that the geographic scope of ancient hybridization is recoverable.
Collapse
Affiliation(s)
- Frank T Burbrink
- Department of Herpetology, The American Museum of Natural History, 79th Street at Central Park West, New York, NY 10024, USA
| | - Marcelo Gehara
- Department of Herpetology, The American Museum of Natural History, 79th Street at Central Park West, New York, NY 10024, USA
| |
Collapse
|
29
|
Tree-based networks: characterisations, metrics, and support trees. J Math Biol 2018; 78:899-918. [PMID: 30283985 DOI: 10.1007/s00285-018-1296-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2017] [Revised: 08/25/2018] [Indexed: 10/28/2022]
Abstract
Phylogenetic networks generalise phylogenetic trees and allow for the accurate representation of the evolutionary history of a set of present-day species whose past includes reticulate events such as hybridisation and lateral gene transfer. One way to obtain such a network is by starting with a (rooted) phylogenetic tree T, called a base tree, and adding arcs between arcs of T. The class of phylogenetic networks that can be obtained in this way is called tree-based networks and includes the prominent classes of tree-child and reticulation-visible networks. Initially defined for binary phylogenetic networks, tree-based networks naturally extend to arbitrary phylogenetic networks. In this paper, we generalise recent tree-based characterisations and associated proximity measures for binary phylogenetic networks to arbitrary phylogenetic networks. These characterisations are in terms of matchings in bipartite graphs, path partitions, and antichains. Some of the generalisations are straightforward to establish using the original approach, while others require a very different approach. Furthermore, for an arbitrary tree-based network N, we characterise the support trees of N, that is, the tree-based embeddings of N. We use this characterisation to give an explicit formula for the number of support trees of N when N is binary. This formula is written in terms of the components of a bipartite graph.
Collapse
|
30
|
When is a Phylogenetic Network Simply an Amalgamation of Two Trees? Bull Math Biol 2018; 80:2338-2348. [DOI: 10.1007/s11538-018-0463-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2017] [Accepted: 06/29/2018] [Indexed: 10/28/2022]
|
31
|
Hendriksen M. Tree-based unrooted nonbinary phylogenetic networks. Math Biosci 2018; 302:131-138. [PMID: 29932953 DOI: 10.1016/j.mbs.2018.06.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2018] [Revised: 06/18/2018] [Accepted: 06/18/2018] [Indexed: 11/16/2022]
Abstract
Phylogenetic networks are a generalisation of phylogenetic trees that allow for more complex evolutionary histories that include hybridisation-like processes. It is of considerable interest whether a network can be considered 'tree-like' or not, which leads to the introduction of tree-based networks in the rooted, binary context. Tree-based networks are those networks which can be constructed by adding additional edges into a phylogenetic tree, called the base tree. Previous extensions have considered extending to the binary, unrooted case and the nonbinary, rooted case. In this paper, we extend tree-based networks to the context of unrooted, nonbinary networks in three ways, depending on the types of additional edges that are permitted. Further, we study fully tree-based networks which are phylogenetic networks in which every embedded tree is a base tree. We also extend this concept to unrooted, nonbinary, phylogenetic networks and classify the resulting networks. Finally, we derive some results on the colourability of tree-based networks, which can be useful to determine whether a network is tree-based.
Collapse
Affiliation(s)
- Michael Hendriksen
- Centre for Research in Mathematics, Western Sydney University, NSW, Australia.
| |
Collapse
|
32
|
Carvalho DS, Schnable JC, Almeida AMR. Integrating Phylogenetic and Network Approaches to Study Gene Family Evolution: The Case of the AGAMOUS Family of Floral Genes. Evol Bioinform Online 2018; 14:1176934318764683. [PMID: 29899658 PMCID: PMC5993073 DOI: 10.1177/1176934318764683] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2017] [Accepted: 02/09/2018] [Indexed: 11/17/2022] Open
Abstract
The study of gene family evolution has benefited from the use of phylogenetic tools, which can greatly inform studies of both relationships within gene families and functional divergence. Here, we propose the use of a network-based approach that in combination with phylogenetic methods can provide additional support for models of gene family evolution. We dissect the contributions of each method to the improved understanding of relationships and functions within the well-characterized family of AGAMOUS floral development genes. The results obtained with the two methods largely agreed with one another. In particular, we show how network approaches can provide improved interpretations of branches with low support in a conventional gene tree. The network approach used here may also better reflect known and suspected patterns of functional divergence relative to phylogenetic methods. Overall, we believe that the combined use of phylogenetic and network tools provide a more robust assessment of gene family evolution.
Collapse
Affiliation(s)
- Daniel S Carvalho
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE, USA.,Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE, USA
| | - James C Schnable
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE, USA.,Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE, USA
| | - Ana Maria R Almeida
- Department of Biological Sciences, California State University East Bay, Hayward, CA, USA
| |
Collapse
|
33
|
Francis A, Huber KT, Moulton V. Tree-Based Unrooted Phylogenetic Networks. Bull Math Biol 2018; 80:404-416. [PMID: 29238909 PMCID: PMC5790869 DOI: 10.1007/s11538-017-0381-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2017] [Accepted: 12/05/2017] [Indexed: 11/30/2022]
Abstract
Phylogenetic networks are a generalization of phylogenetic trees that are used to represent non-tree-like evolutionary histories that arise in organisms such as plants and bacteria, or uncertainty in evolutionary histories. An unrooted phylogenetic network on a non-empty, finite set X of taxa, or network, is a connected, simple graph in which every vertex has degree 1 or 3 and whose leaf set is X. It is called a phylogenetic tree if the underlying graph is a tree. In this paper we consider properties of tree-based networks, that is, networks that can be constructed by adding edges into a phylogenetic tree. We show that although they have some properties in common with their rooted analogues which have recently drawn much attention in the literature, they have some striking differences in terms of both their structural and computational properties. We expect that our results could eventually have applications to, for example, detecting horizontal gene transfer or hybridization which are important factors in the evolution of many organisms.
Collapse
Affiliation(s)
- A Francis
- Centre for Research in Mathematics, Western Sydney University, Sydney, Australia
| | - K T Huber
- School of Computing Sciences, University of East Anglia, Norwich, UK.
| | - V Moulton
- School of Computing Sciences, University of East Anglia, Norwich, UK
| |
Collapse
|
34
|
Jetten L, van Iersel L. Nonbinary Tree-Based Phylogenetic Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:205-217. [PMID: 27723601 DOI: 10.1109/tcbb.2016.2615918] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Rooted phylogenetic networks are used to describe evolutionary histories that contain non-treelike evolutionary events such as hybridization and horizontal gene transfer. In some cases, such histories can be described by a phylogenetic base-tree with additional linking arcs, which can, for example, represent gene transfer events. Such phylogenetic networks are called tree-based. Here, we consider two possible generalizations of this concept to nonbinary networks, which we call tree-based and strictly-tree-based nonbinary phylogenetic networks. We give simple graph-theoretic characterizations of tree-based and strictly-tree-based nonbinary phylogenetic networks. Moreover, we show for each of these two classes that it can be decided in polynomial time whether a given network is contained in the class. Our approach also provides a new view on tree-based binary phylogenetic networks. Finally, we discuss two examples of nonbinary phylogenetic networks in biology and show how our results can be applied to them.
Collapse
|
35
|
Cardona G, Pons JC. Reconstruction of LGT networks from tri-LGT-nets. J Math Biol 2017; 75:1669-1692. [PMID: 28451760 DOI: 10.1007/s00285-017-1131-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Revised: 02/16/2017] [Indexed: 01/06/2023]
Abstract
Phylogenetic networks have gained attention from the scientific community due to the evidence of the existence of evolutionary events that cannot be represented using trees. A variant of phylogenetic networks, called LGT networks, models specifically lateral gene transfer events, which cannot be properly represented with generic phylogenetic networks. In this paper we treat the problem of the reconstruction of LGT networks from substructures induced by three leaves, which we call tri-LGT-nets. We first restrict ourselves to a class of LGT networks that are both mathematically treatable and biologically significant, called BAN-LGT networks. Then, we study the decomposition of such networks in subnetworks with three leaves and ask whether or not this decomposition determines the network. The answer to this question is negative, but if we further impose time-consistency (species involved in a later gene transfer must coexist) the answer is affirmative, up to some redundancy that can never be recovered but is fully characterized.
Collapse
Affiliation(s)
- Gabriel Cardona
- Department of Mathematics and Computer Science, University of the Balearic Islands, 07122, Palma, Spain.
| | - Joan Carles Pons
- Department of Mathematics and Computer Science, University of the Balearic Islands, 07122, Palma, Spain
| |
Collapse
|
36
|
Bordewich M, Linz S, Semple C. Lost in space? Generalising subtree prune and regraft to spaces of phylogenetic networks. J Theor Biol 2017; 423:1-12. [PMID: 28414085 DOI: 10.1016/j.jtbi.2017.03.032] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2016] [Revised: 03/18/2017] [Accepted: 03/20/2017] [Indexed: 10/19/2022]
Abstract
Over the last fifteen years, phylogenetic networks have become a popular tool to analyse relationships between species whose past includes reticulation events such as hybridisation or horizontal gene transfer. However, the space of phylogenetic networks is significantly larger than that of phylogenetic trees, and how to analyse and search this enlarged space remains a poorly understood problem. Inspired by the widely-used rooted subtree prune and regraft (rSPR) operation on rooted phylogenetic trees, we propose a new operation-called subnet prune and regraft (SNPR)-that induces a metric on the space of all rooted phylogenetic networks on a fixed set of leaves. We show that the spaces of several popular classes of rooted phylogenetic networks (e.g. tree child, reticulation visible, and tree based) are connected under SNPR and that connectedness remains for the subclasses of these networks with a fixed number of reticulations. Lastly, we bound the distance between two rooted phylogenetic networks under the SNPR operation, show that it is computationally hard to compute this distance exactly, and analyse how the SNPR-distance between two such networks relates to the rSPR-distance between rooted phylogenetic trees that are embedded in these networks.
Collapse
Affiliation(s)
- Magnus Bordewich
- School of Engineering and Computing Sciences, Durham University, Durham DH1 3LE, United Kingdom.
| | - Simone Linz
- Department of Computer Science, The University of Auckland, Private Bag 92019, Auckland 1142, New Zealand.
| | - Charles Semple
- School of Mathematics and Statistics, University of Canterbury, Private Bag 4800, Christchurch 8140, New Zealand.
| |
Collapse
|
37
|
Scornavacca C, Mayol JCP, Cardona G. Fast algorithm for the reconciliation of gene trees and LGT networks. J Theor Biol 2017; 418:129-137. [PMID: 28111320 DOI: 10.1016/j.jtbi.2017.01.024] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2016] [Revised: 09/27/2016] [Accepted: 01/16/2017] [Indexed: 10/20/2022]
Abstract
In phylogenomics, reconciliations aim at explaining the discrepancies between the evolutionary histories of genes and species. Several reconciliation models are available when the evolution of the species of interest is modelled via phylogenetic trees; the most commonly used are the DL model, accounting for duplications and losses in gene evolution and yielding polynomially-solvable problems, and the DTL model, which also accounts for gene transfers and implies NP-hard problems. However, when dealing with non-tree-like evolutionary events such as hybridisations, phylogenetic networks - and not phylogenetic trees - should be used to model species evolution. Reconciliation models involving phylogenetic networks are still at their early days. In this paper, we propose a new reconciliation model in which the evolution of species is modelled by a special kind of phylogenetic networks - the LGT networks. Our model considers duplications, losses and transfers of genes, but restricts transfers to happen through some specific arcs of the network, called secondary arcs. Moreover, we provide a polynomial algorithm to compute the most parsimonious reconciliation between a gene tree and an LGT network under this model. Our method, when combined with quartet decomposition methods to detect putative "highways" of transfers, permits to refine their analyses by allowing to examine the two possible directions of a highway and even consider combinations of highways.
Collapse
Affiliation(s)
- Celine Scornavacca
- Institut des Sciences de l'Evolution, Université de Montpellier, CNRS, IRD, EPHE 34095 Montpellier Cedex 5, France.
| | - Joan Carles Pons Mayol
- Department of Mathematics and Computer Science, University of the Balearic Islands, E-07122 Palma, Spain.
| | - Gabriel Cardona
- Department of Mathematics and Computer Science, University of the Balearic Islands, E-07122 Palma, Spain.
| |
Collapse
|
38
|
Bryant C, Fischer M, Linz S, Semple C. On the quirks of maximum parsimony and likelihood on phylogenetic networks. J Theor Biol 2017; 417:100-108. [PMID: 28087420 DOI: 10.1016/j.jtbi.2017.01.013] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2016] [Revised: 10/28/2016] [Accepted: 01/06/2017] [Indexed: 11/16/2022]
Abstract
Maximum parsimony is one of the most frequently-discussed tree reconstruction methods in phylogenetic estimation. However, in recent years it has become more and more apparent that phylogenetic trees are often not sufficient to describe evolution accurately. For instance, processes like hybridization or lateral gene transfer that are commonplace in many groups of organisms and result in mosaic patterns of relationships cannot be represented by a single phylogenetic tree. This is why phylogenetic networks, which can display such events, are becoming of more and more interest in phylogenetic research. It is therefore necessary to extend concepts like maximum parsimony from phylogenetic trees to networks. Several suggestions for possible extensions can be found in recent literature, for instance the softwired and the hardwired parsimony concepts. In this paper, we analyze the so-called big parsimony problem under these two concepts, i.e. we investigate maximum parsimonious networks and analyze their properties. In particular, we show that finding a softwired maximum parsimony network is possible in polynomial time. We also show that the set of maximum parsimony networks for the hardwired definition always contains at least one phylogenetic tree. Lastly, we investigate some parallels of parsimony to different likelihood concepts on phylogenetic networks.
Collapse
Affiliation(s)
| | - Mareike Fischer
- Department for Mathematics and Computer Science, Ernst Moritz Arndt University, Greifswald, Germany.
| | - Simone Linz
- Department of Computer Science, University of Auckland, New Zealand.
| | - Charles Semple
- School of Mathematics and Statistics, University of Canterbury, Christchurch, New Zealand.
| |
Collapse
|
39
|
Abstract
BACKGROUND Phylogenetic networks model reticulate evolutionary histories. The last two decades have seen an increased interest in establishing mathematical results and developing computational methods for inferring and analyzing these networks. A salient concept underlying a great majority of these developments has been the notion that a network displays a set of trees and those trees can be used to infer, analyze, and study the network. RESULTS In this paper, we show that in the presence of coalescence effects, the set of displayed trees is not sufficient to capture the network. We formally define the set of parental trees of a network and make three contributions based on this definition. First, we extend the notion of anomaly zone to phylogenetic networks and report on anomaly results for different networks. Second, we demonstrate how coalescence events could negatively affect the ability to infer a species tree that could be augmented into the correct network. Third, we demonstrate how a phylogenetic network can be viewed as a mixture model that lends itself to a novel inference approach via gene tree clustering. CONCLUSIONS Our results demonstrate the limitations of focusing on the set of trees displayed by a network when analyzing and inferring the network. Our findings can form the basis for achieving higher accuracy when inferring phylogenetic networks and open up new venues for research in this area, including new problem formulations based on the notion of a network's parental trees.
Collapse
Affiliation(s)
- Jiafan Zhu
- Department of Computer Science, Rice University, Houston, 77005 Texas USA
| | - Yun Yu
- Department of Computer Science, Rice University, Houston, 77005 Texas USA
| | - Luay Nakhleh
- Department of Computer Science, Rice University, Houston, 77005 Texas USA
- Department of BioSciences, Rice University, Houston, 77005 Texas USA
| |
Collapse
|
40
|
Affiliation(s)
- Louxin Zhang
- Department of Mathematics, National University of Singapore, Singapore
| |
Collapse
|
41
|
Hayamizu M. On the existence of infinitely many universal tree-based networks. J Theor Biol 2016; 396:204-6. [DOI: 10.1016/j.jtbi.2016.02.023] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2015] [Revised: 02/12/2016] [Accepted: 02/15/2016] [Indexed: 10/22/2022]
|
42
|
Anaya M, Anipchenko-Ulaj O, Ashfaq A, Chiu J, Kaiser M, Ohsawa MS, Owen M, Pavlechko E, St John K, Suleria S, Thompson K, Yap C. On Determining if Tree-based Networks Contain Fixed Trees. Bull Math Biol 2016; 78:961-9. [PMID: 27125655 DOI: 10.1007/s11538-016-0169-x] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2016] [Accepted: 04/15/2016] [Indexed: 11/29/2022]
Abstract
We address an open question of Francis and Steel about phylogenetic networks and trees. They give a polynomial time algorithm to decide if a phylogenetic network, N, is tree-based and pose the problem: given a fixed tree T and network N, is N based on T? We show that it is [Formula: see text]-hard to decide, by reduction from 3-Dimensional Matching (3DM) and further that the problem is fixed-parameter tractable.
Collapse
Affiliation(s)
- Maria Anaya
- Queensborough Community College, City University of New York (CUNY), Bayside, NY, USA
| | | | - Aisha Ashfaq
- Queensborough Community College, City University of New York (CUNY), Bayside, NY, USA
| | - Joyce Chiu
- Brooklyn College, City University of New York (CUNY), Brooklyn, NY, USA
| | - Mahedi Kaiser
- Department of Mathematics and Computer Science, Lehman College, City University of New York (CUNY), Bronx, NY, 10468, USA
| | - Max Shoji Ohsawa
- Brooklyn College, City University of New York (CUNY), Brooklyn, NY, USA
| | - Megan Owen
- Department of Mathematics and Computer Science, Lehman College, City University of New York (CUNY), Bronx, NY, 10468, USA.
| | | | - Katherine St John
- Department of Mathematics and Computer Science, Lehman College, City University of New York (CUNY), Bronx, NY, 10468, USA.,Division of Invertebrate Zoology, American Museum of Natural History, New York, NY, 10024, USA
| | - Shivam Suleria
- Brooklyn College, City University of New York (CUNY), Brooklyn, NY, USA
| | - Keith Thompson
- College of Staten Island, City University of New York (CUNY), Staten Island, NY, USA
| | | |
Collapse
|
43
|
Solís-Lemus C, Ané C. Inferring Phylogenetic Networks with Maximum Pseudolikelihood under Incomplete Lineage Sorting. PLoS Genet 2016; 12:e1005896. [PMID: 26950302 PMCID: PMC4780787 DOI: 10.1371/journal.pgen.1005896] [Citation(s) in RCA: 277] [Impact Index Per Article: 30.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2015] [Accepted: 02/03/2016] [Indexed: 11/23/2022] Open
Abstract
Phylogenetic networks are necessary to represent the tree of life expanded by edges to represent events such as horizontal gene transfers, hybridizations or gene flow. Not all species follow the paradigm of vertical inheritance of their genetic material. While a great deal of research has flourished into the inference of phylogenetic trees, statistical methods to infer phylogenetic networks are still limited and under development. The main disadvantage of existing methods is a lack of scalability. Here, we present a statistical method to infer phylogenetic networks from multi-locus genetic data in a pseudolikelihood framework. Our model accounts for incomplete lineage sorting through the coalescent model, and for horizontal inheritance of genes through reticulation nodes in the network. Computation of the pseudolikelihood is fast and simple, and it avoids the burdensome calculation of the full likelihood which can be intractable with many species. Moreover, estimation at the quartet-level has the added computational benefit that it is easily parallelizable. Simulation studies comparing our method to a full likelihood approach show that our pseudolikelihood approach is much faster without compromising accuracy. We applied our method to reconstruct the evolutionary relationships among swordtails and platyfishes (Xiphophorus: Poeciliidae), which is characterized by widespread hybridizations. Phylogenetic networks display the evolutionary history of groups of individuals (species or populations) including reticulation events such as hybridization, horizontal gene transfer or migration. Here, we present a likelihood method to learn networks from molecular sequences at multiple genes. Our model accounts for several biological processes: mutations, incomplete lineage sorting of alleles in ancestral populations, and reticulations in the network. The likelihood is decomposed into 4-taxon subsets to make the analyses scale to many species and many genes. Our work makes it possible to learn large phylogenetic networks from large data sets, with a statistical approach and a biologically relevant model.
Collapse
Affiliation(s)
- Claudia Solís-Lemus
- Department of Statistics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
- * E-mail:
| | - Cécile Ané
- Department of Statistics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
- Department of Botany, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| |
Collapse
|
44
|
Morrison DA. Genealogies: Pedigrees and Phylogenies are Reticulating Networks Not Just Divergent Trees. Evol Biol 2016. [DOI: 10.1007/s11692-016-9376-5] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
45
|
Semple C. Phylogenetic Networks with Every Embedded Phylogenetic Tree a Base Tree. Bull Math Biol 2015; 78:132-7. [DOI: 10.1007/s11538-015-0132-2] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2015] [Accepted: 11/24/2015] [Indexed: 11/24/2022]
|
46
|
Cardona G, Pons JC, Rosselló F. A reconstruction problem for a class of phylogenetic networks with lateral gene transfers. Algorithms Mol Biol 2015; 10:28. [PMID: 26691555 PMCID: PMC4683721 DOI: 10.1186/s13015-015-0059-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2015] [Accepted: 11/15/2015] [Indexed: 11/18/2022] Open
Abstract
Background Lateral, or Horizontal, Gene Transfers are a type of
asymmetric evolutionary events where genetic material is transferred from one species to another. In this paper we consider LGT networks, a general model of phylogenetic networks with lateral gene transfers which consist, roughly, of a principal rooted tree with its leaves labelled on a set of taxa, and a set of extra secondary arcs between nodes in this tree representing lateral gene transfers. An LGT network gives rise in a natural way to a principal phylogenetic subtree and a set of secondary phylogenetic subtrees, which, roughly, represent, respectively, the main line of evolution of most genes and the secondary lines of evolution through lateral gene transfers. Results We introduce a set of simple conditions on an LGT network that guarantee that its principal and secondary phylogenetic subtrees are pairwise different and that these subtrees determine, up to isomorphism, the LGT network. We then give an algorithm that, given a set of pairwise different phylogenetic trees \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$T_0,T_1,\ldots ,T_k$$\end{document}T0,T1,…,Tk on the same set of taxa, outputs, when it exists, the LGT network that satisfies these conditions and such that its principal phylogenetic tree is \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$T_0$$\end{document}T0 and its secondary phylogenetic trees are \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$T_1,\ldots ,T_k$$\end{document}T1,…,Tk. Electronic supplementary material The online version of this article (doi:10.1186/s13015-015-0059-z) contains supplementary material, which is available to authorized users.
Collapse
|