201
|
Whidden C, Zeh N, Beiko RG. Supertrees Based on the Subtree Prune-and-Regraft Distance. Syst Biol 2014; 63:566-81. [PMID: 24695589 PMCID: PMC4055872 DOI: 10.1093/sysbio/syu023] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2013] [Accepted: 03/18/2014] [Indexed: 11/14/2022] Open
Abstract
Supertree methods reconcile a set of phylogenetic trees into a single structure that is often interpreted as a branching history of species. A key challenge is combining conflicting evolutionary histories that are due to artifacts of phylogenetic reconstruction and phenomena such as lateral gene transfer (LGT). Many supertree approaches use optimality criteria that do not reflect underlying processes, have known biases, and may be unduly influenced by LGT. We present the first method to construct supertrees by using the subtree prune-and-regraft (SPR) distance as an optimality criterion. Although calculating the rooted SPR distance between a pair of trees is NP-hard, our new maximum agreement forest-based methods can reconcile trees with hundreds of taxa and>50 transfers in fractions of a second, which enables repeated calculations during the course of an iterative search. Our approach can accommodate trees in which uncertain relationships have been collapsed to multifurcating nodes. Using a series of benchmark datasets simulated under plausible rates of LGT, we show that SPR supertrees are more similar to correct species histories than supertrees based on parsimony or Robinson-Foulds distance criteria. We successfully constructed an SPR supertree from a phylogenomic dataset of 40,631 gene trees that covered 244 genomes representing several major bacterial phyla. Our SPR-based approach also allowed direct inference of highways of gene transfer between bacterial classes and genera. A Small number of these highways connect genera in different phyla and can highlight specific genes implicated in long-distance LGT. [Lateral gene transfer; matrix representation with parsimony; phylogenomics; prokaryotic phylogeny; Robinson-Foulds; subtree prune-and-regraft; supertrees.].
Collapse
Affiliation(s)
- Christopher Whidden
- Faculty of Computer Science, Dalhousie University, 6050 University Avenue, PO Box 15000, Halifax, Nova Scotia, Canada B3H 4R2
| | - Norbert Zeh
- Faculty of Computer Science, Dalhousie University, 6050 University Avenue, PO Box 15000, Halifax, Nova Scotia, Canada B3H 4R2
| | - Robert G Beiko
- Faculty of Computer Science, Dalhousie University, 6050 University Avenue, PO Box 15000, Halifax, Nova Scotia, Canada B3H 4R2
| |
Collapse
|
202
|
Choi J, Détry N, Kim KT, Asiegbu FO, Valkonen JPT, Lee YH. fPoxDB: fungal peroxidase database for comparative genomics. BMC Microbiol 2014; 14:117. [PMID: 24885079 PMCID: PMC4029949 DOI: 10.1186/1471-2180-14-117] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2013] [Accepted: 04/24/2014] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Peroxidases are a group of oxidoreductases which mediate electron transfer from hydrogen peroxide (H2O2) and organic peroxide to various electron acceptors. They possess a broad spectrum of impact on industry and fungal biology. There are numerous industrial applications using peroxidases, such as to catalyse highly reactive pollutants and to breakdown lignin for recycling of carbon sources. Moreover, genes encoding peroxidases play important roles in fungal pathogenicity in both humans and plants. For better understanding of fungal peroxidases at the genome-level, a novel genomics platform is required. To this end, Fungal Peroxidase Database (fPoxDB; http://peroxidase.riceblast.snu.ac.kr/) has been developed to provide such a genomics platform for this important gene family. DESCRIPTION In order to identify and classify fungal peroxidases, 24 sequence profiles were built and applied on 331 genomes including 216 from fungi and Oomycetes. In addition, NoxR, which is known to regulate NADPH oxidases (NoxA and NoxB) in fungi, was also added to the pipeline. Collectively, 6,113 genes were predicted to encode 25 gene families, presenting well-separated distribution along the taxonomy. For instance, the genes encoding lignin peroxidase, manganese peroxidase, and versatile peroxidase were concentrated in the rot-causing basidiomycetes, reflecting their ligninolytic capability. As a genomics platform, fPoxDB provides diverse analysis resources, such as gene family predictions based on fungal sequence profiles, pre-computed results of eight bioinformatics programs, similarity search tools, a multiple sequence alignment tool, domain analysis functions, and taxonomic distribution summary, some of which are not available in the previously developed peroxidase resource. In addition, fPoxDB is interconnected with other family web systems, providing extended analysis opportunities. CONCLUSIONS fPoxDB is a fungi-oriented genomics platform for peroxidases. The sequence-based prediction and diverse analysis toolkits with easy-to-follow web interface offer a useful workbench to study comparative and evolutionary genomics of peroxidases in fungi.
Collapse
Affiliation(s)
- Jaeyoung Choi
- Fungal Bioinformatics Laboratory and Department of Agricultural Biotechnology, Seoul National University, Seoul 151-921, Korea
- Center for Fungal Pathogenesis, Seoul National University, Seoul 151-921, Korea
| | - Nicolas Détry
- Department of Forest Sciences, University of Helsinki, 00014 Helsinki, Finland
| | - Ki-Tae Kim
- Fungal Bioinformatics Laboratory and Department of Agricultural Biotechnology, Seoul National University, Seoul 151-921, Korea
| | - Fred O Asiegbu
- Department of Forest Sciences, University of Helsinki, 00014 Helsinki, Finland
| | - Jari PT Valkonen
- Department of Agricultural Sciences, University of Helsinki, 00014 Helsinki, Finland
| | - Yong-Hwan Lee
- Fungal Bioinformatics Laboratory and Department of Agricultural Biotechnology, Seoul National University, Seoul 151-921, Korea
- Center for Fungal Pathogenesis, Seoul National University, Seoul 151-921, Korea
- Department of Forest Sciences, University of Helsinki, 00014 Helsinki, Finland
- Department of Agricultural Sciences, University of Helsinki, 00014 Helsinki, Finland
- Center for Fungal Genetic Resources, Plant Genomics and Breeding Institute, and Research Institute for Agriculture and Life Sciences, Seoul National University, Seoul 151-921, Korea
| |
Collapse
|
203
|
Zheng Y, Zhang L. Effect of Incomplete Lineage Sorting On Tree-Reconciliation-Based Inference of Gene Duplication. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2014; 11:477-485. [PMID: 26356016 DOI: 10.1109/tcbb.2013.2297913] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
In the tree reconciliation approach to infer the duplication history of a gene family, the gene (family) tree is compared to the corresponding species tree. Incomplete lineage sorting (ILS) gives rise to stochastic variation in the topology of a gene tree and hence likely introduces false duplication events when a tree reconciliation method is used. We quantify the effect of ILS on gene duplication inference in a species tree in terms of the expected number of false duplication events inferred from reconciling a random gene tree, which occurs with a probability predicted in coalescent theory, and the species tree. We computationally examine the relationship between the effect of ILS on duplication inference in a species tree and its topological parameters. Our findings suggest that ILS may cause non-negligible bias on duplication inference, particularly on an asymmetric species tree. Hence, when gene duplication is inferred via tree reconciliation or any other approach that takes gene tree topology into account, the ILS-induced bias should be examined cautiously.
Collapse
|
204
|
Rusin LY, Lyubetskaya EV, Gorbunov KY, Lyubetsky VA. Reconciliation of gene and species trees. BIOMED RESEARCH INTERNATIONAL 2014; 2014:642089. [PMID: 24800245 PMCID: PMC3985182 DOI: 10.1155/2014/642089] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 08/11/2013] [Accepted: 11/27/2013] [Indexed: 11/18/2022]
Abstract
The first part of the paper briefly overviews the problem of gene and species trees reconciliation with the focus on defining and algorithmic construction of the evolutionary scenario. Basic ideas are discussed for the aspects of mapping definitions, costs of the mapping and evolutionary scenario, imposing time scales on a scenario, incorporating horizontal gene transfers, binarization and reconciliation of polytomous trees, and construction of species trees and scenarios. The review does not intend to cover the vast diversity of literature published on these subjects. Instead, the authors strived to overview the problem of the evolutionary scenario as a central concept in many areas of evolutionary research. The second part provides detailed mathematical proofs for the solutions of two problems: (i) inferring a gene evolution along a species tree accounting for various types of evolutionary events and (ii) trees reconciliation into a single species tree when only gene duplications and losses are allowed. All proposed algorithms have a cubic time complexity and are mathematically proved to find exact solutions. Solving algorithms for problem (ii) can be naturally extended to incorporate horizontal transfers, other evolutionary events, and time scales on the species tree.
Collapse
Affiliation(s)
- L. Y. Rusin
- Institute for Information Transmission Problems (Kharkevich Institute), Russian Academy of Sciences, Bolshoy Karetny Pereulok 19, Moscow 127994, Russia
- Faculty of Biology, Moscow State University, Leninskie Gory 1-12, Moscow 119234, Russia
| | - E. V. Lyubetskaya
- Institute for Information Transmission Problems (Kharkevich Institute), Russian Academy of Sciences, Bolshoy Karetny Pereulok 19, Moscow 127994, Russia
| | - K. Y. Gorbunov
- Institute for Information Transmission Problems (Kharkevich Institute), Russian Academy of Sciences, Bolshoy Karetny Pereulok 19, Moscow 127994, Russia
| | - V. A. Lyubetsky
- Institute for Information Transmission Problems (Kharkevich Institute), Russian Academy of Sciences, Bolshoy Karetny Pereulok 19, Moscow 127994, Russia
| |
Collapse
|
205
|
Morrison DA. Is the Tree of Life the Best Metaphor, Model, or Heuristic for Phylogenetics? Syst Biol 2014; 63:628-38. [DOI: 10.1093/sysbio/syu026] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- David A. Morrison
- Section for Parasitology, Swedish University of Agricultural Sciences, 751 89 Uppsala, Sweden
| |
Collapse
|
206
|
Puggioni V, Dondi A, Folli C, Shin I, Rhee S, Percudani R. Gene Context Analysis Reveals Functional Divergence between Hypothetically Equivalent Enzymes of the Purine–Ureide Pathway. Biochemistry 2014; 53:735-45. [DOI: 10.1021/bi4010107] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Vincenzo Puggioni
- Laboratory
of Biochemistry, Molecular Biology, and Bioinformatics, Department
of Life Sciences, University of Parma, Italy
| | - Ambra Dondi
- Laboratory
of Biochemistry, Molecular Biology, and Bioinformatics, Department
of Life Sciences, University of Parma, Italy
| | - Claudia Folli
- Department
of Food Science, University of Parma, Italy
| | - Inchul Shin
- Department
of Agricultural Biotechnology, Seoul National University, Seoul, Korea
| | - Sangkee Rhee
- Department
of Agricultural Biotechnology, Seoul National University, Seoul, Korea
| | - Riccardo Percudani
- Laboratory
of Biochemistry, Molecular Biology, and Bioinformatics, Department
of Life Sciences, University of Parma, Italy
| |
Collapse
|
207
|
Zheng Y, Zhang L. Reconciliation with Non-binary Gene Trees Revisited. LECTURE NOTES IN COMPUTER SCIENCE 2014. [DOI: 10.1007/978-3-319-05269-4_33] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
|
208
|
Nakhleh L. Computational approaches to species phylogeny inference and gene tree reconciliation. Trends Ecol Evol 2013; 28:719-28. [PMID: 24094331 PMCID: PMC3855310 DOI: 10.1016/j.tree.2013.09.004] [Citation(s) in RCA: 119] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2013] [Revised: 09/02/2013] [Accepted: 09/03/2013] [Indexed: 01/20/2023]
Abstract
An intricate relation exists between gene trees and species phylogenies, due to evolutionary processes that act on the genes within and across the branches of the species phylogeny. From an analytical perspective, gene trees serve as character states for inferring accurate species phylogenies, and species phylogenies serve as a backdrop against which gene trees are contrasted for elucidating evolutionary processes and parameters. In a 1997 paper, Maddison discussed this relation, reviewed the signatures left by three major evolutionary processes on the gene trees, and surveyed parsimony and likelihood criteria for utilizing these signatures to elucidate computationally this relation. Here, I review progress that has been made in developing computational methods for analyses under these two criteria, and survey remaining challenges.
Collapse
Affiliation(s)
- Luay Nakhleh
- Department of Computer Science, Rice University, Houston, TX 77005, USA; Department of Ecology and Evolutionary Biology, Rice University, Houston, TX 77005, USA.
| |
Collapse
|
209
|
Choi J, Kim KT, Jeon J, Lee YH. Fungal plant cell wall-degrading enzyme database: a platform for comparative and evolutionary genomics in fungi and Oomycetes. BMC Genomics 2013; 14 Suppl 5:S7. [PMID: 24564786 PMCID: PMC3852112 DOI: 10.1186/1471-2164-14-s5-s7] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Background Plant cell wall-degrading enzymes (PCWDEs) play significant roles throughout the fungal life including acquisition of nutrients and decomposition of plant cell walls. In addition, many of PCWDEs are also utilized by biofuel and pulp industries. In order to develop a comparative genomics platform focused in fungal PCWDEs and provide a resource for evolutionary studies, Fungal PCWDE Database (FPDB) is constructed (http://pcwde.riceblast.snu.ac.kr/). Results In order to archive fungal PCWDEs, 22 sequence profiles were constructed and searched on 328 genomes of fungi, Oomycetes, plants and animals. A total of 6,682 putative genes encoding PCWDEs were predicted, showing differential distribution by their life styles, host ranges and taxonomy. Genes known to be involved in fungal pathogenicity, including polygalacturonase (PG) and pectin lyase, were enriched in plant pathogens. Furthermore, crop pathogens had more PCWDEs than those of rot fungi, implying that the PCWDEs analysed in this study are more needed for invading plant hosts than wood-decaying processes. Evolutionary analysis of PGs in 34 selected genomes revealed that gene duplication and loss events were mainly driven by taxonomic divergence and partly contributed by those events in species-level, especially in plant pathogens. Conclusions The FPDB would provide a fungi-specialized genomics platform, a resource for evolutionary studies of PCWDE gene families and extended analysis option by implementing Favorite, which is a data exchange and analysis hub built in Comparative Fungal Genomics Platform (CFGP 2.0; http://cfgp.snu.ac.kr/).
Collapse
|
210
|
Nguyen TH, Ranwez V, Berry V, Scornavacca C. Support measures to estimate the reliability of evolutionary events predicted by reconciliation methods. PLoS One 2013; 8:e73667. [PMID: 24124449 PMCID: PMC3790797 DOI: 10.1371/journal.pone.0073667] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2013] [Accepted: 07/23/2013] [Indexed: 11/19/2022] Open
Abstract
The genome content of extant species is derived from that of ancestral genomes, distorted by evolutionary events such as gene duplications, transfers and losses. Reconciliation methods aim at recovering such events and at localizing them in the species history, by comparing gene family trees to species trees. These methods play an important role in studying genome evolution as well as in inferring orthology relationships. A major issue with reconciliation methods is that the reliability of predicted evolutionary events may be questioned for various reasons: Firstly, there may be multiple equally optimal reconciliations for a given species tree–gene tree pair. Secondly, reconciliation methods can be misled by inaccurate gene or species trees. Thirdly, predicted events may fluctuate with method parameters such as the cost or rate of elementary events. For all of these reasons, confidence values for predicted evolutionary events are sorely needed. It was recently suggested that the frequency of each event in the set of all optimal reconciliations could be used as a support measure. We put this proposition to the test here and also consider a variant where the support measure is obtained by additionally accounting for suboptimal reconciliations. Experiments on simulated data show the relevance of event supports computed by both methods, while resorting to suboptimal sampling was shown to be more effective. Unfortunately, we also show that, unlike the majority-rule consensus tree for phylogenies, there is no guarantee that a single reconciliation can contain all events having above 50% support. In this paper, we detail how to rely on the reconciliation graph to efficiently identify the median reconciliation. Such median reconciliation can be found in polynomial time within the potentially exponential set of most parsimonious reconciliations.
Collapse
Affiliation(s)
- Thi-Hau Nguyen
- Laboratoire d'Informatique, de Robotique et de Microélectronique de Montpellier, University Montpellier 2 - Centre national de la recherche scientifique, Montpellier, France
- Montpellier SupAgro (Unité Mixte de Recherche AGAP), Montpellier, France
- Institut de Biologie Computationnelle, Montpellier, France
| | - Vincent Ranwez
- Montpellier SupAgro (Unité Mixte de Recherche AGAP), Montpellier, France
- Institut de Biologie Computationnelle, Montpellier, France
| | - Vincent Berry
- Laboratoire d'Informatique, de Robotique et de Microélectronique de Montpellier, University Montpellier 2 - Centre national de la recherche scientifique, Montpellier, France
- Institut de Biologie Computationnelle, Montpellier, France
| | - Celine Scornavacca
- Institut des Sciences de l'Evolution de Montpellier, Unité Mixte de Recherche 5554, University Montpellier 2, Montpellier, France
- Institut de Biologie Computationnelle, Montpellier, France
- * E-mail:
| |
Collapse
|
211
|
Bansal MS, Alm EJ, Kellis M. Reconciliation revisited: handling multiple optima when reconciling with duplication, transfer, and loss. J Comput Biol 2013; 20:738-54. [PMID: 24033262 PMCID: PMC3791060 DOI: 10.1089/cmb.2013.0073] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Phylogenetic tree reconciliation is a powerful approach for inferring evolutionary events like gene duplication, horizontal gene transfer, and gene loss, which are fundamental to our understanding of molecular evolution. While duplication-loss (DL) reconciliation leads to a unique maximum-parsimony solution, duplication-transfer-loss (DTL) reconciliation yields a multitude of optimal solutions, making it difficult to infer the true evolutionary history of the gene family. This problem is further exacerbated by the fact that different event cost assignments yield different sets of optimal reconciliations. Here, we present an effective, efficient, and scalable method for dealing with these fundamental problems in DTL reconciliation. Our approach works by sampling the space of optimal reconciliations uniformly at random and aggregating the results. We show that even gene trees with only a few dozen genes often have millions of optimal reconciliations and present an algorithm to efficiently sample the space of optimal reconciliations uniformly at random in O(mn(2)) time per sample, where m and n denote the number of genes and species, respectively. We use these samples to understand how different optimal reconciliations vary in their node mappings and event assignments and to investigate the impact of varying event costs. We apply our method to a biological dataset of approximately 4700 gene trees from 100 taxa and observe that 93% of event assignments and 73% of mappings remain consistent across different multiple optima. Our analysis represents the first systematic investigation of the space of optimal DTL reconciliations and has many important implications for the study of gene family evolution.
Collapse
Affiliation(s)
- Mukul S. Bansal
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts
| | - Eric J. Alm
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts
| | - Manolis Kellis
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts
| |
Collapse
|
212
|
Capra JA, Stolzer M, Durand D, Pollard KS. How old is my gene? Trends Genet 2013; 29:659-68. [PMID: 23915718 DOI: 10.1016/j.tig.2013.07.001] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2013] [Revised: 06/13/2013] [Accepted: 07/03/2013] [Indexed: 11/26/2022]
Abstract
Gene functions, interactions, disease associations, and ecological distributions are all correlated with gene age. However, it is challenging to estimate the intricate series of evolutionary events leading to a modern-day gene and then to reduce this history to a single age estimate. Focusing on eukaryotic gene families, we introduce a framework that can be used to compare current strategies for quantifying gene age, discuss key differences between these methods, and highlight several common problems. We argue that genes with complex evolutionary histories do not have a single well-defined age. As a result, care must be taken to articulate the goals and assumptions of any analysis that uses gene age estimates. Recent algorithmic advances offer the promise of gene age estimates that are fast, accurate, and consistent across gene families. This will enable a shift to integrated genome-wide analyses of all events in gene evolutionary histories in the near future.
Collapse
Affiliation(s)
- John A Capra
- Center for Human Genetics Research and Department of Biomedical Informatics, Vanderbilt University, Nashville, TN 37232, USA
| | | | | | | |
Collapse
|
213
|
Deepak A, Fernández-Baca D, McMahon MM. Extracting conflict-free information from multi-labeled trees. Algorithms Mol Biol 2013; 8:18. [PMID: 23837994 PMCID: PMC3716922 DOI: 10.1186/1748-7188-8-18] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2012] [Accepted: 06/29/2013] [Indexed: 11/12/2022] Open
Abstract
Background A multi-labeled tree, or MUL-tree, is a phylogenetic tree where two or more leaves share a label, e.g., a species name. A MUL-tree can imply multiple conflicting phylogenetic relationships for the same set of taxa, but can also contain conflict-free information that is of interest and yet is not obvious. Results We define the information content of a MUL-tree T as the set of all conflict-free quartet topologies implied by T, and define the maximal reduced form of T as the smallest tree that can be obtained from T by pruning leaves and contracting edges while retaining the same information content. We show that any two MUL-trees with the same information content exhibit the same reduced form. This introduces an equivalence relation among MUL-trees with potential applications to comparing MUL-trees. We present an efficient algorithm to reduce a MUL-tree to its maximally reduced form and evaluate its performance on empirical datasets in terms of both quality of the reduced tree and the degree of data reduction achieved. Conclusions Our measure of conflict-free information content based on quartets is simple and topologically appealing. In the experiments, the maximally reduced form is often much smaller than the original tree, yet retains most of the taxa. The reduction algorithm is quadratic in the number of leaves and its complexity is unaffected by the multiplicity of leaf labels or the degree of the nodes.
Collapse
|
214
|
Nguyen TH, Ranwez V, Pointet S, Chifolleau AMA, Doyon JP, Berry V. Reconciliation and local gene tree rearrangement can be of mutual profit. Algorithms Mol Biol 2013; 8:12. [PMID: 23566548 PMCID: PMC3871789 DOI: 10.1186/1748-7188-8-12] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2012] [Accepted: 02/05/2013] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND Reconciliation methods compare gene trees and species trees to recover evolutionary events such as duplications, transfers and losses explaining the history and composition of genomes. It is well-known that gene trees inferred from molecular sequences can be partly erroneous due to incorrect sequence alignments as well as phylogenetic reconstruction artifacts such as long branch attraction. In practice, this leads reconciliation methods to overestimate the number of evolutionary events. Several methods have been proposed to circumvent this problem, by collapsing the unsupported edges and then resolving the obtained multifurcating nodes, or by directly rearranging the binary gene trees. Yet these methods have been defined for models of evolution accounting only for duplications and losses, i.e. can not be applied to handle prokaryotic gene families. RESULTS We propose a reconciliation method accounting for gene duplications, losses and horizontal transfers, that specifically takes into account the uncertainties in gene trees by rearranging their weakly supported edges. Rearrangements are performed on edges having a low confidence value, and are accepted whenever they improve the reconciliation cost. We prove useful properties on the dynamic programming matrix used to compute reconciliations, which allows to speed-up the tree space exploration when rearrangements are generated by Nearest Neighbor Interchanges (NNI) edit operations. Experiments on synthetic data show that gene trees modified by such NNI rearrangements are closer to the correct simulated trees and lead to better event predictions on average. Experiments on real data demonstrate that the proposed method leads to a decrease in the reconciliation cost and the number of inferred events. Finally on a dataset of 30 k gene families, this reconciliation method shows a ranking of prokaryotic phyla by transfer rates identical to that proposed by a different approach dedicated to transfer detection [BMCBIOINF 11:324, 2010, PNAS 109(13):4962-4967, 2012]. CONCLUSIONS Prokaryotic gene trees can now be reconciled with their species phylogeny while accounting for the uncertainty of the gene tree. More accurate and more precise reconciliations are obtained with respect to previous parsimony algorithms not accounting for such uncertainties [LNCS 6398:93-108, 2010, BIOINF 28(12): i283-i291, 2012].A software implementing the method is freely available at http://www.atgc-montpellier.fr/Mowgli/.
Collapse
Affiliation(s)
- Thi Hau Nguyen
- LIRMM, UMR 5506 CNRS - Université Montpellier 2, Montpellier Cédex 5, France
- Montpellier SupAgro (UMR AGAP), Montpellier, France
- Institut de Biologie Computationnelle, 95 rue de la Galéra, 34095 Montpellier cédex, France
| | - Vincent Ranwez
- Montpellier SupAgro (UMR AGAP), Montpellier, France
- Institut de Biologie Computationnelle, 95 rue de la Galéra, 34095 Montpellier cédex, France
| | - Stéphanie Pointet
- LIRMM, UMR 5506 CNRS - Université Montpellier 2, Montpellier Cédex 5, France
- Institut de Biologie Computationnelle, 95 rue de la Galéra, 34095 Montpellier cédex, France
| | - Anne-Muriel Arigon Chifolleau
- LIRMM, UMR 5506 CNRS - Université Montpellier 2, Montpellier Cédex 5, France
- Institut de Biologie Computationnelle, 95 rue de la Galéra, 34095 Montpellier cédex, France
| | - Jean-Philippe Doyon
- LIRMM, UMR 5506 CNRS - Université Montpellier 2, Montpellier Cédex 5, France
- Institut de Biologie Computationnelle, 95 rue de la Galéra, 34095 Montpellier cédex, France
| | - Vincent Berry
- LIRMM, UMR 5506 CNRS - Université Montpellier 2, Montpellier Cédex 5, France
- Institut de Biologie Computationnelle, 95 rue de la Galéra, 34095 Montpellier cédex, France
| |
Collapse
|
215
|
Chauve C, El-Mabrouk N, Guéguen L, Semeria M, Tannier E. Duplication, Rearrangement and Reconciliation: A Follow-Up 13 Years Later. MODELS AND ALGORITHMS FOR GENOME EVOLUTION 2013. [DOI: 10.1007/978-1-4471-5298-9_4] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
|
216
|
Zheng Y, Zhang L. Effect of Incomplete Lineage Sorting on Tree-Reconciliation-Based Inference of Gene Duplication. ACTA ACUST UNITED AC 2013. [DOI: 10.1007/978-3-642-38036-5_26] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2023]
|
217
|
Reconciliation Revisited: Handling Multiple Optima When Reconciling with Duplication, Transfer, and Loss. LECTURE NOTES IN COMPUTER SCIENCE 2013. [DOI: 10.1007/978-3-642-37195-0_1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
|
218
|
Sakamoto T, Deguchi M, Brustolini OJB, Santos AA, Silva FF, Fontes EPB. The tomato RLK superfamily: phylogeny and functional predictions about the role of the LRRII-RLK subfamily in antiviral defense. BMC PLANT BIOLOGY 2012; 12:229. [PMID: 23198823 PMCID: PMC3552996 DOI: 10.1186/1471-2229-12-229] [Citation(s) in RCA: 94] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2012] [Accepted: 11/18/2012] [Indexed: 05/19/2023]
Abstract
BACKGROUND Receptor-like kinases (RLKs) play key roles during development and in responses to the environment. Despite the relevance of the RLK family and the completion of the tomato genome sequencing, the tomato RLK family has not yet been characterized, and a framework for functional predictions of the members of the family is lacking. RESULTS To generate a complete list of all the members of the tomato RLK family, we performed a phylogenetic analysis using the Arabidopsis family as a template. A total of 647 RLKs were identified in the tomato genome, which were organized into the same subfamily clades as Arabidopsis RLKs. Only eight of 58 RLK subfamilies exhibited specific expansion/reduction compared to their Arabidopsis counterparts. We also characterized the LRRII-RLK family by phylogeny, genomic analysis, expression profile and interaction with the virulence factor from begomoviruses, the nuclear shuttle protein (NSP). The LRRII subfamily members from tomato and Arabidopsis were highly conserved in both sequence and structure. Nevertheless, the majority of the orthologous pairs did not display similar conservation in the gene expression profile, indicating that these orthologs may have diverged in function after speciation. Based on the fact that members of the Arabidopsis LRRII subfamily (AtNIK1, AtNIK2 and AtNIK3) interact with the begomovirus nuclear shuttle protein (NSP), we examined whether the tomato orthologs of NIK, BAK1 and NsAK genes interact with NSP of Tomato Yellow Spot Virus (ToYSV). The tomato orthologs of NSP interactors, SlNIKs and SlNsAK, interacted specifically with NSP in yeast and displayed an expression pattern consistent with the pattern of geminivirus infection. In addition to suggesting a functional analogy between these phylogenetically classified orthologs, these results expand our previous observation that NSP-NIK interactions are neither virus-specific nor host-specific. CONCLUSIONS The tomato RLK superfamily is made-up of 647 proteins that form a monophyletic tree with the Arabidopsis RLKs and is divided into 58 subfamilies. Few subfamilies have undergone expansion/reduction, and only six proteins were lineage-specific. Therefore, the tomato RLK family shares functional and structural conservation with Arabidopsis. For the LRRII-RLK members SlNIK1 and SlNIK3, we observed functions analogous to those of their Arabidopsis counterparts with respect to protein-protein interactions and similar expression profiles, which predominated in tissues that support high efficiency of begomovirus infection. Therefore, NIK-mediated antiviral signaling is also likely to operate in tomato, suggesting that tomato NIKs may be good targets for engineering resistance against tomato-infecting begomoviruses.
Collapse
Affiliation(s)
- Tetsu Sakamoto
- National Institute of Science and Technology in Plant-Pest Interactions, Universidade Federal de Viçosa, 36570-000, Viçosa, MG, Brazil
| | - Michihito Deguchi
- National Institute of Science and Technology in Plant-Pest Interactions, Universidade Federal de Viçosa, 36570-000, Viçosa, MG, Brazil
- Departamento de Bioquímica e Biologia Molecular/BIOAGRO, Universidade Federal de Viçosa, 36570-000, Viçosa, MG, Brazil
| | - Otávio JB Brustolini
- National Institute of Science and Technology in Plant-Pest Interactions, Universidade Federal de Viçosa, 36570-000, Viçosa, MG, Brazil
- Departamento de Bioquímica e Biologia Molecular/BIOAGRO, Universidade Federal de Viçosa, 36570-000, Viçosa, MG, Brazil
| | - Anésia A Santos
- National Institute of Science and Technology in Plant-Pest Interactions, Universidade Federal de Viçosa, 36570-000, Viçosa, MG, Brazil
- Departamento de Bioquímica e Biologia Molecular/BIOAGRO, Universidade Federal de Viçosa, 36570-000, Viçosa, MG, Brazil
| | - Fabyano F Silva
- Departamento de Estatística, Universidade Federal de Viçosa, 36570-000, Viçosa, MG, Brazil
| | - Elizabeth PB Fontes
- National Institute of Science and Technology in Plant-Pest Interactions, Universidade Federal de Viçosa, 36570-000, Viçosa, MG, Brazil
- Departamento de Bioquímica e Biologia Molecular/BIOAGRO, Universidade Federal de Viçosa, 36570-000, Viçosa, MG, Brazil
| |
Collapse
|
219
|
Budd A, Devos DP. Evaluating the Evolutionary Origins of Unexpected Character Distributions within the Bacterial Planctomycetes-Verrucomicrobia-Chlamydiae Superphylum. Front Microbiol 2012; 3:401. [PMID: 23189077 PMCID: PMC3505017 DOI: 10.3389/fmicb.2012.00401] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2012] [Accepted: 10/31/2012] [Indexed: 12/26/2022] Open
Abstract
Recently, several characters that are absent from most bacteria, but which are found in many eukaryotes or archaea, have been identified within the bacterial Planctomycetes-Verrucomicrobia-Chlamydiae (PVC) superphylum. Hypotheses of the evolutionary history of such characters are commonly based on the inference of phylogenies of gene or protein families associated with the traits, estimated from multiple sequence alignments (MSAs). So far, studies of this kind have focused on the distribution of (i) two genes involved in the synthesis of sterol, (ii) tubulin genes, and (iii) c1 transfer genes. In many cases, these analyses have concluded that horizontal gene transfer (HGT) is likely to have played a role in shaping the taxonomic distribution of these gene families. In this article, we describe several issues with the inference of HGT from such analyses, in particular concerning the considerable uncertainty associated with our estimation of both gene family phylogenies (especially those containing ancient lineage divergences) and the Tree of Life (ToL), and the need for wider use and further development of explicit probabilistic models to compare hypotheses of vertical and horizontal genetic transmission. We suggest that data which is often taken as evidence for the occurrence of ancient HGT events may not be as convincing as is commonly described, and consideration of alternative theories is recommended. While focusing on analyses including PVCs, this discussion is also relevant for inferences of HGT involving other groups of organisms.
Collapse
Affiliation(s)
- A. Budd
- European Molecular Biology LaboratoryHeidelberg, Germany
| | - D. P. Devos
- European Molecular Biology LaboratoryHeidelberg, Germany
| |
Collapse
|