1
|
Chesters D, Bossert S, Orr MC. [genus]_[species]; Presenting phylogenies to facilitate synthesis. Cladistics 2025; 41:177-192. [PMID: 39673226 DOI: 10.1111/cla.12601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2024] [Revised: 11/23/2024] [Accepted: 11/25/2024] [Indexed: 12/16/2024] Open
Abstract
Each published phylogeny is a potential contribution to the synthesis of the Tree of Life and countless downstream projects. Steps are needed for fully synthesizable science, but only a minority of studies achieve these. We here review the range of phylogenetic presentation and note aspects that hinder further analysis. We provide simple suggestions on publication that would greatly enhance utilizability, and propose a formal grammar for phylogeny terminal format. We suggest that each published phylogeny should be accompanied by at minimum the single preferred result in machine readable tree (e.g. Newick) form in the supplement, a simple task fulfilled by fewer than half of studies. Further, the tree should be clear from the file name and extension; the orientation (rooted or unrooted) should match the figures; terminals labels should include genus and species IDs; underscores should separate strings within-field (instead of white spaces); and if other informational fields are added these should be separated by a unique delimiting character (we suggest multiple underscores or the vertical pipe character, |) and ordered consistently. These requirements are largely independent of phylogenetic study aims, while we note other requirements for synthesis (e.g. removal of species repeats and uninformative terminals) that are not necessarily the responsibility of authors. Machine readable trees show greater variation in terminal formatting than typical phylogeny images (owing presumably to greater scrutiny of the latter), and thus are complex and laborious to parse. Since the majority of existing studies have provided only images, we additionally review typical variation in plotting style, information that will be necessary for developing the automated phylogeny transcription tools needed for their eventual inclusion in the Tree of Life.
Collapse
Affiliation(s)
- Douglas Chesters
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
- International College, University of Chinese Academy of Sciences, Shijingshan District, Beijing, 100049, China
| | - Silas Bossert
- Department of Entomology, Washington State University, 1945 Ferdinand's Ln, Pullman, WA, 99163, USA
| | - Michael C Orr
- Entomologie, Staatliches Museum für Naturkunde Stuttgart, Rosenstein 1, Stuttgart, 70191, Germany
| |
Collapse
|
2
|
Morel B, Williams TA, Stamatakis A, Szöllősi GJ. AleRax: a tool for gene and species tree co-estimation and reconciliation under a probabilistic model of gene duplication, transfer, and loss. Bioinformatics 2024; 40:btae162. [PMID: 38514421 PMCID: PMC10990685 DOI: 10.1093/bioinformatics/btae162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 01/30/2024] [Accepted: 03/19/2024] [Indexed: 03/23/2024] Open
Abstract
MOTIVATION Genomes are a rich source of information on the pattern and process of evolution across biological scales. How best to make use of that information is an active area of research in phylogenetics. Ideally, phylogenetic methods should not only model substitutions along gene trees, which explain differences between homologous gene sequences, but also the processes that generate the gene trees themselves along a shared species tree. To conduct accurate inferences, one needs to account for uncertainty at both levels, that is, in gene trees estimated from inherently short sequences and in their diverse evolutionary histories along a shared species tree. RESULTS We present AleRax, a software that can infer reconciled gene trees together with a shared species tree using a simple, yet powerful, probabilistic model of gene duplication, transfer, and loss. A key feature of AleRax is its ability to account for uncertainty in the gene tree and its reconciliation by using an efficient approximation to calculate the joint phylogenetic-reconciliation likelihood and sample reconciled gene trees accordingly. Simulations and analyses of empirical data show that AleRax is one order of magnitude faster than competing gene tree inference tools while attaining the same accuracy. It is consistently more robust than species tree inference methods such as SpeciesRax and ASTRAL-Pro 2 under gene tree uncertainty. Finally, AleRax can process multiple gene families in parallel thereby allowing users to compare competing phylogenetic hypotheses and estimate model parameters, such as duplication, transfer, and loss probabilities for genome-scale datasets with hundreds of taxa. AVAILABILITY AND IMPLEMENTATION GNU GPL at https://github.com/BenoitMorel/AleRax and data are made available at https://cme.h-its.org/exelixis/material/alerax_data.tar.gz.
Collapse
Affiliation(s)
- Benoit Morel
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg 69118, Germany
- Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe 76131, Germany
| | - Tom A Williams
- School of Biological Sciences, University of Bristol, Bristol BS8 1TQ, United Kingdom
| | - Alexandros Stamatakis
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg 69118, Germany
- Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe 76131, Germany
- Institute of Computer Science, Biodiversity Computing Group, Heraklion GR-70013, Greece
| | - Gergely J Szöllősi
- ELTE-MTA “Lendület”, Evolutionary Genomics Research Group, Budapest H-1117, Hungary
- Institute of Evolution, HUN-REN Centre for Ecological Research, Budapest H-1121, Hungary
- Model-Based Evolutionary Genomics Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa 904-0495, Japan
| |
Collapse
|
3
|
Marlétaz F, Timoshevskaya N, Timoshevskiy VA, Parey E, Simakov O, Gavriouchkina D, Suzuki M, Kubokawa K, Brenner S, Smith JJ, Rokhsar DS. The hagfish genome and the evolution of vertebrates. Nature 2024; 627:811-820. [PMID: 38262590 PMCID: PMC10972751 DOI: 10.1038/s41586-024-07070-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 01/15/2024] [Indexed: 01/25/2024]
Abstract
As the only surviving lineages of jawless fishes, hagfishes and lampreys provide a crucial window into early vertebrate evolution1-3. Here we investigate the complex history, timing and functional role of genome-wide duplications4-7 and programmed DNA elimination8,9 in vertebrates in the light of a chromosome-scale genome sequence for the brown hagfish Eptatretus atami. Combining evidence from syntenic and phylogenetic analyses, we establish a comprehensive picture of vertebrate genome evolution, including an auto-tetraploidization (1RV) that predates the early Cambrian cyclostome-gnathostome split, followed by a mid-late Cambrian allo-tetraploidization (2RJV) in gnathostomes and a prolonged Cambrian-Ordovician hexaploidization (2RCY) in cyclostomes. Subsequently, hagfishes underwent extensive genomic changes, with chromosomal fusions accompanied by the loss of genes that are essential for organ systems (for example, genes involved in the development of eyes and in the proliferation of osteoclasts); these changes account, in part, for the simplification of the hagfish body plan1,2. Finally, we characterize programmed DNA elimination in hagfish, identifying protein-coding genes and repetitive elements that are deleted from somatic cell lineages during early development. The elimination of these germline-specific genes provides a mechanism for resolving genetic conflict between soma and germline by repressing germline and pluripotency functions, paralleling findings in lampreys10,11. Reconstruction of the early genomic history of vertebrates provides a framework for further investigations of the evolution of cyclostomes and jawed vertebrates.
Collapse
Affiliation(s)
- Ferdinand Marlétaz
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, London, UK.
- Molecular Genetics Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan.
| | | | | | - Elise Parey
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, London, UK
| | - Oleg Simakov
- Molecular Genetics Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
- Department for Neurosciences and Developmental Biology, University of Vienna, Vienna, Austria
| | - Daria Gavriouchkina
- Molecular Genetics Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
- UK Dementia Research Institute, University College London, London, UK
| | - Masakazu Suzuki
- Department of Science, Graduate School of Integrated Science and Technology, Shizuoka University, Shizuoka, Japan
| | - Kaoru Kubokawa
- Ocean Research Institute, The University of Tokyo, Tokyo, Japan
| | - Sydney Brenner
- Comparative and Medical Genomics Laboratory, Institute of Molecular and Cell Biology, A*STAR, Biopolis, Singapore, Singapore
| | - Jeramiah J Smith
- Department of Biology, University of Kentucky, Lexington, KY, USA.
| | - Daniel S Rokhsar
- Molecular Genetics Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan.
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA.
- Chan Zuckerberg Biohub, San Francisco, CA, USA.
| |
Collapse
|
4
|
Cribbie EP, Doerr D, Chauve C. AGO, a Framework for the Reconstruction of Ancestral Syntenies and Gene Orders. Methods Mol Biol 2024; 2802:247-265. [PMID: 38819563 DOI: 10.1007/978-1-0716-3838-5_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
Reconstructing ancestral gene orders from the genome data of extant species is an important problem in comparative and evolutionary genomics. In a phylogenomics setting that accounts for gene family evolution through gene duplication and gene loss, the reconstruction of ancestral gene orders involves several steps, including multiple sequence alignment, the inference of reconciled gene trees, and the inference of ancestral syntenies and gene adjacencies. For each of the steps of such a process, several methods can be used and implemented using a growing corpus of, often parameterized, tools; in practice, interfacing such tools into an ancestral gene order reconstruction pipeline is far from trivial. This chapter introduces AGO, a Python-based framework aimed at creating ancestral gene order reconstruction pipelines allowing to interface and parameterize different bioinformatics tools. The authors illustrate the features of AGO by reconstructing ancestral gene orders for the X chromosome of three ancestral Anopheles species using three different pipelines. AGO is freely available at https://github.com/cchauve/AGO-pipeline .
Collapse
Affiliation(s)
- Evan P Cribbie
- Department of Mathematics, Simon Fraser University, Burnaby, BC, Canada
| | - Daniel Doerr
- Department for Endocrinology and Diabetology, Medical Faculty and University Hospital Düsseldorf, German Diabetes Center (DDZ), Leibniz Institute for Diabetes Research, and Center for Digital Medicine, Heinrich Heine University, Düsseldorf, Germany
| | - Cedric Chauve
- Department of Mathematics, Simon Fraser University, Burnaby, BC, Canada.
| |
Collapse
|
5
|
Mateos K, Chappell G, Klos A, Le B, Boden J, Stüeken E, Anderson R. The evolution and spread of sulfur cycling enzymes reflect the redox state of the early Earth. SCIENCE ADVANCES 2023; 9:eade4847. [PMID: 37418533 PMCID: PMC10328410 DOI: 10.1126/sciadv.ade4847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/20/2022] [Revised: 02/06/2023] [Accepted: 06/05/2023] [Indexed: 07/09/2023]
Abstract
The biogeochemical sulfur cycle plays a central role in fueling microbial metabolisms, regulating the Earth's redox state, and affecting climate. However, geochemical reconstructions of the ancient sulfur cycle are confounded by ambiguous isotopic signals. We use phylogenetic reconciliation to ascertain the timing of ancient sulfur cycling gene events across the tree of life. Our results suggest that metabolisms using sulfide oxidation emerged in the Archean, but those involving thiosulfate emerged only after the Great Oxidation Event. Our data reveal that observed geochemical signatures resulted not from the expansion of a single type of organism but were instead associated with genomic innovation across the biosphere. Moreover, our results provide the first indication of organic sulfur cycling from the Mid-Proterozoic onwards, with implications for climate regulation and atmospheric biosignatures. Overall, our results provide insights into how the biological sulfur cycle evolved in tandem with the redox state of the early Earth.
Collapse
Affiliation(s)
- Katherine Mateos
- Carleton College, Northfield, MN, USA
- Ocean Sciences Department, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Garrett Chappell
- Carleton College, Northfield, MN, USA
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Aya Klos
- Carleton College, Northfield, MN, USA
| | - Bryan Le
- Carleton College, Northfield, MN, USA
| | - Joanne Boden
- University of St. Andrews, School of Earth and Environmental Sciences, Bute Building, Queen’s Terrace, St Andrews, Fife KY16 9TS, UK
| | - Eva Stüeken
- University of St. Andrews, School of Earth and Environmental Sciences, Bute Building, Queen’s Terrace, St Andrews, Fife KY16 9TS, UK
| | - Rika Anderson
- Carleton College, Northfield, MN, USA
- NASA NExSS Virtual Planetary Laboratory, University of Washington, Seattle, WA, USA
| |
Collapse
|
6
|
Goulty M, Botton-Amiot G, Rosato E, Sprecher SG, Feuda R. The monoaminergic system is a bilaterian innovation. Nat Commun 2023; 14:3284. [PMID: 37280201 DOI: 10.1038/s41467-023-39030-2] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Accepted: 05/25/2023] [Indexed: 06/08/2023] Open
Abstract
Monoamines like serotonin, dopamine, and adrenaline/noradrenaline (epinephrine/norepinephrine) act as neuromodulators in the nervous system. They play a role in complex behaviours, cognitive functions such as learning and memory formation, as well as fundamental homeostatic processes such as sleep and feeding. However, the evolutionary origin of the genes required for monoaminergic modulation is uncertain. Using a phylogenomic approach, in this study, we show that most of the genes involved in monoamine production, modulation, and reception originated in the bilaterian stem group. This suggests that the monoaminergic system is a bilaterian novelty and that its evolution may have contributed to the Cambrian diversification.
Collapse
Affiliation(s)
- Matthew Goulty
- Department of Genetics and Genome Biology, University of Leicester, Leicestershire, UK
| | - Gaelle Botton-Amiot
- Department of Biology, Institute of Zoology, University of Fribourg, CH-1700, Fribourg, Switzerland
| | - Ezio Rosato
- Department of Genetics and Genome Biology, University of Leicester, Leicestershire, UK
| | - Simon G Sprecher
- Department of Biology, Institute of Zoology, University of Fribourg, CH-1700, Fribourg, Switzerland
| | - Roberto Feuda
- Department of Genetics and Genome Biology, University of Leicester, Leicestershire, UK.
| |
Collapse
|
7
|
Marlétaz F, Timoshevskaya N, Timoshevskiy V, Simakov O, Parey E, Gavriouchkina D, Suzuki M, Kubokawa K, Brenner S, Smith J, Rokhsar DS. The hagfish genome and the evolution of vertebrates. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.17.537254. [PMID: 37131617 PMCID: PMC10153176 DOI: 10.1101/2023.04.17.537254] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
As the only surviving lineages of jawless fishes, hagfishes and lampreys provide a critical window into early vertebrate evolution. Here, we investigate the complex history, timing, and functional role of genome-wide duplications in vertebrates in the light of a chromosome-scale genome of the brown hagfish Eptatretus atami. Using robust chromosome-scale (paralogon-based) phylogenetic methods, we confirm the monophyly of cyclostomes, document an auto-tetraploidization (1RV) that predated the origin of crown group vertebrates ~517 Mya, and establish the timing of subsequent independent duplications in the gnathostome and cyclostome lineages. Some 1RV gene duplications can be linked to key vertebrate innovations, suggesting that this early genomewide event contributed to the emergence of pan-vertebrate features such as neural crest. The hagfish karyotype is derived by numerous fusions relative to the ancestral cyclostome arrangement preserved by lampreys. These genomic changes were accompanied by the loss of genes essential for organ systems (eyes, osteoclast) that are absent in hagfish, accounting in part for the simplification of the hagfish body plan; other gene family expansions account for hagfishes' capacity to produce slime. Finally, we characterise programmed DNA elimination in somatic cells of hagfish, identifying protein-coding and repetitive elements that are deleted during development. As in lampreys, the elimination of these genes provides a mechanism for resolving genetic conflict between soma and germline by repressing germline/pluripotency functions. Reconstruction of the early genomic history of vertebrates provides a framework for further exploration of vertebrate novelties.
Collapse
Affiliation(s)
- Ferdinand Marlétaz
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, London, UK
- Molecular Genetics Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
| | | | | | - Oleg Simakov
- Molecular Genetics Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
- Department of Molecular Evolution and Development, University of Vienna, Vienna, Austria
| | - Elise Parey
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, London, UK
| | - Daria Gavriouchkina
- Molecular Genetics Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
- Present address: UK Dementia Research Institute, University College London, London, UK
| | - Masakazu Suzuki
- Department of Science, Graduate School of Integrated Science and Technology, Shizuoka University, Shizuoka, Japan
| | - Kaoru Kubokawa
- Ocean Research Institute, The University of Tokyo, Tokyo, Japan
| | - Sydney Brenner
- Comparative and Medical Genomics Laboratory, Institute of Molecular and Cell Biology, A*STAR, Biopolis, Singapore 138673, Singapore
- Deceased
| | - Jeramiah Smith
- Department of Biology, University of Kentucky, Lexington, KY, USA
| | - Daniel S Rokhsar
- Molecular Genetics Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| |
Collapse
|
8
|
Affiliation(s)
- Hugo Menet
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558,Villeurbanne, France
| | - Vincent Daubin
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558,Villeurbanne, France
- * E-mail: (VD); (ET)
| | - Eric Tannier
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558,Villeurbanne, France
- Inria, centre de recherche de Lyon, Villeurbanne, France
- * E-mail: (VD); (ET)
| |
Collapse
|
9
|
Uzun M, Koziaeva V, Dziuba M, Leão P, Krutkina M, Grouzdev D. Detection of interphylum transfers of the magnetosome gene cluster in magnetotactic bacteria. Front Microbiol 2022; 13:945734. [PMID: 35979495 PMCID: PMC9376291 DOI: 10.3389/fmicb.2022.945734] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Accepted: 06/27/2022] [Indexed: 01/01/2023] Open
Abstract
Magnetosome synthesis in magnetotactic bacteria (MTB) is regarded as a very ancient evolutionary process that dates back to deep-branching phyla. Magnetotactic bacteria belonging to one of such phyla, Nitrospirota, contain the classical genes for the magnetosome synthesis (e.g., mam, mms) and man genes, which were considered to be specific for this group. However, the recent discovery of man genes in MTB from the Thermodesulfobacteriota phylum has raised several questions about the inheritance of these genes in MTB. In this work, three new man genes containing MTB genomes affiliated with Nitrospirota and Thermodesulfobacteriota, were obtained. By applying reconciliation with these and the previously published MTB genomes, we demonstrate that the last common ancestor of all Nitrospirota was most likely not magnetotactic as assumed previously. Instead, our findings suggest that the genes for magnetosome synthesis were transmitted to the phylum Nitrospirota by horizontal gene transfer (HGT), which is the first case of the interphylum transfer of magnetosome genes detected to date. Furthermore, we provide evidence for the HGT of magnetosome genes from the Magnetobacteriaceae to the Dissulfurispiraceae family within Nitrospirota. Thus, our results imply a more significant role of HGT in the MTB evolution than deemed before and challenge the hypothesis of the ancient origin of magnetosome synthesis.
Collapse
Affiliation(s)
- Maria Uzun
- Skryabin Institute of Bioengineering, Research Center of Biotechnology of the Russian Academy of Sciences, Moscow, Russia
- Faculty of Biology, Lomonosov Moscow State University, Moscow, Russia
| | - Veronika Koziaeva
- Skryabin Institute of Bioengineering, Research Center of Biotechnology of the Russian Academy of Sciences, Moscow, Russia
| | - Marina Dziuba
- Skryabin Institute of Bioengineering, Research Center of Biotechnology of the Russian Academy of Sciences, Moscow, Russia
- Department of Microbiology, University of Bayreuth, Bayreuth, Germany
| | - Pedro Leão
- Instituto de Microbiologia Paulo de Góes, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
- Department of Marine Science, The University of Texas at Austin, Austin, TX, United States
| | | | - Denis Grouzdev
- SciBear OU, Tallinn, Estonia
- *Correspondence: Denis Grouzdev,
| |
Collapse
|
10
|
Penel S, Menet H, Tricou T, Daubin V, Tannier E. Thirdkind: displaying phylogenetic encounters beyond 2-level reconciliation. Bioinformatics 2022; 38:2350-2352. [PMID: 35139153 DOI: 10.1093/bioinformatics/btac062] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 01/26/2022] [Accepted: 02/03/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Reconciliation between a host and its symbiont phylogenies or between a species and a gene phylogenies is a prevalent approach in evolution, however no simple generic tool (i.e. virtually usable by all reconciliation software, from host/symbiont to species/gene comparisons) is available to visualize reconciliation results. Moreover there is no tool to visualize 3-levels reconciliations, i.e. to visualize 2 nested reconciliations as for example in a host/symbiont/gene complex. RESULTS Thirdkind is a light and easy to install command line software producing svg files displaying reconciliations, including 3-levels reconciliations. It takes a standard format recPhyloXML as input, and is thus usable with most reconciliation software. AVAILABILITY AND IMPLEMENTATION https://github.com/simonpenel/thirdkind/wiki. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Simon Penel
- Laboratoire de Biométrie et Biologie Evolutive/UMR5558, CNRS/UCBL, Villeurbanne 69622, France
| | - Hugo Menet
- Laboratoire de Biométrie et Biologie Evolutive/UMR5558, CNRS/UCBL, Villeurbanne 69622, France
| | - Théo Tricou
- Laboratoire de Biométrie et Biologie Evolutive/UMR5558, CNRS/UCBL, Villeurbanne 69622, France
| | - Vincent Daubin
- Laboratoire de Biométrie et Biologie Evolutive/UMR5558, CNRS/UCBL, Villeurbanne 69622, France
| | - Eric Tannier
- Laboratoire de Biométrie et Biologie Evolutive/UMR5558, CNRS/UCBL, Villeurbanne 69622, France.,Centre de Recherche Inria Lyon, Villeurbanne 69622, France
| |
Collapse
|
11
|
Bansal MS. Deciphering Microbial Gene Family Evolution Using Duplication-Transfer-Loss Reconciliation and RANGER-DTL. Methods Mol Biol 2022; 2569:233-252. [PMID: 36083451 DOI: 10.1007/978-1-0716-2691-7_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Phylogenetic reconciliation has emerged as a principled, highly effective technique for investigating the origin, spread, and evolutionary history of microbial gene families. Proper application of phylogenetic reconciliation requires a clear understanding of potential pitfalls and sources of error, and knowledge of the most effective reconciliation-based tools and protocols to use to maximize accuracy. In this book chapter, we provide a brief overview of Duplication-Transfer-Loss (DTL) reconciliation, the standard reconciliation model used to study microbial gene families and provide a step-by-step computational protocol to maximize the accuracy of DTL reconciliation and minimize false-positive evolutionary inferences.
Collapse
Affiliation(s)
- Mukul S Bansal
- Department of Computer Science & Engineering, University of Connecticut, Storrs, CT, USA.
| |
Collapse
|
12
|
Kuitche E, Qi Y, Tahiri N, Parmer J, Ouangraoua A. DoubleRecViz: a web-based tool for visualizing transcript-gene-species tree reconciliation. Bioinformatics 2021; 37:1920-1922. [PMID: 33051656 DOI: 10.1093/bioinformatics/btaa882] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2020] [Revised: 08/14/2020] [Accepted: 09/29/2020] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION A phylogenetic tree reconciliation is a mapping of one phylogenetic tree onto another which represents the co-evolution of two sets of taxa (e.g. parasite-host co-evolution, gene-species co-evolution). The reconciliation framework was extended to allow modeling the co-evolution of three sets of taxa such as transcript-gene-species co-evolutions. Several web-based tools have been developed for the display and manipulation of phylogenetic trees and co-phylogenetic trees involving two trees, but there currently exists no tool for visualizing the joint reconciliation between three phylogenetic trees. RESULTS Here, we present DoubleRecViz, a web-based tool for visualizing double reconciliations between phylogenetic trees at three levels: transcript, gene and species. DoubleRecViz extends the RecPhyloXML model-developed for gene-species tree reconciliation-to represent joint transcript-gene and gene-species tree reconciliations. It is implemented using the Dash library, which is a toolbox that provides dynamic visualization functionalities for web data visualization in Python. AVAILABILITY AND IMPLEMENTATION DoubleRecViz is available through a web server at https://doublerecviz.cobius.usherbrooke.ca. The source code and information about installation procedures are also available at https://github.com/UdeS-CoBIUS/DoubleRecViz. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Esaie Kuitche
- Department of Computer Science, University of Sherbrooke, Sherbrooke, QC J1K2R1, Canada
| | - Yanchun Qi
- Department of Computer Science, University of Sherbrooke, Sherbrooke, QC J1K2R1, Canada
| | | | | | - Aïda Ouangraoua
- Department of Computer Science, University of Sherbrooke, Sherbrooke, QC J1K2R1, Canada
| |
Collapse
|
13
|
Morel B, Kozlov AM, Stamatakis A, Szöllősi GJ. GeneRax: A Tool for Species-Tree-Aware Maximum Likelihood-Based Gene Family Tree Inference under Gene Duplication, Transfer, and Loss. Mol Biol Evol 2021; 37:2763-2774. [PMID: 32502238 PMCID: PMC8312565 DOI: 10.1093/molbev/msaa141] [Citation(s) in RCA: 80] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Inferring phylogenetic trees for individual homologous gene families is difficult because
alignments are often too short, and thus contain insufficient signal, while substitution
models inevitably fail to capture the complexity of the evolutionary processes. To
overcome these challenges, species-tree-aware methods also leverage information from a
putative species tree. However, only few methods are available that implement a full
likelihood framework or account for horizontal gene transfers. Furthermore, these methods
often require expensive data preprocessing (e.g., computing bootstrap trees) and rely on
approximations and heuristics that limit the degree of tree space exploration. Here, we
present GeneRax, the first maximum likelihood species-tree-aware phylogenetic inference
software. It simultaneously accounts for substitutions at the sequence level as well as
gene level events, such as duplication, transfer, and loss relying on established maximum
likelihood optimization algorithms. GeneRax can infer rooted phylogenetic trees for
multiple gene families, directly from the per-gene sequence alignments and a rooted, yet
undated, species tree. We show that compared with competing tools, on simulated data
GeneRax infers trees that are the closest to the true tree in 90% of the simulations in
terms of relative Robinson–Foulds distance. On empirical data sets, GeneRax is the fastest
among all tested methods when starting from aligned sequences, and it infers trees with
the highest likelihood score, based on our model. GeneRax completed tree inferences and
reconciliations for 1,099 Cyanobacteria families in 8 min on 512 CPU cores. Thus, its
parallelization scheme enables large-scale analyses. GeneRax is available under GNU GPL at
https://github.com/BenoitMorel/GeneRax (last accessed June 17, 2020).
Collapse
Affiliation(s)
- Benoit Morel
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
| | - Alexey M Kozlov
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
| | - Alexandros Stamatakis
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany.,Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Gergely J Szöllősi
- ELTE-MTA "Lendület" Evolutionary Genomics Research Group, Budapest, Hungary.,Department of Biological Physics, Eötvös University, Budapest, Hungary.,Evolutionary Systems Research Group, Centre for Ecological Research, Hungarian Academy of Sciences, Tihany, Hungary
| |
Collapse
|
14
|
Comte N, Morel B, Hasić D, Guéguen L, Boussau B, Daubin V, Penel S, Scornavacca C, Gouy M, Stamatakis A, Tannier E, Parsons DP. Treerecs: an integrated phylogenetic tool, from sequences to reconciliations. Bioinformatics 2021; 36:4822-4824. [PMID: 33085745 DOI: 10.1093/bioinformatics/btaa615] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2019] [Revised: 06/22/2020] [Accepted: 07/09/2020] [Indexed: 11/15/2022] Open
Abstract
MOTIVATION Gene and species tree reconciliation methods are used to interpret gene trees, root them and correct uncertainties that are due to scarcity of signal in multiple sequence alignments. So far, reconciliation tools have not been integrated in standard phylogenetic software and they either lack performance on certain functions, or usability for biologists. RESULTS We present Treerecs, a phylogenetic software based on duplication-loss reconciliation. Treerecs is simple to install and to use. It is fast and versatile, has a graphic output, and can be used along with methods for phylogenetic inference on multiple alignments like PLL and Seaview. AVAILABILITY AND IMPLEMENTATION Treerecs is open-source. Its source code (C++, AGPLv3) and manuals are available from https://project.inria.fr/treerecs/.
Collapse
Affiliation(s)
- Nicolas Comte
- Inria Grenoble Rhône-Alpes, 38334 Montbonnot, France
| | - Benoit Morel
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
| | - Damir Hasić
- Department of Mathematics, University of Sarajevo, Sarajevo, Bosnia and Herzegovina
| | - Laurent Guéguen
- Université de Lyon, Laboratoire de Biométrie et Biologie Évolutive, CNRS UMR5558, F-69622 Villeurbanne, France
| | - Bastien Boussau
- Université de Lyon, Laboratoire de Biométrie et Biologie Évolutive, CNRS UMR5558, F-69622 Villeurbanne, France
| | - Vincent Daubin
- Université de Lyon, Laboratoire de Biométrie et Biologie Évolutive, CNRS UMR5558, F-69622 Villeurbanne, France
| | - Simon Penel
- Université de Lyon, Laboratoire de Biométrie et Biologie Évolutive, CNRS UMR5558, F-69622 Villeurbanne, France
| | - Celine Scornavacca
- ISEM, CNRS, Université de Montpellier, IRD, EPHE, Montpellier 34000, France
| | - Manolo Gouy
- Université de Lyon, Laboratoire de Biométrie et Biologie Évolutive, CNRS UMR5558, F-69622 Villeurbanne, France
| | - Alexandros Stamatakis
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany.,Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Eric Tannier
- Inria Grenoble Rhône-Alpes, 38334 Montbonnot, France.,Université de Lyon, Laboratoire de Biométrie et Biologie Évolutive, CNRS UMR5558, F-69622 Villeurbanne, France
| | | |
Collapse
|
15
|
Davín AA, Tricou T, Tannier E, de Vienne DM, Szöllősi GJ. Zombi: a phylogenetic simulator of trees, genomes and sequences that accounts for dead linages. Bioinformatics 2020; 36:1286-1288. [PMID: 31566657 PMCID: PMC7031779 DOI: 10.1093/bioinformatics/btz710] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2019] [Revised: 09/09/2019] [Accepted: 09/26/2019] [Indexed: 11/14/2022] Open
Abstract
Summary Here we present Zombi, a tool to simulate the evolution of species, genomes and sequences in silico, that considers for the first time the evolution of genomes in extinct lineages. It also incorporates various features that have not to date been combined in a single simulator, such as the possibility of generating species trees with a pre-defined variation of speciation and extinction rates through time, simulating explicitly intergenic sequences of variable length and outputting gene tree—species tree reconciliations. Availability and implementation Source code and manual are freely available in https://github.com/AADavin/ZOMBI/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Adrián A Davín
- MTA-ELTE Lendület Evolutionary Genomics Research Group, Budapest, Hungary.,Department of Biological Physics, Eötvös Loránd, Budapest, Hungary
| | - Théo Tricou
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558, Villeurbanne F-69622, France
| | - Eric Tannier
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558, Villeurbanne F-69622, France.,INRIA Grenoble Rhône-Alpes, Montbonnot-Saint-Martin F-38334, France
| | - Damien M de Vienne
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558, Villeurbanne F-69622, France
| | - Gergely J Szöllősi
- MTA-ELTE Lendület Evolutionary Genomics Research Group, Budapest, Hungary.,Department of Biological Physics, Eötvös Loránd, Budapest, Hungary.,Evolutionary Systems Research Group, Centre for Ecological Research, Hungarian Academy of Sciences, Tihany H-8237, Hungary
| |
Collapse
|