1
|
Rebhi S, Basharat Z, Wei CR, Lebbal S, Najjaa H, Sadfi-Zouaoui N, Messaoudi A. Core proteome mediated subtractive approach for the identification of potential therapeutic drug target against the honeybee pathogen Paenibacillus larvae. PeerJ 2024; 12:e17292. [PMID: 38818453 PMCID: PMC11138523 DOI: 10.7717/peerj.17292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Accepted: 04/02/2024] [Indexed: 06/01/2024] Open
Abstract
Background & Objectives American foulbrood (AFB), caused by the highly virulent, spore-forming bacterium Paenibacillus larvae, poses a significant threat to honey bee brood. The widespread use of antibiotics not only fails to effectively combat the disease but also raises concerns regarding honey safety. The current computational study was attempted to identify a novel therapeutic drug target against P. larvae, a causative agent of American foulbrood disease in honey bee. Methods We investigated effective novel drug targets through a comprehensive in silico pan-proteome and hierarchal subtractive sequence analysis. In total, 14 strains of P. larvae genomes were used to identify core genes. Subsequently, the core proteome was systematically narrowed down to a single protein predicted as the potential drug target. Alphafold software was then employed to predict the 3D structure of the potential drug target. Structural docking was carried out between a library of phytochemicals derived from traditional Chinese flora (n > 36,000) and the potential receptor using Autodock tool 1.5.6. Finally, molecular dynamics (MD) simulation study was conducted using GROMACS to assess the stability of the best-docked ligand. Results Proteome mining led to the identification of Ketoacyl-ACP synthase III as a highly promising therapeutic target, making it a prime candidate for inhibitor screening. The subsequent virtual screening and MD simulation analyses further affirmed the selection of ZINC95910054 as a potent inhibitor, with the lowest binding energy. This finding presents significant promise in the battle against P. larvae. Conclusions Computer aided drug design provides a novel approach for managing American foulbrood in honey bee populations, potentially mitigating its detrimental effects on both bee colonies and the honey industry.
Collapse
Affiliation(s)
- Sawsen Rebhi
- Université de Tunis-El Manar, Laboratoire de Mycologie, Pathologies et Biomarqueurs, Département de Biologie, Tunis, Tunisia
| | | | - Calvin R. Wei
- Department of Research and Development, Shing Huei Group, Taipei, Taiwan
| | - Salim Lebbal
- University of Khenchela, Department of Agricultural Sciences, Faculty of Nature and Life Sciences, Khenchela, Algeria
| | - Hanen Najjaa
- University of Gabes, Laboratory of Pastoral Ecosystem and Valorization of Spontaneous Plants and Associated Microorganisms, Institute of Arid Lands of Medenine, Medenine, Tunisia
| | - Najla Sadfi-Zouaoui
- Université de Tunis-El Manar, Laboratoire de Mycologie, Pathologies et Biomarqueurs, Département de Biologie, Tunis, Tunisia
| | - Abdelmonaem Messaoudi
- Université de Tunis-El Manar, Laboratoire de Mycologie, Pathologies et Biomarqueurs, Département de Biologie, Tunis, Tunisia
- Jendouba University, Higher Institute of Biotechnology of Beja, Beja, Tunisia
| |
Collapse
|
2
|
Tabatabaee Y, Roch S, Warnow T. QR-STAR: A Polynomial-Time Statistically Consistent Method for Rooting Species Trees Under the Coalescent. J Comput Biol 2023; 30:1146-1181. [PMID: 37902986 DOI: 10.1089/cmb.2023.0185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2023] Open
Abstract
We address the problem of rooting an unrooted species tree given a set of unrooted gene trees, under the assumption that gene trees evolve within the model species tree under the multispecies coalescent (MSC) model. Quintet Rooting (QR) is a polynomial time algorithm that was recently proposed for this problem, which is based on the theory developed by Allman, Degnan, and Rhodes that proves the identifiability of rooted 5-taxon trees from unrooted gene trees under the MSC. However, although QR had good accuracy in simulations, its statistical consistency was left as an open problem. We present QR-STAR, a variant of QR with an additional step and a different cost function, and prove that it is statistically consistent under the MSC. Moreover, we derive sample complexity bounds for QR-STAR and show that a particular variant of it based on "short quintets" has polynomial sample complexity. Finally, our simulation study under a variety of model conditions shows that QR-STAR matches or improves on the accuracy of QR. QR-STAR is available in open-source form on github.
Collapse
Affiliation(s)
- Yasamin Tabatabaee
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| | - Sebastien Roch
- Department of Mathematics, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Tandy Warnow
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| |
Collapse
|
3
|
Roadmap to the study of gene and protein phylogeny and evolution-A practical guide. PLoS One 2023; 18:e0279597. [PMID: 36827278 PMCID: PMC9955684 DOI: 10.1371/journal.pone.0279597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Accepted: 12/12/2022] [Indexed: 02/25/2023] Open
Abstract
Developments in sequencing technologies and the sequencing of an ever-increasing number of genomes have revolutionised studies of biodiversity and organismal evolution. This accumulation of data has been paralleled by the creation of numerous public biological databases through which the scientific community can mine the sequences and annotations of genomes, transcriptomes, and proteomes of multiple species. However, to find the appropriate databases and bioinformatic tools for respective inquiries and aims can be challenging. Here, we present a compilation of DNA and protein databases, as well as bioinformatic tools for phylogenetic reconstruction and a wide range of studies on molecular evolution. We provide a protocol for information extraction from biological databases and simple phylogenetic reconstruction using probabilistic and distance methods, facilitating the study of biodiversity and evolution at the molecular level for the broad scientific community.
Collapse
|
4
|
Lozano-Fernandez J. A Practical Guide to Design and Assess a Phylogenomic Study. Genome Biol Evol 2022; 14:evac129. [PMID: 35946263 PMCID: PMC9452790 DOI: 10.1093/gbe/evac129] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/03/2022] [Indexed: 11/13/2022] Open
Abstract
Over the last decade, molecular systematics has undergone a change of paradigm as high-throughput sequencing now makes it possible to reconstruct evolutionary relationships using genome-scale datasets. The advent of "big data" molecular phylogenetics provided a battery of new tools for biologists but simultaneously brought new methodological challenges. The increase in analytical complexity comes at the price of highly specific training in computational biology and molecular phylogenetics, resulting very often in a polarized accumulation of knowledge (technical on one side and biological on the other). Interpreting the robustness of genome-scale phylogenetic studies is not straightforward, particularly as new methodological developments have consistently shown that the general belief of "more genes, more robustness" often does not apply, and because there is a range of systematic errors that plague phylogenomic investigations. This is particularly problematic because phylogenomic studies are highly heterogeneous in their methodology, and best practices are often not clearly defined. The main aim of this article is to present what I consider as the ten most important points to take into consideration when planning a well-thought-out phylogenomic study and while evaluating the quality of published papers. The goal is to provide a practical step-by-step guide that can be easily followed by nonexperts and phylogenomic novices in order to assess the technical robustness of phylogenomic studies or improve the experimental design of a project.
Collapse
Affiliation(s)
- Jesus Lozano-Fernandez
- Department of Genetics, Microbiology and Statistics, Biodiversity Research Institute (IRBio), University of Barcelona, Avd. Diagonal 643, 08028 Barcelona, Spain
- Institute of Evolutionary Biology (CSIC – Universitat Pompeu Fabra), Passeig marítim de la Barcelona 37-49, 08003 Barcelona, Spain
| |
Collapse
|
5
|
Gatesy J, Springer MS. Phylogenomic Coalescent Analyses of Avian Retroelements Infer Zero-Length Branches at the Base of Neoaves, Emergent Support for Controversial Clades, and Ancient Introgressive Hybridization in Afroaves. Genes (Basel) 2022; 13:genes13071167. [PMID: 35885951 PMCID: PMC9324441 DOI: 10.3390/genes13071167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Revised: 06/20/2022] [Accepted: 06/21/2022] [Indexed: 01/25/2023] Open
Abstract
Retroelement insertions (RIs) are low-homoplasy characters that are ideal data for addressing deep evolutionary radiations, where gene tree reconstruction errors can severely hinder phylogenetic inference with DNA and protein sequence data. Phylogenomic studies of Neoaves, a large clade of birds (>9000 species) that first diversified near the Cretaceous−Paleogene boundary, have yielded an array of robustly supported, contradictory relationships among deep lineages. Here, we reanalyzed a large RI matrix for birds using recently proposed quartet-based coalescent methods that enable inference of large species trees including branch lengths in coalescent units, clade-support, statistical tests for gene flow, and combined analysis with DNA-sequence-based gene trees. Genome-scale coalescent analyses revealed extremely short branches at the base of Neoaves, meager branch support, and limited congruence with previous work at the most challenging nodes. Despite widespread topological conflicts with DNA-sequence-based trees, combined analyses of RIs with thousands of gene trees show emergent support for multiple higher-level clades (Columbea, Passerea, Columbimorphae, Otidimorphae, Phaethoquornithes). RIs express asymmetrical support for deep relationships within the subclade Afroaves that hints at ancient gene flow involving the owl lineage (Strigiformes). Because DNA-sequence data are challenged by gene tree-reconstruction error, analysis of RIs represents one approach for improving gene tree-based methods when divergences are deep, internodes are short, terminal branches are long, and introgressive hybridization further confounds species−tree inference.
Collapse
Affiliation(s)
- John Gatesy
- Division of Vertebrate Zoology, American Museum of Natural History, New York, NY 10024, USA
- Correspondence:
| | - Mark S. Springer
- Department of Evolution, Ecology, and Organismal Biology, University of California, Riverside, CA 92521, USA;
| |
Collapse
|
6
|
Tabatabaee Y, Sarker K, Warnow T. Quintet Rooting: rooting species trees under the multi-species coalescent model. Bioinformatics 2022; 38:i109-i117. [PMID: 35758805 PMCID: PMC9236578 DOI: 10.1093/bioinformatics/btac224] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Motivation Rooted species trees are a basic model with multiple applications throughout biology, including understanding adaptation, biodiversity, phylogeography and co-evolution. Because most species tree estimation methods produce unrooted trees, methods for rooting these trees have been developed. However, most rooting methods either rely on prior biological knowledge or assume that evolution is close to clock-like, which is not usually the case. Furthermore, most prior rooting methods do not account for biological processes that create discordance between gene trees and species trees. Results We present Quintet Rooting (QR), a method for rooting species trees based on a proof of identifiability of the rooted species tree under the multi-species coalescent model established by Allman, Degnan and Rhodes (J. Math. Biol., 2011). We show that QR is generally more accurate than other rooting methods, except under extreme levels of gene tree estimation error. Availability and implementation Quintet Rooting is available in open source form at https://github.com/ytabatabaee/Quintet-Rooting. The simulated datasets used in this study are from a prior study and are available at https://www.ideals.illinois.edu/handle/2142/55319. The biological dataset used in this study is also from a prior study and is available at http://gigadb.org/dataset/101041. Contact warnow@illinois.edu Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yasamin Tabatabaee
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Kowshika Sarker
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Tandy Warnow
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| |
Collapse
|
7
|
Lamarca AP, Mello B, Schrago CG. The performance of outgroup-free rooting under evolutionary radiations. Mol Phylogenet Evol 2022; 169:107434. [PMID: 35143961 DOI: 10.1016/j.ympev.2022.107434] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Revised: 01/07/2022] [Accepted: 01/25/2022] [Indexed: 11/18/2022]
Abstract
Tree rooting implies a temporal dimension to phylogenies. Only after defining the position of the root node is that the ancestral-descendant relationship between branches can be fully deduced. Rooting has been usually carried out by employing evolutionarily close outgroup lineages, which is a drawback when these lineages are unavailable or unknown. Alternatively, outgroup-free rooting methods were proposed, which rely on the constancy of evolutionary rates to varying degrees. In this work we analyzed the performance of two of these methods, the midpoint rooting (MPR) and the minimal ancestor deviation (MAD), in rooting topologies evolved under challenging scenarios of fast evolutionary radiations derived from empirical data, characterized by short internal branches near the crown node. Considering all branch length combinations investigated, both methods exhibited average success rates below 50%, although MAD slightly outperformed MPR. Moreover, tree balance significantly impacted the relative performance of the methods. We found that, in four-taxa unrooted trees, the outcome of whether both methodologies will correctly root the tree can be roughly predicted by two simple dimensionless metrics: the coefficient of variation of the external branch lengths, and the ratio between the internal branch length to the total sum of branch lengths, which were employed to devise a general linear model that allowed calculating the probability of correct placing the root node for any four-taxa tree. We predicted that the performance of both outgroup-free rooting methods on loci representing the placental mammal radiation ranged between 50% and 75%.
Collapse
Affiliation(s)
| | - Beatriz Mello
- Department of Genetics, Federal University of Rio de Janeiro, RJ, Brazil
| | - Carlos G Schrago
- Department of Genetics, Federal University of Rio de Janeiro, RJ, Brazil.
| |
Collapse
|
8
|
Morel B, Schade P, Lutteropp S, Williams TA, Szöllősi GJ, Stamatakis A. SpeciesRax: A tool for maximum likelihood species tree inference from gene family trees under duplication, transfer, and loss. Mol Biol Evol 2022; 39:6503503. [PMID: 35021210 PMCID: PMC8826479 DOI: 10.1093/molbev/msab365] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Species tree inference from gene family trees is becoming increasingly popular because it can account for discordance between the species tree and the corresponding gene family trees. In particular, methods that can account for multiple-copy gene families exhibit potential to leverage paralogy as informative signal. At present, there does not exist any widely adopted inference method for this purpose. Here, we present SpeciesRax, the first maximum likelihood method that can infer a rooted species tree from a set of gene family trees and can account for gene duplication, loss, and transfer events. By explicitly modeling events by which gene trees can depart from the species tree, SpeciesRax leverages the phylogenetic rooting signal in gene trees. SpeciesRax infers species tree branch lengths in units of expected substitutions per site and branch support values via paralogy-aware quartets extracted from the gene family trees. Using both empirical and simulated data sets we show that SpeciesRax is at least as accurate as the best competing methods while being one order of magnitude faster on large data sets at the same time. We used SpeciesRax to infer a biologically plausible rooted phylogeny of the vertebrates comprising 188 species from 31,612 gene families in 1 h using 40 cores. SpeciesRax is available under GNU GPL at https://github.com/BenoitMorel/GeneRax and on BioConda.
Collapse
Affiliation(s)
- Benoit Morel
- Computational Molecular Evolution group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany.,Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Paul Schade
- Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Sarah Lutteropp
- Computational Molecular Evolution group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
| | - Tom A Williams
- School of Biological Sciences, University of Bristol, Bristol, UK
| | - Gergely J Szöllősi
- ELTE-MTA "Lendület" Evolutionary Genomics Research Group, Pázmány P. stny. 1A., H-1117 Budapest, Hungary.,Dept. Biological Physics, Eötvös University, Pázmány P. stny. 1A., H-1117 Budapest, Hungary.,Institute of Evolution, Centre for Ecological Research, Konkoly-Thege M. út 29-33. H-1121 Budapest, Hungary
| | - Alexandros Stamatakis
- Computational Molecular Evolution group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany.,Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| |
Collapse
|
9
|
Garcia AK, Fer E, Sephus C, Kacar B. An Integrated Method to Reconstruct Ancient Proteins. Methods Mol Biol 2022; 2569:267-281. [PMID: 36083453 DOI: 10.1007/978-1-0716-2691-7_13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Proteins have played a fundamental role throughout life's history on Earth. Despite their biological importance, ancient origin, early function, and evolution of proteins are seldom able to be directly studied because few of these attributes are preserved across geologic timescales. Ancestral sequence reconstruction (ASR) provides a method to infer ancestral amino acid sequences and determine the evolutionary predecessors of modern-day proteins using phylogenetic tools. Laboratory application of ASR allows ancient sequences to be deduced from genetic information available in extant organisms and then experimentally resurrected to elucidate ancestral characteristics. In this article, we provide a generalized, stepwise protocol that considers the major elements of a well-designed ASR study and details potential sources of reconstruction bias that can reduce the relevance of historical inferences. We underscore key stages in our approach so that it may be broadly utilized to reconstruct the evolutionary histories of proteins.
Collapse
Affiliation(s)
- Amanda K Garcia
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA
| | - Evrim Fer
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA
- Microbiology Doctoral Training Program, University of Wisconsin-Madison, Madison, WI, USA
| | - Cathryn Sephus
- Scripps Institution of Oceanography, University of California at San Diego, La Jolla, CA, USA
| | - Betul Kacar
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA.
| |
Collapse
|
10
|
Simmons MP, Springer MS, Gatesy J. Gene-tree misrooting drives conflicts in phylogenomic coalescent analyses of palaeognath birds. Mol Phylogenet Evol 2021; 167:107344. [PMID: 34748873 DOI: 10.1016/j.ympev.2021.107344] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Revised: 10/08/2021] [Accepted: 11/02/2021] [Indexed: 10/19/2022]
Abstract
Phylogenomic analyses of ancient rapid radiations can produce conflicting results that are driven by differential sampling of taxa and characters as well as the limitations of alternative analytical methods. We re-examine basal relationships of palaeognath birds (ratites and tinamous) using recently published datasets of nucleotide characters from 20,850 loci as well as 4301 retroelement insertions. The original studies attributed conflicting resolutions of rheas in their inferred coalescent and concatenation trees to concatenation failing in the anomaly zone. By contrast, we find that the coalescent-based resolution of rheas is premised upon extensive gene-tree estimation errors. Furthermore, retroelement insertions contain much more conflict than originally reported and multiple insertion loci support the basal position of rheas found in concatenation trees, while none were reported in the original publication. We demonstrate how even remarkable congruence in phylogenomic studies may be driven by long-branch misplacement of a divergent outgroup, highly incongruent gene trees, differential taxon sampling that can result in gene-tree misrooting errors that bias species-tree inference, and gross homology errors. What was previously interpreted as broad, robustly supported corroboration for a single resolution in coalescent analyses may instead indicate a common bias that taints phylogenomic results across multiple genome-scale datasets. The updated retroelement dataset now supports a species tree with branch lengths that suggest an ancient anomaly zone, and both concatenation and coalescent analyses of the huge nucleotide datasets fail to yield coherent, reliable results in this challenging phylogenetic context.
Collapse
Affiliation(s)
- Mark P Simmons
- Department of Biology, Colorado State University, Fort Collins, CO 80523, USA.
| | - Mark S Springer
- Department of Evolution, Ecology, and Organismal Biology, University of California, Riverside, CA 92521, USA
| | - John Gatesy
- Division of Vertebrate Zoology and Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, NY 10024, USA
| |
Collapse
|
11
|
Zhu X, Liu M, Wu X, Ma W, Zhao X. Phylogenetic analysis of classical swine fever virus isolates from China. Arch Virol 2021; 166:2255-2261. [PMID: 34003359 DOI: 10.1007/s00705-021-05084-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2020] [Accepted: 03/18/2021] [Indexed: 11/26/2022]
Abstract
Classical swine fever (CSF), caused by classical swine fever virus (CSFV), is a severe disease that causes huge economic losses in the swine industry worldwide. In China, CSF has been under control due to extensive vaccination since 1954. However, there are still sporadic CSF outbreaks in China. Here, we isolated 27 CSFV strains from three Chinese provinces (Shaanxi, Gansu, and Ningxia) from 2011 to 2018. Phylogenetic analysis based on the full-length envelope glycoprotein E2 coding region revealed that 25 out of 27 CSFV isolates clustered within subgroups 2.1 and 2.2, while two strains from Gansu belonged to subgroup 1.1. The sequence identity among these 27 isolates varied from 79.3% to 99.8% (nucleotides) and from 83.1% to 99.7% (amino acids). Further analysis based on the E2 amino acid sequences showed that these new isolates have consistent amino acid substitutions, including R31K and N34S.
Collapse
Affiliation(s)
- Xiaofu Zhu
- Key Laboratory of Animal Epidemic Disease Diagnostic Laboratory of Molecular Biology in Xianyang City, Xianyang Vocational Technical College, Xianyang, 712000, Shaanxi, China.
| | - Mingjie Liu
- College of Veterinary Medicine, Northwest A&F University, Yangling, 712100, Shaanxi, China
| | - Xujin Wu
- Key Laboratory of Animal Epidemic Disease Diagnostic Laboratory of Molecular Biology in Xianyang City, Xianyang Vocational Technical College, Xianyang, 712000, Shaanxi, China
| | - Wentao Ma
- College of Veterinary Medicine, Northwest A&F University, Yangling, 712100, Shaanxi, China
| | - Xuanduo Zhao
- Yangling Bodeyue Biotechnology Co., Ltd., Yangling, 712100, Shaanxi, China
| |
Collapse
|
12
|
Bettisworth B, Stamatakis A. Root Digger: a root placement program for phylogenetic trees. BMC Bioinformatics 2021; 22:225. [PMID: 33932975 PMCID: PMC8088003 DOI: 10.1186/s12859-021-03956-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Accepted: 01/01/2021] [Indexed: 01/30/2023] Open
Abstract
Background In phylogenetic analysis, it is common to infer unrooted trees. However, knowing the root location is desirable for downstream analyses and interpretation. There exist several methods to recover a root, such as molecular clock analysis (including midpoint rooting) or rooting the tree using an outgroup. Non-reversible Markov models can also be used to compute the likelihood of a potential root position. Results We present a software called RootDigger which uses a non-reversible Markov model to compute the most likely root location on a given tree and to infer a confidence value for each possible root placement. We find that RootDigger is successful at finding roots when compared to similar tools such as IQ-TREE and MAD, and will occasionally outperform them. Additionally, we find that the exhaustive mode of RootDigger is useful in quantifying and explaining uncertainty in rooting positions. Conclusions RootDigger can be used on an existing phylogeny to find a root, or to asses the uncertainty of the root placement. RootDigger is available under the MIT licence at https://www.github.com/computations/root_digger.
Collapse
Affiliation(s)
- Ben Bettisworth
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany.
| | - Alexandros Stamatakis
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany.,Institut für Theoretische Informatik, Karlsruhe Institute of Technology, Karslruhe, Germany
| |
Collapse
|
13
|
Williams TA, Schrempf D, Szöllősi GJ, Cox CJ, Foster PG, Embley TM. Inferring the deep past from molecular data. Genome Biol Evol 2021; 13:6192802. [PMID: 33772552 PMCID: PMC8175050 DOI: 10.1093/gbe/evab067] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/22/2021] [Indexed: 12/17/2022] Open
Abstract
There is an expectation that analyses of molecular sequences might be able to distinguish between alternative hypotheses for ancient relationships, but the phylogenetic methods used and types of data analyzed are of critical importance in any attempt to recover historical signal. Here, we discuss some common issues that can influence the topology of trees obtained when using overly simple models to analyze molecular data that often display complicated patterns of sequence heterogeneity. To illustrate our discussion, we have used three examples of inferred relationships which have changed radically as models and methods of analysis have improved. In two of these examples, the sister-group relationship between thermophilic Thermus and mesophilic Deinococcus, and the position of long-branch Microsporidia among eukaryotes, we show that recovering what is now generally considered to be the correct tree is critically dependent on the fit between model and data. In the third example, the position of eukaryotes in the tree of life, the hypothesis that is currently supported by the best available methods is fundamentally different from the classical view of relationships between major cellular domains. Since heterogeneity appears to be pervasive and varied among all molecular sequence data, and even the best available models can still struggle to deal with some problems, the issues we discuss are generally relevant to phylogenetic analyses. It remains essential to maintain a critical attitude to all trees as hypotheses of relationship that may change with more data and better methods.
Collapse
Affiliation(s)
- Tom A Williams
- School of Biological Sciences, University of Bristol, Bristol BS8 1TQ, United Kingdom
| | - Dominik Schrempf
- Dept. of Biological Physics, Eötvös Loránd University, 1117 Budapest, Hungary
| | - Gergely J Szöllősi
- Dept. of Biological Physics, Eötvös Loránd University, 1117 Budapest, Hungary.,MTA-ELTE "Lendület" Evolutionary Genomics Research Group, 1117 Budapest, Hungary.,Institute of Evolution, Centre for Ecological Research, 1121 Budapest, Hungary
| | - Cymon J Cox
- Centro de Ciências do Mar, Universidade do Algarve, Gambelas, 8005-319 Faro, Portugal
| | - Peter G Foster
- Department of Life Sciences, Natural History Museum, London SW7 5BD, United Kingdom
| | - T Martin Embley
- Biosciences Institute, Centre for Bacterial Cell Biology, Newcastle University, Newcastle upon Tyne NE2 4AX, United Kingdom
| |
Collapse
|
14
|
Mariadassou M, Suez M, Sathyakumar S, Vignal A, Arca M, Nicolas P, Faraut T, Esquerré D, Nishibori M, Vieaud A, Chen CF, Manh Pham H, Roman Y, Hospital F, Zerjal T, Rognon X, Tixier-Boichard M. Unraveling the history of the genus Gallus through whole genome sequencing. Mol Phylogenet Evol 2020; 158:107044. [PMID: 33346111 DOI: 10.1016/j.ympev.2020.107044] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Revised: 10/23/2020] [Accepted: 12/14/2020] [Indexed: 12/16/2022]
Abstract
The genus Gallus is distributed across a large part of Southeast Asia and has received special interest because the domestic chicken, Gallus gallus domesticus, has spread all over the world and is a major protein source for humans. There are four species: the red junglefowl (G. gallus), the green junglefowl (G. varius), the Lafayette's junglefowl (G. lafayettii) and the grey junglefowl (G. sonneratii). The aim of this study is to reconstruct the history of these species by a whole genome sequencing approach and resolve inconsistencies between well supported topologies inferred using different data and methods. Using deep sequencing, we identified over 35 million SNPs and reconstructed the phylogeny of the Gallus genus using both distance (BioNJ) and maximum likelihood (ML) methods. We observed discrepancies according to reconstruction methods and genomic components. The two most supported topologies were previously reported and were discriminated by using phylogenetic and gene flow analyses, based on ABBA statistics. Terminology fix requested by the deputy editor led to support a scenario with G. gallus as the earliest branching lineage of the Gallus genus, instead of G. varius. We discuss the probable causes for the discrepancy. A likely one is that G. sonneratii samples from parks or private collections are all recent hybrids, with roughly 10% of their autosomal genome originating from G. gallus. The removal of those regions is needed to provide reliable data, which was not done in previous studies. We took care of this and additionally included two wild G. sonneratii samples from India, showing no trace of introgression. This reinforces the importance of carefully selecting and validating samples and genomic components in phylogenomics.
Collapse
Affiliation(s)
| | - Marie Suez
- Université Paris Saclay, INRAE, MaIAGE, 78350 Jouy-en-Josas, France
| | | | - Alain Vignal
- GenPhySE, Université de Toulouse, INRAE, ENVT, 31326 Castanet Tolosan, France
| | - Mariangela Arca
- Université Paris Saclay, INRAE, MaIAGE, 78350 Jouy-en-Josas, France
| | - Pierre Nicolas
- Université Paris Saclay, INRAE, MaIAGE, 78350 Jouy-en-Josas, France
| | - Thomas Faraut
- GenPhySE, Université de Toulouse, INRAE, ENVT, 31326 Castanet Tolosan, France
| | - Diane Esquerré
- GenPhySE, Université de Toulouse, INRAE, ENVT, 31326 Castanet Tolosan, France; Get-PlaGe, INRAE, 31326 Castanet Tolosan, France
| | - Masahide Nishibori
- Lab. of Animal Genetics, Department of Animal Life Science, Graduate School of Integrated Sciences for Life, Hiroshima University, Higashi-Hiroshima 739-8528, Japan
| | - Agathe Vieaud
- Université Paris Saclay, INRAE, AgroParisTech, GABI, 78350 Jouy-en-Josas, France
| | - Chih-Feng Chen
- Department of Animal Science, iEGG and Animal Biotechnology Center, National Chung-Hsing University, Taichung 40227, Taiwan
| | - Hung Manh Pham
- Faculty of Animal Science, Vietnam National University of Agriculture, Trau Quy Town, Gia Lam District, Ha Noi City, Viet Nam
| | | | - Frédéric Hospital
- Université Paris Saclay, INRAE, AgroParisTech, GABI, 78350 Jouy-en-Josas, France
| | - Tatiana Zerjal
- Université Paris Saclay, INRAE, AgroParisTech, GABI, 78350 Jouy-en-Josas, France
| | - Xavier Rognon
- Université Paris Saclay, INRAE, AgroParisTech, GABI, 78350 Jouy-en-Josas, France
| | | |
Collapse
|
15
|
Górecki P, Markin A, Eulenstein O. Exact median-tree inference for unrooted reconciliation costs. BMC Evol Biol 2020; 20:136. [PMID: 33115401 PMCID: PMC7593691 DOI: 10.1186/s12862-020-01700-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background Solving median tree problems under tree reconciliation costs is a classic and well-studied approach for inferring species trees from collections of discordant gene trees. These problems are NP-hard, and therefore are, in practice, typically addressed by local search heuristics. So far, however, such heuristics lack any provable correctness or precision. Further, even for small phylogenetic studies, it has been demonstrated that local search heuristics may only provide sub-optimal solutions. Obviating such heuristic uncertainties are exact dynamic programming solutions that allow solving tree reconciliation problems for smaller phylogenetic studies. Despite these promises, such exact solutions are only suitable for credibly rooted input gene trees, which constitute only a tiny fraction of the readily available gene trees. Standard gene tree inference approaches provide only unrooted gene trees and accurately rooting such trees is often difficult, if not impossible. Results Here, we describe complex dynamic programming solutions that represent the first nonnaïve exact solutions for solving the tree reconciliation problems for unrooted input gene trees. Further, we show that the asymptotic runtime of the proposed solutions does not increase when compared to the most time-efficient dynamic programming solutions for rooted input trees. Conclusions In an experimental evaluation, we demonstrate that the described solutions for unrooted gene trees are, like the solutions for rooted input gene trees, suitable for smaller phylogenetic studies. Finally, for the first time, we study the accuracy of classic local search heuristics for unrooted tree reconciliation problems.
Collapse
Affiliation(s)
- Paweł Górecki
- University of Warsaw, Faculty of Mathematics, Informatics and Mechanics, Banacha 2, Warsaw, 02-097, Poland.
| | - Alexey Markin
- Department of Computer Science, Iowa State University, Atanasoff Hall 212, Ames, 50011, USA
| | - Oliver Eulenstein
- Department of Computer Science, Iowa State University, Atanasoff Hall 212, Ames, 50011, USA
| |
Collapse
|
16
|
Spasojevic T, Broad GR, Sääksjärvi IE, Schwarz M, Ito M, Korenko S, Klopfstein S. Mind the Outgroup and Bare Branches in Total-Evidence Dating: a Case Study of Pimpliform Darwin Wasps (Hymenoptera, Ichneumonidae). Syst Biol 2020; 70:322-339. [PMID: 33057674 PMCID: PMC7875445 DOI: 10.1093/sysbio/syaa079] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2019] [Revised: 10/02/2020] [Accepted: 10/02/2020] [Indexed: 01/16/2023] Open
Abstract
Taxon sampling is a central aspect of phylogenetic study design, but it has received limited attention in the context of total-evidence dating, a widely used dating approach that directly integrates molecular and morphological information from extant and fossil taxa. We here assess the impact of commonly employed outgroup sampling schemes and missing morphological data in extant taxa on age estimates in a total-evidence dating analysis under the uniform tree prior. Our study group is Pimpliformes, a highly diverse, rapidly radiating group of parasitoid wasps of the family Ichneumonidae. We analyze a data set comprising 201 extant and 79 fossil taxa, including the oldest fossils of the family from the Early Cretaceous and the first unequivocal representatives of extant subfamilies from the mid-Paleogene. Based on newly compiled molecular data from ten nuclear genes and a morphological matrix that includes 222 characters, we show that age estimates become both older and less precise with the inclusion of more distant and more poorly sampled outgroups. These outgroups not only lack morphological and temporal information but also sit on long terminal branches and considerably increase the evolutionary rate heterogeneity. In addition, we discover an artifact that might be detrimental for total-evidence dating: “bare-branch attraction,” namely high attachment probabilities of certain fossils to terminal branches for which morphological data are missing. Using computer simulations, we confirm the generality of this phenomenon and show that a large phylogenetic distance to any of the extant taxa, rather than just older age, increases the risk of a fossil being misplaced due to bare-branch attraction. After restricting outgroup sampling and adding morphological data for the previously attracting, bare branches, we recover a Jurassic origin for Pimpliformes and Ichneumonidae. This first age estimate for the group not only suggests an older origin than previously thought but also that diversification of the crown group happened well before the Cretaceous-Paleogene boundary. Our case study demonstrates that in order to obtain robust age estimates, total-evidence dating studies need to be based on a thorough and balanced sampling of both extant and fossil taxa, with the aim of minimizing evolutionary rate heterogeneity and missing morphological information. [Bare-branch attraction; ichneumonids; fossils; morphological matrix; phylogeny; RoguePlots.]
Collapse
Affiliation(s)
- Tamara Spasojevic
- Abteilung Wirbellose Tiere Invertebrates, Naturhistorisches Museum der Burgergemeinde Bern, Bernastrasse 15, 3005 Bern, Switzerland.,Institute of Ecology and Evolution, Department of Biology, University of Bern, 3012 Bern, Switzerland.,Department of Entomology, National Museum of Natural History, Washington, DC 20560, USA
| | - Gavin R Broad
- Department of Life Sciences, Natural History Museum, London SW7 5BD, UK
| | | | | | - Masato Ito
- Graduate School of Agricultural Science, Department of Agrobioscience, Kobe University, 657-8501 Japan
| | - Stanislav Korenko
- Department of Agroecology and Crop Production, Faculty of Agrobiology, Food and Natural Resources, Czech University of Life Sciences Prague, 165 21 Prague 6, Suchdol, Czech Republic
| | - Seraina Klopfstein
- Abteilung Wirbellose Tiere Invertebrates, Naturhistorisches Museum der Burgergemeinde Bern, Bernastrasse 15, 3005 Bern, Switzerland.,Institute of Ecology and Evolution, Department of Biology, University of Bern, 3012 Bern, Switzerland.,Abteilung für Biowissenschaften, Naturhistorisches Museum Basel, 4051 Basel, Switzerland
| |
Collapse
|
17
|
Stadler PF, Geiß M, Schaller D, López Sánchez A, González Laffitte M, Valdivia DI, Hellmuth M, Hernández Rosales M. From pairs of most similar sequences to phylogenetic best matches. Algorithms Mol Biol 2020; 15:5. [PMID: 32308731 PMCID: PMC7147060 DOI: 10.1186/s13015-020-00165-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Accepted: 03/26/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Many of the commonly used methods for orthology detection start from mutually most similar pairs of genes (reciprocal best hits) as an approximation for evolutionary most closely related pairs of genes (reciprocal best matches). This approximation of best matches by best hits becomes exact for ultrametric dissimilarities, i.e., under the Molecular Clock Hypothesis. It fails, however, whenever there are large lineage specific rate variations among paralogous genes. In practice, this introduces a high level of noise into the input data for best-hit-based orthology detection methods. RESULTS If additive distances between genes are known, then evolutionary most closely related pairs can be identified by considering certain quartets of genes provided that in each quartet the outgroup relative to the remaining three genes is known. A priori knowledge of underlying species phylogeny greatly facilitates the identification of the required outgroup. Although the workflow remains a heuristic since the correct outgroup cannot be determined reliably in all cases, simulations with lineage specific biases and rate asymmetries show that nearly perfect results can be achieved. In a realistic setting, where distances data have to be estimated from sequence data and hence are noisy, it is still possible to obtain highly accurate sets of best matches. CONCLUSION Improvements of tree-free orthology assessment methods can be expected from a combination of the accurate inference of best matches reported here and recent mathematical advances in the understanding of (reciprocal) best match graphs and orthology relations. AVAILABILITY Accompanying software is available at https://github.com/david-schaller/AsymmeTree.
Collapse
Affiliation(s)
- Peter F. Stadler
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16–18, 04107 Leipzig, Germany
- Competence Center for Scalable Data Services and Solutions Dresden/Leipzig, Interdisciplinary Center for Bioinformatics, German Centre for Integrative Biodiversity Research (iDiv), and Leipzig Research Center for Civilization Diseases, Universität Leipzig, Augustusplatz 12, 04107 Leipzig, Germany
- Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, 04103 Leipzig, Germany
- Department of Theoretical Chemistry, University of Vienna, Währinger Straße 17, 1090 Vienna, Austria
- Facultad de Ciencias, Universidad National de Colombia, Sede Bogotá, Ciudad Universitaria, 111321 Bogotá, D.C. Colombia
- Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM87501 USA
| | - Manuela Geiß
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16–18, 04107 Leipzig, Germany
- Software Competence Center Hagenberg GmbH, Softwarepark 21, 4232 Hagenberg, Austria
| | - David Schaller
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16–18, 04107 Leipzig, Germany
| | - Alitzel López Sánchez
- CONACYT-Instituto de Matemáticas, UNAM Juriquilla, Blvd. Juriquilla 3001, 76230 Juriquilla, Querétaro, QRO México
| | - Marcos González Laffitte
- CONACYT-Instituto de Matemáticas, UNAM Juriquilla, Blvd. Juriquilla 3001, 76230 Juriquilla, Querétaro, QRO México
| | - Dulce I. Valdivia
- Departamento de Ingeniería Genética, Centro de Investigación y de Estudios Avanzados del IPN (CINVESTAV), Km. 9.6 Libramiento Norte Carretera Irapuato-León, 36821 Irapuato, GTO México
| | - Marc Hellmuth
- School of Computing, University of Leeds, E C Stoner Building, Leeds, LS2 9JT UK
| | - Maribel Hernández Rosales
- CONACYT-Instituto de Matemáticas, UNAM Juriquilla, Blvd. Juriquilla 3001, 76230 Juriquilla, Querétaro, QRO México
| |
Collapse
|
18
|
Lamarca AP, Schrago CG. Fast speciations and slow genes: uncovering the root of living canids. Biol J Linn Soc Lond 2019. [DOI: 10.1093/biolinnean/blz181] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Abstract
Despite ongoing efforts relying on computationally intensive tree-building methods and large datasets, the deeper phylogenetic relationships between living canid genera remain controversial. We demonstrate that this issue arises fundamentally from the uncertainty of root placement as a consequence of the short length of the branch connecting the major canid clades, which probably resulted from a fast radiation during the early diversification of extant Canidae. Using both nuclear and mitochondrial genes, we investigate the position of the canid root and its consistency by using three rooting methods. We find that mitochondrial genomes consistently retrieve a root node separating the tribe Canini from the remaining canids, whereas nuclear data mostly recover a root that places the Urocyon foxes as the sister lineage of living canids. We demonstrate that, to resolve the canid root, the nuclear segments sequenced so far are significantly less informative than mitochondrial genomes. We also propose that short intervals between speciations obscure the place of the true root, because methods are susceptible to stochastic error in the presence of short internal branches near the root.
Collapse
Affiliation(s)
- Alessandra P Lamarca
- Department of Genetics, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Carlos G Schrago
- Department of Genetics, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| |
Collapse
|
19
|
Garcia AK, Kaçar B. How to resurrect ancestral proteins as proxies for ancient biogeochemistry. Free Radic Biol Med 2019; 140:260-269. [PMID: 30951835 DOI: 10.1016/j.freeradbiomed.2019.03.033] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Revised: 02/11/2019] [Accepted: 03/26/2019] [Indexed: 10/27/2022]
Abstract
Throughout the history of life, enzymes have served as the primary molecular mediators of biogeochemical cycles by catalyzing the metabolic pathways that interact with geochemical substrates. The byproducts of enzymatic activities have been preserved as chemical and isotopic signatures in the geologic record. However, interpretations of these signatures are limited by the assumption that such enzymes have remained functionally conserved over billions of years of molecular evolution. By reconstructing ancient genetic sequences in conjunction with laboratory enzyme resurrection, preserved biogeochemical signatures can instead be related to experimentally constrained, ancestral enzymatic properties. We may thereby investigate instances within molecular evolutionary trajectories potentially tied to significant biogeochemical transitions evidenced in the geologic record. Here, we survey recent enzyme resurrection studies to provide a reasoned assessment of areas of success and common pitfalls relevant to ancient biogeochemical applications. We conclude by considering the Great Oxidation Event, which provides a constructive example of a significant biogeochemical transition that warrants investigation with ancestral enzyme resurrection. This event also serves to highlight the pitfalls of facile interpretation of paleophenotype models and data, as applied to two examples of enzymes that likely both influenced and were influenced by the rise of atmospheric oxygen - RuBisCO and nitrogenase.
Collapse
Affiliation(s)
- Amanda K Garcia
- Department of Molecular and Cell Biology, University of Arizona, Tucson, AZ, 85721, USA
| | - Betül Kaçar
- Department of Molecular and Cell Biology, University of Arizona, Tucson, AZ, 85721, USA; Department of Astronomy and Steward Observatory, University of Arizona, Tucson, AZ, 85721, USA.
| |
Collapse
|
20
|
Grant T. Outgroup sampling in phylogenetics: Severity of test and successive outgroup expansion. J ZOOL SYST EVOL RES 2019. [DOI: 10.1111/jzs.12317] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Taran Grant
- Department of Zoology, Institute of Biosciences University of São Paulo São Paulo Brazil
| |
Collapse
|
21
|
Alda F, Tagliacollo VA, Bernt MJ, Waltz BT, Ludt WB, Faircloth BC, Alfaro ME, Albert JS, Chakrabarty P. Resolving Deep Nodes in an Ancient Radiation of Neotropical Fishes in the Presence of Conflicting Signals from Incomplete Lineage Sorting. Syst Biol 2018; 68:573-593. [DOI: 10.1093/sysbio/syy085] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2018] [Revised: 11/30/2018] [Accepted: 12/03/2018] [Indexed: 12/13/2022] Open
Affiliation(s)
- Fernando Alda
- Museum of Natural Science, Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
- Department of Biology, Geology and Environmental Science, University of Tennessee at Chattanooga, Chattanooga, TN 37403, USA
| | - Victor A Tagliacollo
- Museu de Zoologia da Universidade de São Paulo (MZUSP), Ipirianga, 04263-000, São Paulo, São Paulo, Brazil
| | - Maxwell J Bernt
- Department of Biology, University of Louisiana at Lafayette, Lafayette, LA 70503, USA
| | - Brandon T Waltz
- Department of Biology, University of Louisiana at Lafayette, Lafayette, LA 70503, USA
| | - William B Ludt
- Museum of Natural Science, Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Brant C Faircloth
- Museum of Natural Science, Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Michael E Alfaro
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA 90095, USA
| | - James S Albert
- Department of Biology, University of Louisiana at Lafayette, Lafayette, LA 70503, USA
| | - Prosanta Chakrabarty
- Museum of Natural Science, Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| |
Collapse
|
22
|
Degnan JH. Modeling Hybridization Under the Network Multispecies Coalescent. Syst Biol 2018; 67:786-799. [PMID: 29846734 PMCID: PMC6101600 DOI: 10.1093/sysbio/syy040] [Citation(s) in RCA: 59] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2017] [Revised: 05/13/2018] [Accepted: 05/16/2018] [Indexed: 11/13/2022] Open
Abstract
Simultaneously modeling hybridization and the multispecies coalescent is becoming increasingly common, and inference of species networks in this context is now implemented in several software packages. This article addresses some of the conceptual issues and decisions to be made in this modeling, including whether or not to use branch lengths and issues with model identifiability. This article is based on a talk given at a Spotlight Session at Evolution 2017 meeting in Portland, Oregon. This session included several talks about modeling hybridization and gene flow in the presence of incomplete lineage sorting. Other talks given at this meeting are also included in this special issue of Systematic Biology.
Collapse
Affiliation(s)
- James H Degnan
- Department of Mathematics and Statistics, University of New Mexico, Albuquerque, NM 87131, USA
| |
Collapse
|
23
|
Rios L, Núñez JI, Díaz de Arce H, Ganges L, Pérez LJ. Revisiting the genetic diversity of classical swine fever virus: A proposal for new genotyping and subgenotyping schemes of classification. Transbound Emerg Dis 2018; 65:963-971. [PMID: 29799671 DOI: 10.1111/tbed.12909] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2018] [Revised: 04/09/2018] [Accepted: 04/25/2018] [Indexed: 12/30/2022]
Abstract
Classical swine fever (CSF) is a highly contagious febrile viral disease caused by CSF virus (CSFV), and it is considered one of the most important infectious diseases that affect domestic pigs and wild boar. Previous molecular epidemiology studies have revealed that the diversity of CSFV comprises three main genotypes and different subgenotypes defined using a reliable cut-off to accurately classify CSFV at genotype and subgenotype levels. However, a growing number of CSFV both complete genome and full E2 gene sequences have been submitted to GenBank (more than 500 sequences are currently available, revised on December 1, 2017). Therefore, the aim of this study was to revisit the taxonomy of CSFV at genotype and subgenotype levels, to unify nomenclature and to provide an update to the classification of CSFV. We propose here a new genotyping scheme with five well-defined CSFV genotypes (CSFV Genotypes 1-5) and 14 subgenotypes (seven for each of the CSFV Genotype 1 and CSFV Genotype 2). The findings showed in this study are relevant for molecular epidemiology approaches and will help to better understand the genetic diversity and spreading of CSFV at a global scale. The update in the classification of CSFV will allow the scientific community to establish more accurately the links among different outbreaks of the disease.
Collapse
Affiliation(s)
- Liliam Rios
- University of New Brunswick, Saint John, NB, Canada
| | - José I Núñez
- IRTA-CReSA, Centre de Recerca en Sanitat Animal, Barcelona, Spain
| | - Heidy Díaz de Arce
- Hospital Italiano de Buenos Aires, Juan D. Perón 4190, Buenos Aires, Argentina
| | - Llilianne Ganges
- OIE Reference Laboratory for Classical Swine Fever, IRTA-CReSA, Barcelona, Spain
| | - Lester J Pérez
- Dalhousie University, Dalhousie Medicine New Brunswick, Saint John, NB, Canada
| |
Collapse
|
24
|
Urbini L, Sinaimeri B, Matias C, Sagot MF. Exploring the Robustness of the Parsimonious Reconciliation Method in Host-Symbiont Cophylogeny. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 16:738-748. [PMID: 29993554 DOI: 10.1109/tcbb.2018.2838667] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The aim of this paper is to explore the robustness of the parsimonious host-symbiont tree reconciliation method under editing or small perturbations of the input. The editing involves making different choices of unique symbiont mapping to a host in the case where multiple associations exist. This is made necessary by the fact that the tree reconciliation model is currently unable to handle such associations. The analysis performed could however also address the problem of errors. The perturbations are re-rootings of the symbiont tree to deal with a possibly wrong placement of the root specially in the case of fast-evolving species. In order to do this robustness analysis, we introduce a simulation scheme specifically designed for the host-symbiont cophylogeny context, as well as a measure to compare sets of tree reconciliations, both of which are of interest by themselves.
Collapse
|
25
|
Cherlin S, Heaps SE, Nye TMW, Boys RJ, Williams TA, Embley TM. The Effect of Nonreversibility on Inferring Rooted Phylogenies. Mol Biol Evol 2018; 35:984-1002. [PMID: 29149300 PMCID: PMC5889004 DOI: 10.1093/molbev/msx294] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Most phylogenetic models assume that the evolutionary process is stationary and reversible. In addition to being biologically improbable, these assumptions also impair inference by generating models under which the likelihood does not depend on the position of the root. Consequently, the root of the tree cannot be inferred as part of the analysis. Yet identifying the root position is a key component of phylogenetic inference because it provides a point of reference for polarizing ancestor-descendant relationships and therefore interpreting the tree. In this paper, we investigate the effect of relaxing the unrealistic reversibility assumption and allowing the position of the root to be another unknown. We propose two hierarchical models that are centered on a reversible model but perturbed to allow nonreversibility. The models differ in the degree of structure imposed on the perturbations. The analysis is performed in the Bayesian framework using Markov chain Monte Carlo methods for which software is provided. We illustrate the performance of the two nonreversible models in analyses of simulated data using two types of topological priors. We then apply the models to a real biological data set, the radiation of polyploid yeasts, for which there is robust biological opinion about the root position. Finally, we apply the models to a second biological alignment for which the rooted tree is controversial: the ribosomal tree of life. We compare the two nonreversible models and conclude that both are useful in inferring the position of the root from real biological data.
Collapse
Affiliation(s)
- Svetlana Cherlin
- Institute of Genetic Medicine, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Sarah E Heaps
- School of Mathematics, Statistics and Physics, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Tom M W Nye
- School of Mathematics, Statistics and Physics, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Richard J Boys
- School of Mathematics, Statistics and Physics, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Tom A Williams
- School of Biological Sciences, University of Bristol, Bristol, United Kingdom
| | - T Martin Embley
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne, United Kingdom
| |
Collapse
|
26
|
Mykowiecka A, Gorecki P. Credibility of Evolutionary Events in Gene Trees. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 16:713-726. [PMID: 29990287 DOI: 10.1109/tcbb.2017.2788888] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Based on the classical non-parametric bootstrapping for phylogenetic trees, we propose a novel bootstrap method to define support for gene duplication and speciation events. By comparing bootstrap gene trees to the original gene tree, we calculate support for evolutionary events. While this approach can be used to annotate orthology and paralogy, we show how it can be used to verify the reliability of tree reconciliation. We propose a linear time algorithm for the computation of bootstrap values, and we show the correspondence of our method with the classical non-parametric bootstrapping. Finally, we present two computational experiments. In the first one, based on simulated data and nine yeast genomes, we show a comparative study of several tree rooting methods and evaluation of their performance by using our bootstrapping method. In the second experiment, using data from the TreeFam database, we tested how the reliability of the gene trees influence the inferred supertree. We found out that species trees inferred from gene trees having highly supported events are more biologically consistent.
Collapse
|
27
|
Tian Y, Kubatko L. Rooting phylogenetic trees under the coalescent model using site pattern probabilities. BMC Evol Biol 2017; 17:263. [PMID: 29258427 PMCID: PMC5738147 DOI: 10.1186/s12862-017-1108-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2016] [Accepted: 12/01/2017] [Indexed: 12/01/2022] Open
Abstract
BACKGROUND Phylogenetic tree inference is a fundamental tool to estimate ancestor-descendant relationships among different species. In phylogenetic studies, identification of the root - the most recent common ancestor of all sampled organisms - is essential for complete understanding of the evolutionary relationships. Rooted trees benefit most downstream application of phylogenies such as species classification or study of adaptation. Often, trees can be rooted by using outgroups, which are species that are known to be more distantly related to the sampled organisms than any other species in the phylogeny. However, outgroups are not always available in evolutionary research. METHODS In this study, we develop a new method for rooting species tree under the coalescent model, by developing a series of hypothesis tests for rooting quartet phylogenies using site pattern probabilities. The power of this method is examined by simulation studies and by application to an empirical North American rattlesnake data set. RESULTS The method shows high accuracy across the simulation conditions considered, and performs well for the rattlesnake data. Thus, it provides a computationally efficient way to accurately root species-level phylogenies that incorporates the coalescent process. The method is robust to variation in substitution model, but is sensitive to the assumption of a molecular clock. CONCLUSIONS Our study establishes a computationally practical method for rooting species trees that is more efficient than traditional methods. The method will benefit numerous evolutionary studies that require rooting a phylogenetic tree without having to specify outgroups.
Collapse
Affiliation(s)
- Yuan Tian
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, 318 W. 12th Avenue, Columbus, 43210 OH USA
| | - Laura Kubatko
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, 318 W. 12th Avenue, Columbus, 43210 OH USA
- Department of Statistics, The Ohio State University, 404 Cockins Hall, 1958 Neil Avenue, Columbus, 43210 OH USA
| |
Collapse
|
28
|
Mai U, Sayyari E, Mirarab S. Minimum variance rooting of phylogenetic trees and implications for species tree reconstruction. PLoS One 2017; 12:e0182238. [PMID: 28800608 PMCID: PMC5553649 DOI: 10.1371/journal.pone.0182238] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2017] [Accepted: 06/25/2017] [Indexed: 12/29/2022] Open
Abstract
Phylogenetic trees inferred using commonly-used models of sequence evolution are unrooted, but the root position matters both for interpretation and downstream applications. This issue has been long recognized; however, whether the potential for discordance between the species tree and gene trees impacts methods of rooting a phylogenetic tree has not been extensively studied. In this paper, we introduce a new method of rooting a tree based on its branch length distribution; our method, which minimizes the variance of root to tip distances, is inspired by the traditional midpoint rerooting and is justified when deviations from the strict molecular clock are random. Like midpoint rerooting, the method can be implemented in a linear time algorithm. In extensive simulations that consider discordance between gene trees and the species tree, we show that the new method is more accurate than midpoint rerooting, but its relative accuracy compared to using outgroups to root gene trees depends on the size of the dataset and levels of deviations from the strict clock. We show high levels of error for all methods of rooting estimated gene trees due to factors that include effects of gene tree discordance, deviations from the clock, and gene tree estimation error. Our simulations, however, did not reveal significant differences between two equivalent methods for species tree estimation that use rooted and unrooted input, namely, STAR and NJst. Nevertheless, our results point to limitations of existing scalable rooting methods.
Collapse
Affiliation(s)
- Uyen Mai
- Dept of Computer Science and Engineering, University of California at San Diego, San Diego, CA, United States of America
| | - Erfan Sayyari
- Dept of Electrical and Computer Engineering, University of California at San Diego, San Diego, CA, United States of America
| | - Siavash Mirarab
- Dept of Electrical and Computer Engineering, University of California at San Diego, San Diego, CA, United States of America
| |
Collapse
|
29
|
Moon J, Eulenstein O. Synthesizing large-scale species trees using the strict consensus approach. J Bioinform Comput Biol 2017; 15:1740002. [PMID: 28513253 DOI: 10.1142/s0219720017400029] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Supertree problems are a standard tool for synthesizing large-scale species trees from a given collection of gene trees under some problem-specific objective. Unfortunately, these problems are typically NP-hard, and often remain so when their instances are restricted to rooted gene trees sampled from the same species. While a class of restricted supertree problems has been effectively addressed by the parameterized strict consensus approach, in practice, most gene trees are unrooted and sampled from different species. Here, we overcome this stringent limitation by describing efficient algorithms that are adopting the strict consensus approach to also handle unrestricted supertree problems. Finally, we demonstrate the performance of our algorithms in a comparative study with classic supertree heuristics using simulated and empirical data sets.
Collapse
Affiliation(s)
- Jucheol Moon
- 1 Department of Computer Science, Iowa State University Ames, Iowa 50010, USA
| | - Oliver Eulenstein
- 1 Department of Computer Science, Iowa State University Ames, Iowa 50010, USA
| |
Collapse
|
30
|
Song N, Zhang H, Li H, Cai W. All 37 Mitochondrial Genes of Aphid Aphis craccivora Obtained from Transcriptome Sequencing: Implications for the Evolution of Aphids. PLoS One 2016; 11:e0157857. [PMID: 27314587 PMCID: PMC4912114 DOI: 10.1371/journal.pone.0157857] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2015] [Accepted: 06/06/2016] [Indexed: 11/19/2022] Open
Abstract
The availability of mitochondrial genome data for Aphididae, one of the economically important insect pest families, in public databases is limited. The advent of next generation sequencing technology provides the potential to generate mitochondrial genome data for many species timely and cost-effectively. In this report, we used transcriptome sequencing technology to determine all the 37 mitochondrial genes of the cowpea aphid, Aphis craccivora. This method avoids the necessity of finding suitable primers for long PCRs or primer-walking amplicons, and is proved to be effective in obtaining the whole set of mitochondrial gene data for insects with difficulty in sequencing mitochondrial genome by PCR-based strategies. Phylogenetic analyses of aphid mitochondrial genome data show clustering based on tribe level, and strongly support the monophyly of the family Aphididae. Within the monophyletic Aphidini, three samples from Aphis grouped together. In another major clade of Aphididae, Pterocomma pilosum was recovered as a potential sister-group of Cavariella salicicola, as part of Macrosiphini.
Collapse
Affiliation(s)
- Nan Song
- College of Plant Protection, Henan Agricultural University, Zhengzhou, China
| | - Hao Zhang
- Henan Vocational and Technological College of Communication, Zhengzhou, China
| | - Hu Li
- Department of Entomology, China Agricultural University, Beijing, China
| | - Wanzhi Cai
- Department of Entomology, China Agricultural University, Beijing, China
| |
Collapse
|
31
|
Phylogeography of the Vermilion Flycatcher species complex: Multiple speciation events, shifts in migratory behavior, and an apparent extinction of a Galápagos-endemic bird species. Mol Phylogenet Evol 2016; 102:152-73. [PMID: 27233443 DOI: 10.1016/j.ympev.2016.05.029] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2015] [Revised: 05/02/2016] [Accepted: 05/21/2016] [Indexed: 11/23/2022]
Abstract
The Vermilion Flycatcher (Pyrocephalus rubinus) is a widespread species found in North and South America and the Galápagos. Its 12 recognized subspecies vary in degree of geographic isolation, phenotypic distinctness, and migratory status. Some authors suggest that Galápagos subspecies nanus and dubius constitute one or more separate species. Observational reports of distinct differences in song also suggest separate species status for the austral migrant subspecies rubinus. To evaluate geographical patterns of diversification and taxonomic limits within this species complex, we carried out a molecular phylogenetic analysis encompassing 10 subspecies and three outgroup taxa using mitochondrial (ND2, Cyt b) and nuclear loci (ODC introns 6 through 7, FGB intron 5). We used samples of preserved tissues from museum collections as well as toe pad samples from museum skins. Galápagos and continental clades were recovered as sister groups, with initial divergence at ∼1mya. Within the continental clade, North and South American populations were sister groups. Three geographically distinct clades were recovered within South America. We detected no genetic differences between two broadly intergrading North American subspecies, mexicanus and flammeus, suggesting they should not be recognized as separate taxa. Four western South American subspecies were also indistinguishable on the basis of loci that we sampled, but occur in a region with patchy habitat, and may represent recently isolated populations. The austral migrant subspecies, rubinus, comprised a monophyletic mitochondrial clade and had many unique nuclear DNA alleles. In combination with its distinct song, exclusive song recognition behavior, different phenology, and an isolated breeding range, our data suggests that this taxon represents a separate species from other continental populations. Mitochondrial and nuclear genetic data, morphology, and behavior suggest that Galápagos forms should be elevated to two full species corresponding to the two currently recognized subspecies, nanus and dubius. The population of dubius is presumed to be extinct, and thus would represent the first documented extinction of a Galápagos-endemic bird species. Two strongly supported mitochondrial clades divide Galápagos subspecies nanus in a geographic pattern that conflicts with previous hypotheses that were based on plumage color. Several populations of nanus have recently become extinct or are in serious decline. Urgent conservation measures should seek to preserve the deep mitochondrial DNA diversity within nanus, and further work should explore whether additional forms should be recognized within nanus. Ancestral states analysis based on our phylogeny revealed that the most recent common ancestor of extant Vermilion Flycatcher populations was migratory, and that migratory behavior was lost more often than gained within Pyrocephalus and close relatives, as has been shown to be the case within Tyrannidae as a whole.
Collapse
|
32
|
Abstract
BACKGROUND Discovering the location of gene duplications and multiple gene duplication episodes is a fundamental issue in evolutionary molecular biology. The problem introduced by Guigó et al. in 1996 is to map gene duplication events from a collection of rooted, binary gene family trees onto theirs corresponding rooted binary species tree in such a way that the total number of multiple gene duplication episodes is minimized. There are several models in the literature that specify how gene duplications from gene families can be interpreted as one duplication episode. However, in all duplication episode problems gene trees are rooted. This restriction limits the applicability, since unrooted gene family trees are frequently inferred by phylogenetic methods. RESULTS In this article we show the first solution to the open problem of episode clustering where the input gene family trees are unrooted. In particular, by using theoretical properties of unrooted reconciliation, we show an efficient algorithm that reduces this problem into the episode clustering problems defined for rooted trees. We show theoretical properties of the reduction algorithm and evaluation of empirical datasets. CONCLUSIONS We provided algorithms and tools that were successfully applied to several empirical datasets. In particular, our comparative study shows that we can improve known results on genomic duplication inference from real datasets.
Collapse
Affiliation(s)
- Jarosław Paszek
- University of Warsaw, Institute of Informatics, Banacha 2, Warsaw, 02-097, Poland.
| | - Paweł Górecki
- University of Warsaw, Institute of Informatics, Banacha 2, Warsaw, 02-097, Poland.
| |
Collapse
|
33
|
Abstract
Identifying the root of a phylogenetic tree is important because incorrectly rooted phylogenetic trees may mislead evolutionary and taxonomic inferences. Many techniques for inferring the root have been proposed, but each has shortcomings that may make it inappropriate for any particular dataset. Here we outline the various ways to root phylogenetic trees, which include: outgroup, midpoint rooting, molecular clock rooting, and Bayesian molecular clock rooting. In addition, we discuss the pros and cons and also list software availability for each of the rooting methods.
Collapse
|
34
|
Simmons MP, Gatesy J. Coalescence vs. concatenation: Sophisticated analyses vs. first principles applied to rooting the angiosperms. Mol Phylogenet Evol 2015; 91:98-122. [DOI: 10.1016/j.ympev.2015.05.011] [Citation(s) in RCA: 64] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2015] [Revised: 05/01/2015] [Accepted: 05/14/2015] [Indexed: 11/24/2022]
|
35
|
Williams TA, Heaps SE, Cherlin S, Nye TMW, Boys RJ, Embley TM. New substitution models for rooting phylogenetic trees. Philos Trans R Soc Lond B Biol Sci 2015; 370:20140336. [PMID: 26323766 PMCID: PMC4571574 DOI: 10.1098/rstb.2014.0336] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/04/2015] [Indexed: 12/23/2022] Open
Abstract
The root of a phylogenetic tree is fundamental to its biological interpretation, but standard substitution models do not provide any information on its position. Here, we describe two recently developed models that relax the usual assumptions of stationarity and reversibility, thereby facilitating root inference without the need for an outgroup. We compare the performance of these models on a classic test case for phylogenetic methods, before considering two highly topical questions in evolutionary biology: the deep structure of the tree of life and the root of the archaeal radiation. We show that all three alignments contain meaningful rooting information that can be harnessed by these new models, thus complementing and extending previous work based on outgroup rooting. In particular, our analyses exclude the root of the tree of life from the eukaryotes or Archaea, placing it on the bacterial stem or within the Bacteria. They also exclude the root of the archaeal radiation from several major clades, consistent with analyses using other rooting methods. Overall, our results demonstrate the utility of non-reversible and non-stationary models for rooting phylogenetic trees, and identify areas where further progress can be made.
Collapse
Affiliation(s)
- Tom A Williams
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne NE2 4HH, UK
| | - Sarah E Heaps
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne NE2 4HH, UK School of Mathematics and Statistics, Newcastle University, Newcastle upon Tyne NE1 7RU, UK
| | - Svetlana Cherlin
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne NE2 4HH, UK School of Mathematics and Statistics, Newcastle University, Newcastle upon Tyne NE1 7RU, UK
| | - Tom M W Nye
- School of Mathematics and Statistics, Newcastle University, Newcastle upon Tyne NE1 7RU, UK
| | - Richard J Boys
- School of Mathematics and Statistics, Newcastle University, Newcastle upon Tyne NE1 7RU, UK
| | - T Martin Embley
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne NE2 4HH, UK
| |
Collapse
|
36
|
Borner J, Pick C, Thiede J, Kolawole OM, Kingsley MT, Schulze J, Cottontail VM, Wellinghausen N, Schmidt-Chanasit J, Bruchhaus I, Burmester T. Phylogeny of haemosporidian blood parasites revealed by a multi-gene approach. Mol Phylogenet Evol 2015; 94:221-31. [PMID: 26364971 DOI: 10.1016/j.ympev.2015.09.003] [Citation(s) in RCA: 67] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2015] [Revised: 08/31/2015] [Accepted: 09/03/2015] [Indexed: 11/13/2022]
Abstract
The apicomplexan order Haemosporida is a clade of unicellular blood parasites that infect a variety of reptilian, avian and mammalian hosts. Among them are the agents of human malaria, parasites of the genus Plasmodium, which pose a major threat to human health. Illuminating the evolutionary history of Haemosporida may help us in understanding their enormous biological diversity, as well as tracing the multiple host switches and associated acquisitions of novel life-history traits. However, the deep-level phylogenetic relationships among major haemosporidian clades have remained enigmatic because the datasets employed in phylogenetic analyses were severely limited in either gene coverage or taxon sampling. Using a PCR-based approach that employs a novel set of primers, we sequenced fragments of 21 nuclear genes from seven haemosporidian parasites of the genera Leucocytozoon, Haemoproteus, Parahaemoproteus, Polychromophilus and Plasmodium. After addition of genomic data from 25 apicomplexan species, the unreduced alignment comprised 20,580 bp from 32 species. Phylogenetic analyses were performed based on nucleotide, codon and amino acid data employing Bayesian inference, maximum likelihood and maximum parsimony. All analyses resulted in highly congruent topologies. We found consistent support for a basal position of Leucocytozoon within Haemosporida. In contrast to all previous studies, we recovered a sister group relationship between the genera Polychromophilus and Plasmodium. Within Plasmodium, the sauropsid and mammal-infecting lineages were recovered as sister clades. Support for these relationships was high in nearly all trees, revealing a novel phylogeny of Haemosporida, which is robust to the choice of the outgroup and the method of tree inference.
Collapse
Affiliation(s)
- Janus Borner
- Institute of Zoology and Zoological Museum, University of Hamburg, Martin-Luther-King-Platz 3, D-20146 Hamburg, Germany
| | - Christian Pick
- Institute of Zoology and Zoological Museum, University of Hamburg, Martin-Luther-King-Platz 3, D-20146 Hamburg, Germany
| | - Jenny Thiede
- Institute of Zoology and Zoological Museum, University of Hamburg, Martin-Luther-King-Platz 3, D-20146 Hamburg, Germany
| | - Olatunji Matthew Kolawole
- Department of Microbiology, Faculty of Life Sciences, University of Ilorin, PMB 1515, Ilorin, Kwara State, Nigeria
| | - Manchang Tanyi Kingsley
- Institute of Agricultural Research for Development, Veterinary Research Laboratory, Wakwa Regional Center, PO Box␣65, Ngaoundere, Cameroon
| | - Jana Schulze
- Institute of Zoology and Zoological Museum, University of Hamburg, Martin-Luther-King-Platz 3, D-20146 Hamburg, Germany
| | - Veronika M Cottontail
- Institute of Experimental Ecology, University of Ulm, Albert-Einstein Allee 11, D-89069 Ulm, Germany
| | - Nele Wellinghausen
- Gaertner & Colleagues Laboratory, Elisabethenstr. 11, D-88212 Ravensburg, Germany
| | - Jonas Schmidt-Chanasit
- Bernhard Nocht Institute for Tropical Medicine, Bernhard-Nocht-Str. 74, D-20359 Hamburg, Germany
| | - Iris Bruchhaus
- Bernhard Nocht Institute for Tropical Medicine, Bernhard-Nocht-Str. 74, D-20359 Hamburg, Germany
| | - Thorsten Burmester
- Institute of Zoology and Zoological Museum, University of Hamburg, Martin-Luther-King-Platz 3, D-20146 Hamburg, Germany.
| |
Collapse
|
37
|
Sumner JG, Jarvis PD, Holland BR. A tensorial approach to the inversion of group-based phylogenetic models. BMC Evol Biol 2014; 14:236. [PMID: 25472897 PMCID: PMC4268818 DOI: 10.1186/s12862-014-0236-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2014] [Accepted: 11/06/2014] [Indexed: 11/16/2022] Open
Abstract
Background Hadamard conjugation is part of the standard mathematical armoury in the analysis of molecular phylogenetic methods. For group-based models, the approach provides a one-to-one correspondence between the so-called “edge length” and “sequence” spectrum on a phylogenetic tree. The Hadamard conjugation has been used in diverse phylogenetic applications not only for inference but also as an important conceptual tool for thinking about molecular data leading to generalizations beyond strictly tree-like evolutionary modelling. Results For general group-based models of phylogenetic branching processes, we reformulate the problem of constructing a one-one correspondence between pattern probabilities and edge parameters. This takes a classic result previously shown through use of Fourier analysis and presents it in the language of tensors and group representation theory. This derivation makes it clear why the inversion is possible, because, under their usual definition, group-based models are defined for abelian groups only. Conclusion We provide an inversion of group-based phylogenetic models that can implemented using matrix multiplication between rectangular matrices indexed by ordered-partitions of varying sizes. Our approach provides additional context for the construction of phylogenetic probability distributions on network structures, and highlights the potential limitations of restricting to group-based models in this setting.
Collapse
Affiliation(s)
- Jeremy G Sumner
- School of Physical Sciences, University of Tasmania, Hobart TAS 7001, Australia.
| | | | | |
Collapse
|
38
|
Affiliation(s)
- Philip S. Ward
- Department of Entomology & Nematology, and Center for Population Biology, University of California, Davis, California 95616;
| |
Collapse
|
39
|
Eocene diversification of crown group rails (Aves: Gruiformes: Rallidae). PLoS One 2014; 9:e109635. [PMID: 25291147 PMCID: PMC4188725 DOI: 10.1371/journal.pone.0109635] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2014] [Accepted: 09/05/2014] [Indexed: 12/29/2022] Open
Abstract
Central to our understanding of the timing of bird evolution is debate about an apparent conflict between fossil and molecular data. A deep age for higher level taxa within Neoaves is evident from molecular analyses but much remains to be learned about the age of diversification in modern bird families and their evolutionary ecology. In order to better understand the timing and pattern of diversification within the family Rallidae we used a relaxed molecular clock, fossil calibrations, and complete mitochondrial genomes from a range of rallid species analysed in a Bayesian framework. The estimated time of origin of Rallidae is Eocene, about 40.5 Mya, with evidence of intrafamiliar diversification from the Late Eocene to the Miocene. This timing is older than previously suggested for crown group Rallidae, but fossil calibrations, extent of taxon sampling and substantial sequence data give it credence. We note that fossils of Eocene age tentatively assigned to Rallidae are consistent with our findings. Compared to available studies of other bird lineages, the rail clade is old and supports an inference of deep ancestry of ground-dwelling habits among Neoaves.
Collapse
|
40
|
Li T, Hua J, Wright AM, Cui Y, Xie Q, Bu W, Hillis DM. Long-branch attraction and the phylogeny of true water bugs (Hemiptera: Nepomorpha) as estimated from mitochondrial genomes. BMC Evol Biol 2014; 14:99. [PMID: 24884699 PMCID: PMC4101842 DOI: 10.1186/1471-2148-14-99] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2014] [Accepted: 04/29/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Most previous studies of morphological and molecular data have consistently supported the monophyly of the true water bugs (Hemiptera: Nepomorpha). An exception is a recent study by Hua et al. (BMC Evol Biol 9: 134, 2009) based on nine nepomorphan mitochondrial genomes. In the analysis of Hua et al. (BMC Evol Biol 9: 134, 2009), the water bugs in the group Pleoidea formed the sister group to a clade that consisted of Nepomorpha (the remaining true water bugs) + Leptopodomorpha (shore bugs) + Cimicomorpha (assassin bugs and relatives) + Pentatomomorpha (stink bugs and relatives), thereby suggesting that fully aquatic hemipterans evolved independently at least twice. Based on these results, Hua et al. (BMC Evol Biol 9: 134, 2009) elevated the Pleoidea to a new infraorder, the Plemorpha. RESULTS Our reanalysis suggests that the lack of support for the monophyly of the true water bugs (including Pleoidea) by Hua et al. (BMC Evol Biol 9: 134, 2009) likely resulted from inadequate taxon sampling. In particular, long-branch attraction (LBA) between the distant outgroup taxa and Pleoidea, as well as LBA among taxa in the ingroup, made Nepomorpha appear to be polyphyletic. We used three complementary strategies to test and alleviate the effects of LBA: (1) the removal of distant outgroups from the analysis; (2) the addition of closely related outgroups; and (3) the addition of a mitochondrial genome from a second family of Pleoidea. We also performed likelihood-ratio tests to examine the support for monophyly of Nepomorpha with different combinations of taxa included in the analysis. Furthermore, we found that specimens of Helotrephes sp. were misidentified as Paraplea frontalis (Fieber, 1844) by Hua et al. (BMC Evol Biol 9: 134, 2009). CONCLUSIONS All analyses that included the addition of more taxa significantly and consistently supported the placement of Pleoidea within the Nepomorpha (i.e., supported the monophyly of the traditional true water bugs). Our analyses further support a close relationship between Notonectoidea and Pleoidea within Nepomorpha, and the superfamilies Nepoidea, Ochteroidea, Naucoroidea, and Pleoidea are resolved as monophyletic in all trees with strong support. Our results also confirmed that monophyly of Nepomorpha clearly is not refuted by the mitochondrial genome data.
Collapse
Affiliation(s)
- Teng Li
- Institute of Entomology, College of Life Sciences, Nankai University, 94 Weijin Road, Tianjin 300071, China
| | - Jimeng Hua
- Institute of Entomology, College of Life Sciences, Nankai University, 94 Weijin Road, Tianjin 300071, China
| | - April M Wright
- Department of Integrative Biology, University of Texas at Austin, Austin TX 78712, USA
| | - Ying Cui
- Institute of Entomology, College of Life Sciences, Nankai University, 94 Weijin Road, Tianjin 300071, China
| | - Qiang Xie
- Institute of Entomology, College of Life Sciences, Nankai University, 94 Weijin Road, Tianjin 300071, China
| | - Wenjun Bu
- Institute of Entomology, College of Life Sciences, Nankai University, 94 Weijin Road, Tianjin 300071, China
| | - David M Hillis
- Department of Integrative Biology, University of Texas at Austin, Austin TX 78712, USA
| |
Collapse
|
41
|
Khan FAA, Phillips CD, Baker RJ. Timeframes of speciation, reticulation, and hybridization in the bulldog bat explained through phylogenetic analyses of all genetic transmission elements. Syst Biol 2013; 63:96-110. [PMID: 24149076 DOI: 10.1093/sysbio/syt062] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Phylogenetic comparisons of the different mammalian genetic transmission elements (mtDNA, X-, Y-, and autosomal DNA) is a powerful approach for understanding the process of speciation in nature. Through such comparisons the unique inheritance pathways of each genetic element and gender-biased processes can link genomic structure to the evolutionary process, especially among lineages which have recently diversified, in which genetic isolation may be incomplete. Bulldog bats of the genus Noctilio are an exemplar lineage, being a young clade, widely distributed, and exhibiting unique feeding ecologies. In addition, currently recognized species are paraphyletic with respect to the mtDNA gene tree and contain morphologically identifiable clades that exhibit mtDNA divergences as great as among many species. To test taxonomic hypotheses and understand the contribution of hybridization to the extant distribution of genetic diversity in Noctilio, we used phylogenetic, coalescent stochastic modeling, and divergence time estimates using sequence data from cytochrome-b, cytochrome c oxidase-I, zinc finger Y, and zinc finger X, as well as evolutionary reconstructions based on amplified fragment length polymorphisms (AFLPs) data. No evidence of ongoing hybridization between the two currently recognized species was identified. However, signatures of an ancient mtDNA capture were recovered in which an mtDNA lineage of one species was captured early in the noctilionid radiation. Among subspecific mtDNA clades, which were generally coincident with morphology and statistically definable as species, signatures of ongoing hybridization were observed in sex chromosome sequences and AFLP. Divergence dating of genetic elements corroborates the diversification of extant Noctilio beginning about 3 Ma, with ongoing hybridization between mitochondrial lineages separated by 2.5 myr. The timeframe of species' divergence within Noctilio supports the hypothesis that shifts in the dietary strategies of gleaning insects (N. albiventris) or fish (N. leporinus) are among the most rapid instances of dietary evolution observed in mammals. This study illustrates the complex evolutionary dynamics shaping gene pools in nature, how comparisons of genetic elements can serve for understanding species boundaries, and the complex considerations for accurate taxonomic assignment.
Collapse
Affiliation(s)
- Faisal Ali Anwarali Khan
- Department of Biological Sciences and the Museum, Texas Tech University, Lubbock, TX 79409, USA and Department of Zoology, Faculty of Resource Science and Technology, Universiti Malaysia Sarawak, Kota Samarahan, Sarawak 94300, Malaysia
| | | | | |
Collapse
|
42
|
Deep metazoan phylogeny: When different genes tell different stories. Mol Phylogenet Evol 2013; 67:223-33. [DOI: 10.1016/j.ympev.2013.01.010] [Citation(s) in RCA: 200] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2012] [Revised: 01/08/2013] [Accepted: 01/12/2013] [Indexed: 11/30/2022]
|
43
|
Steel M, Linz S, Huson DH, Sanderson MJ. Identifying a species tree subject to random lateral gene transfer. J Theor Biol 2013; 322:81-93. [DOI: 10.1016/j.jtbi.2013.01.009] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2012] [Revised: 01/09/2013] [Accepted: 01/10/2013] [Indexed: 11/26/2022]
|
44
|
Górecki P, Eulenstein O, Tiuryn J. Unrooted tree reconciliation: a unified approach. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013; 10:522-536. [PMID: 23929875 DOI: 10.1109/tcbb.2013.22] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Tree comparison functions are widely used in phylogenetics for comparing evolutionary trees. Unrooted trees can be compared with rooted trees by identifying all rootings of the unrooted tree that minimize some provided comparison function between two rooted trees. The plateau property is satisfied by the provided function, if all optimal rootings form a subtree, or plateau, in the unrooted tree, from which the rootings along every path toward a leaf have monotonically increasing costs. This property is sufficient for the linear-time identification of all optimal rootings and rooting costs. However, the plateau property has only been proven for a few rooted comparison functions, requiring individual proofs for each function without benefitting from inherent structural features of such functions. Here, we introduce the consistency condition that is sufficient for a general function to satisfy the plateau property. For consistent functions, we introduce general linear-time solutions that identify optimal rootings and all rooting costs. Further, we identify novel relationships between consistent functions in terms of plateaus, especially the plateau of the well-studied duplication-loss function is part of a plateau of every other consistent function. We introduce a novel approach for identifying consistent cost functions by defining a formal language of Boolean costs. Formulas in this language can be interpreted as cost functions. Finally, we demonstrate the performance of our general linear-time solutions in practice using empirical and simulation studies.
Collapse
Affiliation(s)
- Pawel Górecki
- Department of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Mazowieckie 02-097, Poland
| | | | | |
Collapse
|
45
|
Cardoso D, Paganucci de Queiroz L, Cavalcante de Lima H, Suganuma E, van den Berg C, Lavin M. A molecular phylogeny of the vataireoid legumes underscores floral evolvability that is general to many early-branching papilionoid lineages. AMERICAN JOURNAL OF BOTANY 2013; 100:403-421. [PMID: 23378491 DOI: 10.3732/ajb.1200276] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
PREMISE OF STUDY Flowering traits can sometimes be overemphasized in taxonomic classifications. The fused and completely differentiated papilionate floral organs in the neotropical legume trees Vatairea and Vataireopsis were traditionally used in part to ascribe these genera to the tribe Dalbergieae. In contrast, the free and mostly undifferentiated floral parts of Luetzelburgia and Sweetia fit the circumscription of the "primitive" Sophoreae. Such divergent floral morphologies thought to divide deep phylogenetic lineages indeed may be prone to episodic transformation among close papilionoid relatives. METHODS We sampled 26 of 27 known species of Luetzelburgia, Sweetia, Vatairea, and Vataireopsis in parsimony and Bayesian phylogenetic analyses of nuclear ribosomal ITS/5.8S and six plastid (matK, 3'-trnK, psbA-trnH, trnL intron, rps16 intron, and trnD-T) DNA sequence loci. KEY RESULTS The analyses of individual and combined data sets strongly resolved the monophyly of each of Luetzelburgia, Sweetia, Vatairea, and Vataireopsis. Vataireopsis was resolved as sister to the rest and the morphologically divergent Luetzelburgia and Vatairea were strongly resolved as sister clades. Floral morphology was generally not a good predictor of phylogenetic relatedness. CONCLUSIONS Luetzelburgia, Sweetia, Vatairea, and Vataireopsis are unequivocally resolved as the "vataireoid" clade. Fruit and vegetative traits are found to be more phylogenetically conserved than many floral traits. This explains why the identity of the vataireoids has been overlooked or confused. The evolvability of floral traits may also be a general condition among many of the early-branching papilionoid lineages.
Collapse
Affiliation(s)
- Domingos Cardoso
- Departamento de Ciências Biológicas, Universidade Estadual de Feira de Santana, Av. Transnordestina, s/n, Novo Horizonte, 44036-900, Feira de Santana, Bahia, Brazil.
| | | | | | | | | | | |
Collapse
|
46
|
Chaudhary R, Burleigh JG, Fernández-Baca D. Fast local search for unrooted Robinson-Foulds supertrees. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012; 9:1004-1013. [PMID: 22431553 DOI: 10.1109/tcbb.2012.47] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
A Robinson-Foulds (RF) supertree for a collection of input trees is a tree containing all the species in the input trees that is at minimum total RF distance to the input trees. Thus, an RF supertree is consistent with the maximum number of splits in the input trees. Constructing RF supertrees for rooted and unrooted data is NP-hard. Nevertheless, effective local search heuristics have been developed for the restricted case where the input trees and the supertree are rooted. We describe new heuristics, based on the Edge Contract and Refine (ECR) operation, that remove this restriction, thereby expanding the utility of RF supertrees. Our experimental results on simulated and empirical data sets show that our unrooted local search algorithms yield better supertrees than those obtained from MRP and rooted RF heuristics in terms of total RF distance to the input trees and, for simulated data, in terms of RF distance to the true tree.
Collapse
Affiliation(s)
- Ruchi Chaudhary
- Department of Computer Science, Iowa State University, Atanasoff Hall, Ames, IA 50011-1041, USA.
| | | | | |
Collapse
|
47
|
Mariadassou M, Bar-Hen A, Kishino H. Taxon influence index: assessing taxon-induced incongruities in phylogenetic inference. Syst Biol 2012; 61:337-45. [PMID: 22228800 DOI: 10.1093/sysbio/syr129] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Understanding the evolutionary history of species is at the core of molecular evolution and is done using several inference methods. The critical issue is to quantify the uncertainty of the inference. The posterior probabilities in Bayesian phylogenetic inference and the bootstrap values in frequentist approaches measure the variability of the estimates due to the sampling of sites from genes and the sampling of genes from genomes. However, they do not measure the uncertainty due to taxon sampling. Taxa that experienced molecular homoplasy, recent selection, a spur of evolution, and so forth may disrupt the inference and cause incongruences in the estimated phylogeny. We define a taxon influence index to assess the influence of each taxon on the phylogeny. We found that although most taxa have a weak influence on the phylogeny, a small fraction of influential taxa strongly alter it even in clades only loosely related to them. We conclude that highly influential taxa should be given special attention and sampling them more thoroughly can lead to more dependable phylogenies.
Collapse
Affiliation(s)
- Mahendra Mariadassou
- Department of Mathematics and Informatics, MAP5, Université Paris Descartes, 45 rue des Saints Pères, 75270 Paris Cedex 06, France.
| | | | | |
Collapse
|
48
|
Rooting phylogenies using gene duplications: An empirical example from the bees (Apoidea). Mol Phylogenet Evol 2011; 60:295-304. [DOI: 10.1016/j.ympev.2011.05.002] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2010] [Revised: 04/26/2011] [Accepted: 05/03/2011] [Indexed: 12/23/2022]
|
49
|
Lim CH, Hamazaki T, Braun EL, Wade J, Terada N. Evolutionary genomics implies a specific function of Ant4 in mammalian and anole lizard male germ cells. PLoS One 2011; 6:e23122. [PMID: 21858006 PMCID: PMC3155547 DOI: 10.1371/journal.pone.0023122] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2011] [Accepted: 07/11/2011] [Indexed: 11/18/2022] Open
Abstract
Most vertebrates have three paralogous genes with identical intron-exon structures and a high degree of sequence identity that encode mitochondrial adenine nucleotide translocase (Ant) proteins, Ant1 (Slc25a4), Ant2 (Slc25a5) and Ant3 (Slc25a6). Recently, we and others identified a fourth mammalian Ant paralog, Ant4 (Slc25a31), with a distinct intron-exon structure and a lower degree of sequence identity. Ant4 was expressed selectively in testis and sperm in adult mammals and was indeed essential for mouse spermatogenesis, but it was absent in birds, fish and frogs. Since Ant2 is X-linked in mammalian genomes, we hypothesized that the autosomal Ant4 gene may compensate for the loss of Ant2 gene expression during male meiosis in mammals. Here we report that the Ant4 ortholog is conserved in green anole lizard (Anolis carolinensis) and demonstrate that it is expressed in the anole testis. Further, a degenerate DNA fragment of putative Ant4 gene was identified in syntenic regions of avian genomes, indicating that Ant4 was present in the common amniote ancestor. Phylogenetic analyses suggest an even more ancient origin of the Ant4 gene. Although anole lizards are presumed male (XY) heterogametic, like mammals, copy numbers of the Ant2 as well as its neighboring gene were similar between male and female anole genomes, indicating that the anole Ant2 gene is either autosomal or located in the pseudoautosomal region of the sex chromosomes, in contrast to the case to mammals. These results imply the conservation of Ant4 is not likely simply driven by the sex chromosomal localization of the Ant2 gene and its subsequent inactivation during male meiosis. Taken together with the fact that Ant4 protein has a uniquely conserved structure when compared to other somatic Ant1, 2 and 3, there may be a specific advantage for mammals and lizards to express Ant4 in their male germ cells.
Collapse
Affiliation(s)
- Chae Ho Lim
- Department of Pathology, College of Medicine, University of Florida, Gainesville, Florida, United States of America
| | - Takashi Hamazaki
- Department of Pathology, College of Medicine, University of Florida, Gainesville, Florida, United States of America
| | - Edward L. Braun
- Department of Biology, College of Liberal Arts and Sciences, University of Florida, Gainesville, Florida, United States of America
| | - Juli Wade
- Neuroscience Program, Department of Psychology, Department of Zoology, Michigan State University, East Lansing, Michigan, United States of America
| | - Naohiro Terada
- Department of Pathology, College of Medicine, University of Florida, Gainesville, Florida, United States of America
- * E-mail:
| |
Collapse
|
50
|
Chang WC, Burleigh GJ, Fernández-Baca DF, Eulenstein O. An ILP solution for the gene duplication problem. BMC Bioinformatics 2011; 12 Suppl 1:S14. [PMID: 21342543 PMCID: PMC3044268 DOI: 10.1186/1471-2105-12-s1-s14] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2023] Open
Abstract
Background The gene duplication (GD) problem seeks a species tree that implies the fewest gene duplication events across a given collection of gene trees. Solving this problem makes it possible to use large gene families with complex histories of duplication and loss to infer phylogenetic trees. However, the GD problem is NP-hard, and therefore, most analyses use heuristics that lack any performance guarantee. Results We describe the first integer linear programming (ILP) formulation to solve instances of the gene duplication problem exactly. With simulations, we demonstrate that the ILP solution can solve problem instances with up to 14 taxa. Furthermore, we apply the new ILP solution to solve the gene duplication problem for the seed plant phylogeny using a 12-taxon, 6, 084-gene data set. The unique, optimal solution, which places Gnetales sister to the conifers, represents a new, large-scale genomic perspective on one of the most puzzling questions in plant systematics. Conclusions Although the GD problem is NP-hard, our novel ILP solution for it can solve instances with data sets consisting of as many as 14 taxa and 1, 000 genes in a few hours. These are the largest instances that have been solved to optimally to date. Thus, this work can provide large-scale genomic perspectives on phylogenetic questions that previously could only be addressed by heuristic estimates.
Collapse
Affiliation(s)
- Wen-Chieh Chang
- Department of Computer Science, Iowa State University, Ames 50011, USA.
| | | | | | | |
Collapse
|