1
|
Mo S, Zhu Y, Braga MP, Lohman DJ, Nylin S, Moumou A, Wheat CW, Wahlberg N, Wang M, Ma F, Zhang P, Wang H. Rapid Evolution of Host Repertoire and Geographic Range in a Young and Diverse Genus of Montane Butterflies. Syst Biol 2025; 74:141-157. [PMID: 39484941 PMCID: PMC11809587 DOI: 10.1093/sysbio/syae061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2022] [Revised: 10/14/2024] [Accepted: 11/05/2024] [Indexed: 11/03/2024] Open
Abstract
Evolutionary changes in geographic distribution and larval host plants may promote the rapid diversification of montane insects, but this scenario has been rarely investigated. We studied the rapid radiation of the butterfly genus Colias, which has diversified in mountain ecosystems in Eurasia, Africa, and the Americas. Based on a data set of 150 nuclear protein-coding genetic loci and mitochondrial genomes, we constructed a time-calibrated phylogenetic tree of Colias species with broad taxon sampling. We then inferred their ancestral geographic ranges, historical diversification rates, and the evolution of host use. We found that the most recent common ancestor of Colias was likely geographically widespread and originated ~3.5 Ma. The group subsequently diversified in different regions across the world, often in tandem with geographic expansion events. No aspect of elevation was found to have a direct effect on diversification. The genus underwent a burst of diversification soon after the divergence of the Neotropical lineage, followed by an exponential decline in diversification rate toward the present. The ancestral host repertoire included the legume genera Astragalus and Trifolium but later expanded to include a wide range of Fabaceae genera and plants in more distantly related families, punctuated with periods of host range expansion and contraction. We suggest that the widespread distribution of the ancestor of all extant Colias lineages set the stage for diversification by isolation of populations that locally adapted to the various different environments they encountered, including different host plants. In this scenario, elevation is not the main driver but might have accelerated diversification by isolating populations.
Collapse
Affiliation(s)
- Shifang Mo
- Department of Entomology, College of Plant Protection, South China Agricultural University, 483 Wushan Road, Tianhe District, Guangzhou, 510000, China
| | - Yaowei Zhu
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, 135 Xingangxi Road, Haizhu District, Guangzhou, 510275, China
| | - Mariana P Braga
- Department of Ecology, Swedish University of Agricultural Sciences, Ulls väg 16 Uppsala, 75649, Sweden
| | - David J Lohman
- Department of Biology, City College of New York, City University of New York, 160 Convent Ave., New York, NY 10031, USA
- PhD Program in Biology, Graduate Center, City University of New York, 365 5th Ave., New York, NY 10016, USA
- Entomology Section, National Museum of Natural History, Rizal Park, T.W. Kalaw St., Manila, 1000, Philippines
| | - Sören Nylin
- Department of Zoology, Svante Arrhenius väg 18B, Stockholm University, Stockholm, SE-10691, Sweden
| | - Ashraf Moumou
- Department of Biology, City College of New York, City University of New York, 160 Convent Ave., New York, NY 10031, USA
| | - Christopher W Wheat
- Department of Zoology, Svante Arrhenius väg 18B, Stockholm University, Stockholm, SE-10691, Sweden
| | - Niklas Wahlberg
- Department of Biology, Kontaktvägen 10, Lund University, Lund, SWE-22362, Sweden
| | - Min Wang
- Department of Entomology, College of Plant Protection, South China Agricultural University, 483 Wushan Road, Tianhe District, Guangzhou, 510000, China
| | - Fangzhou Ma
- Nanjing Institute of Environmental Sciences, Ministry of Ecology and Environment of China, 8 Jiangwangmiao Road, Xuanwu District, Nanjing, 210000, China
| | - Peng Zhang
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, 135 Xingangxi Road, Haizhu District, Guangzhou, 510275, China
| | - Houshuai Wang
- Department of Entomology, College of Plant Protection, South China Agricultural University, 483 Wushan Road, Tianhe District, Guangzhou, 510000, China
| |
Collapse
|
2
|
Kong S, Swofford DL, Kubatko LS. Inference of Phylogenetic Networks From Sequence Data Using Composite Likelihood. Syst Biol 2025; 74:53-69. [PMID: 39387633 DOI: 10.1093/sysbio/syae054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 09/13/2024] [Accepted: 10/08/2024] [Indexed: 10/12/2024] Open
Abstract
While phylogenies have been essential in understanding how species evolve, they do not adequately describe some evolutionary processes. For instance, hybridization, a common phenomenon where interbreeding between 2 species leads to formation of a new species, must be depicted by a phylogenetic network, a structure that modifies a phylogenetic tree by allowing 2 branches to merge into 1, resulting in reticulation. However, existing methods for estimating networks become computationally expensive as the dataset size and/or topological complexity increase. The lack of methods for scalable inference hampers phylogenetic networks from being widely used in practice, despite accumulating evidence that hybridization occurs frequently in nature. Here, we propose a novel method, PhyNEST (Phylogenetic Network Estimation using SiTe patterns), that estimates binary, level-1 phylogenetic networks with a fixed, user-specified number of reticulations directly from sequence data. By using the composite likelihood as the basis for inference, PhyNEST is able to use the full genomic data in a computationally tractable manner, eliminating the need to summarize the data as a set of gene trees prior to network estimation. To search network space, PhyNEST implements both hill climbing and simulated annealing algorithms. PhyNEST assumes that the data are composed of coalescent independent sites that evolve according to the Jukes-Cantor substitution model and that the network has a constant effective population size. Simulation studies demonstrate that PhyNEST is often more accurate than 2 existing composite likelihood summary methods (SNaQand PhyloNet) and that it is robust to at least one form of model misspecification (assuming a less complex nucleotide substitution model than the true generating model). We applied PhyNEST to reconstruct the evolutionary relationships among Heliconius butterflies and Papionini primates, characterized by hybrid speciation and widespread introgression, respectively. PhyNEST is implemented in an open-source Julia package and is publicly available at https://github.com/sungsik-kong/PhyNEST.jl.
Collapse
Affiliation(s)
- Sungsik Kong
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH 43210, USA
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
| | - David L Swofford
- Florida Museum of Natural History, University of Florida, Gainesville, FL 32611, USA
| | - Laura S Kubatko
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH 43210, USA
- Department of Statistics, The Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|
3
|
Patané JSL, Martins J, Setubal JC. A Guide to Phylogenomic Inference. Methods Mol Biol 2024; 2802:267-345. [PMID: 38819564 DOI: 10.1007/978-1-0716-3838-5_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
Phylogenomics aims at reconstructing the evolutionary histories of organisms taking into account whole genomes or large fractions of genomes. Phylogenomics has significant applications in fields such as evolutionary biology, systematics, comparative genomics, and conservation genetics, providing valuable insights into the origins and relationships of species and contributing to our understanding of biological diversity and evolution. This chapter surveys phylogenetic concepts and methods aimed at both gene tree and species tree reconstruction while also addressing common pitfalls, providing references to relevant computer programs. A practical phylogenomic analysis example including bacterial genomes is presented at the end of the chapter.
Collapse
Affiliation(s)
- José S L Patané
- Laboratório de Genética e Cardiologia Molecular, Instituto do Coração/Heart Institute Hospital das Clínicas - Faculdade de Medicina da Universidade de São Paulo São Paulo, São Paulo, SP, Brazil
| | - Joaquim Martins
- Integrative Omics group, Biorenewables National Laboratory, Brazilian Center for Research in Energy and Materials, Campinas, SP, Brazil
| | - João Carlos Setubal
- Departmento de Bioquímica, Instituto de Química, Universidade de São Paulo, São Paulo, SP, Brazil.
| |
Collapse
|
4
|
Skopalíková J, Leong-Škorničková J, Šída O, Newman M, Chumová Z, Zeisek V, Jarolímová V, Poulsen AD, Dantas-Queiroz MV, Fér T, Záveská E. Ancient hybridization in Curcuma (Zingiberaceae)-Accelerator or brake in lineage diversifications? THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2023; 116:773-785. [PMID: 37537754 DOI: 10.1111/tpj.16408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Revised: 07/13/2023] [Accepted: 07/19/2023] [Indexed: 08/05/2023]
Abstract
Hybridization is a widespread phenomenon in the evolution of plants and exploring its role is crucial to understanding diversification processes of many taxonomic groups. Recently, more attention is focused on the role of ancient hybridization that has repeatedly been shown as triggers of evolutionary radiation, although in some cases, it can prevent further diversification. The causes, frequency, and consequences of ancient hybridization remain to be explored. Here, we present an account of several events of ancient hybridization in turmeric, the economically important plant genus Curcuma (Zingiberaceae), which harbors about 130 known species. We analyzed 1094 targeted low-copy genes and plastomes obtained by next-generation sequencing of 37 species of Curcuma, representing the known genetic diversity and spanning the geographical distribution of the genus. Using phylogenetic network analysis, we show that the entire genus Curcuma as well as its most speciose lineage arose via introgression from the genus Pyrgophyllum and one of the extinct lineages, respectively. We also document a single event of ancient hybridization, with C. vamana as a product, that represents an evolutionary dead end. We further discuss distinct circumstances of those hybridization events that deal mainly with (in)congruence in chromosome counts of the parental lineages.
Collapse
Affiliation(s)
- Jana Skopalíková
- Department of Botany, Charles University, Prague, Czech Republic
- Czech Academy of Sciences, Institute of Botany, Průhonice, Czech Republic
| | - Jana Leong-Škorničková
- The Herbarium, Singapore Botanic Gardens, Singapore
- Department of Biological Sciences, National University of Singapore, Singapore
| | - Otakar Šída
- Department of Botany, National Museum in Prague, Prague, Czech Republic
| | - Mark Newman
- Royal Botanic Garden Edinburgh, Edinburgh, Scotland, UK
| | - Zuzana Chumová
- Czech Academy of Sciences, Institute of Botany, Průhonice, Czech Republic
| | - Vojtěch Zeisek
- Department of Botany, Charles University, Prague, Czech Republic
- Czech Academy of Sciences, Institute of Botany, Průhonice, Czech Republic
| | - Vlasta Jarolímová
- Czech Academy of Sciences, Institute of Botany, Průhonice, Czech Republic
| | | | | | - Tomáš Fér
- Department of Botany, Charles University, Prague, Czech Republic
| | - Eliška Záveská
- Czech Academy of Sciences, Institute of Botany, Průhonice, Czech Republic
| |
Collapse
|
5
|
Yu J, Niu Y, You Y, Cox CJ, Barrett RL, Trias-Blasi A, Guo J, Wen J, Lu L, Chen Z. Integrated phylogenomic analyses unveil reticulate evolution in Parthenocissus (Vitaceae), highlighting speciation dynamics in the Himalayan-Hengduan Mountains. THE NEW PHYTOLOGIST 2023; 238:888-903. [PMID: 36305244 DOI: 10.1111/nph.18580] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 10/19/2022] [Indexed: 06/16/2023]
Abstract
Hybridization caused by frequent environmental changes can lead both to species diversification (speciation) and to speciation reversal (despeciation), but the latter has rarely been demonstrated. Parthenocissus, a genus with its trifoliolate lineage in the Himalayan-Hengduan Mountains (HHM) region showing perplexing phylogenetic relationships, provides an opportunity for investigating speciation dynamics based on integrated evidence. We investigated phylogenetic discordance and reticulate evolution in Parthenocissus based on rigorous analyses of plastome and transcriptome data. We focused on reticulations in the trifoliolate lineage in the HHM region using a population-level genome resequencing dataset, incorporating evidence from morphology, distribution, and elevation. Comprehensive analyses confirmed multiple introgressions within Parthenocissus in a robust temporal-spatial framework. Around the HHM region, at least three hybridization hot spots were identified, one of which showed evidence of ongoing speciation reversal. We present a solid case study using an integrative methodological approach to investigate reticulate evolutionary history and its underlying mechanisms in plants. It demonstrates an example of speciation reversal through frequent hybridizations in the HHM region, which provides new perspectives on speciation dynamics in mountainous areas with strong topographic and environmental heterogeneity.
Collapse
Affiliation(s)
- Jinren Yu
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Yanting Niu
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
- China National Botanical Garden, Beijing, 100093, China
| | - Yichen You
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Cymon J Cox
- Centro de Ciências do Mar, Universidade do Algarve, Gambelas, Faro, 8005-319, Portugal
| | - Russell L Barrett
- National Herbarium of New South Wales, Australian Botanic Garden, Locked Bag 6002, Mount Annan, 2567, NSW, Australia
| | | | - Jing Guo
- State Key Laboratory of Genetic Engineering and Collaborative Innovation Center of Genetics and Development, Ministry of Education Key Laboratory of Biodiversity and Ecological Engineering, Institute of Plant Biology, Center of Evolutionary Biology, School of Life Sciences, Fudan University, Shanghai, 200433, China
| | - Jun Wen
- Department of Botany, National Museum of Natural History, MRC-166, Smithsonian Institution, Washington, DC, 20013-7012, USA
| | - Limin Lu
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
| | - Zhiduan Chen
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
| |
Collapse
|
6
|
Kong S. Digest: Frequent hybridization in Darevskia rarely leads to the evolution of asexuality. Evolution 2022; 76:2216-2217. [PMID: 35909234 PMCID: PMC9546131 DOI: 10.1111/evo.14587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Accepted: 07/22/2022] [Indexed: 01/22/2023]
Abstract
Asexuality in vertebrates is often generated via hybridization, but is it a rare product of pervasive hybridization or a common product of rare hybridization? Freitas et al. show that hybridization is frequent among the sexual species of Darevskia, although the crossings between parents of the asexual hybrids are undetected. This study illustrates that hybridization is not extraordinary in nature, and thus scalable phylogenetic network inference methods, rather than phylogenetic trees, are needed to accurately represent the true evolutionary history.
Collapse
Affiliation(s)
- Sungsik Kong
- Department of EvolutionEcology, and Organismal Biology, The Ohio State UniversityColumbusOhioUSA
| |
Collapse
|
7
|
Lutteropp S, Scornavacca C, Kozlov AM, Morel B, Stamatakis A. NetRAX: accurate and fast maximum likelihood phylogenetic network inference. BIOINFORMATICS (OXFORD, ENGLAND) 2022; 38:3725-3733. [PMID: 35713506 DOI: 10.1101/2021.08.30.458194] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 05/11/2022] [Accepted: 06/14/2022] [Indexed: 05/26/2023]
Abstract
MOTIVATION Phylogenetic networks can represent non-treelike evolutionary scenarios. Current, actively developed approaches for phylogenetic network inference jointly account for non-treelike evolution and incomplete lineage sorting (ILS). Unfortunately, this induces a very high computational complexity and current tools can only analyze small datasets. RESULTS We present NetRAX, a tool for maximum likelihood (ML) inference of phylogenetic networks in the absence of ILS. Our tool leverages state-of-the-art methods for efficiently computing the phylogenetic likelihood function on trees, and extends them to phylogenetic networks via the notion of 'displayed trees'. NetRAX can infer ML phylogenetic networks from partitioned multiple sequence alignments and returns the inferred networks in Extended Newick format. On simulated data, our results show a very low relative difference in Bayesian Information Criterion (BIC) score and a near-zero unrooted softwired cluster distance to the true, simulated networks. With NetRAX, a network inference on a partitioned alignment with 8000 sites, 30 taxa and 3 reticulations completes within a few minutes on a standard laptop. AVAILABILITY AND IMPLEMENTATION Our implementation is available under the GNU General Public License v3.0 at https://github.com/lutteropp/NetRAX. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sarah Lutteropp
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg 69118, Germany
| | - Céline Scornavacca
- Institut des Sciences de l'Évolution Université de Montpellier, CNRS, IRD, EPHE Place Eugène Bataillon, 34095 Montpellier Cedex 05, France
| | - Alexey M Kozlov
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg 69118, Germany
| | - Benoit Morel
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg 69118, Germany
- Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe 76128, Germany
| | - Alexandros Stamatakis
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg 69118, Germany
- Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe 76128, Germany
| |
Collapse
|
8
|
Lutteropp S, Scornavacca C, Kozlov AM, Morel B, Stamatakis A. NetRAX: Accurate and Fast Maximum Likelihood Phylogenetic Network Inference. Bioinformatics 2022; 38:3725-3733. [PMID: 35713506 PMCID: PMC9344847 DOI: 10.1093/bioinformatics/btac396] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 05/11/2022] [Accepted: 06/14/2022] [Indexed: 12/03/2022] Open
Abstract
Motivation Phylogenetic networks can represent non-treelike evolutionary scenarios. Current, actively developed approaches for phylogenetic network inference jointly account for non-treelike evolution and incomplete lineage sorting (ILS). Unfortunately, this induces a very high computational complexity and current tools can only analyze small datasets. Results We present NetRAX, a tool for maximum likelihood (ML) inference of phylogenetic networks in the absence of ILS. Our tool leverages state-of-the-art methods for efficiently computing the phylogenetic likelihood function on trees, and extends them to phylogenetic networks via the notion of ‘displayed trees’. NetRAX can infer ML phylogenetic networks from partitioned multiple sequence alignments and returns the inferred networks in Extended Newick format. On simulated data, our results show a very low relative difference in Bayesian Information Criterion (BIC) score and a near-zero unrooted softwired cluster distance to the true, simulated networks. With NetRAX, a network inference on a partitioned alignment with 8000 sites, 30 taxa and 3 reticulations completes within a few minutes on a standard laptop. Availability and implementation Our implementation is available under the GNU General Public License v3.0 at https://github.com/lutteropp/NetRAX. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sarah Lutteropp
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, 69118, Germany
| | - Céline Scornavacca
- Institut des Sciences de l'Évolution Université de Montpellier, CNRS, IRD, EPHE Place Eugène Bataillon 34095, Montpellier Cedex 05, France
| | - Alexey M Kozlov
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, 69118, Germany
| | - Benoit Morel
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, 69118, Germany.,Institute for Theoretical Informatics,Karlsruhe Institute of Technology, Karlsruhe, 76128, Germany
| | - Alexandros Stamatakis
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, 69118, Germany.,Institute for Theoretical Informatics,Karlsruhe Institute of Technology, Karlsruhe, 76128, Germany
| |
Collapse
|
9
|
Kong S, Pons JC, Kubatko L, Wicke K. Classes of explicit phylogenetic networks and their biological and mathematical significance. J Math Biol 2022; 84:47. [PMID: 35503141 DOI: 10.1007/s00285-022-01746-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Revised: 01/18/2022] [Accepted: 03/31/2022] [Indexed: 11/24/2022]
Abstract
The evolutionary relationships among organisms have traditionally been represented using rooted phylogenetic trees. However, due to reticulate processes such as hybridization or lateral gene transfer, evolution cannot always be adequately represented by a phylogenetic tree, and rooted phylogenetic networks that describe such complex processes have been introduced as a generalization of rooted phylogenetic trees. In fact, estimating rooted phylogenetic networks from genomic sequence data and analyzing their structural properties is one of the most important tasks in contemporary phylogenetics. Over the last two decades, several subclasses of rooted phylogenetic networks (characterized by certain structural constraints) have been introduced in the literature, either to model specific biological phenomena or to enable tractable mathematical and computational analyses. In the present manuscript, we provide a thorough review of these network classes, as well as provide a biological interpretation of the structural constraints underlying these networks where possible. In addition, we discuss how imposing structural constraints on the network topology can be used to address the scalability and identifiability challenges faced in the estimation of phylogenetic networks from empirical data.
Collapse
Affiliation(s)
- Sungsik Kong
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH, USA
| | - Joan Carles Pons
- Department of Mathematics and Computer Science, University of the Balearic Islands, Palma, 07122, Spain
| | - Laura Kubatko
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH, USA.,Department of Statistics, The Ohio State University, Columbus, OH, USA
| | - Kristina Wicke
- Department of Mathematics, The Ohio State University, Columbus, OH, USA.
| |
Collapse
|
10
|
Markin A, Wagle S, Anderson TK, Eulenstein O. RF-Net 2: fast inference of virus reassortment and hybridization networks. Bioinformatics 2022; 38:2144-2152. [PMID: 35150239 PMCID: PMC9004648 DOI: 10.1093/bioinformatics/btac075] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 01/26/2022] [Accepted: 02/07/2022] [Indexed: 02/04/2023] Open
Abstract
MOTIVATION A phylogenetic network is a powerful model to represent entangled evolutionary histories with both divergent (speciation) and convergent (e.g. hybridization, reassortment, recombination) evolution. The standard approach to inference of hybridization networks is to (i) reconstruct rooted gene trees and (ii) leverage gene tree discordance for network inference. Recently, we introduced a method called RF-Net for accurate inference of virus reassortment and hybridization networks from input gene trees in the presence of errors commonly found in phylogenetic trees. While RF-Net demonstrated the ability to accurately infer networks with up to four reticulations from erroneous input gene trees, its application was limited by the number of reticulations it could handle in a reasonable amount of time. This limitation is particularly restrictive in the inference of the evolutionary history of segmented RNA viruses such as influenza A virus (IAV), where reassortment is one of the major mechanisms shaping the evolution of these pathogens. RESULTS Here, we expand the functionality of RF-Net that makes it significantly more applicable in practice. Crucially, we introduce a fast extension to RF-Net, called Fast-RF-Net, that can handle large numbers of reticulations without sacrificing accuracy. In addition, we develop automatic stopping criteria to select the appropriate number of reticulations heuristically and implement a feature for RF-Net to output error-corrected input gene trees. We then conduct a comprehensive study of the original method and its novel extensions and confirm their efficacy in practice using extensive simulation and empirical IAV evolutionary analyses. AVAILABILITY AND IMPLEMENTATION RF-Net 2 is available at https://github.com/flu-crew/rf-net-2. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Alexey Markin
- Virus and Prion Research Unit, National Animal Disease Center, USDA-ARS, Ames, IA 50010, USA
| | - Sanket Wagle
- Department of Computer Science, Iowa State University, Ames, IA 50011, USA
| | - Tavis K Anderson
- Virus and Prion Research Unit, National Animal Disease Center, USDA-ARS, Ames, IA 50010, USA
| | - Oliver Eulenstein
- Department of Computer Science, Iowa State University, Ames, IA 50011, USA
| |
Collapse
|
11
|
Zhou BF, Yuan S, Crowl AA, Liang YY, Shi Y, Chen XY, An QQ, Kang M, Manos PS, Wang B. Phylogenomic analyses highlight innovation and introgression in the continental radiations of Fagaceae across the Northern Hemisphere. Nat Commun 2022; 13:1320. [PMID: 35288565 PMCID: PMC8921187 DOI: 10.1038/s41467-022-28917-1] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Accepted: 02/16/2022] [Indexed: 12/12/2022] Open
Abstract
Northern Hemisphere forests changed drastically in the early Eocene with the diversification of the oak family (Fagaceae). Cooling climates over the next 20 million years fostered the spread of temperate biomes that became increasingly dominated by oaks and their chestnut relatives. Here we use phylogenomic analyses of nuclear and plastid genomes to investigate the timing and pattern of major macroevolutionary events and ancient genome-wide signatures of hybridization across Fagaceae. Innovation related to seed dispersal is implicated in triggering waves of continental radiations beginning with the rapid diversification of major lineages and resulting in unparalleled transformation of forest dynamics within 15 million years following the K-Pg extinction. We detect introgression at multiple time scales, including ancient events predating the origination of genus-level diversity. As oak lineages moved into newly available temperate habitats in the early Miocene, secondary contact between previously isolated species occurred. This resulted in adaptive introgression, which may have further amplified the diversification of white oaks across Eurasia.
Collapse
Affiliation(s)
- Biao-Feng Zhou
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, 510650, Guangzhou, China
- University of the Chinese Academy of Sciences, 100049, Beijing, China
| | - Shuai Yuan
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, 510650, Guangzhou, China
| | - Andrew A Crowl
- Department of Biology, Duke University, Durham, NC, 27708, USA
| | - Yi-Ye Liang
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, 510650, Guangzhou, China
| | - Yong Shi
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, 510650, Guangzhou, China
| | - Xue-Yan Chen
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, 510650, Guangzhou, China
| | - Qing-Qing An
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, 510650, Guangzhou, China
| | - Ming Kang
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, 510650, Guangzhou, China
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, 510650, Guangzhou, China
| | - Paul S Manos
- Department of Biology, Duke University, Durham, NC, 27708, USA.
| | - Baosheng Wang
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, 510650, Guangzhou, China.
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, 510650, Guangzhou, China.
| |
Collapse
|
12
|
Hibbins MS, Hahn MW. Phylogenomic approaches to detecting and characterizing introgression. Genetics 2022; 220:iyab173. [PMID: 34788444 PMCID: PMC9208645 DOI: 10.1093/genetics/iyab173] [Citation(s) in RCA: 51] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 10/02/2021] [Indexed: 12/26/2022] Open
Abstract
Phylogenomics has revealed the remarkable frequency with which introgression occurs across the tree of life. These discoveries have been enabled by the rapid growth of methods designed to detect and characterize introgression from whole-genome sequencing data. A large class of phylogenomic methods makes use of data across species to infer and characterize introgression based on expectations from the multispecies coalescent. These methods range from simple tests, such as the D-statistic, to model-based approaches for inferring phylogenetic networks. Here, we provide a detailed overview of the various signals that different modes of introgression are expected leave in the genome, and how current methods are designed to detect them. We discuss the strengths and pitfalls of these approaches and identify areas for future development, highlighting the different signals of introgression, and the power of each method to detect them. We conclude with a discussion of current challenges in inferring introgression and how they could potentially be addressed.
Collapse
Affiliation(s)
- Mark S Hibbins
- Department of Biology, Indiana University, Bloomington, IN 47405, USA
| | - Matthew W Hahn
- Department of Biology, Indiana University, Bloomington, IN 47405, USA
- Department of Computer Science, Indiana University, Bloomington, IN 47405, USA
| |
Collapse
|
13
|
Abstract
Over the past three decades, computational capabilities have grown at such a rapid rate that they have given rise to many computationally heavy science fields such as phylogenomics. As increasingly more genomes are sequenced in the three domains of life, larger and more species-complete phylogenetic tree reconstructions are leading to a better understanding of the tree of life and the evolutionary histories in deep times. However, these large datasets pose unique challenges from a modeling and computational perspective: accurately describing the evolutionary process of thousands of species is still beyond the capability of current models, while the computational burden limits our ability to test multiple hypotheses. Thus, it is common practice to reduce the size of a dataset by selecting species to represent a clade (taxon sampling). Unfortunately, this process is subjective, and comparisons of large tree of life studies show that choice and number of species used in a dataset can alter the topology obtained. Thus, taxon sampling is, in itself, a process that needs to be fully investigated to determine its effect on phylogenetic stability. Here, we present the theory and practical application of an automated pipeline that can be easily implemented to explore the effect of taxon sampling on phylogenetic reconstructions. The application of this approach was recently discussed in a study of Terrabacteria and shows its power in investigating the accuracy of deep nodes of a phylogeny.
Collapse
Affiliation(s)
| | - Fabia Ursula Battistuzzi
- Department of Biological Sciences, Oakland University, Rochester, MI, USA.
- Center for Data Science and Big Data Analytics, Oakland University, Rochester, MI, USA.
| |
Collapse
|
14
|
Morales-Briones DF, Gehrke B, Huang CH, Liston A, Ma H, Marx HE, Tank DC, Yang Y. Analysis of Paralogs in Target Enrichment Data Pinpoints Multiple Ancient Polyploidy Events in Alchemilla s.l. (Rosaceae). Syst Biol 2021; 71:190-207. [PMID: 33978764 PMCID: PMC8677558 DOI: 10.1093/sysbio/syab032] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Revised: 04/28/2021] [Accepted: 05/03/2021] [Indexed: 12/16/2022] Open
Abstract
Target enrichment is becoming increasingly popular for phylogenomic studies. Although baits for enrichment are typically designed to target single-copy genes, paralogs are often recovered with increased sequencing depth, sometimes from a significant proportion of loci, especially in groups experiencing whole-genome duplication (WGD) events. Common approaches for processing paralogs in target enrichment data sets include random selection, manual pruning, and mainly, the removal of entire genes that show any evidence of paralogy. These approaches are prone to errors in orthology inference or removing large numbers of genes. By removing entire genes, valuable information that could be used to detect and place WGD events is discarded. Here, we used an automated approach for orthology inference in a target enrichment data set of 68 species of Alchemilla s.l. (Rosaceae), a widely distributed clade of plants primarily from temperate climate regions. Previous molecular phylogenetic studies and chromosome numbers both suggested ancient WGDs in the group. However, both the phylogenetic location and putative parental lineages of these WGD events remain unknown. By taking paralogs into consideration and inferring orthologs from target enrichment data, we identified four nodes in the backbone of Alchemilla s.l. with an elevated proportion of gene duplication. Furthermore, using a gene-tree reconciliation approach, we established the autopolyploid origin of the entire Alchemilla s.l. and the nested allopolyploid origin of four major clades within the group. Here, we showed the utility of automated tree-based orthology inference methods, previously designed for genomic or transcriptomic data sets, to study complex scenarios of polyploidy and reticulate evolution from target enrichment data sets.[Alchemilla; allopolyploidy; autopolyploidy; gene tree discordance; orthology inference; paralogs; Rosaceae; target enrichment; whole genome duplication.].
Collapse
Affiliation(s)
- Diego F Morales-Briones
- Department of Plant and Microbial Biology, University of Minnesota-Twin Cities, 1445 Gortner Avenue, St. Paul, MN 55108, USA
- Department of Biological Sciences and Institute for Bioinformatics and Evolutionary Studies, University of Idaho, 875 Perimeter Drive MS 3051, Moscow, ID 83844, USA
| | - Berit Gehrke
- University Gardens, University Museum, University of Bergen, Mildeveien 240, 5259 Hjellestad, Norway
| | - Chien-Hsun Huang
- State Key Laboratory of Genetic Engineering and Collaborative Innovation Center of Genetics and Development, Ministry of Education Key Laboratory of Biodiversity and Ecological Engineering, Institute of Plant Biology, Center of Evolutionary Biology, School of Life Sciences, Fudan University, Shanghai 200433, China
| | - Aaron Liston
- Department of Botany and Plant Pathology, Oregon State University, 2082 Cordley Hall, Corvallis, OR 97331, USA
| | - Hong Ma
- Department of Biology, the Huck Institute of the Life Sciences, the Pennsylvania State University, 510D Mueller Laboratory, University Park, PA 16802 USA
| | - Hannah E Marx
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109-1048, USA
- Museum of Southwestern Biology and Department of Biology, University of New Mexico, Albuquerque, NM 87131, USA
| | - David C Tank
- Department of Biological Sciences and Institute for Bioinformatics and Evolutionary Studies, University of Idaho, 875 Perimeter Drive MS 3051, Moscow, ID 83844, USA
| | - Ya Yang
- Department of Plant and Microbial Biology, University of Minnesota-Twin Cities, 1445 Gortner Avenue, St. Paul, MN 55108, USA
| |
Collapse
|
15
|
Rose JP, Kriebel R, Kahan L, DiNicola A, González-Gallegos JG, Celep F, Lemmon EM, Lemmon AR, Sytsma KJ, Drew BT. Sage Insights Into the Phylogeny of Salvia: Dealing With Sources of Discordance Within and Across Genomes. FRONTIERS IN PLANT SCIENCE 2021; 12:767478. [PMID: 34899789 PMCID: PMC8652245 DOI: 10.3389/fpls.2021.767478] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Accepted: 10/22/2021] [Indexed: 05/13/2023]
Abstract
Next-generation sequencing technologies have facilitated new phylogenomic approaches to help clarify previously intractable relationships while simultaneously highlighting the pervasive nature of incongruence within and among genomes that can complicate definitive taxonomic conclusions. Salvia L., with ∼1,000 species, makes up nearly 15% of the species diversity in the mint family and has attracted great interest from biologists across subdisciplines. Despite the great progress that has been achieved in discerning the placement of Salvia within Lamiaceae and in clarifying its infrageneric relationships through plastid, nuclear ribosomal, and nuclear single-copy genes, the incomplete resolution has left open major questions regarding the phylogenetic relationships among and within the subgenera, as well as to what extent the infrageneric relationships differ across genomes. We expanded a previously published anchored hybrid enrichment dataset of 35 exemplars of Salvia to 179 terminals. We also reconstructed nearly complete plastomes for these samples from off-target reads. We used these data to examine the concordance and discordance among the nuclear loci and between the nuclear and plastid genomes in detail, elucidating both broad-scale and species-level relationships within Salvia. We found that despite the widespread gene tree discordance, nuclear phylogenies reconstructed using concatenated, coalescent, and network-based approaches recover a common backbone topology. Moreover, all subgenera, except for Audibertia, are strongly supported as monophyletic in all analyses. The plastome genealogy is largely resolved and is congruent with the nuclear backbone. However, multiple analyses suggest that incomplete lineage sorting does not fully explain the gene tree discordance. Instead, horizontal gene flow has been important in both the deep and more recent history of Salvia. Our results provide a robust species tree of Salvia across phylogenetic scales and genomes. Future comparative analyses in the genus will need to account for the impacts of hybridization/introgression and incomplete lineage sorting in topology and divergence time estimation.
Collapse
Affiliation(s)
- Jeffrey P. Rose
- Department of Biology, University of Nebraska at Kearney, Kearney, NE, United States
- Department of Botany, University of Wisconsin–Madison, Madison, WI, United States
| | - Ricardo Kriebel
- Department of Botany, University of Wisconsin–Madison, Madison, WI, United States
| | - Larissa Kahan
- Department of Botany, University of Wisconsin–Madison, Madison, WI, United States
| | - Alexa DiNicola
- Department of Botany, University of Wisconsin–Madison, Madison, WI, United States
| | | | - Ferhat Celep
- Department of Biology, Faculty of Arts and Sciences, Kırıkkale University, Yahşihan, Turkey
| | - Emily M. Lemmon
- Department of Biological Science, Florida State University, Tallahassee, FL, United States
| | - Alan R. Lemmon
- Department of Scientific Computing, Florida State University, Tallahassee, FL, United States
| | - Kenneth J. Sytsma
- Department of Botany, University of Wisconsin–Madison, Madison, WI, United States
| | - Bryan T. Drew
- Department of Biology, University of Nebraska at Kearney, Kearney, NE, United States
| |
Collapse
|
16
|
Blair C, Ané C. Phylogenetic Trees and Networks Can Serve as Powerful and Complementary Approaches for Analysis of Genomic Data. Syst Biol 2020; 69:593-601. [PMID: 31432090 DOI: 10.1093/sysbio/syz056] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2019] [Accepted: 08/15/2019] [Indexed: 11/14/2022] Open
Abstract
Genomic data have had a profound impact on nearly every biological discipline. In systematics and phylogenetics, the thousands of loci that are now being sequenced can be analyzed under the multispecies coalescent model (MSC) to explicitly account for gene tree discordance due to incomplete lineage sorting (ILS). However, the MSC assumes no gene flow post divergence, calling for additional methods that can accommodate this limitation. Explicit phylogenetic network methods have emerged, which can simultaneously account for ILS and gene flow by representing evolutionary history as a directed acyclic graph. In this point of view, we highlight some of the strengths and limitations of phylogenetic networks and argue that tree-based inference should not be blindly abandoned in favor of networks simply because they represent more parameter rich models. Attention should be given to model selection of reticulation complexity, and the most robust conclusions regarding evolutionary history are likely obtained when combining tree- and network-based inference.
Collapse
Affiliation(s)
- Christopher Blair
- Department of Biological Sciences, New York City College of Technology, The City University of New York, 285 Jay Street, Brooklyn, NY 11201, USA
- Biology PhD Program, CUNY Graduate Center, 365 5th Ave., New York, NY 10016, USA
| | - Cécile Ané
- Department of Botany, University of Wisconsin - Madison, 1300 University Ave, Madison, WI 53706, USA
- Department of Statistics, University of Wisconsin - Madison, 1300 University Ave, Madison, WI 53706, USA
| |
Collapse
|
17
|
Morales-Briones DF, Kadereit G, Tefarikis DT, Moore MJ, Smith SA, Brockington SF, Timoneda A, Yim WC, Cushman JC, Yang Y. Disentangling Sources of Gene Tree Discordance in Phylogenomic Data Sets: Testing Ancient Hybridizations in Amaranthaceae s.l. Syst Biol 2020; 70:219-235. [PMID: 32785686 PMCID: PMC7875436 DOI: 10.1093/sysbio/syaa066] [Citation(s) in RCA: 106] [Impact Index Per Article: 21.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Revised: 03/01/2020] [Accepted: 09/03/2020] [Indexed: 12/26/2022] Open
Abstract
Gene tree discordance in large genomic data sets can be caused by evolutionary processes such as incomplete lineage sorting and hybridization, as well as model violation, and errors in data processing, orthology inference, and gene tree estimation. Species tree methods that identify and accommodate all sources of conflict are not available, but a combination of multiple approaches can help tease apart alternative sources of conflict. Here, using a phylotranscriptomic analysis in combination with reference genomes, we test a hypothesis of ancient hybridization events within the plant family Amaranthaceae s.l. that was previously supported by morphological, ecological, and Sanger-based molecular data. The data set included seven genomes and 88 transcriptomes, 17 generated for this study. We examined gene-tree discordance using coalescent-based species trees and network inference, gene tree discordance analyses, site pattern tests of introgression, topology tests, synteny analyses, and simulations. We found that a combination of processes might have generated the high levels of gene tree discordance in the backbone of Amaranthaceae s.l. Furthermore, we found evidence that three consecutive short internal branches produce anomalous trees contributing to the discordance. Overall, our results suggest that Amaranthaceae s.l. might be a product of an ancient and rapid lineage diversification, and remains, and probably will remain, unresolved. This work highlights the potential problems of identifiability associated with the sources of gene tree discordance including, in particular, phylogenetic network methods. Our results also demonstrate the importance of thoroughly testing for multiple sources of conflict in phylogenomic analyses, especially in the context of ancient, rapid radiations. We provide several recommendations for exploring conflicting signals in such situations. [Amaranthaceae; gene tree discordance; hybridization; incomplete lineage sorting; phylogenomics; species network; species tree; transcriptomics.]
Collapse
Affiliation(s)
- Diego F Morales-Briones
- Department of Plant and Microbial Biology, University of Minnesota-Twin Cities, 1445 Gortner Avenue, St. Paul, MN 55108, USA
| | - Gudrun Kadereit
- Institut für Molekulare Physiologie, Johannes Gutenberg-Universität Mainz, D-55099 Mainz, Germany
| | - Delphine T Tefarikis
- Institut für Molekulare Physiologie, Johannes Gutenberg-Universität Mainz, D-55099 Mainz, Germany
| | - Michael J Moore
- Department of Biology, Oberlin College, Science Center K111, 119 Woodland Street, Oberlin, OH 44074-1097, USA
| | - Stephen A Smith
- Department of Ecology & Evolutionary Biology, University of Michigan, 830 North University Avenue, Ann Arbor, MI 48109-1048, USA
| | - Samuel F Brockington
- Department of Plant Sciences, University of Cambridge, Tennis Court Road, Cambridge CB2 3EA, UK
| | - Alfonso Timoneda
- Department of Plant Sciences, University of Cambridge, Tennis Court Road, Cambridge CB2 3EA, UK
| | - Won C Yim
- Department of Biochemistry and Molecular Biology, University of Nevada, Reno, NV, 89577, USA
| | - John C Cushman
- Department of Biochemistry and Molecular Biology, University of Nevada, Reno, NV, 89577, USA
| | - Ya Yang
- Department of Plant and Microbial Biology, University of Minnesota-Twin Cities, 1445 Gortner Avenue, St. Paul, MN 55108, USA
| |
Collapse
|
18
|
Bernhardt N, Brassac J, Dong X, Willing EM, Poskar CH, Kilian B, Blattner FR. Genome-wide sequence information reveals recurrent hybridization among diploid wheat wild relatives. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2020; 102:493-506. [PMID: 31821649 DOI: 10.1111/tpj.14641] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Revised: 11/13/2019] [Accepted: 11/28/2019] [Indexed: 05/07/2023]
Abstract
Many conflicting hypotheses regarding the relationships among crops and wild species closely related to wheat (the genera Aegilops, Amblyopyrum, and Triticum) have been postulated. The contribution of hybridization to the evolution of these taxa is intensely discussed. To determine possible causes for this, and provide a phylogeny of the diploid taxa based on genome-wide sequence information, independent data were obtained from genotyping-by-sequencing and a target-enrichment experiment that returned 244 low-copy nuclear loci. The data were analyzed using Bayesian, likelihood and coalescent-based methods. D statistics were used to test if incomplete lineage sorting alone or together with hybridization is the source for incongruent gene trees. Here we present the phylogeny of all diploid species of the wheat wild relatives. We hypothesize that most of the wheat-group species were shaped by a primordial homoploid hybrid speciation event involving the ancestral Triticum and Am. muticum lineages to form all other species except Ae. speltoides. This hybridization event was followed by multiple introgressions affecting all taxa except Triticum. Mostly progenitors of the extant species were involved in these processes, while recent interspecific gene flow seems insignificant. The composite nature of many genomes of wheat-group taxa results in complicated patterns of diploid contributions when these lineages are involved in polyploid formation, which is, for example, the case for tetraploid and hexaploid wheats. Our analysis provides phylogenetic relationships and a testable hypothesis for the genome compositions in the basic evolutionary units within the wheat group of Triticeae.
Collapse
Affiliation(s)
- Nadine Bernhardt
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), 06466, Gatersleben, Germany
| | - Jonathan Brassac
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), 06466, Gatersleben, Germany
| | - Xue Dong
- Max Planck Institute for Plant Breeding Research, 50829, Cologne, Germany
- Plant Germplasm and Genomics Centre, Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, 650201, Kunming, Yunnan, China
| | - Eva-Maria Willing
- Max Planck Institute for Plant Breeding Research, 50829, Cologne, Germany
| | - C Hart Poskar
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), 06466, Gatersleben, Germany
| | - Benjamin Kilian
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), 06466, Gatersleben, Germany
- Global Crop Diversity Trust, 53113, Bonn, Germany
| | - Frank R Blattner
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), 06466, Gatersleben, Germany
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, 04103, Leipzig, Germany
| |
Collapse
|
19
|
Salazar AN, Gorter de Vries AR, van den Broek M, Brouwers N, de la Torre Cortès P, Kuijpers NGA, Daran JMG, Abeel T. Chromosome level assembly and comparative genome analysis confirm lager-brewing yeasts originated from a single hybridization. BMC Genomics 2019; 20:916. [PMID: 31791228 PMCID: PMC6889557 DOI: 10.1186/s12864-019-6263-3] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Accepted: 11/05/2019] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND The lager brewing yeast, S. pastorianus, is a hybrid between S. cerevisiae and S. eubayanus with extensive chromosome aneuploidy. S. pastorianus is subdivided into Group 1 and Group 2 strains, where Group 2 strains have higher copy number and a larger degree of heterozygosity for S. cerevisiae chromosomes. As a result, Group 2 strains were hypothesized to have emerged from a hybridization event distinct from Group 1 strains. Current genome assemblies of S. pastorianus strains are incomplete and highly fragmented, limiting our ability to investigate their evolutionary history. RESULTS To fill this gap, we generated a chromosome-level genome assembly of the S. pastorianus strain CBS 1483 from Oxford Nanopore MinION DNA sequencing data and analysed the newly assembled subtelomeric regions and chromosome heterozygosity. To analyse the evolutionary history of S. pastorianus strains, we developed Alpaca: a method to compute sequence similarity between genomes without assuming linear evolution. Alpaca revealed high similarities between the S. cerevisiae subgenomes of Group 1 and 2 strains, and marked differences from sequenced S. cerevisiae strains. CONCLUSIONS Our findings suggest that Group 1 and Group 2 strains originated from a single hybridization involving a heterozygous S. cerevisiae strain, followed by different evolutionary trajectories. The clear differences between both groups may originate from a severe population bottleneck caused by the isolation of the first pure cultures. Alpaca provides a computationally inexpensive method to analyse evolutionary relationships while considering non-linear evolution such as horizontal gene transfer and sexual reproduction, providing a complementary viewpoint beyond traditional phylogenetic approaches.
Collapse
Affiliation(s)
- Alex N Salazar
- Delft Bioinformatics Lab, Delft University of Technology, 2628, CD, Delft, The Netherlands
| | - Arthur R Gorter de Vries
- Department of Biotechnology, Delft University of Technology, Van der Maasweg 9, 2629, HZ, Delft, The Netherlands
| | - Marcel van den Broek
- Department of Biotechnology, Delft University of Technology, Van der Maasweg 9, 2629, HZ, Delft, The Netherlands
| | - Nick Brouwers
- Department of Biotechnology, Delft University of Technology, Van der Maasweg 9, 2629, HZ, Delft, The Netherlands
| | - Pilar de la Torre Cortès
- Department of Biotechnology, Delft University of Technology, Van der Maasweg 9, 2629, HZ, Delft, The Netherlands
| | - Niels G A Kuijpers
- HEINEKEN Supply Chain B.V., Global Innovation and Research, Zoeterwoude, Netherlands
| | - Jean-Marc G Daran
- Department of Biotechnology, Delft University of Technology, Van der Maasweg 9, 2629, HZ, Delft, The Netherlands
| | - Thomas Abeel
- Delft Bioinformatics Lab, Delft University of Technology, 2628, CD, Delft, The Netherlands.
- Broad Institute of MIT and Harvard, Boston, MA, 02142, USA.
| |
Collapse
|
20
|
Advances in Computational Methods for Phylogenetic Networks in the Presence of Hybridization. BIOINFORMATICS AND PHYLOGENETICS 2019. [DOI: 10.1007/978-3-030-10837-3_13] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
21
|
Pouchon C, Fernández A, Nassar JM, Boyer F, Aubert S, Lavergne S, Mavárez J. Phylogenomic Analysis of the Explosive Adaptive Radiation of the Espeletia Complex (Asteraceae) in the Tropical Andes. Syst Biol 2018; 67:1041-1060. [PMID: 30339252 DOI: 10.1093/sysbio/syy022] [Citation(s) in RCA: 52] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2017] [Accepted: 03/15/2018] [Indexed: 01/17/2023] Open
Abstract
The subtribe Espeletiinae (Asteraceae), endemic to the high-elevations in the Northern Andes, exhibits an exceptional diversity of species, growth-forms, and reproductive strategies. This complex of 140 species includes large trees, dichotomous trees, shrubs and the extraordinary giant caulescent rosettes, considered as a classic example of adaptation in tropical high-elevation ecosystems. The subtribe has also long been recognized as a prominent case of adaptive radiation, but the understanding of its evolution has been hampered by a lack of phylogenetic resolution. Herein, we produce the first fully resolved phylogeny of all morphological groups of Espeletiinae, using whole plastomes and about a million nuclear nucleotides obtained with an original de novo assembly procedure without reference genome, and analyzed with traditional and coalescent-based approaches that consider the possible impact of incomplete lineage sorting and hybridization on phylogenetic inference. We show that the diversification of Espeletiinae started from a rosette ancestor about 2.3 Ma, after the final uplift of the Northern Andes. This was followed by two independent radiations in the Colombian and Venezuelan Andes, with a few trans-cordilleran dispersal events among low-elevation tree lineages but none among high-elevation rosettes. We demonstrate complex scenarios of morphological change in Espeletiinae, usually implying the convergent evolution of growth-forms with frequent loss/gains of various traits. For instance, caulescent rosettes evolved independently in both countries, likely as convergent adaptations to life in tropical high-elevation habitats. Tree growth-forms evolved independently three times from the repeated colonization of lower elevations by high-elevation rosette ancestors. The rate of morphological diversification increased during the early phase of the radiation, after which it decreased steadily towards the present. On the other hand, the rate of species diversification in the best-sampled Venezuelan radiation was on average very high (3.1 spp/My), with significant rate variation among growth-forms (much higher in polycarpic caulescent rosettes). Our results point out a scenario where both adaptive morphological evolution and geographical isolation due to Pleistocene climatic oscillations triggered an exceptionally rapid radiation for a continental plant group.
Collapse
Affiliation(s)
- Charles Pouchon
- Laboratoire d'Ecologie Alpine, UMR 5553, Université Grenoble Alpes-CNRS, Grenoble, France
| | - Angel Fernández
- Herbario IVIC, Centro de Biofísica y Bioquímica, Instituto Venezolano de Investigaciones Científicas, Apartado 20632, Caracas 1020-A, Venezuela
| | - Jafet M Nassar
- Laboratorio de Biología de Organismos, Centro de Ecología, Instituto Venezolano de Investigaciones Científicas, Apartado 20632, Caracas 1020-A, Venezuela
| | - Frédéric Boyer
- Laboratoire d'Ecologie Alpine, UMR 5553, Université Grenoble Alpes-CNRS, Grenoble, France
| | - Serge Aubert
- Laboratoire d'Ecologie Alpine, UMR 5553, Université Grenoble Alpes-CNRS, Grenoble, France.,Station alpine Joseph-Fourier, UMS 3370, Université Grenoble Alpes-CNRS, Grenoble, France
| | - Sébastien Lavergne
- Laboratoire d'Ecologie Alpine, UMR 5553, Université Grenoble Alpes-CNRS, Grenoble, France
| | - Jesús Mavárez
- Laboratoire d'Ecologie Alpine, UMR 5553, Université Grenoble Alpes-CNRS, Grenoble, France
| |
Collapse
|
22
|
Bastide P, Solís-Lemus C, Kriebel R, William Sparks K, Ané C. Phylogenetic Comparative Methods on Phylogenetic Networks with Reticulations. Syst Biol 2018; 67:800-820. [PMID: 29701821 DOI: 10.1093/sysbio/syy033] [Citation(s) in RCA: 47] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2017] [Accepted: 04/20/2018] [Indexed: 12/29/2022] Open
Abstract
The goal of phylogenetic comparative methods (PCMs) is to study the distribution of quantitative traits among related species. The observed traits are often seen as the result of a Brownian Motion (BM) along the branches of a phylogenetic tree. Reticulation events such as hybridization, gene flow or horizontal gene transfer, can substantially affect a species' traits, but are not modeled by a tree. Phylogenetic networks have been designed to represent reticulate evolution. As they become available for downstream analyses, new models of trait evolution are needed, applicable to networks. We develop here an efficient recursive algorithm to compute the phylogenetic variance matrix of a trait on a network, in only one preorder traversal of the network. We then extend the standard PCM tools to this new framework, including phylogenetic regression with covariates (or phylogenetic ANOVA), ancestral trait reconstruction, and Pagel's $\lambda$ test of phylogenetic signal. The trait of a hybrid is sometimes outside of the range of its two parents, for instance because of hybrid vigor or hybrid depression. These two phenomena are rather commonly observed in present-day hybrids. Transgressive evolution can be modeled as a shift in the trait value following a reticulation point. We develop a general framework to handle such shifts and take advantage of the phylogenetic regression view of the problem to design statistical tests for ancestral transgressive evolution in the evolutionary history of a group of species. We study the power of these tests in several scenarios and show that recent events have indeed the strongest impact on the trait distribution of present-day taxa. We apply those methods to a data set of Xiphophorus fishes, to confirm and complete previous analysis in this group. All the methods developed here are available in the Julia package PhyloNetworks.
Collapse
Affiliation(s)
- Paul Bastide
- Unité Mixte de Recherche Mathématiques et Informatique Appliquées (MIA - Paris), AgroParisTech, Institut National de la Recherche Agronomique (INRA), Université Paris-Saclay, 16 rue Claude Bernard, 75005 Paris, France.,Unité de Recherche Mathématiques et Informatique Appliquées du Génome ál'Environnement (MaIAGE), Institut National de la Recherche Agronomique (INRA), Université Paris-Saclay, Domaine de Vilvert, 78352 Jouy-en-Josas, France
| | - Claudia Solís-Lemus
- Department of Statistics, University of Wisconsin-Madison, 1300 University Avenue, Madison, WI 53706, USA
| | - Ricardo Kriebel
- Department of Botany, University of Wisconsin-Madison, 430 Lincoln Drive, Madison, WI 53706, USA
| | - K William Sparks
- Department of Botany, University of Wisconsin-Madison, 430 Lincoln Drive, Madison, WI 53706, USA
| | - Cécile Ané
- Department of Statistics, University of Wisconsin-Madison, 1300 University Avenue, Madison, WI 53706, USA.,Department of Botany, University of Wisconsin-Madison, 430 Lincoln Drive, Madison, WI 53706, USA
| |
Collapse
|
23
|
Morales-Briones DF, Liston A, Tank DC. Phylogenomic analyses reveal a deep history of hybridization and polyploidy in the Neotropical genus Lachemilla (Rosaceae). THE NEW PHYTOLOGIST 2018; 218:1668-1684. [PMID: 29604235 DOI: 10.1111/nph.15099] [Citation(s) in RCA: 93] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2017] [Accepted: 02/09/2018] [Indexed: 05/10/2023]
Abstract
Hybridization, incomplete lineage sorting, and phylogenetic error produce similar incongruence patterns, representing a great challenge for phylogenetic reconstruction. Here, we use sequence capture data and multiple species tree and species network approaches to resolve the backbone phylogeny of the Neotropical genus Lachemilla, while distinguishing among sources of incongruence. We used 396 nuclear loci and nearly complete plastome sequences from 27 species to clarify the relationships among the major groups of Lachemilla, and explored multiple sources of conflict between gene trees and species trees inferred with a plurality of approaches. All phylogenetic methods recovered the four major groups previously proposed for Lachemilla, but species tree methods recovered different topologies for relationships between these four clades. Species network analyses revealed that one major clade, Orbiculate, is likely of ancient hybrid origin, representing one of the main sources of incongruence among the species trees. Additionally, we found evidence for a potential whole genome duplication event shared by Lachemilla and allied genera. Lachemilla shows clear evidence of ancient and recent hybridization throughout the evolutionary history of the group. Also, we show the necessity to use phylogenetic network approaches that can simultaneously accommodate incomplete lineage sorting and gene flow when studying groups that show patterns of reticulation.
Collapse
Affiliation(s)
- Diego F Morales-Briones
- Department of Biological Sciences, University of Idaho, 875 Perimeter Drive MS 3051, Moscow, ID, 83844-3051, USA
- Institute for Bioinformatics and Evolutionary Studies, University of Idaho, 875 Perimeter Drive MS 3051, Moscow, ID, 83844-3051, USA
- Stillinger Herbarium, University of Idaho, 875 Perimeter Drive MS 3051, Moscow, ID, 83844-3051, USA
| | - Aaron Liston
- Department of Botany and Plant Pathology, Oregon State University, 2082 Cordley Hall, Corvallis, OR, 97331, USA
| | - David C Tank
- Department of Biological Sciences, University of Idaho, 875 Perimeter Drive MS 3051, Moscow, ID, 83844-3051, USA
- Institute for Bioinformatics and Evolutionary Studies, University of Idaho, 875 Perimeter Drive MS 3051, Moscow, ID, 83844-3051, USA
- Stillinger Herbarium, University of Idaho, 875 Perimeter Drive MS 3051, Moscow, ID, 83844-3051, USA
| |
Collapse
|
24
|
Abstract
Phylogenomics aims at reconstructing the evolutionary histories of organisms taking into account whole genomes or large fractions of genomes. The abundance of genomic data for an enormous variety of organisms has enabled phylogenomic inference of many groups, and this has motivated the development of many computer programs implementing the associated methods. This chapter surveys phylogenetic concepts and methods aimed at both gene tree and species tree reconstruction while also addressing common pitfalls, providing references to relevant computer programs. A practical phylogenomic analysis example including bacterial genomes is presented at the end of the chapter.
Collapse
Affiliation(s)
- José S L Patané
- Department of Biochemistry, Institute of Chemistry, University of São Paulo, Av. Prof. Lineu Prestes 748, São Paulo, SP, 05508-000, Brazil
| | - Joaquim Martins
- Department of Biochemistry, Institute of Chemistry, University of São Paulo, Av. Prof. Lineu Prestes 748, São Paulo, SP, 05508-000, Brazil
| | - João C Setubal
- Department of Biochemistry, Institute of Chemistry, University of São Paulo, Av. Prof. Lineu Prestes 748, São Paulo, SP, 05508-000, Brazil.
| |
Collapse
|
25
|
Kamneva OK, Rosenberg NA. Simulation-Based Evaluation of Hybridization Network Reconstruction Methods in the Presence of Incomplete Lineage Sorting. Evol Bioinform Online 2017; 13:1176934317691935. [PMID: 28469378 PMCID: PMC5395256 DOI: 10.1177/1176934317691935] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2016] [Accepted: 01/11/2017] [Indexed: 11/22/2022] Open
Abstract
Hybridization events generate reticulate species relationships, giving rise to species networks rather than species trees. We report a comparative study of consensus, maximum parsimony, and maximum likelihood methods of species network reconstruction using gene trees simulated assuming a known species history. We evaluate the role of the divergence time between species involved in a hybridization event, the relative contributions of the hybridizing species, and the error in gene tree estimation. When gene tree discordance is mostly due to hybridization and not due to incomplete lineage sorting (ILS), most of the methods can detect even highly skewed hybridization events between highly divergent species. For recent divergences between hybridizing species, when the influence of ILS is sufficiently high, likelihood methods outperform parsimony and consensus methods, which erroneously identify extra hybridizations. The more sophisticated likelihood methods, however, are affected by gene tree errors to a greater extent than are consensus and parsimony.
Collapse
Affiliation(s)
- Olga K Kamneva
- Department of Biology, Stanford University, Stanford, CA, USA
| | | |
Collapse
|