1
|
Gupta A, Mirarab S, Turakhia Y. Accurate, scalable, and fully automated inference of species trees from raw genome assemblies using ROADIES. Proc Natl Acad Sci U S A 2025; 122:e2500553122. [PMID: 40314967 PMCID: PMC12088440 DOI: 10.1073/pnas.2500553122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2025] [Accepted: 03/31/2025] [Indexed: 05/03/2025] Open
Abstract
Current genome sequencing initiatives across a wide range of life forms offer significant potential to enhance our understanding of evolutionary relationships and support transformative biological and medical applications. Species trees play a central role in many of these applications; however, despite the widespread availability of genome assemblies, accurate inference of species trees remains challenging due to the limited automation, substantial domain expertise, and computational resources required by conventional methods. To address this limitation, we present ROADIES, a fully automated pipeline to infer species trees starting from raw genome assemblies. In contrast to the prominent approach, ROADIES incorporates a unique strategy of randomly sampling segments of the input genomes to generate gene trees. This eliminates the need for predefining a set of loci, limiting the analyses to a fixed number of genes, and performing the cumbersome gene annotation and/or whole genome alignment steps. ROADIES also eliminates the need to infer orthology by leveraging existing discordance-aware methods that allow multicopy genes. Using the genomic datasets from large-scale sequencing efforts across four diverse life forms (placental mammals, pomace flies, birds, and budding yeasts), we show that ROADIES infers species trees that are comparable in quality to the state-of-the-art studies but in a fraction of the time and effort, including on challenging datasets with rampant gene tree discordance and complex polyploidy. With its speed, accuracy, and automation, ROADIES has the potential to vastly simplify species tree inference, making it accessible to a broader range of scientists and applications.
Collapse
Affiliation(s)
- Anshu Gupta
- Department of Computer Science and Engineering, University of California, San Diego, CA92093
| | - Siavash Mirarab
- Department of Electrical and Computer Engineering, University of California, San Diego, CA92093
| | - Yatish Turakhia
- Department of Electrical and Computer Engineering, University of California, San Diego, CA92093
| |
Collapse
|
2
|
Xiong Y, Li D, Liu T, Xiong Y, Yu Q, Lei X, Zhao J, Yan L, Ma X. Extensive transcriptome data providing great efficacy in genetic research and adaptive gene discovery: a case study of Elymus sibiricus L. (Poaceae, Triticeae). FRONTIERS IN PLANT SCIENCE 2024; 15:1457980. [PMID: 39363927 PMCID: PMC11447521 DOI: 10.3389/fpls.2024.1457980] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Accepted: 09/02/2024] [Indexed: 10/05/2024]
Abstract
Genetic markers play a central role in understanding genetic diversity, speciation, evolutionary processes, and how species respond to environmental stresses. However, conventional molecular markers are less effective when studying polyploid species with large genomes. In this study, we compared gene expression levels in 101 accessions of Elymus sibiricus, a widely distributed allotetraploid forage species across the Eurasian continent. A total of 20,273 high quality transcriptomic SNPs were identified. In addition, 72,344 evolutionary information loci of these accessions of E. sibiricus were identified using genome skimming data in conjunction with the assembled composite genome. The population structure results suggest that transcriptome SNPs were more effective than SNPs derived from genome skimming data in revealing the population structure of E. sibiricus from different locations, and also outperformed gene expression levels. Compared with transcriptome SNPs, the investigation of population-specifically-expressed genes (PSEGs) using expression levels revealed a larger number of locally adapted genes mainly involved in the ion response process in the Sichuan, Inner Mongolia, and Xizang geographical groups. Furthermore, we performed the weighted gene co-expression network analysis (WGCNA) and successfully identified potential regulators of PSEGs. Therefore, for species lacking genomic information, the use of transcriptome SNPs is an efficient approach to perform population structure analysis. In addition, analyzing genes under selection through nucleotide diversity and genetic differentiation index analysis based on transcriptome SNPs, and exploring PSEG through expression levels is an effective method for analyzing locally adaptive genes.
Collapse
Affiliation(s)
- Yanli Xiong
- College of Grassland Science and Technology, Sichuan Agricultural University, Chengdu, Sichuan, China
| | - Daxu Li
- Sichuan Academy of Grassland Sciences, Chengdu, Sichuan, China
| | - Tianqi Liu
- College of Grassland Science and Technology, Sichuan Agricultural University, Chengdu, Sichuan, China
| | - Yi Xiong
- College of Grassland Science and Technology, Sichuan Agricultural University, Chengdu, Sichuan, China
| | - Qingqing Yu
- Sichuan Academy of Grassland Sciences, Chengdu, Sichuan, China
| | - Xiong Lei
- Sichuan Academy of Grassland Sciences, Chengdu, Sichuan, China
| | - Junming Zhao
- College of Grassland Science and Technology, Sichuan Agricultural University, Chengdu, Sichuan, China
| | - Lijun Yan
- Sichuan Academy of Grassland Sciences, Chengdu, Sichuan, China
| | - Xiao Ma
- College of Grassland Science and Technology, Sichuan Agricultural University, Chengdu, Sichuan, China
| |
Collapse
|
3
|
Smith B, Walling A, Schwartz R. Phylogenomic investigation of lampreys (Petromyzontiformes). Mol Phylogenet Evol 2023; 189:107942. [PMID: 37804959 DOI: 10.1016/j.ympev.2023.107942] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 10/02/2023] [Accepted: 10/04/2023] [Indexed: 10/09/2023]
Abstract
The history of lamprey evolution has been contentious due to limited morphological differentiation and limited genetic data. Available data has produced inconsistent results, including in the relationship among northern and southern species and the monophyly of putative clades. Here we use whole genome sequence data sourced from a public database to identify orthologs for 11 lamprey species from across the globe and build phylogenies. The phylogeny showed a clear separation between northern and southern lamprey species, which contrasts with some prior work. We also find that the phylogenetic relationships of our samples of two genera, Lethenteron and Eudontomyzon, deviate from the taxonomic classification of these species, suggesting that they require reclassification.
Collapse
Affiliation(s)
- Brianna Smith
- Department of Biological Sciences, College of the Environment and Life Sciences, University of Rhode Island, 120 Flagg Road, Kingston, RI 02881, United States
| | - Alexandra Walling
- Department of Biological Sciences, College of the Environment and Life Sciences, University of Rhode Island, 120 Flagg Road, Kingston, RI 02881, United States
| | - Rachel Schwartz
- Department of Biological Sciences, College of the Environment and Life Sciences, University of Rhode Island, 120 Flagg Road, Kingston, RI 02881, United States.
| |
Collapse
|
4
|
Literman R, Windsor AM, Bart HL, Hunter ES, Deeds JR, Handy SM. Using low-coverage whole genome sequencing (genome skimming) to delineate three introgressed species of buffalofish (Ictiobus). Mol Phylogenet Evol 2023; 182:107715. [PMID: 36707011 DOI: 10.1016/j.ympev.2023.107715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 11/03/2022] [Accepted: 01/21/2023] [Indexed: 01/26/2023]
Abstract
Consumption of buffalofish has been sporadically associated with Haff disease-like illnesses involving sudden onset muscle pain and weakness due to skeletal muscle rhabdomyolysis, but determination of precisely which species are associated with these illnesses has been impeded by a lack of species-specific DNA-based markers. Here, three closely related species of buffalofish native to the Mississippi River Basin (Ictiobus bubalus, Ictiobus cyprinellus and Ictiobus niger) that have previously proven genetically indistinguishable using both mitochondrial and nuclear single-locus sequencing were reliably discriminated using low-coverage whole genome sequencing ('genome skimming'). Using 44 specimens representing the three species collected from the mid/upper (Missouri) and lower (Louisiana) regions of the species' native ranges, the SISRS (Site Identification from Short Read Sequences) bioinformatics pipeline was adapted to (1) identify over 620Mbp of putatively homologous nuclear sequence data and (2) isolate over 140,000 single-nucleotide polymorphisms (SNPs) that supported accurate species delimitation, all without the use of a reference genome or annotation data. These sites were used to classify Ictiobus spp. samples with genome-skim data, along with a larger set (n = 67) where ultraconserved elements (UCEs) were sequenced. Analyses of whole mitochondrial data revealed more limited signal. Nearly all samples matched their purported species based on morphologic identification, but two Missouri samples morphologically identified as I. niger grouped with samples of I. bubalus, albeit with significant enrichment of I. niger SNPs. To our knowledge this is the first report of a DNA-based tool to reliably discriminate these three morphologically distinct species.
Collapse
Affiliation(s)
- Robert Literman
- Center for Food Safety and Applied Nutrition, Office of Regulatory Science, U.S. Food and Drug Administration, College Park, MD, USA.
| | - Amanda M Windsor
- Center for Food Safety and Applied Nutrition, Office of Regulatory Science, U.S. Food and Drug Administration, College Park, MD, USA
| | - Henry L Bart
- Department of Ecology and Evolutionary Biology, Tulane University, New Orleans, LA, USA
| | - Elizabeth Sage Hunter
- Center for Food Safety and Applied Nutrition, Office of Regulatory Science, U.S. Food and Drug Administration, College Park, MD, USA
| | - Jonathan R Deeds
- Center for Food Safety and Applied Nutrition, Office of Regulatory Science, U.S. Food and Drug Administration, College Park, MD, USA
| | - Sara M Handy
- Center for Food Safety and Applied Nutrition, Office of Regulatory Science, U.S. Food and Drug Administration, College Park, MD, USA
| |
Collapse
|
5
|
Akiyama T, Uchiyama H, Yajima S, Arikawa K, Terai Y. Parallel evolution of opsin visual pigments in hawkmoths by tuning of spectral sensitivities during transition from a nocturnal to a diurnal ecology. J Exp Biol 2022; 225:285920. [PMID: 36408938 PMCID: PMC10112871 DOI: 10.1242/jeb.244541] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Accepted: 11/15/2022] [Indexed: 11/23/2022]
Abstract
Light environments differ dramatically between day and night. The transition between diurnal and nocturnal visual ecology has happened repeatedly throughout evolution in many species. However, the molecular mechanism underlying the evolution of vision in recent diurnal-nocturnal transition is poorly understood. Here, we focus on hawkmoths (Lepidoptera: Sphingidae) to address this question by investigating five nocturnal and five diurnal species. We performed RNA-sequencing analysis and identified opsin genes corresponding to the ultraviolet (UV), short-wavelength (SW) and long-wavelength (LW)-absorbing visual pigments. We found no significant differences in the expression patterns of opsin genes between the nocturnal and diurnal species. We then constructed the phylogenetic trees of hawkmoth species and opsins. The diurnal lineages had emerged at least three times from the nocturnal ancestors. The evolutionary rates of amino acid substitutions in the three opsins differed between the nocturnal and diurnal species. We found an excess number of parallel amino acid substitutions in the opsins in three independent diurnal lineages. The numbers were significantly more than those inferred from neutral evolution, suggesting that positive selection acted on these parallel substitutions. Moreover, we predicted the visual pigment absorption spectra based on electrophysiologically determined spectral sensitivity in two nocturnal and two diurnal species belonging to different clades. In the diurnal species, the LW pigments shift 10 nm towards shorter wavelengths, and the SW pigments shift 10 nm in the opposite direction. Taken together, our results suggest that parallel evolution of opsins may have enhanced the colour discrimination properties of diurnal hawkmoths in ambient light.
Collapse
Affiliation(s)
- Tokiho Akiyama
- Department of Evolutionary Studies of Biosystems, SOKENDAI (The Graduate University for Advanced Studies), Shonan Village, Hayama, Kanagawa 240-0193, Japan
| | - Hironobu Uchiyama
- NODAI Genome Research Center, Tokyo University of Agriculture, 1-1-1 Sakuragaoka, Setagaya, Tokyo 156-8502, Japan
| | - Shunsuke Yajima
- NODAI Genome Research Center, Tokyo University of Agriculture, 1-1-1 Sakuragaoka, Setagaya, Tokyo 156-8502, Japan.,Department of Bioscience, Tokyo University of Agriculture, 1-1-1 Sakuragaoka, Setagaya, Tokyo 156-8502, Japan
| | - Kentaro Arikawa
- Department of Evolutionary Studies of Biosystems, SOKENDAI (The Graduate University for Advanced Studies), Shonan Village, Hayama, Kanagawa 240-0193, Japan
| | - Yohey Terai
- Department of Evolutionary Studies of Biosystems, SOKENDAI (The Graduate University for Advanced Studies), Shonan Village, Hayama, Kanagawa 240-0193, Japan
| |
Collapse
|
6
|
Kirk R, Rosario ME, Oblie N, Jouaneh TMM, Carro MA, Wu C, Kim AM, Leibovitz E, Hunter ES, Literman R, Handy SM, Rowley DC, Bertin MJ. Screening the PRISM Library against Staphylococcus aureus Reveals a Sesquiterpene Lactone from Liriodendron tulipifera with Inhibitory Activity. ACS OMEGA 2022; 7:35677-35685. [PMID: 36249352 PMCID: PMC9558601 DOI: 10.1021/acsomega.2c03539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Accepted: 09/19/2022] [Indexed: 06/16/2023]
Abstract
Infections caused by the bacterium Staphylococcus aureus continue to pose threats to human health and put a financial burden on the healthcare system. The overuse of antibiotics has contributed to mutations leading to the emergence of methicillin-resistant S. aureus, and there is a critical need for the discovery and development of new antibiotics to evade drug-resistant bacteria. Medicinal plants have shown promise as sources of new small-molecule therapeutics with potential uses against pathogenic infections. The principal Rhode Island secondary metabolite (PRISM) library is a botanical extract library generated from specimens in the URI Youngken Medicinal Garden by upper-division undergraduate students. PRISM extracts were screened for activity against strains of methicillin-susceptible S. aureus (MSSA). An extract generated from the tulip tree (Liriodendron tulipifera) demonstrated growth inhibition against MSSA, and a bioassay-guided approach identified a sesquiterpene lactone, laurenobiolide, as the active constituent. Intriguingly, its isomers, tulipinolide and epi-tulipinolide, lacked potent activity against MSSA. Laurenobiolide also proved to be more potent against MSSA than the structurally similar sesquiterpene lactones, costunolide and dehydrocostus lactone. Laurenobiolide was the most abundant in the twig bark of the tulip tree, supporting the twig bark's historical and cultural usage in poultices and teas.
Collapse
Affiliation(s)
- Riley
D. Kirk
- Department
of Biomedical and Pharmaceutical Sciences, College of Pharmacy, University of Rhode Island, Kingston, Rhode Island 02881, United States
| | - Margaret E. Rosario
- Department
of Biomedical and Pharmaceutical Sciences, College of Pharmacy, University of Rhode Island, Kingston, Rhode Island 02881, United States
| | - Nana Oblie
- Department
of Biomedical and Pharmaceutical Sciences, College of Pharmacy, University of Rhode Island, Kingston, Rhode Island 02881, United States
| | - Terra Marie M. Jouaneh
- Department
of Biomedical and Pharmaceutical Sciences, College of Pharmacy, University of Rhode Island, Kingston, Rhode Island 02881, United States
| | - Marina A. Carro
- Department
of Biomedical and Pharmaceutical Sciences, College of Pharmacy, University of Rhode Island, Kingston, Rhode Island 02881, United States
| | - Christine Wu
- Department
of Biomedical and Pharmaceutical Sciences, College of Pharmacy, University of Rhode Island, Kingston, Rhode Island 02881, United States
| | - Andrew M. Kim
- Department
of Biomedical and Pharmaceutical Sciences, College of Pharmacy, University of Rhode Island, Kingston, Rhode Island 02881, United States
| | - Elizabeth Leibovitz
- Department
of Biomedical and Pharmaceutical Sciences, College of Pharmacy, University of Rhode Island, Kingston, Rhode Island 02881, United States
| | - Elizabeth Sage Hunter
- Center
for Food Safety and Applied Nutrition, Office of Regulatory Science, United States Food and Drug Administration, College Park, Maryland 20740, United States
| | - Robert Literman
- Center
for Food Safety and Applied Nutrition, Office of Regulatory Science, United States Food and Drug Administration, College Park, Maryland 20740, United States
| | - Sara M. Handy
- Center
for Food Safety and Applied Nutrition, Office of Regulatory Science, United States Food and Drug Administration, College Park, Maryland 20740, United States
| | - David C. Rowley
- Department
of Biomedical and Pharmaceutical Sciences, College of Pharmacy, University of Rhode Island, Kingston, Rhode Island 02881, United States
| | - Matthew J. Bertin
- Department
of Biomedical and Pharmaceutical Sciences, College of Pharmacy, University of Rhode Island, Kingston, Rhode Island 02881, United States
| |
Collapse
|
7
|
Deeds JR, Literman RA, Handy SM, Klontz KC, Swajian KA, Benner RA, Bart HL. Haff disease associated with consumption of buffalofish ( Ictiobus spp.) in the United States, 2010-2020, with confirmation of the causative species. Clin Toxicol (Phila) 2022; 60:1087-1093. [PMID: 36200989 DOI: 10.1080/15563650.2022.2123815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
BACKGROUND In the United States, buffalofish (Ictiobus spp.) are sporadically associated with sudden onset muscle pain and weakness due to rhabdomyolysis within 24 h of fish consumption (Haff disease). Previous genetic analyses of case-associated samples were unable to distinguish the three species of buffalofish that occur in the US, Ictiobus cyprinellus (bigmouth buffalo), Ictiobus bubalus (smallmouth buffalo), and Ictiobus niger (black buffalo). METHODS Ten events were investigated between 2010 and 2020 and demographic and clinical information was collected for 24 individuals. Meal remnants were collected from 5 of 10 events with additional associated samples (n = 24) collected from another five of 10 events. Low-coverage whole-genome sequencing (genome skimming) was used to identify meal remnants. RESULTS Patients (26-75 years of age) ranged from 1-4 per event, with 90% involving ≥2 individuals. Reported symptoms included muscle tenderness and weakness, nausea/vomiting, and brown/tea-colored urine. Median incubation period was 8 h. Ninety-six percent of cases were hospitalized with a median duration of four days. The most commonly reported laboratory finding was elevated creatine phosphokinase and liver transaminases. Treatment was supportive including intravenous fluids to prevent renal failure. Events occurred in California (1), Illinois (2), Louisiana (1), New York (1), Mississippi (1), Missouri (2), New Jersey (1), and Texas (1) with location of harvest, when known, being Illinois, Louisiana, Mississippi, Missouri, Texas, and Wisconsin. Meal remnants were identified as I. bubalus (n = 4) and I. niger (n = 1). Associated samples were identified as I. bubalus (n = 16), I. cyprinellus (n = 5), and I. niger (n = 3). DISCUSSION Time course, presentation of illness, and clinical findings were all consistent with previous domestic cases of buffalofish-associated Haff disease. In contrast to previous reports that I. cyprinellus is the causative species in US cases, data indicate that all three buffalofish species are harvested but I. bubalus is most often associated with illness.
Collapse
Affiliation(s)
- Jonathan R Deeds
- Division of Analytical Chemistry, Center for Food Safety and Applied Nutrition, Office of Regulatory Science, US Food and Drug Administration, College Park, MD, USA
| | - Robert A Literman
- Division of Analytical Chemistry, Center for Food Safety and Applied Nutrition, Office of Regulatory Science, US Food and Drug Administration, College Park, MD, USA
| | - Sara M Handy
- Division of Analytical Chemistry, Center for Food Safety and Applied Nutrition, Office of Regulatory Science, US Food and Drug Administration, College Park, MD, USA
| | - Karl C Klontz
- Division of Public Health Information and Analytics, Center for Food Safety and Applied Nutrition, Office of Analytics and Outreach, US Food and Drug Administration, College Park, MD, USA
| | - Karen A Swajian
- Division of Seafood Safety, Center for Food Safety and Applied Nutrition, Office of Food Safety, US Food and Drug Administration, College Park, MD, USA
| | - Ronald A Benner
- Division of Seafood Science and Technology, Center for Food Safety and Applied Nutrition, Office of Food Safety, US Food and Drug Administration, Dauphin Island, AL, USA
| | - Henry L Bart
- Department of Ecology and Evolutionary Biology, Tulane University, New Orleans, LA, USA
| |
Collapse
|
8
|
Gable SM, Byars MI, Literman R, Tollis M. A Genomic Perspective on the Evolutionary Diversification of Turtles. Syst Biol 2022; 71:1331-1347. [DOI: 10.1093/sysbio/syac019] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Revised: 02/28/2022] [Accepted: 03/01/2022] [Indexed: 11/12/2022] Open
Abstract
Abstract
To examine phylogenetic heterogeneity in turtle evolution, we collected thousands of high-confidence single-copy orthologs from 19 genome assemblies representative of extant turtle diversity and estimated a phylogeny with multispecies coalescent and concatenated partitioned methods. We also collected next-generation sequences from 26 turtle species and assembled millions of biallelic markers to reconstruct phylogenies based on annotated regions from the western painted turtle (Chrysemys picta bellii) genome (coding regions, introns, untranslated regions, intergenic, and others). We then measured gene tree-species tree discordance, as well as gene and site heterogeneity at each node in the inferred trees, and tested for temporal patterns in phylogenomic conflict across turtle evolution. We found strong and consistent support for all bifurcations in the inferred turtle species phylogenies. However, a number of genes, sites, and genomic features supported alternate relationships between turtle taxa. Our results suggest that gene tree-species tree discordance in these datasets is likely driven by population-level processes such as incomplete lineage sorting. We found very little effect of substitutional saturation on species tree topologies, and no clear phylogenetic patterns in codon usage bias and compositional heterogeneity. There was no correlation between gene and site concordance, node age, and DNA substitution rate across most annotated genomic regions. Our study demonstrates that heterogeneity is to be expected even in well resolved clades such as turtles, and that future phylogenomic studies should aim to sample as much of the genome as possible in order to obtain accurate phylogenies for assessing conservation priorities in turtles.
Collapse
Affiliation(s)
- Simone M Gable
- School of Informatics, Computing, and Cyber Systems, Northern Arizona University, PO Box 5693, Flagstaff, AZ 8601, USA
| | - Michael I Byars
- School of Informatics, Computing, and Cyber Systems, Northern Arizona University, PO Box 5693, Flagstaff, AZ 8601, USA
| | - Robert Literman
- Department of Biological Sciences, University of Rhode Island, 120 Flagg Road, Kingstown, RI, 0288, USA
| | - Marc Tollis
- School of Informatics, Computing, and Cyber Systems, Northern Arizona University, PO Box 5693, Flagstaff, AZ 8601, USA
| |
Collapse
|
9
|
Literman RA, Ott BM, Wen J, Grauke LJ, Schwartz RS, Handy SM. Reference-free discovery of nuclear SNPs permits accurate, sensitive identification of Carya (hickory) species and hybrids. APPLICATIONS IN PLANT SCIENCES 2022; 10:e11455. [PMID: 35228913 PMCID: PMC8861591 DOI: 10.1002/aps3.11455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Accepted: 12/10/2021] [Indexed: 06/14/2023]
Abstract
PREMISE DNA-based species identification is critical when morphological identification is restricted, but DNA-based identification pipelines typically rely on the ability to compare homologous sequence data across species. Because many clades lack robust genomic resources, we present here a bioinformatics pipeline capable of generating genome-wide single-nucleotide polymorphism (SNP) data while circumventing the need for any reference genome or annotation data. METHODS Using the SISRS bioinformatics pipeline, we generated de novo ortholog data for the genus Carya, isolating sites where genetic variation was restricted to a single Carya species (i.e., species-informative SNPs). We leveraged these SNPs to identify both full-species and hybrid Carya specimens, even at very low sequencing depths. RESULTS We identified between 46,000 and 476,000 species-identifying SNPs for each of eight diploid Carya species, and all species identifications were concordant with the species of record. For all putative F1 hybrid specimens, both parental species were correctly identified in all cases, and more punctate patterns of introgression were detectable in more cryptic crosses. DISCUSSION Bioinformatics pipelines that use only short-read sequencing data provide vital new tools enabling rapid expansion of DNA identification assays for model and non-model clades alike.
Collapse
Affiliation(s)
- Robert A. Literman
- Office of Regulatory Science, Center for Food Safety and Applied NutritionU.S. Food and Drug AdministrationCollege ParkMarylandUSA
| | - Brittany M. Ott
- Office of Food Additive Safety, Center for Food Safety and Applied NutritionU.S. Food and Drug AdministrationCollege ParkMarylandUSA
| | - Jun Wen
- Department of Botany, National Museum of Natural HistorySmithsonian InstitutionWashington, D.C.USA
| | - L. J. Grauke
- United States Department of Agriculture (USDA)–Agricultural Research Service Pecan Breeding and GeneticsSomervilleTexasUSA
| | - Rachel S. Schwartz
- Department of Biological SciencesUniversity of Rhode IslandKingstonRhode IslandUSA
| | - Sara M. Handy
- Office of Regulatory Science, Center for Food Safety and Applied NutritionU.S. Food and Drug AdministrationCollege ParkMarylandUSA
| |
Collapse
|
10
|
Literman R, Schwartz R. Genome-Scale Profiling Reveals Noncoding Loci Carry Higher Proportions of Concordant Data. Mol Biol Evol 2021; 38:2306-2318. [PMID: 33528497 PMCID: PMC8136493 DOI: 10.1093/molbev/msab026] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Many evolutionary relationships remain controversial despite whole-genome sequencing data. These controversies arise, in part, due to challenges associated with accurately modeling the complex phylogenetic signal coming from genomic regions experiencing distinct evolutionary forces. Here, we examine how different regions of the genome support or contradict well-established relationships among three mammal groups using millions of orthologous parsimony-informative biallelic sites (PIBS) distributed across primate, rodent, and Pecora genomes. We compared PIBS concordance percentages among locus types (e.g. coding sequences (CDS), introns, intergenic regions), and contrasted PIBS utility over evolutionary timescales. Sites derived from noncoding sequences provided more data and proportionally more concordant sites compared with those from CDS in all clades. CDS PIBS were also predominant drivers of tree incongruence in two cases of topological conflict. PIBS derived from most locus types provided surprisingly consistent support for splitting events spread across the timescales we examined, although we find evidence that CDS and intronic PIBS may, respectively and to a limited degree, inform disproportionately about older and younger splits. In this era of accessible wholegenome sequence data, these results:1) suggest benefits to more intentionally focusing on noncoding loci as robust data for tree inference and 2) reinforce the importance of accurate modeling, especially when using CDS data.
Collapse
Affiliation(s)
- Robert Literman
- Department of Biological Sciences, University of Rhode Island, South Kingstown, RI, USA.,Center for Food Safety and Applied Nutrition, Office of Regulatory Science, U.S. Food and Drug Administration, College Park, MD, USA
| | - Rachel Schwartz
- Department of Biological Sciences, University of Rhode Island, South Kingstown, RI, USA
| |
Collapse
|
11
|
Hunter ES, Literman R, Handy SM. Utilizing Big Data to Identify Tiny Toxic Components: Digitalis. Foods 2021; 10:1794. [PMID: 34441571 PMCID: PMC8391216 DOI: 10.3390/foods10081794] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 07/22/2021] [Accepted: 07/27/2021] [Indexed: 12/23/2022] Open
Abstract
The botanical genus Digitalis is equal parts colorful, toxic, and medicinal, and its bioactive compounds have a long history of therapeutic use. However, with an extremely narrow therapeutic range, even trace amounts of Digitalis can cause adverse effects. Using chemical methods, the United States Food and Drug Administration traced a 1997 case of Digitalis toxicity to a shipment of Plantago (a common ingredient in dietary supplements marketed to improve digestion) contaminated with Digitalis lanata. With increased accessibility to next generation sequencing technology, here we ask whether this case could have been cracked rapidly using shallow genome sequencing strategies (e.g., genome skims). Using a modified implementation of the Site Identification from Short Read Sequences (SISRS) bioinformatics pipeline with whole-genome sequence data, we generated over 2 M genus-level single nucleotide polymorphisms in addition to species-informative single nucleotide polymorphisms. We simulated dietary supplement contamination by spiking low quantities (0-10%) of Digitalis whole-genome sequence data into a background of commonly used ingredients in products marketed for "digestive cleansing" and reliably detected Digitalis at the genus level while also discriminating between Digitalis species. This work serves as a roadmap for the development of novel DNA-based assays to quickly and reliably detect the presence of toxic species such as Digitalis in food products or dietary supplements using genomic methods and highlights the power of harnessing the entire genome to identify botanical species.
Collapse
Affiliation(s)
| | | | - Sara M. Handy
- Center for Food Safety and Applied Nutrition, Office of Regulatory Science, U.S. Food and Drug Administration, College Park, MD 20740, USA; (E.S.H.); (R.L.)
| |
Collapse
|
12
|
Libkind D, Čadež N, Opulente DA, Langdon QK, Rosa CA, Sampaio JP, Gonçalves P, Hittinger CT, Lachance MA. Towards yeast taxogenomics: lessons from novel species descriptions based on complete genome sequences. FEMS Yeast Res 2020; 20:5876348. [DOI: 10.1093/femsyr/foaa042] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2020] [Accepted: 07/23/2020] [Indexed: 01/23/2023] Open
Abstract
ABSTRACT
In recent years, ‘multi-omic’ sciences have affected all aspects of fundamental and applied biological research. Yeast taxonomists, though somewhat timidly, have begun to incorporate complete genomic sequences into the description of novel taxa, taking advantage of these powerful data to calculate more reliable genetic distances, construct more robust phylogenies, correlate genotype with phenotype and even reveal cryptic sexual behaviors. However, the use of genomic data in formal yeast species descriptions is far from widespread. The present review examines published examples of genome-based species descriptions of yeasts, highlights relevant bioinformatic approaches, provides recommendations for new users and discusses some of the challenges facing the genome-based systematics of yeasts.
Collapse
Affiliation(s)
- D Libkind
- Centro de Referencia en Levaduras y Tecnología Cervecera (CRELTEC), Instituto Andino Patagónico de Tecnologías Biológicas y Geoambientales (IPATEC) – CONICET / Universidad Nacional del Comahue, Bariloche, Argentina
| | - N Čadež
- Biotechnical Faculty, University of Ljubljana, Jamnikarjeva 101, 1000 Ljubljana, Slovenia
| | - D A Opulente
- Laboratory of Genetics, Wisconsin Energy Institute, J. F. Crow Institute for the Study of Evolution, Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI, USA
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI, USA
| | - Q K Langdon
- Laboratory of Genetics, Wisconsin Energy Institute, J. F. Crow Institute for the Study of Evolution, Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI, USA
| | - C A Rosa
- Departamento de Microbiologia, ICB, C.P. 486, Universidade Federal de Minas Gerais, Belo Horizonte, MG, 31270–901, Brazil
| | - J P Sampaio
- UCIBIO, Departamento de Ciências da Vida, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal
| | - P Gonçalves
- UCIBIO, Departamento de Ciências da Vida, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal
| | - C T Hittinger
- Laboratory of Genetics, Wisconsin Energy Institute, J. F. Crow Institute for the Study of Evolution, Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI, USA
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI, USA
| | - M A Lachance
- Department of Biology, University of Western Ontario, London N6A 5B7, Ontario, Canada
| |
Collapse
|
13
|
Oruongo J, Ronk K, Alagoz O, Jaffery J, Smith M. Skilled Nursing Facility Differences in Readmission Rates by the Diagnosis-Related Group Category of the Initial Hospitalization. J Am Med Dir Assoc 2020; 21:1175-1177. [PMID: 32217070 DOI: 10.1016/j.jamda.2020.02.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2020] [Revised: 01/21/2020] [Accepted: 02/10/2020] [Indexed: 11/28/2022]
Affiliation(s)
- John Oruongo
- Department of Industrial and Systems Engineering, College of Engineering, University of Wisconsin, Madison, WI
| | - Katie Ronk
- Population Health Sciences, School of Medicine and Public Health, University of Wisconsin, Madison, WI; Health Innovation Program, School of Medicine and Public Health, University of Wisconsin, Madison, WI
| | - Oguzhan Alagoz
- Department of Industrial and Systems Engineering, College of Engineering, University of Wisconsin, Madison, WI
| | - Jonathan Jaffery
- Office of Population Health, UW Health, Madison, WI; Department of Medicine, University of Wisconsin-Madison School of Medicine and Public Health, Madison, WI
| | - Maureen Smith
- Department of Industrial and Systems Engineering, College of Engineering, University of Wisconsin, Madison, WI; Population Health Sciences, School of Medicine and Public Health, University of Wisconsin, Madison, WI; Health Innovation Program, School of Medicine and Public Health, University of Wisconsin, Madison, WI
| |
Collapse
|
14
|
Libkind D, Peris D, Cubillos FA, Steenwyk JL, Opulente DA, Langdon QK, Rokas A, Hittinger CT. Into the wild: new yeast genomes from natural environments and new tools for their analysis. FEMS Yeast Res 2020; 20:foaa008. [PMID: 32009143 PMCID: PMC7067299 DOI: 10.1093/femsyr/foaa008] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2019] [Accepted: 01/31/2020] [Indexed: 12/16/2022] Open
Abstract
Genomic studies of yeasts from the wild have increased considerably in the past few years. This revolution has been fueled by advances in high-throughput sequencing technologies and a better understanding of yeast ecology and phylogeography, especially for biotechnologically important species. The present review aims to first introduce new bioinformatic tools available for the generation and analysis of yeast genomes. We also assess the accumulated genomic data of wild isolates of industrially relevant species, such as Saccharomyces spp., which provide unique opportunities to further investigate the domestication processes associated with the fermentation industry and opportunistic pathogenesis. The availability of genome sequences of other less conventional yeasts obtained from the wild has also increased substantially, including representatives of the phyla Ascomycota (e.g. Hanseniaspora) and Basidiomycota (e.g. Phaffia). Here, we review salient examples of both fundamental and applied research that demonstrate the importance of continuing to sequence and analyze genomes of wild yeasts.
Collapse
Affiliation(s)
- D Libkind
- Centro de Referencia en Levaduras y Tecnología Cervecera (CRELTEC), Instituto Andino Patagónico de Tecnologías Biológicas y Geoambientales (IPATEC) – CONICET/Universidad Nacional del Comahue, Quintral 1250 (8400), Bariloche., Argentina
| | - D Peris
- Department of Food Biotechnology, Institute of Agrochemistry and Food Technology-CSIC, Calle Catedrático Dr. D. Agustin Escardino Benlloch n°7, 46980 Paterna, Valencia, Spain
| | - F A Cubillos
- Millennium Institute for Integrative Biology (iBio). General del Canto 51 (7500574), Santiago
- Universidad de Santiago de Chile, Facultad de Química y Biología, Departamento de Biología. Alameda 3363 (9170002). Estación Central. Santiago, Chile
| | - J L Steenwyk
- Department of Biological Sciences, VU Station B#35-1634, Vanderbilt University, Nashville, TN 37235, USA
| | - D A Opulente
- Laboratory of Genetics, Wisconsin Energy Institute, J. F. Crow Institute for the Study of Evolution, Center for Genomic Science Innovation, University of Wisconsin-Madison, 1552 University Avenue, Madison, WI 53726-4084, USA
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, 1552 University Avenue, Madison, I 53726-4084, Madison, WI, USA
| | - Q K Langdon
- Laboratory of Genetics, Wisconsin Energy Institute, J. F. Crow Institute for the Study of Evolution, Center for Genomic Science Innovation, University of Wisconsin-Madison, 1552 University Avenue, Madison, WI 53726-4084, USA
| | - A Rokas
- Department of Biological Sciences, VU Station B#35-1634, Vanderbilt University, Nashville, TN 37235, USA
| | - C T Hittinger
- Laboratory of Genetics, Wisconsin Energy Institute, J. F. Crow Institute for the Study of Evolution, Center for Genomic Science Innovation, University of Wisconsin-Madison, 1552 University Avenue, Madison, WI 53726-4084, USA
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, 1552 University Avenue, Madison, I 53726-4084, Madison, WI, USA
| |
Collapse
|
15
|
Langdon QK, Peris D, Kyle B, Hittinger CT. sppIDer: A Species Identification Tool to Investigate Hybrid Genomes with High-Throughput Sequencing. Mol Biol Evol 2019; 35:2835-2849. [PMID: 30184140 PMCID: PMC6231485 DOI: 10.1093/molbev/msy166] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
The genomics era has expanded our knowledge about the diversity of the living world, yet harnessing high-throughput sequencing data to investigate alternative evolutionary trajectories, such as hybridization, is still challenging. Here we present sppIDer, a pipeline for the characterization of interspecies hybrids and pure species, that illuminates the complete composition of genomes. sppIDer maps short-read sequencing data to a combination genome built from reference genomes of several species of interest and assesses the genomic contribution and relative ploidy of each parental species, producing a series of colorful graphical outputs ready for publication. As a proof-of-concept, we use the genus Saccharomyces to detect and visualize both interspecies hybrids and pure strains, even with missing parental reference genomes. Through simulation, we show that sppIDer is robust to variable reference genome qualities and performs well with low-coverage data. We further demonstrate the power of this approach in plants, animals, and other fungi. sppIDer is robust to many different inputs and provides visually intuitive insight into genome composition that enables the rapid identification of species and their interspecies hybrids. sppIDer exists as a Docker image, which is a reusable, reproducible, transparent, and simple-to-run package that automates the pipeline and installation of the required dependencies (https://github.com/GLBRC/sppIDer; last accessed September 6, 2018).
Collapse
Affiliation(s)
- Quinn K Langdon
- Laboratory of Genetics, J. F. Crow Institute for the Study of Evolution, Genome Center of Wisconsin, University of Wisconsin-Madison, Madison, WI.,Wisconsin Energy Institute, University of Wisconsin-Madison, Madison, WI
| | - David Peris
- Laboratory of Genetics, J. F. Crow Institute for the Study of Evolution, Genome Center of Wisconsin, University of Wisconsin-Madison, Madison, WI.,Wisconsin Energy Institute, University of Wisconsin-Madison, Madison, WI.,DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI.,Department of Food Biotechnology, Institute of Agrochemistry and Food Technology (IATA), CSIC, Valencia, Spain
| | - Brian Kyle
- Wisconsin Energy Institute, University of Wisconsin-Madison, Madison, WI
| | - Chris Todd Hittinger
- Laboratory of Genetics, J. F. Crow Institute for the Study of Evolution, Genome Center of Wisconsin, University of Wisconsin-Madison, Madison, WI.,Wisconsin Energy Institute, University of Wisconsin-Madison, Madison, WI.,DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI
| |
Collapse
|
16
|
Allio R, Scornavacca C, Nabholz B, Clamens AL, Sperling FAH, Condamine FL. Whole Genome Shotgun Phylogenomics Resolves the Pattern and Timing of Swallowtail Butterfly Evolution. Syst Biol 2019; 69:38-60. [DOI: 10.1093/sysbio/syz030] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2018] [Revised: 04/26/2019] [Accepted: 04/28/2019] [Indexed: 01/20/2023] Open
Abstract
Abstract
Evolutionary relationships have remained unresolved in many well-studied groups, even though advances in next-generation sequencing and analysis, using approaches such as transcriptomics, anchored hybrid enrichment, or ultraconserved elements, have brought systematics to the brink of whole genome phylogenomics. Recently, it has become possible to sequence the entire genomes of numerous nonbiological models in parallel at reasonable cost, particularly with shotgun sequencing. Here, we identify orthologous coding sequences from whole-genome shotgun sequences, which we then use to investigate the relevance and power of phylogenomic relationship inference and time-calibrated tree estimation. We study an iconic group of butterflies—swallowtails of the family Papilionidae—that has remained phylogenetically unresolved, with continued debate about the timing of their diversification. Low-coverage whole genomes were obtained using Illumina shotgun sequencing for all genera. Genome assembly coupled to BLAST-based orthology searches allowed extraction of 6621 orthologous protein-coding genes for 45 Papilionidae species and 16 outgroup species (with 32% missing data after cleaning phases). Supermatrix phylogenomic analyses were performed with both maximum-likelihood (IQ-TREE) and Bayesian mixture models (PhyloBayes) for amino acid sequences, which produced a fully resolved phylogeny providing new insights into controversial relationships. Species tree reconstruction from gene trees was performed with ASTRAL and SuperTriplets and recovered the same phylogeny. We estimated gene site concordant factors to complement traditional node-support measures, which strengthens the robustness of inferred phylogenies. Bayesian estimates of divergence times based on a reduced data set (760 orthologs and 12% missing data) indicate a mid-Cretaceous origin of Papilionoidea around 99.2 Ma (95% credibility interval: 68.6–142.7 Ma) and Papilionidae around 71.4 Ma (49.8–103.6 Ma), with subsequent diversification of modern lineages well after the Cretaceous-Paleogene event. These results show that shotgun sequencing of whole genomes, even when highly fragmented, represents a powerful approach to phylogenomics and molecular dating in a group that has previously been refractory to resolution.
Collapse
Affiliation(s)
- Rémi Allio
- Institut des Sciences de l’Evolution de Montpellier (Université de Montpellier
- CNRS
- IRD
- EPHE), Place Eugène Bataillon, 34095 Montpellier, France
| | - Céline Scornavacca
- Institut des Sciences de l’Evolution de Montpellier (Université de Montpellier
- CNRS
- IRD
- EPHE), Place Eugène Bataillon, 34095 Montpellier, France
- Institut de Biologie Computationnelle (IBC), Montpellier, France
| | - Benoit Nabholz
- Institut des Sciences de l’Evolution de Montpellier (Université de Montpellier
- CNRS
- IRD
- EPHE), Place Eugène Bataillon, 34095 Montpellier, France
| | - Anne-Laure Clamens
- INRA, UMR 1062 Centre de Biologie pour la Gestion des Populations (INRA, IRD, CIRAD, Montpellier SupAgro), 755 Avenue du Campus Agropolis, 34988 Montferrier-sur-Lez, France
- Department of Biological Sciences, University of Alberta, Edmonton T6G 2E9, AB, Canada
| | - Felix AH Sperling
- Department of Biological Sciences, University of Alberta, Edmonton T6G 2E9, AB, Canada
| | - Fabien L Condamine
- Institut des Sciences de l’Evolution de Montpellier (Université de Montpellier
- CNRS
- IRD
- EPHE), Place Eugène Bataillon, 34095 Montpellier, France
- Department of Biological Sciences, University of Alberta, Edmonton T6G 2E9, AB, Canada
| |
Collapse
|
17
|
Pouchon C, Fernández A, Nassar JM, Boyer F, Aubert S, Lavergne S, Mavárez J. Phylogenomic Analysis of the Explosive Adaptive Radiation of the Espeletia Complex (Asteraceae) in the Tropical Andes. Syst Biol 2018; 67:1041-1060. [PMID: 30339252 DOI: 10.1093/sysbio/syy022] [Citation(s) in RCA: 55] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2017] [Accepted: 03/15/2018] [Indexed: 01/17/2023] Open
Abstract
The subtribe Espeletiinae (Asteraceae), endemic to the high-elevations in the Northern Andes, exhibits an exceptional diversity of species, growth-forms, and reproductive strategies. This complex of 140 species includes large trees, dichotomous trees, shrubs and the extraordinary giant caulescent rosettes, considered as a classic example of adaptation in tropical high-elevation ecosystems. The subtribe has also long been recognized as a prominent case of adaptive radiation, but the understanding of its evolution has been hampered by a lack of phylogenetic resolution. Herein, we produce the first fully resolved phylogeny of all morphological groups of Espeletiinae, using whole plastomes and about a million nuclear nucleotides obtained with an original de novo assembly procedure without reference genome, and analyzed with traditional and coalescent-based approaches that consider the possible impact of incomplete lineage sorting and hybridization on phylogenetic inference. We show that the diversification of Espeletiinae started from a rosette ancestor about 2.3 Ma, after the final uplift of the Northern Andes. This was followed by two independent radiations in the Colombian and Venezuelan Andes, with a few trans-cordilleran dispersal events among low-elevation tree lineages but none among high-elevation rosettes. We demonstrate complex scenarios of morphological change in Espeletiinae, usually implying the convergent evolution of growth-forms with frequent loss/gains of various traits. For instance, caulescent rosettes evolved independently in both countries, likely as convergent adaptations to life in tropical high-elevation habitats. Tree growth-forms evolved independently three times from the repeated colonization of lower elevations by high-elevation rosette ancestors. The rate of morphological diversification increased during the early phase of the radiation, after which it decreased steadily towards the present. On the other hand, the rate of species diversification in the best-sampled Venezuelan radiation was on average very high (3.1 spp/My), with significant rate variation among growth-forms (much higher in polycarpic caulescent rosettes). Our results point out a scenario where both adaptive morphological evolution and geographical isolation due to Pleistocene climatic oscillations triggered an exceptionally rapid radiation for a continental plant group.
Collapse
Affiliation(s)
- Charles Pouchon
- Laboratoire d'Ecologie Alpine, UMR 5553, Université Grenoble Alpes-CNRS, Grenoble, France
| | - Angel Fernández
- Herbario IVIC, Centro de Biofísica y Bioquímica, Instituto Venezolano de Investigaciones Científicas, Apartado 20632, Caracas 1020-A, Venezuela
| | - Jafet M Nassar
- Laboratorio de Biología de Organismos, Centro de Ecología, Instituto Venezolano de Investigaciones Científicas, Apartado 20632, Caracas 1020-A, Venezuela
| | - Frédéric Boyer
- Laboratoire d'Ecologie Alpine, UMR 5553, Université Grenoble Alpes-CNRS, Grenoble, France
| | - Serge Aubert
- Laboratoire d'Ecologie Alpine, UMR 5553, Université Grenoble Alpes-CNRS, Grenoble, France.,Station alpine Joseph-Fourier, UMS 3370, Université Grenoble Alpes-CNRS, Grenoble, France
| | - Sébastien Lavergne
- Laboratoire d'Ecologie Alpine, UMR 5553, Université Grenoble Alpes-CNRS, Grenoble, France
| | - Jesús Mavárez
- Laboratoire d'Ecologie Alpine, UMR 5553, Université Grenoble Alpes-CNRS, Grenoble, France
| |
Collapse
|
18
|
Mayland-Quellhorst E, Meudt HM, Albach DC. Transcriptomic resources and marker validation for diploid and polyploid Veronica (Plantaginaceae) from New Zealand and Europe. APPLICATIONS IN PLANT SCIENCES 2016; 4:apps1600091. [PMID: 27785388 PMCID: PMC5077287 DOI: 10.3732/apps.1600091] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2016] [Accepted: 09/02/2016] [Indexed: 05/24/2023]
Abstract
PREMISE OF THE STUDY Polyploidy may generate novel variation, leading to adaptation and species diversification. An excellent natural system to study polyploid evolution in a comparative framework is Veronica (Plantaginaceae), which comprises several parallel, recently evolved polyploid series. METHODS Over 105 million Illumina paired-end sequence reads were generated from cDNA libraries of leaf tissue from eight individuals representing three European and four New Zealand species. Forty-eight simple sequence repeat (SSR) and 48 low-copy nuclear (LCN) markers were developed and validated with Fluidigm microfluidic PCR and Illumina MiSeq amplicon sequencing on 48 different individuals each. RESULTS Individual Trinity assemblies were similar regarding annotated transcripts (13,009-14,271), mean contig length (635-742 bp), N50 value (916-1133 bp), E90N50 value (1099-1308 bp), contigs with positive BLAST hits (42-63%), and gene ontology terms. Analyses of 29,738 single-nucleotide polymorphisms (8746 phylogenetically informative) mined from these transcriptomes plus two outgroups (Picrorhiza kurrooa and Plantago ovata) showed moderate to high bootstrap support for all branches and reticulation among sampled European Veronica. DISCUSSION The transcriptome sequences themselves, as well as the validated SSR (40/48) and LCN (11/48) markers derived from them, show inter- and intraspecific genetic variation. These resources will be invaluable for future population genetic, phylogenetic, and functional genetic investigations in polyploid Veronica.
Collapse
Affiliation(s)
- Eike Mayland-Quellhorst
- Carl-von-Ossietzky Universität Oldenburg, Carl-von-Ossietzky Straße 9–11, Oldenburg 26111, Germany
| | - Heidi M. Meudt
- Carl-von-Ossietzky Universität Oldenburg, Carl-von-Ossietzky Straße 9–11, Oldenburg 26111, Germany
- Museum of New Zealand Te Papa Tongarewa, Cable Street, P.O. Box 467, Wellington 6140, New Zealand
| | - Dirk C. Albach
- Carl-von-Ossietzky Universität Oldenburg, Carl-von-Ossietzky Straße 9–11, Oldenburg 26111, Germany
| |
Collapse
|
19
|
Abstract
The number of large-scale genomics projects is increasing due to the availability of affordable high-throughput sequencing (HTS) technologies. The use of HTS for bacterial infectious disease research is attractive because one whole-genome sequencing (WGS) run can replace multiple assays for bacterial typing, molecular epidemiology investigations, and more in-depth pathogenomic studies. The computational resources and bioinformatics expertise required to accommodate and analyze the large amounts of data pose new challenges for researchers embarking on genomics projects for the first time. Here, we present a comprehensive overview of a bacterial genomics projects from beginning to end, with a particular focus on the planning and computational requirements for HTS data, and provide a general understanding of the analytical concepts to develop a workflow that will meet the objectives and goals of HTS projects.
Collapse
|
20
|
Maddock ST, Briscoe AG, Wilkinson M, Waeschenbach A, San Mauro D, Day JJ, Littlewood DTJ, Foster PG, Nussbaum RA, Gower DJ. Next-Generation Mitogenomics: A Comparison of Approaches Applied to Caecilian Amphibian Phylogeny. PLoS One 2016; 11:e0156757. [PMID: 27280454 PMCID: PMC4900593 DOI: 10.1371/journal.pone.0156757] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2015] [Accepted: 05/19/2016] [Indexed: 01/06/2023] Open
Abstract
Mitochondrial genome (mitogenome) sequences are being generated with increasing speed due to the advances of next-generation sequencing (NGS) technology and associated analytical tools. However, detailed comparisons to explore the utility of alternative NGS approaches applied to the same taxa have not been undertaken. We compared a 'traditional' Sanger sequencing method with two NGS approaches (shotgun sequencing and non-indexed, multiplex amplicon sequencing) on four different sequencing platforms (Illumina's HiSeq and MiSeq, Roche's 454 GS FLX, and Life Technologies' Ion Torrent) to produce seven (near-) complete mitogenomes from six species that form a small radiation of caecilian amphibians from the Seychelles. The fastest, most accurate method of obtaining mitogenome sequences that we tested was direct sequencing of genomic DNA (shotgun sequencing) using the MiSeq platform. Bayesian inference and maximum likelihood analyses using seven different partitioning strategies were unable to resolve compellingly all phylogenetic relationships among the Seychelles caecilian species, indicating the need for additional data in this case.
Collapse
Affiliation(s)
- Simon T. Maddock
- Department of Life Sciences, Natural History Museum, London, SW7 5BD, United Kingdom
- Department of Genetics, Evolution and Environment, University College London, London, WC1E 6BT, United Kingdom
- Department of Animal Management, Reaseheath College, Nantwich, CW5 6DF, United Kingdom
| | - Andrew G. Briscoe
- Department of Life Sciences, Natural History Museum, London, SW7 5BD, United Kingdom
| | - Mark Wilkinson
- Department of Life Sciences, Natural History Museum, London, SW7 5BD, United Kingdom
| | - Andrea Waeschenbach
- Department of Life Sciences, Natural History Museum, London, SW7 5BD, United Kingdom
| | - Diego San Mauro
- Department of Zoology and Physical Anthropology, Complutense University of Madrid, 28040, Madrid, Spain
| | - Julia J. Day
- Department of Genetics, Evolution and Environment, University College London, London, WC1E 6BT, United Kingdom
| | - D. Tim J. Littlewood
- Department of Life Sciences, Natural History Museum, London, SW7 5BD, United Kingdom
| | - Peter G. Foster
- Department of Life Sciences, Natural History Museum, London, SW7 5BD, United Kingdom
| | - Ronald A. Nussbaum
- Museum of Zoology, University of Michigan, Ann Arbor, MI, 48109–1079, United States of America
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, 48109–1079, United States of America
| | - David J. Gower
- Department of Life Sciences, Natural History Museum, London, SW7 5BD, United Kingdom
| |
Collapse
|
21
|
Phylogenomic reconstruction supports supercontinent origins for Leishmania. INFECTION GENETICS AND EVOLUTION 2015; 38:101-109. [PMID: 26708057 DOI: 10.1016/j.meegid.2015.11.030] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2015] [Revised: 11/25/2015] [Accepted: 11/26/2015] [Indexed: 11/23/2022]
Abstract
Leishmania, a genus of parasites transmitted to human hosts and mammalian/reptilian reservoirs by an insect vector, is the causative agent of the human disease complex leishmaniasis. The evolutionary relationships within the genus Leishmania and its origins are the source of ongoing debate, reflected in conflicting phylogenetic and biogeographic reconstructions. This study employs a recently described bioinformatics method, SISRS, to identify over 200,000 informative sites across the genome from newly sequenced and publicly available Leishmania data. This dataset is used to reconstruct the evolutionary relationships of this genus. Additionally, we constructed a large multi-gene dataset, using it to reconstruct the phylogeny and estimate divergence dates for species. We conclude that the genus Leishmania evolved at least 90-100 million years ago, supporting a modified version of the Multiple Origins hypothesis that we call the Supercontinent hypothesis. According to this scenario, separate Leishmania clades emerged prior to, and during, the breakup of Gondwana. Additionally, we confirm that reptile-infecting Leishmania are derived from mammalian forms and that the species that infect porcupines and sloths form a clade long separated from other species. Finally, we firmly place the guinea-pig infecting species, Leishmaniaenriettii, the globally dispersed Leishmaniasiamensis, and the newly identified Australian species from a kangaroo, as sibling species whose distribution arises from the ancient connection between Australia, Antarctica, and South America.
Collapse
|
22
|
Pettengill JB, Luo Y, Davis S, Chen Y, Gonzalez-Escalona N, Ottesen A, Rand H, Allard MW, Strain E. An evaluation of alternative methods for constructing phylogenies from whole genome sequence data: a case study with Salmonella. PeerJ 2014; 2:e620. [PMID: 25332847 PMCID: PMC4201946 DOI: 10.7717/peerj.620] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2014] [Accepted: 09/23/2014] [Indexed: 11/20/2022] Open
Abstract
Comparative genomics based on whole genome sequencing (WGS) is increasingly being applied to investigate questions within evolutionary and molecular biology, as well as questions concerning public health (e.g., pathogen outbreaks). Given the impact that conclusions derived from such analyses may have, we have evaluated the robustness of clustering individuals based on WGS data to three key factors: (1) next-generation sequencing (NGS) platform (HiSeq, MiSeq, IonTorrent, 454, and SOLiD), (2) algorithms used to construct a SNP (single nucleotide polymorphism) matrix (reference-based and reference-free), and (3) phylogenetic inference method (FastTreeMP, GARLI, and RAxML). We carried out these analyses on 194 whole genome sequences representing 107 unique Salmonella enterica subsp. enterica ser. Montevideo strains. Reference-based approaches for identifying SNPs produced trees that were significantly more similar to one another than those produced under the reference-free approach. Topologies inferred using a core matrix (i.e., no missing data) were significantly more discordant than those inferred using a non-core matrix that allows for some missing data. However, allowing for too much missing data likely results in a high false discovery rate of SNPs. When analyzing the same SNP matrix, we observed that the more thorough inference methods implemented in GARLI and RAxML produced more similar topologies than FastTreeMP. Our results also confirm that reproducibility varies among NGS platforms where the MiSeq had the lowest number of pairwise differences among replicate runs. Our investigation into the robustness of clustering patterns illustrates the importance of carefully considering how data from different platforms are combined and analyzed. We found clear differences in the topologies inferred, and certain methods performed significantly better than others for discriminating between the highly clonal organisms investigated here. The methods supported by our results represent a preliminary set of guidelines and a step towards developing validated standards for clustering based on whole genome sequence data.
Collapse
Affiliation(s)
- James B Pettengill
- Center for Food Safety & Applied Nutrition, U.S. Food & Drug Administration , College Park, MD , USA
| | - Yan Luo
- Center for Food Safety & Applied Nutrition, U.S. Food & Drug Administration , College Park, MD , USA
| | - Steven Davis
- Center for Food Safety & Applied Nutrition, U.S. Food & Drug Administration , College Park, MD , USA
| | - Yi Chen
- Center for Food Safety & Applied Nutrition, U.S. Food & Drug Administration , College Park, MD , USA
| | - Narjol Gonzalez-Escalona
- Center for Food Safety & Applied Nutrition, U.S. Food & Drug Administration , College Park, MD , USA
| | - Andrea Ottesen
- Center for Food Safety & Applied Nutrition, U.S. Food & Drug Administration , College Park, MD , USA
| | - Hugh Rand
- Center for Food Safety & Applied Nutrition, U.S. Food & Drug Administration , College Park, MD , USA
| | - Marc W Allard
- Center for Food Safety & Applied Nutrition, U.S. Food & Drug Administration , College Park, MD , USA
| | - Errol Strain
- Center for Food Safety & Applied Nutrition, U.S. Food & Drug Administration , College Park, MD , USA
| |
Collapse
|