1
|
Abalde S, Jondelius U. A Phylogenomic Backbone for Acoelomorpha Inferred From Transcriptomic Data. Syst Biol 2025; 74:70-85. [PMID: 39451056 PMCID: PMC11809588 DOI: 10.1093/sysbio/syae057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Revised: 10/03/2024] [Accepted: 11/28/2024] [Indexed: 10/26/2024] Open
Abstract
Xenacoelomorpha are mostly microscopic, morphologically simple worms, lacking many structures typical of other bilaterians. Xenacoelomorphs-which include three main groups, namely Acoela, Nemertodermatida, and Xenoturbella-have been proposed to be an early diverging Bilateria, sister to protostomes and deuterostomes, but other phylogenomic analyses have recovered this clade nested within the deuterostomes, as sister to Ambulacraria. The position of Xenacoelomorpha within the metazoan tree has understandably attracted a lot of attention, overshadowing the study of phylogenetic relationships within this group. Given that Xenoturbella includes only six species whose relationships are well understood, we decided to focus on the most speciose Acoelomorpha (Acoela + Nemertodermatida). Here, we have sequenced 29 transcriptomes, doubling the number of sequenced species, to infer a backbone tree for Acoelomorpha based on genomic data. The recovered topology is mostly congruent with previous studies. The most important difference is the recovery of Paratomella as the first off-shoot within Acoela, dramatically changing the reconstruction of the ancestral acoel. Besides, we have detected incongruence between the gene trees and the species tree, likely linked to incomplete lineage sorting, and some signal of introgression between the families Dakuidae and Mecynostomidae, which hampers inferring the correct placement of this family and, particularly, of the genus Notocelis. We have also used this dataset to infer for the first time diversification times within Acoelomorpha, which coincide with known bilaterian diversification and extinction events. Given the importance of morphological data in acoelomorph phylogenetics, we tested several partitions and models. Although morphological data failed to recover a robust phylogeny, phylogenetic placement has proven to be a suitable alternative when a reference phylogeny is available.
Collapse
Affiliation(s)
- Samuel Abalde
- Department of Zoology, Swedish Museum of Natural History, Stockholm, Sweden
| | - Ulf Jondelius
- Department of Zoology, Swedish Museum of Natural History, Stockholm, Sweden
| |
Collapse
|
2
|
Torruella G, Galindo LJ, Moreira D, López-García P. Phylogenomics of neglected flagellated protists supports a revised eukaryotic tree of life. Curr Biol 2025; 35:198-207.e4. [PMID: 39642877 DOI: 10.1016/j.cub.2024.10.075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Revised: 08/28/2024] [Accepted: 10/29/2024] [Indexed: 12/09/2024]
Abstract
Eukaryotes evolved from prokaryotic predecessors in the early Proterozoic1,2 and radiated from their already complex last common ancestor,3 diversifying into several supergroups with unresolved deep evolutionary connections.4 They evolved extremely diverse lifestyles, playing crucial roles in the carbon cycle.5,6 Heterotrophic flagellates are arguably the most diverse eukaryotes4,7,8,9 and often occupy basal positions in phylogenetic trees. However, many of them remain undersampled4,10 and/or incertae sedis.4,11,12,13,14,15,16,17,18 Progressive improvement of phylogenomic methods and a wider protist sampling have reshaped and consolidated major clades in the eukaryotic tree.13,14,15,16,17,18,19 This is illustrated by the Opimoda,14 one of the largest eukaryotic supergroups (Amoebozoa, Ancyromonadida, Apusomonadida, Breviatea, CRuMs [Collodictyon-Rigifila-Mantamonas], Malawimonadida, and Opisthokonta-including animals and fungi).4,14,19,20,21,22 However, their deepest evolutionary relationships still remain uncertain. Here, we sequenced transcriptomes of poorly studied flagellates23,24 (14 apusomonads,25,26 7 ancyromonads,27 and 1 cultured Mediterranean strain of Meteora sporadica17) and conducted comprehensive phylogenomics analyses with an expanded taxon sampling of early-branching protists. Our findings support the monophyly of Opimoda, with CRuMs being sister to the Amorphea (amoebozoans, breviates, apusomonads, and opisthokonts) and ancyromonads and malawimonads forming a moderately supported clade. By mapping key complex phenotypic traits onto this phylogenetic framework, we infer an opimodan biflagellate ancestor with an excavate-like feeding groove, which ancyromonads subsequently lost. Although breviates and apusomonads retained the ancestral biflagellate state, some early-diverging Amorphea lost one or both flagella, facilitating the evolution of amoeboid morphologies, novel feeding modes, and palintomic cell division resulting in multinucleated cells. These innovations likely facilitated the subsequent evolution of fungal and metazoan multicellularity.
Collapse
Affiliation(s)
- Guifré Torruella
- Ecologie Systématique Evolution, CNRS, Université Paris-Saclay, AgroParisTech, 91190 Gif-sur-Yvette, France; Institut de Biologia Evolutiva, UPF-CSIC, Barcelona, Catalonia 08003, Spain.
| | - Luis Javier Galindo
- Ecologie Systématique Evolution, CNRS, Université Paris-Saclay, AgroParisTech, 91190 Gif-sur-Yvette, France; Institute of Water Research, University of Granada, 18071 Granada, Spain; Department of Ecology, University of Granada, Campus Fuentenueva, 18071 Granada, Spain
| | - David Moreira
- Ecologie Systématique Evolution, CNRS, Université Paris-Saclay, AgroParisTech, 91190 Gif-sur-Yvette, France
| | - Purificación López-García
- Ecologie Systématique Evolution, CNRS, Université Paris-Saclay, AgroParisTech, 91190 Gif-sur-Yvette, France.
| |
Collapse
|
3
|
Roberts NG, Gilmore MJ, Struck TH, Kocot KM. Multiple Displacement Amplification Facilitates SMRT Sequencing of Microscopic Animals and the Genome of the Gastrotrich Lepidodermella squamata (Dujardin 1841). Genome Biol Evol 2024; 16:evae254. [PMID: 39590608 DOI: 10.1093/gbe/evae254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 11/11/2024] [Accepted: 11/14/2024] [Indexed: 11/28/2024] Open
Abstract
Obtaining adequate DNA for long-read genome sequencing remains a roadblock to producing contiguous genomes from small-bodied organisms, hindering understanding of phylogenetic relationships and genome evolution. Multiple displacement amplification leverages Phi29 DNA polymerase to produce micrograms of DNA from picograms of input. However, multiple displacement amplification's inherent biases in amplification related to guanine and cytosine (GC) content, repeat content and chimera production are a problem for long-read genome assembly, which has been little investigated. We explored the utility of multiple displacement amplification for generating template DNA for High Fidelity (HiFi) sequencing directly from living cells of Caenorhabditis elegans (Nematoda) and Lepidodermella squamata (Gastrotricha) containing one order of magnitude less DNA than required for the PacBio Ultra-Low DNA Input Workflow. High Fidelity sequencing of libraries prepared from multiple displacement amplification products resulted in highly contiguous and complete genomes for both C. elegans (102 Mbp assembly; 336 contigs; N50 = 868 kbp; L50 = 39; BUSCO_nematoda_nucleotide: S:96.1%, D:2.8%) and L. squamata (122 Mbp assembly; 157 contigs; N50 = 3.9 Mbp; L50 = 13; BUSCO_metazoa_nucleotide: S:80.8%, D:2.8%). Coverage uniformity for reads from multiple displacement amplification DNA (Gini Index: 0.14, normalized mean across all 100 kbp blocks: 0.49) and reads from pooled nematode DNA (Gini Index: 0.16, normalized mean across all 100 kbp blocks: 0.49) proved similar. Using this approach, we sequenced the genome of the microscopic invertebrate L. squamata (Gastrotricha), the first of its phylum. Using the newly sequenced genome, we infer Gastrotricha's long-debated phylogenetic position as the sister taxon of Platyhelminthes and conduct a comparative analysis of the Hox cluster.
Collapse
Affiliation(s)
- Nickellaus G Roberts
- Department of Biological Sciences, The University of Alabama, Tuscaloosa, Alabama, USA
| | - Michael J Gilmore
- Department of Biological Sciences, The University of Alabama, Tuscaloosa, Alabama, USA
| | | | - Kevin M Kocot
- Department of Biological Sciences, The University of Alabama, Tuscaloosa, Alabama, USA
- Alabama Museum of Natural History, The University of Alabama, Tuscaloosa, Alabama, USA
| |
Collapse
|
4
|
Zeng Y, He K, Chen X, Bai W, Lin H, Chen J, Nedyalkov N, Yamaguchi N, Vijayan K, Suganthasakthivel R, Kumar B, Han Y, Chen Z, Wang W, Liu Y. Museum specimens shedding light on the evolutionary history and cryptic diversity of the hedgehog family Erinaceidae. Integr Zool 2024. [PMID: 39370584 DOI: 10.1111/1749-4877.12909] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/08/2024]
Abstract
The family Erinaceidae encompasses 27 extant species in two subfamilies: Erinaceinae, which includes spiny hedgehogs, and Galericinae, which comprises silky-furred gymnures and moonrats. Although they are commonly recognized by the general public, their phylogenetic history remains incompletely understood, and several species have never been included in any molecular analyses. Additionally, previous research suggested that the species diversity of Erinaceidae might be underestimated. In this study, we sequenced the mitochondrial genomes of 29 individuals representing 18 erinaceid species using 18 freshly collected tissue and 11 historical museum specimens. We also integrated previously published data for a concatenated analysis. We aimed to elucidate the evolutionary relationships within Erinaceidae, estimate divergence times, and uncover potential underestimated species diversity. Our data finely resolved intergeneric and interspecific relationships and presented the first molecular evidence for the phylogenetic position of Mesechinus wangi, Paraechinus micropus, and P. nudiventris. Our results revealed a sister relationship between Neotetracus and Neohylomys gymnures, as well as a sister relationship between Hemiechinus and Mesechinus, supporting previous hypotheses. Additionally, our findings provided a novel phylogenetic position for Paraechinus aethiopicus, placing it in a basal position within the genus. Furthermore, our study uncovered cryptic species diversity within Hylomys suillus as well as in Neotetracus sinensis, Atelerix albiventris, P. aethiopicus, and Hemiechinus auratus, most of which have been previously overlooked.
Collapse
Affiliation(s)
- Ying Zeng
- State Key Laboratory of Biocontrol, School of Ecology, Sun Yat-sen University, Shenzhen, China
| | - Kai He
- Key Laboratory of Conservation and Application in Biodiversity of South China, School of Life Sciences, Guangzhou University, Guangzhou, China
| | - Xing Chen
- School of Zoology, Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Weipeng Bai
- Institute of Nihewan Archaeology, College of History and Culture, Hebei Normal University, Shijiazhuang, China
| | - Hongzhou Lin
- State Key Laboratory of Biocontrol, School of Ecology, Sun Yat-sen University, Shenzhen, China
| | - Jianhai Chen
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, China
| | - Nedko Nedyalkov
- National Museum of Natural History, Bulgarian Academy of Sciences, Sofia, Bulgaria
| | - Nobuyuki Yamaguchi
- Department of Biological and Environmental Sciences, Faculty of Arts and Sciences, Qatar University, Doha, Qatar
- Institute of Tropical Biodiversity and Sustainable Development, University Malaysia Terengganu, Kuala Nerus, Malaysia
| | - Keerthy Vijayan
- Centre for Plant Biotechnology and Molecular Biology, Kerala Agricultural University, Thrissur, Kerala, India
| | | | - Brawin Kumar
- Indian Institute of Science Education and Research, Tirupati, Andhra Pradesh, India
- Hedgehog Conservation Alliance (HCA), Kanyakumari, Tamil Nadu, India
| | - Yuqing Han
- State Key Laboratory of Biocontrol, School of Ecology, Sun Yat-sen University, Shenzhen, China
| | - Zhongzheng Chen
- Collaborative Innovation Center of Recovery and Reconstruction of Degraded Ecosystem in Wanjiang Basin Co-founded by Anhui Province and Ministry of Education, School of Ecology and Environment, Anhui Normal University, Wuhu, China
- Wildlife Forensic Science Service, Kunming, China
| | - Wenzhi Wang
- Wildlife Forensic Science Service, Kunming, China
- Guizhou Jiandee Laboratories Co., Ltd., Guiyang, China
| | - Yang Liu
- State Key Laboratory of Biocontrol, School of Ecology, Sun Yat-sen University, Shenzhen, China
| |
Collapse
|
5
|
Salabi F, Jafari H. Dataset of PLA2 family identified from transcriptomic high-throughput sequencing of Androctonus crassicauda (Scorpionida: Buthidae) venom gland. Data Brief 2024; 55:110629. [PMID: 39022691 PMCID: PMC11253220 DOI: 10.1016/j.dib.2024.110629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 05/17/2024] [Accepted: 06/05/2024] [Indexed: 07/20/2024] Open
Abstract
Recently, RNA sequencing has been widely applied to deeply understand the molecular diversity of the venom compounds of various venomous animal species, including scorpions. Among the venomous scorpion species of the Buthidae family, there are many documents of stinging and severe envenoming of victims by the scorpion of Androctonus crassicauda. We present here a high-throughput RNA sequencing dataset of the venom glands from five A. crassicauda individuals, including male and female scorpions. Furthermore, the assembled data corresponding to annotated PLA2 transcripts are also presented. The dataset in this report is related to our research article entitled: "Whole transcriptome sequencing reveals the activity of the PLA2 family members in Androctonus crassicauda (Scorpionida: Buthidae) venom gland" [1]. Here, the venom gland transcriptome analysis of the A. crassicauda was performed. The analysis of concatenated clustered transcriptome assembly using TrinityStats.pl showed that de novo assembly of 517,799,704 clean read pairs generated 744,804 trinity transcripts representing 563,526 trinity genes. BUSCO score for the concatenated clustered transcriptome assembly against orthologs from Arachnida showed 96.7 % complete, 1.6 % fragmented, 1.7 % missing genes, and 2934 genes. Subsequently, the sequences represented PLA2 annotation were extracted from the transcriptome dataset using BLAST searches against the local PLA2 database. We found several cDNA sequences representing PLA2 annotations, which based on sequence similarity to previously found PLA2s, we named platelet-activating factor acetylhydrolases, calcium-dependent PLA2s, calcium-independent PLA2s, and secreted PLA2s. The PLA2 data significantly enrich KEGG pathways related to lipid metabolism. This manuscript complements the primary research article by providing additional data on the abundant estimation of PLA2s.
Collapse
Affiliation(s)
- Fatemeh Salabi
- Razi Vaccine and Serum Research Institute, Agricultural Research, Education and Extension Organization (AREEO), Ahvaz, Iran
| | - Hedieh Jafari
- Razi Vaccine and Serum Research Institute, Agricultural Research, Education and Extension Organization (AREEO), Ahvaz, Iran
| |
Collapse
|
6
|
Yarbrough E, Chandler C. Patterns of molecular evolution in a parthenogenic terrestrial isopod ( Trichoniscus pusillus). PeerJ 2024; 12:e17780. [PMID: 39071119 PMCID: PMC11276757 DOI: 10.7717/peerj.17780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Accepted: 06/30/2024] [Indexed: 07/30/2024] Open
Abstract
The "paradox of sex" refers to the question of why sexual reproduction is maintained in the wild, despite how costly it is compared to asexual reproduction. Because of these costs, one might expect nature to select for asexual reproduction, yet sex seems to be continually selected for. Multiple hypotheses have been proposed to explain this incongruence, including the niche differentiation hypothesis, the Red Queen hypothesis, and accumulation of harmful mutations in asexual species due to inefficient purifying selection. This study focuses on the accumulation of mutations in two terrestrial isopods, Trichoniscus pusillus, which has sexual diploid and parthenogenic triploid forms, and Hyloniscus riparius, an obligately sexual relative. We surveyed sex ratios of both species in an upstate New York population and obtained RNA-seq data from wild-caught individuals of both species to examine within- and between-species patterns of molecular evolution in protein-coding genes. The sex ratio and RNA-seq data together provide strong evidence that this T. pusillus population is entirely asexual and triploid, while the H. riparius population is sexual and diploid. Although all the wild T. pusillus individuals used for sequencing shared identical genotypes at nearly all SNPs, supporting a clonal origin, heterozygosity and SNP density were much higher in T. pusillus than in the sexually reproducing H. riparius. This observation suggests this parthenogenic lineage may have arisen via mating between two divergent diploid lineages. Between-species sequence comparisons showed no evidence of ineffective purifying selection in the asexual T. pusillus lineage, as measured by the ratio of nonsynonymous to synonymous substitutions (dN/dS ratios). Likewise, there was no difference between T. pusillus and H. riparius in the ratios of nonsynonymous to synonymous SNPs overall (pN/pS). However, pN/pS ratios in T. pusillus were significantly higher when considering only SNPs that may have arisen via recent mutation after the transition to parthenogenesis. Thus, these recent SNPs are consistent with the hypothesis that purifying selection is less effective against new mutations in asexual lineages, but only over long time scales. This system provides a useful model for future studies on the evolutionary tradeoffs between sexual and asexual reproduction in nature.
Collapse
Affiliation(s)
- Emily Yarbrough
- Department of Biological Sciences, State University of New York at Oswego, Oswego, NY, United States of America
- Department of Biological Sciences, State University of New York at Binghamton, Binghamton, NY, United States of America
| | - Christopher Chandler
- Department of Biological Sciences, State University of New York at Oswego, Oswego, NY, United States of America
| |
Collapse
|
7
|
Huang YH, Sun YF, Li H, Li HS, Pang H. PhyloAln: A Convenient Reference-Based Tool to Align Sequences and High-Throughput Reads for Phylogeny and Evolution in the Omic Era. Mol Biol Evol 2024; 41:msae150. [PMID: 39041199 PMCID: PMC11287380 DOI: 10.1093/molbev/msae150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Revised: 05/15/2024] [Accepted: 07/16/2024] [Indexed: 07/24/2024] Open
Abstract
The current trend in phylogenetic and evolutionary analyses predominantly relies on omic data. However, prior to core analyses, traditional methods typically involve intricate and time-consuming procedures, including assembly from high-throughput reads, decontamination, gene prediction, homology search, orthology assignment, multiple sequence alignment, and matrix trimming. Such processes significantly impede the efficiency of research when dealing with extensive data sets. In this study, we develop PhyloAln, a convenient reference-based tool capable of directly aligning high-throughput reads or complete sequences with existing alignments as a reference for phylogenetic and evolutionary analyses. Through testing with simulated data sets of species spanning the tree of life, PhyloAln demonstrates consistently robust performance compared with other reference-based tools across different data types, sequencing technologies, coverages, and species, with percent completeness and identity at least 50 percentage points higher in the alignments. Additionally, we validate the efficacy of PhyloAln in removing a minimum of 90% foreign and 70% cross-contamination issues, which are prevalent in sequencing data but often overlooked by other tools. Moreover, we showcase the broad applicability of PhyloAln by generating alignments (completeness mostly larger than 80%, identity larger than 90%) and reconstructing robust phylogenies using real data sets of transcriptomes of ladybird beetles, plastid genes of peppers, or ultraconserved elements of turtles. With these advantages, PhyloAln is expected to facilitate phylogenetic and evolutionary analyses in the omic era. The tool is accessible at https://github.com/huangyh45/PhyloAln.
Collapse
Affiliation(s)
- Yu-Hao Huang
- State Key Laboratory of Biocontrol, School of Ecology, Sun Yat-sen University, Shenzhen 518107, China
| | - Yi-Fei Sun
- State Key Laboratory of Biocontrol, School of Ecology, Sun Yat-sen University, Shenzhen 518107, China
| | - Hao Li
- State Key Laboratory of Biocontrol, School of Ecology, Sun Yat-sen University, Shenzhen 518107, China
| | - Hao-Sen Li
- State Key Laboratory of Biocontrol, School of Ecology, Sun Yat-sen University, Shenzhen 518107, China
| | - Hong Pang
- State Key Laboratory of Biocontrol, School of Ecology, Sun Yat-sen University, Shenzhen 518107, China
| |
Collapse
|
8
|
Chen H, Wang B, Cai L, Yang X, Hu Y, Zhang Y, Leng X, Liu W, Fan D, Niu B, Zhou Q. A comprehensive performance evaluation, comparison, and integration of computational methods for detecting and estimating cross-contamination of human samples in cancer next-generation sequencing analysis. J Biomed Inform 2024; 152:104625. [PMID: 38479675 DOI: 10.1016/j.jbi.2024.104625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 02/25/2024] [Accepted: 03/10/2024] [Indexed: 03/17/2024]
Abstract
Cross-sample contamination is one of the major issues in next-generation sequencing (NGS)-based molecular assays. This type of contamination, even at very low levels, can significantly impact the results of an analysis, especially in the detection of somatic alterations in tumor samples. Several contamination identification tools have been developed and implemented as a crucial quality-control step in the routine NGS bioinformatic pipeline. However, no study has been published to comprehensively and systematically investigate, evaluate, and compare these computational methods in the cancer NGS analysis. In this study, we comprehensively investigated nine state-of-the-art computational methods for detecting cross-sample contamination. To explore their application in cancer NGS analysis, we further compared the performance of five representative tools by qualitative and quantitative analyses using in silico and simulated experimental NGS data. The results showed that Conpair achieved the best performance for identifying contamination and predicting the level of contamination in solid tumors NGS analysis. Moreover, based on Conpair, we developed a Python script, Contamination Source Predictor (ConSPr), to identify the source of contamination. We anticipate that this comprehensive survey and the proposed tool for predicting the source of contamination will assist researchers in selecting appropriate cross-contamination detection tools in cancer NGS analysis and inspire the development of computational methods for detecting sample cross-contamination and identifying its source in the future.
Collapse
Affiliation(s)
- Huijuan Chen
- Beijing ChosenMed Clinical Laboratory Co. Ltd., Beijing 100176, China; Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China; WillingMed Technology Beijing Co. Ltd., Beijing 100176, China
| | - Bing Wang
- Beijing ChosenMed Clinical Laboratory Co. Ltd., Beijing 100176, China
| | - Lili Cai
- Beijing ChosenMed Clinical Laboratory Co. Ltd., Beijing 100176, China
| | - Xiaotian Yang
- Beijing ChosenMed Clinical Laboratory Co. Ltd., Beijing 100176, China
| | - Yali Hu
- Beijing ChosenMed Clinical Laboratory Co. Ltd., Beijing 100176, China
| | - Yiran Zhang
- Beijing ChosenMed Clinical Laboratory Co. Ltd., Beijing 100176, China
| | - Xue Leng
- Beijing ChosenMed Clinical Laboratory Co. Ltd., Beijing 100176, China
| | - Wen Liu
- Beijing ChosenMed Clinical Laboratory Co. Ltd., Beijing 100176, China
| | - Dongjie Fan
- National Key Laboratory of Intelligent Tracking and Forecasting for Infectious Disease, National Institute for Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing 102206, China.
| | - Beifang Niu
- Beijing ChosenMed Clinical Laboratory Co. Ltd., Beijing 100176, China; Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China; ChosenMed Technology (Zhejiang) Co. Ltd., Zhejiang 311103, China.
| | - Qiming Zhou
- Beijing ChosenMed Clinical Laboratory Co. Ltd., Beijing 100176, China; ChosenMed Technology (Zhejiang) Co. Ltd., Zhejiang 311103, China.
| |
Collapse
|
9
|
Bálint B, Merényi Z, Hegedüs B, Grigoriev IV, Hou Z, Földi C, Nagy LG. ContScout: sensitive detection and removal of contamination from annotated genomes. Nat Commun 2024; 15:936. [PMID: 38296951 PMCID: PMC10831095 DOI: 10.1038/s41467-024-45024-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 01/08/2024] [Indexed: 02/02/2024] Open
Abstract
Contamination of genomes is an increasingly recognized problem affecting several downstream applications, from comparative evolutionary genomics to metagenomics. Here we introduce ContScout, a precise tool for eliminating foreign sequences from annotated genomes. It achieves high specificity and sensitivity on synthetic benchmark data even when the contaminant is a closely related species, outperforms competing tools, and can distinguish horizontal gene transfer from contamination. A screen of 844 eukaryotic genomes for contamination identified bacteria as the most common source, followed by fungi and plants. Furthermore, we show that contaminants in ancestral genome reconstructions lead to erroneous early origins of genes and inflate gene loss rates, leading to a false notion of complex ancestral genomes. Taken together, we offer here a tool for sensitive removal of foreign proteins, identify and remove contaminants from diverse eukaryotic genomes and evaluate their impact on phylogenomic analyses.
Collapse
Affiliation(s)
- Balázs Bálint
- Synthetic and Systems Biology Unit, HUN-REN Biological Research Centre, Szeged, Szeged, 6726, Hungary
| | - Zsolt Merényi
- Synthetic and Systems Biology Unit, HUN-REN Biological Research Centre, Szeged, Szeged, 6726, Hungary
| | - Botond Hegedüs
- Synthetic and Systems Biology Unit, HUN-REN Biological Research Centre, Szeged, Szeged, 6726, Hungary
| | - Igor V Grigoriev
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, 94720, USA
| | - Zhihao Hou
- Synthetic and Systems Biology Unit, HUN-REN Biological Research Centre, Szeged, Szeged, 6726, Hungary
- Doctoral School of Biology, Faculty of Science and Informatics, University of Szeged, Szeged, 6720, Hungary
| | - Csenge Földi
- Synthetic and Systems Biology Unit, HUN-REN Biological Research Centre, Szeged, Szeged, 6726, Hungary
- Doctoral School of Biology, Faculty of Science and Informatics, University of Szeged, Szeged, 6720, Hungary
| | - László G Nagy
- Synthetic and Systems Biology Unit, HUN-REN Biological Research Centre, Szeged, Szeged, 6726, Hungary.
| |
Collapse
|
10
|
Alvarez RV, Landsman D. GTax: improving de novo transcriptome assembly by removing foreign RNA contamination. Genome Biol 2024; 25:12. [PMID: 38191464 PMCID: PMC10773103 DOI: 10.1186/s13059-023-03141-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Accepted: 12/08/2023] [Indexed: 01/10/2024] Open
Abstract
The cost and complexity of generating a complete reference genome means that many organisms lack an annotated reference. An alternative is to use a de novo reference transcriptome. This technology is cost-effective but is susceptible to off-target RNA contamination. In this manuscript, we present GTax, a taxonomy-structured database of genomic sequences that can be used with BLAST to detect and remove foreign contamination in RNA sequencing samples before assembly. In addition, we use a de novo transcriptome assembly of Solanum lycopersicum (tomato) to demonstrate that removing foreign contamination in sequencing samples reduces the number of assembled chimeric transcripts.
Collapse
Affiliation(s)
- Roberto Vera Alvarez
- Computational Biology Branch, National Center for Biotechnology Information, Intramural Research Program, National Library of Medicine, NIH, Bethesda, MD, USA
| | - David Landsman
- Computational Biology Branch, National Center for Biotechnology Information, Intramural Research Program, National Library of Medicine, NIH, Bethesda, MD, USA.
| |
Collapse
|
11
|
Rollin J, Rong W, Massart S. Cont-ID: detection of sample cross-contamination in viral metagenomic data. BMC Biol 2023; 21:217. [PMID: 37833740 PMCID: PMC10576407 DOI: 10.1186/s12915-023-01708-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Accepted: 09/20/2023] [Indexed: 10/15/2023] Open
Abstract
BACKGROUND High-throughput sequencing (HTS) technologies completed by the bioinformatic analysis of the generated data are becoming an important detection technique for virus diagnostics. They have the potential to replace or complement the current PCR-based methods thanks to their improved inclusivity and analytical sensitivity, as well as their overall good repeatability and reproducibility. Cross-contamination is a well-known phenomenon in molecular diagnostics and corresponds to the exchange of genetic material between samples. Cross-contamination management was a key drawback during the development of PCR-based detection and is now adequately monitored in routine diagnostics. HTS technologies are facing similar difficulties due to their very high analytical sensitivity. As a single viral read could be detected in millions of sequencing reads, it is mandatory to fix a detection threshold that will be informed by estimated cross-contamination. Cross-contamination monitoring should therefore be a priority when detecting viruses by HTS technologies. RESULTS We present Cont-ID, a bioinformatic tool designed to check for cross-contamination by analysing the relative abundance of virus sequencing reads identified in sequence metagenomic datasets and their duplication between samples. It can be applied when the samples in a sequencing batch have been processed in parallel in the laboratory and with at least one specific external control called Alien control. Using 273 real datasets, including 68 virus species from different hosts (fruit tree, plant, human) and several library preparation protocols (Ribodepleted total RNA, small RNA and double-stranded RNA), we demonstrated that Cont-ID classifies with high accuracy (91%) viral species detection into (true) infection or (cross) contamination. This classification raises confidence in the detection and facilitates the downstream interpretation and confirmation of the results by prioritising the virus detections that should be confirmed. CONCLUSIONS Cross-contamination between samples when detecting viruses using HTS (Illumina technology) can be monitored and highlighted by Cont-ID (provided an alien control is present). Cont-ID is based on a flexible methodology relying on the output of bioinformatics analyses of the sequencing reads and considering the contamination pattern specific to each batch of samples. The Cont-ID method is adaptable so that each laboratory can optimise it before its validation and routine use.
Collapse
Affiliation(s)
- Johan Rollin
- Plant Pathology Laboratory, Gembloux Agro-Bio Tech, University of Liège, 5030, Gembloux, Belgium
- DNAVision, 6041, Gosselies, Belgium
| | - Wei Rong
- Plant Pathology Laboratory, Gembloux Agro-Bio Tech, University of Liège, 5030, Gembloux, Belgium
| | - Sébastien Massart
- Plant Pathology Laboratory, Gembloux Agro-Bio Tech, University of Liège, 5030, Gembloux, Belgium.
| |
Collapse
|
12
|
Ellis EA, Goodheart JA, Hensley NM, González VL, Reda NJ, Rivers TJ, Morin JG, Torres E, Gerrish GA, Oakley TH. Sexual Signals Persist over Deep Time: Ancient Co-option of Bioluminescence for Courtship Displays in Cypridinid Ostracods. Syst Biol 2023; 72:264-274. [PMID: 35984328 PMCID: PMC10448971 DOI: 10.1093/sysbio/syac057] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Revised: 07/22/2022] [Accepted: 08/08/2022] [Indexed: 11/14/2022] Open
Abstract
Although the diversity, beauty, and intricacy of sexually selected courtship displays command the attention of evolutionists, the longevity of these traits in deep time is poorly understood. Population-based theory suggests sexual selection could either lower or raise extinction risk, resulting in high or low persistence of lineages with sexually selected traits. Furthermore, empirical studies that directly estimate the longevity of sexually selected traits are uncommon. Sexually selected signals-including bioluminescent courtship-originated multiple times during evolution, allowing the empirical study of their longevity after careful phylogenetic and divergence time analyses. Here, we estimate the first transcriptome-based molecular phylogeny and divergence times of Cypridinidae. We report extreme longevity of bioluminescent courtship, a trait important in mate choice and probably under sexual selection. Our relaxed-clock estimates of divergence times coupled with stochastic character mapping show luminous courtship evolved only once in Cypridinidae-in a Sub-Tribe, we name Luxorina-at least 151 millions of years ago from cypridinid ancestors that used bioluminescence only in antipredator displays, defining a Tribe we name Luminini. This time-calibrated molecular phylogeny of cypridinids will serve as a foundation for integrative and comparative studies on the biochemistry, molecular evolution, courtship, diversification, and ecology of cypridinid bioluminescence. The persistence of luminous courtship for hundreds of millions of years suggests that sexual selection did not cause a rapid loss of associated traits, and that rates of speciation within the group exceeded extinction risk, which may contribute to the persistence of a diverse clade of signaling species. [Ancestral state reconstruction; Biodiversity; co-option; divergence time estimates; macroevolution; Ostracoda; phylogenomics; sexual selection.].
Collapse
Affiliation(s)
- Emily A Ellis
- Department of Ecology, Evolution, and Marine Biology, University of
California, Santa Barbara, Santa Barbara, CA 93106, USA
| | - Jessica A Goodheart
- Department of Ecology, Evolution, and Marine Biology, University of
California, Santa Barbara, Santa Barbara, CA 93106, USA
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of
Oceanography, University of California, San Diego, La Jolla, CA 92037,
USA
| | - Nicholai M Hensley
- Department of Ecology, Evolution, and Marine Biology, University of
California, Santa Barbara, Santa Barbara, CA 93106, USA
- Department of Neurobiology and Behavior, Cornell University,
Ithaca, NY 14850, USA
| | - Vanessa L González
- Department of Invertebrate Zoology, Smithsonian Institution, National
Museum of Natural History, 10th and Constitution NW, Washington, DC
20560-0105, USA
| | - Nicholas J Reda
- Biology Department, University of Wisconsin–La Crosse, La
Crosse, WI 54601, USA
| | - Trevor J Rivers
- Department of Ecology and Evolutionary Biology, University of Kansas
Lawrence, KS 66045, USA
| | - James G Morin
- Department of Ecology and Evolutionary Biology, Cornell
University, Ithaca, NY 14850, USA
| | - Elizabeth Torres
- Department of Biological Sciences, California State University Los
Angeles, Los Angeles, CA 90032, USA
| | - Gretchen A Gerrish
- Biology Department, University of Wisconsin–La Crosse, La
Crosse, WI 54601, USA
- Trout Lake Station, Center for Limnology, University of Wisconsin –
Madison, Boulder Junction, WI 54512, USA
| | - Todd H Oakley
- Department of Ecology, Evolution, and Marine Biology, University of
California, Santa Barbara, Santa Barbara, CA 93106, USA
| |
Collapse
|
13
|
Fleming JF, Valero‐Gracia A, Struck TH. Identifying and addressing methodological incongruence in phylogenomics: A review. Evol Appl 2023; 16:1087-1104. [PMID: 37360032 PMCID: PMC10286231 DOI: 10.1111/eva.13565] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 04/07/2023] [Accepted: 05/17/2023] [Indexed: 06/28/2023] Open
Abstract
The availability of phylogenetic data has greatly expanded in recent years. As a result, a new era in phylogenetic analysis is dawning-one in which the methods we use to analyse and assess our data are the bottleneck to producing valuable phylogenetic hypotheses, rather than the need to acquire more data. This makes the ability to accurately appraise and evaluate new methods of phylogenetic analysis and phylogenetic artefact identification more important than ever. Incongruence in phylogenetic reconstructions based on different datasets may be due to two major sources: biological and methodological. Biological sources comprise processes like horizontal gene transfer, hybridization and incomplete lineage sorting, while methodological ones contain falsely assigned data or violations of the assumptions of the underlying model. While the former provides interesting insights into the evolutionary history of the investigated groups, the latter should be avoided or minimized as best as possible. However, errors introduced by methodology must first be excluded or minimized to be able to conclude that biological sources are the cause. Fortunately, a variety of useful tools exist to help detect such misassignments and model violations and to apply ameliorating measurements. Still, the number of methods and their theoretical underpinning can be overwhelming and opaque. Here, we present a practical and comprehensive review of recent developments in techniques to detect artefacts arising from model violations and poorly assigned data. The advantages and disadvantages of the different methods to detect such misleading signals in phylogenetic reconstructions are also discussed. As there is no one-size-fits-all solution, this review can serve as a guide in choosing the most appropriate detection methods depending on both the actual dataset and the computational power available to the researcher. Ultimately, this informed selection will have a positive impact on the broader field, allowing us to better understand the evolutionary history of the group of interest.
Collapse
|
14
|
Ragionieri L, Zúñiga-Reinoso Á, Bläser M, Predel R. Phylogenomics of darkling beetles (Coleoptera: Tenebrionidae) from the Atacama Desert. PeerJ 2023; 11:e14848. [PMID: 36855434 PMCID: PMC9968461 DOI: 10.7717/peerj.14848] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Accepted: 01/12/2023] [Indexed: 02/25/2023] Open
Abstract
Background Tenebrionidae (Insecta: Coleoptera) are a conspicuous component of desert fauna worldwide. In these ecosystems, they are significantly responsible for nutrient cycling and show remarkable morphological and physiological adaptations. Nevertheless, Tenebrionidae colonizing individual deserts have repeatedly emerged from different lineages. The goal of our study was to gain insights into the phylogenetic relationships of the tenebrionid genera from the Atacama Desert and how these taxa are related to the globally distributed Tenebrionidae. Methods We used newly generated transcriptome data (47 tribes, 7 of 11 subfamilies) that allowed for a comprehensive phylogenomic analysis of the tenebrionid fauna of this hyperarid desert and fills a gap in our knowledge of the highly diversified Tenebrionidae. We examined two independent data sets known to be suitable for phylogenomic reconstructions. One is based on 35 neuropeptide precursors, the other on 1,742 orthologous genes shared among Coleoptera. Results The majority of Atacama genera are placed into three groups, two of which belong to typical South American lineages within the Pimeliinae. While the data support the monophyly of the Physogasterini, Nycteliini and Scotobiini, this does not hold for the Atacama genera of Edrotini, Epitragini, Evaniosomini, Praociini, Stenosini, Thinobatini, and Trilobocarini. A suggested very close relationship of Psammetichus with the Mediterranean Leptoderis also could not be confirmed. We also provide hints regarding the phylogenetic relationships of the Caenocrypticini, which occur both in South America and southern Africa. Apart from the focus on the Tenebrionidae from the Atacama Desert, we found a striking synapomorphy grouping Alleculinae, Blaptinae, Diaperinae, Stenochinae, and several taxa of Tenebrioninae, but not Tenebrio and Tribolium. This character, an insertion in the myosuppressin gene, defines a higher-level monophyletic group within the Tenebrionidae. Conclusion Transcriptome data allow a comprehensive phylogenomic analysis of the tenebrionid fauna of the Atacama Desert, which represents one of the seven major endemic tribal areas in the world for Tenebrionidae. Most Atacama genera could be placed in three lineages typical of South America; monophyly is not supported for several tribes based on molecular data, suggesting that a detailed systematic revision of several groups is necessary.
Collapse
Affiliation(s)
- Lapo Ragionieri
- University of Cologne, Institute of Zoology, Cologne, Germany
| | | | - Marcel Bläser
- University of Cologne, Institute of Zoology, Cologne, Germany
| | - Reinhard Predel
- University of Cologne, Institute of Zoology, Cologne, Germany
| |
Collapse
|
15
|
Fleming JF. The wealth of shared resources: Improving molecular taxonomy using eDNA and public databases. ZOOL SCR 2023. [DOI: 10.1111/zsc.12591] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/24/2023]
|
16
|
Background Filtering of Clinical Metagenomic Sequencing with a Library Concentration-Normalized Model. Microbiol Spectr 2022; 10:e0177922. [PMID: 36135379 PMCID: PMC9603461 DOI: 10.1128/spectrum.01779-22] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Metagenomic next-generation sequencing (mNGS) can accurately detect pathogens in clinical samples. However, wet-lab contamination constrains mNGS analysis and may result in erroneous interpretation of results. Many existing methods rely on large-scale observational microbiome studies and may not be applicable to clinical mNGS tests. By generation of a pretrained profile of common laboratory contaminants, we developed an mNGS noise-filtering model based on the inverse linear relationship between microbial sequencing reads and sample library concentration, named the background elimination and correction by library concentration-normalized (BECLEAN) model. Its efficacy was evaluated with bacteria- and yeast-spiked samples and 28 cerebrospinal fluid (CSF) specimens. The diagnostic accuracy, precision, sensitivity, and specificity of BECLEAN with reference to conventional methods and diagnosis were 92.9%, 86.7%, 100%, and 86.7%, respectively. BECLEAN led to a dramatic reduction of background noise without affecting the true-positive rate and thus can provide a time-saving and convenient tool in various clinical settings. IMPORTANCE Most of the existing methods to remove wet-lab contamination rely on large-scale observational microbiome studies and may not be applicable to clinical mNGS testing in individual cases. In clinical settings, only a handful of samples might be sequenced in a run. The lab-specific microbiome can complicate existing statistical approaches for removing contamination from small-scale clinical metagenomic sequencing data sets; thus, use of a preliminary lab-specific training set is necessary. Our study provides a rapid and accurate background-filtering tool for clinical metagenomic sequencing by generation of a pretrained profile of common laboratory contaminants. Notably, our work demonstrates that the inverse linear relationship between microbial sequencing reads and library concentration can serve to identify true contaminants and evaluate the relative abundance of a taxon in samples by comparing the observed microbial reads to the model-predicted value. Our findings extend the previously published research and demonstrate confirmatory results in clinical settings.
Collapse
|
17
|
Lozano-Fernandez J. A Practical Guide to Design and Assess a Phylogenomic Study. Genome Biol Evol 2022; 14:evac129. [PMID: 35946263 PMCID: PMC9452790 DOI: 10.1093/gbe/evac129] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/03/2022] [Indexed: 11/13/2022] Open
Abstract
Over the last decade, molecular systematics has undergone a change of paradigm as high-throughput sequencing now makes it possible to reconstruct evolutionary relationships using genome-scale datasets. The advent of "big data" molecular phylogenetics provided a battery of new tools for biologists but simultaneously brought new methodological challenges. The increase in analytical complexity comes at the price of highly specific training in computational biology and molecular phylogenetics, resulting very often in a polarized accumulation of knowledge (technical on one side and biological on the other). Interpreting the robustness of genome-scale phylogenetic studies is not straightforward, particularly as new methodological developments have consistently shown that the general belief of "more genes, more robustness" often does not apply, and because there is a range of systematic errors that plague phylogenomic investigations. This is particularly problematic because phylogenomic studies are highly heterogeneous in their methodology, and best practices are often not clearly defined. The main aim of this article is to present what I consider as the ten most important points to take into consideration when planning a well-thought-out phylogenomic study and while evaluating the quality of published papers. The goal is to provide a practical step-by-step guide that can be easily followed by nonexperts and phylogenomic novices in order to assess the technical robustness of phylogenomic studies or improve the experimental design of a project.
Collapse
Affiliation(s)
- Jesus Lozano-Fernandez
- Department of Genetics, Microbiology and Statistics, Biodiversity Research Institute (IRBio), University of Barcelona, Avd. Diagonal 643, 08028 Barcelona, Spain
- Institute of Evolutionary Biology (CSIC – Universitat Pompeu Fabra), Passeig marítim de la Barcelona 37-49, 08003 Barcelona, Spain
| |
Collapse
|
18
|
Owen CL, Marshall DC, Wade EJ, Meister R, Goemans G, Kunte K, Moulds M, Hill K, Villet M, Pham TH, Kortyna M, Lemmon EM, Lemmon AR, Simon C. Detecting and removing sample contamination in phylogenomic data: an example and its implications for Cicadidae phylogeny (Insecta: Hemiptera). Syst Biol 2022; 71:1504-1523. [PMID: 35708660 DOI: 10.1093/sysbio/syac043] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Revised: 05/23/2022] [Accepted: 06/07/2022] [Indexed: 11/13/2022] Open
Abstract
Contamination of a genetic sample with DNA from one or more non-target species is a continuing concern of molecular phylogenetic studies, both Sanger sequencing studies and Next-Generation Sequencing (NGS) studies. We developed an automated pipeline for identifying and excluding likely cross-contaminated loci based on detection of bimodal distributions of patristic distances across gene trees. When the contamination occurs between samples within a dataset, comparisons between a contaminated sample and its contaminant taxon will yield bimodal distributions with one peak close to zero patristic distance. This new method does not rely on a priori knowledge of taxon relatedness nor does it determine the causes(s) of the contamination. Exclusion of putatively contaminated loci from a dataset generated for the insect family Cicadidae showed that these sequences were affecting some topological patterns and branch supports, although the effects were sometimes subtle, with some contamination-influenced relationships exhibiting strong bootstrap support. Long tip branches and outlier values for one anchored phylogenomic pipeline statistic (AvgNHomologs) were correlated with the presence of contamination. While the AHE markers used here, which target hemipteroid taxa, proved effective in resolving deep and shallow level Cicadidae relationships in aggregate, individual markers contained inadequate phylogenetic signal, in part probably due to short length. The cleaned dataset, consisting of 429 loci, from 90 genera representing 44 of 56 current Cicadidae tribes, supported three of the four sampled Cicadidae subfamilies in concatenated-matrix maximum likelihood (ML) and multispecies coalescent-based species tree analyses, with the fourth subfamily weakly supported in the ML trees. No well-supported patterns from previous family-level Sanger sequencing studies of Cicadidae phylogeny were contradicted. One taxon (Aragualna plenalinea) did not fall with its current subfamily in the genetic tree, and this genus and its tribe Aragualnini is reclassified to Tibicininae following morphological re-examination. Only subtle differences were observed in trees after removal of loci for which divergent base frequencies were detected. Greater success may be achieved by increased taxon sampling and developing a probe set targeting a more recent common ancestor and longer loci. Searches for contamination are an essential step in phylogenomic analyses of all kinds and our pipeline is an effective solution.
Collapse
Affiliation(s)
- Christopher L Owen
- Systematic Entomology Laboratory, USDA-ARS, c/o National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
| | - David C Marshall
- Dept. of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA
| | - Elizabeth J Wade
- Dept. of Natural Science and Mathematics, Curry College, Milton, MA 02186, USA
| | - Russ Meister
- Dept. of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA
| | - Geert Goemans
- Dept. of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA
| | - Krushnamegh Kunte
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, GKVK Campus, Bellary Road, Bangalore 560 065, India
| | - Max Moulds
- Australian Museum Research Institute, 1 William Street, Sydney N.S.W, Australia. 2010
| | - Kathy Hill
- Dept. of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA
| | - M Villet
- Dept. of Biology, Rhodes University, Grahamstown 6140, South Africa
| | - Thai-Hong Pham
- Mientrung Institute for Scientific Research, Vietnam Academy of Science and Technology, Hue, Vietnam.,Vietnam National Museum of Nature and Graduate School of Science and Technology, Vietnam Academy of Science and Technology, Hanoi, Vietnam
| | - Michelle Kortyna
- Department of Biological Science, Florida State University, 319 Stadium Drive, Tallahassee, USA
| | - Emily Moriarty Lemmon
- Department of Biological Science, Florida State University, 319 Stadium Drive, Tallahassee, FL 32306, USA
| | - Alan R Lemmon
- Department of Scientific Computing, Florida State University 400 Dirac Science Library, Tallahassee, FL 32306, USA
| | - Chris Simon
- Dept. of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA
| |
Collapse
|
19
|
A modified protocol with less clean-up steps increased efficiency and product yield of sequencing library preparation. 3 Biotech 2022; 12:111. [PMID: 35462954 PMCID: PMC8995211 DOI: 10.1007/s13205-022-03168-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2021] [Accepted: 03/19/2022] [Indexed: 11/01/2022] Open
Abstract
Library preparation is an essential step for the next-generation sequencing, such as whole-genome sequencing, reduced-representation genome sequencing, exome sequencing and transcriptome sequencing. The library preparation often involves many steps, including DNA fragmentation, end repair, ligation and amplification. Each step involves different enzymes and buffer systems, so many washing steps are implemented in between to clean-up the enzymes and solutes from the previous step. Those extra washing steps not only are tedious and costly, but more importantly may introduce cross-contamination and reduce the final library yield. Here, we modified the common protocol of Illumina library prep to reduce the washing steps by deactivating the enzymes with high temperature. The modified protocol has two less washing steps than the original one, which can save more than 40 min of hands-on time and reduce potential risk of cross-contamination. We compared our protocol with the original one by constructing libraries using 200 ng DNA of Tetraodon nigroviridis. The results showed that libraries prepared with the modified protocol had higher yields than that using the original protocol (53.4 ± 16.8 ng/ml vs. 8 ± 0.7 ng/ml), whereas the coverage and PCR duplication rate were similar. Furthermore, we eliminated the very first washing step after DNA shearing to preserve short DNA fragments, which increased proportion of fragments less than 100 bp DNA from 0.82 to 2.99%. In conclusion, using the modified protocols not only can save time and money, but also can generate higher yield and keep more short DNA fragments. Supplementary Information The online version contains supplementary material available at 10.1007/s13205-022-03168-5.
Collapse
|
20
|
Krug PJ, Caplins SA, Algoso K, Thomas K, Valdés ÁA, Wade R, Wong NLWS, Eernisse DJ, Kocot KM. Phylogenomic resolution of the root of Panpulmonata, a hyperdiverse radiation of gastropods: new insight into the evolution of air breathing. Proc Biol Sci 2022; 289:20211855. [PMID: 35382597 PMCID: PMC8984808 DOI: 10.1098/rspb.2021.1855] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Accepted: 02/21/2022] [Indexed: 11/12/2022] Open
Abstract
Transitions to terrestriality have been associated with major animal radiations including land snails and slugs in Stylommatophora (>20 000 described species), the most successful lineage of 'pulmonates' (a non-monophyletic assemblage of air-breathing gastropods). However, phylogenomic studies have failed to robustly resolve relationships among traditional pulmonates and affiliated marine lineages that comprise clade Panpulmonata (Mollusca, Gastropoda), especially two key taxa: Sacoglossa, a group including photosynthetic sea slugs, and Siphonarioidea, intertidal limpet-like snails with a non-contractile pneumostome (narrow opening to a vascularized pallial cavity). To clarify the evolutionary history of the panpulmonate radiation, we performed phylogenomic analyses on datasets of up to 1160 nuclear protein-coding genes for 110 gastropods, including 40 new transcriptomes for Sacoglossa and Siphonarioidea. All 18 analyses recovered Sacoglossa as the sister group to a clade we named Pneumopulmonata, within which Siphonarioidea was sister to the remaining lineages in most analyses. Comparative modelling indicated shifts to marginal habitat (estuarine, mangrove and intertidal zones) preceded and accelerated the evolution of a pneumostome, present in the pneumopulmonate ancestor along with a one-sided plicate gill. These findings highlight key intermediate stages in the evolution of air-breathing snails, supporting the hypothesis that adaptation to marginal zones played an important role in major sea-to-land transitions.
Collapse
Affiliation(s)
- Patrick J. Krug
- Department of Biological Sciences, California State University, Los Angeles, CA 90032-8201, USA
| | | | - Krisha Algoso
- Department of Biological Sciences, California State University, Los Angeles, CA 90032-8201, USA
| | - Kanique Thomas
- Department of Biological Sciences, California State University, Los Angeles, CA 90032-8201, USA
| | - Ángel A. Valdés
- Department of Biological Sciences, California State Polytechnic University, Pomona, CA 91768, USA
| | - Rachael Wade
- Department of Botany, University of British Columbia, Vancouver, BC, Canada V6T 1Z4
| | - Nur Leena W. S. Wong
- International Institute of Aquaculture and Aquatic Sciences, Universiti Putra Malaysia, Selangor, Malaysia
| | - Douglas J. Eernisse
- Department of Biological Science, California State University, Fullerton, CA 92834, USA
| | - Kevin M. Kocot
- Department of Biological Sciences and Alabama Museum of Natural History, The University of Alabama, Tuscaloosa, AL 35487, USA
| |
Collapse
|
21
|
Mongiardino Koch N, Thompson JR, Hiley AS, McCowin MF, Armstrong AF, Coppard SE, Aguilera F, Bronstein O, Kroh A, Mooi R, Rouse GW. Phylogenomic analyses of echinoid diversification prompt a re-evaluation of their fossil record. eLife 2022; 11:72460. [PMID: 35315317 PMCID: PMC8940180 DOI: 10.7554/elife.72460] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2021] [Accepted: 03/03/2022] [Indexed: 12/25/2022] Open
Abstract
Echinoids are key components of modern marine ecosystems. Despite a remarkable fossil record, the emergence of their crown group is documented by few specimens of unclear affinities, rendering their early history uncertain. The origin of sand dollars, one of its most distinctive clades, is also unclear due to an unstable phylogenetic context. We employ 18 novel genomes and transcriptomes to build a phylogenomic dataset with a near-complete sampling of major lineages. With it, we revise the phylogeny and divergence times of echinoids, and place their history within the broader context of echinoderm evolution. We also introduce the concept of a chronospace - a multidimensional representation of node ages - and use it to explore methodological decisions involved in time calibrating phylogenies. We find the choice of clock model to have the strongest impact on divergence times, while the use of site-heterogeneous models and alternative node prior distributions show minimal effects. The choice of loci has an intermediate impact, affecting mostly deep Paleozoic nodes, for which clock-like genes recover dates more congruent with fossil evidence. Our results reveal that crown group echinoids originated in the Permian and diversified rapidly in the Triassic, despite the relative lack of fossil evidence for this early diversification. We also clarify the relationships between sand dollars and their close relatives and confidently date their origins to the Cretaceous, implying ghost ranges spanning approximately 50 million years, a remarkable discrepancy with their rich fossil record.
Collapse
Affiliation(s)
- Nicolás Mongiardino Koch
- Department of Earth & Planetary Sciences, Yale University, New Haven, United States.,Scripps Institution of Oceanography, University of California San Diego, La Jolla, United States
| | - Jeffrey R Thompson
- Department of Earth Sciences, Natural History Museum, London, United Kingdom.,University College London Center for Life's Origins and Evolution, London, United Kingdom
| | - Avery S Hiley
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, United States
| | - Marina F McCowin
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, United States
| | - A Frances Armstrong
- Department of Invertebrate Zoology and Geology, California Academy of Sciences, San Francisco, United States
| | - Simon E Coppard
- Bader International Study Centre, Queen's University, Herstmonceux Castle, East Sussex, United Kingdom
| | - Felipe Aguilera
- Departamento de Bioquímica y Biología Molecular, Facultad de Ciencias Biológicas, Universidad de Concepción, Concepción, Chile
| | - Omri Bronstein
- School of Zoology, Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel.,Steinhardt Museum of Natural History, Tel-Aviv, Israel
| | - Andreas Kroh
- Department of Geology and Palaeontology, Natural History Museum Vienna, Vienna, Austria
| | - Rich Mooi
- Department of Invertebrate Zoology and Geology, California Academy of Sciences, San Francisco, United States
| | - Greg W Rouse
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, United States
| |
Collapse
|
22
|
Lizano AM, Smolina I, Choquet M, Kopp M, Hoarau G. Insights into the species evolution of Calanus copepods in the northern seas revealed by de novo transcriptome sequencing. Ecol Evol 2022; 12:e8606. [PMID: 35228861 PMCID: PMC8861592 DOI: 10.1002/ece3.8606] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 01/14/2022] [Accepted: 01/19/2022] [Indexed: 01/07/2023] Open
Abstract
Copepods of the zooplankton genus Calanus play a key role in marine ecosystems in the northern seas. Although being among the most studied organisms on Earth, due to their ecological importance, genomic resources for Calanus spp. remain scarce, mostly due to their large genome size (from 6 to 12 Gbps). As an alternative to whole-genome sequencing in Calanus spp., we sequenced and de novo assembled transcriptomes of five Calanus species: Calanus glacialis, C. hyperboreus, C. marshallae, C. pacificus, and C. helgolandicus. Functional assignment of protein families based on clusters of orthologous genes (COG) and gene ontology (GO) annotations showed analogous patterns of protein functions across species. Phylogenetic analyses using maximum likelihood (ML) of 191 protein-coding genes mined from RNA-seq data fully resolved evolutionary relationships among seven Calanus species investigated (five species sequenced for this study and two species with published datasets), with gene and site concordance factors showing that 109 out of 191 protein-coding genes support a separation between three groups: the C. finmarchicus group (including C. finmarchicus, C. glacialis, and C. marshallae), the C. helgolandicus group (including C. helgolandicus, C. sinicus, and C. pacificus) and the monophyletic C. hyperboreus group. The tree topology obtained in ML analyses was similar to a previously proposed phylogeny based on morphological criteria and cleared certain ambiguities from past studies on evolutionary relationships among Calanus species.
Collapse
Affiliation(s)
| | - Irina Smolina
- Faculty of Biosciences and AquacultureNord UniversityBodøNorway
| | - Marvin Choquet
- Faculty of Biosciences and AquacultureNord UniversityBodøNorway
- Department of Medical Biochemistry and MicrobiologyUppsala UniversityUppsalaSweden
| | - Martina Kopp
- Faculty of Biosciences and AquacultureNord UniversityBodøNorway
| | - Galice Hoarau
- Faculty of Biosciences and AquacultureNord UniversityBodøNorway
| |
Collapse
|
23
|
Ahmed M, Roberts NG, Adediran F, Smythe AB, Kocot KM, Holovachov O. Phylogenomic Analysis of the Phylum Nematoda: Conflicts and Congruences With Morphology, 18S rRNA, and Mitogenomes. Front Ecol Evol 2022. [DOI: 10.3389/fevo.2021.769565] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Phylogenetic relationships within many lineages of the phylum Nematoda remain unresolved, despite numerous morphology-based and molecular analyses. We performed several phylogenomic analyses using 286 published genomes and transcriptomes and 19 new transcriptomes by focusing on Trichinellida, Spirurina, Rhabditina, and Tylenchina separately, and by analyzing a selection of species from the whole phylum Nematoda. The phylogeny of Trichinellida supported the division of Trichinella into encapsulated and non-encapsulated species and placed them as sister to Trichuris. The Spirurina subtree supported the clades formed by species from Ascaridomorpha and Spiruromorpha respectively, but did not support Dracunculoidea. The analysis of Tylenchina supported a clade that included all sampled species from Tylenchomorpha and placed it as sister to clades that included sampled species from Cephalobomorpha and Panagrolaimomorpha, supporting the hypothesis that postulates the single origin of the stomatostylet. The Rhabditina subtree placed a clade composed of all sampled species from Diplogastridae as sister to a lineage consisting of paraphyletic Rhabditidae, a single representative of Heterorhabditidae and a clade composed of sampled species belonging to Strongylida. It also strongly supported all suborders within Strongylida. In the phylum-wide analysis, a clade composed of all sampled species belonging to Enoplia were consistently placed as sister to Dorylaimia + Chromadoria. The topology of the Nematoda backbone was consistent with previous studies, including polyphyletic placement of sampled representatives of Monhysterida and Araeolaimida.
Collapse
|
24
|
Su L, Guo S, Guo W, Ji X, Liu Y, Zhang H, Huang Q, Zhou K, Guo X, Gu X, Xing J. mitoDataclean: A machine learning approach for the accurate identification of cross-contamination-derived tumor mitochondrial DNA mutations. Int J Cancer 2022; 150:1677-1689. [PMID: 35001369 DOI: 10.1002/ijc.33927] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Revised: 12/03/2021] [Accepted: 12/22/2021] [Indexed: 11/06/2022]
Abstract
Next-generation sequencing (NGS) of mitochondrial DNA (mtDNA) has widespread applications in aging and cancer studies. However, cross-contamination of mtDNA constitutes a major concern. Previous methods for the detection of mtDNA contamination mainly focus on haplogroup-level phylogeny, but neglect haplotype-level differences, leading to limited sensitivity and accuracy. In this study, we present mitoDataclean, a random-forest-based machine learning package for accurate identification of cross-contamination, evaluation of contamination levels and detection of contamination-derived variants in mtDNA NGS data. Comprehensive optimization of mitoDataclean revealed that training simulation with mixtures of small haplogroup distance and low polymorphic difference was critical for optimal modeling. Compared with existing methods, mitoDataclean exhibited significantly improved sensitivity and accuracy for the detection of sample contamination in simulated data. In addition, mitoDataclean achieved area under the curve values of 0.91 and 0.97 for discerning genuine and contamination-derived mtDNA variants in a simulated Western dataset and private sequencing contamination data, respectively, suggesting that this tool may be applicable for different populations and samples with different sources of contamination. Finally, mitoDataclean was further evaluated in several private and public datasets and showed a robust ability for contamination detection. Altogether, our study demonstrates that mitoDataclean may be used for accurate detection of contaminated samples and contamination-derived variants in mtDNA NGS data. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Liping Su
- State Key Laboratory of Cancer Biology and Department of Physiology and Pathophysiology, Fourth Military Medical University, Xi'an, China
| | - Shanshan Guo
- State Key Laboratory of Cancer Biology and Department of Physiology and Pathophysiology, Fourth Military Medical University, Xi'an, China
| | - Wenjie Guo
- State Key Laboratory of Cancer Biology and Department of Physiology and Pathophysiology, Fourth Military Medical University, Xi'an, China
| | - Xiaoying Ji
- State Key Laboratory of Cancer Biology and Department of Physiology and Pathophysiology, Fourth Military Medical University, Xi'an, China
| | - Yang Liu
- State Key Laboratory of Cancer Biology and Department of Physiology and Pathophysiology, Fourth Military Medical University, Xi'an, China
| | - Huanqin Zhang
- College of Medical Technology, Shaanxi University of Chinese Medicine, Xianyang, China
| | - Qichao Huang
- State Key Laboratory of Cancer Biology and Department of Physiology and Pathophysiology, Fourth Military Medical University, Xi'an, China
| | - Kaixiang Zhou
- State Key Laboratory of Cancer Biology and Department of Physiology and Pathophysiology, Fourth Military Medical University, Xi'an, China
| | - Xu Guo
- State Key Laboratory of Cancer Biology and Department of Physiology and Pathophysiology, Fourth Military Medical University, Xi'an, China
| | - Xiwen Gu
- Key Laboratory of Shaanxi Province for Craniofacial Precision Medicine Research, Clinical Research Center of Shaanxi Province for Dental and Maxillofacial Diseases, College of Stomatology, China
| | - Jinliang Xing
- State Key Laboratory of Cancer Biology and Department of Physiology and Pathophysiology, Fourth Military Medical University, Xi'an, China
| |
Collapse
|
25
|
Jurasz H, Pawłowski T, Perlejewski K. Contamination Issue in Viral Metagenomics: Problems, Solutions, and Clinical Perspectives. Front Microbiol 2021; 12:745076. [PMID: 34745046 PMCID: PMC8564396 DOI: 10.3389/fmicb.2021.745076] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Accepted: 09/17/2021] [Indexed: 12/16/2022] Open
Abstract
We describe the most common internal and external sources and types of contamination encountered in viral metagenomic studies and discuss their negative impact on sequencing results, particularly for low-biomass samples and clinical applications. We also propose some basic recommendations for reducing the background noise in viral shotgun metagenomic (SM) studies, which would limit the bias introduced by various classes of contaminants. Regardless of the specific viral SM protocol, contamination cannot be totally avoided; in particular, the issue of reagent contamination should always be addressed with high priority. There is an urgent need for the development and validation of standards for viral metagenomic studies especially if viral SM protocols will be more widely applied in diagnostics.
Collapse
Affiliation(s)
- Henryk Jurasz
- Department of Immunopathology of Infectious and Parasitic Diseases, Medical University of Warsaw, Warsaw, Poland
| | - Tomasz Pawłowski
- Division of Psychotherapy and Psychosomatic Medicine, Department of Psychiatry, Wrocław Medical University, Wrocław, Poland
| | - Karol Perlejewski
- Department of Immunopathology of Infectious and Parasitic Diseases, Medical University of Warsaw, Warsaw, Poland
| |
Collapse
|
26
|
Wang Y, Yuan H, Huang J, Li C. Inline index helped in cleaning up data contamination generated during library preparation and the subsequent steps. Mol Biol Rep 2021; 49:385-392. [PMID: 34716505 DOI: 10.1007/s11033-021-06884-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 09/23/2021] [Indexed: 11/24/2022]
Abstract
BACKGROUND High-throughput sequencing involves library preparation and amplification steps, which may induce contamination across samples or between samples and the environment. METHODS We tested the effect of applying an inline-index strategy, in which DNA indices of 6 bp were added to both ends of the inserts at the ligation step of library prep for resolving the data contamination problem. RESULTS Our results showed that the contamination ranged from 0.29 to 1.25% in one experiment and from 0.83 to 27.01% in the other. We also found that contamination could be environmental or from reagents besides cross-contamination between samples. CONCLUSIONS Inline-index method is a useful experimental design to clean up the data and address the contamination problem which has been plaguing high-throughput sequencing data in many applications.
Collapse
Affiliation(s)
- Ying Wang
- Shanghai Universities Key Laboratory of Marine Animal Taxonomy and Evolution, Shanghai Ocean University, Shanghai, 201306, China.,Shanghai Collaborative Innovation for Aquatic Animal Genetics and Breeding, Shanghai Ocean University, Shanghai, 201306, China
| | - Hao Yuan
- Shanghai Universities Key Laboratory of Marine Animal Taxonomy and Evolution, Shanghai Ocean University, Shanghai, 201306, China.,Shanghai Collaborative Innovation for Aquatic Animal Genetics and Breeding, Shanghai Ocean University, Shanghai, 201306, China
| | - Junman Huang
- Shanghai Universities Key Laboratory of Marine Animal Taxonomy and Evolution, Shanghai Ocean University, Shanghai, 201306, China.,Shanghai Collaborative Innovation for Aquatic Animal Genetics and Breeding, Shanghai Ocean University, Shanghai, 201306, China
| | - Chenhong Li
- Shanghai Universities Key Laboratory of Marine Animal Taxonomy and Evolution, Shanghai Ocean University, Shanghai, 201306, China. .,Shanghai Collaborative Innovation for Aquatic Animal Genetics and Breeding, Shanghai Ocean University, Shanghai, 201306, China.
| |
Collapse
|
27
|
Simion P, Narayan J, Houtain A, Derzelle A, Baudry L, Nicolas E, Arora R, Cariou M, Cruaud C, Gaudray FR, Gilbert C, Guiglielmoni N, Hespeels B, Kozlowski DKL, Labadie K, Limasset A, Llirós M, Marbouty M, Terwagne M, Virgo J, Cordaux R, Danchin EGJ, Hallet B, Koszul R, Lenormand T, Flot JF, Van Doninck K. Chromosome-level genome assembly reveals homologous chromosomes and recombination in asexual rotifer Adineta vaga. SCIENCE ADVANCES 2021; 7:eabg4216. [PMID: 34613768 PMCID: PMC8494291 DOI: 10.1126/sciadv.abg4216] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Bdelloid rotifers are notorious as a speciose ancient clade comprising only asexual lineages. Thanks to their ability to repair highly fragmented DNA, most bdelloid species also withstand complete desiccation and ionizing radiation. Producing a well-assembled reference genome is a critical step to developing an understanding of the effects of long-term asexuality and DNA breakage on genome evolution. To this end, we present the first high-quality chromosome-level genome assemblies for the bdelloid Adineta vaga, composed of six pairs of homologous (diploid) chromosomes with a footprint of paleotetraploidy. The observed large-scale losses of heterozygosity are signatures of recombination between homologous chromosomes, either during mitotic DNA double-strand break repair or when resolving programmed DNA breaks during a modified meiosis. Dynamic subtelomeric regions harbor more structural diversity (e.g., chromosome rearrangements, transposable elements, and haplotypic divergence). Our results trigger the reappraisal of potential meiotic processes in bdelloid rotifers and help unravel the factors underlying their long-term asexual evolutionary success.
Collapse
Affiliation(s)
- Paul Simion
- Research Unit in Environmental and Evolutionary Biology, Université de Namur, Namur 5000, Belgium
- Corresponding author. (K.V.D.); (J.-F.F.); (P.S.)
| | - Jitendra Narayan
- Research Unit in Environmental and Evolutionary Biology, Université de Namur, Namur 5000, Belgium
| | - Antoine Houtain
- Research Unit in Environmental and Evolutionary Biology, Université de Namur, Namur 5000, Belgium
| | - Alessandro Derzelle
- Research Unit in Environmental and Evolutionary Biology, Université de Namur, Namur 5000, Belgium
| | - Lyam Baudry
- Institut Pasteur, Unité Régulation Spatiale des Génomes, UMR 3525, CNRS, Paris F-75015, France
- Collège Doctoral, Sorbonne Université, F-75005 Paris, France
| | - Emilien Nicolas
- Research Unit in Environmental and Evolutionary Biology, Université de Namur, Namur 5000, Belgium
- Molecular Biology and Evolution, Université libre de Bruxelles (ULB), Brussels 1050, Belgium
| | - Rohan Arora
- Research Unit in Environmental and Evolutionary Biology, Université de Namur, Namur 5000, Belgium
- Molecular Biology and Evolution, Université libre de Bruxelles (ULB), Brussels 1050, Belgium
| | - Marie Cariou
- Research Unit in Environmental and Evolutionary Biology, Université de Namur, Namur 5000, Belgium
- CIRI, Centre International de Recherche en Infectiologie, Univ Lyon, Inserm, U1111, Université Claude Bernard Lyon 1, CNRS, UMR5308, ENS de Lyon, F-69007 Lyon, France
| | - Corinne Cruaud
- Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057 Evry, France
| | | | - Clément Gilbert
- Évolution, Génomes, Comportement et Écologie, Université Paris-Saclay, CNRS, IRD, UMR, 91198 Gif-sur-Yvette, France
| | - Nadège Guiglielmoni
- Evolutionary Biology and Ecology, Université libre de Bruxelles (ULB), Brussels 1050, Belgium
| | - Boris Hespeels
- Research Unit in Environmental and Evolutionary Biology, Université de Namur, Namur 5000, Belgium
| | - Djampa K. L. Kozlowski
- INRAE, Université Côte-d’Azur, CNRS, Institut Sophia Agrobiotech, Sophia Antipolis 06903, France
| | - Karine Labadie
- Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057 Evry, France
| | - Antoine Limasset
- Université de Lille, CNRS, UMR 9189 - CRIStAL, 59655 Villeneuve-d’Ascq, France
| | - Marc Llirós
- Research Unit in Environmental and Evolutionary Biology, Université de Namur, Namur 5000, Belgium
- Institut d’Investigació Biomédica de Girona, Malalties Digestives i Microbiota, 17190 Salt, Spain
| | - Martial Marbouty
- Institut Pasteur, Unité Régulation Spatiale des Génomes, UMR 3525, CNRS, Paris F-75015, France
| | - Matthieu Terwagne
- Research Unit in Environmental and Evolutionary Biology, Université de Namur, Namur 5000, Belgium
| | - Julie Virgo
- Research Unit in Environmental and Evolutionary Biology, Université de Namur, Namur 5000, Belgium
| | - Richard Cordaux
- Ecologie et Biologie des interactions, Université de Poitiers, UMR CNRS 7267, 5 rue Albert Turpain, 86073 Poitiers, France
| | - Etienne G. J. Danchin
- INRAE, Université Côte-d’Azur, CNRS, Institut Sophia Agrobiotech, Sophia Antipolis 06903, France
| | - Bernard Hallet
- LIBST, Université Catholique de Louvain (UCLouvain), Croix du Sud 4/5, Louvain-la-Neuve 1348, Belgium
| | - Romain Koszul
- Institut Pasteur, Unité Régulation Spatiale des Génomes, UMR 3525, CNRS, Paris F-75015, France
| | - Thomas Lenormand
- CEFE, Univ Montpellier, CNRS, Univ Paul Valéry Montpellier 3, EPHE, IRD, Montpellier, France
| | - Jean-Francois Flot
- Evolutionary Biology and Ecology, Université libre de Bruxelles (ULB), Brussels 1050, Belgium
- Interuniversity Institute of Bioinformatics in Brussels - (IB), Brussels 1050, Belgium
- Corresponding author. (K.V.D.); (J.-F.F.); (P.S.)
| | - Karine Van Doninck
- Research Unit in Environmental and Evolutionary Biology, Université de Namur, Namur 5000, Belgium
- Molecular Biology and Evolution, Université libre de Bruxelles (ULB), Brussels 1050, Belgium
- Corresponding author. (K.V.D.); (J.-F.F.); (P.S.)
| |
Collapse
|
28
|
Neuropeptide repertoire and 3D anatomy of the ctenophore nervous system. Curr Biol 2021; 31:5274-5285.e6. [PMID: 34587474 DOI: 10.1016/j.cub.2021.09.005] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2021] [Revised: 07/20/2021] [Accepted: 09/02/2021] [Indexed: 11/24/2022]
Abstract
Ctenophores are gelatinous marine animals famous for locomotion by ciliary combs. Due to the uncertainties of the phylogenetic placement of ctenophores and the absence of some key bilaterian neuronal genes, it has been hypothesized that their neurons evolved independently. Additionally, recent whole-body, single-cell RNA sequencing (scRNA-seq) analysis failed to identify ctenophore neurons using any of the known neuronal molecular markers. To reveal the molecular machinery of ctenophore neurons, we have characterized the neuropeptide repertoire of the ctenophore Mnemiopsis leidyi. Using the machine learning NeuroPID tool, we predicted 129 new putative neuropeptide precursors. Sixteen of them were localized to the subepithelial nerve net (SNN), sensory aboral organ (AO), and epithelial sensory cells (ESCs), providing evidence that they are neuropeptide precursors. Four of these putative neuropeptides had a behavioral effect and increased the animals' swimming speed. Intriguingly, these putative neuropeptides finally allowed us to identify neuronal cell types in single-cell transcriptomic data and reveal the molecular identity of ctenophore neurons. High-resolution electron microscopy and 3D reconstructions of the nerve net underlying the comb plates confirmed a more than 100-year-old hypothesis of anastomoses between neurites of the same cell in ctenophores and revealed that they occur through a continuous membrane. Our work demonstrates the unique ultrastructure of the peptidergic nerve net and a rich neuropeptide repertoire of ctenophores, supporting the hypothesis that the first nervous system(s) evolved as nets of peptidergic cells.
Collapse
|
29
|
Emelianova K, Martínez Martínez A, Campos-Dominguez L, Kidner C. Multi-tissue transcriptome analysis of two Begonia species reveals dynamic patterns of evolution in the chalcone synthase gene family. Sci Rep 2021; 11:17773. [PMID: 34493743 PMCID: PMC8423730 DOI: 10.1038/s41598-021-96854-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Accepted: 08/17/2021] [Indexed: 02/07/2023] Open
Abstract
Begonia is an important horticultural plant group, as well as one of the most speciose Angiosperm genera, with over 2000 described species. Genus wide studies of genome size have shown that Begonia has a highly variable genome size, and analysis of paralog pairs has previously suggested that Begonia underwent a whole genome duplication. We address the contribution of gene duplication to the generation of diversity in Begonia using a multi-tissue RNA-seq approach. We chose to focus on chalcone synthase (CHS), a gene family having been shown to be involved in biotic and abiotic stress responses in other plant species, in particular its importance in maximising the use of variable light levels in tropical plants. We used RNA-seq to sample six tissues across two closely related but ecologically and morphologically divergent species, Begonia conchifolia and B. plebeja, yielding 17,012 and 19,969 annotated unigenes respectively. We identified the chalcone synthase gene family members in our Begonia study species, as well as in Hillebrandia sandwicensis, the monotypic sister genus to Begonia, Cucumis sativus, Arabidopsis thaliana, and Zea mays. Phylogenetic analysis suggested the CHS gene family has high duplicate turnover, all members of CHS identified in Begonia arising recently, after the divergence of Begonia and Cucumis. Expression profiles were similar within orthologous pairs, but we saw high inter-ortholog expression variation. Sequence analysis showed relaxed selective constraints on some ortholog pairs, with substitutions at conserved sites. Evidence of pseudogenisation and species specific duplication indicate that lineage specific differences are already beginning to accumulate since the divergence of our study species. We conclude that there is evidence for a role of gene duplication in generating diversity through sequence and expression divergence in Begonia.
Collapse
Affiliation(s)
- Katie Emelianova
- grid.426106.70000 0004 0598 2103Royal Botanic Gardens Edinburgh, 20a Inverleith Row, Edinburgh, EH3 5LR UK ,grid.4305.20000 0004 1936 7988Dementia Research Institute at the University of Edinburgh, Edinburgh, UK
| | - Andrea Martínez Martínez
- grid.426106.70000 0004 0598 2103Royal Botanic Gardens Edinburgh, 20a Inverleith Row, Edinburgh, EH3 5LR UK ,grid.4305.20000 0004 1936 7988School of Biological Sciences, University of Edinburgh, King’s Buildings, Mayfield Rd, Edinburgh, EH9 3JU UK
| | - Lucia Campos-Dominguez
- grid.426106.70000 0004 0598 2103Royal Botanic Gardens Edinburgh, 20a Inverleith Row, Edinburgh, EH3 5LR UK ,grid.4305.20000 0004 1936 7988School of Biological Sciences, University of Edinburgh, King’s Buildings, Mayfield Rd, Edinburgh, EH9 3JU UK
| | - Catherine Kidner
- grid.426106.70000 0004 0598 2103Royal Botanic Gardens Edinburgh, 20a Inverleith Row, Edinburgh, EH3 5LR UK ,grid.4305.20000 0004 1936 7988School of Biological Sciences, University of Edinburgh, King’s Buildings, Mayfield Rd, Edinburgh, EH9 3JU UK
| |
Collapse
|
30
|
Zhang C, Zhao Y, Braun EL, Mirarab S. TAPER: Pinpointing errors in multiple sequence alignments despite varying rates of evolution. Methods Ecol Evol 2021. [DOI: 10.1111/2041-210x.13696] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Affiliation(s)
- Chao Zhang
- Bioinformatics and Systems Biology Program University of California San Diego CA USA
| | - Yiming Zhao
- Electrical and Computer Engineering Department University of California San Diego CA USA
| | - Edward L. Braun
- Department of Biology and Genetics Institute University of Florida Gainesville FL USA
| | - Siavash Mirarab
- Electrical and Computer Engineering Department University of California San Diego CA USA
| |
Collapse
|
31
|
Rachtman E, Bafna V, Mirarab S. CONSULT: accurate contamination removal using locality-sensitive hashing. NAR Genom Bioinform 2021; 3:lqab071. [PMID: 34377979 PMCID: PMC8340999 DOI: 10.1093/nargab/lqab071] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 06/30/2021] [Accepted: 07/19/2021] [Indexed: 12/27/2022] Open
Abstract
A fundamental question appears in many bioinformatics applications: Does a sequencing read belong to a large dataset of genomes from some broad taxonomic group, even when the closest match in the set is evolutionarily divergent from the query? For example, low-coverage genome sequencing (skimming) projects either assemble the organelle genome or compute genomic distances directly from unassembled reads. Using unassembled reads needs contamination detection because samples often include reads from unintended groups of species. Similarly, assembling the organelle genome needs distinguishing organelle and nuclear reads. While k-mer-based methods have shown promise in read-matching, prior studies have shown that existing methods are insufficiently sensitive for contamination detection. Here, we introduce a new read-matching tool called CONSULT that tests whether k-mers from a query fall within a user-specified distance of the reference dataset using locality-sensitive hashing. Taking advantage of large memory machines available nowadays, CONSULT libraries accommodate tens of thousands of microbial species. Our results show that CONSULT has higher true-positive and lower false-positive rates of contamination detection than leading methods such as Kraken-II and improves distance calculation from genome skims. We also demonstrate that CONSULT can distinguish organelle reads from nuclear reads, leading to dramatic improvements in skim-based mitochondrial assemblies.
Collapse
Affiliation(s)
- Eleonora Rachtman
- Bioinformatics and Systems Biology Graduate Program, UC San Diego, CA 92093, USA
| | - Vineet Bafna
- Department of Computer Science and Engineering, UC San Diego, CA 92093, USA
| | - Siavash Mirarab
- Department of Electrical and Computer Engineering, UC San Diego, CA 92093, USA
| |
Collapse
|
32
|
Van Vlierberghe M, Di Franco A, Philippe H, Baurain D. Decontamination, pooling and dereplication of the 678 samples of the Marine Microbial Eukaryote Transcriptome Sequencing Project. BMC Res Notes 2021; 14:306. [PMID: 34372933 PMCID: PMC8353744 DOI: 10.1186/s13104-021-05717-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Accepted: 07/27/2021] [Indexed: 11/10/2022] Open
Abstract
OBJECTIVES Complex algae are photosynthetic organisms resulting from eukaryote-to-eukaryote endosymbiotic-like interactions. Yet the specific lineages and mechanisms are still under debate. That is why large scale phylogenomic studies are needed. Whereas available proteomes provide a limited diversity of complex algae, MMETSP (Marine Microbial Eukaryote Transcriptome Sequencing Project) transcriptomes represent a valuable resource for phylogenomic analyses, owing to their broad and rich taxonomic sampling, especially of photosynthetic species. Unfortunately, this sampling is unbalanced and sometimes highly redundant. Moreover, we observed contaminated sequences in some samples. In such a context, tree inference and readability are impaired. Consequently, the aim of the data processing reported here is to release a unique set of clean and non-redundant transcriptomes produced through an original protocol featuring decontamination, pooling and dereplication steps. DATA DESCRIPTION We submitted 678 MMETSP re-assembly samples to our parallel consolidation pipeline. Hence, we combined 423 samples into 110 consolidated transcriptomes, after the systematic removal of the most contaminated samples (186). This approach resulted in a total of 224 high-quality transcriptomes, easy to use and suitable to compute less contaminated, less redundant and more balanced phylogenies.
Collapse
Affiliation(s)
- Mick Van Vlierberghe
- InBioS – PhytoSYSTEMS, Eukaryotic Phylogenomics, University of Liège, Liège, Belgium
| | - Arnaud Di Franco
- Station D’Ecologie Théorique Et Expérimentale de Moulis, UMR CNRS 5321, Moulis, France
| | - Hervé Philippe
- Station D’Ecologie Théorique Et Expérimentale de Moulis, UMR CNRS 5321, Moulis, France
| | - Denis Baurain
- InBioS – PhytoSYSTEMS, Eukaryotic Phylogenomics, University of Liège, Liège, Belgium
| |
Collapse
|
33
|
Tomescu MS, Sooklal SA, Ntsowe T, Naicker P, Darnhofer B, Archer R, Stoychev S, Swanevelder D, Birner-Grünberger R, Rumbold K. Transcriptome and proteome of the corm, leaf and flower of Hypoxis hemerocallidea (African potato). PLoS One 2021; 16:e0253741. [PMID: 34283859 PMCID: PMC8291589 DOI: 10.1371/journal.pone.0253741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Accepted: 06/11/2021] [Indexed: 11/19/2022] Open
Abstract
The corm of Hypoxis hemerocallidea, commonly known as the African potato, is used in traditional medicine to treat several medical conditions such as urinary infections, benign prostate hyperplasia, inflammatory conditions and testicular tumours. The metabolites contributing to the medicinal properties of H. hemerocallidea have been identified in several studies and, more recently, the active terpenoids of the plant were profiled. However, the biosynthetic pathways and the enzymes involved in the production of the terpene metabolites in H. hemerocallidea have not been characterised at a transcriptomic or proteomic level. In this study, total RNA extracted from the corm, leaf and flower tissues of H. hemerocallidea was sequenced on the Illumina HiSeq 2500 platform. A total of 143,549 transcripts were assembled de novo using Trinity and 107,131 transcripts were functionally annotated using the nr, GO, COG, KEGG and SWISS-PROT databases. Additionally, the proteome of the three tissues were sequenced using LC-MS/MS, revealing aspects of secondary metabolism and serving as data validation for the transcriptome. Functional annotation led to the identification of numerous terpene synthases such as nerolidol synthase, germacrene D synthase, and cycloartenol synthase amongst others. Annotations also revealed a transcript encoding the terpene synthase phytoalexin momilactone A synthase. Differential expression analysis using edgeR identified 946 transcripts differentially expressed between the three tissues and revealed that the leaf upregulates linalool synthase compared to the corm and the flower tissues. The transcriptome as well as the proteome of Hypoxis hemerocallidea presented here provide a foundation for future research.
Collapse
Affiliation(s)
- Mihai-Silviu Tomescu
- School of Molecular and Cell Biology, University of the Witwatersrand, Johannesburg, South Africa
| | - Selisha Ann Sooklal
- Department of Life and Consumer Sciences, College of Agriculture and Environmental Sciences, UNISA, Johannesburg, South Africa
| | - Thuto Ntsowe
- Biotechnology Platform, Agricultural Research Council, Onderstepoort, South Africa
| | - Previn Naicker
- Council for Scientific and Industrial Research, Pretoria, South Africa
| | - Barbara Darnhofer
- ACIB GmbH, Graz, Austria
- Institute for Pathology, Medical University of Graz, Graz, Austria
- Omics Center Graz, BioTechMed, Graz, Austria
| | - Robert Archer
- National Herbarium, South African National Biodiversity Institute, Pretoria, South Africa
| | - Stoyan Stoychev
- Council for Scientific and Industrial Research, Pretoria, South Africa
| | - Dirk Swanevelder
- Biotechnology Platform, Agricultural Research Council, Onderstepoort, South Africa
| | - Ruth Birner-Grünberger
- ACIB GmbH, Graz, Austria
- Institute for Pathology, Medical University of Graz, Graz, Austria
- Omics Center Graz, BioTechMed, Graz, Austria
| | - Karl Rumbold
- School of Molecular and Cell Biology, University of the Witwatersrand, Johannesburg, South Africa
- * E-mail:
| |
Collapse
|
34
|
Ellis EA, Storer CG, Kawahara AY. De novo genome assemblies of butterflies. Gigascience 2021; 10:giab041. [PMID: 34076242 PMCID: PMC8170690 DOI: 10.1093/gigascience/giab041] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Revised: 07/22/2020] [Accepted: 05/05/2021] [Indexed: 11/14/2022] Open
Abstract
BACKGROUND The availability of thousands of genomes has enabled new advancements in biology. However, many genomes have not been investigated for their quality. Here we examine quality trends in a taxonomically diverse and well-known group, butterflies (Papilionoidea), and provide draft, de novo assemblies for all available butterfly genomes. Owing to massive genome sequencing investment and taxonomic curation, this is an excellent group to explore genome quality. FINDINGS We provide de novo assemblies for all 822 available butterfly genomes and interpret their quality in terms of completeness and continuity. We identify the 50 highest quality genomes across butterflies and conclude that the ringlet, Aphantopus hyperantus, has the highest quality genome. Our post-processing of draft genome assemblies identified 118 butterfly genomes that should not be reused owing to contamination or extremely low quality. However, many draft genomes are of high utility, especially because permissibility of low-quality genomes is dependent on the objective of the study. Our assemblies will serve as a key resource for papilionid genomics, especially for researchers without computational resources. CONCLUSIONS Quality metrics and assemblies are typically presented with annotated genome accessions but rarely with de novo genomes. We recommend that studies presenting genome sequences provide the assembly and some metrics of quality because quality will significantly affect downstream results. Transparency in quality metrics is needed to improve the field of genome science and encourage data reuse.
Collapse
Affiliation(s)
- Emily A Ellis
- McGuire Center for Lepidoptera and Biodiversity, Florida Museum of Natural History, University of Florida, 3215 Hull Road, Gainesville, FL 32611–2710, USA
| | - Caroline G Storer
- McGuire Center for Lepidoptera and Biodiversity, Florida Museum of Natural History, University of Florida, 3215 Hull Road, Gainesville, FL 32611–2710, USA
| | - Akito Y Kawahara
- McGuire Center for Lepidoptera and Biodiversity, Florida Museum of Natural History, University of Florida, 3215 Hull Road, Gainesville, FL 32611–2710, USA
| |
Collapse
|
35
|
Cordoba J, Perez E, Van Vlierberghe M, Bertrand AR, Lupo V, Cardol P, Baurain D. De Novo Transcriptome Meta-Assembly of the Mixotrophic Freshwater Microalga Euglena gracilis. Genes (Basel) 2021; 12:842. [PMID: 34072576 PMCID: PMC8227486 DOI: 10.3390/genes12060842] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Revised: 05/24/2021] [Accepted: 05/27/2021] [Indexed: 01/01/2023] Open
Abstract
Euglena gracilis is a well-known photosynthetic microeukaryote considered as the product of a secondary endosymbiosis between a green alga and a phagotrophic unicellular belonging to the same eukaryotic phylum as the parasitic trypanosomatids. As its nuclear genome has proven difficult to sequence, reliable transcriptomes are important for functional studies. In this work, we assembled a new consensus transcriptome by combining sequencing reads from five independent studies. Based on a detailed comparison with two previously released transcriptomes, our consensus transcriptome appears to be the most complete so far. Remapping the reads on it allowed us to compare the expression of the transcripts across multiple culture conditions at once and to infer a functionally annotated network of co-expressed genes. Although the emergence of meaningful gene clusters indicates that some biological signal lies in gene expression levels, our analyses confirm that gene regulation in euglenozoans is not primarily controlled at the transcriptional level. Regarding the origin of E. gracilis, we observe a heavily mixed gene ancestry, as previously reported, and rule out sequence contamination as a possible explanation for these observations. Instead, they indicate that this complex alga has evolved through a convoluted process involving much more than two partners.
Collapse
Affiliation(s)
- Javier Cordoba
- InBioS—PhytoSYSTEMS, Laboratoire de Génétique et Physiologie des Microalgues, ULiège, B-4000 Liège, Belgium; (J.C.); (E.P.); (P.C.)
| | - Emilie Perez
- InBioS—PhytoSYSTEMS, Laboratoire de Génétique et Physiologie des Microalgues, ULiège, B-4000 Liège, Belgium; (J.C.); (E.P.); (P.C.)
- InBioS—PhytoSYSTEMS, Unit of Eukaryotic Phylogenomics, ULiège, B-4000 Liège, Belgium; (M.V.V.); (A.R.B.); (V.L.)
| | - Mick Van Vlierberghe
- InBioS—PhytoSYSTEMS, Unit of Eukaryotic Phylogenomics, ULiège, B-4000 Liège, Belgium; (M.V.V.); (A.R.B.); (V.L.)
| | - Amandine R. Bertrand
- InBioS—PhytoSYSTEMS, Unit of Eukaryotic Phylogenomics, ULiège, B-4000 Liège, Belgium; (M.V.V.); (A.R.B.); (V.L.)
| | - Valérian Lupo
- InBioS—PhytoSYSTEMS, Unit of Eukaryotic Phylogenomics, ULiège, B-4000 Liège, Belgium; (M.V.V.); (A.R.B.); (V.L.)
| | - Pierre Cardol
- InBioS—PhytoSYSTEMS, Laboratoire de Génétique et Physiologie des Microalgues, ULiège, B-4000 Liège, Belgium; (J.C.); (E.P.); (P.C.)
| | - Denis Baurain
- InBioS—PhytoSYSTEMS, Unit of Eukaryotic Phylogenomics, ULiège, B-4000 Liège, Belgium; (M.V.V.); (A.R.B.); (V.L.)
| |
Collapse
|
36
|
Ahmed M, Adedidran F, Holovachov O. A draft transcriptome of a parasite Neocamacolaimus parasiticus (Camacolaimidae, Plectida). J Nematol 2021; 53:e2021-40. [PMID: 33860270 PMCID: PMC8040144 DOI: 10.21307/jofnem-2021-040] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2020] [Indexed: 12/02/2022] Open
Abstract
Camacolaimidae is a clade of nematodes that include both free-living epistrate feeding forms and parasites of marine protozoans and invertebrates. Neocamacolaimus parasiticus is a parasite of marine polychaete worms. Given its phylogenetic affinities to free-living species, Neocamacolaimus can be a reference for research of the origin of parasitism in an aquatic environment. Here, we present a draft transcriptome obtained from a single post-parasitic juvenile individual of this species. The final assembly consists of 19,180 protein coding sequences (including isoforms) with the following BUSCO scores for Nematoda: 65.38% complete, 9.06% partial, and 25.56% missing, and for Metazoa: 79.45% complete, 3.17% partial, and 17.38% missing.
Collapse
Affiliation(s)
- Mohammed Ahmed
- Department of Zoology, Swedish Museum of Natural History, SE-104 05, Stockholm, Sweden
| | | | - Oleksandr Holovachov
- Department of Zoology, Swedish Museum of Natural History, SE-104 05, Stockholm, Sweden
| |
Collapse
|
37
|
Kyslík J, Kosakyan A, Nenarokov S, Holzer AS, Fiala I. The myxozoan minicollagen gene repertoire was not simplified by the parasitic lifestyle: computational identification of a novel myxozoan minicollagen gene. BMC Genomics 2021; 22:198. [PMID: 33743585 PMCID: PMC7981951 DOI: 10.1186/s12864-021-07515-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Accepted: 03/08/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Lineage-specific gene expansions represent one of the driving forces in the evolutionary dynamics of unique phylum traits. Myxozoa, a cnidarian subphylum of obligate parasites, are evolutionarily altered and highly reduced organisms with a simple body plan including cnidarian-specific organelles and polar capsules (a type of nematocyst). Minicollagens, a group of structural proteins, are prominent constituents of nematocysts linking Myxozoa and Cnidaria. Despite recent advances in the identification of minicollagens in Myxozoa, the evolutionary history and diversity of minicollagens in Myxozoa and Cnidaria remain elusive. RESULTS We generated new transcriptomes of two myxozoan species using a novel pipeline for filtering of closely related contaminant species in RNA-seq data. Mining of our transcriptomes and published omics data confirmed the existence of myxozoan Ncol-4, reported only once previously, and revealed a novel noncanonical minicollagen, Ncol-5, which is exclusive to Myxozoa. Phylogenetic analyses support a close relationship between myxozoan Ncol-1-3 with minicollagens of Polypodium hydriforme, but suggest independent evolution in the case of the myxozoan minicollagens Ncol-4 and Ncol-5. Additional genome- and transcriptome-wide searches of cnidarian minicollagens expanded the dataset to better clarify the evolutionary trajectories of minicollagen. CONCLUSIONS The development of a new approach for the handling of next-generation data contaminated by closely related species represents a useful tool for future applications beyond the field of myxozoan research. This data processing pipeline allowed us to expand the dataset and study the evolution and diversity of minicollagen genes in Myxozoa and Cnidaria. We identified a novel type of minicollagen in Myxozoa (Ncol-5). We suggest that the large number of minicollagen paralogs in some cnidarians is a result of several recent large gene multiplication events. We revealed close juxtaposition of minicollagens Ncol-1 and Ncol-4 in myxozoan genomes, suggesting their common evolutionary history. The unique gene structure of myxozoan Ncol-5 suggests a specific function in the myxozoan polar capsule or tubule. Despite the fact that myxozoans possess only one type of nematocyst, their gene repertoire is similar to those of other cnidarians.
Collapse
Affiliation(s)
- Jiří Kyslík
- Institute of Parasitology, Biology Centre, Academy of Sciences of the Czech Republic, Ceske Budejovice, Czech Republic
- Faculty of Science, University of South Bohemia, Ceske Budejovice, Czech Republic
| | - Anush Kosakyan
- Institute of Parasitology, Biology Centre, Academy of Sciences of the Czech Republic, Ceske Budejovice, Czech Republic
| | - Serafim Nenarokov
- Institute of Parasitology, Biology Centre, Academy of Sciences of the Czech Republic, Ceske Budejovice, Czech Republic
| | - Astrid S Holzer
- Institute of Parasitology, Biology Centre, Academy of Sciences of the Czech Republic, Ceske Budejovice, Czech Republic
| | - Ivan Fiala
- Institute of Parasitology, Biology Centre, Academy of Sciences of the Czech Republic, Ceske Budejovice, Czech Republic.
- Faculty of Science, University of South Bohemia, Ceske Budejovice, Czech Republic.
| |
Collapse
|
38
|
Nachtigall PG, Grazziotin FG, Junqueira-de-Azevedo ILM. MITGARD: an automated pipeline for mitochondrial genome assembly in eukaryotic species using RNA-seq data. Brief Bioinform 2021; 22:6123950. [PMID: 33515000 DOI: 10.1093/bib/bbaa429] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Revised: 11/27/2020] [Accepted: 12/22/2020] [Indexed: 12/19/2022] Open
Abstract
MOTIVATION Over the past decade, the field of next-generation sequencing (NGS) has seen dramatic advances in methods and a decrease in costs. Consequently, a large expansion of data has been generated by NGS, most of which have originated from RNA-sequencing (RNA-seq) experiments. Because mitochondrial genes are expressed in most eukaryotic cells, mitochondrial mRNA sequences are usually co-sequenced within the target transcriptome, generating data that are commonly underused or discarded. Here, we present MITGARD, an automated pipeline that reliably recovers the mitochondrial genome from RNA-seq data from various sources. The pipeline identifies mitochondrial sequence reads based on a phylogenetically related reference, assembles them into contigs, and extracts a complete mtDNA for the target species. RESULTS We demonstrate that MITGARD can reconstruct the mitochondrial genomes of several species throughout the tree of life. We noticed that MITGARD can recover the mitogenomes in different sequencing schemes and even in a scenario of low-sequencing depth. Moreover, we showed that the use of references from congeneric species diverging up to 30 million years ago (MYA) from the target species is sufficient to recover the entire mitogenome, whereas the use of species diverging between 30 and 60 MYA allows the recovery of most mitochondrial genes. Additionally, we provide a case study with original data in which we estimate a phylogenetic tree of snakes from the genus Bothrops, further demonstrating that MITGARD is suitable for use on biodiversity projects. MITGARD is then a valuable tool to obtain high-quality information for studies focusing on the phylogenetic and evolutionary aspects of eukaryotes and provides data for easily identifying a sample using barcoding, and to check for cross-contamination using third-party tools.
Collapse
Affiliation(s)
- Pedro G Nachtigall
- Laboratório Especial de Toxinologia Aplicada, CeTICS, Instituto Butantan, São Paulo, SP, 05503-900, Brazil
| | - Felipe G Grazziotin
- Laboratório de Coleções Zoológicas, Instituto Butantan, São Paulo, SP, 05503-900, Brazil
| | | |
Collapse
|
39
|
Li HS, Tang XF, Huang YH, Xu ZY, Chen ML, Du XY, Qiu BY, Chen PT, Zhang W, Ślipiński A, Escalona HE, Waterhouse RM, Zwick A, Pang H. Horizontally acquired antibacterial genes associated with adaptive radiation of ladybird beetles. BMC Biol 2021; 19:7. [PMID: 33446206 PMCID: PMC7807722 DOI: 10.1186/s12915-020-00945-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Accepted: 12/22/2020] [Indexed: 01/12/2023] Open
Abstract
BACKGROUND Horizontal gene transfer (HGT) has been documented in many herbivorous insects, conferring the ability to digest plant material and promoting their remarkable ecological diversification. Previous reports suggest HGT of antibacterial enzymes may have contributed to the insect immune response and limit bacterial growth. Carnivorous insects also display many evolutionary successful lineages, but in contrast to the plant feeders, the potential role of HGTs has been less well-studied. RESULTS Using genomic and transcriptomic data from 38 species of ladybird beetles, we identified a set of bacterial cell wall hydrolase (cwh) genes acquired by this group of beetles. Infection with Bacillus subtilis led to upregulated expression of these ladybird cwh genes, and their recombinantly produced proteins limited bacterial proliferation. Moreover, RNAi-mediated cwh knockdown led to downregulation of other antibacterial genes, indicating a role in antibacterial immune defense. cwh genes are rare in eukaryotes, but have been maintained in all tested Coccinellinae species, suggesting that this putative immune-related HGT event played a role in the evolution of this speciose subfamily of predominant predatory ladybirds. CONCLUSION Our work demonstrates that, in a manner analogous to HGT-facilitated plant feeding, enhanced immunity through HGT might have played a key role in the prey adaptation and niche expansion that promoted the diversification of carnivorous beetle lineages. We believe that this represents the first example of immune-related HGT in carnivorous insects with an association with a subsequent successful species radiation.
Collapse
Affiliation(s)
- Hao-Sen Li
- State Key Laboratory of Biocontrol, School of Life Sciences / School of Ecology, Sun Yat-sen University, Guangzhou, 510275, China
| | - Xue-Fei Tang
- State Key Laboratory of Biocontrol, School of Life Sciences / School of Ecology, Sun Yat-sen University, Guangzhou, 510275, China
| | - Yu-Hao Huang
- State Key Laboratory of Biocontrol, School of Life Sciences / School of Ecology, Sun Yat-sen University, Guangzhou, 510275, China
| | - Ze-Yu Xu
- State Key Laboratory of Biocontrol, School of Life Sciences / School of Ecology, Sun Yat-sen University, Guangzhou, 510275, China
| | - Mei-Lan Chen
- State Key Laboratory of Biocontrol, School of Life Sciences / School of Ecology, Sun Yat-sen University, Guangzhou, 510275, China
- School of Environment and Life Science, Nanning Normal University, Nanning, 530001, China
| | - Xue-Yong Du
- State Key Laboratory of Biocontrol, School of Life Sciences / School of Ecology, Sun Yat-sen University, Guangzhou, 510275, China
| | - Bo-Yuan Qiu
- State Key Laboratory of Biocontrol, School of Life Sciences / School of Ecology, Sun Yat-sen University, Guangzhou, 510275, China
| | - Pei-Tao Chen
- State Key Laboratory of Biocontrol, School of Life Sciences / School of Ecology, Sun Yat-sen University, Guangzhou, 510275, China
| | - Wei Zhang
- State Key Laboratory of Biocontrol, School of Life Sciences / School of Ecology, Sun Yat-sen University, Guangzhou, 510275, China
| | - Adam Ślipiński
- Australian National Insect Collection, CSIRO, GPO Box 1700, Canberra, ACT, 2601, Australia
| | - Hermes E Escalona
- Australian National Insect Collection, CSIRO, GPO Box 1700, Canberra, ACT, 2601, Australia
| | - Robert M Waterhouse
- Department of Ecology and Evolution, University of Lausanne and Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland
| | - Andreas Zwick
- Australian National Insect Collection, CSIRO, GPO Box 1700, Canberra, ACT, 2601, Australia
| | - Hong Pang
- State Key Laboratory of Biocontrol, School of Life Sciences / School of Ecology, Sun Yat-sen University, Guangzhou, 510275, China.
| |
Collapse
|
40
|
Allio R, Nabholz B, Wanke S, Chomicki G, Pérez-Escobar OA, Cotton AM, Clamens AL, Kergoat GJ, Sperling FAH, Condamine FL. Genome-wide macroevolutionary signatures of key innovations in butterflies colonizing new host plants. Nat Commun 2021; 12:354. [PMID: 33441560 PMCID: PMC7806994 DOI: 10.1038/s41467-020-20507-3] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2020] [Accepted: 12/03/2020] [Indexed: 01/29/2023] Open
Abstract
The mega-diversity of herbivorous insects is attributed to their co-evolutionary associations with plants. Despite abundant studies on insect-plant interactions, we do not know whether host-plant shifts have impacted both genomic adaptation and species diversification over geological times. We show that the antagonistic insect-plant interaction between swallowtail butterflies and the highly toxic birthworts began 55 million years ago in Beringia, followed by several major ancient host-plant shifts. This evolutionary framework provides a valuable opportunity for repeated tests of genomic signatures of macroevolutionary changes and estimation of diversification rates across their phylogeny. We find that host-plant shifts in butterflies are associated with both genome-wide adaptive molecular evolution (more genes under positive selection) and repeated bursts of speciation rates, contributing to an increase in global diversification through time. Our study links ecological changes, genome-wide adaptations and macroevolutionary consequences, lending support to the importance of ecological interactions as evolutionary drivers over long time periods.
Collapse
Affiliation(s)
- Rémi Allio
- CNRS, IRD, EPHE, Institut des Sciences de l'Evolution de Montpellier, Université de Montpellier, Place Eugène Bataillon, 34095, Montpellier, France.
| | - Benoit Nabholz
- CNRS, IRD, EPHE, Institut des Sciences de l'Evolution de Montpellier, Université de Montpellier, Place Eugène Bataillon, 34095, Montpellier, France
| | - Stefan Wanke
- Institut für Botanik, Technische Universität Dresden, Zellescher Weg 20b, 01062, Dresden, Germany
| | - Guillaume Chomicki
- Department of Bioscience, Durham University, Stockton Road, Durham, DH1 3LE, UK
| | | | - Adam M Cotton
- 86/2 Moo 5, Tambon Nong Kwai, Hang Dong, Chiang Mai, Thailand
| | - Anne-Laure Clamens
- CBGP, INRAE, CIRAD, IRD, Montpellier SupAgro, Univ. Montpellier, Montpellier, France
| | - Gaël J Kergoat
- CBGP, INRAE, CIRAD, IRD, Montpellier SupAgro, Univ. Montpellier, Montpellier, France
| | - Felix A H Sperling
- Department of Biological Sciences, University of Alberta, Edmonton, T6G 2E9, AB, Canada
| | - Fabien L Condamine
- CNRS, IRD, EPHE, Institut des Sciences de l'Evolution de Montpellier, Université de Montpellier, Place Eugène Bataillon, 34095, Montpellier, France.
- Department of Biological Sciences, University of Alberta, Edmonton, T6G 2E9, AB, Canada.
| |
Collapse
|
41
|
Genome Survey Sequencing of In Vivo Mother Plant and In Vitro Plantlets of Mikania cordata. PLANTS 2020; 9:plants9121665. [PMID: 33261119 PMCID: PMC7759884 DOI: 10.3390/plants9121665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/22/2020] [Revised: 11/21/2020] [Accepted: 11/25/2020] [Indexed: 11/16/2022]
Abstract
Mikania cordata, the only native congener of the invasive weed Mikania micrantha in China, is an ideal species for comparative study to reveal the invasion mechanism. However, its genome resources are lagging far behind its congener, which limits the comparative genomic analysis. Our goal is to characterize the genome of M. cordata by next-generation sequencing and propose a scheme for long-read genome sequencing. Previous studies have shown that the genomic resources of the host plant would be affected by the endophytic microbial DNA. An aseptic sample of M. cordata will ensure the proper genome in downstream analysis. Because endophytes are ubiquitous in the greenhouse-grown M. cordata, the in vitro culture with cefotaxime or timentin treatment was undertaken to obtain the aseptic plantlets. The in vivo mother plant and in vitro plantlets were used to survey the genome. The microbial contamination in M. cordata was recognized by blast search and eliminated from the raw reads. The decontaminated sequencing reads were used to predict the genome size, heterozygosity, and repetitive rate. The in vivo plant was so contaminated that microbes occupied substantial sequencing resources and misled the scaffold assembly. Compared with cefotaxime, treatment with timentin performed better in cultivating robust in vitro plantlets. The survey result from the in vitro plantlets was more accurate due to low levels of contamination. The genome size was estimated to be 1.80 Gb with 0.50% heterozygosity and 78.35% repetitive rate. Additionally, 289,831 SSRs were identified in the genome. The genome is heavily contaminated and repetitive; therefore, the in vitro culture technique and long-read sequencing technology are recommended to generate a high-quality and highly contiguous genome.
Collapse
|
42
|
Miocene Diversification and High-Altitude Adaptation of Parnassius Butterflies (Lepidoptera: Papilionidae) in Qinghai-Tibet Plateau Revealed by Large-Scale Transcriptomic Data. INSECTS 2020; 11:insects11110754. [PMID: 33153157 PMCID: PMC7693471 DOI: 10.3390/insects11110754] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Revised: 10/27/2020] [Accepted: 10/30/2020] [Indexed: 12/01/2022]
Abstract
Simple Summary Parnassius butterflies have contributed to fundamental studies in biogeography, insect–plant interactions, and other fields of conservation biology and ecology. However, the early evolutionary pattern and molecular adaptation mechanism of this alpine butterfly group to high altitudes in Qinghai–Tibet Plateau are poorly understood up to now. In this study, we report for the first time, a relatively large-scale transcriptomic dataset of eight Parnassius species and their two closely related papilionid species, a dated phylogeny based on hundreds of gene sequences, and potential genetic mechanisms underlying the high-altitude adaptation by investigating changes in evolutionary rates and positively selected genes. Overall, our findings indicate that the transcriptome data sets reported here can provide some new insights into the spatiotemporally evolutionary pattern and high altitude adaptation of Parnassius butterflies from the extrinsic and intrinsic view, and will support further expressional and functional studies that will help interested researchers to address evolution, biodiversity and conservation questions concerning Parnassius and other butterfly species. Abstract The early evolutionary pattern and molecular adaptation mechanism of alpine Parnassius butterflies to high altitudes in Qinghai–Tibet Plateau are poorly understood up to now, due to difficulties in sampling, limited sequence data, and time calibration issues. Here, we present large-scale transcriptomic datasets of eight representative Parnassius species to reveal the phylogenetic timescale and potential genetic basis for high-altitude adaptation with multiple analytic strategies using 476 orthologous genes. Our phylogenetic results strongly supported that the subgenus Parnassius formed a well-resolved basal clade, and the subgenera Tadumia and Kailasius were closely related in the phylogenetic trees. In addition, molecular dating analyses showed that the Parnassius began to diverge at about 13.0 to 14.3 million years ago (middle Miocene), correlated with their hostplant’s spatiotemporal distributions, as well as geological and palaeoenvironmental changes of the Qinghai–Tibet Plateau. Moreover, the accelerated evolutionary rate, candidate positively selected genes and their potentially functional changes were detected, probably contributed to the high-altitude adaptation of Parnassius species. Overall, our study provided some new insights into the spatiotemporally evolutionary pattern and high altitude adaptation of Parnassius butterflies from the extrinsic and intrinsic view, which will help to address evolution, biodiversity, and conservation questions concerning Parnassius and other butterfly species.
Collapse
|
43
|
Peijnenburg KTCA, Janssen AW, Wall-Palmer D, Goetze E, Maas AE, Todd JA, Marlétaz F. The origin and diversification of pteropods precede past perturbations in the Earth's carbon cycle. Proc Natl Acad Sci U S A 2020; 117:25609-25617. [PMID: 32973093 PMCID: PMC7568333 DOI: 10.1073/pnas.1920918117] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Pteropods are a group of planktonic gastropods that are widely regarded as biological indicators for assessing the impacts of ocean acidification. Their aragonitic shells are highly sensitive to acute changes in ocean chemistry. However, to gain insight into their potential to adapt to current climate change, we need to accurately reconstruct their evolutionary history and assess their responses to past changes in the Earth's carbon cycle. Here, we resolve the phylogeny and timing of pteropod evolution with a phylogenomic dataset (2,654 genes) incorporating new data for 21 pteropod species and revised fossil evidence. In agreement with traditional taxonomy, we recovered molecular support for a division between "sea butterflies" (Thecosomata; mucus-web feeders) and "sea angels" (Gymnosomata; active predators). Molecular dating demonstrated that these two lineages diverged in the early Cretaceous, and that all main pteropod clades, including shelled, partially-shelled, and unshelled groups, diverged in the mid- to late Cretaceous. Hence, these clades originated prior to and subsequently survived major global change events, including the Paleocene-Eocene Thermal Maximum (PETM), the closest analog to modern-day ocean acidification and warming. Our findings indicate that planktonic aragonitic calcifiers have shown resilience to perturbations in the Earth's carbon cycle over evolutionary timescales.
Collapse
Affiliation(s)
- Katja T C A Peijnenburg
- Plankton Diversity and Evolution, Naturalis Biodiversity Center, 2300 RA Leiden, The Netherlands;
- Department Freshwater and Marine Ecology, Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, 1090 GE Amsterdam, The Netherlands
| | - Arie W Janssen
- Plankton Diversity and Evolution, Naturalis Biodiversity Center, 2300 RA Leiden, The Netherlands
| | - Deborah Wall-Palmer
- Plankton Diversity and Evolution, Naturalis Biodiversity Center, 2300 RA Leiden, The Netherlands
| | - Erica Goetze
- Department of Oceanography, University of Hawai'i at Mānoa, Honolulu, HI 96822
| | - Amy E Maas
- Bermuda Institute of Ocean Sciences, St. Georges GE01, Bermuda
| | - Jonathan A Todd
- Department of Earth Sciences, Natural History Museum, London SW7 5BD, United Kingdom
| | - Ferdinand Marlétaz
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, United Kingdom;
- Molecular Genetics Unit, Okinawa Institute of Science and Technology, Onna-son 904-0495, Japan
| |
Collapse
|
44
|
Mongiardino Koch N, Thompson JR. A Total-Evidence Dated Phylogeny of Echinoidea Combining Phylogenomic and Paleontological Data. Syst Biol 2020; 70:421-439. [PMID: 32882040 DOI: 10.1093/sysbio/syaa069] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2020] [Revised: 08/14/2020] [Accepted: 08/23/2020] [Indexed: 12/13/2022] Open
Abstract
Phylogenomic and paleontological data constitute complementary resources for unraveling the phylogenetic relationships and divergence times of lineages, yet few studies have attempted to fully integrate them. Several unique properties of echinoids (sea urchins) make them especially useful for such synthesizing approaches, including a remarkable fossil record that can be incorporated into explicit phylogenetic hypotheses. We revisit the phylogeny of crown group Echinoidea using a total-evidence dating approach that combines the largest phylogenomic data set for the clade, a large-scale morphological matrix with a dense fossil sampling, and a novel compendium of tip and node age constraints. To this end, we develop a novel method for subsampling phylogenomic data sets that selects loci with high phylogenetic signal, low systematic biases, and enhanced clock-like behavior. Our results demonstrate that combining different data sources increases topological accuracy and helps resolve conflicts between molecular and morphological data. Notably, we present a new hypothesis for the origin of sand dollars, and restructure the relationships between stem and crown echinoids in a way that implies a long stretch of undiscovered evolutionary history of the crown group in the late Paleozoic. Our efforts help bridge the gap between phylogenomics and phylogenetic paleontology, providing a model example of the benefits of combining the two. [Echinoidea; fossils; paleontology; phylogenomics; time calibration; total evidence.].
Collapse
Affiliation(s)
| | - Jeffrey R Thompson
- Department of Genetics, Evolution and Environment, University College London, Darwin Building, Gower Street, London WC1E 6BT, UK
| |
Collapse
|
45
|
Brand JN, Wiberg RAW, Pjeta R, Bertemes P, Beisel C, Ladurner P, Schärer L. RNA-Seq of three free-living flatworm species suggests rapid evolution of reproduction-related genes. BMC Genomics 2020; 21:462. [PMID: 32631219 PMCID: PMC7336406 DOI: 10.1186/s12864-020-06862-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2019] [Accepted: 06/22/2020] [Indexed: 01/03/2023] Open
Abstract
BACKGROUND The genus Macrostomum consists of small free-living flatworms and contains Macrostomum lignano, which has been used in investigations of ageing, stem cell biology, bioadhesion, karyology, and sexual selection in hermaphrodites. Two types of mating behaviour occur within this genus. Some species, including M. lignano, mate via reciprocal copulation, where, in a single mating, both partners insert their male copulatory organ into the female storage organ and simultaneously donate and receive sperm. Other species mate via hypodermic insemination, where worms use a needle-like copulatory organ to inject sperm into the tissue of the partner. These contrasting mating behaviours are associated with striking differences in sperm and copulatory organ morphology. Here we expand the genomic resources within the genus to representatives of both behaviour types and investigate whether genes vary in their rate of evolution depending on their putative function. RESULTS We present de novo assembled transcriptomes of three Macrostomum species, namely M. hystrix, a close relative of M. lignano that mates via hypodermic insemination, M. spirale, a more distantly related species that mates via reciprocal copulation, and finally M. pusillum, which represents a clade that is only distantly related to the other three species and also mates via hypodermic insemination. We infer 23,764 sets of homologous genes and annotate them using experimental evidence from M. lignano. Across the genus, we identify 521 gene families with conserved patterns of differential expression between juvenile vs. adult worms and 185 gene families with a putative expression in the testes that are restricted to the two reciprocally mating species. Further, we show that homologs of putative reproduction-related genes have a higher protein divergence across the four species than genes lacking such annotations and that they are more difficult to identify across the four species, indicating that these genes evolve more rapidly, while genes involved in neoblast function are more conserved. CONCLUSIONS This study improves the genus Macrostomum as a model system, by providing resources for the targeted investigation of gene function in a broad range of species. And we, for the first time, show that reproduction-related genes evolve at an accelerated rate in flatworms.
Collapse
Affiliation(s)
- Jeremias N Brand
- Department of Environmental Sciences, Zoological Institute, University of Basel, Vesalgasse 1, 4051, Basel, Switzerland.
| | - R Axel W Wiberg
- Department of Environmental Sciences, Zoological Institute, University of Basel, Vesalgasse 1, 4051, Basel, Switzerland
| | - Robert Pjeta
- Institute of Zoology and Center of Molecular Biosciences Innsbruck, University of Innsbruck, Innsbruck, Austria
| | - Philip Bertemes
- Institute of Zoology and Center of Molecular Biosciences Innsbruck, University of Innsbruck, Innsbruck, Austria
| | - Christian Beisel
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Peter Ladurner
- Institute of Zoology and Center of Molecular Biosciences Innsbruck, University of Innsbruck, Innsbruck, Austria
| | - Lukas Schärer
- Department of Environmental Sciences, Zoological Institute, University of Basel, Vesalgasse 1, 4051, Basel, Switzerland
| |
Collapse
|
46
|
Abstract
Knowing phylogenetic relationships among species is fundamental for many studies in biology. An accurate phylogenetic tree underpins our understanding of the major transitions in evolution, such as the emergence of new body plans or metabolism, and is key to inferring the origin of new genes, detecting molecular adaptation, understanding morphological character evolution and reconstructing demographic changes in recently diverged species. Although data are ever more plentiful and powerful analysis methods are available, there remain many challenges to reliable tree building. Here, we discuss the major steps of phylogenetic analysis, including identification of orthologous genes or proteins, multiple sequence alignment, and choice of substitution models and inference methodologies. Understanding the different sources of errors and the strategies to mitigate them is essential for assembling an accurate tree of life.
Collapse
|
47
|
Rousselle M, Simion P, Tilak MK, Figuet E, Nabholz B, Galtier N. Is adaptation limited by mutation? A timescale-dependent effect of genetic diversity on the adaptive substitution rate in animals. PLoS Genet 2020; 16:e1008668. [PMID: 32251427 PMCID: PMC7162527 DOI: 10.1371/journal.pgen.1008668] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Revised: 04/16/2020] [Accepted: 02/14/2020] [Indexed: 12/16/2022] Open
Abstract
Whether adaptation is limited by the beneficial mutation supply is a long-standing question of evolutionary genetics, which is more generally related to the determination of the adaptive substitution rate and its relationship with species effective population size (Ne) and genetic diversity. Empirical evidence reported so far is equivocal, with some but not all studies supporting a higher adaptive substitution rate in large-Ne than in small-Ne species. We gathered coding sequence polymorphism data and estimated the adaptive amino-acid substitution rate ωa, in 50 species from ten distant groups of animals with markedly different population mutation rate θ. We reveal the existence of a complex, timescale dependent relationship between species adaptive substitution rate and genetic diversity. We find a positive relationship between ωa and θ among closely related species, indicating that adaptation is indeed limited by the mutation supply, but this was only true in relatively low-θ taxa. In contrast, we uncover no significant correlation between ωa and θ at a larger taxonomic scale, suggesting that the proportion of beneficial mutations scales negatively with species' long-term Ne.
Collapse
Affiliation(s)
| | - Paul Simion
- ISEM, Univ. Montpellier, CNRS, EPHE, IRD, Montpellier, France
- LEGE, Department of Biology, University of Namur, Namur, Belgium
| | - Marie-Ka Tilak
- ISEM, Univ. Montpellier, CNRS, EPHE, IRD, Montpellier, France
| | - Emeric Figuet
- ISEM, Univ. Montpellier, CNRS, EPHE, IRD, Montpellier, France
| | - Benoit Nabholz
- ISEM, Univ. Montpellier, CNRS, EPHE, IRD, Montpellier, France
| | - Nicolas Galtier
- ISEM, Univ. Montpellier, CNRS, EPHE, IRD, Montpellier, France
| |
Collapse
|
48
|
Prevalence and Implications of Contamination in Public Genomic Resources: A Case Study of 43 Reference Arthropod Assemblies. G3-GENES GENOMES GENETICS 2020; 10:721-730. [PMID: 31862787 PMCID: PMC7003083 DOI: 10.1534/g3.119.400758] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Thanks to huge advances in sequencing technologies, genomic resources are increasingly being generated and shared by the scientific community. The quality of such public resources are therefore of critical importance. Errors due to contamination are particularly worrying; they are widespread, propagate across databases, and can compromise downstream analyses, especially the detection of horizontally-transferred sequences. However we still lack consistent and comprehensive assessments of contamination prevalence in public genomic data. Here we applied a standardized procedure for foreign sequence annotation to 43 published arthropod genomes from the widely used Ensembl Metazoa database. This method combines information on sequence similarity and synteny to identify contaminant and putative horizontally-transferred sequences in any genome assembly, provided that an adequate reference database is available. We uncovered considerable heterogeneity in quality among arthropod assemblies, some being devoid of contaminant sequences, whereas others included hundreds of contaminant genes. Contaminants far outnumbered horizontally-transferred genes and were a major confounder of their detection, quantification and analysis. We strongly recommend that automated standardized decontamination procedures be systematically embedded into the submission process to genomic databases.
Collapse
|
49
|
Zhao Q, Zhang R, Xiao Y, Niu Y, Shao F, Li Y, Peng Z. Comparative Transcriptome Profiling of the Loaches Triplophysa bleekeri and Triplophysa rosa Reveals Potential Mechanisms of Eye Degeneration. Front Genet 2020; 10:1334. [PMID: 32010191 PMCID: PMC6977438 DOI: 10.3389/fgene.2019.01334] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2019] [Accepted: 12/06/2019] [Indexed: 12/30/2022] Open
Abstract
Eye degeneration is one of the most obvious characteristics of organisms restricted to subterranean habitats. In cavefish, eye degeneration has evolved independently numerous times and each process is associated with different genetic mechanisms. To gain a better understanding of these mechanisms, we compared the eyes of adult individuals of the cave loach Triplophysa rosa and surface loach Triplophysa bleekeri. Compared with the normal eyes of the surface loach, those of the cave loach were found to possess a small abnormal lens and a defective retina containing photoreceptor cells that lack outer segments. Sequencing of the transcriptomes of both species to identify differentially expressed genes (DEGs) and genes under positive selection revealed 4,802 DEGs and 50 genes under positive selection (dN/dS > 1, FDR < 0.1). For cave loaches, we identified one Gene Ontology category related to vision that was significantly enriched in downregulated genes. Specifically, we found that many of the downregulated genes, including pitx3, lim2, crx, gnat2, rx1, rho, prph2, and β|γ-crystallin are associated with lens/retinal development and maintenance. However, compared with those in the surface loach, the lower dS rates but higher dN rates of the protein-coding sequences in T. rosa indicate that changes in amino acid sequences might be involved in the adaptation and visual degeneration of cave loaches. We also found that genes associated with light perception and light-stimulated vision have evolved at higher rates (some genes dN/dS > 1 but FDR > 0.1). Collectively, the findings of this study indicate that the degradation of cavefish vision is probably associated with both gene expression and amino acid changes and provide new insights into the mechanisms underlying the degeneration of cavefish eyes.
Collapse
Affiliation(s)
- Qingyuan Zhao
- Key Laboratory of Freshwater Fish Reproduction and Development (Ministry of Education), Southwest University School of Life Sciences, Chongqing, China
| | - Renyi Zhang
- School of Life Sciences, Guizhou Normal University, Guiyang, China
| | - Yingqi Xiao
- Key Laboratory of Freshwater Fish Reproduction and Development (Ministry of Education), Southwest University School of Life Sciences, Chongqing, China
| | - Yabing Niu
- Key Laboratory of Freshwater Fish Reproduction and Development (Ministry of Education), Southwest University School of Life Sciences, Chongqing, China
| | - Feng Shao
- Key Laboratory of Freshwater Fish Reproduction and Development (Ministry of Education), Southwest University School of Life Sciences, Chongqing, China
| | - Yanping Li
- Key Laboratory of Freshwater Fish Reproduction and Development (Ministry of Education), Southwest University School of Life Sciences, Chongqing, China
| | - Zuogang Peng
- Key Laboratory of Freshwater Fish Reproduction and Development (Ministry of Education), Southwest University School of Life Sciences, Chongqing, China
| |
Collapse
|
50
|
De Simone G, Pasquadibisceglie A, Proietto R, Polticelli F, Aime S, J M Op den Camp H, Ascenzi P. Contaminations in (meta)genome data: An open issue for the scientific community. IUBMB Life 2019; 72:698-705. [PMID: 31869003 DOI: 10.1002/iub.2216] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Accepted: 11/30/2019] [Indexed: 12/13/2022]
Abstract
In recent years, the high throughput and the low cost of next-generation sequencing (NGS) technologies have led to an increase of the amount of (meta)genomic data, revolutionizing genomic research studies. However, the quality of sequencing data could be affected by experimental errors derived from defective methods and protocols. This represents a serious problem for the scientific community with a negative impact on the correctness of studies that involve genomic sequence analysis. As a countermeasure, several alignment and taxonomic classification tools have been developed to uncover and correct errors. In this critical review some of these integrated software tools and pipelines used to detect contaminations in reference genome databases and sequenced samples are reported. In particular, case studies of bacterial contaminations, contaminations of human origin, mitochondrial contaminations of ancient DNA, and cross contaminations are examined.
Collapse
Affiliation(s)
| | | | | | | | - Silvio Aime
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Torino, Italy
| | - Huub J M Op den Camp
- Department of Microbiology, IWWR, Radboud University, Heyendaalseweg 135, Nijmegen, AJ, The Netherlands
| | - Paolo Ascenzi
- Interdepartmental Laboratory for Electron Microscopy, Roma Tre University, Roma, Italy
| |
Collapse
|