1
|
Quintana-Escobar AO, Loyola-Vargas VM. Transcriptomic Analysis During the Induction of Somatic Embryogenesis in Coffea canephora. Methods Mol Biol 2024; 2827:363-376. [PMID: 38985282 DOI: 10.1007/978-1-0716-3954-2_24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/11/2024]
Abstract
Omic tools have changed the way of doing research in experimental biology. The somatic embryogenesis (SE) study has not been immune to this benefit. The transcriptomic tools have been used to compare the genes expressed during the induction of SE with the genes expressed in zygotic embryogenesis or to compare the development of the different stages embryos go through. It has also been used to compare the expression of genes during the development of calli from which SE is induced, as well as many other applications. The protocol described here is employed in our laboratory to extract RNA and generate several transcriptomes for the study of SE on Coffea canephora.
Collapse
Affiliation(s)
- Ana O Quintana-Escobar
- Unidad de Biología Integrativa, Centro de Investigación Científica de Yucatán, Chuburna, Merida, CP, Mexico
| | - Víctor M Loyola-Vargas
- Unidad de Biología Integrativa, Centro de Investigación Científica de Yucatán, Chuburna, Merida, CP, Mexico.
| |
Collapse
|
2
|
Irizarry KJL, Zhong W, Sun Y, Kronmiller BA, Darmani NA. RNA sequencing least shrew ( Cryptotis parva) brainstem and gut transcripts following administration of a selective substance P neurokinin NK 1 receptor agonist and antagonist expands genomics resources for emesis research. Front Genet 2023; 14:975087. [PMID: 36865388 PMCID: PMC9972295 DOI: 10.3389/fgene.2023.975087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 01/18/2023] [Indexed: 02/16/2023] Open
Abstract
The least shrew is among the subset of animals that are capable of vomiting and therefore serves as a valuable research model for investigating the biochemistry, molecular biology, pharmacology, and genomics of emesis. Both nausea and vomiting are associated with a variety of illnesses (bacterial/viral infections, bulimia, exposure to toxins, gall bladder disease), conditions (pregnancy, motion sickness, emotional stress, overeating) and reactions to drugs (chemotherapeutics, opiates). The severe discomfort and intense fear associated with the stressful symptoms of nausea and emesis are the major reason for patient non-compliance when being treated with cancer chemotherapeutics. Increased understanding of the physiology, pharmacology and pathophysiology underlying vomiting and nausea can accelerate progress for developing new antiemetics. As a major animal model for emesis, expanding genomic knowledge associated with emesis in the least shrew will further enhance the laboratory utility of this model. A key question is which genes mediate emesis, and are they expressed in response to emetics/antiemetics. To elucidate the mediators of emesis, in particular emetic receptors, their downstream signaling pathways, as well as the shared emetic signals, we carried out an RNA sequencing study focused on the central and peripheral emetic loci, the brainstem and gut. Thus, we sequenced RNA extracted from brainstem and gut tissues from different groups of least shrews treated with either a neurokinin NK1 receptor selective emetic agonist, GR73632 (5 mg/kg, i.p.), its corresponding selective antagonist netupitant (5 mg/kg, i.p.), a combination of these two agents, versus their corresponding vehicle-pretreated controls and drug naïve animals. The resulting sequences were processed using a de novo transcriptome assembly and used it to identify orthologs within human, dog, mouse, and ferret gene sets. We compared the least shrew to human and a veterinary species (dog) that may be treated with vomit-inducing chemotherapeutics, and the ferret, another well-established model organism for emesis research. The mouse was included because it does not vomit. In total, we identified a final set of 16,720 least shrew orthologs. We employed comparative genomics analyses as well as gene ontology enrichment, KEGG pathway enrichment and phenotype enrichment to better understand the molecular biology of genes implicated in vomiting.
Collapse
Affiliation(s)
| | - Weixia Zhong
- Department of Basic Medical Sciences, College of Osteopathic Medicine of the Pacific, Western University of Health Sciences, Pomona, CA, United States
| | - Yina Sun
- Department of Basic Medical Sciences, College of Osteopathic Medicine of the Pacific, Western University of Health Sciences, Pomona, CA, United States
| | - Brent A. Kronmiller
- Center for Genome Research and Biocomputing, Oregon State University, Corvallis, OR, United States
| | - Nissar A. Darmani
- Department of Basic Medical Sciences, College of Osteopathic Medicine of the Pacific, Western University of Health Sciences, Pomona, CA, United States
| |
Collapse
|
3
|
Zhang S, Yazaki E, Sakamoto H, Yamamoto H, Mizushima N. Evolutionary diversification of the autophagy-related ubiquitin-like conjugation systems. Autophagy 2022; 18:2969-2984. [PMID: 35427200 PMCID: PMC9673942 DOI: 10.1080/15548627.2022.2059168] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Two autophagy-related (ATG) ubiquitin-like conjugation systems, the ATG12 and ATG8 systems, play important roles in macroautophagy. While multiple duplications and losses of the ATG conjugation system proteins are found in different lineages, the extent to which the underlying systems diversified across eukaryotes is not fully understood. Here, in order to understand the evolution of the ATG conjugation systems, we constructed a transcriptome database consisting of 94 eukaryotic species covering major eukaryotic clades and systematically identified ATG conjugation system components. Both ATG10 and the C-terminal glycine of ATG12 are essential for the canonical ubiquitin-like conjugation of ATG12 and ATG5. However, loss of ATG10 or the C-terminal glycine of ATG12 occurred at least 16 times in a wide range of lineages, suggesting that possible covalent-to-non-covalent transition is not limited to the species that we previously reported such as Alveolata and some yeast species. Some species have only the ATG8 system (with conjugation enzymes) or only ATG8 (without conjugation enzymes). More than 10 species have ATG8 homologs without the conserved C-terminal glycine, and Tetrahymena has an ATG8 homolog with a predicted transmembrane domain, which may be able to anchor to the membrane independent of the ATG conjugation systems. We discuss the possibility that the ancestor of the ATG12 and ATG8 systems is more similar to ATG8. Overall, our study offers a whole picture of the evolution and diversity of the ATG conjugation systems among eukaryotes, and provides evidence that functional diversifications of the systems are more common than previously thought.Abbreviations: APEAR: ATG8-PE association region; ATG: autophagy-related; LIR: LC3-interacting region; NEDD8: neural precursor cell expressed, developmentally down-regulated gene 8; PE: phosphatidylethanolamine; SAMP: small archaeal modifier protein; SAR: Stramenopiles, Alveolata, and Rhizaria; SMC: structural maintenance of chromosomes; SUMO: small ubiquitin like modifier; TACK: Thaumarchaeota, Aigarchaeota, Crenarchaeota, and Korarchaeota; UBA: ubiquitin like modifier activating enzyme; UFM: ubiquitin fold modifier; URM: ubiquitin related modifier.
Collapse
Affiliation(s)
- Sidi Zhang
- Department of Biochemistry and Molecular Biology, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Euki Yazaki
- Department of Biochemistry and Molecular Biology, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan,Interdisciplinary Theoretical and Mathematical Sciences (iTHEMS), RIKEN, Saitama, Japan
| | - Hirokazu Sakamoto
- Department of Biochemistry and Molecular Biology, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan,Department of Biomedical Chemistry, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan,Department of Infection and Host Defense, Graduate School of Medicine, Chiba University, Chiba, Japan
| | - Hayashi Yamamoto
- Department of Biochemistry and Molecular Biology, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Noboru Mizushima
- Department of Biochemistry and Molecular Biology, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan,CONTACT Noboru Mizushima Department of Biochemistry and Molecular Biology, Graduate School of Medicine, The University of Tokyo, Tokyo113-0033, Japan
| |
Collapse
|
4
|
Bulbul Ahmed M, Humayan Kabir A. Understanding of the various aspects of gene regulatory networks related to crop improvement. Gene 2022; 833:146556. [PMID: 35609798 DOI: 10.1016/j.gene.2022.146556] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 03/14/2022] [Accepted: 05/06/2022] [Indexed: 12/30/2022]
Abstract
The hierarchical relationship between transcription factors, associated proteins, and their target genes is defined by a gene regulatory network (GRN). GRNs allow us to understand how the genotype and environment of a plant are incorporated to control the downstream physiological responses. During plant growth or environmental acclimatization, GRNs are diverse and can be differently regulated across tissue types and organs. An overview of recent advances in the development of GRN that speed up basic and applied plant research is given here. Furthermore, the overview of genome and transcriptome involving GRN research along with the exciting advancement and application are discussed. In addition, different approaches to GRN predictions were elucidated. In this review, we also describe the role of GRN in crop improvement, crop plant manipulation, stress responses, speed breeding and identifying genetic variations/locus. Finally, the challenges and prospects of GRN in plant biology are discussed.
Collapse
Affiliation(s)
- Md Bulbul Ahmed
- Plant Science Department, McGill University, 21111 lakeshore Road, Ste. Anne de Bellevue H9X3V9, Quebec, Canada; Institut de Recherche en Biologie Végétale (IRBV), University of Montreal, Montréal, Québec H1X 2B2, Canada.
| | | |
Collapse
|
5
|
Stanley TR, Guisbert KSK, Perez SM, Oneka M, Kernin I, Higgins NR, Lobo A, Subasi MM, Carroll DJ, Turingan RG, Guisbert E. Stress response gene family expansions correlate with invasive potential in teleost fish. J Exp Biol 2022; 225:274389. [PMID: 35258619 PMCID: PMC8987736 DOI: 10.1242/jeb.243263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Accepted: 01/24/2022] [Indexed: 11/20/2022]
Abstract
The bluegill sunfish Lepomis macrochirus and the closely related redear sunfish Lepomis microlophus have important ecological and recreational value and are widely used for research and aquaculture. While both species have been introduced outside of their native ranges, only the bluegill is considered invasive. Here, we report de novo transcriptome assemblies for these fish as a resource for sunfish biology. Comparative analyses of the transcriptomes revealed an unexpected, bluegill-specific expansion in the HSP70 and HSP90 molecular chaperone gene families. These expansions were not unique to the bluegill as expansions in HSP70s and HSP90s were identified in the genomes of other teleost fish using the NCBI RefSeq database. To determine whether gene family expansions are specific for thermal stress responses, GST and SOD gene families that are associated with oxidative stress responses were also analyzed. Species-specific expansions were also observed for these gene families in distinct fish species. Validating our approach, previously described expansions in the MHC gene family were also identified. Intriguingly, the number of HSP70 paralogs was positively correlated with thermotolerance range for each species, suggesting that these expansions can impact organismal physiology. Furthermore, fish that are considered invasive contained a higher average number of HSP70 paralogs than non-invasive fish. Invasive fish also had higher average numbers of HSP90, MHC and GST paralogs, but not SOD paralogs. Taken together, we propose that expansions in key cellular stress response gene families represent novel genetic signatures that correlate with invasive potential.
Collapse
Affiliation(s)
- Taylor R Stanley
- Department of Biomedical and Chemical Engineering and Sciences, Florida Institute of Technology, Melbourne, FL 32937, USA
| | - Karen S Kim Guisbert
- Department of Biomedical and Chemical Engineering and Sciences, Florida Institute of Technology, Melbourne, FL 32937, USA
| | - Sabrina M Perez
- Department of Biomedical and Chemical Engineering and Sciences, Florida Institute of Technology, Melbourne, FL 32937, USA
| | - Morgan Oneka
- Department of Biomedical and Chemical Engineering and Sciences, Florida Institute of Technology, Melbourne, FL 32937, USA
| | - Isabela Kernin
- Department of Biomedical and Chemical Engineering and Sciences, Florida Institute of Technology, Melbourne, FL 32937, USA
| | - Nicole R Higgins
- Department of Biomedical and Chemical Engineering and Sciences, Florida Institute of Technology, Melbourne, FL 32937, USA
| | - Alexandra Lobo
- Department of Biomedical and Chemical Engineering and Sciences, Florida Institute of Technology, Melbourne, FL 32937, USA
| | - Munevver M Subasi
- Department of Mathematical Sciences, Florida Institute of Technology, Melbourne, FL 32937, USA
| | - David J Carroll
- Department of Biomedical and Chemical Engineering and Sciences, Florida Institute of Technology, Melbourne, FL 32937, USA
| | - Ralph G Turingan
- Department of Ocean Engineering and Marine Sciences, Florida Institute of Technology, Melbourne, FL 32937, USA
| | - Eric Guisbert
- Department of Biomedical and Chemical Engineering and Sciences, Florida Institute of Technology, Melbourne, FL 32937, USA
| |
Collapse
|
6
|
Dimos B, Emery M, Beavers K, MacKnight N, Brandt M, Demuth J, Mydlarz L. Adaptive Variation in Homolog Number Within Transcript Families Promotes Expression Divergence in Reef-Building Coral. Mol Ecol 2022; 31:2594-2610. [PMID: 35229964 DOI: 10.1111/mec.16414] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 02/10/2022] [Accepted: 02/22/2022] [Indexed: 11/30/2022]
Abstract
Gene expression, especially in multi-species experiments, is used to gain insight into the genetic basis of how organisms adapt and respond to changing environments. However, evolutionary processes which can influence gene expression patterns between species such as the presence of paralogs which arise from gene duplication events are rarely accounted for. Paralogous transcripts can alter the transcriptional output of a gene and thus exclusion of these transcripts can obscure important biological differences between species. To address this issue, we investigated how differences in transcript family size is associated with divergent gene expression patterns in five species of Caribbean reef-building corals. We demonstrate that transcript families that are rapidly evolving in terms of size have increased levels of expression divergence. Additionally, these rapidly evolving transcript families are enriched for multiple biological processes, with genes involved in the coral innate immune system demonstrating pronounced variation in homolog number between species. Overall, this investigation demonstrates the importance of incorporating paralogous transcripts when comparing gene expression across species by influencing both transcriptional output and the number of transcripts within biological processes. As this investigation was based on transcriptome assemblies, additional insights into the relationship between gene duplications and expression patterns will likely emergence once more genome assemblies are available for study.
Collapse
Affiliation(s)
- Bradford Dimos
- Department of Biology, University of Texas at Arlington, Arlington, TX, 76019, USA
| | - Madison Emery
- Department of Biology, University of Texas at Arlington, Arlington, TX, 76019, USA
| | - Kelsey Beavers
- Department of Biology, University of Texas at Arlington, Arlington, TX, 76019, USA
| | - Nicholas MacKnight
- Department of Biology, University of Texas at Arlington, Arlington, TX, 76019, USA
| | - Marilyn Brandt
- Center for Marine and Environmental Studies, University of the Virgin Islands, St. Thomas, US Virgin Islands, 00802, USA
| | - Jeffery Demuth
- Department of Biology, University of Texas at Arlington, Arlington, TX, 76019, USA
| | - Laura Mydlarz
- Department of Biology, University of Texas at Arlington, Arlington, TX, 76019, USA
| |
Collapse
|
7
|
Burks DJ, Azad RK. RNA-Seq Data Analysis Pipeline for Plants: Transcriptome Assembly, Alignment, and Differential Expression Analysis. Methods Mol Biol 2022; 2396:47-60. [PMID: 34786675 DOI: 10.1007/978-1-0716-1822-6_5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
In this chapter, we describe methods for analyzing RNA-Seq data, presented as a flow along a pipeline beginning with raw data from a sequencer and ending with an output of differentially expressed genes and their functional characterization. The first section covers de novo transcriptome assembly for organisms lacking reference genomes or for those interested in probing against the background of organism-specific transcriptomes assembled from RNA-Seq data. Section 2 covers both gene- and transcript-level quantifications, leading to the third and final section on differential expression analysis between two or more conditions. The pipeline starts with raw sequence reads, followed by quality assessment and preprocessing of the input data to ensure a robust estimate of the transcripts and their differential regulation. The preprocessed data can be inputted into the de novo transcriptome flow to assemble transcripts, functionally annotated using tools such as InterProScan or Blast2Go and then forwarded to differential expression analysis flow, or directly inputted into the differential expression analysis flow if a reference genome is available. An online repository containing sample data has also been made available, as well as custom Python scripts to modify the output of the programs within the pipeline for various downstream analyses.
Collapse
Affiliation(s)
- David J Burks
- Department of Biological Sciences and BioDiscovery Institute, University of North Texas, Denton, TX, USA
| | - Rajeev K Azad
- Department of Biological Sciences and BioDiscovery Institute, University of North Texas, Denton, TX, USA.
- Department of Mathematics, University of North Texas, Denton, TX, USA.
| |
Collapse
|
8
|
Nachtigall PG, Rautsaw RM, Ellsworth SA, Mason AJ, Rokyta DR, Parkinson CL, Junqueira-de-Azevedo ILM. ToxCodAn: a new toxin annotator and guide to venom gland transcriptomics. Brief Bioinform 2021; 22:6235957. [PMID: 33866357 DOI: 10.1093/bib/bbab095] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Revised: 02/15/2021] [Accepted: 03/03/2021] [Indexed: 01/23/2023] Open
Abstract
MOTIVATION Next-generation sequencing has become exceedingly common and has transformed our ability to explore nonmodel systems. In particular, transcriptomics has facilitated the study of venom and evolution of toxins in venomous lineages; however, many challenges remain. Primarily, annotation of toxins in the transcriptome is a laborious and time-consuming task. Current annotation software often fails to predict the correct coding sequence and overestimates the number of toxins present in the transcriptome. Here, we present ToxCodAn, a python script designed to perform precise annotation of snake venom gland transcriptomes. We test ToxCodAn with a set of previously curated transcriptomes and compare the results to other annotators. In addition, we provide a guide for venom gland transcriptomics to facilitate future research and use Bothrops alternatus as a case study for ToxCodAn and our guide. RESULTS Our analysis reveals that ToxCodAn provides precise annotation of toxins present in the transcriptome of venom glands of snakes. Comparison with other annotators demonstrates that ToxCodAn has better performance with regard to run time ($>20x$ faster), coding sequence prediction ($>3x$ more accurate) and the number of toxins predicted (generating $>4x$ less false positives). In this sense, ToxCodAn is a valuable resource for toxin annotation. The ToxCodAn framework can be expanded in the future to work with other venomous lineages and detect novel toxins.
Collapse
Affiliation(s)
- Pedro G Nachtigall
- Laboratório de Toxinologia Aplicada, CeTICS, Instituto Butantan, São Paulo, SP 05503-900, Brazil
| | - Rhett M Rautsaw
- Department of Biological Sciences, Clemson University, Clemson, SC 29634, USA
| | - Schyler A Ellsworth
- Department of Biological Science, Florida State University, Tallahassee, FL 32306, USA
| | - Andrew J Mason
- Department of Biological Sciences, Clemson University, Clemson, SC 29634, USA
- Department of Evolution, Ecology and Organismal Biology, The Ohio State University, Columbus, OH 43210 USA
| | - Darin R Rokyta
- Department of Biological Science, Florida State University, Tallahassee, FL 32306, USA
| | - Christopher L Parkinson
- Department of Biological Sciences, Clemson University, Clemson, SC 29634, USA
- Department of Forestry and Environmental Conservation, Clemson University, Clemson, SC 29634, USA
| | | |
Collapse
|
9
|
Modern Approaches for Transcriptome Analyses in Plants. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2021; 1346:11-50. [DOI: 10.1007/978-3-030-80352-0_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
10
|
Marisaldi L, Basili D, Gioacchini G, Canapa A, Carnevali O. De novo transcriptome assembly, functional annotation and characterization of the Atlantic bluefin tuna (Thunnus thynnus) larval stage. Mar Genomics 2020; 58:100834. [PMID: 33371994 DOI: 10.1016/j.margen.2020.100834] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Revised: 12/10/2020] [Accepted: 12/11/2020] [Indexed: 10/22/2022]
Abstract
In the present work, we assembled and characterized a de novo larval transcriptome of the Atlantic bluefin tuna Thunnus thynnus by taking advantage of publicly available databases with the goal of better understanding its larval development. The assembled transcriptome comprised 37,117 protein-coding transcripts, of which 13,633 full-length (>80% coverage), with an Ex90N50 of 3061 bp and 76% of complete and single-copy core vertebrate genes orthologues. Of these transcripts, 34,980 had a hit against the EggNOG database and 14,983 with the KEGG database. Codon usage bias was identified in processes such as translation and muscle development. By comparing our data with a set of representative fish species, 87.1% of tuna transcripts were included in orthogroups with other species and 5.1% in assembly-specific orthogroups, which were enriched in terms related to muscle and bone development, visual system and ion transport. Following this comparative approach, protein families related to myosin, extracellular matrix and immune system resulted significantly expanded in the Atlantic bluefin tuna. Altogether, these results provide a glimpse of how the Atlantic bluefin tuna might have achieved early physical advantages over competing species in the pelagic environment. The information generated lays the foundation for future research on the more detailed exploration of physiological responses at the molecular level in different larval stages and paves the way to evolutionary studies on the Atlantic bluefin tuna.
Collapse
Affiliation(s)
- Luca Marisaldi
- Department of Life and Environmental Sciences, Università Politecnica delle Marche, Ancona 60131, Italy
| | - Danilo Basili
- Department of Life and Environmental Sciences, Università Politecnica delle Marche, Ancona 60131, Italy; Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK
| | - Giorgia Gioacchini
- Department of Life and Environmental Sciences, Università Politecnica delle Marche, Ancona 60131, Italy
| | - Adriana Canapa
- Department of Life and Environmental Sciences, Università Politecnica delle Marche, Ancona 60131, Italy
| | - Oliana Carnevali
- Department of Life and Environmental Sciences, Università Politecnica delle Marche, Ancona 60131, Italy.
| |
Collapse
|
11
|
Miralles A, Bruy T, Wolcott K, Scherz MD, Begerow D, Beszteri B, Bonkowski M, Felden J, Gemeinholzer B, Glaw F, Glöckner FO, Hawlitschek O, Kostadinov I, Nattkemper TW, Printzen C, Renz J, Rybalka N, Stadler M, Weibulat T, Wilke T, Renner SS, Vences M. Repositories for Taxonomic Data: Where We Are and What is Missing. Syst Biol 2020; 69:1231-1253. [PMID: 32298457 PMCID: PMC7584136 DOI: 10.1093/sysbio/syaa026] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2019] [Revised: 02/20/2020] [Accepted: 03/24/2020] [Indexed: 12/05/2022] Open
Abstract
Natural history collections are leading successful large-scale projects of specimen digitization (images, metadata, DNA barcodes), thereby transforming taxonomy into a big data science. Yet, little effort has been directed towards safeguarding and subsequently mobilizing the considerable amount of original data generated during the process of naming 15,000-20,000 species every year. From the perspective of alpha-taxonomists, we provide a review of the properties and diversity of taxonomic data, assess their volume and use, and establish criteria for optimizing data repositories. We surveyed 4113 alpha-taxonomic studies in representative journals for 2002, 2010, and 2018, and found an increasing yet comparatively limited use of molecular data in species diagnosis and description. In 2018, of the 2661 papers published in specialized taxonomic journals, molecular data were widely used in mycology (94%), regularly in vertebrates (53%), but rarely in botany (15%) and entomology (10%). Images play an important role in taxonomic research on all taxa, with photographs used in >80% and drawings in 58% of the surveyed papers. The use of omics (high-throughput) approaches or 3D documentation is still rare. Improved archiving strategies for metabarcoding consensus reads, genome and transcriptome assemblies, and chemical and metabolomic data could help to mobilize the wealth of high-throughput data for alpha-taxonomy. Because long-term-ideally perpetual-data storage is of particular importance for taxonomy, energy footprint reduction via less storage-demanding formats is a priority if their information content suffices for the purpose of taxonomic studies. Whereas taxonomic assignments are quasifacts for most biological disciplines, they remain hypotheses pertaining to evolutionary relatedness of individuals for alpha-taxonomy. For this reason, an improved reuse of taxonomic data, including machine-learning-based species identification and delimitation pipelines, requires a cyberspecimen approach-linking data via unique specimen identifiers, and thereby making them findable, accessible, interoperable, and reusable for taxonomic research. This poses both qualitative challenges to adapt the existing infrastructure of data centers to a specimen-centered concept and quantitative challenges to host and connect an estimated $ \le $2 million images produced per year by alpha-taxonomic studies, plus many millions of images from digitization campaigns. Of the 30,000-40,000 taxonomists globally, many are thought to be nonprofessionals, and capturing the data for online storage and reuse therefore requires low-complexity submission workflows and cost-free repository use. Expert taxonomists are the main stakeholders able to identify and formalize the needs of the discipline; their expertise is needed to implement the envisioned virtual collections of cyberspecimens. [Big data; cyberspecimen; new species; omics; repositories; specimen identifier; taxonomy; taxonomic data.].
Collapse
Affiliation(s)
- Aurélien Miralles
- Departement Origins and Evolution, Institut Systématique, Evolution, Biodiversité (ISYEB), Muséum national d’Histoire naturelle, CNRS, Sorbonne Université, EPHE, 57 rue Cuvier, CP50, 75005 Paris, France
- Systematic Botany and Mycology, University of Munich (LMU), Menzingerstraße 67, 80638 Munich, Germany
| | - Teddy Bruy
- Departement Origins and Evolution, Institut Systématique, Evolution, Biodiversité (ISYEB), Muséum national d’Histoire naturelle, CNRS, Sorbonne Université, EPHE, 57 rue Cuvier, CP50, 75005 Paris, France
- Systematic Botany and Mycology, University of Munich (LMU), Menzingerstraße 67, 80638 Munich, Germany
| | - Katherine Wolcott
- Systematic Botany and Mycology, University of Munich (LMU), Menzingerstraße 67, 80638 Munich, Germany
- National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
| | - Mark D Scherz
- Department of Herpetology, Zoologische Staatssammlung München (ZSM-SNSB), Münchhausenstraße 21, 81247 München, Germany
- Department of Biology, Universität Konstanz, Universitätstraße 10, 78464 Konstanz, Germany
| | - Dominik Begerow
- Department of Geobotany, Ruhr-University Bochum, Universitätsstraße 150, 44780 Bochum, Germany
| | - Bank Beszteri
- Department of Phycology, Faculty of Biology, University of Duisburg-Essen, Universitätsstraße 2, 45141 Essen, Germany
| | - Michael Bonkowski
- Department of Terrestrial Ecology, Center of Excellence in Plant Sciences (CEPLAS), Terrestrial Ecology, Institute of Zoology, University of Cologne, 50674 Köln, Germany
| | - Janine Felden
- MARUM - Center for Marine Environmental Sciences, University of Bremen, Leobenerstraße 8, 28359 Bremen, Germany
- Alfred Wegener Institute - Helmholtz Center for Polar- and Marine Research, Am Handelshafen 12, 27570 Bremerhaven, Germany
| | - Birgit Gemeinholzer
- Department of Systematic Botany, Justus Liebig University Gießen, Heinrich-Buff Ring 38, 35392 Giessen, Germany
| | - Frank Glaw
- Department of Herpetology, Zoologische Staatssammlung München (ZSM-SNSB), Münchhausenstraße 21, 81247 München, Germany
| | - Frank Oliver Glöckner
- Alfred Wegener Institute - Helmholtz Center for Polar- and Marine Research, Am Handelshafen 12, 27570 Bremerhaven, Germany
| | - Oliver Hawlitschek
- Department of Herpetology, Zoologische Staatssammlung München (ZSM-SNSB), Münchhausenstraße 21, 81247 München, Germany
- Department of Scientific Infrastructure, Centrum für Naturkunde (CeNak), Universität Hamburg, Martin-Luther-King-Platz 3, 20146 Hamburg, Germany
| | - Ivaylo Kostadinov
- GFBio - Gesellschaft für Biologische Daten e.V., c/o Research II, Campus Ring 1, 28759 Bremen, Germany
| | - Tim W Nattkemper
- Biodata Mining Group, Center of Biotechnology (CeBiTec), Bielefeld University, PO Box 100131, 33501 Bielefeld, Germany
| | - Christian Printzen
- Department of Botany and Molecular Evolution, Senckenberg Research Institute and Natural History Museum Frankfurt, Senckenberganlage 25, 60325 Frankfurt/Main, Germany
| | - Jasmin Renz
- Zooplankton Research Group, DZMB – Senckenberg am Meer, Martin-Luther-King Platz 3, 20146 Hamburg, Germany
| | - Nataliya Rybalka
- Department of Experimental Phycology and Culture Collection of Algae, University Göttingen, Nikolausberger-Weg 18, 37073 Göttingen, Germany
| | - Marc Stadler
- Department Microbial Drugs, Helmholtz Centre for Infection Research (HZI), and German Centre for Infection Research (DZIF), Partner Site Hannover-Braunschweig, Inhoffenstrasse 7, 38124 Braunschweig, Germany
| | - Tanja Weibulat
- GFBio - Gesellschaft für Biologische Daten e.V., c/o Research II, Campus Ring 1, 28759 Bremen, Germany
| | - Thomas Wilke
- Department of Animal Ecology and Systematics, Justus Liebig University Gießen, Heinrich-Buff Ring 26, 35392 Giessen, Germany
| | - Susanne S Renner
- Systematic Botany and Mycology, University of Munich (LMU), Menzingerstraße 67, 80638 Munich, Germany
| | - Miguel Vences
- Department of Evolutionary Biology, Zoological Institute, Technische Universität Braunschweig, Mendelssohnstraße 4, 38106 Braunschweig, Germany
| |
Collapse
|
12
|
Liu X, Hérault F, Diot C, Corre E. Development of a relevant strategy using de novo transcriptome assembly method for transcriptome comparisons between Muscovy and common duck species and their reciprocal inter-specific mule and hinny hybrids fed ad libitum and overfed. BMC Genomics 2020; 21:687. [PMID: 33008290 PMCID: PMC7531116 DOI: 10.1186/s12864-020-07099-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2020] [Accepted: 09/23/2020] [Indexed: 12/01/2022] Open
Abstract
Background Common Pekin and Muscovy ducks and their intergeneric hinny and mule hybrids have different abilities for fatty liver production. RNA-Seq analyses from the liver of these different genetic types fed ad libitum or overfed would help to identify genes with different response to overfeeding between them. However RNA-seq analyses from different species and comparison is challenging. The goal of this study was develop a relevant strategy for transcriptome analysis and comparison between different species. Results Transcriptomes were first assembled with a reference-based approach. Important mapping biases were observed when heterologous mapping were conducted on common duck reference genome, suggesting that this reference-based strategy was not suited to compare the four different genetic types. De novo transcriptome assemblies were then performed using Trinity and Oases. Assemblies of transcriptomes were not relevant when more than a single genetic type was considered. Finally, single genetic type transcriptomes were assembled with DRAP in a mega-transcriptome. No bias was observed when reads from the different genetic types were mapped on this mega-transcriptome and differences in gene expression between the four genetic types could be identified. Conclusions Analyses using both reference-based and de novo transcriptome assemblies point out a good performance of the de novo approach for the analysis of gene expression in different species. It also allowed the identification of differences in responses to overfeeding between Pekin and Muscovy ducks and hinny and mule hybrids.
Collapse
Affiliation(s)
- Xi Liu
- ABiMS Bioinformatics Facility, CNRS, Sorbonne Université, FR2424, Station Biologique, 29680, Roscoff, France
| | - Frédéric Hérault
- UMR PEGASE, INRAE, Institut Agro, 16 Le Clos, 35590, Saint-Gilles, France
| | - Christian Diot
- UMR PEGASE, INRAE, Institut Agro, 16 Le Clos, 35590, Saint-Gilles, France.
| | - Erwan Corre
- ABiMS Bioinformatics Facility, CNRS, Sorbonne Université, FR2424, Station Biologique, 29680, Roscoff, France.
| |
Collapse
|
13
|
Nielsen ES, Henriques R, Beger M, Toonen RJ, von der Heyden S. Multi-model seascape genomics identifies distinct environmental drivers of selection among sympatric marine species. BMC Evol Biol 2020; 20:121. [PMID: 32938400 PMCID: PMC7493327 DOI: 10.1186/s12862-020-01679-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2020] [Accepted: 08/24/2020] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND As global change and anthropogenic pressures continue to increase, conservation and management increasingly needs to consider species' potential to adapt to novel environmental conditions. Therefore, it is imperative to characterise the main selective forces acting on ecosystems, and how these may influence the evolutionary potential of populations and species. Using a multi-model seascape genomics approach, we compare putative environmental drivers of selection in three sympatric southern African marine invertebrates with contrasting ecology and life histories: Cape urchin (Parechinus angulosus), Common shore crab (Cyclograpsus punctatus), and Granular limpet (Scutellastra granularis). RESULTS Using pooled (Pool-seq), restriction-site associated DNA sequencing (RAD-seq), and seven outlier detection methods, we characterise genomic variation between populations along a strong biogeographical gradient. Of the three species, only S. granularis showed significant isolation-by-distance, and isolation-by-environment driven by sea surface temperatures (SST). In contrast, sea surface salinity (SSS) and range in air temperature correlated more strongly with genomic variation in C. punctatus and P. angulosus. Differences were also found in genomic structuring between the three species, with outlier loci contributing to two clusters in the East and West Coasts for S. granularis and P. angulosus, but not for C. punctatus. CONCLUSION The findings illustrate distinct evolutionary potential across species, suggesting that species-specific habitat requirements and responses to environmental stresses may be better predictors of evolutionary patterns than the strong environmental gradients within the region. We also found large discrepancies between outlier detection methodologies, and thus offer a novel multi-model approach to identifying the principal environmental selection forces acting on species. Overall, this work highlights how adding a comparative approach to seascape genomics (both with multiple models and species) can elucidate the intricate evolutionary responses of ecosystems to global change.
Collapse
Affiliation(s)
- Erica S Nielsen
- Evolutionary Genomics Group, Department of Botany and Zoology, University of Stellenbosch, Private Bag X1, Matieland, 7602, South Africa
| | - Romina Henriques
- Evolutionary Genomics Group, Department of Botany and Zoology, University of Stellenbosch, Private Bag X1, Matieland, 7602, South Africa.,Technical University of Denmark, National Institute of Aquatic Resources, Section for Marine Living Resources, Velsøvej 39, 8600, Silkeborg, Denmark
| | - Maria Beger
- School of Biology, Faculty of Biological Sciences, University of Leeds, Leeds, LS2 9JT, UK
| | - Robert J Toonen
- Hawai'i Institute of Marine Biology, University of Hawai'i at Mānoa, Kāne'ohe, HI, 96744, USA
| | - Sophie von der Heyden
- Evolutionary Genomics Group, Department of Botany and Zoology, University of Stellenbosch, Private Bag X1, Matieland, 7602, South Africa.
| |
Collapse
|
14
|
Zhang L, Yang J, Li H, You J, Chatterjee N, Zhang X. Development of the transcriptome for a sediment ecotoxicological model species, Chironomus dilutus. CHEMOSPHERE 2020; 244:125541. [PMID: 32050339 DOI: 10.1016/j.chemosphere.2019.125541] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Revised: 11/04/2019] [Accepted: 12/03/2019] [Indexed: 06/10/2023]
Abstract
Chironomus dilutus is a prominent model species in conventional sediment toxicity testing and sediment contamination diagnosis. However, lack of genomic data significantly limited its application in identifying toxicological mode of action (MOA) and molecular biomarkers of toxicants. Here the transcriptome of C. dilutus in full life span and both sexes (1st, 2nd, 3rd and 4th instar larvae, pupae, and adults) were developed and temporal gene expression across adjacent life stages were investigated to understand the regulation of development. Furthermore, transcriptional response of Midges (the 4th instar larvae) exposed to chemicals of different MOAs (CdCl2, nonylphenol and triclosan) were profiled based on the reference transcriptome. Consequently, a complete transcriptome of 31132 unigenes with N50 of 3117bp, covering 98.8% of the arthropod single-copy orthologs were assembled. While 364 genes were differentially expressed among adjacent larval stages, 7142 and 2127 of transcripts were significantly changed for the transition of larvae-pupae and pupae-adults, respectively. Finally, chemical-specific gene expression profile were identified in the midges, showed its potential in classifying distinct contaminants. Overall, the comprehensive transcriptome of C. dilutus developed here could not only facilitate the mechanistic understanding of environmental toxicants during critical life stage of aquatic insects, but also provide molecular diagnostic tools in sediment ecotoxicology.
Collapse
Affiliation(s)
- Lijuan Zhang
- State Key Laboratory of Pollution Control & Resource Reuse, School of the Environment, Nanjing University, Nanjing, Jiangsu, 210023, China
| | - Jianghua Yang
- State Key Laboratory of Pollution Control & Resource Reuse, School of the Environment, Nanjing University, Nanjing, Jiangsu, 210023, China
| | - Huizhen Li
- School of Environment and Guangdong Key Laboratory of Environmental Pollution and Health, Jinan University, Guangzhou, Guangdong, 510632, China.
| | - Jing You
- School of Environment and Guangdong Key Laboratory of Environmental Pollution and Health, Jinan University, Guangzhou, Guangdong, 510632, China
| | - Nivedita Chatterjee
- State Key Laboratory of Pollution Control & Resource Reuse, School of the Environment, Nanjing University, Nanjing, Jiangsu, 210023, China
| | - Xiaowei Zhang
- State Key Laboratory of Pollution Control & Resource Reuse, School of the Environment, Nanjing University, Nanjing, Jiangsu, 210023, China.
| |
Collapse
|
15
|
Carvajal-Lopez P, Von Borstel FD, Torres A, Rustici G, Gutierrez J, Romero-Vivas E. Microarray-Based Quality Assessment as a Supporting Criterion for de novo Transcriptome Assembly Selection. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:198-206. [PMID: 30059314 DOI: 10.1109/tcbb.2018.2860997] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
RNA-Sequencing and de novo assembly have enabled the analysis of species with non-available reference transcriptomes, although intrinsic features (biological and technical) induce errors in the reconstruction. A strategy to resolve these errors consists of varying assembling process parameters to generate multiple reconstructions. However, the best assembly selection remains a challenge. Quantitative metrics for quality assessment have been inconsistent when compared with pertinent references. In this paper, a criterion for supporting assembly selection based on mapping DNA microarray hybridized probes to assembly sets is proposed. Mouse and fruit fly RNA-Seq datasets were assembled with standard de novo procedures. Quality assessment was estimated using quantitative metrics and the proposed criterion. The assembly that best mapped to the available reference transcriptomes of these model species provided the highest quality assembly. The hybridized probes identified the best assemblies, whereas quantitative metrics remained inconsistent. For example, subtle probe mapping difference of 0.25 percent, but statistically significant (ANOVA, p < 0.05), enabled the assembly selection that led to identify 3,719 more contigs and led to 1,049 further mapped contigs to the mouse reference transcriptome. The microarray data availability for non-model species makes the proposed criterion suitable for quality assessment of multiple de novo assembly strategies.
Collapse
|
16
|
Hart AJ, Ginzburg S, Xu MS, Fisher CR, Rahmatpour N, Mitton JB, Paul R, Wegrzyn JL. EnTAP: Bringing faster and smarter functional annotation to non-model eukaryotic transcriptomes. Mol Ecol Resour 2019; 20:591-604. [PMID: 31628884 DOI: 10.1111/1755-0998.13106] [Citation(s) in RCA: 114] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2018] [Revised: 09/18/2019] [Accepted: 09/24/2019] [Indexed: 11/28/2022]
Abstract
EnTAP (Eukaryotic Non-Model Transcriptome Annotation Pipeline) was designed to improve the accuracy, speed, and flexibility of functional gene annotation for de novo assembled transcriptomes in non-model eukaryotes. This software package addresses the fragmentation and related assembly issues that result in inflated transcript estimates and poor annotation rates of protein-coding transcripts. Following filters applied through assessment of true expression and frame selection, open-source tools are leveraged to functionally annotate the reduced set of translated proteins. Downstream features include fast similarity search across five repositories, protein domain assignment, orthologous gene family assessment, and Gene Ontology (GO) term assignment. The final annotation integrates across multiple databases and selects an optimal assignment from a combination of weighted metrics describing similarity search score, taxonomic relationship, and informativeness. Researchers have the option to include additional filters to identify and remove contaminants, identify associated pathways, and prepare the transcripts for enrichment analysis. This fully featured pipeline is easy to install, configure, and runs significantly faster than comparable annotation packages. EnTAP is optimized to generate extensive functional information for the gene space of organisms with limited or poorly characterized genomic resources.
Collapse
Affiliation(s)
- Alexander J Hart
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Samuel Ginzburg
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Muyang Sam Xu
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Cera R Fisher
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Nasim Rahmatpour
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Jeffry B Mitton
- Department of Ecology and Evolutionary Biology, University of Colorado Boulder, Boulder, CO, USA
| | - Robin Paul
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| |
Collapse
|
17
|
Comparative Analysis of Strategies for De Novo Transcriptome Assembly in Prokaryotes: Streptomyces clavuligerus as a Case Study. High Throughput 2019; 8:ht8040020. [PMID: 31801255 PMCID: PMC6970227 DOI: 10.3390/ht8040020] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 11/20/2019] [Accepted: 11/23/2019] [Indexed: 12/15/2022] Open
Abstract
The performance of software tools for de novo transcriptome assembly greatly depends on the selection of software parameters. Up to now, the development of de novo transcriptome assembly for prokaryotes has not been as remarkable as that for eukaryotes. In this contribution, Rockhopper2 was used to perform a comparative transcriptome analysis of Streptomyces clavuligerus exposed to diverse environmental conditions. The study focused on assessing the incidence of software parameters on software performance for the identification of differentially expressed genes as a final goal. For this, a statistical optimization was performed using the Transrate Assembly Score (TAS). TAS was also used for evaluating the software performance and for comparing it with related tools, e.g., Trinity. Transcriptome redundancy and completeness were also considered for this analysis. Rockhopper2 and Trinity reached a TAS value of 0.55092 and 0.58337, respectively. Trinity assembles transcriptomes with high redundancy, with 55.6% of transcripts having some duplicates. Additionally, we observed that the total number of differentially expressed genes (DEG) and their annotation greatly depends on the method used for removing redundancy and the tools used for transcript quantification. To our knowledge, this is the first work aimed at assessing de novo assembly software for prokaryotic organisms.
Collapse
|
18
|
dos Santos ÍGD, de Oliveira Mendes TA, Silva GAB, Reis AMS, Monteiro-Vitorello CB, Schaker PDC, Herai RH, Fabotti ABC, Coutinho LL, Jorge EC. Didelphis albiventris: an overview of unprecedented transcriptome sequencing of the white-eared opossum. BMC Genomics 2019; 20:866. [PMID: 31730444 PMCID: PMC6858782 DOI: 10.1186/s12864-019-6240-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2019] [Accepted: 10/29/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The white-eared opossum (Didelphis albiventris) is widely distributed throughout Brazil and South America. It has been used as an animal model for studying different scientific questions ranging from the restoration of degraded green areas to medical aspects of Chagas disease, leishmaniasis and resistance against snake venom. As a marsupial, D. albiventris can also contribute to the understanding of the molecular mechanisms that govern the different stages of organogenesis. Opossum joeys are born after only 13 days, and the final stages of organogenesis occur when the neonates are inside the pouch, depending on lactation. As neither the genome of this opossum species nor its transcriptome has been completely sequenced, the use of D. albiventris as an animal model is limited. In this work, we sequenced the D. albiventris transcriptome by RNA-seq to obtain the first catalogue of differentially expressed (DE) genes and gene ontology (GO) annotations during the neonatal stages of marsupial development. RESULTS The D. albiventris transcriptome was obtained from whole neonates harvested at birth (P0), at 5 days of age (P5) and at 10 days of age (P10). The de novo assembly of these transcripts generated 85,338 transcripts. Approximately 30% of these transcripts could be mapped against the amino acid sequences of M. domestica, the evolutionarily closest relative of D. albiventris to be sequenced thus far. Among the expressed transcripts, 2077 were found to be DE between P0 and P5, 13,780 between P0 and P10, and 1453 between P5 and P10. The enriched GO terms were mainly related to the immune system, blood tissue development and differentiation, vision, hearing, digestion, the CNS and limb development. CONCLUSIONS The elucidation of opossum transcriptomes provides an out-group for better understanding the distinct characteristics associated with the evolution of mammalian species. This study provides the first transcriptome sequences and catalogue of genes for a marsupial species at different neonatal stages, allowing the study of the mechanisms involved in organogenesis.
Collapse
Affiliation(s)
- Íria Gabriela Dias dos Santos
- Departamento de Morfologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais Brazil
| | | | - Gerluza Aparecida Borges Silva
- Departamento de Morfologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais Brazil
| | - Amanda Maria Sena Reis
- Departamento de Morfologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais Brazil
| | | | - Patricia Dayane Carvalho Schaker
- Departamento de Genética, Escola Superior de Agricultura Luiz de Queiroz, Universidade de São Paulo, Piracicaba, São Paulo Brazil
| | - Roberto Hirochi Herai
- Graduate Program in Health Sciences, School of Medicine, Pontifícia Universidade Católica do Paraná (PUCPR), Curitiba, Paraná, Brazil
| | | | - Luiz Lehmann Coutinho
- Departamento de Zootecnia, Escola Superior de Agricultura Luiz de Queiroz, Universidade de São Paulo, Piracicaba, São Paulo Brazil
| | - Erika Cristina Jorge
- Departamento de Morfologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais Brazil
| |
Collapse
|
19
|
Durai DA, Schulz MH. In silico read normalization using set multi-cover optimization. Bioinformatics 2019; 34:3273-3280. [PMID: 29912280 PMCID: PMC6157080 DOI: 10.1093/bioinformatics/bty307] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2017] [Accepted: 04/18/2018] [Indexed: 11/24/2022] Open
Abstract
Motivation De Bruijn graphs are a common assembly data structure for sequencing datasets. But with the advances in sequencing technologies, assembling high coverage datasets has become a computational challenge. Read normalization, which removes redundancy in datasets, is widely applied to reduce resource requirements. Current normalization algorithms, though efficient, provide no guarantee to preserve important k-mers that form connections between regions in the graph. Results Here, normalization is phrased as a set multi-cover problem on reads and a heuristic algorithm, Optimized Read Normalization Algorithm (ORNA), is proposed. ORNA normalizes to the minimum number of reads required to retain all k-mers and their relative k-mer abundances from the original dataset. Hence, all connections from the original graph are preserved. ORNA was tested on various RNA-seq datasets with different coverage values. It was compared to the current normalization algorithms and was found to be performing better. Normalizing error corrected data allows for more accurate assemblies compared to the normalized uncorrected dataset. Further, an application is proposed in which multiple datasets are combined and normalized to predict novel transcripts that would have been missed otherwise. Finally, ORNA is a general purpose normalization algorithm that is fast and significantly reduces datasets with loss of assembly quality in between [1, 30]% depending on reduction stringency. Availability and implementation ORNA is available at https://github.com/SchulzLab/ORNA. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Dilip A Durai
- Cluster of Excellence on Multimodal Computing and Interaction, Saarland University, Saarbrücken, Germany.,Department of Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Saarbrücken, Germany.,Saarbrücken Graduate School of Computer Science, Saarland University, Saarbrücken, Germany
| | - Marcel H Schulz
- Cluster of Excellence on Multimodal Computing and Interaction, Saarland University, Saarbrücken, Germany.,Department of Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Saarbrücken, Germany
| |
Collapse
|
20
|
Comprehensive Stress-Based De Novo Transcriptome Assembly and Annotation of Guar ( Cyamopsis tetragonoloba (L.) Taub.): An Important Industrial and Forage Crop. Int J Genomics 2019; 2019:7295859. [PMID: 31687376 PMCID: PMC6800914 DOI: 10.1155/2019/7295859] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2019] [Revised: 08/23/2019] [Accepted: 09/05/2019] [Indexed: 11/17/2022] Open
Abstract
The forage crop Guar (Cyamopsis tetragonoloba (L.) Taub.) has the ability to endure heat, drought, and mild salinity. A complete image on its genic architecture will promote our understanding about gene expression networks and different tolerance mechanisms at the molecular level. Therefore, whole mRNA sequence approach on the Guar plant was conducted to provide a snapshot of the mRNA information in the cell under salinity, heat, and drought stresses to be integrated with previous transcriptomic studies. RNA-Seq technology was employed to perform a 2 × 100 paired-end sequencing using an Illumina HiSeq 2500 platform for the transcriptome of leaves of C. tetragonoloba under normal, heat, drought, and salinity conditions. Trinity was used to achieve a de novo assembly followed by gene annotation, functional classification, metabolic pathway analysis, and identification of SSR markers. A total of 218.2 million paired-end raw reads (~44 Gbp) were generated. Of those, 193.5M paired-end reads of high quality were used to reconstruct a total of 161,058 transcripts (~266 Mbp) with N50 of 2552 bp and 61,508 putative genes. There were 6463 proteins having >90% full-length coverage against the Swiss-Prot database and 94% complete orthologs against Embryophyta. Approximately, 62.87% of transcripts were blasted, 50.46% mapped, and 43.50% annotated. A total of 4715 InterProScan families, 3441 domains, 74 repeats, and 490 sites were detected. Biological processes, molecular functions, and cellular components comprised 64.12%, 25.42%, and 10.4%, respectively. The transcriptome was associated with 985 enzymes and 156 KEGG pathways. A total of 27,066 SSRs were gained with an average frequency of one SSR/9.825 kb in the assembled transcripts. This resulting data will be helpful for the advanced analysis of Guar to multi-stress tolerance.
Collapse
|
21
|
Marine Fungi: Biotechnological Perspectives from Deep-Hypersaline Anoxic Basins. DIVERSITY 2019. [DOI: 10.3390/d11070113] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Deep-sea hypersaline anoxic basins (DHABs) are one of the most hostile environments on Earth. Even though DHABs have hypersaline conditions, anoxia and high hydrostatic pressure, they host incredible microbial biodiversity. Among eukaryotes inhabiting these systems, recent studies demonstrated that fungi are a quantitatively relevant component. Here, fungi can benefit from the accumulation of large amounts of organic material. Marine fungi are also known to produce bioactive molecules. In particular, halophilic and halotolerant fungi are a reservoir of enzymes and secondary metabolites with valuable applications in industrial, pharmaceutical, and environmental biotechnology. Here we report that among the fungal taxa identified from the Mediterranean and Red Sea DHABs, halotolerant halophilic species belonging to the genera Aspergillus and Penicillium can be used or screened for enzymes and bioactive molecules. Fungi living in DHABs can extend our knowledge about the limits of life, and the discovery of new species and molecules from these environments can have high biotechnological potential.
Collapse
|
22
|
González-González A, Rubio-Meléndez ME, Ballesteros GI, Ramírez CC, Palma-Millanao R. Sex- and tissue-specific expression of odorant-binding proteins and chemosensory proteins in adults of the scarab beetle Hylamorpha elegans (Burmeister) (Coleoptera: Scarabaeidae). PeerJ 2019; 7:e7054. [PMID: 31223529 PMCID: PMC6571001 DOI: 10.7717/peerj.7054] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Accepted: 05/02/2019] [Indexed: 12/04/2022] Open
Abstract
In this study, we addressed the sex- and tissue-specific expression patterns of odorant-binding proteins (OBPs) and chemosensory proteins (CSPs) in Hylamorpha elegans (Burmeister), an important native scarab beetle pest species from Chile. Similar to other members of its family, this scarab beetle exhibit habits that make difficult to control the pest by conventional methods. Hence, alternative ways to manage the pest populations based on chemical communication and signaling (such as disrupting mating or host finding process) are highly desirable. However, developing pest-control methods based on chemical communication requires to understand the molecular basis for pheromone recognition/chemical perception in this species. Thus, with the aim of discovering olfaction-related genes, we obtained the first reference transcriptome assembly of H. elegans. We used different tissues of adult beetles from males and females: antennae and maxillary palps, which are well known for embedded sensory organs. Then, the expression of predicted odorant-binding proteins (OBPs) and chemosensory proteins (CSPs) was analyzed by qRT-PCR. In total, 165 transcripts related to chemoperception were predicted. Of these, 16 OBPs, including one pheromone-binding protein (PBP), and four CSPs were successfully amplified by qRT-PCR. All of these genes were differentially expressed in the sensory tissues with respect to the tibial tissue that was used as a control. The single predicted PBP found was highly expressed in the antennal tissues, particularly in males, while several OBPs and one CSP showed male-biased expression patterns, suggesting that these proteins may participate in sexual recognition process. In addition, a single CSP was expressed at higher levels in female palps than in any other studied condition, suggesting that this CSP would participate in oviposition process. Finally, all four CSPs exhibited palp-biased expression while mixed results were obtained for the expression of the OBPs, which were more abundant in the palps than in the antennae. These results suggest that these chemoperception proteins would be interesting novel targets for control of H. elegans, thus providing a theoretical basis for further studies involving new pest control methods.
Collapse
Affiliation(s)
- Angélica González-González
- Centre in Molecular and Functional Ecology, Universidad de Talca, Talca, Chile.,Instituto de Ciencias Biológicas, Universidad de Talca, Talca, Chile
| | - María E Rubio-Meléndez
- Centro de Bioinformática y Simulación Molecular (CBSM), Facultad de Ingeniería, Universidad de Talca, Talca, Maule, Chile
| | - Gabriel I Ballesteros
- Centre in Molecular and Functional Ecology, Universidad de Talca, Talca, Chile.,Instituto de Ciencias Biológicas, Universidad de Talca, Talca, Chile
| | - Claudio C Ramírez
- Centre in Molecular and Functional Ecology, Universidad de Talca, Talca, Chile.,Instituto de Ciencias Biológicas, Universidad de Talca, Talca, Chile
| | - Rubén Palma-Millanao
- Centre in Molecular and Functional Ecology, Universidad de Talca, Talca, Chile.,Instituto de Ciencias Biológicas, Universidad de Talca, Talca, Chile
| |
Collapse
|
23
|
Tarrant AM, Nilsson B, Hansen BW. Molecular physiology of copepods - from biomarkers to transcriptomes and back again. COMPARATIVE BIOCHEMISTRY AND PHYSIOLOGY D-GENOMICS & PROTEOMICS 2019; 30:230-247. [DOI: 10.1016/j.cbd.2019.03.005] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/20/2018] [Revised: 03/14/2019] [Accepted: 03/16/2019] [Indexed: 12/31/2022]
|
24
|
Torres-Sánchez M, Creevey CJ, Kornobis E, Gower DJ, Wilkinson M, San Mauro D. Multi-tissue transcriptomes of caecilian amphibians highlight incomplete knowledge of vertebrate gene families. DNA Res 2019; 26:13-20. [PMID: 30351380 PMCID: PMC6379020 DOI: 10.1093/dnares/dsy034] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2018] [Accepted: 09/13/2018] [Indexed: 12/29/2022] Open
Abstract
RNA sequencing (RNA-seq) has become one of the most powerful tools to unravel the genomic basis of biological adaptation and diversity. Although challenging, RNA-seq is particularly promising for research on non-model, secretive species that cannot be observed in nature easily and therefore remain comparatively understudied. Among such animals, the caecilians (order Gymnophiona) likely constitute the least known group of vertebrates, despite being an old and remarkably distinct lineage of amphibians. Here, we characterize multi-tissue transcriptomes for five species of caecilians that represent a broad level of diversity across the order. We identified vertebrate homologous elements of caecilian functional genes of varying tissue specificity that reveal a great number of unclassified gene families, especially for the skin. We annotated several protein domains for those unknown candidate gene families to investigate their function. We also conducted supertree analyses of a phylogenomic dataset of 1,955 candidate orthologous genes among five caecilian species and other major lineages of vertebrates, with the inferred tree being in agreement with current views of vertebrate evolution and systematics. Our study provides insights into the evolution of vertebrate protein-coding genes, and a basis for future research on the molecular elements underlying the particular biology and adaptations of caecilian amphibians.
Collapse
Affiliation(s)
- María Torres-Sánchez
- Department of Biodiversity, Ecology and Evolution, Complutense University of Madrid, Madrid, Spain
| | - Christopher J Creevey
- Institute for Global Food Security, School of Biological Sciences, Queen's University Belfast, Belfast, UK
| | - Etienne Kornobis
- Institut Pasteur, Bioinformatics and Biostatistics Hub, C3BI, USR 3756 IP CNRS, Paris, France
| | - David J Gower
- Department of Life Sciences, The Natural History Museum, London, UK
| | - Mark Wilkinson
- Department of Life Sciences, The Natural History Museum, London, UK
| | - Diego San Mauro
- Department of Biodiversity, Ecology and Evolution, Complutense University of Madrid, Madrid, Spain
| |
Collapse
|
25
|
Abstract
Specialized de novo assemblers for diverse datatypes have been developed and are in widespread use for the analyses of single-cell genomics, metagenomics and RNA-seq data. However, assembly of large sequencing datasets produced by modern technologies is challenging and computationally intensive. In-silico read normalization has been suggested as a computational strategy to reduce redundancy in read datasets, which leads to significant speedups and memory savings of assembly pipelines. Previously, we presented a set multi-cover optimization based approach, ORNA, where reads are reduced without losing important k-mer connectivity information, as used in assembly graphs. Here we propose extensions to ORNA, named ORNA-Q and ORNA-K, which consider a weighted set multi-cover optimization formulation for the in-silico read normalization problem. These novel formulations make use of the base quality scores obtained from sequencers (ORNA-Q) or k-mer abundances of reads (ORNA-K) to improve normalization further. We devise efficient heuristic algorithms for solving both formulations. In applications to human RNA-seq data, ORNA-Q and ORNA-K are shown to assemble more or equally many full length transcripts compared to other normalization methods at similar or higher read reduction values. The algorithm is implemented under the latest version of ORNA (v2.0, https://github.com/SchulzLab/ORNA).
Collapse
|
26
|
Minio A, Massonnet M, Figueroa-Balderas R, Vondras AM, Blanco-Ulate B, Cantu D. Iso-Seq Allows Genome-Independent Transcriptome Profiling of Grape Berry Development. G3 (BETHESDA, MD.) 2019; 9:755-767. [PMID: 30642874 PMCID: PMC6404599 DOI: 10.1534/g3.118.201008] [Citation(s) in RCA: 52] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Accepted: 01/09/2019] [Indexed: 01/13/2023]
Abstract
Transcriptomics has been widely applied to study grape berry development. With few exceptions, transcriptomic studies in grape are performed using the available genome sequence, PN40024, as reference. However, differences in gene content among grape accessions, which contribute to phenotypic differences among cultivars, suggest that a single reference genome does not represent the species' entire gene space. Though whole genome assembly and annotation can reveal the relatively unique or "private" gene space of any particular cultivar, transcriptome reconstruction is a more rapid, less costly, and less computationally intensive strategy to accomplish the same goal. In this study, we used single molecule-real time sequencing (SMRT) to sequence full-length cDNA (Iso-Seq) and reconstruct the transcriptome of Cabernet Sauvignon berries during berry ripening. In addition, short reads from ripening berries were used to error-correct low-expression isoforms and to profile isoform expression. By comparing the annotated gene space of Cabernet Sauvignon to other grape cultivars, we demonstrate that the transcriptome reference built with Iso-Seq data represents most of the expressed genes in the grape berries and includes 1,501 cultivar-specific genes. Iso-Seq produced transcriptome profiles similar to those obtained after mapping on a complete genome reference. Together, these results justify the application of Iso-Seq to identify cultivar-specific genes and build a comprehensive reference for transcriptional profiling that circumvents the necessity of a genome reference with its associated costs and computational weight.
Collapse
Affiliation(s)
- Andrea Minio
- Department of Viticulture and Enology, University of California Davis, Davis, CA
| | - Mélanie Massonnet
- Department of Viticulture and Enology, University of California Davis, Davis, CA
| | | | - Amanda M Vondras
- Department of Viticulture and Enology, University of California Davis, Davis, CA
| | | | - Dario Cantu
- Department of Viticulture and Enology, University of California Davis, Davis, CA
| |
Collapse
|
27
|
Sarwar MB, Ahmad Z, Rashid B, Hassan S, Gregersen PL, Leyva MDLO, Nagy I, Asp T, Husnain T. De novo assembly of Agave sisalana transcriptome in response to drought stress provides insight into the tolerance mechanisms. Sci Rep 2019; 9:396. [PMID: 30674899 PMCID: PMC6344536 DOI: 10.1038/s41598-018-35891-6] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2018] [Accepted: 10/29/2018] [Indexed: 11/30/2022] Open
Abstract
Agave, monocotyledonous succulent plants, is endemic to arid regions of North America, exhibiting exceptional tolerance to their xeric environments. They employ various strategies to overcome environmental constraints, such as crassulacean acid metabolism, wax depositions, and protective leaf morphology. Genomic resources of Agave species have received little attention irrespective of their cultural, economic and ecological importance, which so far prevented the understanding of the molecular bases underlying their adaptations to the arid environment. In this study, we aimed to elucidate molecular mechanism(s) using transcriptome sequencing of A. sisalana. A de novo approach was applied to assemble paired-end reads. The expression study unveiled 3,095 differentially expressed unigenes between well-irrigated and drought-stressed leaf samples. Gene ontology and KEGG analysis specified a significant number of abiotic stress responsive genes and pathways involved in processes like hormonal responses, antioxidant activity, response to stress stimuli, wax biosynthesis, and ROS metabolism. We also identified transcripts belonging to several families harboring important drought-responsive genes. Our study provides the first insight into the genomic structure of A. sisalana underlying adaptations to drought stress, thus providing diverse genetic resources for drought tolerance breeding research.
Collapse
Affiliation(s)
- Muhammad Bilal Sarwar
- Plant Genomics Lab, Center of Excellence in Molecular Biology, University of the Punjab, 87-West Canal Bank Road Thokar Niaz Baig, Lahore, 53700, Pakistan
- Department of Molecular Biology and Genetics, Aarhus University, Forsøgsvej 1, Slagelse, Denmark
| | - Zarnab Ahmad
- Plant Genomics Lab, Center of Excellence in Molecular Biology, University of the Punjab, 87-West Canal Bank Road Thokar Niaz Baig, Lahore, 53700, Pakistan
| | - Bushra Rashid
- Plant Genomics Lab, Center of Excellence in Molecular Biology, University of the Punjab, 87-West Canal Bank Road Thokar Niaz Baig, Lahore, 53700, Pakistan.
| | - Sameera Hassan
- Plant Genomics Lab, Center of Excellence in Molecular Biology, University of the Punjab, 87-West Canal Bank Road Thokar Niaz Baig, Lahore, 53700, Pakistan
| | - Per L Gregersen
- Department of Molecular Biology and Genetics, Aarhus University, Forsøgsvej 1, Slagelse, Denmark
| | - Maria De la O Leyva
- Department of Molecular Biology and Genetics, Aarhus University, Forsøgsvej 1, Slagelse, Denmark
| | - Istvan Nagy
- Department of Molecular Biology and Genetics, Aarhus University, Forsøgsvej 1, Slagelse, Denmark
| | - Torben Asp
- Department of Molecular Biology and Genetics, Aarhus University, Forsøgsvej 1, Slagelse, Denmark
| | - Tayyab Husnain
- Plant Genomics Lab, Center of Excellence in Molecular Biology, University of the Punjab, 87-West Canal Bank Road Thokar Niaz Baig, Lahore, 53700, Pakistan
| |
Collapse
|
28
|
Kerr SC, Gaiti F, Tanurdzic M. De Novo Plant Transcriptome Assembly and Annotation Using Illumina RNA-Seq Reads. Methods Mol Biol 2019; 1933:265-275. [PMID: 30945191 DOI: 10.1007/978-1-4939-9045-0_16] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The ability to identify and quantify transcribed sequences from a multitude of organisms using high-throughput RNA sequencing has revolutionized our understanding of genetics and plant biology. However, a number of computational tools used in these analyses still require a reference genome sequence, something that is seldom available for non-model organisms. Computational tools employing de Bruijn graphs to reconstruct full-length transcripts from short sequence reads allow for de novo transcriptome assembly. Here we provide detailed methods for generating and annotating de novo transcriptome assembly from plant RNA-seq data.
Collapse
Affiliation(s)
- Stephanie C Kerr
- School of Biological Sciences, The University of Queensland, St Lucia, QLD, Australia
| | - Federico Gaiti
- New York Genome Center and Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Milos Tanurdzic
- School of Biological Sciences, The University of Queensland, St Lucia, QLD, Australia.
| |
Collapse
|
29
|
Saban JM, Chapman MA, Taylor G. FACE facts hold for multiple generations; Evidence from natural CO 2 springs. GLOBAL CHANGE BIOLOGY 2019; 25:1-11. [PMID: 30422366 PMCID: PMC7379517 DOI: 10.1111/gcb.14437] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Revised: 07/25/2018] [Accepted: 08/13/2018] [Indexed: 05/05/2023]
Abstract
Rising atmospheric CO2 concentration is a key driver of enhanced global greening, thought to account for up to 70% of increased global vegetation in recent decades. CO2 fertilization effects have further profound implications for ecosystems, food security and biosphere-atmosphere feedbacks. However, it is also possible that current trends will not continue, due to ecosystem level constraints and as plants acclimate to future CO2 concentrations. Future predictions of plant response to rising [CO2 ] are often validated using single-generation short-term FACE (Free Air CO2 Enrichment) experiments but whether this accurately represents vegetation response over decades is unclear. The role of transgenerational plasticity and adaptation in the multigenerational response has yet to be elucidated. Here, we propose that naturally occurring high CO2 springs provide a proxy to quantify the multigenerational and long-term impacts of rising [CO2 ] in herbaceous and woody species respectively, such that plasticity, transgenerational effects and genetic adaptation can be quantified together in these systems. In this first meta-analysis of responses to elevated [CO2 ] at natural CO2 springs, we show that the magnitude and direction of change in eight of nine functional plant traits are consistent between spring and FACE experiments. We found increased photosynthesis (49.8% in spring experiments, comparable to 32.1% in FACE experiments) and leaf starch (58.6% spring, 84.3% FACE), decreased stomatal conductance (gs , 27.2% spring, 21.1% FACE), leaf nitrogen content (6.3% spring, 13.3% FACE) and Specific Leaf Area (SLA, 9.7% spring, 6.0% FACE). These findings not only validate the use of these sites for studying multigenerational plant response to elevated [CO2 ], but additionally suggest that long-term positive photosynthetic response to rising [CO2 ] are likely to continue as predicted by single-generation exposure FACE experiments.
Collapse
Affiliation(s)
- Jasmine M. Saban
- Biological SciencesUniversity of Southampton, Life SciencesSouthamptonUK
| | - Mark A. Chapman
- Biological SciencesUniversity of Southampton, Life SciencesSouthamptonUK
| | - Gail Taylor
- Biological SciencesUniversity of Southampton, Life SciencesSouthamptonUK
- Department of Plant SciencesUniversity of CaliforniaDavisCalifornia
| |
Collapse
|
30
|
MacManes MD. The Oyster River Protocol: a multi-assembler and kmer approach for de novo transcriptome assembly. PeerJ 2018; 6:e5428. [PMID: 30083482 PMCID: PMC6078068 DOI: 10.7717/peerj.5428] [Citation(s) in RCA: 67] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2017] [Accepted: 07/21/2018] [Indexed: 11/24/2022] Open
Abstract
Characterizing transcriptomes in non-model organisms has resulted in a massive increase in our understanding of biological phenomena. This boon, largely made possible via high-throughput sequencing, means that studies of functional, evolutionary, and population genomics are now being done by hundreds or even thousands of labs around the world. For many, these studies begin with a de novo transcriptome assembly, which is a technically complicated process involving several discrete steps. The Oyster River Protocol (ORP), described here, implements a standardized and benchmarked set of bioinformatic processes, resulting in an assembly with enhanced qualities over other standard assembly methods. Specifically, ORP produced assemblies have higher Detonate and TransRate scores and mapping rates, which is largely a product of the fact that it leverages a multi-assembler and kmer assembly process, thereby bypassing the shortcomings of any one approach. These improvements are important, as previously unassembled transcripts are included in ORP assemblies, resulting in a significant enhancement of the power of downstream analysis. Further, as part of this study, I show that assembly quality is unrelated with the number of reads generated, above 30 million reads. Code Availability: The version controlled open-source code is available at https://github.com/macmanes-lab/Oyster_River_Protocol. Instructions for software installation and use, and other details are available at http://oyster-river-protocol.rtfd.org/.
Collapse
Affiliation(s)
- Matthew D MacManes
- Department of Molecular, Cellular and Biomedical Sciences, University of New Hampshire, Durham, NH, USA
| |
Collapse
|
31
|
Insect-specific viruses: from discovery to potential translational applications. Curr Opin Virol 2018; 33:33-41. [PMID: 30048906 DOI: 10.1016/j.coviro.2018.07.006] [Citation(s) in RCA: 63] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2018] [Revised: 06/29/2018] [Accepted: 07/04/2018] [Indexed: 12/17/2022]
Abstract
Over the past decade the scientific community has experienced a new age of virus discovery in arthropods in general, and in insects in particular. Next generation sequencing and advanced bioinformatics tools have provided new insights about insect viromes and viral evolution. In this review, we discuss some high-throughput sequencing technologies used to discover viruses in insects and the challenges raised in data interpretations. Additionally, the discovery of these novel viruses that are considered as insect-specific viruses (ISVs) has gained increasing attention in their potential use as biological agents. As example, we show how the ISV Nhumirim virus was used to reduce West Nile virus transmission when co-infecting the mosquito vector. We also discuss new translational opportunities of using ISVs to limit insect vector competence by using them to interfere with pathogen acquisition, to directly target the insect vector or to confer pathogen resistance by the insect vector.
Collapse
|
32
|
Assessment of an Organ-Specific de Novo Transcriptome of the Nematode Trap-Crop, Solanum sisymbriifolium. G3-GENES GENOMES GENETICS 2018; 8:2135-2143. [PMID: 29769290 PMCID: PMC6027862 DOI: 10.1534/g3.118.200327] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
Abstract
Solanum sisymbriifolium, also known as “Litchi Tomato” or “Sticky Nightshade,” is an undomesticated and poorly researched plant related to potato and tomato. Unlike the latter species, S. sisymbriifolium induces eggs of the cyst nematode, Globodera pallida, to hatch and migrate into its roots, but then arrests further nematode maturation. In order to provide researchers with a partial blueprint of its genetic make-up so that the mechanism of this response might be identified, we used single molecule real time (SMRT) sequencing to compile a high quality de novo transcriptome of 41,189 unigenes drawn from individually sequenced bud, root, stem, and leaf RNA populations. Functional annotation and BUSCO analysis showed that this transcriptome was surprisingly complete, even though it represented genes expressed at a single time point. By sequencing the 4 organ libraries separately, we found we could get a reliable snapshot of transcript distributions in each organ. A divergent site analysis of the merged transcriptome indicated that this species might have undergone a recent genome duplication and re-diploidization. Further analysis indicated that the plant then retained a disproportionate number of genes associated with photosynthesis and amino acid metabolism in comparison to genes with characteristics of R-proteins or involved in secondary metabolism. The former processes may have given S. sisymbriifolium a bigger competitive advantage than the latter did.
Collapse
|
33
|
Pokorn T, Radišek S, Javornik B, Štajner N, Jakše J. Development of hop transcriptome to support research into host-viroid interactions. PLoS One 2017; 12:e0184528. [PMID: 28886174 PMCID: PMC5590963 DOI: 10.1371/journal.pone.0184528] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2017] [Accepted: 08/25/2017] [Indexed: 01/08/2023] Open
Abstract
Viroids, the smallest known pathogens, unable to encode any proteins, can cause severe diseases in their host plants. One of the proposed mechanisms of their pathogenicity includes silencing the host's genes via viroid-derived small RNAs, which are products of the host's immune response to the viroid's double stranded RNA. Humulus lupulus (hop) plants are hosts to several viroids; two of them, HLVd and CBCVd, are interesting models for studying host-viroid interactions, due to the symptomless infection of the former and severe stunting disease caused by the latter. To study these interactions, we constructed a deep hop NGS transcriptome based on 35 Gb paired-end sequencing data assembled into over 74 Mb of contigs. These transcripts were used for in-silico prediction of target transcripts of vd-sRNA of the two aforementioned viroids, using two different software tools. Prediction models revealed that 1062 and 1387 hop transcripts share nucleotide similarities with HLVd- and CBCVd-derived small RNAs, respectively, so they could be silenced in an RNA interference process. Furthermore, we selected 17 transcripts from 4 groups of targets involved in the metabolism of plant hormones, small RNA biogenesis, transcripts with high complementarity with viroid-derived small RNAs and transcripts targeted by CBCVd-derived small RNAs with high cellular concentrations. Their expression was monitored by reverse transcription quantitative PCR performed using leaf, flower and cone samples. Additionally, the expression of 5 pathogenesis related genes was monitored. Expression analysis confirmed high expression levels of four pathogenesis related genes in leaves of HLVd and CBCVd infected hop plants. Expression fluctuations were observed for the majority of targets, with possible evidence of downregulation of GATA transcription factor by CBCVd- and of linoleate 13S-lipoxygenase by HLVd-derived small RNAs. These results provide a deep transcriptome of hop and the first insights into complex viroid-hop plant interactions.
Collapse
Affiliation(s)
- Tine Pokorn
- Agronomy Department, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
| | - Sebastjan Radišek
- Department of Plant Protection, Slovenian Institute of Hop Research and Brewing, Žalec, Slovenia
| | - Branka Javornik
- Agronomy Department, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
| | - Nataša Štajner
- Agronomy Department, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
| | - Jernej Jakše
- Agronomy Department, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
| |
Collapse
|
34
|
Berghoff BA, Karlsson T, Källman T, Wagner EGH, Grabherr MG. RNA-sequence data normalization through in silico prediction of reference genes: the bacterial response to DNA damage as case study. BioData Min 2017; 10:30. [PMID: 28878825 PMCID: PMC5584328 DOI: 10.1186/s13040-017-0150-8] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2017] [Accepted: 08/22/2017] [Indexed: 11/17/2022] Open
Abstract
Background Measuring how gene expression changes in the course of an experiment assesses how an organism responds on a molecular level. Sequencing of RNA molecules, and their subsequent quantification, aims to assess global gene expression changes on the RNA level (transcriptome). While advances in high-throughput RNA-sequencing (RNA-seq) technologies allow for inexpensive data generation, accurate post-processing and normalization across samples is required to eliminate any systematic noise introduced by the biochemical and/or technical processes. Existing methods thus either normalize on selected known reference genes that are invariant in expression across the experiment, assume that the majority of genes are invariant, or that the effects of up- and down-regulated genes cancel each other out during the normalization. Results Here, we present a novel method, moose2, which predicts invariant genes in silico through a dynamic programming (DP) scheme and applies a quadratic normalization based on this subset. The method allows for specifying a set of known or experimentally validated invariant genes, which guides the DP. We experimentally verified the predictions of this method in the bacterium Escherichia coli, and show how moose2 is able to (i) estimate the expression value distances between RNA-seq samples, (ii) reduce the variation of expression values across all samples, and (iii) to subsequently reveal new functional groups of genes during the late stages of DNA damage. We further applied the method to three eukaryotic data sets, on which its performance compares favourably to other methods. The software is implemented in C++ and is publicly available from http://grabherr.github.io/moose2/. Conclusions The proposed RNA-seq normalization method, moose2, is a valuable alternative to existing methods, with two major advantages: (i) in silico prediction of invariant genes provides a list of potential reference genes for downstream analyses, and (ii) non-linear artefacts in RNA-seq data are handled adequately to minimize variations between replicates. Electronic supplementary material The online version of this article (10.1186/s13040-017-0150-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Bork A Berghoff
- Institut für Mikrobiologie und Molekularbiologie, Justus-Liebig-Universität, Giessen, Germany
| | - Torgny Karlsson
- Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden
| | - Thomas Källman
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden.,Bioinformatics Infrastructure for Life Sciences (BILS), Science for Life Laboratories, Uppsala University, Uppsala, Sweden
| | - E Gerhart H Wagner
- Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | - Manfred G Grabherr
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden.,Bioinformatics Infrastructure for Life Sciences (BILS), Science for Life Laboratories, Uppsala University, Uppsala, Sweden
| |
Collapse
|
35
|
Ballesteros GI, Gadau J, Legeai F, Gonzalez-Gonzalez A, Lavandero B, Simon JC, Figueroa CC. Expression differences in Aphidius ervi (Hymenoptera: Braconidae) females reared on different aphid host species. PeerJ 2017; 5:e3640. [PMID: 28852588 PMCID: PMC5572533 DOI: 10.7717/peerj.3640] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2017] [Accepted: 07/12/2017] [Indexed: 01/25/2023] Open
Abstract
The molecular mechanisms that allow generalist parasitoids to exploit many, often very distinct hosts are practically unknown. The wasp Aphidius ervi, a generalist koinobiont parasitoid of aphids, was introduced from Europe into Chile in the late 1970s to control agriculturally important aphid species. A recent study showed significant differences in host preference and host acceptance (infectivity) depending on the host A. ervi were reared on. In contrast, no genetic differentiation between A. ervi populations parasitizing different aphid species and aphids of the same species reared on different host plants was found in Chile. Additionally, the same study did not find any fitness effects in A. ervi if offspring were reared on a different host as their mothers. Here, we determined the effect of aphid host species (Sitobion avenae versus Acyrthosiphon pisum reared on two different host plants alfalfa and pea) on the transcriptome of adult A. ervi females. We found a large number of differentially expressed genes (between host species: head: 2,765; body: 1,216; within the same aphid host species reared on different host plants: alfalfa versus pea: head 593; body 222). As expected, the transcriptomes from parasitoids reared on the same host species (pea aphid) but originating from different host plants (pea versus alfalfa) were more similar to each other than the transcriptomes of parasitoids reared on a different aphid host and host plant (head: 648 and 1,524 transcripts; body: 566 and 428 transcripts). We found several differentially expressed odorant binding proteins and olfactory receptor proteins in particular, when we compared parasitoids from different host species. Additionally, we found differentially expressed genes involved in neuronal growth and development as well as signaling pathways. These results point towards a significant rewiring of the transcriptome of A. ervi depending on aphid-plant complex where parasitoids develop, even if different biotypes of a certain aphid host species (A. pisum) are reared on the same host plant. This difference seems to persist even after the different wasp populations were reared on the same aphid host in the laboratory for more than 50 generations. This indicates that either the imprinting process is very persistent or there is enough genetic/allelic variation between A. ervi populations. The role of distinct molecular mechanisms is discussed in terms of the formation of host fidelity.
Collapse
Affiliation(s)
- Gabriel I Ballesteros
- Instituto de Ciencias Biológicas, Universidad de Talca, Talca, Chile.,Millennium Nucleus Centre in Molecular Ecology and Evolutionary Applications in the Agroecosystems, Universidad de Talca, Talca, Chile
| | - Jürgen Gadau
- School of Life Sciences, Arizona State University, Tempe, AZ, United States of America.,Institute for Evolution and Biodiversity, Westfälische Wilhelms-Universität Münster, Münster, Germany
| | - Fabrice Legeai
- GenScale, INRIA Centre Rennes, Rennes, France.,Institute of Genetics, Environment and Plant Protection, INRA, Le Rheu, France
| | - Angelica Gonzalez-Gonzalez
- Instituto de Ciencias Biológicas, Universidad de Talca, Talca, Chile.,Millennium Nucleus Centre in Molecular Ecology and Evolutionary Applications in the Agroecosystems, Universidad de Talca, Talca, Chile
| | - Blas Lavandero
- Instituto de Ciencias Biológicas, Universidad de Talca, Talca, Chile
| | | | - Christian C Figueroa
- Instituto de Ciencias Biológicas, Universidad de Talca, Talca, Chile.,Millennium Nucleus Centre in Molecular Ecology and Evolutionary Applications in the Agroecosystems, Universidad de Talca, Talca, Chile
| |
Collapse
|
36
|
Stavrianakou M, Perez R, Wu C, Sachs MS, Aramayo R, Harlow M. Draft de novo transcriptome assembly and proteome characterization of the electric lobe of Tetronarce californica: a molecular tool for the study of cholinergic neurotransmission in the electric organ. BMC Genomics 2017; 18:611. [PMID: 28806931 PMCID: PMC5557070 DOI: 10.1186/s12864-017-3890-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2016] [Accepted: 06/21/2017] [Indexed: 11/10/2022] Open
Abstract
Background The electric organ of Tetronarce californica (an electric ray formerly known as Torpedo californica) is a classic preparation for biochemical studies of cholinergic neurotransmission. To broaden the usefulness of this preparation, we have performed a transcriptome assembly of the presynaptic component of the electric organ (the electric lobe). We combined our assembled transcriptome with a previous transcriptome of the postsynaptic electric organ, to define a MetaProteome containing pre- and post-synaptic components of the electric organ. Results Sequencing yielded 102 million paired-end 100 bp reads. De novo Trinity assembly was performed at Kmer 25 (default) and Kmers 27, 29, and 31. Trinity, generated around 103,000 transcripts, and 78,000 genes per assembly. Assemblies were evaluated based on the number of bases/transcripts assembled, RSEM-EVAL scores and informational content and completeness. We found that different assemblies scored differently according to the evaluation criteria used, and that while each individual assembly contained unique information, much of the assembly information was shared by all assemblies. To generate the presynaptic transcriptome (electric lobe), while capturing all information, assemblies were first clustered and then combined with postsynaptic transcripts (electric organ) downloaded from NCBI. The completness of the resulting clustered predicted MetaProteome was rigorously evaluated by comparing its information against the predicted proteomes from Homo sapiens, Callorhinchus milli, and the Transporter Classification Database (TCDB). Conclusions In summary, we obtained a MetaProteome containing 92%, 88.5%, and 66% of the expected set of ultra-conserved sequences (i.e., BUSCOs), expected to be found for Eukaryotes, Metazoa, and Vertebrata, respectively. We cross-annotated the conserved set of proteins shared between the T. californica MetaProteome and the proteomes of H. sapiens and C. milli, using the H. sapiens genome as a reference. This information was used to predict the position in human pathways of the conserved members of the T. californica MetaProteome. We found proteins not detected before in T. californica, corresponding to processes involved in synaptic vesicle biology. Finally, we identified 42 transporter proteins in TCDB that were detected by the T. californica MetaProteome (electric fish) and not selected by a control proteome consisting of the combined proteomes of 12 widely diverse non-electric fishes by Reverse-Blast-Hit Blast. Combined, the information provided here is not only a unique tool for the study of cholinergic neurotransmission, but it is also a starting point for understanding the evolution of early vertebrates. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3890-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Maria Stavrianakou
- Department of Biology, Texas A&M University, 3258 TAMU, College Station, 77843-3258, USA
| | - Ricardo Perez
- Department of Biology, Texas A&M University, 3258 TAMU, College Station, 77843-3258, USA
| | - Cheng Wu
- Department of Biology, Texas A&M University, 3258 TAMU, College Station, 77843-3258, USA
| | - Matthew S Sachs
- Department of Biology, Texas A&M University, 3258 TAMU, College Station, 77843-3258, USA
| | - Rodolfo Aramayo
- Department of Biology, Texas A&M University, 3258 TAMU, College Station, 77843-3258, USA.
| | - Mark Harlow
- Department of Biology, Texas A&M University, 3258 TAMU, College Station, 77843-3258, USA.
| |
Collapse
|
37
|
Liu H, Smith TPL, Nonneman DJ, Dekkers JCM, Tuggle CK. A high-quality annotated transcriptome of swine peripheral blood. BMC Genomics 2017. [PMID: 28646867 PMCID: PMC5483264 DOI: 10.1186/s12864-017-3863-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Background High throughput gene expression profiling assays of peripheral blood are widely used in biomedicine, as well as in animal genetics and physiology research. Accurate, comprehensive, and precise interpretation of such high throughput assays relies on well-characterized reference genomes and/or transcriptomes. However, neither the reference genome nor the peripheral blood transcriptome of the pig have been sufficiently assembled and annotated to support such profiling assays in this emerging biomedical model organism. We aimed to assemble published and novel RNA-seq data to provide a comprehensive, well-annotated blood transcriptome for pigs by integrating a de novo assembly with a genome-guided assembly. Results A de novo and a genome-guided transcriptome of porcine whole peripheral blood was assembled with ~162 million pairs of paired-end and ~183 million single-end, trimmed and normalized Illumina RNA-seq reads (~6 billion initial reads from 146 RNA-seq libraries) from five independent studies by using the Trinity and Cufflinks software, respectively. We then removed putative transcripts (PTs) of low confidence from both assemblies and merged the remaining PTs into an integrated transcriptome consisting of 132,928 PTs, with 126,225 (~95%) PTs from the de novo assembly and more than 91% of PTs spliced. In the integrated transcriptome, ~90% and 63% of PTs had significant sequence similarity to sequences in the NCBI NT and NR databases, respectively; 68,754 (~52%) PTs were annotated with 15,965 unique gene ontology (GO) terms; and 7618 PTs annotated with Enzyme Commission codes were assigned to 134 pathways curated by the Kyoto Encyclopedia of Genes and Genomes (KEGG). Full exon-intron junctions of 17,528 PTs were validated by PacBio IsoSeq full-length cDNA reads from 3 other porcine tissues, NCBI pig RefSeq mRNAs and transcripts from Ensembl Sscrofa10.2 annotation. Completeness of the 5’ termini of 37,569 PTs was validated by public cap analysis of gene expression (CAGE) data. By comparison to the Ensembl transcripts, we found that (1) the deduced precursors of 54,402 PTs shared at least one intron or exon with those of 18,437 Ensembl transcripts; (2) 12,262 PTs had both longer 5’ and 3’ termini than their maximally overlapping Ensembl transcripts; and (3) 41,838 spliced PTs were totally missing from the Sscrofa10.2 annotation. Similar results were obtained when the PTs were compared to the pig NCBI RefSeq mRNA collection. Conclusions We built, validated and annotated a comprehensive porcine blood transcriptome with significant improvement over the annotation of Ensembl Sscrofa10.2 and the pig NCBI RefSeq mRNAs, and laid a foundation for blood-based high throughput transcriptomic assays in pigs and for advancing annotation of the pig genome. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3863-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Haibo Liu
- Bioinformatics and Computational Biology Program, Department of Animal Science, Iowa State University, 2258 Kildee Hall, Ames, IA, 50011, USA
| | - Timothy P L Smith
- USDA, ARS, U.S. Meat Animal Research Center, Clay Center, NE, 68933, USA
| | - Dan J Nonneman
- USDA, ARS, U.S. Meat Animal Research Center, Clay Center, NE, 68933, USA
| | - Jack C M Dekkers
- Department of Animal Science, Iowa State University, 239 Kildee Hall, Ames, IA, 50011, USA
| | - Christopher K Tuggle
- Department of Animal Science, Iowa State University, 2255 Kildee Hall, Ames, IA, 50011, USA.
| |
Collapse
|
38
|
Gonzalez-Ibeas D, Martinez-Garcia PJ, Famula RA, Delfino-Mix A, Stevens KA, Loopstra CA, Langley CH, Neale DB, Wegrzyn JL. Assessing the Gene Content of the Megagenome: Sugar Pine (Pinus lambertiana). G3 (BETHESDA, MD.) 2016; 6:3787-3802. [PMID: 27799338 PMCID: PMC5144951 DOI: 10.1534/g3.116.032805] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/24/2016] [Accepted: 07/13/2016] [Indexed: 02/06/2023]
Abstract
Sugar pine (Pinus lambertiana Douglas) is within the subgenus Strobus with an estimated genome size of 31 Gbp. Transcriptomic resources are of particular interest in conifers due to the challenges presented in their megagenomes for gene identification. In this study, we present the first comprehensive survey of the P. lambertiana transcriptome through deep sequencing of a variety of tissue types to generate more than 2.5 billion short reads. Third generation, long reads generated through PacBio Iso-Seq have been included for the first time in conifers to combat the challenges associated with de novo transcriptome assembly. A technology comparison is provided here to contribute to the otherwise scarce comparisons of second and third generation transcriptome sequencing approaches in plant species. In addition, the transcriptome reference was essential for gene model identification and quality assessment in the parallel project responsible for sequencing and assembly of the entire genome. In this study, the transcriptomic data were also used to address questions surrounding lineage-specific Dicer-like proteins in conifers. These proteins play a role in the control of transposable element proliferation and the related genome expansion in conifers.
Collapse
Affiliation(s)
- Daniel Gonzalez-Ibeas
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, Connecticut 06269
| | | | - Randi A Famula
- Department of Plant Sciences, University of California, Davis, California 95616
| | - Annette Delfino-Mix
- United States Department of Agriculture Forest Service, Institute of Forest Genetics, Placerville, California 95667
| | - Kristian A Stevens
- Department of Evolution and Ecology, University of California, Davis, California 95616
| | - Carol A Loopstra
- Department of Ecosystem Science and Management, Texas A&M University, College Station, Texas 77843
| | - Charles H Langley
- Department of Evolution and Ecology, University of California, Davis, California 95616
| | - David B Neale
- Department of Plant Sciences, University of California, Davis, California 95616
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, Connecticut 06269
| |
Collapse
|
39
|
Wang JY, Liang YL, Hai MR, Chen JW, Gao ZJ, Hu QQ, Zhang GH, Yang SC. Genome-Wide Transcriptional Excavation of Dipsacus asperoides Unmasked both Cryptic Asperosaponin Biosynthetic Genes and SSR Markers. FRONTIERS IN PLANT SCIENCE 2016; 7:339. [PMID: 27066018 PMCID: PMC4809893 DOI: 10.3389/fpls.2016.00339] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/08/2015] [Accepted: 03/04/2016] [Indexed: 06/01/2023]
Abstract
BACKGROUND Dipsacus asperoides is a traditional Chinese medicinal crop. The root is generally used as a medicine and is frequently prescribed by Chinese doctors for the treatment of back pain, limb paralysis, flutter trauma, tendon injuries, and fractures. With the rapid development of bioinformatics, research has been focused on this species at the gene or molecular level. For purpose of fleshing out genome information about D. asperoides, in this paper we conducted transcriptome analysis of this species. PRINCIPAL FINDINGS To date, many genes encoding enzymes involved in the biosynthesis of triterpenoid saponins in D.asperoides have not been elucidated. Illumina paired-end sequencing was employed to probe D. asperoides's various enzymes associated with the relevant mesostate. A total of 30, 832,805 clean reads and de novo spliced 43,243 unigenes were obtained. Of all unigenes, only 8.27% (3578) were successfully annotated in total of seven public databases: Nr, Nt, Swiss-Prot, GO, KOG, KEGG, and Pfam, which might be attributed to the poor studies on D. asperoides. The candidate genes encoding enzymes involved in triterpenoid saponin biosynthesis were identified and experimentally verified by reverse transcription qPCR, encompassing nine cytochrome P450s and 17 UDP-glucosyltransferases. Specifically, unearthly putative genes involved in the glycosylation of hederagenin were acquired. Simultaneously, 4490 SSRs from 43,243 examined sequences were determined via bioinformatics analysis. CONCLUSION This study represents the first report on the use of the Illumina sequence platform on this crop at the transcriptome level. Our findings of candidate genes encoding enzymes involved in Dipsacus saponin VI biosynthes is provide novel information in efforts to further understand the triterpenoid metabolic pathway on this species. The initial genetics resources in this study will contribute significantly to the genetic breeding program of D. asperoides, and are beneficial for clinical diagnosis and treatment.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Guang-hui Zhang
- Yunnan Research Center on Good Agricultural Practice for Dominant Chinese Medicinal Materials, Yunnan Agricultural UniversityYunnan, China
| | - Sheng-chao Yang
- Yunnan Research Center on Good Agricultural Practice for Dominant Chinese Medicinal Materials, Yunnan Agricultural UniversityYunnan, China
| |
Collapse
|