51
|
Bliznina A, Masunaga A, Mansfield MJ, Tan Y, Liu AW, West C, Rustagi T, Chien HC, Kumar S, Pichon J, Plessy C, Luscombe NM. Telomere-to-telomere assembly of the genome of an individual Oikopleura dioica from Okinawa using Nanopore-based sequencing. BMC Genomics 2021; 22:222. [PMID: 33781200 PMCID: PMC8008620 DOI: 10.1186/s12864-021-07512-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Accepted: 03/05/2021] [Indexed: 11/10/2022] Open
Abstract
Background The larvacean Oikopleura dioica is an abundant tunicate plankton with the smallest (65–70 Mbp) non-parasitic, non-extremophile animal genome identified to date. Currently, there are two genomes available for the Bergen (OdB3) and Osaka (OSKA2016) O. dioica laboratory strains. Both assemblies have full genome coverage and high sequence accuracy. However, a chromosome-scale assembly has not yet been achieved. Results Here, we present a chromosome-scale genome assembly (OKI2018_I69) of the Okinawan O. dioica produced using long-read Nanopore and short-read Illumina sequencing data from a single male, combined with Hi-C chromosomal conformation capture data for scaffolding. The OKI2018_I69 assembly has a total length of 64.3 Mbp distributed among 19 scaffolds. 99% of the assembly is contained within five megabase-scale scaffolds. We found telomeres on both ends of the two largest scaffolds, which represent assemblies of two fully contiguous autosomal chromosomes. Each of the other three large scaffolds have telomeres at one end only and we propose that they correspond to sex chromosomes split into a pseudo-autosomal region and X-specific or Y-specific regions. Indeed, these five scaffolds mostly correspond to equivalent linkage groups in OdB3, suggesting overall agreement in chromosomal organization between the two populations. At a more detailed level, the OKI2018_I69 assembly possesses similar genomic features in gene content and repetitive elements reported for OdB3. The Hi-C map suggests few reciprocal interactions between chromosome arms. At the sequence level, multiple genomic features such as GC content and repetitive elements are distributed differently along the short and long arms of the same chromosome. Conclusions We show that a hybrid approach of integrating multiple sequencing technologies with chromosome conformation information results in an accurate de novo chromosome-scale assembly of O. dioica’s highly polymorphic genome. This genome assembly opens up the possibility of cross-genome comparison between O. dioica populations, as well as of studies of chromosomal evolution in this lineage. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07512-6.
Collapse
Affiliation(s)
- Aleksandra Bliznina
- Genomics and Regulatory Systems Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan.
| | - Aki Masunaga
- Genomics and Regulatory Systems Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
| | - Michael J Mansfield
- Genomics and Regulatory Systems Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
| | - Yongkai Tan
- Genomics and Regulatory Systems Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
| | - Andrew W Liu
- Genomics and Regulatory Systems Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
| | - Charlotte West
- Genomics and Regulatory Systems Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan.,Francis Crick Institute, London, UK
| | - Tanmay Rustagi
- Genomics and Regulatory Systems Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
| | - Hsiao-Chiao Chien
- Genomics and Regulatory Systems Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
| | - Saurabh Kumar
- Genomics and Regulatory Systems Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
| | - Julien Pichon
- Genomics and Regulatory Systems Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
| | - Charles Plessy
- Genomics and Regulatory Systems Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan.
| | - Nicholas M Luscombe
- Genomics and Regulatory Systems Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan.,Francis Crick Institute, London, UK.,Department of Genetics, Evolution and Environment, UCL Genetics Institute, University College London, London, UK
| |
Collapse
|
52
|
Behera S, Voshall A, Moriyama EN. Plant Transcriptome Assembly: Review and Benchmarking. Bioinformatics 2021. [DOI: 10.36255/exonpublications.bioinformatics.2021.ch7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
53
|
Galise TR, Esposito S, D'Agostino N. Guidelines for Setting Up a mRNA Sequencing Experiment and Best Practices for Bioinformatic Data Analysis. Methods Mol Biol 2021; 2264:137-162. [PMID: 33263908 DOI: 10.1007/978-1-0716-1201-9_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
RNA-sequencing, commonly referred to as RNA-seq, is the most recently developed method for the analysis of transcriptomes. It uses high-throughput next-generation sequencing technologies and has revolutionized our understanding of the complexity and dynamics of whole transcriptomes.In this chapter, we recall the key developments in transcriptome analysis and dissect the different steps of the general workflow that can be run by users to design and perform a mRNA-seq experiment as well as to process mRNA-seq data obtained by the Illumina technology. The chapter proposes guidelines for completing a mRNA-seq study properly and makes available recommendations for best practices based on recent literature and on the latest developments in technology and algorithms. We also remark the large number of choices available (especially for bioinformatic data analysis) in front of which the scientist may be in trouble.In the last part of the chapter we discuss the new frontiers of single-cell RNA-seq and isoform sequencing by long read technology.
Collapse
Affiliation(s)
- Teresa Rosa Galise
- Department of Agricultural Sciences, University of Naples Federico II, Portici, Italy
| | - Salvatore Esposito
- CREA Research Centre for Vegetable and Ornamental Crops, Pontecagnano Faiano, Italy
| | - Nunzio D'Agostino
- Department of Agricultural Sciences, University of Naples Federico II, Portici, Italy.
| |
Collapse
|
54
|
Sharma P, Sharma BS, Verma RJ. A Guide to RNAseq Data Analysis Using Bioinformatics Approaches. Adv Bioinformatics 2021. [DOI: 10.1007/978-981-33-6191-1_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
|
55
|
Mora-Márquez F, Vázquez-Poletti JL, Chano V, Collada C, Soto Á, de Heredia UL. Hardware Performance Evaluation of De novo Transcriptome Assembly Software in Amazon Elastic Compute Cloud. Curr Bioinform 2020. [DOI: 10.2174/1574893615666191219095817] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
Bioinformatics software for RNA-seq analysis has a high computational
requirement in terms of the number of CPUs, RAM size, and processor characteristics.
Specifically, de novo transcriptome assembly demands large computational infrastructure due to
the massive data size, and complexity of the algorithms employed. Comparative studies on the
quality of the transcriptome yielded by de novo assemblers have been previously published,
lacking, however, a hardware efficiency-oriented approach to help select the assembly hardware
platform in a cost-efficient way.
Objective:
We tested the performance of two popular de novo transcriptome assemblers, Trinity
and SOAPdenovo-Trans (SDNT), in terms of cost-efficiency and quality to assess limitations, and
provided troubleshooting and guidelines to run transcriptome assemblies efficiently.
Methods:
We built virtual machines with different hardware characteristics (CPU number, RAM
size) in the Amazon Elastic Compute Cloud of the Amazon Web Services. Using simulated and
real data sets, we measured the elapsed time, cost, CPU percentage and output size of small and
large data set assemblies.
Results:
For small data sets, SDNT outperformed Trinity by an order the magnitude, significantly
reducing the time duration and costs of the assembly. For large data sets, Trinity performed better
than SDNT. Both the assemblers provide good quality transcriptomes.
Conclusion:
The selection of the optimal transcriptome assembler and provision of computational
resources depend on the combined effect of size and complexity of RNA-seq experiments.
Collapse
Affiliation(s)
- Fernando Mora-Márquez
- GI Sistemas Naturales e Historia Forestal, Dpto. Sistemas y Recursos Naturales, ETSI Montes, Forestal y del Medio Natural, Universidad Politecnica de Madrid, Ciudad Universitaria, 28040 Madrid, Spain
| | - José Luis Vázquez-Poletti
- GI Arquitectura de Sistemas Distribuidos, Dpto. Arquitectura de Computadores y Automatica, Facultad de Informatica, Universidad Complutense de Madrid, Ciudad Universitaria, 28040 Madrid, Spain
| | - Víctor Chano
- GI Sistemas Naturales e Historia Forestal, Dpto. Sistemas y Recursos Naturales, ETSI Montes, Forestal y del Medio Natural, Universidad Politecnica de Madrid, Ciudad Universitaria, 28040 Madrid, Spain
| | - Carmen Collada
- GI Sistemas Naturales e Historia Forestal, Dpto. Sistemas y Recursos Naturales, ETSI Montes, Forestal y del Medio Natural, Universidad Politecnica de Madrid, Ciudad Universitaria, 28040 Madrid, Spain
| | - Álvaro Soto
- GI Sistemas Naturales e Historia Forestal, Dpto. Sistemas y Recursos Naturales, ETSI Montes, Forestal y del Medio Natural, Universidad Politecnica de Madrid, Ciudad Universitaria, 28040 Madrid, Spain
| | - Unai López de Heredia
- GI Sistemas Naturales e Historia Forestal, Dpto. Sistemas y Recursos Naturales, ETSI Montes, Forestal y del Medio Natural, Universidad Politecnica de Madrid, Ciudad Universitaria, 28040 Madrid, Spain
| |
Collapse
|
56
|
Alejo-Jacuinde G, González-Morales SI, Oropeza-Aburto A, Simpson J, Herrera-Estrella L. Comparative transcriptome analysis suggests convergent evolution of desiccation tolerance in Selaginella species. BMC PLANT BIOLOGY 2020; 20:468. [PMID: 33046015 PMCID: PMC7549206 DOI: 10.1186/s12870-020-02638-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2020] [Accepted: 09/04/2020] [Indexed: 05/04/2023]
Abstract
BACKGROUND Desiccation tolerant Selaginella species evolved to survive extreme environmental conditions. Studies to determine the mechanisms involved in the acquisition of desiccation tolerance (DT) have focused on only a few Selaginella species. Due to the large diversity in morphology and the wide range of responses to desiccation within the genus, the understanding of the molecular basis of DT in Selaginella species is still limited. RESULTS Here we present a reference transcriptome for the desiccation tolerant species S. sellowii and the desiccation sensitive species S. denticulata. The analysis also included transcriptome data for the well-studied S. lepidophylla (desiccation tolerant), in order to identify DT mechanisms that are independent of morphological adaptations. We used a comparative approach to discriminate between DT responses and the common water loss response in Selaginella species. Predicted proteomes show strong homology, but most of the desiccation responsive genes differ between species. Despite such differences, functional analysis revealed that tolerant species with different morphologies employ similar mechanisms to survive desiccation. Significant functions involved in DT and shared by both tolerant species included induction of antioxidant systems, amino acid and secondary metabolism, whereas species-specific responses included cell wall modification and carbohydrate metabolism. CONCLUSIONS Reference transcriptomes generated in this work represent a valuable resource to study Selaginella biology and plant evolution in relation to DT. Our results provide evidence of convergent evolution of S. sellowii and S. lepidophylla due to the different gene sets that underwent selection to acquire DT.
Collapse
Affiliation(s)
- Gerardo Alejo-Jacuinde
- National Laboratory of Genomics for Biodiversity (Langebio), Unit of Advanced Genomics, CINVESTAV, 36824 Irapuato, Guanajuato Mexico
- Department of Genetic Engineering, CINVESTAV, 36824 Irapuato, Guanajuato Mexico
| | | | - Araceli Oropeza-Aburto
- National Laboratory of Genomics for Biodiversity (Langebio), Unit of Advanced Genomics, CINVESTAV, 36824 Irapuato, Guanajuato Mexico
| | - June Simpson
- Department of Genetic Engineering, CINVESTAV, 36824 Irapuato, Guanajuato Mexico
| | - Luis Herrera-Estrella
- National Laboratory of Genomics for Biodiversity (Langebio), Unit of Advanced Genomics, CINVESTAV, 36824 Irapuato, Guanajuato Mexico
- Institute of Genomics for Crop Abiotic Stress Tolerance, Texas Tech University, Lubbock, TX 79409 USA
| |
Collapse
|
57
|
Puglia GD, Prjibelski AD, Vitale D, Bushmanova E, Schmid KJ, Raccuia SA. Hybrid transcriptome sequencing approach improved assembly and gene annotation in Cynara cardunculus (L.). BMC Genomics 2020; 21:317. [PMID: 32819282 PMCID: PMC7441626 DOI: 10.1186/s12864-020-6670-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Accepted: 03/13/2020] [Indexed: 12/11/2022] Open
Abstract
Background The investigation of transcriptome profiles using short reads in non-model organisms, which lack of well-annotated genomes, is limited by partial gene reconstruction and isoform detection. In contrast, long-reads sequencing techniques revealed their potential to generate complete transcript assemblies even when a reference genome is lacking. Cynara cardunculus var. altilis (DC) (cultivated cardoon) is a perennial hardy crop adapted to dry environments with many industrial and nutraceutical applications due to the richness of secondary metabolites mostly produced in flower heads. The investigation of this species benefited from the recent release of a draft genome, but the transcriptome profile during the capitula formation still remains unexplored. In the present study we show a transcriptome analysis of vegetative and inflorescence organs of cultivated cardoon through a novel hybrid RNA-seq assembly approach utilizing both long and short RNA-seq reads. Results The inclusion of a single Nanopore flow-cell output in a hybrid sequencing approach determined an increase of 15% complete assembled genes and 18% transcript isoforms respect to short reads alone. Among 25,463 assembled unigenes, we identified 578 new genes and updated 13,039 gene models, 11,169 of which were alternatively spliced isoforms. During capitulum development, 3424 genes were differentially expressed and approximately two-thirds were identified as transcription factors including bHLH, MYB, NAC, C2H2 and MADS-box which were highly expressed especially after capitulum opening. We also show the expression dynamics of key genes involved in the production of valuable secondary metabolites of which capitulum is rich such as phenylpropanoids, flavonoids and sesquiterpene lactones. Most of their biosynthetic genes were strongly transcribed in the flower heads with alternative isoforms exhibiting differentially expression levels across the tissues. Conclusions This novel hybrid sequencing approach allowed to improve the transcriptome assembly, to update more than half of annotated genes and to identify many novel genes and different alternatively spliced isoforms. This study provides new insights on the flowering cycle in an Asteraceae plant, a valuable resource for plant biology and breeding in Cynara and an effective method for improving gene annotation.
Collapse
Affiliation(s)
- Giuseppe D Puglia
- Institute for Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, Fruwirthstrasse 21, 70599, Stuttgart, Germany. .,Consiglio Nazionale delle Ricerche, Istituto per i Sistemi Agricoli e Forestali del Mediterraneo (CNR-ISAFOM) U.O.S. Catania, Via Empedocle, 58, 95128, Catania, Italy.
| | - Andrey D Prjibelski
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia
| | - Domenico Vitale
- Consiglio Nazionale delle Ricerche, Istituto per i Sistemi Agricoli e Forestali del Mediterraneo (CNR-ISAFOM) U.O.S. Catania, Via Empedocle, 58, 95128, Catania, Italy
| | - Elena Bushmanova
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia
| | - Karl J Schmid
- Institute for Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, Fruwirthstrasse 21, 70599, Stuttgart, Germany.
| | - Salvatore A Raccuia
- Consiglio Nazionale delle Ricerche, Istituto per i Sistemi Agricoli e Forestali del Mediterraneo (CNR-ISAFOM) U.O.S. Catania, Via Empedocle, 58, 95128, Catania, Italy
| |
Collapse
|
58
|
Nip KM, Chiu R, Yang C, Chu J, Mohamadi H, Warren RL, Birol I. RNA-Bloom enables reference-free and reference-guided sequence assembly for single-cell transcriptomes. Genome Res 2020; 30:1191-1200. [PMID: 32817073 PMCID: PMC7462077 DOI: 10.1101/gr.260174.119] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Accepted: 07/23/2020] [Indexed: 12/27/2022]
Abstract
Despite the rapid advance in single-cell RNA sequencing (scRNA-seq) technologies within the last decade, single-cell transcriptome analysis workflows have primarily used gene expression data while isoform sequence analysis at the single-cell level still remains fairly limited. Detection and discovery of isoforms in single cells is difficult because of the inherent technical shortcomings of scRNA-seq data, and existing transcriptome assembly methods are mainly designed for bulk RNA samples. To address this challenge, we developed RNA-Bloom, an assembly algorithm that leverages the rich information content aggregated from multiple single-cell transcriptomes to reconstruct cell-specific isoforms. Assembly with RNA-Bloom can be either reference-guided or reference-free, thus enabling unbiased discovery of novel isoforms or foreign transcripts. We compared both assembly strategies of RNA-Bloom against five state-of-the-art reference-free and reference-based transcriptome assembly methods. In our benchmarks on a simulated 384-cell data set, reference-free RNA-Bloom reconstructed 37.9%–38.3% more isoforms than the best reference-free assembler, whereas reference-guided RNA-Bloom reconstructed 4.1%–11.6% more isoforms than reference-based assemblers. When applied to a real 3840-cell data set consisting of more than 4 billion reads, RNA-Bloom reconstructed 9.7%–25.0% more isoforms than the best competing reference-based and reference-free approaches evaluated. We expect RNA-Bloom to boost the utility of scRNA-seq data beyond gene expression analysis, expanding what is informatically accessible now.
Collapse
Affiliation(s)
- Ka Ming Nip
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada V5Z 4S6
| | - Readman Chiu
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada V5Z 4S6
| | - Chen Yang
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada V5Z 4S6
| | - Justin Chu
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada V5Z 4S6
| | - Hamid Mohamadi
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada V5Z 4S6
| | - René L Warren
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada V5Z 4S6
| | - Inanc Birol
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada V5Z 4S6.,Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada V6H 3N1
| |
Collapse
|
59
|
Miller CH, Campbell P, Sheehan MJ. Distinct evolutionary trajectories of V1R clades across mouse species. BMC Evol Biol 2020; 20:99. [PMID: 32770934 PMCID: PMC7414754 DOI: 10.1186/s12862-020-01662-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2020] [Accepted: 07/21/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Many animals rely heavily on olfaction to navigate their environment. Among rodents, olfaction is crucial for a wide range of social behaviors. The vomeronasal olfactory system in particular plays an important role in mediating social communication, including the detection of pheromones and recognition signals. In this study we examine patterns of vomeronasal type-1 receptor (V1R) evolution in the house mouse and related species within the genus Mus. We report the extent of gene repertoire turnover and conservation among species and clades, as well as the prevalence of positive selection on gene sequences across the V1R tree. By exploring the evolution of these receptors, we provide insight into the functional roles of receptor subtypes as well as the dynamics of gene family evolution. RESULTS We generated transcriptomes from the vomeronasal organs of 5 Mus species, and produced high quality V1R repertoires for each species. We find that V1R clades in the house mouse and relatives exhibit distinct evolutionary trajectories. We identify putative species-specific gene expansions, including a large clade D expansion in the house mouse. While gene gains are abundant, we detect very few gene losses. We describe a novel V1R clade and highlight candidate receptors for future study. We find evidence for distinct evolutionary processes across different clades, from largescale turnover to highly conserved repertoires. Patterns of positive selection are similarly variable, as some clades exhibit abundant positive selection while others display high gene sequence conservation. Based on clade-level evolutionary patterns, we identify receptor families that are strong candidates for detecting social signals and predator cues. Our results reveal clades with receptors detecting female reproductive status are among the most conserved across species, suggesting an important role in V1R chemosensation. CONCLUSION Analysis of clade-level evolution is critical for understanding species' chemosensory adaptations. This study provides clear evidence that V1R clades are characterized by distinct evolutionary trajectories. As receptor evolution is shaped by ligand identity, these results provide a framework for examining the functional roles of receptors.
Collapse
Affiliation(s)
| | - Polly Campbell
- Evolution, Ecology and Organismal Biology, University of California-Riverside, Riverside, USA
| | | |
Collapse
|
60
|
Prjibelski AD, Puglia GD, Antipov D, Bushmanova E, Giordano D, Mikheenko A, Vitale D, Lapidus A. Extending rnaSPAdes functionality for hybrid transcriptome assembly. BMC Bioinformatics 2020; 21:302. [PMID: 32703149 PMCID: PMC7379828 DOI: 10.1186/s12859-020-03614-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2020] [Accepted: 06/18/2020] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND De novo RNA-Seq assembly is a powerful method for analysing transcriptomes when the reference genome is not available or poorly annotated. However, due to the short length of Illumina reads it is usually impossible to reconstruct complete sequences of complex genes and alternative isoforms. Recently emerged possibility to generate long RNA reads, such as PacBio and Oxford Nanopores, may dramatically improve the assembly quality, and thus the consecutive analysis. While reference-based tools for analysing long RNA reads were recently developed, there is no established pipeline for de novo assembly of such data. RESULTS In this work we present a novel method that allows to perform high-quality de novo transcriptome assemblies by combining accuracy and reliability of short reads with exon structure information carried out from long error-prone reads. The algorithm is designed by incorporating existing hybridSPAdes approach into rnaSPAdes pipeline and adapting it for transcriptomic data. CONCLUSION To evaluate the benefit of using long RNA reads we selected several datasets containing both Illumina and Iso-seq or Oxford Nanopore Technologies (ONT) reads. Using an existing quality assessment software, we show that hybrid assemblies performed with rnaSPAdes contain more full-length genes and alternative isoforms comparing to the case when only short-read data is used.
Collapse
Affiliation(s)
- Andrey D Prjibelski
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia.
| | - Giuseppe D Puglia
- Consiglio Nazionale delle Ricerche, Istituto per i Sistemi Agricoli e Forestali del Mediterraneo, Catania, Italy
| | - Dmitry Antipov
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia
| | - Elena Bushmanova
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia
| | - Daniela Giordano
- Department of Electrical, Electronics and Computer Engineering, University of Catania, Catania, Italy
| | - Alla Mikheenko
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia
| | - Domenico Vitale
- Consiglio Nazionale delle Ricerche, Istituto per i Sistemi Agricoli e Forestali del Mediterraneo, Catania, Italy
| | - Alla Lapidus
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia
| |
Collapse
|
61
|
Mikheenko A, Bzikadze AV, Gurevich A, Miga KH, Pevzner PA. TandemTools: mapping long reads and assessing/improving assembly quality in extra-long tandem repeats. Bioinformatics 2020; 36:i75-i83. [PMID: 32657355 PMCID: PMC7355294 DOI: 10.1093/bioinformatics/btaa440] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
MOTIVATION Extra-long tandem repeats (ETRs) are widespread in eukaryotic genomes and play an important role in fundamental cellular processes, such as chromosome segregation. Although emerging long-read technologies have enabled ETR assemblies, the accuracy of such assemblies is difficult to evaluate since there are no tools for their quality assessment. Moreover, since the mapping of error-prone reads to ETRs remains an open problem, it is not clear how to polish draft ETR assemblies. RESULTS To address these problems, we developed the TandemTools software that includes the TandemMapper tool for mapping reads to ETRs and the TandemQUAST tool for polishing ETR assemblies and their quality assessment. We demonstrate that TandemTools not only reveals errors in ETR assemblies but also improves the recently generated assemblies of human centromeres. AVAILABILITY AND IMPLEMENTATION https://github.com/ablab/TandemTools. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Alla Mikheenko
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg 199034, Russia
| | - Andrey V Bzikadze
- Graduate Program in Bioinformatics and Systems Biology, University of California, San Diego, CA 92093, USA
| | - Alexey Gurevich
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg 199034, Russia
| | - Karen H Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Pavel A Pevzner
- Department of Computer Science and Engineering, University of California, San Diego, CA 92093, USA
| |
Collapse
|
62
|
Chen LY, Morales-Briones DF, Passow CN, Yang Y. Performance of gene expression analyses using de novo assembled transcripts in polyploid species. Bioinformatics 2020; 35:4314-4320. [PMID: 31400193 DOI: 10.1093/bioinformatics/btz620] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2018] [Revised: 07/12/2019] [Accepted: 08/09/2019] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Quality of gene expression analyses using de novo assembled transcripts in species that experienced recent polyploidization remains unexplored. RESULTS Differential gene expression (DGE) analyses using putative genes inferred by Trinity, Corset and Grouper performed slightly differently across five plant species that experienced various polyploidy histories. In species that lack recent polyploidy events that occurred in the past several millions of years, DGE analyses using de novo assembled transcriptomes identified 54-82% of the differentially expressed genes recovered by mapping reads to the reference genes. However, in species that experienced more recent polyploidy events, the percentage decreased to 21-65%. Gene co-expression network analyses using de novo assemblies versus mapping to the reference genes recovered the same module that significantly correlated with treatment in one species that lacks recent polyploidization. AVAILABILITY AND IMPLEMENTATION Commands and scripts used in this study are available at https://bitbucket.org/lychen83/chen_et_al_2018_benchmark_dge/; Analysis files are available at Dryad doi: 10.5061/dryad.4p6n481. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ling-Yun Chen
- Department of Plant and Microbial Biology, University of Minnesota, Twin Cities, Saint Paul, MN, USA
| | - Diego F Morales-Briones
- Department of Plant and Microbial Biology, University of Minnesota, Twin Cities, Saint Paul, MN, USA
| | - Courtney N Passow
- Department of Ecology Evolution and Behavior, University of Minnesota, Twin Cities, Saint Paul, MN, USA.,University of Minnesota Genomics Center, University of Minnesota, Twin Cities, Saint Paul, MN, USA
| | - Ya Yang
- Department of Plant and Microbial Biology, University of Minnesota, Twin Cities, Saint Paul, MN, USA
| |
Collapse
|
63
|
Malovichko YV, Shtark OY, Vasileva EN, Nizhnikov AA, Antonets KS. Transcriptomic Insights into Mechanisms of Early Seed Maturation in the Garden Pea ( Pisum sativum L.). Cells 2020; 9:E779. [PMID: 32210065 PMCID: PMC7140803 DOI: 10.3390/cells9030779] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2019] [Revised: 03/20/2020] [Accepted: 03/21/2020] [Indexed: 02/07/2023] Open
Abstract
The garden pea (Pisum sativum L.) is a legume crop of immense economic value. Extensive breeding has led to the emergence of numerous pea varieties, of which some are distinguished by accelerated development in various stages of ontogenesis. One such trait is rapid seed maturation, which, despite novel insights into the genetic control of seed development in legumes, remains poorly studied. This article presents an attempt to dissect mechanisms of early maturation in the pea line Sprint-2 by means of whole transcriptome RNA sequencing in two developmental stages. By using a de novo assembly approach, we have obtained a reference transcriptome of 25,756 non-redundant entries expressed in pea seeds at either 10 or 20 days after pollination. Differential expression in Sprint-2 seeds has affected 13,056 transcripts. A comparison of the two pea lines with a common maturation rate demonstrates that while at 10 days after pollination, Sprint-2 seeds show development retardation linked to intensive photosynthesis, morphogenesis, and cell division, and those at 20 days show a rapid onset of desiccation marked by the cessation of translation and cell anabolism and accumulation of dehydration-protective and -storage moieties. Further inspection of certain transcript functional categories, including the chromatin constituent, transcription regulation, protein turnover, and hormonal regulation, has revealed transcriptomic trends unique to specific stages and cultivars. Among other remarkable features, Sprint-2 demonstrated an enhanced expression of transposable element-associated open reading frames and an altered expression of major maturation regulators and DNA methyltransferase genes. To the best of our knowledge, this is the first comparative transcriptomic study in which the issue of the seed maturation rate is addressed.
Collapse
Affiliation(s)
- Yury V. Malovichko
- Laboratory for Proteomics of Supra-Organismal Systems, All-Russia Research Institute for Agricultural Microbiology (ARRIAM), Podbelskogo sh., 3, Pushkin, 196608 St. Petersburg, Russia;
- Faculty of Biology, St. Petersburg State University, 199034 St. Petersburg, Russia;
| | - Oksana Y. Shtark
- Department of Biotechnology, All-Russia Research Institute for Agricultural Microbiology (ARRIAM), Podbelskogo sh., 3, Pushkin, 196608 St. Petersburg, Russia;
| | - Ekaterina N. Vasileva
- Faculty of Biology, St. Petersburg State University, 199034 St. Petersburg, Russia;
- Department of Biotechnology, All-Russia Research Institute for Agricultural Microbiology (ARRIAM), Podbelskogo sh., 3, Pushkin, 196608 St. Petersburg, Russia;
| | - Anton A. Nizhnikov
- Laboratory for Proteomics of Supra-Organismal Systems, All-Russia Research Institute for Agricultural Microbiology (ARRIAM), Podbelskogo sh., 3, Pushkin, 196608 St. Petersburg, Russia;
- Faculty of Biology, St. Petersburg State University, 199034 St. Petersburg, Russia;
| | - Kirill S. Antonets
- Laboratory for Proteomics of Supra-Organismal Systems, All-Russia Research Institute for Agricultural Microbiology (ARRIAM), Podbelskogo sh., 3, Pushkin, 196608 St. Petersburg, Russia;
- Faculty of Biology, St. Petersburg State University, 199034 St. Petersburg, Russia;
| |
Collapse
|
64
|
Bushmanova E, Antipov D, Lapidus A, Prjibelski AD. rnaSPAdes: a de novo transcriptome assembler and its application to RNA-Seq data. Gigascience 2020; 8:5559527. [PMID: 31494669 PMCID: PMC6736328 DOI: 10.1093/gigascience/giz100] [Citation(s) in RCA: 407] [Impact Index Per Article: 81.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2018] [Revised: 04/20/2019] [Accepted: 08/01/2019] [Indexed: 12/18/2022] Open
Abstract
Background The possibility of generating large RNA-sequencing datasets has led to development of various reference-based and de novo transcriptome assemblers with their own strengths and limitations. While reference-based tools are widely used in various transcriptomic studies, their application is limited to the organisms with finished and well-annotated genomes. De novo transcriptome reconstruction from short reads remains an open challenging problem, which is complicated by the varying expression levels across different genes, alternative splicing, and paralogous genes. Results Herein we describe the novel transcriptome assembler rnaSPAdes, which has been developed on top of the SPAdes genome assembler and explores computational parallels between assembly of transcriptomes and single-cell genomes. We also present quality assessment reports for rnaSPAdes assemblies, compare it with modern transcriptome assembly tools using several evaluation approaches on various RNA-sequencing datasets, and briefly highlight strong and weak points of different assemblers. Conclusions Based on the performed comparison between different assembly methods, we infer that it is not possible to detect the absolute leader according to all quality metrics and all used datasets. However, rnaSPAdes typically outperforms other assemblers by such important property as the number of assembled genes and isoforms, and at the same time has higher accuracy statistics on average comparing to the closest competitors.
Collapse
Affiliation(s)
- Elena Bushmanova
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, 199004, 6 linia V.O. 11d, Russia
| | - Dmitry Antipov
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, 199004, 6 linia V.O. 11d, Russia
| | - Alla Lapidus
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, 199004, 6 linia V.O. 11d, Russia
| | - Andrey D Prjibelski
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, 199004, 6 linia V.O. 11d, Russia
| |
Collapse
|
65
|
MicroRNAs regulate innate immunity against uropathogenic and commensal-like Escherichia coli infections in the surrogate insect model Galleria mellonella. Sci Rep 2020; 10:2570. [PMID: 32054914 PMCID: PMC7018962 DOI: 10.1038/s41598-020-59407-3] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2019] [Accepted: 01/15/2020] [Indexed: 12/23/2022] Open
Abstract
Uropathogenic Escherichia coli (UPEC) strains cause symptomatic urinary tract infections in humans whereas commensal-like E. coli strains in the urinary bladder cause long-term asymptomatic bacteriuria (ABU). We previously reported that UPEC and ABU strains differentially regulate key DNA methylation and histone acetylation components in the surrogate insect host Galleria mellonella to epigenetically modulate innate immunity-related gene expression, which in turn controls bacterial growth. In this follow-up study, we infected G. mellonella larvae with UPEC strain CFT073 or ABU strain 83972 to identify differences in the expression of microRNAs (miRNAs), a class of non-coding RNAs that regulate gene expression at the post-transcriptional level. Our small RNA sequencing analysis showed that UPEC and ABU infections caused significant changes in the abundance of miRNAs in the larvae, and highlighted the differential expression of 147 conserved miRNAs and 95 novel miRNA candidates. We annotated the G. mellonella genome sequence to investigate the miRNA-regulated expression of genes encoding antimicrobial peptides, signaling proteins, and enzymatic regulators of DNA methylation and histone acetylation in infected larvae. Our results indicate that miRNAs play a role in the epigenetic reprograming of innate immunity in G. mellonella larvae to distinguish between pathogenic and commensal strains of E. coli.
Collapse
|
66
|
Tung LH, Shao M, Kingsford C. Quantifying the benefit offered by transcript assembly with Scallop-LR on single-molecule long reads. Genome Biol 2019; 20:287. [PMID: 31849338 PMCID: PMC6918626 DOI: 10.1186/s13059-019-1883-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Accepted: 11/06/2019] [Indexed: 12/19/2022] Open
Abstract
Single-molecule long-read sequencing has been used to improve mRNA isoform identification. However, not all single-molecule long reads represent full transcripts due to incomplete cDNA synthesis and sequencing length limits. This drives a need for long-read transcript assembly. By adding long-read-specific optimizations to Scallop, we developed Scallop-LR, a reference-based long-read transcript assembler. Analyzing 26 PacBio samples, we quantified the benefit of performing transcript assembly on long reads. We demonstrate Scallop-LR identifies more known transcripts and potentially novel isoforms for the human transcriptome than Iso-Seq Analysis and StringTie, indicating that long-read transcript assembly by Scallop-LR can reveal a more complete human transcriptome.
Collapse
Affiliation(s)
- Laura H Tung
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, 15213, PA, USA
- Joint Carnegie Mellon University-University of Pittsburgh Ph.D. Program in Computational Biology, Pittsburgh, 15213, PA, USA
| | - Mingfu Shao
- Department of Computer Science and Engineering, The Pennsylvania State University, University Park, 16802, PA, USA
| | - Carl Kingsford
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, 15213, PA, USA.
| |
Collapse
|
67
|
Comparative Analysis of Strategies for De Novo Transcriptome Assembly in Prokaryotes: Streptomyces clavuligerus as a Case Study. High Throughput 2019; 8:ht8040020. [PMID: 31801255 PMCID: PMC6970227 DOI: 10.3390/ht8040020] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 11/20/2019] [Accepted: 11/23/2019] [Indexed: 12/15/2022] Open
Abstract
The performance of software tools for de novo transcriptome assembly greatly depends on the selection of software parameters. Up to now, the development of de novo transcriptome assembly for prokaryotes has not been as remarkable as that for eukaryotes. In this contribution, Rockhopper2 was used to perform a comparative transcriptome analysis of Streptomyces clavuligerus exposed to diverse environmental conditions. The study focused on assessing the incidence of software parameters on software performance for the identification of differentially expressed genes as a final goal. For this, a statistical optimization was performed using the Transrate Assembly Score (TAS). TAS was also used for evaluating the software performance and for comparing it with related tools, e.g., Trinity. Transcriptome redundancy and completeness were also considered for this analysis. Rockhopper2 and Trinity reached a TAS value of 0.55092 and 0.58337, respectively. Trinity assembles transcriptomes with high redundancy, with 55.6% of transcripts having some duplicates. Additionally, we observed that the total number of differentially expressed genes (DEG) and their annotation greatly depends on the method used for removing redundancy and the tools used for transcript quantification. To our knowledge, this is the first work aimed at assessing de novo assembly software for prokaryotic organisms.
Collapse
|
68
|
Utilization of Tissue Ploidy Level Variation in de Novo Transcriptome Assembly of Pinus sylvestris. G3-GENES GENOMES GENETICS 2019; 9:3409-3421. [PMID: 31427456 PMCID: PMC6778806 DOI: 10.1534/g3.119.400357] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Compared to angiosperms, gymnosperms lag behind in the availability of assembled and annotated genomes. Most genomic analyses in gymnosperms, especially conifer tree species, rely on the use of de novo assembled transcriptomes. However, the level of allelic redundancy and transcript fragmentation in these assembled transcriptomes, and their effect on downstream applications have not been fully investigated. Here, we assessed three assembly strategies for short-reads data, including the utility of haploid megagametophyte tissue during de novo assembly as single-allele guides, for six individuals and five different tissues in Pinus sylvestris. We then contrasted haploid and diploid tissue genotype calls obtained from the assembled transcriptomes to evaluate the extent of paralog mapping. The use of the haploid tissue during assembly increased its completeness without reducing the number of assembled transcripts. Our results suggest that current strategies that rely on available genomic resources as guidance to minimize allelic redundancy are less effective than the application of strategies that cluster redundant assembled transcripts. The strategy yielding the lowest levels of allelic redundancy among the assembled transcriptomes assessed here was the generation of SuperTranscripts with Lace followed by CD-HIT clustering. However, we still observed some levels of heterozygosity (multiple gene fragments per transcript reflecting allelic redundancy) in this assembled transcriptome on the haploid tissue, indicating that further filtering is required before using these assemblies for downstream applications. We discuss the influence of allelic redundancy when these reference transcriptomes are used to select regions for probe design of exome capture baits and for estimation of population genetic diversity.
Collapse
|
69
|
Bendele KG, Guerrero FD, Cameron C, Bodine DM, Miller RJ. Gene expression during the early stages of host perception and attachment in adult female Rhipicephalus microplus ticks. EXPERIMENTAL & APPLIED ACAROLOGY 2019; 79:107-124. [PMID: 31552563 DOI: 10.1007/s10493-019-00420-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/19/2019] [Accepted: 09/13/2019] [Indexed: 06/10/2023]
Abstract
The cattle tick, Rhipicephalus microplus, is a serious pest of cattle, with significant economic consequences to the livestock industries of tropical and semitropical countries. Rhipicephalus microplus belongs to the Metastriata group of the Ixodidae family known as hard ticks. When adult hard ticks feed, mating has not yet occurred and an initial host attachment phase of 1-2 days is followed by a slow feeding phase that can last several days. Once mating occurs, feeding concludes with a rapid engorgement phase that is completed in 12-36 h. Our group's interest in mining the genome and transcriptome of R. microplus for novel targets for development of tick control technologies led us to investigate the early transcriptional events occurring upon tick attachment and subsequent feeding. We placed newly molted unfed adult R. microplus females upon a bovine host and harvested the attached ticks after 3, 6, 12, and 24 h. We also placed a group of these ticks in a gas-permeable tube taped onto the side of the bovine host. These ticks were able to sense the host but unable to penetrate the tube to begin attachment and were ultimately harvested after 3 h. This study produced a comprehensive transcriptome from newly molted adult ticks and will provide a useful resource for studies of tick feeding and host perception and also assist genome annotation refinements.
Collapse
Affiliation(s)
- Kylie G Bendele
- Agricultural Research Service Knipling-Bushland U. S. Livestock Insects Research Laboratory, United States Department of Agriculture, 2700 Fredericksburg Rd, Kerrville, TX, 78028, USA.
| | - Felix D Guerrero
- Agricultural Research Service Knipling-Bushland U. S. Livestock Insects Research Laboratory, United States Department of Agriculture, 2700 Fredericksburg Rd, Kerrville, TX, 78028, USA
| | - Connor Cameron
- National Center for Genome Resources, 2935 Rodeo Park Drive East, Santa Fe, NM, 87505, USA
| | - Deanna M Bodine
- Agricultural Research Service Knipling-Bushland U. S. Livestock Insects Research Laboratory, United States Department of Agriculture, 2700 Fredericksburg Rd, Kerrville, TX, 78028, USA
| | - Robert J Miller
- Agricultural Research Service, Cattle Fever Tick Research Laboratory, United States Department of Agriculture, 22675 N. Moorefield Rd, Edinburg, TX, 78541, USA
| |
Collapse
|
70
|
Van de Weyer AL, Monteiro F, Furzer OJ, Nishimura MT, Cevik V, Witek K, Jones JDG, Dangl JL, Weigel D, Bemm F. A Species-Wide Inventory of NLR Genes and Alleles in Arabidopsis thaliana. Cell 2019; 178:1260-1272.e14. [PMID: 31442410 PMCID: PMC6709784 DOI: 10.1016/j.cell.2019.07.038] [Citation(s) in RCA: 220] [Impact Index Per Article: 36.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2019] [Revised: 06/13/2019] [Accepted: 07/19/2019] [Indexed: 12/18/2022]
Abstract
Infectious disease is both a major force of selection in nature and a prime cause of yield loss in agriculture. In plants, disease resistance is often conferred by nucleotide-binding leucine-rich repeat (NLR) proteins, intracellular immune receptors that recognize pathogen proteins and their effects on the host. Consistent with extensive balancing and positive selection, NLRs are encoded by one of the most variable gene families in plants, but the true extent of intraspecific NLR diversity has been unclear. Here, we define a nearly complete species-wide pan-NLRome in Arabidopsis thaliana based on sequence enrichment and long-read sequencing. The pan-NLRome largely saturates with approximately 40 well-chosen wild strains, with half of the pan-NLRome being present in most accessions. We chart NLR architectural diversity, identify new architectures, and quantify selective forces that act on specific NLRs and NLR domains. Our study provides a blueprint for defining pan-NLRomes.
Collapse
Affiliation(s)
- Anna-Lena Van de Weyer
- Department of Molecular Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | - Freddy Monteiro
- Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA; Department of Biology, University of North Carolina, Chapel Hill, NC 27599-3280, USA; Center for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, 08193 Barcelona, Spain
| | - Oliver J Furzer
- Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA; Department of Biology, University of North Carolina, Chapel Hill, NC 27599-3280, USA; The Sainsbury Laboratory, University of East Anglia, Norwich Research Park, Norwich NR4 7UH, UK
| | - Marc T Nishimura
- Department of Biology, Colorado State University, Fort Collins, CO 80523, USA
| | - Volkan Cevik
- The Sainsbury Laboratory, University of East Anglia, Norwich Research Park, Norwich NR4 7UH, UK; Milner Centre for Evolution & Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, UK
| | - Kamil Witek
- The Sainsbury Laboratory, University of East Anglia, Norwich Research Park, Norwich NR4 7UH, UK
| | - Jonathan D G Jones
- The Sainsbury Laboratory, University of East Anglia, Norwich Research Park, Norwich NR4 7UH, UK.
| | - Jeffery L Dangl
- Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA.
| | - Detlef Weigel
- Department of Molecular Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany.
| | - Felix Bemm
- Department of Molecular Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| |
Collapse
|
71
|
Ojeda DI, Koenen E, Cervantes S, de la Estrella M, Banguera-Hinestroza E, Janssens SB, Migliore J, Demenou BB, Bruneau A, Forest F, Hardy OJ. Phylogenomic analyses reveal an exceptionally high number of evolutionary shifts in a florally diverse clade of African legumes. Mol Phylogenet Evol 2019; 137:156-167. [DOI: 10.1016/j.ympev.2019.05.002] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Revised: 04/26/2019] [Accepted: 05/02/2019] [Indexed: 11/15/2022]
|
72
|
Drukewitz SH, von Reumont BM. The Significance of Comparative Genomics in Modern Evolutionary Venomics. Front Ecol Evol 2019. [DOI: 10.3389/fevo.2019.00163] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
|
73
|
Hölzer M, Marz M. De novo transcriptome assembly: A comprehensive cross-species comparison of short-read RNA-Seq assemblers. Gigascience 2019; 8:giz039. [PMID: 31077315 PMCID: PMC6511074 DOI: 10.1093/gigascience/giz039] [Citation(s) in RCA: 114] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2018] [Revised: 12/21/2018] [Accepted: 03/09/2019] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND In recent years, massively parallel complementary DNA sequencing (RNA sequencing [RNA-Seq]) has emerged as a fast, cost-effective, and robust technology to study entire transcriptomes in various manners. In particular, for non-model organisms and in the absence of an appropriate reference genome, RNA-Seq is used to reconstruct the transcriptome de novo. Although the de novo transcriptome assembly of non-model organisms has been on the rise recently and new tools are frequently developing, there is still a knowledge gap about which assembly software should be used to build a comprehensive de novo assembly. RESULTS Here, we present a large-scale comparative study in which 10 de novo assembly tools are applied to 9 RNA-Seq data sets spanning different kingdoms of life. Overall, we built >200 single assemblies and evaluated their performance on a combination of 20 biological-based and reference-free metrics. Our study is accompanied by a comprehensive and extensible Electronic Supplement that summarizes all data sets, assembly execution instructions, and evaluation results. Trinity, SPAdes, and Trans-ABySS, followed by Bridger and SOAPdenovo-Trans, generally outperformed the other tools compared. Moreover, we observed species-specific differences in the performance of each assembler. No tool delivered the best results for all data sets. CONCLUSIONS We recommend a careful choice and normalization of evaluation metrics to select the best assembling results as a critical step in the reconstruction of a comprehensive de novo transcriptome assembly.
Collapse
Affiliation(s)
- Martin Hölzer
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University, Leutragraben 1, 07743 Jena, Germany
- European Virus Bioinformatics Center, Friedrich Schiller University, Leutragraben 1, 07743 Jena, Germany
| | - Manja Marz
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University, Leutragraben 1, 07743 Jena, Germany
- European Virus Bioinformatics Center, Friedrich Schiller University, Leutragraben 1, 07743 Jena, Germany
- FLI Leibniz Institute for Age Research, Beutenbergstraße 11, 07743 Jena, Germany
| |
Collapse
|
74
|
Rivera-García L, Rivera-Vicéns RE, Veglia AJ, Schizas NV. De novo transcriptome assembly of the digitate morphotype of Briareum asbestinum (Octocorallia: Alcyonacea) from the southwest shelf of Puerto Rico. Mar Genomics 2019; 47:100676. [PMID: 31005610 DOI: 10.1016/j.margen.2019.04.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2018] [Revised: 04/04/2019] [Accepted: 04/04/2019] [Indexed: 11/19/2022]
Abstract
Octocorals have now become the most visually dominant metazoan benthic taxa of most Caribbean reefs, following the precipitous decline of scleractinian corals. Yet taxonomic issues because of their extensive phenotypic plasticity are still abound. Briareum asbestinum one of the iconic octocorals of the shallow Caribbean coral reefs exhibits a biform morphology, the digitate and the encrusting one. The taxonomic status of each form has not been clarified, yet. Until recently, there were few genetic resources for non-model metazoans, however, affordable high-throughput DNA sequencing has removed this hindrance. We present the first transcriptome of the digitate form of Briareum asbestinum from southwest Puerto Rico. We used paired-end sequencing (Illumina NextSeq 500), with a total yield of 159,754,702 raw reads. De novo assembly was performed utilizing a multi-assembler approach generating 371,554 biologically true, non-redundant transcripts. Open reading frame analysis identified 102,839 putative ORFs of which 78,607 were with annotations. BUSCO analysis indicated a total of 96.4% complete orthologous genes from the metazoan dataset. The assembly presented here serves as an important new genomic reference for the Briareum genus that will facilitate future population and phylogenetic studies aiming to better understand the molecular basis of phenotypic plasticity exhibited throughout the genus.
Collapse
Affiliation(s)
- Liajay Rivera-García
- Department of Marine Sciences, University of Puerto Rico at Mayagüez, PO Box 9000, Mayagüez, PR 00681, USA
| | - Ramón E Rivera-Vicéns
- Department of Marine Sciences, University of Puerto Rico at Mayagüez, PO Box 9000, Mayagüez, PR 00681, USA; Department of Earth and Environmental Sciences, Paleontology and Geobiology, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Alex J Veglia
- Department of Marine Sciences, University of Puerto Rico at Mayagüez, PO Box 9000, Mayagüez, PR 00681, USA
| | - Nikolaos V Schizas
- Department of Marine Sciences, University of Puerto Rico at Mayagüez, PO Box 9000, Mayagüez, PR 00681, USA.
| |
Collapse
|
75
|
The assembled transcriptome of the adult horn fly, Haematobia irritans. Data Brief 2018; 19:1933-1940. [PMID: 30229068 PMCID: PMC6141423 DOI: 10.1016/j.dib.2018.06.095] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2018] [Accepted: 06/27/2018] [Indexed: 11/20/2022] Open
Abstract
The horn fly, Haematobia irritans irritans (Linnaeus, 1758; Diptera: Muscidae), a hematophagous external parasite of cattle, causes considerable economic losses to the livestock industry worldwide. This pest is mainly controlled with insecticides; however, horn fly populations from several countries have developed resistance to many of the products available for their control. In an attempt to better understand the adult horn fly and the development of resistance in natural populations, we used an Illumina paired-end read HiSeq and GAII approach to determine the transcriptomes of untreated control adult females, untreated control adult males, permethrin-treated surviving adult males and permethrin + piperonyl butoxide-treated killed adult males from a Louisiana population of horn flies with a moderate level of pyrethroid resistance. A total of 128,769,829, 127,276,458, 67,653,920, and 64,270,124 quality-filtered Illumina reads were obtained for untreated control adult females, untreated control adult males, permethrin-treated surviving adult males and permethrin + piperonyl butoxide-treated killed adult males, respectively. The de novo assemblies using CLC Genomics Workbench 8.0.1 yielded 15,699, 11,961, 2672, 7278 contigs (≥ 200 nt) for untreated control adult females, untreated control adult males, permethrin-treated surviving adult males and permethrin + piperonyl butoxide-treated killed adult males, respectively. More than 56% of the assembled contigs of each data set had significant hits in the BlastX (UniProtKB/Swiss-Prot database) (E <0.001). The number of contigs in each data set with InterProScan, GO mapping, Enzyme codes and KEGG pathway annotations were: Untreated Control Adult Females – 10,331, 8770, 2963, 2183; Untreated control adult males – 8392, 7056, 2449, 1765; Permethrin-treated surviving adult males – 1992, 1609, 641, 495; Permethrin + PBO-treated killed adult males – 5561, 4463, 1628, 1211.
Collapse
|
76
|
Khan H, Mohamadi H, Vandervalk BP, Warren RL, Chu J, Birol I. ChopStitch: exon annotation and splice graph construction using transcriptome assembly and whole genome sequencing data. Bioinformatics 2018; 34:1697-1704. [PMID: 29300846 PMCID: PMC5946899 DOI: 10.1093/bioinformatics/btx839] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2017] [Accepted: 12/28/2017] [Indexed: 11/13/2022] Open
Abstract
Motivation Sequencing studies on non-model organisms often interrogate both genomes and transcriptomes with massive amounts of short sequences. Such studies require de novo analysis tools and techniques, when the species and closely related species lack high quality reference resources. For certain applications such as de novo annotation, information on putative exons and alternative splicing may be desirable. Results Here we present ChopStitch, a new method for finding putative exons de novo and constructing splice graphs using an assembled transcriptome and whole genome shotgun sequencing (WGSS) data. ChopStitch identifies exon-exon boundaries in de novo assembled RNA-Seq data with the help of a Bloom filter that represents the k-mer spectrum of WGSS reads. The algorithm also accounts for base substitutions in transcript sequences that may be derived from sequencing or assembly errors, haplotype variations, or putative RNA editing events. The primary output of our tool is a FASTA file containing putative exons. Further, exon edges are interrogated for alternative exon-exon boundaries to detect transcript isoforms, which are represented as splice graphs in DOT output format. Availability and implementation ChopStitch is written in Python and C++ and is released under the GPL license. It is freely available at https://github.com/bcgsc/ChopStitch. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hamza Khan
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4S6, Canada
| | - Hamid Mohamadi
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4S6, Canada
| | - Benjamin P Vandervalk
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4S6, Canada
| | - Rene L Warren
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4S6, Canada
| | - Justin Chu
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4S6, Canada
| | - Inanc Birol
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4S6, Canada
| |
Collapse
|
77
|
Mora-Márquez F, Vázquez-Poletti JL, López de Heredia U. NGScloud: RNA-seq analysis of non-model species using cloud computing. Bioinformatics 2018; 34:3405-3407. [DOI: 10.1093/bioinformatics/bty363] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2017] [Accepted: 05/02/2018] [Indexed: 01/13/2023] Open
Affiliation(s)
- Fernando Mora-Márquez
- GI Genética, Fisiología e Historia Forestal, Dpto. Sistemas y Recursos Naturales, ETSI Montes, Forestal y del Medio Natural, Universidad Politécnica de Madrid, Spain
| | - José Luis Vázquez-Poletti
- GI Arquitectura de Sistemas Distribuidos, Dpto. Arquitectura de Computadores y Automática, Facultad de Informática, Universidad Complutense de Madrid, Ciudad Universitaria, Madrid, Spain
| | - Unai López de Heredia
- GI Genética, Fisiología e Historia Forestal, Dpto. Sistemas y Recursos Naturales, ETSI Montes, Forestal y del Medio Natural, Universidad Politécnica de Madrid, Spain
| |
Collapse
|
78
|
Esteve-Codina A. RNA-Seq Data Analysis, Applications and Challenges. COMPREHENSIVE ANALYTICAL CHEMISTRY 2018. [DOI: 10.1016/bs.coac.2018.06.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
|
79
|
Shlemov A, Bankevich S, Bzikadze A, Turchaninova MA, Safonova Y, Pevzner PA. Reconstructing Antibody Repertoires from Error-Prone Immunosequencing Reads. THE JOURNAL OF IMMUNOLOGY 2017; 199:3369-3380. [PMID: 28978691 DOI: 10.4049/jimmunol.1700485] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2017] [Accepted: 08/24/2017] [Indexed: 12/16/2022]
Abstract
Transforming error-prone immunosequencing datasets into Ab repertoires is a fundamental problem in immunogenomics, and a prerequisite for studies of immune responses. Although various repertoire reconstruction algorithms were released in the last 3 y, it remains unclear how to benchmark them and how to assess the accuracy of the reconstructed repertoires. We describe an accurate IgReC algorithm for constructing Ab repertoires from high-throughput immunosequencing datasets and a new framework for assessing the quality of reconstructed repertoires. Surprisingly, Ab repertoires constructed by IgReC from barcoded immunosequencing datasets in the blind mode (without using information about unique molecular identifiers) improved upon the repertoires constructed by the state-of-the-art tools that use barcoding. This finding suggests that IgReC may alleviate the need to generate repertoires using the barcoding technology (the workhorse of current immunogenomics efforts) because our computational approach to error correction of immunosequencing data is nearly as powerful as the experimental approach based on barcoding.
Collapse
Affiliation(s)
- Alexander Shlemov
- Center for Algorithmic Biotechnology, Institute for Translational Biomedicine, St. Petersburg University, St. Petersburg, Russia 199034
| | - Sergey Bankevich
- Center for Algorithmic Biotechnology, Institute for Translational Biomedicine, St. Petersburg University, St. Petersburg, Russia 199034
| | - Andrey Bzikadze
- Center for Algorithmic Biotechnology, Institute for Translational Biomedicine, St. Petersburg University, St. Petersburg, Russia 199034
| | - Maria A Turchaninova
- Institute of Bioorganic Chemistry, Russian Academy of Sciences, Moscow, Russia 117997
| | - Yana Safonova
- Center for Algorithmic Biotechnology, Institute for Translational Biomedicine, St. Petersburg University, St. Petersburg, Russia 199034; .,Information Theory and Applications Center, University of California, San Diego, La Jolla, CA 92093; and
| | - Pavel A Pevzner
- Center for Algorithmic Biotechnology, Institute for Translational Biomedicine, St. Petersburg University, St. Petersburg, Russia 199034.,Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA 92093
| |
Collapse
|
80
|
Challenges and advances for transcriptome assembly in non-model species. PLoS One 2017; 12:e0185020. [PMID: 28931057 PMCID: PMC5607178 DOI: 10.1371/journal.pone.0185020] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2017] [Accepted: 09/04/2017] [Indexed: 12/28/2022] Open
Abstract
Analyses of high-throughput transcriptome sequences of non-model organisms are based on two main approaches: de novo assembly and genome-guided assembly using mapping to assign reads prior to assembly. Given the limits of mapping reads to a reference when it is highly divergent, as is frequently the case for non-model species, we evaluate whether using blastn would outperform mapping methods for read assignment in such situations (>15% divergence). We demonstrate its high performance by using simulated reads of lengths corresponding to those generated by the most common sequencing platforms, and over a realistic range of genetic divergence (0% to 30% divergence). Here we focus on gene identification and not on resolving the whole set of transcripts (i.e. the complete transcriptome). For simulated datasets, the transcriptome-guided assembly based on blastn recovers 94.8% of genes irrespective of read length at 0% divergence; however, assignment rate of reads is negatively correlated with both increasing divergence level and reducing read lengths. Nevertheless, we still observe 92.6% of recovered genes at 30% divergence irrespective of read length. This analysis also produces a categorization of genes relative to their assignment, and suggests guidelines for data processing prior to analyses of comparative transcriptomics and gene expression to minimize potential inferential bias associated with incorrect transcript assignment. We also compare the performances of de novo assembly alone vs in combination with a transcriptome-guided assembly based on blastn both via simulation and empirically, using data from a cyprinid fish species and from an oak species. For any simulated scenario, the transcriptome-guided assembly using blastn outperforms the de novo approach alone, including when the divergence level is beyond the reach of traditional mapping methods. Combining de novo assembly and a related reference transcriptome for read assignment also addresses the bias/error in contigs caused by the dependence on a related reference alone. Empirical data corroborate these findings when assembling transcriptomes from the two non-model organisms: Parachondrostoma toxostoma (fish) and Quercus pubescens (plant). For the fish species, out of the 31,944 genes known from D. rerio, the guided and de novo assemblies recover respectively 20,605 and 20,032 genes but the performance of the guided assembly approach is much higher for both the contiguity and completeness metrics. For the oak, out of the 29,971 genes known from Vitis vinifera, the transcriptome-guided and de novo assemblies display similar performance, but the new guided approach detects 16,326 genes where the de novo assembly only detects 9,385 genes.
Collapse
|
81
|
Duan J, Sanggaard KW, Schauser L, Lauridsen SE, Enghild JJ, Schierup MH, Wang T. Transcriptome analysis of the response of Burmese python to digestion. Gigascience 2017; 6:1-18. [PMID: 28873961 PMCID: PMC5597892 DOI: 10.1093/gigascience/gix057] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2016] [Revised: 04/12/2017] [Accepted: 07/06/2017] [Indexed: 12/13/2022] Open
Abstract
Exceptional and extreme feeding behaviour makes the Burmese python (Python bivittatus) an interesting model to study physiological remodelling and metabolic adaptation in response to refeeding after prolonged starvation. In this study, we used transcriptome sequencing of 5 visceral organs during fasting as well as 24 hours and 48 hours after ingestion of a large meal to unravel the postprandial changes in Burmese pythons. We first used the pooled data to perform a de novo assembly of the transcriptome and supplemented this with a proteomic survey of enzymes in the plasma and gastric fluid. We constructed a high-quality transcriptome with 34 423 transcripts, of which 19 713 (57%) were annotated. Among highly expressed genes (fragments per kilo base per million sequenced reads > 100 in 1 tissue), we found that the transition from fasting to digestion was associated with differential expression of 43 genes in the heart, 206 genes in the liver, 114 genes in the stomach, 89 genes in the pancreas, and 158 genes in the intestine. We interrogated the function of these genes to test previous hypotheses on the response to feeding. We also used the transcriptome to identify 314 secreted proteins in the gastric fluid of the python. Digestion was associated with an upregulation of genes related to metabolic processes, and translational changes therefore appear to support the postprandial rise in metabolism. We identify stomach-related proteins from a digesting individual and demonstrate that the sensitivity of modern liquid chromatography/tandem mass spectrometry equipment allows the identification of gastric juice proteins that are present during digestion.
Collapse
Affiliation(s)
- Jinjie Duan
- Bioinformatics Research Center, Aarhus University, C.F. Moellers Alle 8, Aarhus C, Denmark
| | - Kristian Wejse Sanggaard
- Department of Molecular Biology and Genetics, Aarhus University, Gustav Wieds Vej 10C, Aarhus C, Denmark
- Interdisciplinary Nanoscience Center (iNANO), Aarhus University, Gustav Wieds Vej 14, Aarhus C, Denmark
| | | | - Sanne Enok Lauridsen
- Department of Bioscience, Aarhus University, Ny Munkegade 116, Aarhus C, Denmark
| | - Jan J. Enghild
- Department of Molecular Biology and Genetics, Aarhus University, Gustav Wieds Vej 10C, Aarhus C, Denmark
- Interdisciplinary Nanoscience Center (iNANO), Aarhus University, Gustav Wieds Vej 14, Aarhus C, Denmark
| | - Mikkel Heide Schierup
- Bioinformatics Research Center, Aarhus University, C.F. Moellers Alle 8, Aarhus C, Denmark
- Department of Bioscience, Aarhus University, Ny Munkegade 116, Aarhus C, Denmark
| | - Tobias Wang
- Department of Bioscience, Aarhus University, Ny Munkegade 116, Aarhus C, Denmark
| |
Collapse
|
82
|
Gravouil K, Ferru-Clément R, Colas S, Helye R, Kadri L, Bourdeau L, Moumen B, Mercier A, Ferreira T. Transcriptomics and Lipidomics of the Environmental Strain Rhodococcus ruber Point out Consumption Pathways and Potential Metabolic Bottlenecks for Polyethylene Degradation. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2017; 51:5172-5181. [PMID: 28345896 DOI: 10.1021/acs.est.7b00846] [Citation(s) in RCA: 79] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Polyethylene (PE), one of the most prominent synthetic polymers used worldwide, is very poorly biodegradable in the natural environment. Consequently, PE represents by itself more than half of all plastic wastes. PE biodegradation is achieved through the combination of abiotic and biotic processes. Several microorganisms have been shown to grow on the surface of PE materials, among which are the species of the Rhodococcus genus, suggesting a potent ability of these microorganisms to use, at least partly, PE as a potent carbon source. However, most of them, if not all, fail to induce a clear-cut degradation of PE samples, showing that bottlenecks to reach optimal biodegradation clearly exist. To identify the pathways involved in PE consumption, we used in the present study a combination of RNA-sequencing and lipidomic strategies. We show that short-term exposure to various forms of PE, displaying different molecular weight distributions and oxidation levels, lead to an increase in the expression of 158 genes in a Rhodococcus representative, R. ruber. Interestingly, one of the most up-regulated pathways is related to alkane degradation and β-oxidation of fatty acids. This approach also allowed us to identify metabolic limiting steps, which could be fruitfully targeted for optimized PE consumption by R. ruber.
Collapse
Affiliation(s)
- Kévin Gravouil
- Cooperative Laboratory ThanaplastSP-Carbios, Laboratory of Ecological and Biological Interactions, National Center for Scientific Research UMR 7267, University of Poitiers , Poitiers 86073, France
| | - Romain Ferru-Clément
- Laboratory of Signalisation and Membrane Ionic Transports, National Center for Scientific Research STIM CNRS ERL 7368, University of Poitiers , Poitiers 86073, France
| | - Steven Colas
- Cooperative Laboratory ThanaplastSP-Carbios, Laboratory of Ecological and Biological Interactions, National Center for Scientific Research UMR 7267, University of Poitiers , Poitiers 86073, France
| | - Reynald Helye
- Laboratory of Signalisation and Membrane Ionic Transports, National Center for Scientific Research STIM CNRS ERL 7368, University of Poitiers , Poitiers 86073, France
| | - Linette Kadri
- Cooperative Laboratory ThanaplastSP-Carbios, Laboratory of Ecological and Biological Interactions, National Center for Scientific Research UMR 7267, University of Poitiers , Poitiers 86073, France
| | - Ludivine Bourdeau
- Cooperative Laboratory ThanaplastSP-Carbios, Laboratory of Ecological and Biological Interactions, National Center for Scientific Research UMR 7267, University of Poitiers , Poitiers 86073, France
| | - Bouziane Moumen
- Team Ecology, Evolution, Symbiosis, Laboratory of Ecological and Biological Interactions, National Center for Scientific Research UMR 7267, University of Poitiers , Poitiers 86073, France
| | - Anne Mercier
- Cooperative Laboratory ThanaplastSP-Carbios, Laboratory of Ecological and Biological Interactions, National Center for Scientific Research UMR 7267, University of Poitiers , Poitiers 86073, France
| | - Thierry Ferreira
- Cooperative Laboratory ThanaplastSP-Carbios, Laboratory of Ecological and Biological Interactions, National Center for Scientific Research UMR 7267, University of Poitiers , Poitiers 86073, France
- Laboratory of Signalisation and Membrane Ionic Transports, National Center for Scientific Research STIM CNRS ERL 7368, University of Poitiers , Poitiers 86073, France
| |
Collapse
|