1
|
Sessegolo C, Cruaud C, Da Silva C, Cologne A, Dubarry M, Derrien T, Lacroix V, Aury JM. Transcriptome profiling of mouse samples using nanopore sequencing of cDNA and RNA molecules. Sci Rep 2019; 9:14908. [PMID: 31624302 PMCID: PMC6797730 DOI: 10.1038/s41598-019-51470-9] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2019] [Accepted: 09/28/2019] [Indexed: 01/27/2023] Open
Abstract
Our vision of DNA transcription and splicing has changed dramatically with the introduction of short-read sequencing. These high-throughput sequencing technologies promised to unravel the complexity of any transcriptome. Generally gene expression levels are well-captured using these technologies, but there are still remaining caveats due to the limited read length and the fact that RNA molecules had to be reverse transcribed before sequencing. Oxford Nanopore Technologies has recently launched a portable sequencer which offers the possibility of sequencing long reads and most importantly RNA molecules. Here we generated a full mouse transcriptome from brain and liver using the Oxford Nanopore device. As a comparison, we sequenced RNA (RNA-Seq) and cDNA (cDNA-Seq) molecules using both long and short reads technologies and tested the TeloPrime preparation kit, dedicated to the enrichment of full-length transcripts. Using spike-in data, we confirmed that expression levels are efficiently captured by cDNA-Seq using short reads. More importantly, Oxford Nanopore RNA-Seq tends to be more efficient, while cDNA-Seq appears to be more biased. We further show that the cDNA library preparation of the Nanopore protocol induces read truncation for transcripts containing internal runs of T’s. This bias is marked for runs of at least 15 T’s, but is already detectable for runs of at least 9 T’s and therefore concerns more than 20% of expressed transcripts in mouse brain and liver. Finally, we outline that bioinformatics challenges remain ahead for quantifying at the transcript level, especially when reads are not full-length. Accurate quantification of repeat-associated genes such as processed pseudogenes also remains difficult, and we show that current mapping protocols which map reads to the genome largely over-estimate their expression, at the expense of their parent gene.
Collapse
Affiliation(s)
- Camille Sessegolo
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558, F-69622, Villeurbanne, France.,EPI ERABLE - Inria Grenoble, Rhône-Alpes, France
| | - Corinne Cruaud
- Genoscope, Institut de biologie François-Jacob, Commissariat a l'Energie Atomique (CEA), Université Paris-Saclay, F-91057, Evry, France
| | - Corinne Da Silva
- Genoscope, Institut de biologie François-Jacob, Commissariat a l'Energie Atomique (CEA), Université Paris-Saclay, F-91057, Evry, France
| | - Audric Cologne
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558, F-69622, Villeurbanne, France.,EPI ERABLE - Inria Grenoble, Rhône-Alpes, France
| | - Marion Dubarry
- Genoscope, Institut de biologie François-Jacob, Commissariat a l'Energie Atomique (CEA), Université Paris-Saclay, F-91057, Evry, France
| | - Thomas Derrien
- Univ Rennes, CNRS, IGDR (Institut de génétique et développement de Rennes) - UMR 6290, F-35000, Rennes, France
| | - Vincent Lacroix
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558, F-69622, Villeurbanne, France.,EPI ERABLE - Inria Grenoble, Rhône-Alpes, France
| | - Jean-Marc Aury
- Genoscope, Institut de biologie François-Jacob, Commissariat a l'Energie Atomique (CEA), Université Paris-Saclay, F-91057, Evry, France.
| |
Collapse
|
2
|
Sessegolo C, Burlet N, Haudry A. Strong phylogenetic inertia on genome size and transposable element content among 26 species of flies. Biol Lett 2017; 12:rsbl.2016.0407. [PMID: 27576524 PMCID: PMC5014035 DOI: 10.1098/rsbl.2016.0407] [Citation(s) in RCA: 53] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2016] [Accepted: 08/08/2016] [Indexed: 01/28/2023] Open
Abstract
While the evolutionary mechanisms driving eukaryote genome size evolution are still debated, repeated element content appears to be crucial. Here, we reconstructed the phylogeny and identified repeats in the genome of 26 Drosophila exhibiting a twofold variation in genome size. The content in transposable elements (TEs) is highly correlated to genome size evolution among these closely related species. We detected a strong phylogenetic signal on the evolution of both genome size and TE content, and a genome contraction in the Drosophila melanogaster subgroup.
Collapse
Affiliation(s)
- Camille Sessegolo
- Laboratoire de Biométrie et Biologie Evolutive, Université de Lyon, Université Claude Bernard Lyon 1, CNRS, UMR5558, 69100 Villeurbanne, France
| | - Nelly Burlet
- Laboratoire de Biométrie et Biologie Evolutive, Université de Lyon, Université Claude Bernard Lyon 1, CNRS, UMR5558, 69100 Villeurbanne, France
| | - Annabelle Haudry
- Laboratoire de Biométrie et Biologie Evolutive, Université de Lyon, Université Claude Bernard Lyon 1, CNRS, UMR5558, 69100 Villeurbanne, France
| |
Collapse
|