Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Dozmorov MG, Adrianto I, Giles CB, Glass E, Glenn SB, Montgomery C, Sivils KL, Olson LE, Iwayama T, Freeman WM, Lessard CJ, Wren JD. Detrimental effects of duplicate reads and low complexity regions on RNA- and ChIP-seq data. BMC Bioinformatics 2015;16 Suppl 13:S10. [PMID: 26423047 PMCID: PMC4597324 DOI: 10.1186/1471-2105-16-s13-s10] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open

For:	Dozmorov MG, Adrianto I, Giles CB, Glass E, Glenn SB, Montgomery C, Sivils KL, Olson LE, Iwayama T, Freeman WM, Lessard CJ, Wren JD. Detrimental effects of duplicate reads and low complexity regions on RNA- and ChIP-seq data. BMC Bioinformatics 2015;16 Suppl 13:S10. [PMID: 26423047 PMCID: PMC4597324 DOI: 10.1186/1471-2105-16-s13-s10] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open

Number

Cited by Other Article(s)

Paiva I, Seguin J, Grgurina I, Singh AK, Cosquer B, Plassard D, Tzeplaeff L, Le Gras S, Cotellessa L, Decraene C, Gambi J, Alcala-Vida R, Eswaramoorthy M, Buée L, Cassel JC, Giacobini P, Blum D, Merienne K, Kundu TK, Boutillier AL. Dysregulated expression of cholesterol biosynthetic genes in Alzheimer's disease alters epigenomic signatures of hippocampal neurons. Neurobiol Dis 2024:106538. [PMID: 38789057 DOI: 10.1016/j.nbd.2024.106538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 05/18/2024] [Accepted: 05/20/2024] [Indexed: 05/26/2024] Open

Affiliation(s)

Isabel Paiva University of Strasbourg, Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg F-67000, France; CNRS, UMR7364 - Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg F-67000, France.
Jonathan Seguin University of Strasbourg, Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg F-67000, France; CNRS, UMR7364 - Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg F-67000, France
Iris Grgurina University of Strasbourg, Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg F-67000, France; CNRS, UMR7364 - Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg F-67000, France
Akash Kumar Singh Transcription and Disease Laboratory, Molecular Biology and Genetics Unit, Jawaharlal Nehru Centre for Advanced Scientific Research (JNCASR), Bangalore, India
Brigitte Cosquer University of Strasbourg, Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg F-67000, France; CNRS, UMR7364 - Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg F-67000, France
Damien Plassard University of Strasbourg, CNRS UMR7104, Inserm U1258 - GenomEast Platform - IGBMC - Institut de Génétique et de Biologie Moléculaire et Cellulaire, F-67404 Illkirch, France
Laura Tzeplaeff University of Strasbourg, Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg F-67000, France; CNRS, UMR7364 - Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg F-67000, France
Stephanie Le Gras University of Strasbourg, CNRS UMR7104, Inserm U1258 - GenomEast Platform - IGBMC - Institut de Génétique et de Biologie Moléculaire et Cellulaire, F-67404 Illkirch, France
Ludovica Cotellessa University of Lille, Inserm, CHU Lille, Laboratory of Development and Plasticity of the Postnatal Brain, Lille Neuroscience & Cognition, UMR-S1172, FHU 1000 Days for Health, 59000 Lille, France
Charles Decraene University of Strasbourg, Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg F-67000, France; CNRS, UMR7364 - Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg F-67000, France
Johanne Gambi University of Strasbourg, Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg F-67000, France; CNRS, UMR7364 - Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg F-67000, France
Rafael Alcala-Vida University of Strasbourg, Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg F-67000, France; CNRS, UMR7364 - Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg F-67000, France
Muthusamy Eswaramoorthy Chemistry and Physics of Materials Unit, Jawaharlal Nehru Centre for Advanced Scientific Research, Bangalore, India
Luc Buée University of Lille, Inserm, CHU Lille, UMR-S1172 LilNCog - Lille Neuroscience & Cognition, Lille, France; Alzheimer and Tauopathies, LabEx DISTALZ, France
Jean-Christophe Cassel University of Strasbourg, Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg F-67000, France; CNRS, UMR7364 - Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg F-67000, France
Paolo Giacobini University of Lille, Inserm, CHU Lille, Laboratory of Development and Plasticity of the Postnatal Brain, Lille Neuroscience & Cognition, UMR-S1172, FHU 1000 Days for Health, 59000 Lille, France
David Blum University of Lille, Inserm, CHU Lille, UMR-S1172 LilNCog - Lille Neuroscience & Cognition, Lille, France; Alzheimer and Tauopathies, LabEx DISTALZ, France
Karine Merienne University of Strasbourg, Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg F-67000, France; CNRS, UMR7364 - Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg F-67000, France
Tapas K Kundu Transcription and Disease Laboratory, Molecular Biology and Genetics Unit, Jawaharlal Nehru Centre for Advanced Scientific Research (JNCASR), Bangalore, India
Anne-Laurence Boutillier University of Strasbourg, Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg F-67000, France; CNRS, UMR7364 - Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg F-67000, France.

Collapse

Tzeplaeff L, Seguin J, Le Gras S, Megat S, Cosquer B, Plassard D, Dieterlé S, Paiva I, Picchiarelli G, Decraene C, Alcala-Vida R, Cassel JC, Merienne K, Dupuis L, Boutillier AL. Mutant FUS induces chromatin reorganization in the hippocampus and alters memory processes. Prog Neurobiol 2023;227:102483. [PMID: 37327984 DOI: 10.1016/j.pneurobio.2023.102483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 05/12/2023] [Accepted: 06/09/2023] [Indexed: 06/18/2023]

Affiliation(s)

Laura Tzeplaeff Université de Strasbourg, Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg, France; CNRS, UMR 7364, Strasbourg 67000, France; Université de Strasbourg, INSERM, UMR-S1118, Strasbourg, France
Jonathan Seguin Université de Strasbourg, Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg, France; CNRS, UMR 7364, Strasbourg 67000, France
Stéphanie Le Gras Université de Strasbourg, CNRS UMR 7104, INSERM U1258, GenomEast Platform, Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC), Université de Strasbourg, Illkirch, France
Salim Megat Université de Strasbourg, INSERM, UMR-S1118, Strasbourg, France
Brigitte Cosquer Université de Strasbourg, Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg, France; CNRS, UMR 7364, Strasbourg 67000, France
Damien Plassard Université de Strasbourg, CNRS UMR 7104, INSERM U1258, GenomEast Platform, Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC), Université de Strasbourg, Illkirch, France
Stéphane Dieterlé Université de Strasbourg, INSERM, UMR-S1118, Strasbourg, France
Isabel Paiva Université de Strasbourg, Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg, France; CNRS, UMR 7364, Strasbourg 67000, France
Gina Picchiarelli Université de Strasbourg, INSERM, UMR-S1118, Strasbourg, France
Charles Decraene Université de Strasbourg, Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg, France; CNRS, UMR 7364, Strasbourg 67000, France
Rafael Alcala-Vida Université de Strasbourg, Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg, France; CNRS, UMR 7364, Strasbourg 67000, France
Jean-Christophe Cassel Université de Strasbourg, Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg, France; CNRS, UMR 7364, Strasbourg 67000, France
Karine Merienne Université de Strasbourg, Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg, France; CNRS, UMR 7364, Strasbourg 67000, France
Luc Dupuis Université de Strasbourg, INSERM, UMR-S1118, Strasbourg, France.
Anne-Laurence Boutillier Université de Strasbourg, Laboratoire de Neuroscience Cognitives et Adaptatives (LNCA), Strasbourg, France.

Collapse

Katsantoni M, van Nimwegen E, Zavolan M. Improved analysis of (e)CLIP data with RCRUNCH yields a compendium of RNA-binding protein binding sites and motifs. Genome Biol 2023;24:77. [PMID: 37069586 PMCID: PMC10108518 DOI: 10.1186/s13059-023-02913-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Accepted: 03/29/2023] [Indexed: 04/19/2023] Open

O Adetunji M, J Abraham B. SEAseq: a portable and cloud-based chromatin occupancy analysis suite. BMC Bioinformatics 2022;23:77. [PMID: 35193506 PMCID: PMC8864840 DOI: 10.1186/s12859-022-04588-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 01/28/2022] [Indexed: 11/26/2022] Open

Abstract

Background

Genome-wide protein-DNA binding is popularly assessed using specific antibody pulldown in Chromatin Immunoprecipitation Sequencing (ChIP-Seq) or Cleavage Under Targets and Release Using Nuclease (CUT&RUN) sequencing experiments. These technologies generate high-throughput sequencing data that necessitate the use of multiple sophisticated, computationally intensive genomic tools to make discoveries, but these genomic tools often have a high barrier to use because of computational resource constraints.

Results

We present a comprehensive, infrastructure-independent, computational pipeline called SEAseq, which leverages field-standard, open-source tools for processing and analyzing ChIP-Seq/CUT&RUN data. SEAseq performs extensive analyses from the raw output of the experiment, including alignment, peak calling, motif analysis, promoters and metagene coverage profiling, peak annotation distribution, clustered/stitched peaks (e.g. super-enhancer) identification, and multiple relevant quality assessment metrics, as well as automatic interfacing with data in GEO/SRA. SEAseq enables rapid and cost-effective resource for analysis of both new and publicly available datasets as demonstrated in our comparative case studies.

Conclusions

The easy-to-use and versatile design of SEAseq makes it a reliable and efficient resource for ensuring high quality analysis. Its cloud implementation enables a broad suite of analyses in environments with constrained computational resources. SEAseq is platform-independent and is aimed to be usable by everyone with or without programming skills. It is available on the cloud at https://platform.stjude.cloud/workflows/seaseq and can be locally installed from the repository at https://github.com/stjude/seaseq.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12859-022-04588-z.

Collapse

Selberherr E, Penz T, König L, Conrady B, Siegl A, Horn M, Schmitz-Esser S. The life cycle-dependent transcriptional profile of the obligate intracellular amoeba symbiont Amoebophilus asiaticus. FEMS Microbiol Ecol 2022;98:6499296. [PMID: 34999767 PMCID: PMC8831229 DOI: 10.1093/femsec/fiac001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Revised: 12/22/2021] [Accepted: 01/04/2022] [Indexed: 12/04/2022] Open

Qi X, Gu H, Qu L. Transcriptome-Wide Analyses Identify Dominant as the Predominantly Non-Conservative Alternative Splicing Inheritance Patterns in F1 Chickens. Front Genet 2021;12:774240. [PMID: 34925458 PMCID: PMC8678468 DOI: 10.3389/fgene.2021.774240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2021] [Accepted: 11/05/2021] [Indexed: 11/25/2022] Open

Gajos M, Jasnovidova O, van Bömmel A, Freier S, Vingron M, Mayer A. Conserved DNA sequence features underlie pervasive RNA polymerase pausing. Nucleic Acids Res 2021;49:4402-4420. [PMID: 33788942 PMCID: PMC8096220 DOI: 10.1093/nar/gkab208] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2020] [Revised: 03/05/2021] [Accepted: 03/15/2021] [Indexed: 12/17/2022] Open

Direct Nanopore Sequencing of mRNA Reveals Landscape of Transcript Isoforms in Apicomplexan Parasites. mSystems 2021;6:6/2/e01081-20. [PMID: 33688018 PMCID: PMC8561664 DOI: 10.1128/msystems.01081-20] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open

Abstract

Alternative splicing is a widespread phenomenon in metazoans by which single genes are able to produce multiple isoforms of the gene product. However, this has been poorly characterized in apicomplexans, a major phylum of some of the most important global parasites. Efforts have been hampered by atypical transcriptomic features, such as the high AU content of Plasmodium RNA, but also the limitations of short-read sequencing in deciphering complex splicing events. In this study, we utilized the long read direct RNA sequencing platform developed by Oxford Nanopore Technologies to survey the alternative splicing landscape of Toxoplasma gondii and Plasmodium falciparum. We find that while native RNA sequencing has a reduced throughput, it allows us to obtain full-length or nearly full-length transcripts with comparable quantification to Illumina sequencing. By comparing these data with available gene models, we find widespread alternative splicing, particularly intron retention, in these parasites. Most of these transcripts contain premature stop codons, suggesting that in these parasites, alternative splicing represents a pathway to transcriptomic diversity, rather than expanding proteomic diversity. Moreover, alternative splicing rates are comparable between parasites, suggesting a shared splicing machinery, despite notable transcriptomic differences between the parasites. This study highlights a strategy in using long-read sequencing to understand splicing events at the whole-transcript level and has implications in the future interpretation of transcriptome sequencing studies.

IMPORTANCE We have used a novel nanopore sequencing technology to directly analyze parasite transcriptomes. The very long reads of this technology reveal the full-length genes of the parasites that cause malaria and toxoplasmosis. Gene transcripts must be processed in a process called splicing before they can be translated to protein. Our analysis reveals that these parasites very frequently only partially process their gene products, in a manner that departs dramatically from their human hosts.

Collapse

Inheritance patterns of the transcriptome in hybrid chickens and their parents revealed by expression analysis. Sci Rep 2019;9:5750. [PMID: 30962479 PMCID: PMC6453914 DOI: 10.1038/s41598-019-42019-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2018] [Accepted: 03/22/2019] [Indexed: 12/11/2022] Open

Owen N, Moosajee M. RNA-sequencing in ophthalmology research: considerations for experimental design and analysis. Ther Adv Ophthalmol 2019;11:2515841419835460. [PMID: 30911735 PMCID: PMC6421592 DOI: 10.1177/2515841419835460] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Accepted: 02/08/2019] [Indexed: 12/13/2022] Open

Renaud G, Schubert M, Sawyer S, Orlando L. Authentication and Assessment of Contamination in Ancient DNA. Methods Mol Biol 2019;1963:163-194. [PMID: 30875054 DOI: 10.1007/978-1-4939-9176-1_17] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Wang Q, Mank JE, Li J, Yang N, Qu L. Allele-Specific Expression Analysis Does Not Support Sex Chromosome Inactivation on the Chicken Z Chromosome. Genome Biol Evol 2017;9:619-626. [PMID: 28391319 PMCID: PMC5381566 DOI: 10.1093/gbe/evx031] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/22/2017] [Indexed: 12/27/2022] Open

Abstract

Heterogametic sex chromosomes have evolved many times independently, and in many cases, the loss of functional genes from the sex-limited Y or W chromosome leaves only one functional gene copy on the corresponding X or Z chromosome in the heterogametic sex. Because gene dose often correlates with gene expression level, this difference in gene dose between males and females for X- or Z-linked genes in some cases has selected for chromosome-wide transcriptional dosage compensation mechanisms to counteract any reduction in expression in the heterogametic sex. These mechanisms are thought to restore the balance between sex-linked loci and the autosomal genes they interact with, and this also typically results in equal expression between the sexes. However, dosage compensation in many other species is incomplete, and in the case of birds average expression from males (ZZ) remains higher than in females (ZW). Interestingly, recent reports in chickens and related species have shown that the Z chromosome is expressed less in males than would be expected from two copies of the chromosome, and recent data from cell-based approaches on 11 loci in chicken have suggested that one Z chromosome is partially inactivated in males, in a mechanism thought to be homologous to X inactivation in therian mammals. In the present study, we use controlled crosses in three tissues to test for the presence of Z inactivation in males, which would be expected to bias transcription to the active gene copy (allele-specific expression). We show that for the vast majority of genes on the chicken Z chromosome, males express both parental alleles at statistically similar levels, indicating no Z chromosome inactivation. For those Z chromosome loci with detectable ASE in males, we show that the most likely cause is cis-regulatory variation, rather than Z chromosome inactivation. Taken together, our results indicate that unlike the X chromosome in mammals, Z inactivation does not affect an appreciable number of loci in chicken.

Collapse

Gebert D, Hewel C, Rosenkranz D. unitas: the universal tool for annotation of small RNAs. BMC Genomics 2017;18:644. [PMID: 28830358 PMCID: PMC5567656 DOI: 10.1186/s12864-017-4031-9] [Citation(s) in RCA: 66] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2017] [Accepted: 08/07/2017] [Indexed: 12/21/2022] Open

Klepikova AV, Kasianov AS, Chesnokov MS, Lazarevich NL, Penin AA, Logacheva M. Effect of method of deduplication on estimation of differential gene expression using RNA-seq. PeerJ 2017;5:e3091. [PMID: 28321364 PMCID: PMC5357343 DOI: 10.7717/peerj.3091] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2016] [Accepted: 02/14/2017] [Indexed: 12/11/2022] Open

Wright MN, Gola D, Ziegler A. Preprocessing and Quality Control for Whole-Genome Sequences from the Illumina HiSeq X Platform. Methods Mol Biol 2017;1666:629-647. [PMID: 28980267 DOI: 10.1007/978-1-4939-7274-6_30] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

Chaitankar V, Karakülah G, Ratnapriya R, Giuste FO, Brooks MJ, Swaroop A. Next generation sequencing technology and genomewide data analysis: Perspectives for retinal research. Prog Retin Eye Res 2016;55:1-31. [PMID: 27297499 DOI: 10.1016/j.preteyeres.2016.06.001] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2016] [Revised: 06/06/2016] [Accepted: 06/08/2016] [Indexed: 02/08/2023]

Trimming of sequence reads alters RNA-Seq gene expression estimates. BMC Bioinformatics 2016;17:103. [PMID: 26911985 PMCID: PMC4766705 DOI: 10.1186/s12859-016-0956-2] [Citation(s) in RCA: 89] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2015] [Accepted: 02/19/2016] [Indexed: 01/08/2023] Open

Abstract

BACKGROUND

High-throughput RNA-Sequencing (RNA-Seq) has become the preferred technique for studying gene expression differences between biological samples and for discovering novel isoforms, though the techniques to analyze the resulting data are still immature. One pre-processing step that is widely but heterogeneously applied is trimming, in which low quality bases, identified by the probability that they are called incorrectly, are removed. However, the impact of trimming on subsequent alignment to a genome could influence downstream analyses including gene expression estimation; we hypothesized that this might occur in an inconsistent manner across different genes, resulting in differential bias.

RESULTS

To assess the effects of trimming on gene expression, we generated RNA-Seq data sets from four samples of larval Drosophila melanogaster sensory neurons, and used three trimming algorithms--SolexaQA, Trimmomatic, and ConDeTri-to perform quality-based trimming across a wide range of stringencies. After aligning the reads to the D. melanogaster genome with TopHat2, we used Cuffdiff2 to compare the original, untrimmed gene expression estimates to those following trimming. With the most aggressive trimming parameters, over ten percent of genes had significant changes in their estimated expression levels. This trend was seen with two additional RNA-Seq data sets and with alternative differential expression analysis pipelines. We found that the majority of the expression changes could be mitigated by imposing a minimum length filter following trimming, suggesting that the differential gene expression was primarily being driven by spurious mapping of short reads. Slight differences with the untrimmed data set remained after length filtering, which were associated with genes with low exon numbers and high GC content. Finally, an analysis of paired RNA-seq/microarray data sets suggests that no or modest trimming results in the most biologically accurate gene expression estimates.

CONCLUSIONS

We find that aggressive quality-based trimming has a large impact on the apparent makeup of RNA-Seq-based gene expression estimates, and that short reads can have a particularly strong impact. We conclude that implementation of trimming in RNA-Seq analysis workflows warrants caution, and if used, should be used in conjunction with a minimum read length filter to minimize the introduction of unpredictable changes in expression estimates.

Collapse

Wren JD, Thakkar S, Homayouni R, Johann DJ, Dozmorov MG. Proceedings of the 2015 MidSouth Computational Biology and Bioinformatics Society (MCBIOS) Conference. BMC Bioinformatics 2015;16 Suppl 13:S1. [PMID: 26424691 PMCID: PMC4596983 DOI: 10.1186/1471-2105-16-s13-s1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open