Transcriptomic Analysis Pipeline (TAP) for quality control and functional assessment of transcriptomes.
RESEARCH SQUARE 2023:rs.3.rs-3390128. [PMID:
37886564 PMCID:
PMC10602190 DOI:
10.21203/rs.3.rs-3390128/v1]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/28/2023]
Abstract
Background
RNA-sequencing (RNA-seq) has revolutionized the exploration of biological mechanisms, shedding light on the roles of non-coding RNAs, including long non-coding RNAs (lncRNAs), across various biological processes, including stress responses. Despite these advancements, there remains a gap in our understanding of the implications of different RNA-seq library protocols on comprehensive lncRNA expression analysis, particularly in non-mammalian organisms.
Results
In this study, we sought to bridge this knowledge gap by investigating lncRNA expression patterns in Drosophila melanogaster under thermal stress conditions. To achieve this, we conducted a comparative analysis of two RNA-seq library protocols: polyA + RNA capture and rRNA-depletion. Our approach involved the development and application of a Transcriptome Analysis Pipeline (TAP) designed to systematically assess both the technical and functional dimensions of RNA-seq, facilitating a robust comparison of these library protocols. Our findings underscore the efficacy of the polyA + protocol in capturing the majority of expressed lncRNAs within the Drosophila melanogaster transcriptome. In contrast, rRNA-depletion exhibited limited advantages in the context of D. melanogaster studies. Notably, the polyA + protocol demonstrated superior performance in terms of usable read yield and the accurate detection of splice junctions.
Conclusions
Our study introduces a versatile transcriptomic analysis pipeline, TAP, designed to uniformly process RNA-seq data from any organism with a reference genome. It also highlights the significance of selecting an appropriate RNA-seq library protocol tailored to the specific research context.
Background
Advances in next generation sequencing (NGS) technologies enable the comprehensive analysis of genetic sequences of organisms in a relatively cost-effective manner [1, 2]. Among these technologies, RNA-sequencing (RNA-seq) has emerged as a preeminent method to study fundamental biological mechanisms at the level of cells, tissues, and whole organisms. RNA-seq enables the detection and quantification of various RNA populations, including messenger RNA (mRNA) and various species of non-coding RNA, such as long non-coding RNA (lncRNA), as well as an assessment of features including splice junctions in RNA.
Collapse