1
|
The World of Stable Ribonucleoproteins and Its Mapping With Grad-Seq and Related Approaches. Front Mol Biosci 2021; 8:661448. [PMID: 33898526 PMCID: PMC8058203 DOI: 10.3389/fmolb.2021.661448] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2021] [Accepted: 03/04/2021] [Indexed: 12/13/2022] Open
Abstract
Macromolecular complexes of proteins and RNAs are essential building blocks of cells. These stable supramolecular particles can be viewed as minimal biochemical units whose structural organization, i.e., the way the RNA and the protein interact with each other, is directly linked to their biological function. Whether those are dynamic regulatory ribonucleoproteins (RNPs) or integrated molecular machines involved in gene expression, the comprehensive knowledge of these units is critical to our understanding of key molecular mechanisms and cell physiology phenomena. Such is the goal of diverse complexomic approaches and in particular of the recently developed gradient profiling by sequencing (Grad-seq). By separating cellular protein and RNA complexes on a density gradient and quantifying their distributions genome-wide by mass spectrometry and deep sequencing, Grad-seq charts global landscapes of native macromolecular assemblies. In this review, we propose a function-based ontology of stable RNPs and discuss how Grad-seq and related approaches transformed our perspective of bacterial and eukaryotic ribonucleoproteins by guiding the discovery of new RNA-binding proteins and unusual classes of noncoding RNAs. We highlight some methodological aspects and developments that permit to further boost the power of this technique and to look for exciting new biology in understudied and challenging biological models.
Collapse
|
2
|
Abstract
The complexome of a cell is the entirety of its complexes. Complexome capture studies have mostly focused on protein-protein interactions, which has left a gap in our knowledge of the global interactions of RNAs. To overcome these limitations, we recently introduced gradient profiling by sequencing (Grad-seq), which analyzes in a high-throughput fashion soluble cellular complexes after their separation in a glycerol gradient by fraction-wise RNA-seq and mass spectrometry. Here, we describe a detailed Grad-seq protocol for Streptococcus pneumoniae, which should also be applicable to other bacterial species.
Collapse
|
3
|
Synergistic defects in pre-rRNA processing from mutations in the U3-specific protein Rrp9 and U3 snoRNA. Nucleic Acids Res 2020; 48:3848-3868. [PMID: 31996908 PMCID: PMC7144924 DOI: 10.1093/nar/gkaa066] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2019] [Revised: 01/17/2020] [Accepted: 01/22/2020] [Indexed: 01/24/2023] Open
Abstract
U3 snoRNA and the associated Rrp9/U3-55K protein are essential for 18S rRNA production by the SSU-processome complex. U3 and Rrp9 are required for early pre-rRNA cleavages at sites A0, A1 and A2, but the mechanism remains unclear. Substitution of Arg 289 in Rrp9 to Ala (R289A) specifically reduced cleavage at sites A1 and A2. Surprisingly, R289 is located on the surface of the Rrp9 β-propeller structure opposite to U3 snoRNA. To understand this, we first characterized the protein-protein interaction network of Rrp9 within the SSU-processome. This identified a direct interaction between the Rrp9 β-propeller domain and Rrp36, the strength of which was reduced by the R289A substitution, implicating this interaction in the observed processing phenotype. The Rrp9 R289A mutation also showed strong synergistic negative interactions with mutations in U3 that destabilize the U3/pre-rRNA base-pair interactions or reduce the length of their linking segments. We propose that the Rrp9 β-propeller and U3/pre-rRNA binding cooperate in the structure or stability of the SSU-processome. Additionally, our analysis of U3 variants gave insights into the function of individual segments of the 5′-terminal 72-nt sequence of U3. We interpret these data in the light of recently reported SSU-processome structures.
Collapse
|
4
|
Identifying the Translatome of Mouse NEBD-Stage Oocytes via SSP-Profiling; A Novel Polysome Fractionation Method. Int J Mol Sci 2020; 21:ijms21041254. [PMID: 32070012 PMCID: PMC7072993 DOI: 10.3390/ijms21041254] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Revised: 02/03/2020] [Accepted: 02/10/2020] [Indexed: 12/13/2022] Open
Abstract
Meiotic maturation of oocyte relies on pre-synthesised maternal mRNA, the translation of which is highly coordinated in space and time. Here, we provide a detailed polysome profiling protocol that demonstrates a combination of the sucrose gradient ultracentrifugation in small SW55Ti tubes with the qRT-PCR-based quantification of 18S and 28S rRNAs in fractionated polysome profile. This newly optimised method, named Scarce Sample Polysome Profiling (SSP-profiling), is suitable for both scarce and conventional sample sizes and is compatible with downstream RNA-seq to identify polysome associated transcripts. Utilising SSP-profiling we have assayed the translatome of mouse oocytes at the onset of nuclear envelope breakdown (NEBD)—a developmental point, the study of which is important for furthering our understanding of the molecular mechanisms leading to oocyte aneuploidy. Our analyses identified 1847 transcripts with moderate to strong polysome occupancy, including abundantly represented mRNAs encoding mitochondrial and ribosomal proteins, proteasomal components, glycolytic and amino acids synthetic enzymes, proteins involved in cytoskeleton organization plus RNA-binding and translation initiation factors. In addition to transcripts encoding known players of meiotic progression, we also identified several mRNAs encoding proteins of unknown function. Polysome profiles generated using SSP-profiling were more than comparable to those developed using existing conventional approaches, being demonstrably superior in their resolution, reproducibility, versatility, speed of derivation and downstream protocol applicability.
Collapse
|
5
|
Extracellular RNA Profile in Mesenteric Lymph from Exemplar Rat Models of Acute and Critical Illness. Lymphat Res Biol 2019; 17:512-517. [PMID: 30864890 DOI: 10.1089/lrb.2018.0044] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Background: Mesenteric lymph (ML) has been implicated in the development of multiple organ dysfunction syndrome in critical illness. Extracellular RNAs play a role in cell-to-cell communication during physiological and disease processes but they are rarely studied in ML. We aimed at examining the RNA profiles of peripheral plasma, ML, and ML's extracellular vesicle (ML-EV) and triglyceride-rich lipoprotein (ML-TRL) fractions, obtained from rodent models of critical illness. Methods and Results: We collected ML for 5 hours from rodent models of critical illness [Acute Pancreatitis, Cecal Ligation and Incision (CLI), Gut Ischemia-Reperfusion (IR)] and matching Sham control rats. ML-EV and ML-TRL fractions were also isolated. RNA sequencing was performed on the RNA extracted from ML, ML-EV, ML-TRL, and plasma by using the Ion Torrent Personal Genome Machine platform. RNA sequences were searched using the Basic Local Alignment Search Tool against rat genome and RefSeq, microRNA (miRNA), genomic tRNA, functional RNA, and Genbank nucleotide databases, and the read counts were analyzed. Each sample type had a distinct RNA profile. ML contained more RNA per volume and a larger proportion of tRNA fragments than plasma. ML-EVs were enriched with miRNA, whereas ML-TRLs contained low absolute amounts of RNA. The RNA size profiles for CLI and Gut IR were different from Sham. ML carried intestinal RNAs and in a CLI model it was significantly enriched with bacterial RNA sequences. Conclusions: We found the distinct but diverse RNA profiles of ML and its compartments, and their different profiles in critical illness. Intestinal-derived small RNAs in ML may have a direct role in critical illness and utility as potential biomarkers.
Collapse
|
6
|
Comparative transcriptomics in Leishmania braziliensis: disclosing differential gene expression of coding and putative noncoding RNAs across developmental stages. RNA Biol 2019; 16:639-660. [PMID: 30689499 DOI: 10.1080/15476286.2019.1574161] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Leishmaniasis is a worldwide public health problem caused by protozoan parasites of the genus Leishmania. Leishmania braziliensis is the most important species responsible for tegumentary leishmaniases in Brazil. An understanding of the molecular mechanisms underlying the success of this parasite is urgently needed. An in-depth study on the modulation of gene expression across the life cycle stages of L. braziliensis covering coding and noncoding RNAs (ncRNAs) was missing and is presented herein. Analyses of differentially expressed (DE) genes revealed that most prominent differences were observed between the transcriptomes of insect and mammalian proliferative forms (6,576 genes). Gene ontology (GO) analysis indicated stage-specific enriched biological processes. A computational pipeline and 5 ncRNA predictors allowed the identification of 11,372 putative ncRNAs. Most of the DE ncRNAs were found between the transcriptomes of insect and mammalian proliferative stages (38%). Of the DE ncRNAs, 295 were DE in all three stages and displayed a wide range of lengths, chromosomal distributions and locations; many of them had a distinct expression profile compared to that of their protein-coding neighbors. Thirty-five putative ncRNAs were submitted to northern blotting analysis, and one or more hybridization-positive signals were observed in 22 of these ncRNAs. This work presents an overview of the L. braziliensis transcriptome and its adjustments throughout development. In addition to determining the general features of the transcriptome at each life stage and the profile of protein-coding transcripts, we identified and characterized a variety of noncoding transcripts. The novel putative ncRNAs uncovered in L. braziliensis might be regulatory elements to be further investigated.
Collapse
|
7
|
Discovery of new RNA classes and global RNA-binding proteins. Curr Opin Microbiol 2017; 39:152-160. [PMID: 29179042 DOI: 10.1016/j.mib.2017.11.016] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2017] [Accepted: 11/17/2017] [Indexed: 12/15/2022]
Abstract
The identification of new RNA functions and the functional annotation of transcripts in genomes represent exciting yet challenging endeavours of modern biology. Crucial insights into the biological roles of RNA molecules can be gained from the identification of the proteins with which they form specific complexes. Modern interactome techniques permit to profile RNA-protein interactions in a genome-wide manner and identify new RNA classes associated with globally acting RNA-binding proteins. Applied to a variety of organisms, these methods are already revolutionising our understanding of RNA-mediated biological processes. Here, we focus on one such approach-Gradient sequencing or Grad-seq-which has recently guided the discovery of protein ProQ and its associated small RNAs as a new domain of post-transcriptional control in bacteria.
Collapse
|
8
|
Two genetic codes: Repetitive syntax for active non-coding RNAs; non-repetitive syntax for the DNA archives. Commun Integr Biol 2017; 10:e1297352. [PMID: 29149223 PMCID: PMC5398208 DOI: 10.1080/19420889.2017.1297352] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2017] [Accepted: 02/16/2017] [Indexed: 02/06/2023] Open
Abstract
Current knowledge of the RNA world indicates 2 different genetic codes being present throughout the living world. In contrast to non-coding RNAs that are built of repetitive nucleotide syntax, the sequences that serve as templates for proteins share-as main characteristics-a non-repetitive syntax. Whereas non-coding RNAs build groups that serve as regulatory tools in nearly all genetic processes, the coding sections represent the evolutionarily successful function of the genetic information storage medium. This indicates that the differences in their syntax structure are coherent with the differences of the functions they represent. Interestingly, these 2 genetic codes resemble the function of all natural languages, i.e., the repetitive non-coding sequences serve as appropriate tool for organization, coordination and regulation of group behavior, and the non-repetitive coding sequences are for conservation of instrumental constructions, plans, blueprints for complex protein-body architecture. This differentiation may help to better understand RNA group behavioral motifs.
Collapse
|
9
|
Grad-seq guides the discovery of ProQ as a major small RNA-binding protein. Proc Natl Acad Sci U S A 2016; 113:11591-11596. [PMID: 27671629 DOI: 10.1073/pnas.1609981113] [Citation(s) in RCA: 203] [Impact Index Per Article: 25.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
The functional annotation of transcriptomes and identification of noncoding RNA (ncRNA) classes has been greatly facilitated by the advent of next-generation RNA sequencing which, by reading the nucleotide order of transcripts, theoretically allows the rapid profiling of all transcripts in a cell. However, primary sequence per se is a poor predictor of function, as ncRNAs dramatically vary in length and structure and often lack identifiable motifs. Therefore, to visualize an informative RNA landscape of organisms with potentially new RNA biology that are emerging from microbiome and environmental studies requires the use of more functionally relevant criteria. One such criterion is the association of RNAs with functionally important cognate RNA-binding proteins. Here we analyze the full ensemble of cellular RNAs using gradient profiling by sequencing (Grad-seq) in the bacterial pathogen Salmonella enterica, partitioning its coding and noncoding transcripts based on their network of RNA-protein interactions. In addition to capturing established RNA classes based on their biochemical profiles, the Grad-seq approach enabled the discovery of an overlooked large collective of structured small RNAs that form stable complexes with the conserved protein ProQ. We show that ProQ is an abundant RNA-binding protein with a wide range of ligands and a global influence on Salmonella gene expression. Given its generic ability to chart a functional RNA landscape irrespective of transcript length and sequence diversity, Grad-seq promises to define functional RNA classes and major RNA-binding proteins in both model species and genetically intractable organisms.
Collapse
|
10
|
Secondary structure-based analysis of mouse brain small RNA sequences obtained by using next-generation sequencing. Genomics 2015; 106:122-8. [DOI: 10.1016/j.ygeno.2015.05.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2015] [Revised: 05/05/2015] [Accepted: 05/13/2015] [Indexed: 01/21/2023]
|
11
|
SPARSE: quadratic time simultaneous alignment and folding of RNAs without sequence-based heuristics. Bioinformatics 2015; 31:2489-96. [PMID: 25838465 PMCID: PMC4514930 DOI: 10.1093/bioinformatics/btv185] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2014] [Accepted: 03/25/2015] [Indexed: 01/19/2023] Open
Abstract
Motivation: RNA-Seq experiments have revealed a multitude of novel ncRNAs. The gold standard for their analysis based on simultaneous alignment and folding suffers from extreme time complexity of O(n6). Subsequently, numerous faster ‘Sankoff-style’ approaches have been suggested. Commonly, the performance of such methods relies on sequence-based heuristics that restrict the search space to optimal or near-optimal sequence alignments; however, the accuracy of sequence-based methods breaks down for RNAs with sequence identities below 60%. Alignment approaches like LocARNA that do not require sequence-based heuristics, have been limited to high complexity (≥ quartic time). Results: Breaking this barrier, we introduce the novel Sankoff-style algorithm ‘sparsified prediction and alignment of RNAs based on their structure ensembles (SPARSE)’, which runs in quadratic time without sequence-based heuristics. To achieve this low complexity, on par with sequence alignment algorithms, SPARSE features strong sparsification based on structural properties of the RNA ensembles. Following PMcomp, SPARSE gains further speed-up from lightweight energy computation. Although all existing lightweight Sankoff-style methods restrict Sankoff’s original model by disallowing loop deletions and insertions, SPARSE transfers the Sankoff algorithm to the lightweight energy model completely for the first time. Compared with LocARNA, SPARSE achieves similar alignment and better folding quality in significantly less time (speedup: 3.7). At similar run-time, it aligns low sequence identity instances substantially more accurate than RAF, which uses sequence-based heuristics. Availability and implementation: SPARSE is freely available at http://www.bioinf.uni-freiburg.de/Software/SPARSE. Contact:backofen@informatik.uni-freiburg.de Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
|
12
|
Abstract
The revolution of miRNA discovery, in the early 2000s, shed a new light in the exciting field of small non-coding RNAs. Since then, and owing to outstanding breakthroughs in RNomic techniques, novel small non-coding RNA families have been regularly discovered, e.g., piRNAs, tiRNAs, and many others.In this review, we provide a very succinct historical and functional overview on most prominent small non-coding RNA families.
Collapse
|
13
|
Profiling of small RNA cargo of extracellular vesicles shed by Trypanosoma cruzi reveals a specific extracellular signature. Mol Biochem Parasitol 2015; 199:19-28. [DOI: 10.1016/j.molbiopara.2015.03.003] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2014] [Revised: 03/03/2015] [Accepted: 03/09/2015] [Indexed: 12/31/2022]
|
14
|
Generation of a neuro-specific microarray reveals novel differentially expressed noncoding RNAs in mouse models for neurodegenerative diseases. RNA (NEW YORK, N.Y.) 2014; 20:1929-43. [PMID: 25344396 PMCID: PMC4238357 DOI: 10.1261/rna.047225.114] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/09/2014] [Accepted: 08/27/2014] [Indexed: 05/24/2023]
Abstract
We have generated a novel, neuro-specific ncRNA microarray, covering 1472 ncRNA species, to investigate their expression in different mouse models for central nervous system diseases. Thereby, we analyzed ncRNA expression in two mouse models with impaired calcium channel activity, implicated in Epilepsy or Parkinson's disease, respectively, as well as in a mouse model mimicking pathophysiological aspects of Alzheimer's disease. We identified well over a hundred differentially expressed ncRNAs, either from known classes of ncRNAs, such as miRNAs or snoRNAs or which represented entirely novel ncRNA species. Several differentially expressed ncRNAs in the calcium channel mouse models were assigned as miRNAs and target genes involved in calcium signaling, thus suggesting feedback regulation of miRNAs by calcium signaling. In the Alzheimer mouse model, we identified two snoRNAs, whose expression was deregulated prior to amyloid plaque formation. Interestingly, the presence of snoRNAs could be detected in cerebral spine fluid samples in humans, thus potentially serving as early diagnostic markers for Alzheimer's disease. In addition to known ncRNAs species, we also identified 63 differentially expressed, entirely novel ncRNA candidates, located in intronic or intergenic regions of the mouse genome, genomic locations, which previously have been shown to harbor the majority of functional ncRNAs.
Collapse
|
15
|
BlockClust: efficient clustering and classification of non-coding RNAs from short read RNA-seq profiles. ACTA ACUST UNITED AC 2014; 30:i274-82. [PMID: 24931994 PMCID: PMC4058930 DOI: 10.1093/bioinformatics/btu270] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Summary: Non-coding RNAs (ncRNAs) play a vital role in many cellular processes such as RNA splicing, translation, gene regulation. However the vast majority of ncRNAs still have no functional annotation. One prominent approach for putative function assignment is clustering of transcripts according to sequence and secondary structure. However sequence information is changed by post-transcriptional modifications, and secondary structure is only a proxy for the true 3D conformation of the RNA polymer. A different type of information that does not suffer from these issues and that can be used for the detection of RNA classes, is the pattern of processing and its traces in small RNA-seq reads data. Here we introduce BlockClust, an efficient approach to detect transcripts with similar processing patterns. We propose a novel way to encode expression profiles in compact discrete structures, which can then be processed using fast graph-kernel techniques. We perform both unsupervised clustering and develop family specific discriminative models; finally we show how the proposed approach is scalable, accurate and robust across different organisms, tissues and cell lines. Availability: The whole BlockClust galaxy workflow including all tool dependencies is available at http://toolshed.g2.bx.psu.edu/view/rnateam/blockclust_workflow. Contact:backofen@informatik.uni-freiburg.de; costa@informatik.uni-freiburg.de Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
|
16
|
|
17
|
Abstract
Although transcriptional and posttranscriptional events are detected in RNA-Seq data from second-generation sequencing, full-length mRNA isoforms are not captured. On the other hand, third-generation sequencing, which yields much longer reads, has current limitations of lower raw accuracy and throughput. Here, we combine second-generation sequencing and third-generation sequencing with a custom-designed method for isoform identification and quantification to generate a high-confidence isoform dataset for human embryonic stem cells (hESCs). We report 8,084 RefSeq-annotated isoforms detected as full-length and an additional 5,459 isoforms predicted through statistical inference. Over one-third of these are novel isoforms, including 273 RNAs from gene loci that have not previously been identified. Further characterization of the novel loci indicates that a subset is expressed in pluripotent cells but not in diverse fetal and adult tissues; moreover, their reduced expression perturbs the network of pluripotency-associated genes. Results suggest that gene identification, even in well-characterized human cell lines and tissues, is likely far from complete.
Collapse
|
18
|
Alternative processing of the U2 small nuclear RNA produces a 19-22nt fragment with relevance for the detection of non-small cell lung cancer in human serum. PLoS One 2013; 8:e60134. [PMID: 23527303 PMCID: PMC3603938 DOI: 10.1371/journal.pone.0060134] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2012] [Accepted: 02/21/2013] [Indexed: 12/28/2022] Open
Abstract
RNU2 exists in two functional forms (RNU2-1 and RNU2-2) distinguishable by the presence of a unique 4-bases motif. Detailed investigation of datasets obtained from deep sequencing of five human lung primary tumors revealed that both forms express at a high rate a 19-22nt fragment (miR-U2-1 and -2) from its 3' region and contains the 4-bases motif. Deep sequencing of independent pools of serum samples from healthy donors and lung cancer patients revealed that miR-U2-1 and -2 are pervasively processed in lung tissue by means of endonucleolytic cleavages and stably exported to the blood. Then, microarrays hybridization experiments of matched normal/tumor samples revealed a significant over-expression of miR-U2-1 in 14 of 18 lung primary tumors. Subsequently, qRT-PCR of miR-U2-1 using serum from 62 lung cancer patients and 96 various controls demonstrated that its expression levels identify lung cancer patients with 79% sensitivity and 80% specificity. miR-U2-1 expression correlated with the presence or absence of lung cancer in patients with chronic obstructive pulmonary disease (COPD), other diseases of the lung - not cancer, and in healthy controls. These data suggest that RNU2-1 is a new bi-functional ncRNA that produces a 19-22nt fragment which may be useful in detecting lung cancer non-invasively in high risk patients.
Collapse
|
19
|
Profiling and identification of small rDNA-derived RNAs and their potential biological functions. PLoS One 2013; 8:e56842. [PMID: 23418607 PMCID: PMC3572043 DOI: 10.1371/journal.pone.0056842] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2012] [Accepted: 01/14/2013] [Indexed: 12/13/2022] Open
Abstract
Small non-coding RNAs constitute a large family of regulatory molecules with diverse functions. Notably, some small non-coding RNAs matched to rDNA have been identified as qiRNAs and small guide RNAs involved in various biological processes. However, a large number of small rDNA-derived RNAs (srRNAs) are usually neglected and yet to be investigated. We systematically investigated srRNAs using small RNA datasets generated by high-throughput sequencing, and found srRNAs are mainly mapped to rRNA coding regions in sense direction. The datasets from immunoprecipitation and high-throughput sequencing demonstrate that srRNAs are co-immunoprecipitated with Argonaute (AGO) proteins. Furthermore, the srRNA expression profile in mouse liver is affected by diabetes. Overexpression or inhibition of srRNAs in cultured cells shows that srRNAs are involved in various signaling pathways. This study presents a global view of srRNAs in total small RNA and AGO protein complex from different species, and demonstrates that srRNAs are correlated with diabetes, and involved in some biological processes. These findings provide new insights into srRNAs and their functions in various physiological and pathological processes.
Collapse
|
20
|
Abstract
Next-generation sequencing of noncoding RNA (ncRNA) libraries has become an essential tool for the profiling of ncRNAs and the identification of novel ncRNA species. Here, we describe the generation of a ncRNA-derived complementary DNA (cDNA) library by 3'-tailing of ncRNAs by CTP and poly(A) polymerase, followed by 5'-adapter ligation by T4 RNA ligase and reverse transcription of ncRNAs with an oligo-d(G) anchor primer. Preliminary selection of ncRNAs from ribonucleoprotein particles (RNPs) enables a strong enrichment of the generated libraries with functional regulatory ncRNAs compared to classical approaches.
Collapse
|
21
|
Processing of snoRNAs as a new source of regulatory non-coding RNAs: snoRNA fragments form a new class of functional RNAs. Bioessays 2012. [PMID: 23180440 DOI: 10.1002/bies.201200117] [Citation(s) in RCA: 118] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Recent experimental evidence suggests that most of the genome is transcribed into non-coding RNAs. The initial transcripts undergo further processing generating shorter, metabolically stable RNAs with diverse functions. Small nucleolar RNAs (snoRNAs) are non-coding RNAs that modify rRNAs, tRNAs, and snRNAs that were considered stable. We review evidence that snoRNAs undergo further processing. High-throughput sequencing and RNase protection experiments showed widespread expression of snoRNA fragments, known as snoRNA-derived RNAs (sdRNAs). Some sdRNAs resemble miRNAs, these can associate with argonaute proteins and influence translation. Other sdRNAs are longer, form complexes with hnRNPs and influence gene expression. C/D box snoRNA fragmentation patterns are conserved across multiple cell types, suggesting a processing event, rather than degradation. The loss of expression from genetic loci that generate canonical snoRNAs and processed snoRNAs results in diseases, such as Prader-Willi Syndrome, indicating possible physiological roles for processed snoRNAs. We propose that processed snoRNAs acquire new roles in gene expression and represent a new class of regulatory RNAs distinct from canonical snoRNAs.
Collapse
|
22
|
Expression Profiling of a Heterogeneous Population of ncRNAs Employing a Mixed DNA/LNA Microarray. J Nucleic Acids 2012; 2012:283560. [PMID: 22778910 PMCID: PMC3384982 DOI: 10.1155/2012/283560] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2011] [Revised: 03/06/2012] [Accepted: 03/06/2012] [Indexed: 12/20/2022] Open
Abstract
Mammalian transcriptomes mainly consist of non protein coding RNAs. These ncRNAs play various roles in all cells and are involved in multiple regulation pathways. More recently, ncRNAs have also been described as valuable diagnostic tools. While RNA-seq approaches progressively replace microarray-based technologies for high-throughput expression profiling, they are still not routinely used in diagnostic. Microarrays, on the other hand, are more widely used for diagnostic profiling, especially for very small ncRNA (e.g., miRNAs), employing locked nucleic acid (LNA) arrays. However, LNA microarrays are quite expensive for high-throughput studies targeting longer ncRNAs, while DNA arrays do not provide satisfying results for the analysis of small RNAs. Here, we describe a mixed DNA/LNA microarray platform, where directly labeled small and longer ncRNAs are hybridized on LNA probes or custom DNA probes, respectively, enabling sensitive and specific analysis of a complex RNA population on a unique array in one single experiment. The DNA/LNA system, requiring relatively low amounts of total RNA, which complies with diagnostic references, was successfully applied to the analysis of differential ncRNA expression in mouse embryonic stem cells and adult brain cells.
Collapse
|
23
|
Identification of differentially expressed non-coding RNAs in embryonic stem cell neural differentiation. Nucleic Acids Res 2012; 40:6001-15. [PMID: 22492625 PMCID: PMC3401476 DOI: 10.1093/nar/gks311] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Protein-coding genes, guiding differentiation of ES cells into neural cells, have extensively been studied in the past. However, for the class of ncRNAs only the involvement of some specific microRNAs (miRNAs) has been described. Thus, to characterize the entire small non-coding RNA (ncRNA) transcriptome, involved in the differentiation of mouse ES cells into neural cells, we have generated three specialized ribonucleo-protein particle (RNP)-derived cDNA libraries, i.e. from pluripotent ES cells, neural progenitors and differentiated neural cells, respectively. By high-throughput sequencing and transcriptional profiling we identified several novel miRNAs to be involved in ES cell differentiation, as well as seven small nucleolar RNAs. In addition, expression of 7SL, 7SK and vault-2 RNAs was significantly up-regulated during ES cell differentiation. About half of ncRNA sequences from the three cDNA libraries mapped to intergenic or intragenic regions, designated as interRNAs and intraRNAs, respectively. Thereby, novel ncRNA candidates exhibited a predominant size of 18-30 nt, thus resembling miRNA species, but, with few exceptions, lacking canonical miRNA features. Additionally, these novel intraRNAs and interRNAs were not only found to be differentially expressed in stem-cell derivatives, but also in primary cultures of hippocampal neurons and astrocytes, strengthening their potential function in neural ES cell differentiation.
Collapse
|
24
|
Revealing stable processing products from ribosome-associated small RNAs by deep-sequencing data analysis. Nucleic Acids Res 2012; 40:4013-24. [PMID: 22266655 PMCID: PMC3351166 DOI: 10.1093/nar/gks020] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
The exploration of the non-protein-coding RNA (ncRNA) transcriptome is currently focused on profiling of microRNA expression and detection of novel ncRNA transcription units. However, recent studies suggest that RNA processing can be a multi-layer process leading to the generation of ncRNAs of diverse functions from a single primary transcript. Up to date no methodology has been presented to distinguish stable functional RNA species from rapidly degraded side products of nucleases. Thus the correct assessment of widespread RNA processing events is one of the major obstacles in transcriptome research. Here, we present a novel automated computational pipeline, named APART, providing a complete workflow for the reliable detection of RNA processing products from next-generation-sequencing data. The major features include efficient handling of non-unique reads, detection of novel stable ncRNA transcripts and processing products and annotation of known transcripts based on multiple sources of information. To disclose the potential of APART, we have analyzed a cDNA library derived from small ribosome-associated RNAs in Saccharomyces cerevisiae. By employing the APART pipeline, we were able to detect and confirm by independent experimental methods multiple novel stable RNA molecules differentially processed from well known ncRNAs, like rRNAs, tRNAs or snoRNAs, in a stress-dependent manner.
Collapse
|
25
|
RNA-seq analysis of small RNPs in Trypanosoma brucei reveals a rich repertoire of non-coding RNAs. Nucleic Acids Res 2011; 40:1282-98. [PMID: 21976736 PMCID: PMC3273796 DOI: 10.1093/nar/gkr786] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The discovery of a plethora of small non-coding RNAs (ncRNAs) has fundamentally changed our understanding of how genes are regulated. In this study, we employed the power of deep sequencing of RNA (RNA-seq) to examine the repertoire of ncRNAs present in small ribonucleoprotein particles (RNPs) of Trypanosoma brucei, an important protozoan parasite. We identified new C/D and H/ACA small nucleolar RNAs (snoRNAs), as well as tens of putative novel non-coding RNAs; several of these are processed from trans-spliced and polyadenylated transcripts. The RNA-seq analysis provided information on the relative abundance of the RNAs, and their 5'- and 3'-termini. The study demonstrated that three highly abundant snoRNAs are involved in rRNA processing and highlight the unique trypanosome-specific repertoire of these RNAs. Novel RNAs were studied using in situ hybridization, association in RNP complexes, and 'RNA walk' to detect interaction with their target RNAs. Finally, we showed that the abundance of certain ncRNAs varies between the two stages of the parasite, suggesting that ncRNAs may contribute to gene regulation during the complex parasite's life cycle. This is the first study to provide a whole-genome analysis of the large repertoire of small RNPs in trypanosomes.
Collapse
|
26
|
Direct cloning of double-stranded RNAs from RNase protection analysis reveals processing patterns of C/D box snoRNAs and provides evidence for widespread antisense transcript expression. Nucleic Acids Res 2011; 39:9720-30. [PMID: 21880592 PMCID: PMC3239178 DOI: 10.1093/nar/gkr684] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
We describe a new method that allows cloning of double-stranded RNAs (dsRNAs) that are generated in RNase protection experiments. We demonstrate that the mouse C/D box snoRNA MBII-85 (SNORD116) is processed into at least five shorter RNAs using processing sites near known functional elements of C/D box snoRNAs. Surprisingly, the majority of cloned RNAs from RNase protection experiments were derived from endogenous cellular RNA, indicating widespread antisense expression. The cloned dsRNAs could be mapped to genome areas that show RNA expression on both DNA strands and partially overlapped with experimentally determined argonaute-binding sites. The data suggest a conserved processing pattern for some C/D box snoRNAs and abundant expression of longer, non-coding RNAs in the cell that can potentially form dsRNAs.
Collapse
|
27
|
|
28
|
Fast and accurate clustering of noncoding RNAs using ensembles of sequence alignments and secondary structures. BMC Bioinformatics 2011; 12 Suppl 1:S48. [PMID: 21342580 PMCID: PMC3044305 DOI: 10.1186/1471-2105-12-s1-s48] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background Clustering of unannotated transcripts is an important task to identify novel families of noncoding RNAs (ncRNAs). Several hierarchical clustering methods have been developed using similarity measures based on the scores of structural alignment. However, the high computational cost of exact structural alignment requires these methods to employ approximate algorithms. Such heuristics degrade the quality of clustering results, especially when the similarity among family members is not detectable at the primary sequence level. Results We describe a new similarity measure for the hierarchical clustering of ncRNAs. The idea is that the reliability of approximate algorithms can be improved by utilizing the information of suboptimal solutions in their dynamic programming frameworks. We approximate structural alignment in a more simplified manner than the existing methods. Instead, our method utilizes all possible sequence alignments and all possible secondary structures, whereas the existing methods only use one optimal sequence alignment and one optimal secondary structure. We demonstrate that this strategy can achieve the best balance between the computational cost and the quality of the clustering. In particular, our method can keep its high performance even when the sequence identity of family members is less than 60%. Conclusions Our method enables fast and accurate clustering of ncRNAs. The software is available for download at http://bpla-kernel.dna.bio.keio.ac.jp/clustering/.
Collapse
|
29
|
nocoRNAc: characterization of non-coding RNAs in prokaryotes. BMC Bioinformatics 2011; 12:40. [PMID: 21281482 PMCID: PMC3230914 DOI: 10.1186/1471-2105-12-40] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2010] [Accepted: 01/31/2011] [Indexed: 11/10/2022] Open
Abstract
Background The interest in non-coding RNAs (ncRNAs) constantly rose during the past few years because of the wide spectrum of biological processes in which they are involved. This led to the discovery of numerous ncRNA genes across many species. However, for most organisms the non-coding transcriptome still remains unexplored to a great extent. Various experimental techniques for the identification of ncRNA transcripts are available, but as these methods are costly and time-consuming, there is a need for computational methods that allow the detection of functional RNAs in complete genomes in order to suggest elements for further experiments. Several programs for the genome-wide prediction of functional RNAs have been developed but most of them predict a genomic locus with no indication whether the element is transcribed or not. Results We present NOCORNAc, a program for the genome-wide prediction of ncRNA transcripts in bacteria. NOCORNAc incorporates various procedures for the detection of transcriptional features which are then integrated with functional ncRNA loci to determine the transcript coordinates. We applied RNAz and NOCORNAc to the genome of Streptomyces coelicolor and detected more than 800 putative ncRNA transcripts most of them located antisense to protein-coding regions. Using a custom design microarray we profiled the expression of about 400 of these elements and found more than 300 to be transcribed, 38 of them are predicted novel ncRNA genes in intergenic regions. The expression patterns of many ncRNAs are similarly complex as those of the protein-coding genes, in particular many antisense ncRNAs show a high expression correlation with their protein-coding partner. Conclusions We have developed NOCORNAc, a framework that facilitates the automated characterization of functional ncRNAs. NOCORNAc increases the confidence of predicted ncRNA loci, especially if they contain transcribed ncRNAs. NOCORNAc is not restricted to intergenic regions, but it is applicable to the prediction of ncRNA transcripts in whole microbial genomes. The software as well as a user guide and example data is available at http://www.zbit.uni-tuebingen.de/pas/nocornac.htm.
Collapse
|
30
|
Abstract
Most, if not all, known noncoding RNAs (ncRNAs) are associated with RNA binding proteins, thus forming ribonucleoprotein particles or RNPs. Here we describe a protocol for the generation of a specialized cDNA library from RNPs, thereby increasing the proportion of functional ncRNA species in the library. To that end, cellular extracts are fractionated on 10-30% glycerol gradients. Subsequently, RNP-derived ncRNAs are isolated and 3'-tailed by cytidine triphosphate and poly(A) polymerase; this is followed by 5' adapter ligation by T4 RNA ligase. Reverse transcription of ncRNAs into cDNAs is carried out with an oligo-d(G) anchor primer. The generated cDNA libraries are subsequently submitted to high-throughput sequencing. This RNP selection procedure increases the probability of the presence of biologically relevant ncRNA species in the library compared with libraries generation methods that use size-selected, protein-devoid ncRNAs. The protocol enables the generation of deep-sequencing-compatible cDNA libraries that code for functional ncRNAs within 1 week.
Collapse
|
31
|
Abstract
In this chapter, we present an up-to-date view of the optimal characteristics of the yeast Saccharomyces cerevisiae as a model eukaryote for systems biology studies, with main molecular mechanisms, biological networks, and sub-cellular organization essentially conserved in all eukaryotes, derived from a complex common ancestor. The existence of advanced tools for molecular studies together with high-throughput experimental and computational methods, most of them being implemented and validated in yeast, with new ones being developed, is opening the way to the characterization of the core modular architecture and complex networks essential to all eukaryotes. Selected examples of the latest discoveries in eukaryote complexity and systems biology studies using yeast as a reference model and their applications in biotechnology and medicine are presented.
Collapse
|
32
|
Whole transcriptome analysis: what are we still missing? WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2010; 3:527-43. [PMID: 21197667 DOI: 10.1002/wsbm.135] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
New technologies such as tag-based sequencing and tiling arrays have provided unique insights into the transcriptional output of cells. Many new RNA classes have been uncovered in the past decade, despite limitations in current technologies. Even as the repertoire of known functional elements of the transcriptome increases and contemporary technologies become mainstream, inadequacies in conventional protocols for library preparation, sequencing and mapping continue to hamper revelation of the entire transcriptome of cells. In this article, we review current protocols and outline their deficiencies. We also provide our view on what we may be overlooking in the transcriptome, despite exhaustive investigations, and indicate future areas of technological development and research.
Collapse
|
33
|
Abstract
Genomic tiling arrays, cDNA sequencing and, more recently, RNA-Seq have provided initial insights into the extent and depth of transcribed sequence across human and other genomes. These methods have led to greatly improved annotations of protein-coding genes, but have also identified transcription outside of annotated exons. One resultant issue that has aroused dispute is the balance of transcription of known exons against transcription outside of known exons. While non-genic 'dark matter' transcription was found by tiling arrays to be pervasive, it was seen to contribute only a small percentage of the polyadenylated transcriptome in some RNA-Seq experiments. This apparent contradiction has been compounded by a lack of clarity about what exactly constitutes a protein-coding gene. It remains unclear, for example, whether or not all transcripts that overlap on either strand within a genomic locus should be assigned to a single gene locus, including those that fail to share promoters, exons and splice junctions. The inability of tiling arrays and RNA-Seq to count transcripts, rather than exons or exon pairs, adds to these difficulties. While there is agreement that thousands of apparently non-coding loci are present outside of protein-coding genes in the human genome, there is vigorous debate of what constitutes evidence for their functionality. These issues will only be resolved upon the demonstration, or otherwise, that organismal or cellular phenotypes frequently result when non-coding RNA loci are disrupted.
Collapse
|