1
|
Emerging Functions for snoRNAs and snoRNA-Derived Fragments. Int J Mol Sci 2021; 22:ijms221910193. [PMID: 34638533 PMCID: PMC8508363 DOI: 10.3390/ijms221910193] [Citation(s) in RCA: 55] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 09/16/2021] [Accepted: 09/16/2021] [Indexed: 12/11/2022] Open
Abstract
The widespread implementation of mass sequencing has revealed a diverse landscape of small RNAs derived from larger precursors. Whilst many of these are likely to be byproducts of degradation, there are nevertheless metabolically stable fragments derived from tRNAs, rRNAs, snoRNAs, and other non-coding RNA, with a number of examples of the production of such fragments being conserved across species. Coupled with specific interactions to RNA-binding proteins and a growing number of experimentally reported examples suggesting function, a case is emerging whereby the biological significance of small non-coding RNAs extends far beyond miRNAs and piRNAs. Related to this, a similarly complex picture is emerging of non-canonical roles for the non-coding precursors, such as for snoRNAs that are also implicated in such areas as the silencing of gene expression and the regulation of alternative splicing. This is in addition to a body of literature describing snoRNAs as an additional source of miRNA-like regulators. This review seeks to highlight emerging roles for such non-coding RNA, focusing specifically on “new” roles for snoRNAs and the small fragments derived from them.
Collapse
|
2
|
Shen ZA, Luo T, Zhou YK, Yu H, Du PF. NPI-GNN: Predicting ncRNA-protein interactions with deep graph neural networks. Brief Bioinform 2021; 22:6210071. [PMID: 33822882 DOI: 10.1093/bib/bbab051] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Revised: 01/29/2021] [Accepted: 02/01/2021] [Indexed: 12/23/2022] Open
Abstract
Noncoding RNAs (ncRNAs) play crucial roles in many biological processes. Experimental methods for identifying ncRNA-protein interactions (NPIs) are always costly and time-consuming. Many computational approaches have been developed as alternative ways. In this work, we collected five benchmarking datasets for predicting NPIs. Based on these datasets, we evaluated and compared the prediction performances of existing machine-learning based methods. Graph neural network (GNN) is a recently developed deep learning algorithm for link predictions on complex networks, which has never been applied in predicting NPIs. We constructed a GNN-based method, which is called Noncoding RNA-Protein Interaction prediction using Graph Neural Networks (NPI-GNN), to predict NPIs. The NPI-GNN method achieved comparable performance with state-of-the-art methods in a 5-fold cross-validation. In addition, it is capable of predicting novel interactions based on network information and sequence information. We also found that insufficient sequence information does not affect the NPI-GNN prediction performance much, which makes NPI-GNN more robust than other methods. As far as we can tell, NPI-GNN is the first end-to-end GNN predictor for predicting NPIs. All benchmarking datasets in this work and all source codes of the NPI-GNN method have been deposited with documents in a GitHub repo (https://github.com/AshuiRUA/NPI-GNN).
Collapse
Affiliation(s)
- Zi-Ang Shen
- College of Intelligence and Computing, Tianjin University, Tianjin 300350, China
| | - Tao Luo
- College of Intelligence and Computing, Tianjin University, Tianjin 300350, China
| | - Yuan-Ke Zhou
- College of Intelligence and Computing, Tianjin University, Tianjin 300350, China
| | - Han Yu
- College of Intelligence and Computing, Tianjin University, Tianjin 300350, China
| | - Pu-Feng Du
- College of Intelligence and Computing, Tianjin University, Tianjin 300350, China
| |
Collapse
|
3
|
Zhou YK, Shen ZA, Yu H, Luo T, Gao Y, Du PF. Predicting lncRNA-Protein Interactions With miRNAs as Mediators in a Heterogeneous Network Model. Front Genet 2020; 10:1341. [PMID: 32038709 PMCID: PMC6988623 DOI: 10.3389/fgene.2019.01341] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Accepted: 12/09/2019] [Indexed: 01/20/2023] Open
Abstract
Long non-coding RNAs (lncRNAs) play important roles in various biological processes, where lncRNA–protein interactions are usually involved. Therefore, identifying lncRNA–protein interactions is of great significance to understand the molecular functions of lncRNAs. Since the experiments to identify lncRNA–protein interactions are always costly and time consuming, computational methods are developed as alternative approaches. However, existing lncRNA–protein interaction predictors usually require prior knowledge of lncRNA–protein interactions with experimental evidences. Their performances are limited due to the number of known lncRNA–protein interactions. In this paper, we explored a novel way to predict lncRNA–protein interactions without direct prior knowledge. MiRNAs were picked up as mediators to estimate potential interactions between lncRNAs and proteins. By validating our results based on known lncRNA–protein interactions, our method achieved an AUROC (Area Under Receiver Operating Curve) of 0.821, which is comparable to the state-of-the-art methods. Moreover, our method achieved an improved AUROC of 0.852 by further expanding the training dataset. We believe that our method can be a useful supplement to the existing methods, as it provides an alternative way to estimate lncRNA–protein interactions in a heterogeneous network without direct prior knowledge. All data and codes of this work can be downloaded from GitHub (https://github.com/zyk2118216069/LncRNA-protein-interactions-prediction).
Collapse
Affiliation(s)
- Yuan-Ke Zhou
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Zi-Ang Shen
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Han Yu
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Tao Luo
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Yang Gao
- School of Medicine, Nankai University, Tianjin, China
| | - Pu-Feng Du
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| |
Collapse
|
4
|
Abstract
Long noncoding RNAs (lncRNAs) have gained widespread attention in recent years as a potentially new and crucial layer of biological regulation. lncRNAs of all kinds have been implicated in a range of developmental processes and diseases, but knowledge of the mechanisms by which they act is still surprisingly limited, and claims that almost the entirety of the mammalian genome is transcribed into functional noncoding transcripts remain controversial. At the same time, a small number of well-studied lncRNAs have given us important clues about the biology of these molecules, and a few key functional and mechanistic themes have begun to emerge, although the robustness of these models and classification schemes remains to be seen. Here, we review the current state of knowledge of the lncRNA field, discussing what is known about the genomic contexts, biological functions, and mechanisms of action of lncRNAs. We also reflect on how the recent interest in lncRNAs is deeply rooted in biology's longstanding concern with the evolution and function of genomes.
Collapse
Affiliation(s)
- Johnny T Y Kung
- Howard Hughes Medical Institute, Harvard Medical School, Boston, MA 02114, USA
| | | | | |
Collapse
|
5
|
Ong SAK, Lin HH, Chen YZ, Li ZR, Cao Z. Efficacy of different protein descriptors in predicting protein functional families. BMC Bioinformatics 2007; 8:300. [PMID: 17705863 PMCID: PMC1997217 DOI: 10.1186/1471-2105-8-300] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2006] [Accepted: 08/17/2007] [Indexed: 02/02/2023] Open
Abstract
Background Sequence-derived structural and physicochemical descriptors have frequently been used in machine learning prediction of protein functional families, thus there is a need to comparatively evaluate the effectiveness of these descriptor-sets by using the same method and parameter optimization algorithm, and to examine whether the combined use of these descriptor-sets help to improve predictive performance. Six individual descriptor-sets and four combination-sets were evaluated in support vector machines (SVM) prediction of six protein functional families. Results The performance of these descriptor-sets were ranked by Matthews correlation coefficient (MCC), and categorized into two groups based on their performance. While there is no overwhelmingly favourable choice of descriptor-sets, certain trends were found. The combination-sets tend to give slightly but consistently higher MCC values and thus overall best performance such that three out of four combination-sets show slightly better performance compared to one out of six individual descriptor-sets. Conclusion Our study suggests that currently used descriptor-sets are generally useful for classifying proteins and the prediction performance may be enhanced by exploring combinations of descriptors.
Collapse
Affiliation(s)
- Serene AK Ong
- Department of Pharmacy, National University of Singapore, Blk S16, Level 8, 08-14, 3 Science Drive 2, Singapore 117543, Singapore
| | - Hong Huang Lin
- Department of Pharmacy, National University of Singapore, Blk S16, Level 8, 08-14, 3 Science Drive 2, Singapore 117543, Singapore
| | - Yu Zong Chen
- Department of Pharmacy, National University of Singapore, Blk S16, Level 8, 08-14, 3 Science Drive 2, Singapore 117543, Singapore
| | - Ze Rong Li
- College of Chemistry, Sichuan University, Chengdu, 610064, P.R. China
| | - Zhiwei Cao
- Shanghai Center for Bioinformatics Technology, 100, Qinzhou Road, Shanghai 200235 P.R. China
| |
Collapse
|
6
|
Lin HH, Han LY, Zhang HL, Zheng CJ, Xie B, Chen YZ. Prediction of the functional class of lipid binding proteins from sequence-derived properties irrespective of sequence similarity. J Lipid Res 2006; 47:824-31. [PMID: 16443826 DOI: 10.1194/jlr.m500530-jlr200] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Lipid binding proteins play important roles in signaling, regulation, membrane trafficking, immune response, lipid metabolism, and transport. Because of their functional and sequence diversity, it is desirable to explore additional methods for predicting lipid binding proteins irrespective of sequence similarity. This work explores the use of support vector machines (SVMs) as such a method. SVM prediction systems are developed using 14,776 lipid binding and 133,441 nonlipid binding proteins and are evaluated by an independent set of 6,768 lipid binding and 64,761 nonlipid binding proteins. The computed prediction accuracy is 78.9, 79.5, 82.2, 79.5, 84.4, 76.6, 90.6, 79.0, and 89.9% for lipid degradation, lipid metabolism, lipid synthesis, lipid transport, lipid binding, lipopolysaccharide biosynthesis, lipoprotein, lipoyl, and all lipid binding proteins, respectively. The accuracy for the nonmember proteins of each class is 99.9, 99.2, 99.6, 99.8, 99.9, 99.8, 98.5, 99.9, and 97.0%, respectively. Comparable accuracies are obtained when homologous proteins are considered as one, or by using a different SVM kernel function. Our method predicts 86.8% of the 76 lipid binding proteins nonhomologous to any protein in the Swiss-Prot database and 89.0% of the 73 known lipid binding domains as lipid binding. These findings suggest the usefulness of SVMs for facilitating the prediction of lipid binding proteins. Our software can be accessed at the SVMProt server (http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi).
Collapse
Affiliation(s)
- H H Lin
- Bioinformatics and Drug Design Group, Department of Computational Science, National University of Singapore, Singapore 117543
| | | | | | | | | | | |
Collapse
|
7
|
Han LY, Cai CZ, Lo SL, Chung MCM, Chen YZ. Prediction of RNA-binding proteins from primary sequence by a support vector machine approach. RNA (NEW YORK, N.Y.) 2004; 10:355-68. [PMID: 14970381 PMCID: PMC1370931 DOI: 10.1261/rna.5890304] [Citation(s) in RCA: 86] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/19/2003] [Accepted: 10/06/2003] [Indexed: 05/20/2023]
Abstract
Elucidation of the interaction of proteins with different molecules is of significance in the understanding of cellular processes. Computational methods have been developed for the prediction of protein-protein interactions. But insufficient attention has been paid to the prediction of protein-RNA interactions, which play central roles in regulating gene expression and certain RNA-mediated enzymatic processes. This work explored the use of a machine learning method, support vector machines (SVM), for the prediction of RNA-binding proteins directly from their primary sequence. Based on the knowledge of known RNA-binding and non-RNA-binding proteins, an SVM system was trained to recognize RNA-binding proteins. A total of 4011 RNA-binding and 9781 non-RNA-binding proteins was used to train and test the SVM classification system, and an independent set of 447 RNA-binding and 4881 non-RNA-binding proteins was used to evaluate the classification accuracy. Testing results using this independent evaluation set show a prediction accuracy of 94.1%, 79.3%, and 94.1% for rRNA-, mRNA-, and tRNA-binding proteins, and 98.7%, 96.5%, and 99.9% for non-rRNA-, non-mRNA-, and non-tRNA-binding proteins, respectively. The SVM classification system was further tested on a small class of snRNA-binding proteins with only 60 available sequences. The prediction accuracy is 40.0% and 99.9% for snRNA-binding and non-snRNA-binding proteins, indicating a need for a sufficient number of proteins to train SVM. The SVM classification systems trained in this work were added to our Web-based protein functional classification software SVMProt, at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi. Our study suggests the potential of SVM as a useful tool for facilitating the prediction of protein-RNA interactions.
Collapse
Affiliation(s)
- Lian Yi Han
- Department of Computational Science, National University of Singapore, Singapore 117543
| | | | | | | | | |
Collapse
|
8
|
Dimario PJ. Cell and Molecular Biology of Nucleolar Assembly and Disassembly. INTERNATIONAL REVIEW OF CYTOLOGY 2004; 239:99-178. [PMID: 15464853 DOI: 10.1016/s0074-7696(04)39003-0] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Nucleoli disassemble in prophase of the metazoan mitotic cycle, and they begin their reassembly (nucleologenesis) in late anaphase?early telophase. Nucleolar disassembly and reassembly were obvious to the early cytologists of the eighteenth and nineteenth centuries, and although this has lead to a plethora of literature describing these events, our understanding of the molecular mechanisms regulating nucleolar assembly and disassembly has expanded immensely just within the last 10-15 years. We briefly survey the findings of nineteenth-century cytologists on nucleolar assembly and disassembly, followed by the work of Heitz and McClintock on nucleolar organizers. A primer review of nucleolar structure and functions precedes detailed descriptions of modern molecular and microscopic studies of nucleolar assembly and disassembly. Nucleologenesis is concurrent with the reinitiation of rDNA transcription in telophase. The perichromosomal sheath, prenucleolar bodies, and nucleolar-derived foci serve as repositories for nucleolar processing components used in the previous interphase. Disassembly of the perichromosomal sheath along with the dynamic movements and compositional changes of the prenucleolar bodies and nucleolus-derived foci coincide with reactivation of rDNA synthesis within the chromosomal nucleolar organizers during telophase. Nucleologenesis is considered in various model organisms to provide breadth to our understanding. Nucleolar disassembly occurs at the onset of mitosis primarily as a result of the mitosis-specific phosphorylation of Pol I transcription factors and processing components. Although we have learned much regarding nucleolar assembly and disassembly, many questions still remain, and these questions are as vibrant for us today as early questions were for nineteenth- and early twentieth-century cytologists.
Collapse
Affiliation(s)
- Patrick J Dimario
- Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana 70803-1715, USA
| |
Collapse
|
9
|
Fedorova L, Fedorov A. Introns in gene evolution. CONTEMPORARY ISSUES IN GENETICS AND EVOLUTION 2003. [DOI: 10.1007/978-94-010-0229-5_3] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
|
10
|
Okuwaki M, Tsujimoto M, Nagata K. The RNA binding activity of a ribosome biogenesis factor, nucleophosmin/B23, is modulated by phosphorylation with a cell cycle-dependent kinase and by association with its subtype. Mol Biol Cell 2002; 13:2016-30. [PMID: 12058066 PMCID: PMC117621 DOI: 10.1091/mbc.02-03-0036] [Citation(s) in RCA: 131] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Nucleophosmin/B23 is a nucleolar phosphoprotein. It has been shown that B23 binds to nucleic acids, digests RNA, and is localized in nucleolar granular components from which preribosomal particles are transported to cytoplasm. The intracellular localization of B23 is significantly changed during the cell cycle. Here, we have examined the cellular localization of B23 proteins and the effect of mitotic phosphorylation of B23.1 on its RNA binding activity. Two splicing variants of B23 proteins, termed B23.1 and B23.2, were complexed both in vivo and in vitro. The RNA binding activity of B23.1 was impaired by hetero-oligomer formation with B23.2. Both subtypes of B23 proteins were phosphorylated during mitosis by cyclin B/cdc2. The RNA binding activity of B23.1 was repressed through cyclin B/cdc2-mediated phosphorylation at specific sites in B23. Thus, the RNA binding activity of B23.1 is stringently modulated by its phosphorylation and subtype association. Interphase B23.1 was mainly localized in nucleoli, whereas B23.2 and mitotic B23.1, those of which were incapable of binding to RNA, were dispersed throughout the nucleoplasm and cytoplasm, respectively. These results suggest that nucleolar localization of B23.1 is mediated by its ability to associate with RNA.
Collapse
Affiliation(s)
- Mitsuru Okuwaki
- Department of Infection Biology, Institute of Basic Medical Sciences, University of Tsukuba, 1-1-1 Tennohdai, Tsukuba 305-8575, Japan
| | | | | |
Collapse
|
11
|
King TH, Decatur WA, Bertrand E, Maxwell ES, Fournier MJ. A well-connected and conserved nucleoplasmic helicase is required for production of box C/D and H/ACA snoRNAs and localization of snoRNP proteins. Mol Cell Biol 2001; 21:7731-46. [PMID: 11604509 PMCID: PMC99944 DOI: 10.1128/mcb.21.22.7731-7746.2001] [Citation(s) in RCA: 92] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Biogenesis of small nucleolar RNA-protein complexes (snoRNPs) consists of synthesis of the snoRNA and protein components, snoRNP assembly, and localization to the nucleolus. Recently, two nucleoplasmic proteins from mice were observed to bind to a model box C/D snoRNA in vitro, suggesting that they function at an early stage in snoRNP biogenesis. Both proteins have been described in other contexts. The proteins, called p50 and p55 in the snoRNA binding study, are highly conserved and related to each other. Both have Walker A and B motifs characteristic of ATP- and GTP-binding and nucleoside triphosphate-hydrolyzing domains, and the mammalian orthologs have DNA helicase activity in vitro. Here, we report that the Saccharomyces cerevisiae ortholog of p50 (Rvb2, Tih2p, and other names) is required for production of C/D snoRNAs in vivo and, surprisingly, H/ACA snoRNAs as well. Point mutations in the Walker A and B motifs cause temperature-sensitive or lethal growth phenotypes and severe defects in snoRNA accumulation. Notably, depletion of p50 (called Rvb2 in this study) also impairs localization of C/D and H/ACA core snoRNP proteins Nop1p and Gar1p, suggesting a defect(s) in snoRNP assembly or trafficking to the nucleolus. Findings from other studies link Rvb2 orthologs with chromatin remodeling and transcription. Taken together, the present results indicate that Rvb2 is involved in an early stage of snoRNP biogenesis and may play a role in coupling snoRNA synthesis with snoRNP assembly and localization.
Collapse
Affiliation(s)
- T H King
- Department of Biochemistry and Molecular Biology, University of Massachusetts, Amherst, 01003, USA
| | | | | | | | | |
Collapse
|