51
|
Zhao W, Zhou Y, Cui Q, Zhou Y. PACES: prediction of N4-acetylcytidine (ac4C) modification sites in mRNA. Sci Rep 2019; 9:11112. [PMID: 31366994 PMCID: PMC6668381 DOI: 10.1038/s41598-019-47594-7] [Citation(s) in RCA: 54] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2019] [Accepted: 07/19/2019] [Indexed: 01/27/2023] Open
Abstract
N4-acetylcytidine (ac4C) is a highly conserved RNA modification and is the first acetylation event described in mRNA. ac4C in mRNA has been demonstrated to be involved in the regulation of mRNA stability, processing and translation, but the exact means by which ac4C works remain unclear. In addition, ac4C is widely distributed within the human transcriptome at physiologically relevant levels and so far only a small fraction of modified sequences have been detected by experiments. In this study, we developed a predictor of ac4C sites in human mRNA named PACES to help mining possible modified motifs. PACES combines two random forest classifiers, position-specific dinucleotide sequence profile and K-nucleotide frequencies. With genomic sequences as input, PACES gives possible modified sequences based on the training model. PACES is freely available at http://www.rnanut.net/paces/.
Collapse
Affiliation(s)
- Wanqing Zhao
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
| | - Yiran Zhou
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
| | - Qinghua Cui
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China.
- Center of Bioinformatics, Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China.
| | - Yuan Zhou
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China.
| |
Collapse
|
52
|
Tahir M, Tayara H, Chong KT. iPseU-CNN: Identifying RNA Pseudouridine Sites Using Convolutional Neural Networks. MOLECULAR THERAPY-NUCLEIC ACIDS 2019; 16:463-470. [PMID: 31048185 PMCID: PMC6488737 DOI: 10.1016/j.omtn.2019.03.010] [Citation(s) in RCA: 58] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/07/2019] [Revised: 03/29/2019] [Accepted: 03/29/2019] [Indexed: 12/15/2022]
Abstract
Pseudouridine is the most prevalent RNA modification and has been found in both eukaryotes and prokaryotes. Currently, pseudouridine has been demonstrated in several kinds of RNAs, such as small nuclear RNA, rRNA, tRNA, mRNA, and small nucleolar RNA. Therefore, its significance to academic research and drug development is understandable. Through biochemical experiments, the pseudouridine site identification has produced good outcomes, but these lab exploratory methods and biochemical processes are expensive and time consuming. Therefore, it is important to introduce efficient methods for identification of pseudouridine sites. In this study, an intelligent method for pseudouridine sites using the deep-learning approach was developed. The proposed prediction model is called iPseU-CNN (identifying pseudouridine by convolutional neural networks). The existing methods used handcrafted features and machine-learning approaches to identify pseudouridine sites. However, the proposed predictor extracts the features of the pseudouridine sites automatically using a convolution neural network model. The iPseU-CNN model yields better outcomes than the current state-of-the-art models in all evaluation parameters. It is thus highly projected that the iPseU-CNN predictor will become a helpful tool for academic research on pseudouridine site prediction of RNA, as well as in drug discovery.
Collapse
Affiliation(s)
- Muhammad Tahir
- Department of Electronics and Information Engineering, Chonbuk National University, Jeonju 54896, South Korea; Department of Computer Science, Abdul Wali Khan University, Mardan 23200, Pakistan
| | - Hilal Tayara
- Department of Electronics and Information Engineering, Chonbuk National University, Jeonju 54896, South Korea.
| | - Kil To Chong
- Advanced Electronics and Information Research Center, Chonbuk National University, Jeonju 54896, South Korea.
| |
Collapse
|
53
|
He J, Fang T, Zhang Z, Huang B, Zhu X, Xiong Y. PseUI: Pseudouridine sites identification based on RNA sequence information. BMC Bioinformatics 2018; 19:306. [PMID: 30157750 PMCID: PMC6114832 DOI: 10.1186/s12859-018-2321-0] [Citation(s) in RCA: 76] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2018] [Accepted: 08/21/2018] [Indexed: 01/28/2023] Open
Abstract
Background Pseudouridylation is the most prevalent type of posttranscriptional modification in various stable RNAs of all organisms, which significantly affects many cellular processes that are regulated by RNA. Thus, accurate identification of pseudouridine (Ψ) sites in RNA will be of great benefit for understanding these cellular processes. Due to the low efficiency and high cost of current available experimental methods, it is highly desirable to develop computational methods for accurately and efficiently detecting Ψ sites in RNA sequences. However, the predictive accuracy of existing computational methods is not satisfactory and still needs improvement. Results In this study, we developed a new model, PseUI, for Ψ sites identification in three species, which are H. sapiens, S. cerevisiae, and M. musculus. Firstly, five different kinds of features including nucleotide composition (NC), dinucleotide composition (DC), pseudo dinucleotide composition (pseDNC), position-specific nucleotide propensity (PSNP), and position-specific dinucleotide propensity (PSDP) were generated based on RNA segments. Then, a sequential forward feature selection strategy was used to gain an effective feature subset with a compact representation but discriminative prediction power. Based on the selected feature subsets, we built our model by using a support vector machine (SVM). Finally, the generalization of our model was validated by both the jackknife test and independent validation tests on the benchmark datasets. The experimental results showed that our model is more accurate and stable than the previously published models. We have also provided a user-friendly web server for our model at http://zhulab.ahu.edu.cn/PseUI, and a brief instruction for the web server is provided in this paper. By using this instruction, the academic users can conveniently get their desired results without complicated calculations. Conclusion In this study, we proposed a new predictor, PseUI, to detect Ψ sites in RNA sequences. It is shown that our model outperformed the existing state-of-art models. It is expected that our model, PseUI, will become a useful tool for accurate identification of RNA Ψ sites. Electronic supplementary material The online version of this article (10.1186/s12859-018-2321-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jingjing He
- School of Life Sciences, Anhui University, Hefei, 230601, Anhui, China
| | - Ting Fang
- School of Life Sciences, Anhui University, Hefei, 230601, Anhui, China
| | - Zizheng Zhang
- School of Life Sciences, Anhui University, Hefei, 230601, Anhui, China
| | - Bei Huang
- School of Life Sciences, Anhui University, Hefei, 230601, Anhui, China
| | - Xiaolei Zhu
- School of Life Sciences, Anhui University, Hefei, 230601, Anhui, China.
| | - Yi Xiong
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China.
| |
Collapse
|
54
|
Morena F, Argentati C, Bazzucchi M, Emiliani C, Martino S. Above the Epitranscriptome: RNA Modifications and Stem Cell Identity. Genes (Basel) 2018; 9:E329. [PMID: 29958477 PMCID: PMC6070936 DOI: 10.3390/genes9070329] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2018] [Revised: 06/15/2018] [Accepted: 06/25/2018] [Indexed: 02/07/2023] Open
Abstract
Sequence databases and transcriptome-wide mapping have revealed different reversible and dynamic chemical modifications of the nitrogen bases of RNA molecules. Modifications occur in coding RNAs and noncoding-RNAs post-transcriptionally and they can influence the RNA structure, metabolism, and function. The result is the expansion of the variety of the transcriptome. In fact, depending on the type of modification, RNA molecules enter into a specific program exerting the role of the player or/and the target in biological and pathological processes. Many research groups are exploring the role of RNA modifications (alias epitranscriptome) in cell proliferation, survival, and in more specialized activities. More recently, the role of RNA modifications has been also explored in stem cell biology. Our understanding in this context is still in its infancy. Available evidence addresses the role of RNA modifications in self-renewal, commitment, and differentiation processes of stem cells. In this review, we will focus on five epitranscriptomic marks: N6-methyladenosine, N1-methyladenosine, 5-methylcytosine, Pseudouridine (Ψ) and Adenosine-to-Inosine editing. We will provide insights into the function and the distribution of these chemical modifications in coding RNAs and noncoding-RNAs. Mainly, we will emphasize the role of epitranscriptomic mechanisms in the biology of naïve, primed, embryonic, adult, and cancer stem cells.
Collapse
Affiliation(s)
- Francesco Morena
- Department of Chemistry, Biology and Biotechnologies, University of Perugia, 06126 Perugia, Italy.
| | - Chiara Argentati
- Department of Chemistry, Biology and Biotechnologies, University of Perugia, 06126 Perugia, Italy.
| | - Martina Bazzucchi
- Department of Chemistry, Biology and Biotechnologies, University of Perugia, 06126 Perugia, Italy.
| | - Carla Emiliani
- Department of Chemistry, Biology and Biotechnologies, University of Perugia, 06126 Perugia, Italy.
- CEMIN, Center of Excellence of Nanostructured Innovative Materials, University of Perugia, 06126 Perugia, Italy.
| | - Sabata Martino
- Department of Chemistry, Biology and Biotechnologies, University of Perugia, 06126 Perugia, Italy.
- CEMIN, Center of Excellence of Nanostructured Innovative Materials, University of Perugia, 06126 Perugia, Italy.
| |
Collapse
|
55
|
Zhang M, Xu Y, Li L, Liu Z, Yang X, Yu DJ. Accurate RNA 5-methylcytosine site prediction based on heuristic physical-chemical properties reduction and classifier ensemble. Anal Biochem 2018; 550:41-48. [DOI: 10.1016/j.ab.2018.03.027] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2017] [Revised: 03/27/2018] [Accepted: 03/28/2018] [Indexed: 11/25/2022]
|
56
|
Li YH, Zhang GG. Towards understanding the lifespan extension by reduced insulin signaling: bioinformatics analysis of DAF-16/FOXO direct targets in Caenorhabditis elegans. Oncotarget 2017; 7:19185-92. [PMID: 27027346 PMCID: PMC4991374 DOI: 10.18632/oncotarget.8313] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2016] [Accepted: 03/16/2016] [Indexed: 11/25/2022] Open
Abstract
DAF-16, the C. elegans FOXO transcription factor, is an important determinant in aging and longevity. In this work, we manually curated FOXODB http://lyh.pkmu.cn/foxodb/, a database of FOXO direct targets. It now covers 208 genes. Bioinformatics analysis on 109 DAF-16 direct targets in C. elegans found interesting results. (i) DAF-16 and transcription factor PQM-1 co-regulate some targets. (ii) Seventeen targets directly regulate lifespan. (iii) Four targets are involved in lifespan extension induced by dietary restriction. And (iv) DAF-16 direct targets might play global roles in lifespan regulation.
Collapse
Affiliation(s)
- Yan-Hui Li
- Institute of Cardiovascular Sciences and Key Laboratory of Molecular Cardiovascular Sciences, Ministry of Education, Peking University Health Science Center, Beijing, P. R. China
| | - Gai-Gai Zhang
- Special Medical Ward (Geratology Department), First Hospital of Tsinghua University, Beijing, P. R. China
| |
Collapse
|
57
|
Chen X, Sun YZ, Liu H, Zhang L, Li JQ, Meng J. RNA methylation and diseases: experimental results, databases, Web servers and computational models. Brief Bioinform 2017; 20:896-917. [DOI: 10.1093/bib/bbx142] [Citation(s) in RCA: 49] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2017] [Revised: 09/12/2017] [Indexed: 12/15/2022] Open
Affiliation(s)
- Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| | - Ya-Zhou Sun
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
| | - Hui Liu
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| | - Lin Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| | - Jian-Qiang Li
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
| | - Jia Meng
- Department of Biological Sciences, Xi’an Jiaotong-Liverpool University
| |
Collapse
|
58
|
Li YH, Zhang GG. Network-based characterization and prediction of human DNA repair genes and pathways. Sci Rep 2017; 8:45714. [PMID: 28368026 PMCID: PMC5377940 DOI: 10.1038/srep45714] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2016] [Accepted: 03/03/2017] [Indexed: 11/14/2022] Open
Abstract
Network biology is a useful strategy to understand cell’s functional organization. In this study, for the first time, we successfully introduced network approaches to study properties of human DNA repair genes. Compared with non-DNA repair genes, we found distinguishing features for DNA repair genes: (i) they tend to have higher degrees; (ii) they tend to be located at global network center; (iii) they tend to interact directly with each other. Based on these features, we developed the first algorithm to predict new DNA repair genes. We tested several machine-learning models and found that support vector machine with kernel function of radial basis function (RBF) achieve the best performance, with precision = 0.74 and area under curve (AUC) = 0.96. In the end, we applied the algorithm to predict new DNA repair genes and got 32 new candidates. Literature supporting four of the predictions was found. We believe the network approaches introduced here might open a new avenue to understand DNA repair genes and pathways. The suggested algorithm and the predicted genes might be helpful for scientists in the field.
Collapse
Affiliation(s)
- Yan-Hui Li
- Institute of Cardiovascular Sciences and Key Laboratory of Molecular Cardiovascular Sciences, Ministry of Education, Peking University Health Science Center, Beijing, P. R. China
| | - Gai-Gai Zhang
- Special Medical Ward (Geratology Department) First Hospital of Tsinghua University Beijing, P. R. China
| |
Collapse
|
59
|
Chen W, Lin H. Recent Advances in Identification of RNA Modifications. Noncoding RNA 2016; 3:ncrna3010001. [PMID: 29657273 PMCID: PMC5831996 DOI: 10.3390/ncrna3010001] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2016] [Revised: 12/19/2016] [Accepted: 12/23/2016] [Indexed: 12/18/2022] Open
Abstract
RNA modifications are involved in a broad spectrum of biological and physiological processes. To reveal the functions of RNA modifications, it is important to accurately predict their positions. Although high-throughput experimental techniques have been proposed, they are cost-ineffective. As good complements of experiments, many computational methods have been proposed to predict RNA modification sites in recent years. In this review, we will summarize the existing computational approaches directed at predicting RNA modification sites. We will also discuss the challenges and future perspectives in developing reliable methods for predicting RNA modification sites.
Collapse
Affiliation(s)
- Wei Chen
- Department of Physics, School of Sciences, and Center for Genomics and Computational Biology, North China University of Science and Technology, Tangshan 063000, China.
| | - Hao Lin
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| |
Collapse
|
60
|
Li YH, Zhang GG, Wang N. Systematic Characterization and Prediction of Human Hypertension Genes. Hypertension 2016; 69:349-355. [PMID: 27895194 DOI: 10.1161/hypertensionaha.116.08573] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2016] [Revised: 10/19/2016] [Accepted: 11/09/2016] [Indexed: 01/25/2023]
Abstract
Hypertension is a major cardiovascular risk factor and accounts for a large part of cardiovascular mortality. In this work, we analyzed the properties of hypertension genes and found that when compared with genes not yet known to be involved in hypertension regulation, known hypertension genes display distinguishing features: (1) hypertension genes tend to be located at network center; (2) hypertension genes tend to interact with each other; and (3) hypertension genes tend to enrich in certain biological processes and show certain phenotypes. Based on these features, we developed a machine-learning algorithm to predict new hypertension genes. One hundred and seventy-seven candidates were predicted with a posterior probability >0.9. Evidence supporting 17 of the predictions has been found.
Collapse
Affiliation(s)
- Yan-Hui Li
- From the Institute of Cardiovascular Sciences and Key Laboratory of Molecular Cardiovascular Sciences, Ministry of Education, Peking University Health Science Center, Beijing, People's Republic of China (Y.-H.L., N.W.); Special Medical Ward (Geratology Department), First Hospital of Tsinghua University Beijing, People's Republic of China (G.-G.Z.); and The Advanced Institute for Medical Sciences, Dalian Medical University, China (N.W.).
| | - Gai-Gai Zhang
- From the Institute of Cardiovascular Sciences and Key Laboratory of Molecular Cardiovascular Sciences, Ministry of Education, Peking University Health Science Center, Beijing, People's Republic of China (Y.-H.L., N.W.); Special Medical Ward (Geratology Department), First Hospital of Tsinghua University Beijing, People's Republic of China (G.-G.Z.); and The Advanced Institute for Medical Sciences, Dalian Medical University, China (N.W.)
| | - Nanping Wang
- From the Institute of Cardiovascular Sciences and Key Laboratory of Molecular Cardiovascular Sciences, Ministry of Education, Peking University Health Science Center, Beijing, People's Republic of China (Y.-H.L., N.W.); Special Medical Ward (Geratology Department), First Hospital of Tsinghua University Beijing, People's Republic of China (G.-G.Z.); and The Advanced Institute for Medical Sciences, Dalian Medical University, China (N.W.).
| |
Collapse
|
61
|
Zhou Y, Zeng P, Li YH, Zhang Z, Cui Q. SRAMP: prediction of mammalian N6-methyladenosine (m6A) sites based on sequence-derived features. Nucleic Acids Res 2016; 44:e91. [PMID: 26896799 PMCID: PMC4889921 DOI: 10.1093/nar/gkw104] [Citation(s) in RCA: 651] [Impact Index Per Article: 72.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2015] [Accepted: 02/11/2016] [Indexed: 12/26/2022] Open
Abstract
N6-methyladenosine (m6A) is a prevalent RNA methylation modification involved in the regulation of degradation, subcellular localization, splicing and local conformation changes of RNA transcripts. High-throughput experiments have demonstrated that only a small fraction of the m6A consensus motifs in mammalian transcriptomes are modified. Therefore, accurate identification of RNA m6A sites becomes emergently important. For the above purpose, here a computational predictor of mammalian m6A site named SRAMP is established. To depict the sequence context around m6A sites, SRAMP combines three random forest classifiers that exploit the positional nucleotide sequence pattern, the K-nearest neighbor information and the position-independent nucleotide pair spectrum features, respectively. SRAMP uses either genomic sequences or cDNA sequences as its input. With either kind of input sequence, SRAMP achieves competitive performance in both cross-validation tests and rigorous independent benchmarking tests. Analyses of the informative features and overrepresented rules extracted from the random forest classifiers demonstrate that nucleotide usage preferences at the distal positions, in addition to those at the proximal positions, contribute to the classification. As a public prediction server, SRAMP is freely available at http://www.cuilab.cn/sramp/.
Collapse
Affiliation(s)
- Yuan Zhou
- Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, Beijing 100191, China MOE Key Lab of Molecular Cardiovascular Sciences, Peking University, Beijing 100191, China Center for Noncoding RNA Medicine, Peking University Health Science Center, Beijing 100191, China
| | - Pan Zeng
- Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, Beijing 100191, China MOE Key Lab of Molecular Cardiovascular Sciences, Peking University, Beijing 100191, China Center for Noncoding RNA Medicine, Peking University Health Science Center, Beijing 100191, China
| | - Yan-Hui Li
- Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, Beijing 100191, China MOE Key Lab of Molecular Cardiovascular Sciences, Peking University, Beijing 100191, China Center for Noncoding RNA Medicine, Peking University Health Science Center, Beijing 100191, China
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Qinghua Cui
- Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, Beijing 100191, China MOE Key Lab of Molecular Cardiovascular Sciences, Peking University, Beijing 100191, China Center for Noncoding RNA Medicine, Peking University Health Science Center, Beijing 100191, China
| |
Collapse
|