1
|
Wang T, Liu Z. m6A-SPP: Identification of RNA N6-methyladenosine modification sites through multi-source biological features and a hybrid deep learning architecture. Int J Biol Macromol 2025; 316:144789. [PMID: 40449782 DOI: 10.1016/j.ijbiomac.2025.144789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2025] [Revised: 05/20/2025] [Accepted: 05/28/2025] [Indexed: 06/03/2025]
Abstract
The N6-methyladenosine(m6A) modification plays crucial regulatory roles in various biological processes including gene expression regulation, RNA stability, splicing, and translation. Accurate prediction of m6A modification sites is essential for understanding their biological functions and implications in diseases. To address this, we introduce m6A-SPP, a novel deep learning framework for predicting m6A modification sites effectively. The model integrates both sequence features and physicochemical properties of RNA through two specialized modules. The sequence feature module leverages a pretrained bidirectional encoder representation of transformers (BERT) module (DNABERT), combined with convolutional neural networks (CNN), to provide refined processing of RNA sequence representations. The physicochemical feature module, on the other hand, computes feature embeddings by incorporating three crucial physicochemical properties. The feature matrices from both modules are then concatenated effectively and passed through fully connected layers to produce precise predictions of m6A modification sites. Comprehensive evaluations were performed on a dataset with single-nucleotide resolution for m6A, encompassing eight cell lines (such as HEK293T and HeLa) and three tissue types (including Brain, Liver, and Kidney). The experimental results demonstrate that m6A-SPP surpasses existing methods, highlighting its better performance in predicting m6A modification sites.
Collapse
Affiliation(s)
- Tong Wang
- School of Computer and Information Engineering, Institute for Artificial Intelligence, Shanghai Polytechnic University, Shanghai 201209, China.
| | - Zhendong Liu
- School of Computer and Information Engineering, Institute for Artificial Intelligence, Shanghai Polytechnic University, Shanghai 201209, China
| |
Collapse
|
2
|
Xie H, Wang L, Qian Y, Ding Y, Guo F. Methyl-GP: accurate generic DNA methylation prediction based on a language model and representation learning. Nucleic Acids Res 2025; 53:gkaf223. [PMID: 40156859 PMCID: PMC11952970 DOI: 10.1093/nar/gkaf223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2024] [Revised: 01/24/2025] [Accepted: 03/12/2025] [Indexed: 04/01/2025] Open
Abstract
Accurate prediction of DNA methylation remains a challenge. Identifying DNA methylation is important for understanding its functions and elucidating its role in gene regulation mechanisms. In this study, we propose Methyl-GP, a general predictor that accurately predicts three types of DNA methylation from DNA sequences. We found that the conservation of sequence patterns among different species contributes to enhancing the generalizability of the model. By fine-tuning a language model on a dataset comprising multiple species with similar sequence patterns and employing a fusion module to integrate embeddings into a high-quality comprehensive representation, Methyl-GP demonstrates satisfactory predictive performance in methylation identification. Experiments on 17 benchmark datasets for three types of DNA methylation (4mC, 5hmC, and 6mA) demonstrate the superiority of Methyl-GP over existing predictors. Furthermore, by utilizing the attention mechanism, we have visualized the sequence patterns learned by the model, which may help us to gain a deeper understanding of methylation patterns across various species.
Collapse
Affiliation(s)
- Hao Xie
- School of Computer Science and Engineering, Central South University, Hunan, Changsha 410000, China
| | - Leyao Wang
- College of Intelligence and Computing, Tianjin University, Tianjin, Tianjin 300350, China
| | - Yuqing Qian
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Sichuan, Chengdu 610054, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Zhejiang, Quzhou 324000, China
| | - Yijie Ding
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Zhejiang, Quzhou 324000, China
| | - Fei Guo
- School of Computer Science and Engineering, Central South University, Hunan, Changsha 410000, China
| |
Collapse
|
3
|
Su Q, Phan LT, Pham NT, Wei L, Manavalan B. MST-m6A: A Novel Multi-Scale Transformer-based Framework for Accurate Prediction of m6A Modification Sites Across Diverse Cellular Contexts. J Mol Biol 2025; 437:168856. [PMID: 39510345 DOI: 10.1016/j.jmb.2024.168856] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Revised: 10/23/2024] [Accepted: 11/02/2024] [Indexed: 11/15/2024]
Abstract
N6-methyladenosine (m6A) modification, a prevalent epigenetic mark in eukaryotic cells, is crucial in regulating gene expression and RNA metabolism. Accurately identifying m6A modification sites is essential for understanding their functions within biological processes and the intricate mechanisms that regulate them. Recent advances in high-throughput sequencing technologies have enabled the generation of extensive datasets characterizing m6A modification sites at single-nucleotide resolution, leading to the development of computational methods for identifying m6A RNA modification sites. However, most current methods focus on specific cell lines, limiting their generalizability and practical application across diverse biological contexts. To address the limitation, we propose MST-m6A, a novel approach for identifying m6A modification sites with higher accuracy across various cell lines and tissues. MST-m6A utilizes a multi-scale transformer-based architecture, employing dual k-mer tokenization to capture rich feature representations and global contextual information from RNA sequences at multiple levels of granularity. These representations are then effectively combined using a channel fusion mechanism and further processed by a convolutional neural network to enhance prediction accuracy. Rigorous validation demonstrates that MST-m6A significantly outperforms conventional machine learning models, deep learning models, and state-of-the-art predictors. We anticipate that the high precision and cross-cell-type adaptability of MST-m6A will provide valuable insights into m6A biology and facilitate advancements in related fields. The proposed approach is available at https://github.com/cbbl-skku-org/MST-m6A/ for prediction and reproducibility purposes.
Collapse
Affiliation(s)
- Qiaosen Su
- Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon 16419, Gyeonggi-do, Republic of Korea
| | - Le Thi Phan
- Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon 16419, Gyeonggi-do, Republic of Korea
| | - Nhat Truong Pham
- Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon 16419, Gyeonggi-do, Republic of Korea
| | - Leyi Wei
- Faculty of Applied Sciences, Macao Polytechnic University, Macau
| | - Balachandran Manavalan
- Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon 16419, Gyeonggi-do, Republic of Korea.
| |
Collapse
|
4
|
Li G, Zhao B, Su X, Yang Y, Zeng Z, Hu P, Hu L. Capturing short-range and long-range dependencies of nucleotides for identifying RNA N6-methyladenosine modification sites. Comput Biol Med 2025; 186:109625. [PMID: 39756188 DOI: 10.1016/j.compbiomed.2024.109625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 11/17/2024] [Accepted: 12/23/2024] [Indexed: 01/07/2025]
Abstract
N6-methyladenosine (m6A) plays a crucial role in enriching RNA functional and genetic information, and the identification of m6A modification sites is therefore an important task to promote the understanding of RNA epigenetics. In the identification process, current studies are mainly concentrated on capturing the short-range dependencies between adjacent nucleotides in RNA sequences, while ignoring the impact of long-range dependencies between non-adjacent nucleotides for learning high-quality representation of RNA sequences. In this work, we propose an end-to-end prediction model, called m6ASLD, to improve the identification accuracy of m6A modification sites by capturing the short-range and long-range dependencies of nucleotides. Specifically, m6ASLD first encodes the type and position information of nucleotides to construct the initial embeddings of RNA sequences. A self-correlation map is then generated to characterize both short-range and long-range dependencies with a designed map generating block for each RNA sequence. After that, m6ASLD learns the global and local representations of RNA sequences by using a graph convolution process and a designed dependency searching block respectively, and finally achieves its identification task under a joint training scheme. Extensive experiments have demonstrated the promising performance of m6ASLD on 11 benchmark datasets across several evaluation metrics.
Collapse
Affiliation(s)
- Guodong Li
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Science, 830011, Urumqi, China; University of Chinese Academy of Sciences, 100049, Beijing, China; Xinjiang Laboratory of Minority Speech and Language Information Processing, 830011, Urumqi, China.
| | - Bowei Zhao
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Science, 830011, Urumqi, China; University of Chinese Academy of Sciences, 100049, Beijing, China; Xinjiang Laboratory of Minority Speech and Language Information Processing, 830011, Urumqi, China.
| | - Xiaorui Su
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Science, 830011, Urumqi, China; University of Chinese Academy of Sciences, 100049, Beijing, China; Xinjiang Laboratory of Minority Speech and Language Information Processing, 830011, Urumqi, China.
| | - Yue Yang
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Science, 830011, Urumqi, China; University of Chinese Academy of Sciences, 100049, Beijing, China; Xinjiang Laboratory of Minority Speech and Language Information Processing, 830011, Urumqi, China.
| | - Zhi Zeng
- College of Computer Science and Technology, Xi'an Jiaotong University, 710049, Xi'an, China.
| | - Pengwei Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Science, 830011, Urumqi, China; University of Chinese Academy of Sciences, 100049, Beijing, China; Xinjiang Laboratory of Minority Speech and Language Information Processing, 830011, Urumqi, China.
| | - Lun Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Science, 830011, Urumqi, China; University of Chinese Academy of Sciences, 100049, Beijing, China; Xinjiang Laboratory of Minority Speech and Language Information Processing, 830011, Urumqi, China.
| |
Collapse
|
5
|
Wang H, Wang Y, Zhou J, Song B, Tu G, Nguyen A, Su J, Coenen F, Wei Z, Rigden DJ, Meng J. Statistical modeling of single-cell epitranscriptomics enabled trajectory and regulatory inference of RNA methylation. CELL GENOMICS 2025; 5:100702. [PMID: 39642887 PMCID: PMC11770222 DOI: 10.1016/j.xgen.2024.100702] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Revised: 10/07/2024] [Accepted: 11/06/2024] [Indexed: 12/09/2024]
Abstract
As a fundamental mechanism for gene expression regulation, post-transcriptional RNA methylation plays versatile roles in various biological processes and disease mechanisms. Recent advances in single-cell technology have enabled simultaneous profiling of transcriptome-wide RNA methylation in thousands of cells, holding the promise to provide deeper insights into the dynamics, functions, and regulation of RNA methylation. However, it remains a major challenge to determine how to best analyze single-cell epitranscriptomics data. In this study, we developed SigRM, a computational framework for effectively mining single-cell epitranscriptomics datasets with a large cell number, such as those produced by the scDART-seq technique from the SMART-seq2 platform. SigRM not only outperforms state-of-the-art models in RNA methylation site detection on both simulated and real datasets but also provides rigorous quantification metrics of RNA methylation levels. This facilitates various downstream analyses, including trajectory inference and regulatory network reconstruction concerning the dynamics of RNA methylation.
Collapse
Affiliation(s)
- Haozhe Wang
- Department of Biosciences and Bioinformatics, Center for Intelligent RNA Therapeutics, Suzhou Key Laboratory of Cancer Biology and Chronic Disease, School of Science, XJTLU Entrepreneur College, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China; Department of Computer Science, University of Liverpool, L7 8TX Liverpool, UK
| | - Yue Wang
- School of Pharmacy, Nanjing University of Chinese Medicine, Nanjing, Jiangsu 210023, China.
| | - Jingxian Zhou
- School of AI and Advanced Computing, XJTLU Entrepreneur College, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China; Department of Computer Science, University of Liverpool, L7 8TX Liverpool, UK; Sino-French Hoffmann Institute, School of Basic Medical Sciences, Guangzhou Medical University, Guangzhou, Guangdong 511436, China
| | - Bowen Song
- Department of Public Health, School of Medicine, Nanjing University of Chinese Medicine, Nanjing, Jiangsu 210023, China; Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Gang Tu
- Department of Biosciences and Bioinformatics, Center for Intelligent RNA Therapeutics, Suzhou Key Laboratory of Cancer Biology and Chronic Disease, School of Science, XJTLU Entrepreneur College, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China
| | - Anh Nguyen
- Department of Computer Science, University of Liverpool, L7 8TX Liverpool, UK
| | - Jionglong Su
- School of AI and Advanced Computing, XJTLU Entrepreneur College, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China
| | - Frans Coenen
- Department of Computer Science, University of Liverpool, L7 8TX Liverpool, UK
| | - Zhi Wei
- Department of Computer Science, New Jersey Institute of Technology, Newark, NJ 07102, USA
| | - Daniel J Rigden
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, L7 8TX Liverpool, UK
| | - Jia Meng
- Department of Biosciences and Bioinformatics, Center for Intelligent RNA Therapeutics, Suzhou Key Laboratory of Cancer Biology and Chronic Disease, School of Science, XJTLU Entrepreneur College, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China; Institute of Biomedical Research, Regulatory Mechanism and Targeted Therapy for Liver Cancer Shiyan Key Laboratory, Hubei Provincial Clinical Research Center for Precise Diagnosis and Treatment of Liver Cancer, Taihe Hospital, Hubei University of Medicine, Shiyan, Hubei 442000, China; Institute of Systems, Molecular and Integrative Biology, University of Liverpool, L7 8TX Liverpool, UK.
| |
Collapse
|
6
|
Xia R, Yin X, Huang J, Chen K, Ma J, Wei Z, Su J, Blake N, Rigden DJ, Meng J, Song B. Interpretable deep cross networks unveiled common signatures of dysregulated epitranscriptomes across 12 cancer types. MOLECULAR THERAPY. NUCLEIC ACIDS 2024; 35:102376. [PMID: 39618823 PMCID: PMC11605186 DOI: 10.1016/j.omtn.2024.102376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/31/2024] [Accepted: 10/25/2024] [Indexed: 01/12/2025]
Abstract
Cancer is a complex and multifaceted group of diseases characterized by uncontrolled cell growth that leads to the formation of malignant tumors. Recent studies suggest that N6-methyladenosine (m6A) RNA methylation plays pivotal roles in cancer pathology by influencing various cellular processes. However, the degree to which these mechanisms are shared across different cancer types remains unclear. In this study, we analyze an expansive array of 167 m6A epitranscriptome profiles covering 12 distinct cancer types and their originating normal tissues. We trained 12 distinct, cancer type-specific interpretable deep cross network models, which successfully distinguish between specific pairs of normal and cancer m6A contexts using integrated information from both the sequences and curated genomic knowledge. Interestingly, cross-cancer type testing indicated the existence of shared genomic patterns across various cancers at the epitranscriptome level. A pan-cancer model was subsequently developed to identify these shared patterns that could not be observed in a single cancer type. Our analysis uncovered, for the first time, a common epitranscriptome signature shared across multiple cancer types, particularly associated with RNA hybridization process and aberrant splicing. This highlights the importance of a comprehensive understanding of the pan-cancer epitranscriptome and holding potential implications in the development of RNA methylation-based therapeutics for various cancers.
Collapse
Affiliation(s)
- Rong Xia
- Department of Public Health, School of Medicine, Nanjing University of Chinese Medicine, Nanjing 210023, China
- Department of Biological Sciences, School of Science, Suzhou Key Laboratory of Cancer Biology and Chronic Disease, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China
- School of AI and Advanced Computing, XJTLU Entrepreneur College (Taicang), Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China
| | - Xiangyu Yin
- Department of Biological Sciences, School of Science, Suzhou Key Laboratory of Cancer Biology and Chronic Disease, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, L7 8TX Liverpool, UK
| | - Jiaming Huang
- Department of Biological Sciences, School of Science, Suzhou Key Laboratory of Cancer Biology and Chronic Disease, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China
| | - Kunqi Chen
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, School of Basic Medical Sciences, Fujian Medical University, Fuzhou 350004, China
| | - Jiongming Ma
- Department of Biological Sciences, School of Science, Suzhou Key Laboratory of Cancer Biology and Chronic Disease, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China
| | - Zhen Wei
- Department of Biological Sciences, School of Science, Suzhou Key Laboratory of Cancer Biology and Chronic Disease, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China
- Institute of Infection, Veterinary & Ecological Sciences, University of Liverpool, L7 8TX Liverpool, UK
| | - Jionglong Su
- School of AI and Advanced Computing, XJTLU Entrepreneur College (Taicang), Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China
| | - Neil Blake
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, L7 8TX Liverpool, UK
| | - Daniel J. Rigden
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, L7 8TX Liverpool, UK
| | - Jia Meng
- Institute of Biomedical Research, Regulatory Mechanism and Targeted Therapy for Liver Cancer Shiyan Key Laboratory, Hubei Provincial Clinical Research Center for Precise Diagnosis and Treatment of Liver Cancer, Taihe Hospital, Hubei University of Medicine, Shiyan, Hubei 442000, China
- Department of Biological Sciences, School of Science, Suzhou Key Laboratory of Cancer Biology and Chronic Disease, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, L7 8TX Liverpool, UK
| | - Bowen Song
- Department of Public Health, School of Medicine, Nanjing University of Chinese Medicine, Nanjing 210023, China
| |
Collapse
|
7
|
Du C, Fan W, Zhou Y. Integrated Biochemical and Computational Methods for Deciphering RNA-Processing Codes. WILEY INTERDISCIPLINARY REVIEWS. RNA 2024; 15:e1875. [PMID: 39523464 DOI: 10.1002/wrna.1875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 09/23/2024] [Accepted: 10/21/2024] [Indexed: 11/16/2024]
Abstract
RNA processing involves steps such as capping, splicing, polyadenylation, modification, and nuclear export. These steps are essential for transforming genetic information in DNA into proteins and contribute to RNA diversity and complexity. Many biochemical methods have been developed to profile and quantify RNAs, as well as to identify the interactions between RNAs and RNA-binding proteins (RBPs), especially when coupled with high-throughput sequencing technologies. With the rapid accumulation of diverse data, it is crucial to develop computational methods to convert the big data into biological knowledge. In particular, machine learning and deep learning models are commonly utilized to learn the rules or codes governing the transformation from DNA sequences to intriguing RNAs based on manually designed or automatically extracted features. When precise enough, the RNA codes can be incredibly useful for predicting RNA products, decoding the molecular mechanisms, forecasting the impact of disease variants on RNA processing events, and identifying driver mutations. In this review, we systematically summarize the biochemical and computational methods for deciphering five important RNA codes related to alternative splicing, alternative polyadenylation, RNA localization, RNA modifications, and RBP binding. For each code, we review the main types of experimental methods used to generate training data, as well as the key features, strategic model structures, and advantages of representative tools. We also discuss the challenges encountered in developing predictive models using large language models and extensive domain knowledge. Additionally, we highlight useful resources and propose ways to improve computational tools for studying RNA codes.
Collapse
Affiliation(s)
- Chen Du
- College of Life Sciences, TaiKang Center for Life and Medical Sciences, RNA Institute, Wuhan University, Wuhan, China
| | - Weiliang Fan
- College of Life Sciences, TaiKang Center for Life and Medical Sciences, RNA Institute, Wuhan University, Wuhan, China
| | - Yu Zhou
- College of Life Sciences, TaiKang Center for Life and Medical Sciences, RNA Institute, Wuhan University, Wuhan, China
- Frontier Science Center for Immunology and Metabolism, Wuhan University, Wuhan, China
- State Key Laboratory of Virology, Wuhan University, Wuhan, China
| |
Collapse
|
8
|
Wang M, Ali H, Xu Y, Xie J, Xu S. BiPSTP: Sequence feature encoding method for identifying different RNA modifications with bidirectional position-specific trinucleotides propensities. J Biol Chem 2024; 300:107140. [PMID: 38447795 PMCID: PMC10997841 DOI: 10.1016/j.jbc.2024.107140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 02/17/2024] [Accepted: 02/25/2024] [Indexed: 03/08/2024] Open
Abstract
RNA modification, a posttranscriptional regulatory mechanism, significantly influences RNA biogenesis and function. The accurate identification of modification sites is paramount for investigating their biological implications. Methods for encoding RNA sequence into numerical data play a crucial role in developing robust models for predicting modification sites. However, existing techniques suffer from limitations, including inadequate information representation, challenges in effectively integrating positional and sequential information, and the generation of irrelevant or redundant features when combining multiple approaches. These deficiencies hinder the effectiveness of machine learning models in addressing the performance challenges associated with predicting RNA modification sites. Here, we introduce a novel RNA sequence feature representation method, named BiPSTP, which utilizes bidirectional trinucleotide position-specific propensities. We employ the parameter ξ to denote the interval between the current nucleotide and its adjacent forward or backward dinucleotide, enabling the extraction of positional and sequential information from RNA sequences. Leveraging the BiPSTP method, we have developed the prediction model mRNAPred using support vector machine classifier to identify multiple types of RNA modification sites. We evaluate the performance of our BiPSTP method and mRNAPred model across 12 distinct RNA modification types. Our experimental results demonstrate the superiority of the mRNAPred model compared to state-of-art models in the domain of RNA modification sites identification. Importantly, our BiPSTP method enhances the robustness and generalization performance of prediction models. Notably, it can be applied to feature extraction from DNA sequences to predict other biological modification sites.
Collapse
Affiliation(s)
- Mingzhao Wang
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| | - Haider Ali
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| | - Yandi Xu
- School of Computer Science, Shaanxi Normal University, Xi'an, China; College of Life Sciences, Shaanxi Normal University, Xi'an, China
| | - Juanying Xie
- School of Computer Science, Shaanxi Normal University, Xi'an, China.
| | - Shengquan Xu
- College of Life Sciences, Shaanxi Normal University, Xi'an, China.
| |
Collapse
|