1
|
Liu L, Lei X, Wang Z, Meng J, Song B. TransRM: Weakly supervised learning of translation-enhancing N6-methyladenosine (m 6A) in circular RNAs. Int J Biol Macromol 2025; 306:141588. [PMID: 40023417 DOI: 10.1016/j.ijbiomac.2025.141588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2024] [Revised: 02/24/2025] [Accepted: 02/26/2025] [Indexed: 03/04/2025]
Abstract
As our understanding of Circular RNAs (circRNAs) continues to expand, accumulating evidence has demonstrated that circRNAs can interact with microRNAs and RNA-binding proteins to modulate gene expression. More importantly, a subset of circRNAs has been reported to possess coding potential, enabling them to translate into functional proteins. Recent studies also indicate that the N6-methyladenosine (m6A)-modified start codon may function as an Internal Ribosome Entry Site (IRES), influencing the translation of circRNAs. Therefore, elucidating how m6A regulates circRNA translation potential could significantly advance circRNA research, including the development of circRNA-based vaccines. However, to our knowledge, there are currently no computational tools specifically designed for this purpose. To bridge this gap, we have developed the first computational model, termed TransRM, to predict the impact of base-resolution m6A sites on circRNA translation. Our model employs weakly supervised learning with two convolution layers. These layers extract RNA modification features, and a bidirectional gated recurrent unit predicts the contribution of each RNA modification to circRNA translation. Subsequently, the RNA modification features are then integrated with their contribution to assess the probability of circRNA translation using a random forest algorithm. TransRM has demonstrated efficiency in identifying translation-enhancing m6A sites, with an AUROC of 0.9188 and an AUPRC of 0.9371, respectively. We hope that our newly proposed model could help to broaden our understanding of circRNA regulation at the epitranscriptome layer, particularly in identifying translated circRNAs, thereby contributing to the development of more effective circular RNA-based therapeutics.
Collapse
Affiliation(s)
- Lian Liu
- School of Computer Science, Shaanxi Normal University, Xi'an, Shaanxi 710119, China
| | - Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an, Shaanxi 710119, China.
| | - Zheng Wang
- School of Computer Science, Shaanxi Normal University, Xi'an, Shaanxi 710119, China
| | - Jia Meng
- Department of Biosciences and Bioinformatics, Center for Intelligent RNA Therapeutics, Suzhou Key Laboratory of Cancer Biology and Chronic Disease, School of Science, XJTLU Entrepreneur College, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China; Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L7 8TX, United Kingdom
| | - Bowen Song
- Department of Public Health, School of Medicine, Nanjing University of Chinese Medicine, Nanjing 210023, China.
| |
Collapse
|
2
|
Li C, Xie P, Luo M, Lv K, Cong Z. EIF4A3-Induced hsa_circ_0118578 Expression Enhances the Tumorigenesis of Papillary Thyroid Cancer. Cancer Biother Radiopharm 2025; 40:285-292. [PMID: 39689861 DOI: 10.1089/cbr.2024.0133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2024] Open
Abstract
Background: Circular RNA (circRNA) plays a regulatory role in the malignancy of papillary thyroid cancer (PTC). However, the role of a novel circRNA, hsa_circ_0118578, in PTC is not yet fully understood. This report focuses on unveiling hsa_circ_0118578's effect on PTC cell malignancy and reveals its mechanism in PTC progression. Methods: Levels of hsa_circ_0118578 in PTC were assessed by quantitative real-time polymerase chain reaction (qRT-PCR). The functional roles of hsa_circ_0118578 in PTC cell malignancy were evaluated through Transwell, 5-ethynyl-2'-deoxyuridine (EdU), and wound healing assays. A xenograft model in nude mice was used to examine the effects of hsa_circ_0118578's in vivo. The interaction between eukaryotic translation initiation factor 4A3 (EIF4A3) and hsa_circ_0118578 was confirmed using RNA-binding protein immunoprecipitation, qRT-PCR, and Western blotting. Results: Hsa_circ_0118578 with high expression in PTC tissues was associated with higher tumor node metastasis stage, lymph node metastasis, as well as poor differentiation. Cell functional assays demonstrated that silencing hsa_circ_0118578 inhibited PTC cell proliferation, invasion, and migration. In the xenograft assay, tumorigenicity of PTC cells in vivo was reduced following hsa_circ_0118578 suppression. Additionally, EIF4A3, as an RNA-binding protein, was shown to interact with hsa_circ_0118578 to stabilize its expression in PTC cells. Conclusions: Upregulated hsa_circ_0118578 in PTC interacts with EIF4A3 to exert oncogenic effects by enhancing hsa_circ_0118578 stability, contributing to PTC development. These findings shed light on the oncogenic role of hsa_circ_0118578 in PTC and suggest it as a potential therapeutic target.
Collapse
Affiliation(s)
- Chan Li
- Department of Tradition Chinese Medicine, Wuhan Third Hospital (Tongren Hospital of Wuhan University), Wuhan, China
| | - Ping Xie
- Department of Tradition Chinese Medicine, Wuhan Third Hospital (Tongren Hospital of Wuhan University), Wuhan, China
| | - Meng Luo
- Department of Tradition Chinese Medicine, Wuhan Third Hospital (Tongren Hospital of Wuhan University), Wuhan, China
| | - Kun Lv
- Department of Tradition Chinese Medicine, Wuhan Third Hospital (Tongren Hospital of Wuhan University), Wuhan, China
| | - Zewei Cong
- Department of Tradition Chinese Medicine, Wuhan Third Hospital (Tongren Hospital of Wuhan University), Wuhan, China
| |
Collapse
|
3
|
Wei Y, Tan Z, Liu L. CR-deal: Explainable Neural Network for circRNA-RBP Binding Site Recognition and Interpretation. Interdiscip Sci 2025:10.1007/s12539-025-00694-7. [PMID: 40146403 DOI: 10.1007/s12539-025-00694-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 02/01/2025] [Accepted: 02/06/2025] [Indexed: 03/28/2025]
Abstract
circRNAs are a type of single-stranded non-coding RNA molecules, and their unique feature is their closed circular structure. The interaction between circRNAs and RNA-binding proteins (RBPs) plays a key role in biological functions and is crucial for studying post-transcriptional regulatory mechanisms. The genome-wide circRNA binding event data obtained by cross-linking immunoprecipitation sequencing technology provides a foundation for constructing efficient computational model prediction methods. However, in existing studies, although machine learning techniques have been applied to predict circRNA-RBP interaction sites, these methods still have room for improvement in accuracy and lack interpretability. We propose CR-deal, which is an interpretable joint deep learning network that predicts the binding sites of circRNA and RBP through genome-wide circRNA data. CR-deal utilizes a graph attention network to unify sequence and structural features into the same view, more effectively utilizing structural features to improve accuracy. It can infer marker genes in the binding site through integrated gradient feature interpretation, thereby inferring functional structural regions in the binding site. We conducted benchmark tests on CR-deal on 37 circRNA datasets and 7 lncRNA datasets, respectively, and obtained the interpretability of CR-deal and discovered functional structural regions through 5 circRNA datasets. We believe that CR-deal can help researchers gain a deeper understanding of the functions and mechanisms of circRNA in living organisms and its critical role in the occurrence and development of diseases. The source code of CR-deal is provided free of charge on https://github.com/liuliwei1980/CR .
Collapse
Affiliation(s)
- Yuxiao Wei
- College of Software, Dalian Jiaotong University, Dalian, 116028, China
| | - Zhebin Tan
- College of Software, Dalian Jiaotong University, Dalian, 116028, China
| | - Liwei Liu
- College of Science, Dalian Jiaotong University, Dalian, 116028, China.
| |
Collapse
|
4
|
Pan X, Fang Y, Liu X, Guo X, Shen HB. RBPsuite 2.0: an updated RNA-protein binding site prediction suite with high coverage on species and proteins based on deep learning. BMC Biol 2025; 23:74. [PMID: 40069726 PMCID: PMC11899677 DOI: 10.1186/s12915-025-02182-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2024] [Accepted: 03/03/2025] [Indexed: 03/14/2025] Open
Abstract
BACKGROUND RNA-binding proteins (RBPs) play crucial roles in many biological processes, and computationally identifying RNA-RBP interactions provides insights into the biological mechanism of diseases associated with RBPs. RESULTS To make the RBP-specific deep learning-based RBP binding sites prediction methods easily accessible, we developed an updated easy-to-use webserver, RBPsuite 2.0, with an updated web interface for predicting RBP binding sites from linear and circular RNA sequences. RBPsuite 2.0 has a higher coverage on the number of supported RBPs and species compared to the original RBPsuite, supporting an increased number of RBPs from 154 to 353 and expanding the supported species from one to seven. Additionally, RBPsuite 2.0 replaces the CRIP built into RBPsuite 1.0 with iDeepC, a more accurate RBP binding site predictor for circular RNAs. Furthermore, RBPsuite 2.0 estimates the contribution score of individual nucleotides on the input sequences as potential binding motifs and links to the UCSC browser track for better visualization of the prediction results. CONCLUSIONS RBPsuite 2.0 is an updated, more comprehensive webserver for predicting RBP binding sites in both linear and circular RNA sequences. It supports more RBPs and species and provides more accurate predictions for circular RNAs. The tool is freely available at http://www.csbio.sjtu.edu.cn/bioinf/RBPsuite/ .
Collapse
Affiliation(s)
- Xiaoyong Pan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China.
| | - Yi Fang
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China
| | - Xiaojian Liu
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China
| | - Xiaoyu Guo
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China.
| |
Collapse
|
5
|
Guo Y, Lei X, Li S. An Integrated TCN-CrossMHA Model for Predicting circRNA-RBP Binding Sites. Interdiscip Sci 2025; 17:86-100. [PMID: 39503827 DOI: 10.1007/s12539-024-00660-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Revised: 09/14/2024] [Accepted: 09/17/2024] [Indexed: 02/19/2025]
Abstract
Circular RNA (circRNA) has the capacity to bind with RNA binding protein (RBP), thereby exerting a substantial impact on diseases. Predicting binding sites aids in comprehending the interaction mechanism, thereby offering insights for disease treatment strategies. Here, we propose a novel approach based on temporal convolutional network (TCN) and cross multi-head attention mechanism to predict circRNA-RBP binding sites (circTCA). First, we employ two distinct encoding methodologies to obtain two raw matrices of circRNA sequences. Then, two parallel TCN blocks extract shallow and abstract features of the two matrices separately. The fusion of the two is achieved through cross multi-head attention mechanism and after this, global expectation pooling assigns weights to the concatenated feature. Finally, the task of classifying the input sequence is entrusted to a fully connected (FC) layer. We compare circTCA with other five methods and conduct ablation experiments to demonstrate its effectiveness. We also conduct feature visualization and assess the motifs extracted by circTCA with existing motifs. All in all, circTCA is effective for binding sites prediction of circRNA and RBP.
Collapse
Affiliation(s)
- Yajing Guo
- School of Computer Science, Shaanxi Normal University, Xi'an, 710119, China
| | - Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an, 710119, China.
| | - Shuyu Li
- School of Computer Science, Shaanxi Normal University, Xi'an, 710119, China
| |
Collapse
|
6
|
Wang Y, Zhu H, Wang Y, Yang Y, Huang Y, Zhang J, Wong KC, Li X. EnrichRBP: an automated and interpretable computational platform for predicting and analysing RNA-binding protein events. Bioinformatics 2024; 41:btaf018. [PMID: 39804669 PMCID: PMC11783304 DOI: 10.1093/bioinformatics/btaf018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2024] [Revised: 12/18/2024] [Accepted: 01/10/2025] [Indexed: 02/01/2025] Open
Abstract
MOTIVATION Predicting RNA-binding proteins (RBPs) is central to understanding post-transcriptional regulatory mechanisms. Here, we introduce EnrichRBP, an automated and interpretable computational platform specifically designed for the comprehensive analysis of RBP interactions with RNA. RESULTS EnrichRBP is a web service that enables researchers to develop original deep learning and machine learning architectures to explore the complex dynamics of RBPs. The platform supports 70 deep learning algorithms, covering feature representation, selection, model training, comparison, optimization, and evaluation, all integrated within an automated pipeline. EnrichRBP is adept at providing comprehensive visualizations, enhancing model interpretability, and facilitating the discovery of functionally significant sequence regions crucial for RBP interactions. In addition, EnrichRBP supports base-level functional annotation tasks, offering explanations and graphical visualizations that confirm the reliability of the predicted RNA-binding sites. Leveraging high-performance computing, EnrichRBP provides ultra-fast predictions ranging from seconds to hours, applicable to both pre-trained and custom model scenarios, thus proving its utility in real-world applications. Case studies highlight that EnrichRBP provides robust and interpretable predictions, demonstrating the power of deep learning in the functional analysis of RBP interactions. Finally, EnrichRBP aims to enhance the reproducibility of computational method analyses for RBP sequences, as well as reduce the programming and hardware requirements for biologists, thereby offering meaningful functional insights. AVAILABILITY AND IMPLEMENTATION EnrichRBP is available at https://airbp.aibio-lab.com/. The source code is available at https://github.com/wangyb97/EnrichRBP, and detailed online documentation can be found at https://enrichrbp.readthedocs.io/en/latest/.
Collapse
Affiliation(s)
- Yubo Wang
- School of Artificial Intelligence, Jilin University, Changchun 130012, China
| | - Haoran Zhu
- School of Artificial Intelligence, Jilin University, Changchun 130012, China
| | - Yansong Wang
- School of Artificial Intelligence, Jilin University, Changchun 130012, China
| | - Yuning Yang
- Information Science and Technology, Northeast Normal University, Changchun 130024, China
| | - Yujian Huang
- College of Computer Science and Cyber Security, Chengdu University of Technology, Chengdu 610059, China
| | - Jian Zhang
- School of Computer and Information Technology, Xinyang Normal University, Xinyang 464000, China
| | - Ka-chun Wong
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR 999077, China
| | - Xiangtao Li
- School of Artificial Intelligence, Jilin University, Changchun 130012, China
| |
Collapse
|
7
|
Cao C, Wang C, Dai Q, Zou Q, Wang T. CRBPSA: CircRNA-RBP interaction sites identification using sequence structural attention model. BMC Biol 2024; 22:260. [PMID: 39543602 PMCID: PMC11566611 DOI: 10.1186/s12915-024-02055-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2024] [Accepted: 10/30/2024] [Indexed: 11/17/2024] Open
Abstract
BACKGROUND Due to the ability of circRNA to bind with corresponding RBPs and play a critical role in gene regulation and disease prevention, numerous identification algorithms have been developed. Nevertheless, most of the current mainstream methods primarily capture one-dimensional sequence features through various descriptors, while neglecting the effective extraction of secondary structure features. Moreover, as the number of introduced descriptors increases, the issues of sparsity and ineffective representation also rise, causing a significant burden on computational models and leaving room for improvement in predictive performance. RESULTS Based on this, we focused on capturing the features of secondary structure in sequences and developed a new architecture called CRBPSA, which is based on a sequence-structure attention mechanism. Firstly, a base-pairing matrix is generated by calculating the matching probability between each base, with a Gaussian function introduced as a weight to construct the secondary structure. Then, a Structure_Transformer is employed to extract base-pairing information and spatial positional dependencies, enabling the identification of binding sites through deeper feature extraction. Experimental results using the same set of hyperparameters on 37 circRNA datasets, totaling 671,952 samples, show that the CRBPSA algorithm achieves an average AUC of 99.93%, surpassing all existing prediction methods. CONCLUSIONS CRBPSA is a lightweight and efficient prediction tool for circRNA-RBP, which can capture structural features of sequences with minimal computational resources and accurately predict protein-binding sites. This tool facilitates a deeper understanding of the biological processes and mechanisms underlying circRNA and protein interactions.
Collapse
Affiliation(s)
- Chao Cao
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang, China
| | - Chunyu Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Qi Dai
- College of Life Science and Medicine, Zhejiang Sci-Tech University, Hangzhou, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang, China
| | - Tao Wang
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China.
| |
Collapse
|
8
|
Wang F, Li Y, Shen H, Martinez-Feduchi P, Ji X, Teng P, Krishnakumar S, Hu J, Chen L, Feng Y, Yao B. Identification of pathological pathways centered on circRNA dysregulation in association with irreversible progression of Alzheimer's disease. Genome Med 2024; 16:129. [PMID: 39529134 PMCID: PMC11552301 DOI: 10.1186/s13073-024-01404-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2024] [Accepted: 10/30/2024] [Indexed: 11/16/2024] Open
Abstract
BACKGROUND Circular RNAs (circRNAs) are highly stable regulators, often accumulated in mammalian brains and thought to serve as "memory molecules" that govern the long process of aging. Mounting evidence demonstrated circRNA dysregulation in the brains of Alzheimer's disease (AD) patients. However, whether and how circRNA dysregulation underlies AD progression remains unexplored. METHODS We combined Poly(A)-tailing/RNase R digestion experimental approach with CARP, our published computational framework using pseudo-reference alignment for more sensitive and accurate circRNA detection to identify genome-wide circRNA dysregulation and their downstream pathways in the 5xFAD mouse cerebral cortex between 5 and 7 months of age, a critical window marks the transition from reversible to irreversible pathogenic progression. Dysregulated circRNAs and pathways associated with disease progression in 5xFAD cortex were systematically compared with circRNAs affected in postmortem subcortical areas of a large human AD cohort. A top-ranked circRNA conserved and commonly affected in AD patients and 5xFAD mice was depleted in cultured cells to examine AD-relevant molecular and cellular changes. RESULTS We discovered genome-wide circRNA alterations specifically in 5xFAD cortex associated with AD progression, many of which are commonly dysregulated in the subcortical areas of AD patients. Among these circRNAs, circGigyf2 is highly conserved and showed the highest net reduction specifically in the 7-month 5xFAD cortex. CircGIGYF2 level in AD patients' cortices negatively correlated with dementia severity. Mechanistically, we found multiple AD-affected splicing factors that are essential for circGigyf2 biogenesis. Functionally, we identified and experimentally validated the conserved roles of circGigyf2 in sponging AD-relevant miRNAs and AD-associated RNA binding proteins (RBPs), including the cleavage and polyadenylation factor 6 (CPSF6). Moreover, circGigyf2 downregulation in AD promoted silencing activities of its sponged miRNAs and enhanced polyadenylation site processing efficiency of CPSF6 targets. Furthermore, circGigyf2 depletion in a mouse neuronal cell line dysregulated circGigyf2-miRNA and circGigyf2-CPSF6 axes and potentiated apoptotic responses upon insults, which strongly support the causative roles of circGigyf2 deficiency in AD neurodegeneration. CONCLUSIONS Together, our results unveiled brain circRNAs associated with irreversible disease progression in an AD mouse model that is also affected in AD patients and identified novel molecular mechanisms underlying the dysregulation of conserved circRNA pathways contributing to AD pathogenesis.
Collapse
Affiliation(s)
- Feng Wang
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
| | - Yangping Li
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
| | - Huifeng Shen
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
| | - Paula Martinez-Feduchi
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
| | - Xingyu Ji
- Department of Pharmacology and Chemical Biology, Emory University School of Medicine, Atlanta, GA, 30322, USA
| | - Peng Teng
- Department of Pharmacology and Chemical Biology, Emory University School of Medicine, Atlanta, GA, 30322, USA
| | - Siddharth Krishnakumar
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
| | - Jian Hu
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
| | - Li Chen
- Department of Biostatistics, College of Public Health and Health Professions & College of Medicine, University of Florida, Gainesville, FL, 32611, USA
| | - Yue Feng
- Department of Pharmacology and Chemical Biology, Emory University School of Medicine, Atlanta, GA, 30322, USA.
| | - Bing Yao
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA.
| |
Collapse
|
9
|
Sanadgol N, Amini J, Khalseh R, Bakhshi M, Nikbin A, Beyer C, Zendehdel A. Mitochondrial genome-derived circRNAs: Orphan epigenetic regulators in molecular biology. Mitochondrion 2024; 79:101968. [PMID: 39321951 DOI: 10.1016/j.mito.2024.101968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Revised: 09/02/2024] [Accepted: 09/18/2024] [Indexed: 09/27/2024]
Abstract
Mitochondria are vital for cellular activities, influencing ATP production, Ca2+ signaling, and reactive oxygen species generation. It has been proposed that nuclear genome-derived circular RNAs (circRNAs) play a role in biological processes. For the first time, this study aims to comprehensively explore experimentally confirmed human mitochondrial genome-derived circRNAs (mt-circRNAs) via in-silico analysis. We utilized wide-ranging bioinformatics tools to anticipate their roles in molecular biology, involving miRNA sponging, protein antagonism, and peptide translation. Among five well-characterized mt-circRNAs, SCAR/mc-COX2 stands out as particularly significant with the potential to sponge around 41 different miRNAs, which target several genes mostly involved in endocytosis, MAP kinase, and PI3K-Akt pathways. Interestingly, circMNTND5 and mecciND1 specifically interact with miRNAs through their unique back-splice junction sequence. These exclusively targeted miRNAs (has-miR-5186, 6888-5p, 8081, 924, 672-5p) are predominantly associated with insulin secretion, proteoglycans in cancer, and MAPK signaling pathways. Moreover, all mt-circRNAs intricately affect the P53 pathway through miRNA sequestration. Remarkably, mc-COX2 and circMNTND5 appear to be involved in the RNA's biogenesis by antagonizing AGO1/2, EIF4A3, and DGCR8. All mt-circRNAs engaged with IGF2BP proteins crucial in redox signaling, and except mecciND1, they all potentially generate at least one protein resembling the immunoglobulin heavy chain protein. Given P53's function as a redox-sensitive transcription factor, and insulin's role as a crucial regulator of energy metabolism, their indirect interplay with mt-circRNAs could influence cellular outcomes. However, due to limited attention and infrequent data availability, it is advisable to conduct more thorough investigations to gain a deeper understanding of the functions of mt-circRNA.
Collapse
Affiliation(s)
- Nima Sanadgol
- Institute of Neuroanatomy, RWTH University Hospital Aachen, 52074 Aachen, Germany.
| | - Javad Amini
- Department of Physiology and Pharmacology, School of Medicine, North Khorasan University of Medical Sciences, 94149-75516 Bojnurd, Iran
| | - Roghayeh Khalseh
- Institute of Neuroanatomy, RWTH University Hospital Aachen, 52074 Aachen, Germany
| | - Mostafa Bakhshi
- Department of Electrical and Computer Engineering, Kharazmi University, 15719-14911 Tehran, Iran
| | - Arezoo Nikbin
- Department of Oral and Maxillofacial Radiology, School of Dentistry, Golestan University of Medical Sciences, Gorgan, Iran
| | - Cordian Beyer
- Institute of Neuroanatomy, RWTH University Hospital Aachen, 52074 Aachen, Germany
| | - Adib Zendehdel
- Institut of Anatomy, Department of Biomedicine, University of Basel, 4031 Basel, Switzerland
| |
Collapse
|
10
|
Zhou Y, Cui H, Liu D, Wang W. MSTCRB: Predicting circRNA-RBP interaction by extracting multi-scale features based on transformer and attention mechanism. Int J Biol Macromol 2024; 278:134805. [PMID: 39153682 DOI: 10.1016/j.ijbiomac.2024.134805] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Revised: 08/14/2024] [Accepted: 08/14/2024] [Indexed: 08/19/2024]
Abstract
CircRNAs play vital roles in biological system mainly through binding RNA-binding protein (RBP), which is essential for regulating physiological processes in vivo and for identifying causal disease variants. Therefore, predicting interactions between circRNA and RBP is a critical step for the discovery of new therapeutic agents. Application of various deep-learning models in bioinformatics has significantly improved prediction and classification performance. However, most of existing prediction models are only applicable to specific type of RNA or RNA with simple characteristics. In this study, we proposed an attractive deep learning model, MSTCRB, based on transformer and attention mechanism for extracting multi-scale features to predict circRNA-RBP interactions. Therein, K-mer and KNF encoding are employed to capture the global sequence features of circRNA, NCP and DPCP encoding are utilized to extract local sequence features, and the CDPfold method is applied to extract structural features. In order to improve prediction performance, optimized transformer framework and attention mechanism were used to integrate these multi-scale features. We compared our model's performance with other five state-of-the-art methods on 37 circRNA datasets and 31 linear RNA datasets. The results show that the average AUC value of MSTCRB reaches 98.45 %, which is better than other comparative methods. All of above datasets are deposited in https://github.com/chy001228/MSTCRB_database.git and source code are available from https://github.com/chy001228/MSTCRB.git.
Collapse
Affiliation(s)
- Yun Zhou
- College of Computer and Information Engineering, Henan Normal University, Xinxiang 453007, China; Key Laboratory of Artificial Intelligence and Personalized Learning in Education of Henan Province, College of Computer and Information Engineering, Henan Normal University, Xinxiang 453007, China.
| | - Haoyu Cui
- College of Computer and Information Engineering, Henan Normal University, Xinxiang 453007, China
| | - Dong Liu
- College of Computer and Information Engineering, Henan Normal University, Xinxiang 453007, China; Key Laboratory of Artificial Intelligence and Personalized Learning in Education of Henan Province, College of Computer and Information Engineering, Henan Normal University, Xinxiang 453007, China.
| | - Wei Wang
- College of Computer and Information Engineering, Henan Normal University, Xinxiang 453007, China; Key Laboratory of Artificial Intelligence and Personalized Learning in Education of Henan Province, College of Computer and Information Engineering, Henan Normal University, Xinxiang 453007, China.
| |
Collapse
|
11
|
Liu L, Wei Y, Tan Z, Zhang Q, Sun J, Zhao Q. Predicting circRNA-RBP Binding Sites Using a Hybrid Deep Neural Network. Interdiscip Sci 2024; 16:635-648. [PMID: 38381315 DOI: 10.1007/s12539-024-00616-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 01/26/2024] [Accepted: 01/29/2024] [Indexed: 02/22/2024]
Abstract
Circular RNAs (circRNAs) are non-coding RNAs generated by reverse splicing. They are involved in biological process and human diseases by interacting with specific RNA-binding proteins (RBPs). Due to traditional biological experiments being costly, computational methods have been proposed to predict the circRNA-RBP interaction. However, these methods have problems of single feature extraction. Therefore, we propose a novel model called circ-FHN, which utilizes only circRNA sequences to predict circRNA-RBP interactions. The circ-FHN approach involves feature coding and a hybrid deep learning model. Feature coding takes into account the physicochemical properties of circRNA sequences and employs four coding methods to extract sequence features. The hybrid deep structure comprises a convolutional neural network (CNN) and a bidirectional gated recurrent unit (BiGRU). The CNN learns high-level abstract features, while the BiGRU captures long-term dependencies in the sequence. To assess the effectiveness of circ-FHN, we compared it to other computational methods on 16 datasets and conducted ablation experiments. Additionally, we conducted motif analysis. The results demonstrate that circ-FHN exhibits exceptional performance and surpasses other methods. circ-FHN is freely available at https://github.com/zhaoqi106/circ-FHN .
Collapse
Affiliation(s)
- Liwei Liu
- College of Science, Dalian Jiaotong University, Dalian, 116028, China
- Key Laboratory of Computational Science and Application of Hainan Province, Hainan Normal University, Haikou, 571158, China
| | - Yixin Wei
- College of Science, Dalian Jiaotong University, Dalian, 116028, China
| | - Zhebin Tan
- College of Software, Dalian Jiaotong University, Dalian, 116028, China
| | - Qi Zhang
- College of Science, Dalian Jiaotong University, Dalian, 116028, China
| | - Jianqiang Sun
- School of Information Science and Engineering, Linyi University, Linyi, 276000, China.
| | - Qi Zhao
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China.
| |
Collapse
|
12
|
Zuo Y, Chen H, Yang L, Chen R, Zhang X, Deng Z. Research progress on prediction of RNA-protein binding sites in the past five years. Anal Biochem 2024; 691:115535. [PMID: 38643894 DOI: 10.1016/j.ab.2024.115535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Revised: 04/08/2024] [Accepted: 04/11/2024] [Indexed: 04/23/2024]
Abstract
Accurately predicting RNA-protein binding sites is essential to gain a deeper comprehension of the protein-RNA interactions and their regulatory mechanisms, which are fundamental in gene expression and regulation. However, conventional biological approaches to detect these sites are often costly and time-consuming. In contrast, computational methods for predicting RNA protein binding sites are both cost-effective and expeditious. This review synthesizes already existing computational methods, summarizing commonly used databases for predicting RNA protein binding sites. In addition, applications and innovations of computational methods using traditional machine learning and deep learning for RNA protein binding site prediction during 2018-2023 are presented. These methods cover a wide range of aspects such as effective database utilization, feature selection and encoding, innovative classification algorithms, and evaluation strategies. Exploring the limitations of existing computational methods, this paper delves into the potential directions for future development. DeepRKE, RDense, and DeepDW all employ convolutional neural networks and long and short-term memory networks to construct prediction models, yet their algorithm design and feature encoding differ, resulting in diverse prediction performances.
Collapse
Affiliation(s)
- Yun Zuo
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214000, China
| | - Huixian Chen
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214000, China
| | - Lele Yang
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214000, China
| | - Ruoyan Chen
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214000, China
| | - Xiaoyao Zhang
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214000, China
| | - Zhaohong Deng
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214000, China.
| |
Collapse
|
13
|
Yuan L, Zhao L, Lai J, Jiang Y, Zhang Q, Shen Z, Zheng CH, Huang DS. iCRBP-LKHA: Large convolutional kernel and hybrid channel-spatial attention for identifying circRNA-RBP interaction sites. PLoS Comput Biol 2024; 20:e1012399. [PMID: 39173070 PMCID: PMC11373821 DOI: 10.1371/journal.pcbi.1012399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2024] [Revised: 09/04/2024] [Accepted: 08/08/2024] [Indexed: 08/24/2024] Open
Abstract
Circular RNAs (circRNAs) play vital roles in transcription and translation. Identification of circRNA-RBP (RNA-binding protein) interaction sites has become a fundamental step in molecular and cell biology. Deep learning (DL)-based methods have been proposed to predict circRNA-RBP interaction sites and achieved impressive identification performance. However, those methods cannot effectively capture long-distance dependencies, and cannot effectively utilize the interaction information of multiple features. To overcome those limitations, we propose a DL-based model iCRBP-LKHA using deep hybrid networks for identifying circRNA-RBP interaction sites. iCRBP-LKHA adopts five encoding schemes. Meanwhile, the neural network architecture, which consists of large kernel convolutional neural network (LKCNN), convolutional block attention module with one-dimensional convolution (CBAM-1D) and bidirectional gating recurrent unit (BiGRU), can explore local information, global context information and multiple features interaction information automatically. To verify the effectiveness of iCRBP-LKHA, we compared its performance with shallow learning algorithms on 37 circRNAs datasets and 37 circRNAs stringent datasets. And we compared its performance with state-of-the-art DL-based methods on 37 circRNAs datasets, 37 circRNAs stringent datasets and 31 linear RNAs datasets. The experimental results not only show that iCRBP-LKHA outperforms other competing methods, but also demonstrate the potential of this model in identifying other RNA-RBP interaction sites.
Collapse
Affiliation(s)
- Lin Yuan
- Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
- Shandong Engineering Research Center of Big Data Applied Technology, Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
- Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan, China
| | - Ling Zhao
- Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
- Shandong Engineering Research Center of Big Data Applied Technology, Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
- Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan, China
| | - Jinling Lai
- Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
- Shandong Engineering Research Center of Big Data Applied Technology, Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
- Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan, China
| | - Yufeng Jiang
- Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
- Shandong Engineering Research Center of Big Data Applied Technology, Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
- Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan, China
| | - Qinhu Zhang
- Eastern Institute for Advanced Study, Eastern Institute of Technology, Ningbo, China
| | - Zhen Shen
- School of Computer and Software, Nanyang Institute of Technology, Nanyang, China
| | - Chun-Hou Zheng
- Key Lab of Intelligent Computing and Signal Processing of Ministry of Education, School of Artificial Intelligence, Anhui University, Hefei, China
| | - De-Shuang Huang
- Eastern Institute for Advanced Study, Eastern Institute of Technology, Ningbo, China
| |
Collapse
|
14
|
Li F, Ma C, Lei S, Pan Y, Lin L, Pan C, Li Q, Geng F, Min D, Tang X. Gingipains may be one of the key virulence factors of Porphyromonas gingivalis to impair cognition and enhance blood-brain barrier permeability: An animal study. J Clin Periodontol 2024; 51:818-839. [PMID: 38414291 DOI: 10.1111/jcpe.13966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Revised: 01/24/2024] [Accepted: 02/08/2024] [Indexed: 02/29/2024]
Abstract
AIM Blood-brain barrier (BBB) disorder is one of the early findings in cognitive impairments. We have recently found that Porphyromonas gingivalis bacteraemia can cause cognitive impairment and increased BBB permeability. This study aimed to find out the possible key virulence factors of P. gingivalis contributing to the pathological process. MATERIALS AND METHODS C57/BL6 mice were infected with P. gingivalis or gingipains or P. gingivalis lipopolysaccharide (P. gingivalis LPS group) by tail vein injection for 8 weeks. The cognitive behaviour changes in mice, the histopathological changes in the hippocampus and cerebral cortex, the alternations of BBB permeability, and the changes in Mfsd2a and Cav-1 levels were measured. The mechanisms of Ddx3x-induced regulation on Mfsd2a by arginine-specific gingipain A (RgpA) in BMECs were explored. RESULTS P. gingivalis and gingipains significantly promoted mice cognitive impairment, pathological changes in the hippocampus and cerebral cortex, increased BBB permeability, inhibited Mfsd2a expression and up-regulated Cav-1 expression. After RgpA stimulation, the permeability of the BBB model in vitro increased, and the Ddx3x/Mfsd2a/Cav-1 regulatory axis was activated. CONCLUSIONS Gingipains may be one of the key virulence factors of P. gingivalis to impair cognition and enhance BBB permeability by the Ddx3x/Mfsd2a/Cav-1 axis.
Collapse
Affiliation(s)
- Fulong Li
- Department of Periodontics, School and Hospital of Stomatology, Liaoning Provincial Key Laboratory of Oral Disease, China Medical University, Shenyang, China
- Center of Implantology, School and Hospital of Stomatology, Liaoning Provincial Key Laboratory of Oral Disease, China Medical University, Shenyang, China
| | - Chunliang Ma
- Department of Periodontics, School and Hospital of Stomatology, Liaoning Provincial Key Laboratory of Oral Disease, China Medical University, Shenyang, China
| | - Shuang Lei
- Department of Pediatric Dentistry, School and Hospital of Stomatology, Liaoning Provincial Key Laboratory of Oral Disease, China Medical University, Shenyang, China
| | - Yaping Pan
- Department of Periodontics, School and Hospital of Stomatology, Liaoning Provincial Key Laboratory of Oral Disease, China Medical University, Shenyang, China
| | - Li Lin
- Department of Periodontics, School and Hospital of Stomatology, Liaoning Provincial Key Laboratory of Oral Disease, China Medical University, Shenyang, China
| | - Chunling Pan
- Department of Periodontics, School and Hospital of Stomatology, Liaoning Provincial Key Laboratory of Oral Disease, China Medical University, Shenyang, China
| | - Qian Li
- Department of Periodontics, School and Hospital of Stomatology, Liaoning Provincial Key Laboratory of Oral Disease, China Medical University, Shenyang, China
| | - Fengxue Geng
- Department of Periodontics, School and Hospital of Stomatology, Liaoning Provincial Key Laboratory of Oral Disease, China Medical University, Shenyang, China
| | - Dongyu Min
- Traditional Chinese Medicine Experimental Center, Affiliated Hospital of Liaoning University of Traditional Chinese Medicine, Shenyang, China
- Key Laboratory of Ministry of Education for TCM Viscera State Theory and Applications, Liaoning University of Traditional Chinese Medicine, Shenyang, China
| | - Xiaolin Tang
- Department of Periodontics, School and Hospital of Stomatology, Liaoning Provincial Key Laboratory of Oral Disease, China Medical University, Shenyang, China
| |
Collapse
|
15
|
Mou Y, Lv K. Extracellular vesicle-delivered hsa_circ_0090081 regulated by EIF4A3 enhances gastric cancer tumorigenesis. Cell Div 2024; 19:19. [PMID: 38862985 PMCID: PMC11165812 DOI: 10.1186/s13008-024-00123-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Accepted: 06/03/2024] [Indexed: 06/13/2024] Open
Abstract
BACKGROUND Circular RNA (circRNA) and extracellular vesicles (EVs) in tumors are crucial for the malignant phenotype of tumor cells. Nevertheless, the mechanisms and clinical effects of EV-delivered hsa_circ_0090081 in gastric cancer (GC) are unclear. This study aimed to reveal the effect of eukaryotic translation initiation factor 4A3 (EIF4A3)-mediated hsa_circ_0090081 expression and EV-delivered hsa_circ_0090081 on GC progression. METHODS qRT-PCR was conducted to clarify hsa_circ_0090081 and EIF4A3 levels in GC tissues. Transmission electronic microscopy (TEM), nanoparticle tracking analysis (NTA), and Western blotting identified the EVs isolated from GC cells by ultracentrifugation. The roles of hsa_circ_0090081, EIF4A3, and EV-delivered hsa_circ_0090081 in GC cells were analyzed using Transwell, EdU, and CCK-8 assays. The regulatory role between EIF4A3 and hsa_circ_0090081 was investigated using RIP, qRT-PCR, and Pearson's analysis. RESULTS Our study showed that hsa_circ_0090081 and EIF4A3 were highly expressed in GC, and hsa_circ_0090081 was associated with poor prognosis. Data revealed that hsa_circ_0090081 inhibition restrained GC cell proliferation, invasion, and migration. Additionally, EIF4A3 could bind to the pre-mRNA of PHEX (linear form of hsa_circ_0090081) to enhance hsa_circ_0090081 expression in GC cells. Moreover, EIF4A3 overexpression nullified the malignant phenotypic suppression caused by hsa_circ_0090081 silencing in GC cells. Furthermore, EVs secreted by GC cells delivered hsa_circ_0090081 to facilitate the malignant progression of targeted GC cells. CONCLUSION This study showed that hsa_circ_0090081 was enhanced by EIF4A3 to play a promotive role in GC development. The results may help understand the mechanism of EIF4A3 and EV-delivered hsa_circ_0090081 and offer a valuable GC therapeutic target.
Collapse
Affiliation(s)
- Yanjie Mou
- Department of Tradition Chinese Medicine, Wuhan Third Hospital (Tongren Hospital of Wuhan University), No. 241, Pengliuyang Road, Wuchang District, Wuhan, 430060, Hubei, China
| | - Kun Lv
- Department of Tradition Chinese Medicine, Wuhan Third Hospital (Tongren Hospital of Wuhan University), No. 241, Pengliuyang Road, Wuchang District, Wuhan, 430060, Hubei, China.
| |
Collapse
|
16
|
Digby B, Finn S, Ó Broin P. Computational approaches and challenges in the analysis of circRNA data. BMC Genomics 2024; 25:527. [PMID: 38807085 PMCID: PMC11134749 DOI: 10.1186/s12864-024-10420-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Accepted: 05/15/2024] [Indexed: 05/30/2024] Open
Abstract
Circular RNAs (circRNA) are a class of non-coding RNA, forming a single-stranded covalently closed loop structure generated via back-splicing. Advancements in sequencing methods and technologies in conjunction with algorithmic developments of bioinformatics tools have enabled researchers to characterise the origin and function of circRNAs, with practical applications as a biomarker of diseases becoming increasingly relevant. Computational methods developed for circRNA analysis are predicated on detecting the chimeric back-splice junction of circRNAs whilst mitigating false-positive sequencing artefacts. In this review, we discuss in detail the computational strategies developed for circRNA identification, highlighting a selection of tool strengths, weaknesses and assumptions. In addition to circRNA identification tools, we describe methods for characterising the role of circRNAs within the competing endogenous RNA (ceRNA) network, their interactions with RNA-binding proteins, and publicly available databases for rich circRNA annotation.
Collapse
Affiliation(s)
- Barry Digby
- School of Mathematical and Statistical Sciences, University of Galway, Galway, Ireland.
| | - Stephen Finn
- Discipline of Histopathology, School of Medicine, Trinity College Dublin and Cancer Molecular Diagnostic Laboratory, Dublin, Ireland
| | - Pilib Ó Broin
- School of Mathematical and Statistical Sciences, University of Galway, Galway, Ireland
| |
Collapse
|
17
|
Yuan Y, Tang X, Li H, Lang X, Song Y, Yang Y, Zhou Z. BiLSTM- and CNN-Based m6A Modification Prediction Model for circRNAs. Molecules 2024; 29:2429. [PMID: 38893304 PMCID: PMC11173551 DOI: 10.3390/molecules29112429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 05/13/2024] [Accepted: 05/20/2024] [Indexed: 06/21/2024] Open
Abstract
m6A methylation, a ubiquitous modification on circRNAs, exerts a profound influence on RNA function, intracellular behavior, and diverse biological processes, including disease development. While prediction algorithms exist for mRNA m6A modifications, a critical gap remains in the prediction of circRNA m6A modifications. Therefore, accurate identification and prediction of m6A sites are imperative for understanding RNA function and regulation. This study presents a novel hybrid model combining a convolutional neural network (CNN) and a bidirectional long short-term memory network (BiLSTM) for precise m6A methylation site prediction in circular RNAs (circRNAs) based on data from HEK293 cells. This model exploits the synergy between CNN's ability to extract intricate sequence features and BiLSTM's strength in capturing long-range dependencies. Furthermore, the integrated attention mechanism empowers the model to pinpoint critical biological information for studying circRNA m6A methylation. Our model, exhibiting over 78% prediction accuracy on independent datasets, offers not only a valuable tool for scientific research but also a strong foundation for future biomedical applications. This work not only furthers our understanding of gene expression regulation but also opens new avenues for the exploration of circRNA methylation in biological research.
Collapse
Affiliation(s)
- Yuqian Yuan
- School of Artificial Intelligence and Information Technology, Nanjing University of Chinese Medicine, Nanjing 210023, China; (Y.Y.); (H.L.); (X.L.); (Y.S.)
| | - Xiaozhu Tang
- School of Medicine & Holistic Integrative Medicine, Nanjing University of Chinese Medicine, Nanjing 210023, China;
| | - Hongyan Li
- School of Artificial Intelligence and Information Technology, Nanjing University of Chinese Medicine, Nanjing 210023, China; (Y.Y.); (H.L.); (X.L.); (Y.S.)
| | - Xufeng Lang
- School of Artificial Intelligence and Information Technology, Nanjing University of Chinese Medicine, Nanjing 210023, China; (Y.Y.); (H.L.); (X.L.); (Y.S.)
| | - Yihua Song
- School of Artificial Intelligence and Information Technology, Nanjing University of Chinese Medicine, Nanjing 210023, China; (Y.Y.); (H.L.); (X.L.); (Y.S.)
| | - Ye Yang
- School of Medicine & Holistic Integrative Medicine, Nanjing University of Chinese Medicine, Nanjing 210023, China;
| | - Zuojian Zhou
- School of Artificial Intelligence and Information Technology, Nanjing University of Chinese Medicine, Nanjing 210023, China; (Y.Y.); (H.L.); (X.L.); (Y.S.)
| |
Collapse
|
18
|
Lasantha D, Vidanagamachchi S, Nallaperuma S. CRIECNN: Ensemble convolutional neural network and advanced feature extraction methods for the precise forecasting of circRNA-RBP binding sites. Comput Biol Med 2024; 174:108466. [PMID: 38615462 DOI: 10.1016/j.compbiomed.2024.108466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 03/29/2024] [Accepted: 04/08/2024] [Indexed: 04/16/2024]
Abstract
Circular RNAs (circRNAs) have surfaced as important non-coding RNA molecules in biology. Understanding interactions between circRNAs and RNA-binding proteins (RBPs) is crucial in circRNA research. Existing prediction models suffer from limited availability and accuracy, necessitating advanced approaches. In this study, we propose CRIECNN (Circular RNA-RBP Interaction predictor using an Ensemble Convolutional Neural Network), a novel ensemble deep learning model that enhances circRNA-RBP binding site prediction accuracy. CRIECNN employs advanced feature extraction methods and evaluates four distinct sequence datasets and encoding techniques (BERT, Doc2Vec, KNF, EIIP). The model consists of an ensemble convolutional neural network, a BiLSTM, and a self-attention mechanism for feature refinement. Our results demonstrate that CRIECNN outperforms state-of-the-art methods in accuracy and performance, effectively predicting circRNA-RBP interactions from both full-length sequences and fragments. This novel strategy makes an enormous advancement in the prediction of circRNA-RBP interactions, improving our understanding of circRNAs and their regulatory roles.
Collapse
Affiliation(s)
- Dilan Lasantha
- Department of Computer Science, University of Ruhuna, Sri Lanka.
| | | | - Sam Nallaperuma
- Department of Engineering, University of Cambridge, United Kingdom.
| |
Collapse
|
19
|
Wu H, Liu X, Fang Y, Yang Y, Huang Y, Pan X, Shen HB. Decoding protein binding landscape on circular RNAs with base-resolution transformer models. Comput Biol Med 2024; 171:108175. [PMID: 38402841 DOI: 10.1016/j.compbiomed.2024.108175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2023] [Revised: 01/16/2024] [Accepted: 02/18/2024] [Indexed: 02/27/2024]
Abstract
Circular RNAs (circRNAs), a class of endogenous RNA with a covalent loop structure, can regulate gene expression by serving as sponges for microRNAs and RNA-binding proteins (RBPs). To date, most computational methods for predicting RBP binding sites on circRNAs focus on circRNA fragments instead of circRNAs. These methods detect whether a circRNA fragment contains binding sites, but cannot determine where are the binding sites and how many binding sites are on the circRNA transcript. We report a hybrid deep learning-based tool, CircSite, to predict RBP binding sites at single-nucleotide resolution and detect key contributed nucleotides on circRNA transcripts. CircSite takes advantage of convolutional neural networks (CNNs) and Transformer for learning local and global representations of circRNAs binding to RBPs, respectively. We construct 37 datasets of circRNAs interacting with proteins for benchmarking and the experimental results show that CircSite offers accurate predictions of RBP binding nucleotides and detects key subsequences aligning well with known binding motifs. CircSite is an easy-to-use online webserver for predicting RBP binding sites on circRNA transcripts and freely available at http://www.csbio.sjtu.edu.cn/bioinf/CircSite/.
Collapse
Affiliation(s)
- Hehe Wu
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, And Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Xiaojian Liu
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, And Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Yi Fang
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, And Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Yang Yang
- Center for Brain-Like Computing and Machine Intelligence, Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yan Huang
- State Key Laboratory of Infrared Physics, Shanghai Institute of Technical Physics Chinese Academy of Sciences, 500 Yutian Road, Shanghai, 200083, China
| | - Xiaoyong Pan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, And Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China.
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, And Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China.
| |
Collapse
|
20
|
Cao C, Wang C, Yang S, Zou Q. CircSI-SSL: circRNA-binding site identification based on self-supervised learning. Bioinformatics 2024; 40:btae004. [PMID: 38180876 PMCID: PMC10789309 DOI: 10.1093/bioinformatics/btae004] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Revised: 11/13/2023] [Accepted: 01/03/2024] [Indexed: 01/07/2024] Open
Abstract
MOTIVATION In recent years, circular RNAs (circRNAs), the particular form of RNA with a closed-loop structure, have attracted widespread attention due to their physiological significance (they can directly bind proteins), leading to the development of numerous protein site identification algorithms. Unfortunately, these studies are supervised and require the vast majority of labeled samples in training to produce superior performance. But the acquisition of sample labels requires a large number of biological experiments and is difficult to obtain. RESULTS To resolve this matter that a great deal of tags need to be trained in the circRNA-binding site prediction task, a self-supervised learning binding site identification algorithm named CircSI-SSL is proposed in this article. According to the survey, this is unprecedented in the research field. Specifically, CircSI-SSL initially combines multiple feature coding schemes and employs RNA_Transformer for cross-view sequence prediction (self-supervised task) to learn mutual information from the multi-view data, and then fine-tuning with only a few sample labels. Comprehensive experiments on six widely used circRNA datasets indicate that our CircSI-SSL algorithm achieves excellent performance in comparison to previous algorithms, even in the extreme case where the ratio of training data to test data is 1:9. In addition, the transplantation experiment of six linRNA datasets without network modification and hyperparameter adjustment shows that CircSI-SSL has good scalability. In summary, the prediction algorithm based on self-supervised learning proposed in this article is expected to replace previous supervised algorithms and has more extensive application value. AVAILABILITY AND IMPLEMENTATION The source code and data are available at https://github.com/cc646201081/CircSI-SSL.
Collapse
Affiliation(s)
- Chao Cao
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang 324003, China
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, Sichuan 611731, China
| | - Chunyu Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Shuhong Yang
- Faculty of Mathematics and Computer Science, Guangdong Ocean University, Zhanjiang, Guangdong 524088, China
| | - Quan Zou
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang 324003, China
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, Sichuan 611731, China
| |
Collapse
|
21
|
Shi M, Fang Y, Liang Y, Hu Y, Huang J, Xia W, Bian H, Zhuo Q, Wu L, Zhao C. Identification and characterization of differentially expressed circular RNAs in extraocular muscle of oculomotor nerve palsy. BMC Genomics 2023; 24:617. [PMID: 37848864 PMCID: PMC10583365 DOI: 10.1186/s12864-023-09733-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 10/11/2023] [Indexed: 10/19/2023] Open
Abstract
BACKGROUND Oculomotor nerve palsy (ONP) is a neuroparalytic disorder resulting in dysfunction of innervating extraocular muscles (EOMs), of which the pathological characteristics remain underexplored. METHODS In this study, medial rectus muscle tissue samples from four ONP patients and four constant exotropia (CXT) patients were collected for RNA sequencing. Differentially expressed circular RNAs (circRNAs) were identified and included in functional enrichment analysis, followed by interaction analysis with microRNAs and mRNAs as well as RNA binding proteins. Furthermore, RT-qPCR was used to validate the expression level of the differentially expressed circRNAs. RESULTS A total of 84 differentially expressed circRNAs were identified from 10,504 predicted circRNAs. Functional enrichment analysis indicated that the differentially expressed circRNAs significantly correlated with skeletal muscle contraction. In addition, interaction analyses showed that up-regulated circRNA_03628 was significantly interacted with RNA binding protein AGO2 and EIF4A3 as well as microRNA hsa-miR-188-5p and hsa-miR-4529-5p. The up-regulation of circRNA_03628 was validated by RT-qPCR, followed by further elaboration of the expression, location and clinical significance of circRNA_03628 in EOMs of ONP. CONCLUSIONS Our study may shed light on the role of differentially expressed circRNAs, especially circRNA_03628, in the pathological changes of EOMs in ONP.
Collapse
Affiliation(s)
- Mingsu Shi
- Eye Institute, Department of Ophthalmology, Eye & ENT Hospital, Fudan University, 83 Fenyang Road, Shanghai, 200031, China
- NHC Key Laboratory of Myopia (Fudan University), Key Laboratory of Myopia, Chinese Academy of Medical Sciences, 83 Fenyang Road, Shanghai, 200031, China
- Shanghai Key Laboratory of Visual Impairment and Restoration, 83 Fenyang Road, Shanghai, 200031, China
| | - Yanxi Fang
- Eye Institute, Department of Ophthalmology, Eye & ENT Hospital, Fudan University, 83 Fenyang Road, Shanghai, 200031, China
- NHC Key Laboratory of Myopia (Fudan University), Key Laboratory of Myopia, Chinese Academy of Medical Sciences, 83 Fenyang Road, Shanghai, 200031, China
- Shanghai Key Laboratory of Visual Impairment and Restoration, 83 Fenyang Road, Shanghai, 200031, China
| | - Yu Liang
- Eye Institute, Department of Ophthalmology, Eye & ENT Hospital, Fudan University, 83 Fenyang Road, Shanghai, 200031, China
- NHC Key Laboratory of Myopia (Fudan University), Key Laboratory of Myopia, Chinese Academy of Medical Sciences, 83 Fenyang Road, Shanghai, 200031, China
- Shanghai Key Laboratory of Visual Impairment and Restoration, 83 Fenyang Road, Shanghai, 200031, China
| | - Yuxiang Hu
- Eye Institute, Department of Ophthalmology, Eye & ENT Hospital, Fudan University, 83 Fenyang Road, Shanghai, 200031, China
- NHC Key Laboratory of Myopia (Fudan University), Key Laboratory of Myopia, Chinese Academy of Medical Sciences, 83 Fenyang Road, Shanghai, 200031, China
- Shanghai Key Laboratory of Visual Impairment and Restoration, 83 Fenyang Road, Shanghai, 200031, China
| | - Jiaqiu Huang
- Eye Institute, Department of Ophthalmology, Eye & ENT Hospital, Fudan University, 83 Fenyang Road, Shanghai, 200031, China
- NHC Key Laboratory of Myopia (Fudan University), Key Laboratory of Myopia, Chinese Academy of Medical Sciences, 83 Fenyang Road, Shanghai, 200031, China
- Shanghai Key Laboratory of Visual Impairment and Restoration, 83 Fenyang Road, Shanghai, 200031, China
| | - Weiyi Xia
- Eye Institute, Department of Ophthalmology, Eye & ENT Hospital, Fudan University, 83 Fenyang Road, Shanghai, 200031, China
- NHC Key Laboratory of Myopia (Fudan University), Key Laboratory of Myopia, Chinese Academy of Medical Sciences, 83 Fenyang Road, Shanghai, 200031, China
- Shanghai Key Laboratory of Visual Impairment and Restoration, 83 Fenyang Road, Shanghai, 200031, China
| | - Hewei Bian
- Eye Institute, Department of Ophthalmology, Eye & ENT Hospital, Fudan University, 83 Fenyang Road, Shanghai, 200031, China
- NHC Key Laboratory of Myopia (Fudan University), Key Laboratory of Myopia, Chinese Academy of Medical Sciences, 83 Fenyang Road, Shanghai, 200031, China
- Shanghai Key Laboratory of Visual Impairment and Restoration, 83 Fenyang Road, Shanghai, 200031, China
| | - Qiao Zhuo
- Eye Institute, Department of Ophthalmology, Eye & ENT Hospital, Fudan University, 83 Fenyang Road, Shanghai, 200031, China
- NHC Key Laboratory of Myopia (Fudan University), Key Laboratory of Myopia, Chinese Academy of Medical Sciences, 83 Fenyang Road, Shanghai, 200031, China
- Shanghai Key Laboratory of Visual Impairment and Restoration, 83 Fenyang Road, Shanghai, 200031, China
| | - Lianqun Wu
- Eye Institute, Department of Ophthalmology, Eye & ENT Hospital, Fudan University, 83 Fenyang Road, Shanghai, 200031, China.
- NHC Key Laboratory of Myopia (Fudan University), Key Laboratory of Myopia, Chinese Academy of Medical Sciences, 83 Fenyang Road, Shanghai, 200031, China.
- Shanghai Key Laboratory of Visual Impairment and Restoration, 83 Fenyang Road, Shanghai, 200031, China.
| | - Chen Zhao
- Eye Institute, Department of Ophthalmology, Eye & ENT Hospital, Fudan University, 83 Fenyang Road, Shanghai, 200031, China.
- NHC Key Laboratory of Myopia (Fudan University), Key Laboratory of Myopia, Chinese Academy of Medical Sciences, 83 Fenyang Road, Shanghai, 200031, China.
- Shanghai Key Laboratory of Visual Impairment and Restoration, 83 Fenyang Road, Shanghai, 200031, China.
| |
Collapse
|
22
|
Shen Z, Liu W, Zhao S, Zhang Q, Wang S, Yuan L. Nucleotide-level prediction of CircRNA-protein binding based on fully convolutional neural network. Front Genet 2023; 14:1283404. [PMID: 37867600 PMCID: PMC10587422 DOI: 10.3389/fgene.2023.1283404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Accepted: 09/21/2023] [Indexed: 10/24/2023] Open
Abstract
Introduction: CircRNA-protein binding plays a critical role in complex biological activity and disease. Various deep learning-based algorithms have been proposed to identify CircRNA-protein binding sites. These methods predict whether the CircRNA sequence includes protein binding sites from the sequence level, and primarily concentrate on analysing the sequence specificity of CircRNA-protein binding. For model performance, these methods are unsatisfactory in accurately predicting motif sites that have special functions in gene expression. Methods: In this study, based on the deep learning models that implement pixel-level binary classification prediction in computer vision, we viewed the CircRNA-protein binding sites prediction as a nucleotide-level binary classification task, and use a fully convolutional neural networks to identify CircRNA-protein binding motif sites (CPBFCN). Results: CPBFCN provides a new path to predict CircRNA motifs. Based on the MEME tool, the existing CircRNA-related and protein-related database, we analysed the motif functions discovered by CPBFCN. We also investigated the correlation between CircRNA sponge and motif distribution. Furthermore, by comparing the motif distribution with different input sequence lengths, we found that some motifs in the flanking sequences of CircRNA-protein binding region may contribute to CircRNA-protein binding. Conclusion: This study contributes to identify circRNA-protein binding and provides help in understanding the role of circRNA-protein binding in gene expression regulation.
Collapse
Affiliation(s)
- Zhen Shen
- School of Computer and Software, Nanyang Institute of Technology, Nanyang, Henan, China
| | - Wei Liu
- School of Computer and Software, Nanyang Institute of Technology, Nanyang, Henan, China
| | - ShuJun Zhao
- School of Computer and Software, Nanyang Institute of Technology, Nanyang, Henan, China
| | - QinHu Zhang
- EIT Institute for Advanced Study, Ningbo, Zhejiang, China
| | - SiGuo Wang
- EIT Institute for Advanced Study, Ningbo, Zhejiang, China
| | - Lin Yuan
- Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
- Shandong Engineering Research Center of Big Data Applied Technology, Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
- Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan, China
| |
Collapse
|
23
|
Liu N, Zhang Z, Wu Y, Wang Y, Liang Y. CRBSP:Prediction of CircRNA-RBP Binding Sites Based on Multimodal Intermediate Fusion. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:2898-2906. [PMID: 37130249 DOI: 10.1109/tcbb.2023.3272400] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Circular RNA (CircRNA) is widely expressed and has physiological and pathological significance, regulating post-transcriptional processes via its protein-binding activity. However, whereas much work has been done on linear RNA and RNA binding protein (RBP), little is known about the binding sites of CircRNA. The current report is on the development of a medium-term multimodal data fusion strategy, CRBSP, to predict CircRNA-RBP binding sites. CRBSP represents the CircRNA trinucleotide semantic, location, composition and frequency information as the corresponding coding methods of Word to vector (Word2vec), Position-specific trinucleotide propensity (PSTNP), Pseudo trinucleotide composition (PseTNC) and Trinucleotide nucleotide composition (TNC), respectively. CNN (Convolution Neural Networks) was used to extract global information and BiLSTM (bidirectional Long- and Short-Term Memory network) encoder and LSTM (Long- and Short-Term Memory network) decoder for local sequence information. Enhancement of the contributions of key features by the self-attention mechanism was followed by mid-term fusion of the four enhanced features. Logistic Regression (LR) classifier showed that CRBSP gives a mean AUC value of 0.9362 through 5-fold Cross Validation of all 37 datasets, a performance which is superior to five current state-of-the-art models. Similar evaluation of linear RNA-RBP binding sites gave an AUC value of 0.7615 which is also higher than other prediction methods, demonstrating the robustness of CRBSP.
Collapse
|
24
|
Li L, Xue Z, Du X. ASCRB: Multi-view based attentional feature selection for CircRNA-binding site prediction. Comput Biol Med 2023; 162:107077. [PMID: 37290390 DOI: 10.1016/j.compbiomed.2023.107077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 05/15/2023] [Accepted: 05/27/2023] [Indexed: 06/10/2023]
Abstract
CircRNA is a non-coding RNA with a special circular structure, which plays a key role in a variety of life activities by interacting with RNA-binding proteins through CircRNA binding sites. Therefore, accurately identifying CircRNA binding sites is of great importance for gene regulation. In previous studies, most of the methods are based on single-view or multi-view features. Considering that single-view methods provide less effective information, the current mainstream methods mainly focus on extracting rich relevant features by constructing multiple views. However, the increasing number of views leads to a large amount of redundant information, which is detrimental to the detection of CircRNA binding sites. Therefore, to solve this problem, we propose to use the channel attention mechanism to further obtain useful multi-view features by filtering out invalid information in each view. First, we use five feature encoding schemes to construct multi-view. Then, we calibrate the features by generating the global representation of each view, filtering out redundant information to retain important feature information. Finally, features obtained from multiple views are fused to detect RNA binding sites. To validate the effectiveness of the method, we compared its performance on 37 CircRNA-RBP datasets with existing methods. Experimental results show that the average AUC performance of our method is 93.85%, which is better than the current state-of-the-art methods. We also provide the source code, which can be accessed at https://github.com/dxqllp/ASCRB for access.
Collapse
Affiliation(s)
- Lei Li
- Department of Neurology, Shuyang Hospital Affiliated to Yangzhou University School of Medicine (Shuyang Hospital of Traditional Chinese Medicine, Suqian, China
| | - Zhigang Xue
- School of Computer Science and Technology, Anhui University, Hefei, China
| | - Xiuquan Du
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, Hefei, China; School of Computer Science and Technology, Anhui University, Hefei, China.
| |
Collapse
|
25
|
Cao C, Yang S, Li M, Li C. CircSSNN: circRNA-binding site prediction via sequence self-attention neural networks with pre-normalization. BMC Bioinformatics 2023; 24:220. [PMID: 37254080 DOI: 10.1186/s12859-023-05352-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Accepted: 05/25/2023] [Indexed: 06/01/2023] Open
Abstract
BACKGROUND Circular RNAs (circRNAs) play a significant role in some diseases by acting as transcription templates. Therefore, analyzing the interaction mechanism between circRNA and RNA-binding proteins (RBPs) has far-reaching implications for the prevention and treatment of diseases. Existing models for circRNA-RBP identification usually adopt convolution neural network (CNN), recurrent neural network (RNN), or their variants as feature extractors. Most of them have drawbacks such as poor parallelism, insufficient stability, and inability to capture long-term dependencies. METHODS In this paper, we propose a new method completely using the self-attention mechanism to capture deep semantic features of RNA sequences. On this basis, we construct a CircSSNN model for the cirRNA-RBP identification. The proposed model constructs a feature scheme by fusing circRNA sequence representations with statistical distributions, static local contexts, and dynamic global contexts. With a stable and efficient network architecture, the distance between any two positions in a sequence is reduced to a constant, so CircSSNN can quickly capture the long-term dependencies and extract the deep semantic features. RESULTS Experiments on 37 circRNA datasets show that the proposed model has overall advantages in stability, parallelism, and prediction performance. Keeping the network structure and hyperparameters unchanged, we directly apply the CircSSNN to linRNA datasets. The favorable results show that CircSSNN can be transformed simply and efficiently without task-oriented tuning. CONCLUSIONS In conclusion, CircSSNN can serve as an appealing circRNA-RBP identification tool with good identification performance, excellent scalability, and wide application scope without the need for task-oriented fine-tuning of parameters, which is expected to reduce the professional threshold required for hyperparameter tuning in bioinformatics analysis.
Collapse
Affiliation(s)
- Chao Cao
- School of Computer Science and Technology, Guangxi University of Science and Technology, Liuzhou, China
| | - Shuhong Yang
- Key Laboratory of Guangxi Universities on Intelligent Computing and Distributed Information Processing, Guangxi University of Science and Technology, Liuzhou, China.
| | - Mengli Li
- School of Technology, Guilin University, Guilin, China
| | - Chungui Li
- School of Computer Science and Technology, Guangxi University of Science and Technology, Liuzhou, China.
| |
Collapse
|
26
|
Ma Z, Sun ZL, Liu M. CRBP-HFEF: Prediction of RBP-Binding Sites on circRNAs Based on Hierarchical Feature Expansion and Fusion. Interdiscip Sci 2023:10.1007/s12539-023-00572-0. [PMID: 37233959 DOI: 10.1007/s12539-023-00572-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Revised: 04/20/2023] [Accepted: 04/21/2023] [Indexed: 05/27/2023]
Abstract
Circular RNAs (circRNAs) participate in the regulation of biological processes by binding to specific proteins and thus influence transcriptional processes. In recent years, circRNAs have become an emerging hotspot in RNA research. Due to powerful learning ability, the various deep learning frameworks have been used to predict the binding sites of RNA-binding protein (RPB) on circRNAs. These methods usually perform only single-level feature extraction of sequence information. However, the feature acquisition may be inadequate for single-level extraction. Generally, the features of deep and shallow layers of neural network can complement each other and are both important for binding site prediction tasks. Based on this concept, we propose a method that combines deep and shallow features, namely CRBP-HFEF. Specifically, features are first extracted and expanded for different levels of network. Then, the expanded deep and shallow features are fused and fed into the classification network, which finally determines whether they are binding sites. Compared to several existing methods, the experimental results on multiple datasets show that the proposed method achieves significant improvements in a number of metrics (with an average AUC of 0.9855). Moreover, much sufficient ablation experiments are also performed to verify the effectiveness of the hierarchical feature expansion strategy.
Collapse
Affiliation(s)
- Zheng Ma
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University, and School of Electrical Engineering and Automation Anhui University, Hefei, 230601, Anhui, China
| | - Zhan-Li Sun
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University, and School of Electrical Engineering and Automation Anhui University, Hefei, 230601, Anhui, China.
| | - Mengya Liu
- School of Computer Science and Technology, Anhui University, Hefei, 230601, Anhui, China
| |
Collapse
|
27
|
Rebolledo C, Silva JP, Saavedra N, Maracaja-Coutinho V. Computational approaches for circRNAs prediction and in silico characterization. Brief Bioinform 2023; 24:7150741. [PMID: 37139555 DOI: 10.1093/bib/bbad154] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 03/20/2023] [Accepted: 03/30/2023] [Indexed: 05/05/2023] Open
Abstract
Circular RNAs (circRNAs) are single-stranded and covalently closed non-coding RNA molecules originated from RNA splicing. Their functions include regulatory potential over other RNA species, such as microRNAs, messenger RNAs and RNA binding proteins. For circRNA identification, several algorithms are available and can be classified in two major types: pseudo-reference-based and split-alignment-based approaches. In general, the data generated from circRNA transcriptome initiatives is deposited on public specific databases, which provide a large amount of information on different species and functional annotations. In this review, we describe the main computational resources for the identification and characterization of circRNAs, covering the algorithms and predictive tools to evaluate its potential role in a particular transcriptomics project, including the public repositories containing relevant data and information for circRNAs, recapitulating their characteristics, reliability and amount of data reported.
Collapse
Affiliation(s)
- Camilo Rebolledo
- Center of Molecular Biology & Pharmacogenetics, Department of Basic Sciences, Scientific and Technological Resources, Universidad de La Frontera, Temuco, Chile
- Advanced Center for Chronic Diseases - ACCDiS, Facultad de Ciencias Químicas y Farmacéuticas, Universidad de Chile, Santiago, Chile
- Centro de Modelamiento Molecular, Biofísica y Bioinformática - CM2B2, Facultad de Ciencias Químicas y Farmacéuticas, Universidad de Chile, Santiago, Chile
| | - Juan Pablo Silva
- Centro de Modelamiento Molecular, Biofísica y Bioinformática - CM2B2, Facultad de Ciencias Químicas y Farmacéuticas, Universidad de Chile, Santiago, Chile
- ANID Anillo ACT210004 SYSTEMIX, Rancagua, Chile
| | - Nicolás Saavedra
- Center of Molecular Biology & Pharmacogenetics, Department of Basic Sciences, Scientific and Technological Resources, Universidad de La Frontera, Temuco, Chile
| | - Vinicius Maracaja-Coutinho
- Advanced Center for Chronic Diseases - ACCDiS, Facultad de Ciencias Químicas y Farmacéuticas, Universidad de Chile, Santiago, Chile
- Centro de Modelamiento Molecular, Biofísica y Bioinformática - CM2B2, Facultad de Ciencias Químicas y Farmacéuticas, Universidad de Chile, Santiago, Chile
- ANID Anillo ACT210004 SYSTEMIX, Rancagua, Chile
- Anillo Inflammation in HIV/AIDS - InflammAIDS, Santiago, Chile
| |
Collapse
|
28
|
Zhang L, Lu C, Zeng M, Li Y, Wang J. CRMSS: predicting circRNA-RBP binding sites based on multi-scale characterizing sequence and structure features. Brief Bioinform 2023; 24:6889442. [PMID: 36511222 DOI: 10.1093/bib/bbac530] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 11/01/2022] [Accepted: 11/07/2022] [Indexed: 12/14/2022] Open
Abstract
Circular RNAs (circRNAs) are reverse-spliced and covalently closed RNAs. Their interactions with RNA-binding proteins (RBPs) have multiple effects on the progress of many diseases. Some computational methods are proposed to identify RBP binding sites on circRNAs but suffer from insufficient accuracy, robustness and explanation. In this study, we first take the characteristics of both RNA and RBP into consideration. We propose a method for discriminating circRNA-RBP binding sites based on multi-scale characterizing sequence and structure features, called CRMSS. For circRNAs, we use sequence ${k}\hbox{-}{mer}$ embedding and the forming probabilities of local secondary structures as features. For RBPs, we combine sequence and structure frequencies of RNA-binding domain regions to generate features. We capture binding patterns with multi-scale residual blocks. With BiLSTM and attention mechanism, we obtain the contextual information of high-level representation for circRNA-RBP binding. To validate the effectiveness of CRMSS, we compare its predictive performance with other methods on 37 RBPs. Taking the properties of both circRNAs and RBPs into account, CRMSS achieves superior performance over state-of-the-art methods. In the case study, our model provides reliable predictions and correctly identifies experimentally verified circRNA-RBP pairs. The code of CRMSS is freely available at https://github.com/BioinformaticsCSU/CRMSS.
Collapse
Affiliation(s)
- Lishen Zhang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, China
| | - Chengqian Lu
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, China
| | - Min Zeng
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, China
| | - Yaohang Li
- Department of Computer Science at Old Dominion University, USA
| | - Jianxin Wang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, China
| |
Collapse
|
29
|
Ruan H, Wang PC, Han L. Characterization of circular RNAs with advanced sequencing technologies in human complex diseases. WILEY INTERDISCIPLINARY REVIEWS. RNA 2023; 14:e1759. [PMID: 36164985 DOI: 10.1002/wrna.1759] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 07/09/2022] [Accepted: 08/02/2022] [Indexed: 01/31/2023]
Abstract
Circular RNAs (circRNAs) are one category of non-coding RNAs that do not possess 5' caps and 3' free ends. Instead, they are derived in closed circle forms from pre-mRNAs by a non-canonical splicing mechanism named "back-splicing." CircRNAs were discovered four decades ago, initially called "scrambled exons." Compared to linear RNAs, the expression levels of circRNAs are considerably lower, and it is challenging to identify circRNAs specifically. Thus, the biological relevance of circRNAs has been underappreciated until the advancement of next generation sequencing (NGS) technology. The biological insights of circRNAs, such as their tissue-specific expression patterns, biogenesis factors, and functional effects in complex diseases, namely human cancers, have been extensively explored in the last decade. With the invention of the third generation sequencing (TGS) with longer sequencing reads and newly designed strategies to characterize full-length circRNAs, the panorama of circRNAs in human complex diseases could be further unveiled. In this review, we first introduce the history of circular RNA detection. Next, we describe widely adopted NGS-based methods and the recently established TGS-based approaches capable of characterizing circRNAs in full-length. We then summarize data resources and representative circRNA functional studies related to human complex diseases. In the last section, we reviewed computational tools and discuss the potential advantages of utilizing advanced sequencing approaches to a functional interpretation of full-length circRNAs in complex diseases. This article is categorized under: RNA Evolution and Genomics > Computational Analyses of RNA RNA in Disease and Development > RNA in Disease.
Collapse
Affiliation(s)
- Hang Ruan
- Institutes of Biology and Medical Sciences, Soochow University, Suzhou, China
| | - Peng-Cheng Wang
- Institutes of Biology and Medical Sciences, Soochow University, Suzhou, China
| | - Leng Han
- Center for Epigenetics and Disease Prevention, Institute of Biosciences and Technology, Texas A&M University, Houston, Texas, USA.,Department of Translational Medical Sciences, College of Medicine, Texas A&M University, Houston, Texas, USA
| |
Collapse
|
30
|
Wei Q, Zhang Q, Gao H, Song T, Salhi A, Yu B. DEEPStack-RBP: Accurate identification of RNA-binding proteins based on autoencoder feature selection and deep stacking ensemble classifier. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
|
31
|
Pepe G, Appierdo R, Carrino C, Ballesio F, Helmer-Citterich M, Gherardini PF. Artificial intelligence methods enhance the discovery of RNA interactions. Front Mol Biosci 2022; 9:1000205. [PMID: 36275611 PMCID: PMC9585310 DOI: 10.3389/fmolb.2022.1000205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 09/20/2022] [Indexed: 11/13/2022] Open
Abstract
Understanding how RNAs interact with proteins, RNAs, or other molecules remains a challenge of main interest in biology, given the importance of these complexes in both normal and pathological cellular processes. Since experimental datasets are starting to be available for hundreds of functional interactions between RNAs and other biomolecules, several machine learning and deep learning algorithms have been proposed for predicting RNA-RNA or RNA-protein interactions. However, most of these approaches were evaluated on a single dataset, making performance comparisons difficult. With this review, we aim to summarize recent computational methods, developed in this broad research area, highlighting feature encoding and machine learning strategies adopted. Given the magnitude of the effect that dataset size and quality have on performance, we explored the characteristics of these datasets. Additionally, we discuss multiple approaches to generate datasets of negative examples for training. Finally, we describe the best-performing methods to predict interactions between proteins and specific classes of RNA molecules, such as circular RNAs (circRNAs) and long non-coding RNAs (lncRNAs), and methods to predict RNA-RNA or RNA-RBP interactions independently of the RNA type.
Collapse
Affiliation(s)
- G Pepe
- Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
- *Correspondence: G Pepe, ; M Helmer-Citterich,
| | - R Appierdo
- Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| | - C Carrino
- PhD Program in Cellular and Molecular Biology, Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| | - F Ballesio
- PhD Program in Cellular and Molecular Biology, Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| | - M Helmer-Citterich
- Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
- *Correspondence: G Pepe, ; M Helmer-Citterich,
| | - PF Gherardini
- Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| |
Collapse
|
32
|
JLCRB: A unified multi-view-based joint representation learning for CircRNA binding sites prediction. J Biomed Inform 2022; 136:104231. [DOI: 10.1016/j.jbi.2022.104231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Revised: 10/14/2022] [Accepted: 10/14/2022] [Indexed: 11/07/2022]
|
33
|
A pseudo-Siamese framework for circRNA-RBP binding sites prediction integrating BiLSTM and soft attention mechanism. Methods 2022; 207:57-64. [PMID: 36113743 DOI: 10.1016/j.ymeth.2022.09.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2022] [Revised: 08/24/2022] [Accepted: 09/09/2022] [Indexed: 11/20/2022] Open
Abstract
Circular RNAs (circRNAs) are widely expressed in tissues and play a key role in diseases through interacting with RNA binding proteins (RBPs). Since the high cost of traditional technology, computational methods are developed to identify the binding sites between circRNAs and RBPs. Unfortunately, these methods suffer from the insufficient learning of features and the single classification of output. To address these limitations, we propose a novel method named circ-pSBLA which constructs a pseudo-Siamese framework integrating Bi-directional long short-term memory (BiLSTM) network and soft attention mechanism for circRNA-RBP binding sites prediction. Softmax function and CatBoost are adopted to classify, respectively, and then a pseudo-Siamese framework is constructed. circ-pSBLA combines them to get final output. To validate the effectiveness of circ-pSBLA, we compare it with other state-of-the-art methods and carry out an ablation experiment on 17 sub-datasets. Moreover, we do motif analysis on 3 sub-datasets. The results show that circ-pSBLA achieves superior performance and outperforms other methods. All supporting source codes can be downloaded from https://github.com/gyj9811/circ-pSBLA.
Collapse
|
34
|
Wang Z, Lei X. A web server for identifying circRNA-RBP variable-length binding sites based on stacked generalization ensemble deep learning network. Methods 2022; 205:179-190. [PMID: 35810958 DOI: 10.1016/j.ymeth.2022.06.014] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2022] [Revised: 06/23/2022] [Accepted: 06/28/2022] [Indexed: 11/28/2022] Open
Abstract
Circular RNA (circRNA) can exert biological functions by interacting with RNA-binding protein (RBP), and some deep learning-based methods have been developed to predict RBP binding sites on circRNA. However, most of these methods identify circRNA-RBP binding sites are only based on single data resource and cannot provide exact binding sites, only providing the probability value of a sequence fragment. To solve these problems, we propose a binding sites localization algorithm that fuses binding sites from multiple databases, and further design a stacked generalization ensemble deep learning model named CirRBP to identify RBP binding sites on circRNA. The CirRBP is trained by combining the binding sites from multiple databases and makes predictions by weighted aggregating the predictions of each sub-model. The results show that the CirRBP outperforms any sub-model and existing online prediction model. For better access to our research results, we develop an open-source web application called CRWS (CircRNA-RBP Web Server). Its back-end learning model of the CRWS is a stacked generalization ensemble learning model CirRBP based on different deep learning frameworks. Given a full-length circRNA or fragment sequence and a target RBP, the CRWS can analyze and provide the exact potential binding sites of the target RBP on the given sequence through the binding sites localization algorithm, and visualize it. In addition, the CRWS can discover the most widely distributed motif in each RBP dataset. Up to now, CRWS is the first significant online tool that uses multi-source data to train models and predict exact binding sites. CRWS is now publicly and freely available without login requirement at: http://www.bioinformatics.team.
Collapse
Affiliation(s)
- Zhengfeng Wang
- School of Computer Science, Shaanxi Normal University, Xi'an 710119, China; College of Information Science and Engineering, Guilin University of Technology, Guilin 541004, China; Guangxi Key Laboratory of Embedded Technology and Intelligent System, Guilin University of Technology, Guilin 541004, China
| | - Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an 710119, China.
| |
Collapse
|
35
|
Dong X, Chen K, Chen W, Wang J, Chang L, Deng J, Wei L, Han L, Huang C, He C. circRIP: an accurate tool for identifying circRNA-RBP interactions. Brief Bioinform 2022; 23:6596315. [PMID: 35641157 DOI: 10.1093/bib/bbac186] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Revised: 04/07/2022] [Accepted: 04/23/2022] [Indexed: 12/25/2022] Open
Abstract
Circular ribonucleic acids (RNAs) (circRNAs) are formed by covalently linking the downstream splice donor and the upstream splice acceptor. One of the most important functions of circRNAs is mainly exerted through binding RNA-binding proteins (RBPs). However, there is no efficient algorithm for identifying genome-wide circRNA-RBP interactions. Here, we developed a unique algorithm, circRIP, for identifying circRNA-RBP interactions from RNA immunoprecipitation sequencing (RIP-Seq) data. A simulation test demonstrated the sensitivity and specificity of circRIP. By applying circRIP, we identified 95 IGF2BP3-binding circRNAs based on the IGF2BP3 RIP-Seq dataset. We further identified 2823 and 1333 circRNAs binding to >100 RBPs in K562 and HepG2 cell lines, respectively, based on enhanced cross-linking immunoprecipitation (eCLIP) data, demonstrating the significance to survey the potential interactions between circRNAs and RBPs. In this study, we provide an accurate and sensitive tool, circRIP (https://github.com/bioinfolabwhu/circRIP), to systematically identify RBP and circRNA interactions from RIP-Seq and eCLIP data, which can significantly benefit the research community for the functional exploration of circRNAs.
Collapse
Affiliation(s)
- Xin Dong
- School of Basic Medical Sciences, Wuhan University, Wuhan 430071, China
| | - Ke Chen
- Department of Urology,Tongji Hospital, Tongji Medical College,Huazhong University of Science and Technology, 430030, Wuhan, China
| | - Wenbo Chen
- School of Basic Medical Sciences, Wuhan University, Wuhan 430071, China
| | - Jun Wang
- School of Basic Medical Sciences, Wuhan University, Wuhan 430071, China
| | - Liuping Chang
- College of Biomedicine and Health, Huazhong Agricultural University, Wuhan 430070, China
| | - Jin Deng
- School of Basic Medical Sciences, Wuhan University, Wuhan 430071, China
| | - Lei Wei
- School of Basic Medical Sciences, Wuhan University, Wuhan 430071, China
| | - Leng Han
- Center for Epigenetics and Disease Prevention, Institute of Biosciences and Technology, Texas A&M University, Houston, TX, 77030, USA
| | - Chunhua Huang
- College of Basic Medicine, Guizhou University of Traditional Chinese Medicine, Guiyang, Key Laboratory of Traditional Chinese Medicine Toxicology in Forensic Medicine, Guizhou Education Department, Guiyang 550025, China
| | - Chunjiang He
- School of Basic Medical Sciences, Wuhan University, Wuhan 430071, China.,College of Biomedicine and Health, Huazhong Agricultural University, Wuhan 430070, China
| |
Collapse
|
36
|
Du X, Zhao X, Zhang Y. DeepBtoD: Improved RNA-binding proteins prediction via integrated deep learning. J Bioinform Comput Biol 2022; 20:2250006. [PMID: 35451938 DOI: 10.1142/s0219720022500068] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
RNA-binding proteins (RBPs) have crucial roles in various cellular processes such as alternative splicing and gene regulation. Therefore, the analysis and identification of RBPs is an essential issue. However, although many computational methods have been developed for predicting RBPs, a few studies simultaneously consider local and global information from the perspective of the RNA sequence. Facing this challenge, we present a novel method called DeepBtoD, which predicts RBPs directly from RNA sequences. First, a [Formula: see text]-BtoD encoding is designed, which takes into account the composition of [Formula: see text]-nucleotides and their relative positions and forms a local module. Second, we designed a multi-scale convolutional module embedded with a self-attentive mechanism, the ms-focusCNN, which is used to further learn more effective, diverse, and discriminative high-level features. Finally, global information is considered to supplement local modules with ensemble learning to predict whether the target RNA binds to RBPs. Our preliminary 24 independent test datasets show that our proposed method can classify RBPs with the area under the curve of 0.933. Remarkably, DeepBtoD shows competitive results across seven state-of-the-art methods, suggesting that RBPs can be highly recognized by integrating local [Formula: see text]-BtoD and global information only from RNA sequences. Hence, our integrative method may be useful to improve the power of RBPs prediction, which might be particularly useful for modeling protein-nucleic acid interactions in systems biology studies. Our DeepBtoD server can be accessed at http://175.27.228.227/DeepBtoD/.
Collapse
Affiliation(s)
- XiuQuan Du
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, Hefei 230601, Anhui, P. R. China.,School of Computer Science and Technology, Anhui University, Hefei 230601, Anhui, P. R. China
| | - XiuJuan Zhao
- School of Computer Science and Technology, Anhui University, Hefei 230601, Anhui, P. R. China
| | - YanPing Zhang
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, Hefei 230601, Anhui, P. R. China
| |
Collapse
|
37
|
Chalupová E, Vaculík O, Poláček J, Jozefov F, Majtner T, Alexiou P. ENNGene: an Easy Neural Network model building tool for Genomics. BMC Genomics 2022; 23:248. [PMID: 35361122 PMCID: PMC8973509 DOI: 10.1186/s12864-022-08414-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 02/23/2022] [Indexed: 11/17/2022] Open
Abstract
Background The recent big data revolution in Genomics, coupled with the emergence of Deep Learning as a set of powerful machine learning methods, has shifted the standard practices of machine learning for Genomics. Even though Deep Learning methods such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) are becoming widespread in Genomics, developing and training such models is outside the ability of most researchers in the field. Results Here we present ENNGene—Easy Neural Network model building tool for Genomics. This tool simplifies training of custom CNN or hybrid CNN-RNN models on genomic data via an easy-to-use Graphical User Interface. ENNGene allows multiple input branches, including sequence, evolutionary conservation, and secondary structure, and performs all the necessary preprocessing steps, allowing simple input such as genomic coordinates. The network architecture is selected and fully customized by the user, from the number and types of the layers to each layer's precise set-up. ENNGene then deals with all steps of training and evaluation of the model, exporting valuable metrics such as multi-class ROC and precision-recall curve plots or TensorBoard log files. To facilitate interpretation of the predicted results, we deploy Integrated Gradients, providing the user with a graphical representation of an attribution level of each input position. To showcase the usage of ENNGene, we train multiple models on the RBP24 dataset, quickly reaching the state of the art while improving the performance on more than half of the proteins by including the evolutionary conservation score and tuning the network per protein. Conclusions As the role of DL in big data analysis in the near future is indisputable, it is important to make it available for a broader range of researchers. We believe that an easy-to-use tool such as ENNGene can allow Genomics researchers without a background in Computational Sciences to harness the power of DL to gain better insights into and extract important information from the large amounts of data available in the field. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08414-x.
Collapse
Affiliation(s)
- Eliška Chalupová
- Faculty of Science, National Centre for Biomolecular Research, Masaryk University, Brno, Czechia.,Central European Institute of Technology (CEITEC), Masaryk University, Brno, Czechia
| | - Ondřej Vaculík
- Faculty of Science, National Centre for Biomolecular Research, Masaryk University, Brno, Czechia.,Central European Institute of Technology (CEITEC), Masaryk University, Brno, Czechia
| | - Jakub Poláček
- Faculty of Informatics, Masaryk University, Brno, Czechia
| | - Filip Jozefov
- Faculty of Informatics, Masaryk University, Brno, Czechia
| | - Tomáš Majtner
- Central European Institute of Technology (CEITEC), Masaryk University, Brno, Czechia
| | - Panagiotis Alexiou
- Central European Institute of Technology (CEITEC), Masaryk University, Brno, Czechia.
| |
Collapse
|
38
|
Liu Q, Yu J, Cai Y, Zhang G, Dai X. SAAED: Embedding and Deep Learning Enhance Accurate Prediction of Association Between circRNA and Disease. Front Genet 2022; 13:832244. [PMID: 35273640 PMCID: PMC8902643 DOI: 10.3389/fgene.2022.832244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Accepted: 01/17/2022] [Indexed: 11/13/2022] Open
Abstract
Emerging evidence indicates that circRNA can regulate various diseases. However, the mechanisms of circRNA in these diseases have not been fully understood. Therefore, detecting potential circRNA–disease associations has far-reaching significance for pathological development and treatment of these diseases. In recent years, deep learning models are used in association analysis of circRNA–disease, but a lack of circRNA–disease association data limits further improvement. Therefore, there is an urgent need to mine more semantic information from data. In this paper, we propose a novel method called Semantic Association Analysis by Embedding and Deep learning (SAAED), which consists of two parts, a neural network embedding model called Entity Relation Network (ERN) and a Pseudo-Siamese network (PSN) for analysis. ERN can fuse multiple sources of data and express the information with low-dimensional embedding vectors. PSN can extract the feature between circRNA and disease for the association analysis. CircRNA–disease, circRNA–miRNA, disease–gene, disease–miRNA, disease–lncRNA, and disease–drug association information are used in this paper. More association data can be introduced for analysis without restriction. Based on the CircR2Disease benchmark dataset for evaluation, a fivefold cross-validation experiment showed an AUC of 98.92%, an accuracy of 95.39%, and a sensitivity of 93.06%. Compared with other state-of-the-art models, SAAED achieves the best overall performance. SAAED can expand the expression of the biological related information and is an efficient method for predicting potential circRNA–disease association.
Collapse
Affiliation(s)
- Qingyu Liu
- School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou, China
| | - Junjie Yu
- Macquarie Business School, Macquarie University, Sydney, NSW, Australia
| | - Yanning Cai
- College of Information Science and Technology, Jinan University, Guangzhou, China
| | - Guishan Zhang
- College of Engineering, Shantou University, Shantou, China
| | - Xianhua Dai
- School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou, China
| |
Collapse
|
39
|
Yu B, Wang X, Zhang Y, Gao H, Wang Y, Liu Y, Gao X. RPI-MDLStack: Predicting RNA-protein interactions through deep learning with stacking strategy and LASSO. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.108676] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
|
40
|
Yang Y, Hou Z, Wang Y, Ma H, Sun P, Ma Z, Wong KC, Li X. HCRNet: high-throughput circRNA-binding event identification from CLIP-seq data using deep temporal convolutional network. Brief Bioinform 2022; 23:6533504. [PMID: 35189638 DOI: 10.1093/bib/bbac027] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Revised: 01/03/2022] [Accepted: 01/17/2022] [Indexed: 01/11/2023] Open
Abstract
Identifying genome-wide binding events between circular RNAs (circRNAs) and RNA-binding proteins (RBPs) can greatly facilitate our understanding of functional mechanisms within circRNAs. Thanks to the development of cross-linked immunoprecipitation sequencing technology, large amounts of genome-wide circRNA binding event data have accumulated, providing opportunities for designing high-performance computational models to discriminate RBP interaction sites and thus to interpret the biological significance of circRNAs. Unfortunately, there are still no computational models sufficiently flexible to accommodate circRNAs from different data scales and with various degrees of feature representation. Here, we present HCRNet, a novel end-to-end framework for identification of circRNA-RBP binding events. To capture the hierarchical relationships, the multi-source biological information is fused to represent circRNAs, including various natural language sequence features. Furthermore, a deep temporal convolutional network incorporating global expectation pooling was developed to exploit the latent nucleotide dependencies in an exhaustive manner. We benchmarked HCRNet on 37 circRNA datasets and 31 linear RNA datasets to demonstrate the effectiveness of our proposed method. To evaluate further the model's robustness, we performed HCRNet on a full-length dataset containing 740 circRNAs. Results indicate that HCRNet generally outperforms existing methods. In addition, motif analyses were conducted to exhibit the interpretability of HCRNet on circRNAs. All supporting source code and data can be downloaded from https://github.com/yangyn533/HCRNet and https://doi.org/10.6084/m9.figshare.16943722.v1. And the web server of HCRNet is publicly accessible at http://39.104.118.143:5001/.
Collapse
Affiliation(s)
- Yuning Yang
- School of Information Science and Technology, Northeast Normal University, Changchun, Jilin, China
| | - Zilong Hou
- School of Artificial Intelligence, Jilin University, Changchun, Jilin, China
| | - Yansong Wang
- School of Artificial Intelligence, Jilin University, Changchun, Jilin, China
| | - Hongli Ma
- School of Mathematics, Shandong University, Jinan, Shandong, China
| | - Pingping Sun
- School of Information Science and Technology, Northeast Normal University, Changchun, Jilin, China
| | - Zhiqiang Ma
- School of Information Science and Technology, Northeast Normal University, Changchun, Jilin, China
| | - Ka-Chun Wong
- School of Computer Science, City University of Hong Kong, Hong Kong SAR
| | - Xiangtao Li
- School of Artificial Intelligence, Jilin University, Changchun, Jilin, China
| |
Collapse
|
41
|
Wang Y, Yang Y, Ma Z, Wong KC, Li X. EDCNN: identification of genome-wide RNA-binding proteins using evolutionary deep convolutional neural network. Bioinformatics 2022; 38:678-686. [PMID: 34694393 DOI: 10.1093/bioinformatics/btab739] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2021] [Revised: 10/14/2021] [Accepted: 10/20/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION RNA-binding proteins (RBPs) are a group of proteins associated with RNA regulation and metabolism, and play an essential role in mediating the maturation, transport, localization and translation of RNA. Recently, Genome-wide RNA-binding event detection methods have been developed to predict RBPs. Unfortunately, the existing computational methods usually suffer some limitations, such as high-dimensionality, data sparsity and low model performance. RESULTS Deep convolution neural network has a useful advantage for solving high-dimensional and sparse data. To improve further the performance of deep convolution neural network, we propose evolutionary deep convolutional neural network (EDCNN) to identify protein-RNA interactions by synergizing evolutionary optimization with gradient descent to enhance deep conventional neural network. In particular, EDCNN combines evolutionary algorithms and different gradient descent models in a complementary algorithm, where the gradient descent and evolution steps can alternately optimize the RNA-binding event search. To validate the performance of EDCNN, an experiment is conducted on two large-scale CLIP-seq datasets, and results reveal that EDCNN provides superior performance to other state-of-the-art methods. Furthermore, time complexity analysis, parameter analysis and motif analysis are conducted to demonstrate the effectiveness of our proposed algorithm from several perspectives. AVAILABILITY AND IMPLEMENTATION The EDCNN algorithm is available at GitHub: https://github.com/yaweiwang1232/EDCNN. Both the software and the supporting data can be downloaded from: https://figshare.com/articles/software/EDCNN/16803217. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yawei Wang
- School of Artificial Intelligence, Jilin University, Changchun, Jilin, China
| | - Yuning Yang
- School of Information Science and Technology, Northeast Normal University, Changchun, Jilin, China
| | - Zhiqiang Ma
- School of Information Science and Technology, Northeast Normal University, Changchun, Jilin, China
| | - Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
| | - Xiangtao Li
- School of Artificial Intelligence, Jilin University, Changchun, Jilin, China
| |
Collapse
|
42
|
DFpin: Deep learning-based protein-binding site prediction with feature-based non-redundancy from RNA level. Comput Biol Med 2022; 142:105216. [PMID: 35030497 DOI: 10.1016/j.compbiomed.2022.105216] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Revised: 12/19/2021] [Accepted: 01/02/2022] [Indexed: 11/20/2022]
Abstract
The interaction between proteins and RNA is closely related to various human diseases. Computer-aided drug design can be facilitated by detecting the RNA sites that bind proteins. However, due to the aggregation of binding sites in RNA sequences, high sample similarity occurs when extracting RNA fragments by using a sliding window. Considering these problems, we present a method, DFpin, to predict protein-interacting nucleotides in RNA. To retain more key nucleotide sites, we used the redundancy method based on feature similarity, that is, feature redundancy is removed based on the RNA mono-nucleotide composition to maintain the diversity of RNA samples and avoid the residue of redundant data. In addition, to extract key abstract features and avoid over-fitting, we used the cascade structure of a deep forest model to predict protein-interacting nucleotides. Overall, DFpin demonstrated excellent classification with 85.4% accuracy and 93.3% area under the curve. Compared with other methods, the accuracy of DFpin was better, suggesting that feature-based redundancy removal and deep forest can help predict nucleotides of protein interactions. The source code and all dataset are available at: https://github.com/zhaoxj-tech/DFpin.git.
Collapse
|
43
|
Niu M, Zou Q, Lin C. CRBPDL: Identification of circRNA-RBP interaction sites using an ensemble neural network approach. PLoS Comput Biol 2022; 18:e1009798. [PMID: 35051187 PMCID: PMC8806072 DOI: 10.1371/journal.pcbi.1009798] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Revised: 02/01/2022] [Accepted: 01/02/2022] [Indexed: 02/06/2023] Open
Abstract
Circular RNAs (circRNAs) are non-coding RNAs with a special circular structure produced formed by the reverse splicing mechanism. Increasing evidence shows that circular RNAs can directly bind to RNA-binding proteins (RBP) and play an important role in a variety of biological activities. The interactions between circRNAs and RBPs are key to comprehending the mechanism of posttranscriptional regulation. Accurately identifying binding sites is very useful for analyzing interactions. In past research, some predictors on the basis of machine learning (ML) have been presented, but prediction accuracy still needs to be ameliorated. Therefore, we present a novel calculation model, CRBPDL, which uses an Adaboost integrated deep hierarchical network to identify the binding sites of circular RNA-RBP. CRBPDL combines five different feature encoding schemes to encode the original RNA sequence, uses deep multiscale residual networks (MSRN) and bidirectional gating recurrent units (BiGRUs) to effectively learn high-level feature representations, it is sufficient to extract local and global context information at the same time. Additionally, a self-attention mechanism is employed to train the robustness of the CRBPDL. Ultimately, the Adaboost algorithm is applied to integrate deep learning (DL) model to improve prediction performance and reliability of the model. To verify the usefulness of CRBPDL, we compared the efficiency with state-of-the-art methods on 37 circular RNA data sets and 31 linear RNA data sets. Moreover, results display that CRBPDL is capable of performing universal, reliable, and robust. The code and data sets are obtainable at https://github.com/nmt315320/CRBPDL.git. More and more evidences show that circular RNA can directly bind to proteins and participate in countless different biological processes. The calculation method can quickly and accurately predict the binding site of circular RNA and RBP. In order to identify the interaction of circRNA with 37 different types of circRNA binding proteins, we developed an integrated deep learning network based on hierarchical network, called CRBPDL. It can effectively learn high-level feature representations. The performance of the model was verified through comparative experiments of different feature extraction algorithms, different deep learning models and classifier models. Moreover, the CRBPDL model was applied to 31 linear RNAs, and the effectiveness of our method was proved by comparison with the results of current excellent algorithms. It is expected that the CRBPDL model can effectively predict the binding site of circular RNA-RBP and provide reliable candidates for further biological experiments.
Collapse
Affiliation(s)
- Mengting Niu
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang, China
| | - Chen Lin
- School of Informatics, Xiamen University, Xiamen, China
- * E-mail:
| |
Collapse
|
44
|
Wang Z, Lei X. Prediction of RBP binding sites on circRNAs using an LSTM-based deep sequence learning architecture. Brief Bioinform 2021; 22:6355419. [PMID: 34415289 DOI: 10.1093/bib/bbab342] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 07/14/2021] [Accepted: 08/02/2021] [Indexed: 01/22/2023] Open
Abstract
Circular RNAs (circRNAs) are widely expressed in highly diverged eukaryotes. Although circRNAs have been known for many years, their function remains unclear. Interaction with RNA-binding protein (RBP) to influence post-transcriptional regulation is considered to be an important pathway for circRNA function, such as acting as an oncogenic RBP sponge to inhibit cancer. In this study, we design a deep learning framework, CRPBsites, to predict the binding sites of RBPs on circRNAs. In this model, the sequences of variable-length binding sites are transformed into embedding vectors by word2vec model. Bidirectional LSTM is used to encode the embedding vectors of binding sites, and then they are fed into another LSTM decoder for decoding and classification tasks. To train and test the model, we construct four datasets that contain sequences of variable-length binding sites on circRNAs, and each set corresponds to an RBP, which is overexpressed in bladder cancer tissues. Experimental results on four datasets and comparison with other existing models show that CRPBsites has superior performance. Afterwards, we found that there were highly similar binding motifs in the four binding site datasets. Finally, we applied well-trained CRPBsites to identify the binding sites of IGF2BP1 on circCDYL, and the results proved the effectiveness of this method. In conclusion, CRPBsites is an effective prediction model for circRNA-RBP interaction site identification. We hope that CRPBsites can provide valuable guidance for experimental studies on the influence of circRNA on post-transcriptional regulation.
Collapse
Affiliation(s)
- Zhengfeng Wang
- School of Computer Science, Shaanxi Normal University, Xi'an, China.,College of Information Science and Engineering, Guilin University of Technology, Guilin, China
| | - Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| |
Collapse
|
45
|
Tayara H, Chong KT. Improved Predicting of The Sequence Specificities of RNA Binding Proteins by Deep Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:2526-2534. [PMID: 32191896 DOI: 10.1109/tcbb.2020.2981335] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
RNA-binding proteins (RBPs) have a significant role in various regulatory tasks. However, the mechanism by which RBPs identify the subsequence target RNAs is still not clear. In recent years, several machine and deep learning-based computational models have been proposed for understanding the binding preferences of RBPs. These methods required integrating multiple features with raw RNA sequences such as secondary structure and their performances can be further improved. In this paper, we propose an efficient and simple convolution neural network, RBPCNN, that relies on the combination of the raw RNA sequence and evolutionary information. We show that conservation scores (evolutionary information) for the RNA sequences can significantly improve the overall performance of the proposed predictor. In addition, the automatic extraction of the binding sequence motifs can enhance our understanding of the binding specificities of RBPs. The experimental results show that RBPCNN outperforms significantly the current state-of-the-art methods. More specifically, the average area under the receiver operator curve was improved by 2.67 percent and the mean average precision was improved by 8.03 percent. The datasets and results can be downloaded from https://home.jbnu.ac.kr/NSCL/RBPCNN.htm.
Collapse
|
46
|
Li H, Deng Z, Yang H, Pan X, Wei Z, Shen HB, Choi KS, Wang L, Wang S, Wu J. circRNA-binding protein site prediction based on multi-view deep learning, subspace learning and multi-view classifier. Brief Bioinform 2021; 23:6375057. [PMID: 34571539 DOI: 10.1093/bib/bbab394] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 08/08/2021] [Accepted: 08/30/2021] [Indexed: 12/22/2022] Open
Abstract
Circular RNAs (circRNAs) generally bind to RNA-binding proteins (RBPs) to play an important role in the regulation of autoimmune diseases. Thus, it is crucial to study the binding sites of RBPs on circRNAs. Although many methods, including traditional machine learning and deep learning, have been developed to predict the interactions between RNAs and RBPs, and most of them are focused on linear RNAs. At present, few studies have been done on the binding relationships between circRNAs and RBPs. Thus, in-depth research is urgently needed. In the existing circRNA-RBP binding site prediction methods, circRNA sequences are the main research subjects, but the relevant characteristics of circRNAs have not been fully exploited, such as the structure and composition information of circRNA sequences. Some methods have extracted different views to construct recognition models, but how to efficiently use the multi-view data to construct recognition models is still not well studied. Considering the above problems, this paper proposes a multi-view classification method called DMSK based on multi-view deep learning, subspace learning and multi-view classifier for the identification of circRNA-RBP interaction sites. In the DMSK method, first, we converted circRNA sequences into pseudo-amino acid sequences and pseudo-dipeptide components for extracting high-dimensional sequence features and component features of circRNAs, respectively. Then, the structure prediction method RNAfold was used to predict the secondary structure of the RNA sequences, and the sequence embedding model was used to extract the context-dependent features. Next, we fed the above four views' raw features to a hybrid network, which is composed of a convolutional neural network and a long short-term memory network, to obtain the deep features of circRNAs. Furthermore, we used view-weighted generalized canonical correlation analysis to extract four views' common features by subspace learning. Finally, the learned subspace common features and multi-view deep features were fed to train the downstream multi-view TSK fuzzy system to construct a fuzzy rule and fuzzy inference-based multi-view classifier. The trained classifier was used to predict the specific positions of the RBP binding sites on the circRNAs. The experiments show that the prediction performance of the proposed method DMSK has been improved compared with the existing methods. The code and dataset of this study are available at https://github.com/Rebecca3150/DMSK.
Collapse
Affiliation(s)
- Hui Li
- Jiangnan University, Wuxi, Jiangsu 214012, China
| | - Zhaohong Deng
- School of Artificial Intelligence and Computer Science of Jiangnan University, Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (LCNBI) and ZJLab, Wuxi, Jiangsu 214012, China
| | - Haitao Yang
- Jiangnan University, Wuxi, Jiangsu 214012, China
| | - Xiaoyong Pan
- Department of Automation of Shanghai Jiao Tong University, Wuxi, Jiangsu 214012, China
| | - Zhisheng Wei
- School of Biotechnology and Key Laboratory of Industrial Biotechnology Ministry in Jiangnan University, Wuxi, Jiangsu 214012, China
| | - Hong-Bin Shen
- Shanghai Jiao Tong University, Wuxi, Jiangsu 214012, China
| | - Kup-Sze Choi
- Hong Kong Polytechnic University, Wuxi, Jiangsu 214012, China
| | - Lei Wang
- School of Biotechnology and Key Laboratory of Industrial Biotechnology Ministry in Jiangnan University, Wuxi, Jiangsu 214012, China
| | - Shitong Wang
- School of Artificial Intelligence and Computer Science of Jiangnan University, Wuxi, Jiangsu 214012, China
| | - Jing Wu
- School of Biotechnology and Key Laboratory of Industrial Biotechnology Ministry in Jiangnan University, Wuxi, Jiangsu 214012, China
| |
Collapse
|
47
|
Das A, Sinha T, Shyamal S, Panda AC. Emerging Role of Circular RNA-Protein Interactions. Noncoding RNA 2021; 7:48. [PMID: 34449657 PMCID: PMC8395946 DOI: 10.3390/ncrna7030048] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 07/26/2021] [Accepted: 07/29/2021] [Indexed: 12/17/2022] Open
Abstract
Circular RNAs (circRNAs) are emerging as novel regulators of gene expression in various biological processes. CircRNAs regulate gene expression by interacting with cellular regulators such as microRNAs and RNA binding proteins (RBPs) to regulate downstream gene expression. The accumulation of high-throughput RNA-protein interaction data revealed the interaction of RBPs with the coding and noncoding RNAs, including recently discovered circRNAs. RBPs are a large family of proteins known to play a critical role in gene expression by modulating RNA splicing, nuclear export, mRNA stability, localization, and translation. However, the interaction of RBPs with circRNAs and their implications on circRNA biogenesis and function has been emerging in the last few years. Recent studies suggest that circRNA interaction with target proteins modulates the interaction of the protein with downstream target mRNAs or proteins. This review outlines the emerging mechanisms of circRNA-protein interactions and their functional role in cell physiology.
Collapse
Affiliation(s)
- Arundhati Das
- Institute of Life Sciences, Nalco Square, Bhubaneswar 751023, India; (A.D.); (T.S.); (S.S.)
- School of Biotechnology, KIIT University, Bhubaneswar 751024, India
| | - Tanvi Sinha
- Institute of Life Sciences, Nalco Square, Bhubaneswar 751023, India; (A.D.); (T.S.); (S.S.)
| | - Sharmishtha Shyamal
- Institute of Life Sciences, Nalco Square, Bhubaneswar 751023, India; (A.D.); (T.S.); (S.S.)
| | - Amaresh Chandra Panda
- Institute of Life Sciences, Nalco Square, Bhubaneswar 751023, India; (A.D.); (T.S.); (S.S.)
| |
Collapse
|
48
|
Wu H, Pan X, Yang Y, Shen HB. Recognizing binding sites of poorly characterized RNA-binding proteins on circular RNAs using attention Siamese network. Brief Bioinform 2021; 22:6326526. [PMID: 34297803 DOI: 10.1093/bib/bbab279] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Revised: 06/04/2021] [Accepted: 07/01/2021] [Indexed: 12/24/2022] Open
Abstract
Circular RNAs (circRNAs) interact with RNA-binding proteins (RBPs) to play crucial roles in gene regulation and disease development. Computational approaches have attracted much attention to quickly predict highly potential RBP binding sites on circRNAs using the sequence or structure statistical binding knowledge. Deep learning is one of the popular learning models in this area but usually requires a lot of labeled training data. It would perform unsatisfactorily for the less characterized RBPs with a limited number of known target circRNAs. How to improve the prediction performance for such small-size labeled characterized RBPs is a challenging task for deep learning-based models. In this study, we propose an RBP-specific method iDeepC for predicting RBP binding sites on circRNAs from sequences. It adopts a Siamese neural network consisting of a lightweight attention module and a metric module. We have found that Siamese neural network effectively enhances the network capability of capturing mutual information between circRNAs with pairwise metric learning. To further deal with the small-sample size problem, we have performed the pretraining using available labeled data from other RBPs and also demonstrate the efficacy of this transfer-learning pipeline. We comprehensively evaluated iDeepC on the benchmark datasets of RBP-binding circRNAs, and the results suggest iDeepC achieving promising results on the poorly characterized RBPs. The source code is available at https://github.com/hehew321/iDeepC.
Collapse
Affiliation(s)
- Hehe Wu
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Xiaoyong Pan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Yang Yang
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| |
Collapse
|
49
|
Qaid TS, Mazaar H, Alqahtani MS, Raweh AA, Alakwaa W. Deep sequence modelling for predicting COVID-19 mRNA vaccine degradation. PeerJ Comput Sci 2021; 7:e597. [PMID: 34239977 PMCID: PMC8237341 DOI: 10.7717/peerj-cs.597] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Accepted: 05/26/2021] [Indexed: 06/13/2023]
Abstract
The worldwide coronavirus (COVID-19) pandemic made dramatic and rapid progress in the year 2020 and requires urgent global effort to accelerate the development of a vaccine to stop the daily infections and deaths. Several types of vaccine have been designed to teach the immune system how to fight off certain kinds of pathogens. mRNA vaccines are the most important candidate vaccines because of their capacity for rapid development, high potency, safe administration and potential for low-cost manufacture. mRNA vaccine acts by training the body to recognize and response to the proteins produced by disease-causing organisms such as viruses or bacteria. This type of vaccine is the fastest candidate to treat COVID-19 but it currently facing several limitations. In particular, it is a challenge to design stable mRNA molecules because of the inefficient in vivo delivery of mRNA, its tendency for spontaneous degradation and low protein expression levels. This work designed and implemented a sequence deep model based on bidirectional GRU and LSTM models applied on the Stanford COVID-19 mRNA vaccine dataset to predict the mRNA sequences responsible for degradation by predicting five reactivity values for every position in the sequence. Four of these values determine the likelihood of degradation with/without magnesium at high pH (pH 10) and high temperature (50 degrees Celsius) and the fifth reactivity value is used to determine the likely secondary structure of the RNA sample. The model relies on two types of features, namely numerical and categorical features, where the categorical features are extracted from the mRNA sequences, structure and predicted loop. These features are represented and encoded by numbers, and then, the features are extracted using embedding layer learning. There are five numerical features depending on the likelihood for each pair of nucleotides in the RNA. The model gives promising results because it predicts the five reactivity values with a validation mean columnwise root mean square error (MCRMSE) of 0.125 using LSTM model with augmentation and the codon encoding method. Codon encoding outperforms Base encoding in MCRMSE validation error using the LSTM model meanwhile Base encoding outperforms codon encoding due to less over-fitting and the difference between the training and validation loss error is 0.008.
Collapse
Affiliation(s)
- Talal S. Qaid
- Computer Science Department, College of Computer Science, King Khalid University, Abha, Saudi Arabia
- Faculty of Computer Science, Hodeidah University, Hodeidah, Yemen
| | - Hussein Mazaar
- Computer Science Department, College of Science & Arts in Tanumah, King Khalid University, Abha, Saudi Arabia
| | - Mohammed S. Alqahtani
- Radiological Sciences Department, College of Applied Medical Sciences, King Khalid University, Abha, Saudi Arabia
| | - Abeer A. Raweh
- Computer Science Department, College of Computer Science, King Khalid University, Abha, Saudi Arabia
- Faculty of Computer Science, Hodeidah University, Hodeidah, Yemen
| | - Wafaa Alakwaa
- Computer Science Department, College of Science & Arts in Tanumah, King Khalid University, Abha, Saudi Arabia
| |
Collapse
|
50
|
Yu J, Sun S, Mao W, Xu B, Chen M. Identification of Enzalutamide Resistance-Related circRNA-miRNA-mRNA Regulatory Networks in Patients with Prostate Cancer. Onco Targets Ther 2021; 14:3833-3848. [PMID: 34188491 PMCID: PMC8232970 DOI: 10.2147/ott.s309917] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Accepted: 05/25/2021] [Indexed: 12/20/2022] Open
Abstract
Purpose This study aimed to identify enzalutamide resistant-related (EnzR-related) circRNAs and to characterize and validate circRNA-miRNA-mRNA ceRNA regulatory network and corresponding prognostic signature of prostate cancer patients. Methods We obtained circRNA expression microarray from the Gene Expression Omnibus (GEO) database and performed differential expression analysis to identify EnzR-related circRNAs using the limma package. The miRNA and mRNA expression profiling were downloaded and performed differential expression analysis, then overlapped with predicted candidates. Next, we established circRNA-miRNA-mRNA ceRNA network and PPI network utilized Cytoscape software and STRING database, respectively. In addition, univariate and Lasso Cox regression analyses were applied to generate a prognostic signature. Receiver operating characteristic (ROC) curves and Kaplan–Meier analysis were used to evaluate the reliability and sensitivity of the signature. Ultimately, we chose hsa_circ_0047641 to validate the feasibility of the EnzR-related ceRNA regulatory pathway using qRT-PCR, CCK8 and Transwell assays. Results We identified 13 EnzR-related circRNAs and constructed a ceRNA regulatory network that contained two downregulated circRNAs (has-circ-00000919 and has-circ-0000036) and two upregulated circRNAs (has-circ-0047641 and has-circ-0068697), and their sponged 6 miRNAs and 167 targeted mRNAs. Subsequently, these targeted mRNAs were performed to implement PPI analysis and to identify 10 Hub genes. Functional enrichment analysis provided new ways to seek potential biological functions. Besides, we established a prognostic signature of PCa patients based on 8 prognostic-associated mRNAs. We confirmed that the survival rates of PCa patients with high-risk subgroup were slightly lower than those with low-risk subgroup in the TCGA dataset (p<0.001), and ROC curves revealed that the AUC value for prognostic signature was 0.816. Finally, the functional analysis suggested that knockdown of hsa_circ_0047641 could inhibit the progression of PCa and could reverse Enz-resistance in vitro. Conclusion We identified 13 EnzR-related circRNAs, and constructed and confirmed that EnzR-related circRNA-miRNA-mRNA ceRNA network and corresponding prognostic signature could be a useful prognostic biomarker and therapeutic target.
Collapse
Affiliation(s)
- JunJie Yu
- Surgical Research Center, Institute of Urology, School of Medicine, Southeast University, Nanjing, People's Republic of China.,Department of Medical College, Southeast University, Nanjing, Jiangsu, People's Republic of China
| | - Si Sun
- Surgical Research Center, Institute of Urology, School of Medicine, Southeast University, Nanjing, People's Republic of China.,Department of Medical College, Southeast University, Nanjing, Jiangsu, People's Republic of China
| | - WeiPu Mao
- Surgical Research Center, Institute of Urology, School of Medicine, Southeast University, Nanjing, People's Republic of China.,Department of Medical College, Southeast University, Nanjing, Jiangsu, People's Republic of China
| | - Bin Xu
- Department of Urology, Affiliated Zhongda Hospital of Southeast University, Nanjing, People's Republic of China
| | - Ming Chen
- Department of Urology, Affiliated Zhongda Hospital of Southeast University, Nanjing, People's Republic of China.,Institute of Urology, Southeastern University, Nanjing, People's Republic of China.,Department of Urology, Affiliated Lishui People's Hospital of Southeast University, Nanjing, People's Republic of China
| |
Collapse
|