1
|
Zafarjafarzadeh N, Feridouni E, Sobhani-Moghaddam S, Amini J, Mollazadeh S, Ataei R, Ghomi H, Beyer C, Sanadgol N. Dynamics and role of covalently-closed circular RNAs in Alzheimer's disease: A review of experimental and bioinformatics studies. Neurobiol Aging 2025; 151:54-69. [PMID: 40239316 DOI: 10.1016/j.neurobiolaging.2025.04.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2024] [Revised: 04/02/2025] [Accepted: 04/06/2025] [Indexed: 04/18/2025]
Abstract
Alzheimer's disease (AD) is an age-associated disorder characterized by cognitive decline, with dementia representing the final stage of a complex clinical-biological process rather than simply a more severe form of cognitive decline. Circular RNAs (circRNAs), novel non-coding RNAs, have emerged as key regulators of brain function and associated disorders. This study explores the role of circRNAs in AD by reviewing experimentally validated circRNAs in human and animal models. We identified 10 human (seven pathogenic, three protective) and six animal (three pathogenic, three protective) AD-related circRNAs. Experimental studies have confirmed that human protective circRNAs are predominantly downregulated in AD, where they function by sequestering specific miRNAs within cells, particularly miR-7, miR-142-5p, and miR-217, which have well-recognized neuroinflammatory functions. In-silico analysis revealed that circLPAR1 (pathogenic), circHUWE1 (pathogenic), and circHOMER1 (protective) interact with miRNAs that mainly control AD-related genes. Notably, circHOMER1 plays a key role in regulating multiple AD-related pathways, including autophagy, apoptosis, and PI3K-AKT and amyloid fiber formation. Furthermore, circRNA/protein interaction analysis revealed that circHUWE1 predominantly associates with RNA transport proteins, whereas circHOMER1 interacts with proteins involved in mRNA surveillance pathways. Remarkably, docking analysis demonstrated that circAβ-a (pathogenic) exhibits a strong affinity for eukaryotic translation initiation factor 4A3 protein, while circHOMER1 shows a higher binding affinity for DGCR8 microprocessor complex subunit protein. Our study presents a concise list of circRNAs as potential key targets for further investigation in AD research. Future experimental research is essential to uncover their precise mechanisms and assess their potential as biomarkers, offering promising avenues for developing interventions to alleviate cognitive decline in AD.
Collapse
Affiliation(s)
- Nikta Zafarjafarzadeh
- Department of Cellular and Molecular Biology, Faculty of Advanced Science and Technology, Tehran Medical Sciences, Islamic Azad University, Tehran, Iran
| | - Elham Feridouni
- Department of Biology, Gonbad Kavous University, Golestan, Iran
| | | | - Javad Amini
- Natural Products and Medicinal Plants Research Center, North Khorasan University of Medical Sciences, Bojnurd, Iran.
| | - Samaneh Mollazadeh
- Natural Products and Medicinal Plants Research Center, North Khorasan University of Medical Sciences, Bojnurd, Iran
| | - Reza Ataei
- Department of Biology, Western University, London, Canada
| | - Hamed Ghomi
- Department for Life Quality Studies, Alma Mater Studiorum, University of Bologna, Bologna, Italy
| | - Cordian Beyer
- Institute of Neuroanatomy, RWTH University Hospital Aachen, Aachen 52074, Germany
| | - Nima Sanadgol
- Institute of Neuroanatomy, RWTH University Hospital Aachen, Aachen 52074, Germany.
| |
Collapse
|
2
|
Zeng M, Zhang X, Li Y, Lu C, Yin R, Guo F, Li M. RNALoc-LM: RNA subcellular localization prediction using pre-trained RNA language model. Bioinformatics 2025; 41:btaf127. [PMID: 40119908 PMCID: PMC11978386 DOI: 10.1093/bioinformatics/btaf127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2025] [Revised: 02/28/2025] [Accepted: 03/19/2025] [Indexed: 03/25/2025] Open
Abstract
MOTIVATION Accurately predicting RNA subcellular localization is crucial for understanding the cellular functions and regulatory mechanisms of RNAs. Although many computational methods have been developed to predict the subcellular localization of lncRNAs, miRNAs, and circRNAs, very few of them are designed to simultaneously predict the subcellular localization of multiple types of RNAs. In addition, the emergence of pre-trained RNA language model has shown remarkable performance in various bioinformatics tasks, such as structure prediction and functional annotation. Despite these advancements, there remains a significant gap in applying pre-trained RNA language models specifically for predicting RNA subcellular localization. RESULTS In this study, we proposed RNALoc-LM, the first interpretable deep-learning framework that leverages a pre-trained RNA language model for predicting RNA subcellular localization. RNALoc-LM uses a pre-trained RNA language model to encode RNA sequences, then captures local patterns and long-range dependencies through TextCNN and BiLSTM modules. A multi-head attention mechanism is used to focus on important regions within the RNA sequences. The results demonstrate that RNALoc-LM significantly outperforms both deep-learning baselines and existing state-of-the-art predictors. Additionally, motif analysis highlights RNALoc-LM's potential for discovering important motifs, while an ablation study confirms the effectiveness of the RNA sequence embeddings generated by the pre-trained RNA language model. AVAILABILITY AND IMPLEMENTATION The RNALoc-LM web server is available at http://csuligroup.com:8000/RNALoc-LM. The source code can be obtained from https://github.com/CSUBioGroup/RNALoc-LM.
Collapse
Affiliation(s)
- Min Zeng
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Xinyu Zhang
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Yiming Li
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Chengqian Lu
- School of Computer Science, Key Laboratory of Intelligent Computing and Information Processing, Xiangtan University, Xiangtan, Hunan 411105, China
| | - Rui Yin
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL 32603, United States
| | - Fei Guo
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| |
Collapse
|
3
|
Wang S, Yu ZG, Han GS, Sun XG. CFPLncLoc: A multi-label lncRNA subcellular localization prediction based on Chaos game representation and centralized feature pyramid. Int J Biol Macromol 2025; 297:139519. [PMID: 39761904 DOI: 10.1016/j.ijbiomac.2025.139519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2024] [Revised: 01/01/2025] [Accepted: 01/03/2025] [Indexed: 01/20/2025]
Abstract
There is increasing evidence that the subcellular localization of long noncoding RNAs (lncRNAs) can provide valuable insights into their biological functions. In terms of transcriptomes, lncRNAs were usually found in multiple subcellular localizations. Although several computational methods have been developed to predict the subcellular localization of lncRNAs, few of them were designed for lncRNAs that have multiple subcellular localizations. In this study, we propose a novel deep learning model, called CFPLncLoc, which uses chaos game representation (CGR) images of lncRNA sequences to predict multi-label lncRNA subcellular localization. CFPLncLoc utilizes image update strategy (IUS) to enhance the relative feature representation of the CGR images. To extract higher-level features from CGR images, CFPLncLoc introduces the multi-scale feature fusion (MFF) model, centralized feature pyramid (CFP), from the field of computer vision (CV). Ablation studies confirmed the contribution of the IUS and CFP in improving the prediction performance. Statistical test results verify that CFPLncLoc outperforms existing state-of-the-art predictors under the evaluation metric MaAUC on the hold-out/independent test set. The source code can be obtained from https://github.com/ShengWang-XTU/CFPLncLoc.
Collapse
Affiliation(s)
- Sheng Wang
- National Center for Applied Mathematics in Hunan, Xiangtan University, Hunan 411105, China; Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Hunan 411105, China
| | - Zu-Guo Yu
- National Center for Applied Mathematics in Hunan, Xiangtan University, Hunan 411105, China; Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Hunan 411105, China.
| | - Guo-Sheng Han
- National Center for Applied Mathematics in Hunan, Xiangtan University, Hunan 411105, China; Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Hunan 411105, China.
| | - Xin-Gen Sun
- National Center for Applied Mathematics in Hunan, Xiangtan University, Hunan 411105, China; Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Hunan 411105, China
| |
Collapse
|
4
|
Hu W, Yue Y, Yan R, Guan L, Li M. An ensemble deep learning framework for multi-class LncRNA subcellular localization with innovative encoding strategy. BMC Biol 2025; 23:47. [PMID: 39984880 PMCID: PMC11846348 DOI: 10.1186/s12915-025-02148-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2024] [Accepted: 02/03/2025] [Indexed: 02/23/2025] Open
Abstract
BACKGROUND Long non-coding RNA (LncRNA) play pivotal roles in various cellular processes, and elucidating their subcellular localization can offer crucial insights into their functional significance. Accurate prediction of lncRNA subcellular localization is of paramount importance. Despite numerous computational methods developed for this purpose, existing approaches still encounter challenges stemming from the complexity of data representation and the difficulty in capturing nucleotide distribution information within sequences. RESULTS In this study, we propose a novel deep learning-based model, termed MGBLncLoc, which incorporates a unique multi-class encoding technique known as generalized encoding based on the Distribution Density of Multi-Class Nucleotide Groups (MCD-ND). This encoding approach enables more precise reflection of nucleotide distributions, distinguishing between constant and discriminative regions within sequences, thereby enhancing prediction performance. Additionally, our deep learning model integrates advanced neural network modules, including Multi-Dconv Head Transposed Attention, Gated-Dconv Feed-forward Network, Convolutional Neural Network, and Bidirectional Gated Recurrent Unit, to comprehensively exploit sequence features of lncRNA. CONCLUSIONS Comparative analysis against commonly used sequence feature encoding methods and existing prediction models validates the effectiveness of MGBLncLoc, demonstrating superior performance. This research offers novel insights and effective solutions for predicting lncRNA subcellular localization, thereby providing valuable support for related biological investigations.
Collapse
Affiliation(s)
- Wenxing Hu
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou, Jiangxi, 341000, China
| | - Yan Yue
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou, Jiangxi, 341000, China
| | - Ruomei Yan
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou, Jiangxi, 341000, China
| | - Lixin Guan
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou, Jiangxi, 341000, China
| | - Mengshan Li
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou, Jiangxi, 341000, China.
| |
Collapse
|
5
|
Wang S, Yu ZG, Han GS. MVSLLnc: LncRNA subcellular localization prediction based on multi-source features and two-stage voting strategy. Methods 2025; 234:324-332. [PMID: 39837434 DOI: 10.1016/j.ymeth.2025.01.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2024] [Revised: 12/28/2024] [Accepted: 01/16/2025] [Indexed: 01/23/2025] Open
Abstract
The subcellular localization of long non-coding RNAs (lncRNAs) is crucial for understanding the function of lncRNAs. Since the traditional biological experimental methods are time-consuming and some existing computational methods rely on high computing power, we are committed to finding a simple and easy-to-implement method to achieve more efficient prediction of the subcellular localization of lncRNAs. In this work, we proposed a model based on multi-source features and two-stage voting strategy for predicting the subcellular localization of lncRNAs (MVSLLnc). The multi-source features include k-mer frequency, features based on the coordinate values of Chaos Game Representation (CGR) and features based on physicochemical property (PhyChe). We feed the multi-source features into the traditional machine learning classifiers RF, SVM and XGBoost, respectively, and perform the final prediction task with two-stage voting strategy. Experimental results on three benchmark datasets show that the accuracy can reach 0.829, 0.793 and 0.968, respectively. The accuracy on three independent test sets is 0.642, 0.737 and 0.518, respectively, which are competitive with the existing methods. Our ablation analyses show that the two-stage voting strategy can make full use of the advantages of multi-source features and multiple classifiers, and obtain more robust results.
Collapse
Affiliation(s)
- Sheng Wang
- National Center for Applied Mathematics in Hunan, Xiangtan University, Hunan 411105, China; Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Hunan 411105, China
| | - Zu-Guo Yu
- National Center for Applied Mathematics in Hunan, Xiangtan University, Hunan 411105, China; Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Hunan 411105, China.
| | - Guo-Sheng Han
- National Center for Applied Mathematics in Hunan, Xiangtan University, Hunan 411105, China; Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Hunan 411105, China.
| |
Collapse
|
6
|
Du K, Xia Y, Wu Q, Yin M, Zhao H, Chen XW. Analysis of whole transcriptome reveals the immune response to porcine reproductive and respiratory syndrome virus infection and tylvalosin tartrate treatment in the porcine alveolar macrophages. Front Immunol 2025; 15:1506371. [PMID: 39872536 PMCID: PMC11769836 DOI: 10.3389/fimmu.2024.1506371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2024] [Accepted: 12/23/2024] [Indexed: 01/30/2025] Open
Abstract
Introduction Porcine reproductive and respiratory syndrome virus (PRRSV) is a major pathogen that has caused severe economic losses in the swine industry. Screening key host immune-related genetic factors in the porcine alveolar macrophages (PAMs) is critical to improve the anti-virial ability in pigs. Methods In this study, an in vivo model was set to evaluate the anti-PRRSV effect of tylvalosin tartrates. Then, strand-specific RNA-sequencing (ssRNA-seq) and miRNA-sequencing (miRNA-seq) were carried out to profile the whole transcriptome of PAMs in the negative control, PRRSV-infected, and tylvalosin tartrates-treatment group. Results The ssRNA-seq identified 11740 long non-coding RNAs in PAMs. Based on our attention mechanism-improved graph convolutional network, 41.07% and 28.59% lncRNAs were predicted to be located in the nucleus and cytoplasm, respectively. The miRNA-seq revealed that tylvalosin tartrates-enhanced miRNAs might play roles in regulating angiogenesis and innate immune-related functions, and it rescued the expression of three anti-inflammation miRNAs (ssc-miR-30a-5p, ssc-miR-218-5p, and ssc-miR-218) that were downregulated due to PRRSV infection. The cytoplasmic lncRNAs enhanced by tylvalosin tartrates might form ceRNA networks with miRNAs to regulate PAM chemotaxis. While cytoplasmic lncRNAs that were rescued by tylvalosin tartrates might protect PAMs via efferocytosis-related ceRNA networks. On the other hand, the tylvalosin tartrates-rescued nuclear lncRNAs might negatively regulate T cell apoptosis and bind to key anti-inflammation factor IL37 to protect the lungs by cis- and trans-regulation. Conclusions Our data provides a catalog of key non-coding RNAs in response to PRRSV and tylvalosin tartrates and might enrich the genetic basis for future PRRSV prevention and control.
Collapse
Affiliation(s)
| | | | | | | | | | - Xi-wen Chen
- Animal Disease Prevention and Control and Healthy Breeding Engineering Technology Research Centre, Mianyang Normal University, Mianyang, China
| |
Collapse
|
7
|
Wu L, Wang L, Hu S, Tang G, Chen J, Yi Y, Xie H, Lin J, Wang M, Wang D, Yang B, Huang Y. RNALocate v3.0: Advancing the Repository of RNA Subcellular Localization with Dynamic Analysis and Prediction. Nucleic Acids Res 2025; 53:D284-D292. [PMID: 39404071 PMCID: PMC11701552 DOI: 10.1093/nar/gkae872] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2024] [Revised: 09/18/2024] [Accepted: 09/24/2024] [Indexed: 01/18/2025] Open
Abstract
Subcellular localization of RNA is a crucial mechanism for regulating diverse biological processes within cells. Dynamic RNA subcellular localizations are essential for maintaining cellular homeostasis; however, their distribution and changes during development and differentiation remain largely unexplored. To elucidate the dynamic patterns of RNA distribution within cells, we have upgraded RNALocate to version 3.0, a repository for RNA-subcellular localization (http://www.rnalocate.org/ or http://www.rna-society.org/rnalocate/). RNALocate v3.0 incorporates and analyzes RNA subcellular localization sequencing data from over 850 samples, with a specific focus on the dynamic changes in subcellular localizations under various conditions. The species coverage has also been expanded to encompass mammals, non-mammals, plants and microbes. Additionally, we provide an integrated prediction algorithm for the subcellular localization of seven RNA types across eleven subcellular compartments, utilizing convolutional neural networks (CNNs) and transformer models. Overall, RNALocate v3.0 contains a total of 1 844 013 RNA-localization entries covering 26 RNA types, 242 species and 177 subcellular localizations. It serves as a comprehensive and readily accessible data resource for RNA-subcellular localization, facilitating the elucidation of cellular function and disease pathogenesis.
Collapse
Affiliation(s)
- Le Wu
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, No.1023, South Shatai Road, Baiyun District, Guangzhou 510515, China
| | - Luqi Wang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, No.1023, South Shatai Road, Baiyun District, Guangzhou 510515, China
| | - Shijie Hu
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, No.1023, South Shatai Road, Baiyun District, Guangzhou 510515, China
- Department of Pathology, Harbin Medical University, 157th Rd of Baojian, Nangang Distinct, Harbin 150081, China
| | - Guangjue Tang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, No.1023, South Shatai Road, Baiyun District, Guangzhou 510515, China
| | - Jia Chen
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, No.1023, South Shatai Road, Baiyun District, Guangzhou 510515, China
| | - Ying Yi
- Dermatology Hospital, Southern Medical University, No.2, Lujing Road, Yuexiu District, Guangzhou 510091, China
| | - Hailong Xie
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, No.1023, South Shatai Road, Baiyun District, Guangzhou 510515, China
| | - Jiahao Lin
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, No.1023, South Shatai Road, Baiyun District, Guangzhou 510515, China
| | - Mei Wang
- State Key Laboratory of Organ Failure Research, Department of Developmental Biology, School of Basic Medical Sciences, Southern Medical University, No.1023, South Shatai Road, Baiyun District, Guangzhou 510515, China
| | - Dong Wang
- Dermatology Hospital, Southern Medical University, No.2, Lujing Road, Yuexiu District, Guangzhou 510091, China
- Department of Bioinformatics, Guangdong Province Key Laboratory of Molecular Tumor Pathology, School of Basic Medical Sciences, Southern Medical University, No.1023, South Shatai Road, Baiyun District, Guangzhou 510515, China
| | - Bin Yang
- Dermatology Hospital, Southern Medical University, No.2, Lujing Road, Yuexiu District, Guangzhou 510091, China
| | - Yan Huang
- Cancer Research Institute, School of Basic Medical Sciences, Southern Medical University, No.1023, South Shatai Road, Baiyun District, Guangzhou 510515, China
| |
Collapse
|
8
|
Amini J, Zafarjafarzadeh N, Ghahramanlu S, Mohammadalizadeh O, Mozaffari E, Bibak B, Sanadgol N. Role of Circular RNA MMP9 in Glioblastoma Progression: From Interaction With hnRNPC and hnRNPA1 to Affecting the Expression of BIRC5 by Sequestering miR-149. J Mol Recognit 2025; 38:e3109. [PMID: 39401767 DOI: 10.1002/jmr.3109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Revised: 07/29/2024] [Accepted: 09/27/2024] [Indexed: 01/04/2025]
Abstract
Glioblastoma multiforme (GBM) presents a significant challenge in neuro-oncology due to its aggressive behavior and self-renewal capacity. Circular RNAs (circRNAs), a subset of non-coding RNAs (ncRNAs) generated through mRNA back-splicing, are gaining attention as potential targets for GBM research. In our study, we sought to explore the functional role of circMMP9 (circular form of matrix metalloproteinase-9) as a promising therapeutic target for GBM through bioinformatic predictions and human data analysis. Our results suggest that circMMP9 functions as a sponge for miR-149 and miR-542, both upregulated in GBM based on microarray data. Kaplan-Meier analysis indicated that reduced levels of miR-149 and miR-542 correlate with worse survival outcomes in GBM, suggesting their role as tumor suppressors. Importantly, miR-149 has been demonstrated to inhibit the expression of BIRC5 (baculoviral inhibitor of apoptosis repeat-containing 5 or survivin), a significant promoter of proliferation in GBM. BIRC5 is not only upregulated in GBM but also in various other cancers, including neuroblastoma and other brain cancers. Our protein-protein interaction analysis highlights the significance of BIRC5 as a central hub gene in GBM. CircMMP9 seems to influence this complex relationship by suppressing miR-149 and miR-542, despite their increased expression in GBM. Additionally, we found that circMMP9 directly interacts with heterogeneous nuclear ribonucleoproteins C and A1 (hnRNPC and A1), although not within their protein-binding domains. This suggests that hnRNPC/A1 may play a role in transporting circMMP9. Moreover, RNA-seq data from GBM patient samples confirmed the increased expression of BIRC5, PIK3CB, and hnRNPC/A1, further emphasizing the potential therapeutic significance of circMMP9 in GBM. In this study, we propose for the first time a new epigenetic regulatory role for circMMP9, highlighting a novel aspect of its oncogenic function. circMMP9 may regulate BIRC5 expression in GBM by sponging miR-149 and miR-542. BIRC5, in turn, suppresses apoptosis and enhances proliferation in GBM. Nonetheless, more extensive studies are advised to delve deeper into the roles of circMMP9, especially in the context of glioma.
Collapse
Affiliation(s)
- Javad Amini
- Natural Products and Medicinal Plants Research Center, North Khorasan University of Medical Sciences, Bojnurd, Iran
| | - Nikta Zafarjafarzadeh
- Department of Cellular and Molecular Biology, Faculty of Advanced Science and Technology, Tehran Medical Sciences, Islamic Azad University, Tehran, Iran
| | - Sara Ghahramanlu
- Blood Transfusion Department of Samenolaemeh Hospital, North Khorasan University of Medical Sciences, Bojnurd, Iran
| | - Omid Mohammadalizadeh
- Department of Agronomy and Plant Breeding, College of Agriculture and Natural Resources, University of Tehran, Karaj, Iran
| | - Elaheh Mozaffari
- Biotechnology Research Center, Islamic Azad University of Shahrekord Branch, Tehran, Iran
| | - Bahram Bibak
- Natural Products and Medicinal Plants Research Center, North Khorasan University of Medical Sciences, Bojnurd, Iran
| | - Nima Sanadgol
- Institute of Neuroanatomy, RWTH University Hospital Aachen, Aachen, Germany
| |
Collapse
|
9
|
Deng X, Liu L. BiGM-lncLoc: Bi-level Multi-Graph Meta-Learning for Predicting Cell-Specific Long Noncoding RNAs Subcellular Localization. Interdiscip Sci 2024:10.1007/s12539-024-00679-y. [PMID: 39724386 DOI: 10.1007/s12539-024-00679-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2024] [Revised: 11/11/2024] [Accepted: 11/18/2024] [Indexed: 12/28/2024]
Abstract
The precise spatiotemporal expression of long noncoding RNAs (lncRNAs) plays a pivotal role in biological regulation, and aberrant expression of lncRNAs in different subcellular localizations has been intricately linked to the onset and progression of a variety of cancers. Computational methods provide effective means for predicting lncRNA subcellular localization, but current studies either ignore cell line and tissue specificity or the correlation and shared information among cell lines. In this study, we propose a novel approach, BiGM-lncLoc, treating the prediction of lncRNA subcellular localization across cell lines as a multi-graph meta-learning task. Our investigation involves two categories of data: the localization data of nucleotide sequences in different cell lines and cell line expression data. BiGM-lncLoc comprises a cell line-specific optimization network learning specific knowledge from cell line expression data and a graph neural network optimized across cell lines. Subsequently, the specific and shared knowledge acquired through bi-level optimization is applied to a new cell-line prediction task without the need for re-training or fine-tuning. Additionally, through key feature analysis of the impact of different nucleotide combinations on the model, we confirm the necessity of cell line-specific studies based on correlation analysis. Finally, experiments conducted on various cell lines with different data sizes indicate that BiGM-lncLoc outperforms other methods in terms of prediction accuracy, with an average accuracy of 97.7%. After removing overlapping samples to ensure data independence for each cell line, the accuracy ranged from 82.4% to 94.7%, still surpassing existing models. Our code can be found at https://github.com/BioCL1/BiGM-lncLoc .
Collapse
Affiliation(s)
- Xi Deng
- School of Information, Yunnan Normal University, Kunming, 650500, China
| | - Lin Liu
- School of Information, Yunnan Normal University, Kunming, 650500, China.
- Department of Education of Yunnan Province, Engineering Research Center of Computer Vision and Intelligent Control Technology, Kunming, 650500, China.
| |
Collapse
|
10
|
Zhu L, Chen H, Yang S. LncSL: A Novel Stacked Ensemble Computing Tool for Subcellular Localization of lncRNA by Amino Acid-Enhanced Features and Two-Stage Automated Selection Strategy. Int J Mol Sci 2024; 25:13734. [PMID: 39769496 PMCID: PMC11678684 DOI: 10.3390/ijms252413734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2024] [Revised: 12/17/2024] [Accepted: 12/19/2024] [Indexed: 01/11/2025] Open
Abstract
Long non-coding RNA (lncRNA) is a non-coding RNA longer than 200 nucleotides, crucial for functions like cell cycle regulation and gene transcription. Accurate localization prediction from sequence information is vital for understanding lncRNA's biological roles. Computational methods offer an effective alternative to traditional experimental methods for annotating lncRNA subcellular positions. Existing machine learning-based methods are limited and often overlook regions with coding potential that affect the function of lncRNA. Therefore, we propose a new model called LncSL. For feature encoding, both lncRNA sequences and amino acid sequences from open reading frames (ORFs) are employed. And we selected the most suitable features by CatBoost and integrated them into a new feature set. Additionally, a voting process with seven feature selection algorithms identified the higher contributive features for training our final stacked model. Additionally, an automatic model selection strategy is constructed to find a better performance meta-model for assembling LncSL. This study specifically focuses on predicting the subcellular localization of lncRNA in the nucleus and cytoplasm. On two benchmark datasets called S1 and S2 datasets, LncSL outperformed existing methods by 6.3% to 12.3% in the Matthew's correlation coefficient on a balanced test dataset. On an unbalanced independent test dataset sourced from S1, LncSL improved by 4.7% to 18.6% in the Matthew's correlation coefficient, which further demonstrates that LncSL is superior to other compared methods. In all, this study presents an effective method for predicting lncRNA subcellular localization through enhancing sequence information, which is always overlooked by traditional methods, and addressing contributive meta-model selection problems, which can offer new insights for other bioinformatics problems.
Collapse
Affiliation(s)
| | | | - Sen Yang
- School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou 213164, China; (L.Z.); (H.C.)
| |
Collapse
|
11
|
Yin R, Zhao H, Li L, Yang Q, Zeng M, Yang C, Bian J, Xie M. Gra-CRC-miRTar: The pre-trained nucleotide-to-graph neural networks to identify potential miRNA targets in colorectal cancer. Comput Struct Biotechnol J 2024; 23:3020-3029. [PMID: 39171252 PMCID: PMC11338065 DOI: 10.1016/j.csbj.2024.07.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 07/13/2024] [Accepted: 07/13/2024] [Indexed: 08/23/2024] Open
Abstract
Colorectal cancer (CRC) is the third most diagnosed cancer and the second deadliest cancer worldwide representing a major public health problem. In recent years, increasing evidence has shown that microRNA (miRNA) can control the expression of targeted human messenger RNA (mRNA) by reducing their abundance or translation, acting as oncogenes or tumor suppressors in various cancers, including CRC. Due to the significant up-regulation of oncogenic miRNAs in CRC, elucidating the underlying mechanism and identifying dysregulated miRNA targets may provide a basis for improving current therapeutic interventions. In this paper, we proposed Gra-CRC-miRTar, a pre-trained nucleotide-to-graph neural network framework, for identifying potential miRNA targets in CRC. Different from previous studies, we constructed two pre-trained models to encode RNA sequences and transformed them into de Bruijn graphs. We employed different graph neural networks to learn the latent representations. The embeddings generated from de Bruijn graphs were then fed into a Multilayer Perceptron (MLP) to perform the prediction tasks. Our extensive experiments show that Gra-CRC-miRTar achieves better performance than other deep learning algorithms and existing predictors. In addition, our analyses also successfully revealed 172 out of 201 functional interactions through experimentally validated miRNA-mRNA pairs in CRC. Collectively, our effort provides an accurate and efficient framework to identify potential miRNA targets in CRC, which can also be used to reveal miRNA target interactions in other malignancies, facilitating the development of novel therapeutics. The Gra-CRC-miRTar web server can be found at: http://gra-crc-mirtar.com/.
Collapse
Affiliation(s)
- Rui Yin
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Hongru Zhao
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Lu Li
- Department of Biochemistry and Molecular Biology, University of Florida, Gainesville, FL, USA
| | - Qiang Yang
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Min Zeng
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Carl Yang
- Department of Computer Science, Emory University, Atlanta, GA, USA
| | - Jiang Bian
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Mingyi Xie
- Department of Biochemistry and Molecular Biology, University of Florida, Gainesville, FL, USA
| |
Collapse
|
12
|
Cao X, Lu P. DCSGMDA: A dual-channel convolutional model based on stacked deep learning collaborative gradient decomposition for predicting miRNA-disease associations. Comput Biol Chem 2024; 113:108201. [PMID: 39255626 DOI: 10.1016/j.compbiolchem.2024.108201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2024] [Revised: 08/17/2024] [Accepted: 08/31/2024] [Indexed: 09/12/2024]
Abstract
Numerous studies have shown that microRNAs (miRNAs) play a key role in human diseases as critical biomarkers. Its abnormal expression is often accompanied by the emergence of specific diseases. Therefore, studying the relationship between miRNAs and diseases can deepen the insights of their pathogenesis, grasp the process of disease onset and development, and promote drug research of specific diseases. However, many undiscovered relationships between miRNAs and diseases remain, significantly limiting research on miRNA-disease correlations. To explore more potential correlations, we propose a dual-channel convolutional model based on stacked deep learning collaborative gradient decomposition for predicting miRNA-disease associations (DCSGMDA). Firstly, we constructed similarity networks for miRNAs and diseases, as well as an association relationship network. Secondly, potential features were fully mined using stacked deep learning and gradient decomposition networks, along with dual-channel convolutional neural networks. Finally, correlations were scored by a multilayer perceptron. We performed 5-fold and 10-fold cross-validation experiments on DCSGMDA using two datasets based on the Human MicroRNA Disease Database (HMDD). Additionally, parametric, ablation, and comparative experiments, along with case studies, were conducted. The experimental results demonstrate that DCSGMDA performs well in predicting miRNA-disease associations.
Collapse
Affiliation(s)
- Xu Cao
- School of Computer and Communication, Lanzhou University of Technology, Lanzhou 730050, Gansu, China.
| | - Pengli Lu
- School of Computer and Communication, Lanzhou University of Technology, Lanzhou 730050, Gansu, China.
| |
Collapse
|
13
|
Wang K, Hu Y, Li S, Chen M, Li Z. LncLSTA: a versatile predictor unveiling subcellular localization of lncRNAs through long-short term attention. BIOINFORMATICS ADVANCES 2024; 5:vbae173. [PMID: 39758831 PMCID: PMC11700581 DOI: 10.1093/bioadv/vbae173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Revised: 10/20/2024] [Accepted: 11/07/2024] [Indexed: 01/07/2025]
Abstract
Motivation Much evidence suggests that the subcellular localization of long-stranded noncoding RNAs (LncRNAs) provides key insights for the study of their biological function. Results This study proposes a novel deep learning framework, LncLSTA, designed for predicting the subcellular localization of LncRNAs. It firstly exploits LncRNA sequence, electron-ion interaction pseudopotentials, and nucleotide chemical property as feature inputs. Departing from conventional k-mer approaches, this model uses a set of 1D convolutional and maxpooling operations for dynamical feature aggregation. Furthermore, LncLSTA integrates a long-short term attention module with a bidirectional long and short term memory network to comprehensively extract sequence information. In addition, it incorporates a TextCNN module to enhance accuracy and robustness in subcellular localization tasks. Experimental results demonstrate the efficacy of LncLSTA, showcasing its superior performance compared to other state-of-the-art methods. Notably, LncLSTA exhibits the transfer learning capability, extending its utility to predict the subcellular localization prediction of mRNAs, while maintaining consistently satisfactory prediction results. This research contributes valuable insights into understanding the biological functions of LncRNAs through subcellular localization, emphasizing the potential of deep learning approaches in advancing RNA-related studies. Availability and implementation The source code is publicly available at https://bis.zju.edu.cn/LncLSTA.
Collapse
Affiliation(s)
- Kai Wang
- School of Information Engineering, Huzhou University, Huzhou, Zhejiang 313000, China
- School of Science, Zhejiang Sci-Tech University, Hangzhou, Zhejiang 310018, China
| | - Yueming Hu
- College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang 310003, China
| | - Sida Li
- College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang 310003, China
| | - Ming Chen
- College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang 310003, China
| | - Zhong Li
- School of Information Engineering, Huzhou University, Huzhou, Zhejiang 313000, China
- School of Science, Zhejiang Sci-Tech University, Hangzhou, Zhejiang 310018, China
- College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang 310003, China
| |
Collapse
|
14
|
Wei Y, Zhang Q, Liu L. The improved de Bruijn graph for multitask learning: predicting functions, subcellular localization, and interactions of noncoding RNAs. Brief Bioinform 2024; 26:bbae627. [PMID: 39592154 PMCID: PMC11596098 DOI: 10.1093/bib/bbae627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2024] [Revised: 11/13/2024] [Accepted: 11/15/2024] [Indexed: 11/28/2024] Open
Abstract
Noncoding RNA refers to RNA that does not encode proteins. The lncRNA and miRNA it contains play crucial regulatory roles in organisms, and their aberrant expression is closely related to various diseases. Traditional experimental methods for validating the interactions of these RNAs have limitations, and existing prediction models exhibit relatively limited functionality, relying on isolated feature extraction and performing poorly in handling various types of small sample tasks. This paper proposes an improved de Bruijn graph that can inject RNA structural information into the graph while preserving sequence information. Furthermore, the improved de Bruijn graph enables graph neural networks to learn broader dependencies and correlations among data by introducing richer edge relationships. Meanwhile, the multitask learning model, DVMnet, proposed in this paper can handle multiple related tasks, and we optimize model parameters by integrating the total loss of three tasks. This enables multitask prediction of RNA interactions, disease associations, and subcellular localization. Compared with the best existing models in this field, DVMnet has achieved the best performance with a 3% improvement in the area under the curve value and demonstrates robust results in predicting diseases and subcellular localization. The improved de Bruijn graph is also applicable to various scenarios and can unify the sequence and structural information of various nucleic acids into a single graph.
Collapse
Affiliation(s)
- Yuxiao Wei
- College of Software, Dalian Jiaotong University,794 Huanghe Road, Dalian 116028, China
| | - Qi Zhang
- College of Science, Dalian Jiaotong University, 794 Huanghe Road, Dalian 116028, China
| | - Liwei Liu
- College of Science, Dalian Jiaotong University, 794 Huanghe Road, Dalian 116028, China
| |
Collapse
|
15
|
Du C, Fan W, Zhou Y. Integrated Biochemical and Computational Methods for Deciphering RNA-Processing Codes. WILEY INTERDISCIPLINARY REVIEWS. RNA 2024; 15:e1875. [PMID: 39523464 DOI: 10.1002/wrna.1875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 09/23/2024] [Accepted: 10/21/2024] [Indexed: 11/16/2024]
Abstract
RNA processing involves steps such as capping, splicing, polyadenylation, modification, and nuclear export. These steps are essential for transforming genetic information in DNA into proteins and contribute to RNA diversity and complexity. Many biochemical methods have been developed to profile and quantify RNAs, as well as to identify the interactions between RNAs and RNA-binding proteins (RBPs), especially when coupled with high-throughput sequencing technologies. With the rapid accumulation of diverse data, it is crucial to develop computational methods to convert the big data into biological knowledge. In particular, machine learning and deep learning models are commonly utilized to learn the rules or codes governing the transformation from DNA sequences to intriguing RNAs based on manually designed or automatically extracted features. When precise enough, the RNA codes can be incredibly useful for predicting RNA products, decoding the molecular mechanisms, forecasting the impact of disease variants on RNA processing events, and identifying driver mutations. In this review, we systematically summarize the biochemical and computational methods for deciphering five important RNA codes related to alternative splicing, alternative polyadenylation, RNA localization, RNA modifications, and RBP binding. For each code, we review the main types of experimental methods used to generate training data, as well as the key features, strategic model structures, and advantages of representative tools. We also discuss the challenges encountered in developing predictive models using large language models and extensive domain knowledge. Additionally, we highlight useful resources and propose ways to improve computational tools for studying RNA codes.
Collapse
Affiliation(s)
- Chen Du
- College of Life Sciences, TaiKang Center for Life and Medical Sciences, RNA Institute, Wuhan University, Wuhan, China
| | - Weiliang Fan
- College of Life Sciences, TaiKang Center for Life and Medical Sciences, RNA Institute, Wuhan University, Wuhan, China
| | - Yu Zhou
- College of Life Sciences, TaiKang Center for Life and Medical Sciences, RNA Institute, Wuhan University, Wuhan, China
- Frontier Science Center for Immunology and Metabolism, Wuhan University, Wuhan, China
- State Key Laboratory of Virology, Wuhan University, Wuhan, China
| |
Collapse
|
16
|
Li C, Wang H, Wen Y, Yin R, Zeng X, Li K. GenoM7GNet: An Efficient N 7-Methylguanosine Site Prediction Approach Based on a Nucleotide Language Model. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:2258-2268. [PMID: 39302806 DOI: 10.1109/tcbb.2024.3459870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/22/2024]
Abstract
N-methylguanosine (m7G), one of the mainstream post-transcriptional RNA modifications, occupies an exceedingly significant place in medical treatments. However, classic approaches for identifying m7G sites are costly both in time and equipment. Meanwhile, the existing machine learning methods extract limited hidden information from RNA sequences, thus making it difficult to improve the accuracy. Therefore, we put forward to a deep learning network, called "GenoM7GNet," for m7G site identification. This model utilizes a Bidirectional Encoder Representation from Transformers (BERT) and is pretrained on nucleotide sequences data to capture hidden patterns from RNA sequences for m7G site prediction. Moreover, through detailed comparative experiments with various deep learning models, we discovered that the one-dimensional convolutional neural network (CNN) exhibits outstanding performance in sequence feature learning and classification. The proposed GenoM7GNet model achieved 0.953in accuracy, 0.932in sensitivity, 0.976in specificity, 0.907in Matthews Correlation Coefficient and 0.984in Area Under the receiver operating characteristic Curve on performance evaluation. Extensive experimental results further prove that our GenoM7GNet model markedly surpasses other state-of-the-art models in predicting m7G sites, exhibiting high computing performance.
Collapse
|
17
|
Sanadgol N, Amini J, Khalseh R, Bakhshi M, Nikbin A, Beyer C, Zendehdel A. Mitochondrial genome-derived circRNAs: Orphan epigenetic regulators in molecular biology. Mitochondrion 2024; 79:101968. [PMID: 39321951 DOI: 10.1016/j.mito.2024.101968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Revised: 09/02/2024] [Accepted: 09/18/2024] [Indexed: 09/27/2024]
Abstract
Mitochondria are vital for cellular activities, influencing ATP production, Ca2+ signaling, and reactive oxygen species generation. It has been proposed that nuclear genome-derived circular RNAs (circRNAs) play a role in biological processes. For the first time, this study aims to comprehensively explore experimentally confirmed human mitochondrial genome-derived circRNAs (mt-circRNAs) via in-silico analysis. We utilized wide-ranging bioinformatics tools to anticipate their roles in molecular biology, involving miRNA sponging, protein antagonism, and peptide translation. Among five well-characterized mt-circRNAs, SCAR/mc-COX2 stands out as particularly significant with the potential to sponge around 41 different miRNAs, which target several genes mostly involved in endocytosis, MAP kinase, and PI3K-Akt pathways. Interestingly, circMNTND5 and mecciND1 specifically interact with miRNAs through their unique back-splice junction sequence. These exclusively targeted miRNAs (has-miR-5186, 6888-5p, 8081, 924, 672-5p) are predominantly associated with insulin secretion, proteoglycans in cancer, and MAPK signaling pathways. Moreover, all mt-circRNAs intricately affect the P53 pathway through miRNA sequestration. Remarkably, mc-COX2 and circMNTND5 appear to be involved in the RNA's biogenesis by antagonizing AGO1/2, EIF4A3, and DGCR8. All mt-circRNAs engaged with IGF2BP proteins crucial in redox signaling, and except mecciND1, they all potentially generate at least one protein resembling the immunoglobulin heavy chain protein. Given P53's function as a redox-sensitive transcription factor, and insulin's role as a crucial regulator of energy metabolism, their indirect interplay with mt-circRNAs could influence cellular outcomes. However, due to limited attention and infrequent data availability, it is advisable to conduct more thorough investigations to gain a deeper understanding of the functions of mt-circRNA.
Collapse
Affiliation(s)
- Nima Sanadgol
- Institute of Neuroanatomy, RWTH University Hospital Aachen, 52074 Aachen, Germany.
| | - Javad Amini
- Department of Physiology and Pharmacology, School of Medicine, North Khorasan University of Medical Sciences, 94149-75516 Bojnurd, Iran
| | - Roghayeh Khalseh
- Institute of Neuroanatomy, RWTH University Hospital Aachen, 52074 Aachen, Germany
| | - Mostafa Bakhshi
- Department of Electrical and Computer Engineering, Kharazmi University, 15719-14911 Tehran, Iran
| | - Arezoo Nikbin
- Department of Oral and Maxillofacial Radiology, School of Dentistry, Golestan University of Medical Sciences, Gorgan, Iran
| | - Cordian Beyer
- Institute of Neuroanatomy, RWTH University Hospital Aachen, 52074 Aachen, Germany
| | - Adib Zendehdel
- Institut of Anatomy, Department of Biomedicine, University of Basel, 4031 Basel, Switzerland
| |
Collapse
|
18
|
Miller JR, Yi W, Adjeroh DA. Evaluation of machine learning models that predict lncRNA subcellular localization. NAR Genom Bioinform 2024; 6:lqae125. [PMID: 39296930 PMCID: PMC11409063 DOI: 10.1093/nargab/lqae125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Revised: 08/17/2024] [Accepted: 09/02/2024] [Indexed: 09/21/2024] Open
Abstract
The lncATLAS database quantifies the relative cytoplasmic versus nuclear abundance of long non-coding RNAs (lncRNAs) observed in 15 human cell lines. The literature describes several machine learning models trained and evaluated on these and similar datasets. These reports showed moderate performance, e.g. 72-74% accuracy, on test subsets of the data withheld from training. In all these reports, the datasets were filtered to include genes with extreme values while excluding genes with values in the middle range and the filters were applied prior to partitioning the data into training and testing subsets. Using several models and lncATLAS data, we show that this 'middle exclusion' protocol boosts performance metrics without boosting model performance on unfiltered test data. We show that various models achieve only about 60% accuracy when evaluated on unfiltered lncRNA data. We suggest that the problem of predicting lncRNA subcellular localization from nucleotide sequences is more challenging than currently perceived. We provide a basic model and evaluation procedure as a benchmark for future studies of this problem.
Collapse
Affiliation(s)
- Jason R Miller
- Department of Computer Science and Information Technology; Hood College, Frederick, MD 21701, USA
- Lane Department of Computer Science and Electrical Engineering; West Virginia University, Morgantown, WV 26506, USA
| | - Weijun Yi
- Lane Department of Computer Science and Electrical Engineering; West Virginia University, Morgantown, WV 26506, USA
| | - Donald A Adjeroh
- Lane Department of Computer Science and Electrical Engineering; West Virginia University, Morgantown, WV 26506, USA
| |
Collapse
|
19
|
Wu J, Lu P, Zhang W. Predicting associations between CircRNA and diseases through structure-aware graph transformer and path-integral convolution. Anal Biochem 2024; 692:115554. [PMID: 38710353 DOI: 10.1016/j.ab.2024.115554] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2024] [Revised: 04/27/2024] [Accepted: 04/30/2024] [Indexed: 05/08/2024]
Abstract
A series of biological experiments has demonstrated that circular RNAs play a crucial regulatory role in cellular processes and may be potentially associated with diseases. Uncovering these connections helps in understanding potential disease mechanisms and advancing the development of treatment strategies. However, in biology, traditional experiments face limitations in terms of efficiency and cost, especially when enumerating possible associations. To address these limitations, several computational methods have been proposed, but existing methods only measure from a nodal perspective and cannot capture structural similarities between edges. In this study, we introduce an advanced computational method called SATPIC2CD for analyzing potential associations between circular RNAs and diseases. Specifically, we first employ an Structure-Aware Graph Transformer (SAT), which extracts five predefined metapath representations before calculating attention. This adaptive network integrates structural information into the original self-attention by aggregating information within and between paths. Subsequently, we use Path Integral Convolutional Networks (PACN) to integrate feature information for all path weights between two nodes. Afterward, we complement the network node features with feature loss and feature smoothing using Gated Recurrent Units (GRU) and node centrality. Finally, a Multi-Layer Perceptron (MLP) is employed to obtain the ultimate prediction scores for each circular RNA-disease pair. SATPIC2CD performs remarkably well, with an accuracy of up to 0.9715 measured by the Area Under the Curve (AUC) in a 5-fold cross-validation, surpassing other comparative models. Case studies further emphasize the high precision of our method in identifying circular RNA-disease associations, laying a solid foundation for guiding future biological research efforts.
Collapse
Affiliation(s)
- Jinkai Wu
- School of Computer and Communication, Lanzhou University of Technology, Lanzhou, 730050, Gansu, PR China
| | - PengLi Lu
- School of Computer and Communication, Lanzhou University of Technology, Lanzhou, 730050, Gansu, PR China.
| | - Wenqi Zhang
- School of Computer and Communication, Lanzhou University of Technology, Lanzhou, 730050, Gansu, PR China
| |
Collapse
|
20
|
Diao B, Luo J, Guo Y. A comprehensive survey on deep learning-based identification and predicting the interaction mechanism of long non-coding RNAs. Brief Funct Genomics 2024; 23:314-324. [PMID: 38576205 DOI: 10.1093/bfgp/elae010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 02/25/2024] [Accepted: 03/14/2024] [Indexed: 04/06/2024] Open
Abstract
Long noncoding RNAs (lncRNAs) have been discovered to be extensively involved in eukaryotic epigenetic, transcriptional, and post-transcriptional regulatory processes with the advancements in sequencing technology and genomics research. Therefore, they play crucial roles in the body's normal physiology and various disease outcomes. Presently, numerous unknown lncRNA sequencing data require exploration. Establishing deep learning-based prediction models for lncRNAs provides valuable insights for researchers, substantially reducing time and costs associated with trial and error and facilitating the disease-relevant lncRNA identification for prognosis analysis and targeted drug development as the era of artificial intelligence progresses. However, most lncRNA-related researchers lack awareness of the latest advancements in deep learning models and model selection and application in functional research on lncRNAs. Thus, we elucidate the concept of deep learning models, explore several prevalent deep learning algorithms and their data preferences, conduct a comprehensive review of recent literature studies with exemplary predictive performance over the past 5 years in conjunction with diverse prediction functions, critically analyze and discuss the merits and limitations of current deep learning models and solutions, while also proposing prospects based on cutting-edge advancements in lncRNA research.
Collapse
Affiliation(s)
- Biyu Diao
- Department of Breast Surgery, The First Affiliated Hospital of Ningbo University, No. 59, Liuting Street, Haishu District, Ningbo 315000, China
| | - Jin Luo
- Department of Breast Surgery, The First Affiliated Hospital of Ningbo University, No. 59, Liuting Street, Haishu District, Ningbo 315000, China
| | - Yu Guo
- Department of Breast Surgery, The First Affiliated Hospital of Ningbo University, No. 59, Liuting Street, Haishu District, Ningbo 315000, China
| |
Collapse
|
21
|
Lu P, Jiang J. AE-RW: Predicting miRNA-disease associations by using autoencoder and random walk on miRNA-gene-disease heterogeneous network. Comput Biol Chem 2024; 110:108085. [PMID: 38754260 DOI: 10.1016/j.compbiolchem.2024.108085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 04/04/2024] [Accepted: 04/23/2024] [Indexed: 05/18/2024]
Abstract
Since scientific investigations have demonstrated that aberrant expression of miRNAs brings about the incidence of numerous intricate diseases, precise determination of miRNA-disease relationships greatly contributes to the advancement of human medical progress. To tackle the issue of inefficient conventional experimental approaches, numerous computational methods have been proposed to predict miRNA-disease association with enhanced accuracy. However, constructing miRNA-gene-disease heterogeneous network by incorporating gene information has been relatively under-explored in existing computational techniques. Accordingly, this paper puts forward a technique to predict miRNA-disease association by applying autoencoder and implementing random walk on miRNA-gene-disease heterogeneous network(AE-RW). Firstly, we integrate association information and similarities between miRNAs, genes, and diseases to construct a miRNA-gene-disease heterogeneous network. Subsequently, we consolidate two network feature representations extracted independently via an autoencoder and a random walk procedure. Finally, deep neural network(DNN) are utilized to conduct association prediction. The experimental results demonstrate that the AE-RW model achieved an AUC of 0.9478 through 5-fold CV on the HMDD v3.2 dataset, outperforming the five most advanced existing models. Additionally, case studies were implemented for breast and lung cancer, further validated the superior predictive capabilities of our model.
Collapse
Affiliation(s)
- Pengli Lu
- School of Computer and Communication, Lanzhou University of Technology, Lanzhou, 730050, Gansu, PR China.
| | - Jicheng Jiang
- School of Computer and Communication, Lanzhou University of Technology, Lanzhou, 730050, Gansu, PR China.
| |
Collapse
|
22
|
Yin R, Zhao H, Li L, Yang Q, Zeng M, Yang C, Bian J, Xie M. Gra-CRC-miRTar: The pre-trained nucleotide-to-graph neural networks to identify potential miRNA targets in colorectal cancer. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.15.589599. [PMID: 38659732 PMCID: PMC11042274 DOI: 10.1101/2024.04.15.589599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
Colorectal cancer (CRC) is the third most diagnosed cancer and the second deadliest cancer worldwide representing a major public health problem. In recent years, increasing evidence has shown that microRNA (miRNA) can control the expression of targeted human messenger RNA (mRNA) by reducing their abundance or translation, acting as oncogenes or tumor suppressors in various cancers, including CRC. Due to the significant up-regulation of oncogenic miRNAs in CRC, elucidating the underlying mechanism and identifying dysregulated miRNA targets may provide a basis for improving current therapeutic interventions. In this paper, we proposed Gra-CRC-miRTar, a pre-trained nucleotide-to-graph neural network framework, for identifying potential miRNA targets in CRC. Different from previous studies, we constructed two pre-trained models to encode RNA sequences and transformed them into de Bruijn graphs. We employed different graph neural networks to learn the latent representations. The embeddings generated from de Bruijn graphs were then fed into a Multilayer Perceptron (MLP) to perform the prediction tasks. Our extensive experiments show that Gra-CRC-miRTar achieves better performance than other deep learning algorithms and existing predictors. In addition, our analyses also successfully revealed 172 out of 201 functional interactions through experimentally validated miRNA-mRNA pairs in CRC. Collectively, our effort provides an accurate and efficient framework to identify potential miRNA targets in CRC, which can also be used to reveal miRNA target interactions in other malignancies, facilitating the development of novel therapeutics.
Collapse
Affiliation(s)
- Rui Yin
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
- These authors contributed equally
| | - Hongru Zhao
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
- These authors contributed equally
| | - Lu Li
- Department of Biochemistry and Molecular Biology, University of Florida, Gainesville, FL, USA
| | - Qiang Yang
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Min Zeng
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Carl Yang
- Department of Computer Science, Emory University, Atlanta, GA, USA
| | - Jiang Bian
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Mingyi Xie
- Department of Biochemistry and Molecular Biology, University of Florida, Gainesville, FL, USA
| |
Collapse
|
23
|
Zhang ZY, Zhang Z, Ye X, Sakurai T, Lin H. A BERT-based model for the prediction of lncRNA subcellular localization in Homo sapiens. Int J Biol Macromol 2024; 265:130659. [PMID: 38462114 DOI: 10.1016/j.ijbiomac.2024.130659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Revised: 02/19/2024] [Accepted: 03/04/2024] [Indexed: 03/12/2024]
Abstract
Understanding the subcellular localization of lncRNAs is crucial for comprehending their regulation activities. The conventional detection of lncRNA subcellular location usually uses in situ detection techniques, which are resource intensive. Some machine learning-based algorithms have been proposed for lncRNA subcellular location prediction in mammals. However, due to the low level of conservation of lncRNA sequence, the performance of cross-species models remains unsatisfactory. In this study, we curated a novel dataset containing subcellular location information of lncRNAs in Homo sapiens. Subsequently, based on the BERT pre-trained language algorithm, we developed a model for lncRNA subcellular location prediction. Our model achieved a micro-average area under the receiver operating characteristic (AUROC) of 0.791 on the training set and an AUROC of 0.700 on the testing nucleus set. Additionally, we conducted cross-species validation and motif discovery to further investigate underlying patterns. In summary, our study provides valuable guidance and computational analysis tools for exploring the mechanisms of lncRNA subcellular localization and the dynamic spatial changes of RNA in abnormal physiological states.
Collapse
Affiliation(s)
- Zhao-Yue Zhang
- Tsukuba Life Science Innovation Program, University of Tsukuba, Tsukuba 3058577, Japan
| | - Zheng Zhang
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL 36849, USA
| | - Xiucai Ye
- Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan.
| | - Tetsuya Sakurai
- Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan
| | - Hao Lin
- Center for Information Biology, University of Electronic Science and Technology of China, Chengdu 611731, China.
| |
Collapse
|
24
|
Wang RH, Ng YK, Zhang X, Wang J, Li SC. Coding genomes with gapped pattern graph convolutional network. Bioinformatics 2024; 40:btae188. [PMID: 38603603 PMCID: PMC11034989 DOI: 10.1093/bioinformatics/btae188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 03/11/2024] [Accepted: 04/05/2024] [Indexed: 04/13/2024] Open
Abstract
MOTIVATION Genome sequencing technologies reveal a huge amount of genomic sequences. Neural network-based methods can be prime candidates for retrieving insights from these sequences because of their applicability to large and diverse datasets. However, the highly variable lengths of genome sequences severely impair the presentation of sequences as input to the neural network. Genetic variations further complicate tasks that involve sequence comparison or alignment. RESULTS Inspired by the theory and applications of "spaced seeds," we propose a graph representation of genome sequences called "gapped pattern graph." These graphs can be transformed through a Graph Convolutional Network to form lower-dimensional embeddings for downstream tasks. On the basis of the gapped pattern graphs, we implemented a neural network model and demonstrated its performance on diverse tasks involving microbe and mammalian genome data. Our method consistently outperformed all the other state-of-the-art methods across various metrics on all tasks, especially for the sequences with limited homology to the training data. In addition, our model was able to identify distinct gapped pattern signatures from the sequences. AVAILABILITY AND IMPLEMENTATION The framework is available at https://github.com/deepomicslab/GCNFrame.
Collapse
Affiliation(s)
- Ruo Han Wang
- Department of Computer Science, City University of Hong Kong Shenzhen Research Institute, Shen Zhen, 518063, China
- Department of Computer Science, City University of Hong Kong, Hong Kong, 999077, China
| | - Yen Kaow Ng
- Department of Computer Science, City University of Hong Kong Shenzhen Research Institute, Shen Zhen, 518063, China
- Department of Computer Science, City University of Hong Kong, Hong Kong, 999077, China
| | - Xianglilan Zhang
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, 100071, China
| | - Jianping Wang
- Department of Computer Science, City University of Hong Kong Shenzhen Research Institute, Shen Zhen, 518063, China
- Department of Computer Science, City University of Hong Kong, Hong Kong, 999077, China
| | - Shuai Cheng Li
- Department of Computer Science, City University of Hong Kong Shenzhen Research Institute, Shen Zhen, 518063, China
- Department of Computer Science, City University of Hong Kong, Hong Kong, 999077, China
| |
Collapse
|
25
|
Yao D, Li B, Zhan X, Zhan X, Yu L. GCNFORMER: graph convolutional network and transformer for predicting lncRNA-disease associations. BMC Bioinformatics 2024; 25:5. [PMID: 38166659 PMCID: PMC10763317 DOI: 10.1186/s12859-023-05625-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2023] [Accepted: 12/18/2023] [Indexed: 01/05/2024] Open
Abstract
BACKGROUND A growing body of researches indicate that the disrupted expression of long non-coding RNA (lncRNA) is linked to a range of human disorders. Therefore, the effective prediction of lncRNA-disease association (LDA) can not only suggest solutions to diagnose a condition but also save significant time and labor costs. METHOD In this work, we proposed a novel LDA predicting algorithm based on graph convolutional network and transformer, named GCNFORMER. Firstly, we integrated the intraclass similarity and interclass connections between miRNAs, lncRNAs and diseases, and built a graph adjacency matrix. Secondly, to completely obtain the features between various nodes, we employed a graph convolutional network for feature extraction. Finally, to obtain the global dependencies between inputs and outputs, we used a transformer encoder with a multiheaded attention mechanism to forecast lncRNA-disease associations. RESULTS The results of fivefold cross-validation experiment on the public dataset revealed that the AUC and AUPR of GCNFORMER achieved 0.9739 and 0.9812, respectively. We compared GCNFORMER with six advanced LDA prediction models, and the results indicated its superiority over the other six models. Furthermore, GCNFORMER's effectiveness in predicting potential LDAs is underscored by case studies on breast cancer, colon cancer and lung cancer. CONCLUSIONS The combination of graph convolutional network and transformer can effectively improve the performance of LDA prediction model and promote the in-depth development of this research filed.
Collapse
Affiliation(s)
- Dengju Yao
- School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, 150080, China.
| | - Bailin Li
- School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, 150080, China
| | - Xiaojuan Zhan
- School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, 150080, China
- College of Computer Science and Technology, Heilongjiang Institute of Technology, Harbin, 150050, China
| | - Xiaorong Zhan
- Department of Endocrinology and Metabolism, Hospital of South, University of Science and Technology, Shenzhen, 518055, China
| | - Liyang Yu
- School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, 150080, China
| |
Collapse
|
26
|
Zeng M, Wu Y, Li Y, Yin R, Lu C, Duan J, Li M. LncLocFormer: a Transformer-based deep learning model for multi-label lncRNA subcellular localization prediction by using localization-specific attention mechanism. Bioinformatics 2023; 39:btad752. [PMID: 38109668 PMCID: PMC10749772 DOI: 10.1093/bioinformatics/btad752] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 11/13/2023] [Accepted: 12/17/2023] [Indexed: 12/20/2023] Open
Abstract
MOTIVATION There is mounting evidence that the subcellular localization of lncRNAs can provide valuable insights into their biological functions. In the real world of transcriptomes, lncRNAs are usually localized in multiple subcellular localizations. Furthermore, lncRNAs have specific localization patterns for different subcellular localizations. Although several computational methods have been developed to predict the subcellular localization of lncRNAs, few of them are designed for lncRNAs that have multiple subcellular localizations, and none of them take motif specificity into consideration. RESULTS In this study, we proposed a novel deep learning model, called LncLocFormer, which uses only lncRNA sequences to predict multi-label lncRNA subcellular localization. LncLocFormer utilizes eight Transformer blocks to model long-range dependencies within the lncRNA sequence and shares information across the lncRNA sequence. To exploit the relationship between different subcellular localizations and find distinct localization patterns for different subcellular localizations, LncLocFormer employs a localization-specific attention mechanism. The results demonstrate that LncLocFormer outperforms existing state-of-the-art predictors on the hold-out test set. Furthermore, we conducted a motif analysis and found LncLocFormer can capture known motifs. Ablation studies confirmed the contribution of the localization-specific attention mechanism in improving the prediction performance. AVAILABILITY AND IMPLEMENTATION The LncLocFormer web server is available at http://csuligroup.com:9000/LncLocFormer. The source code can be obtained from https://github.com/CSUBioGroup/LncLocFormer.
Collapse
Affiliation(s)
- Min Zeng
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Yifan Wu
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Yiming Li
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Rui Yin
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL 32603, United States
| | - Chengqian Lu
- School of Computer Science, Key Laboratory of Intelligent Computing and Information Processing, Xiangtan University, Xiangtan, Hunan 411105, China
| | - Junwen Duan
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| |
Collapse
|
27
|
Fu X, Chen Y, Tian S. DlncRNALoc: A discrete wavelet transform-based model for predicting lncRNA subcellular localization. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:20648-20667. [PMID: 38124569 DOI: 10.3934/mbe.2023913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
The prediction of long non-coding RNA (lncRNA) subcellular localization is essential to the understanding of its function and involvement in cellular regulation. Traditional biological experimental methods are costly and time-consuming, making computational methods the preferred approach for predicting lncRNA subcellular localization (LSL). However, existing computational methods have limitations due to the structural characteristics of lncRNAs and the uneven distribution of data across subcellular compartments. We propose a discrete wavelet transform (DWT)-based model for predicting LSL, called DlncRNALoc. We construct a physicochemical property matrix of a 2-tuple bases based on lncRNA sequences, and we introduce a DWT lncRNA feature extraction method. We use the Synthetic Minority Over-sampling Technique (SMOTE) for oversampling and the local fisher discriminant analysis (LFDA) algorithm to optimize feature information. The optimized feature vectors are fed into support vector machine (SVM) to construct a predictive model. DlncRNALoc has been applied for a five-fold cross-validation on the three sets of benchmark datasets. Extensive experiments have demonstrated the superiority and effectiveness of the DlncRNALoc model in predicting LSL.
Collapse
Affiliation(s)
- Xiangzheng Fu
- Neher's Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao, China
- College of Information Science and Engineering, Hunan University, Changsha, Hunan, China
- Department of Basic Biology, Changsha Medical College, Changsha, Hunan, China
| | - Yifan Chen
- College of Information Science and Engineering, Hunan University, Changsha, Hunan, China
- Department of Basic Biology, Changsha Medical College, Changsha, Hunan, China
| | - Sha Tian
- Department of Internal Medicine, College of Integrated Chinese and Western Medicine, Hunan University of Chinese Medicine, Changsha, Hunan, China
| |
Collapse
|
28
|
Ballarino M, Pepe G, Helmer-Citterich M, Palma A. Exploring the landscape of tools and resources for the analysis of long non-coding RNAs. Comput Struct Biotechnol J 2023; 21:4706-4716. [PMID: 37841333 PMCID: PMC10568309 DOI: 10.1016/j.csbj.2023.09.041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Revised: 09/28/2023] [Accepted: 09/28/2023] [Indexed: 10/17/2023] Open
Abstract
In recent years, research on long non-coding RNAs (lncRNAs) has gained considerable attention due to the increasing number of newly identified transcripts. Several characteristics make their functional evaluation challenging, which called for the urgent need to combine molecular biology with other disciplines, including bioinformatics. Indeed, the recent development of computational pipelines and resources has greatly facilitated both the discovery and the mechanisms of action of lncRNAs. In this review, we present a curated collection of the most recent computational resources, which have been categorized into distinct groups: databases and annotation, identification and classification, interaction prediction, and structure prediction. As the repertoire of lncRNAs and their analysis tools continues to expand over the years, standardizing the computational pipelines and improving the existing annotation of lncRNAs will be crucial to facilitate functional genomics studies.
Collapse
Affiliation(s)
- Monica Ballarino
- Department of Biology and Biotechnologies “Charles Darwin”, Sapienza University of Rome, Piazzale Aldo Moro 5, 00161 Rome, Italy
| | - Gerardo Pepe
- Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica, 1, 00133 Rome, Italy
| | - Manuela Helmer-Citterich
- Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica, 1, 00133 Rome, Italy
| | - Alessandro Palma
- Department of Biology and Biotechnologies “Charles Darwin”, Sapienza University of Rome, Piazzale Aldo Moro 5, 00161 Rome, Italy
| |
Collapse
|
29
|
Ding P, Zeng M, Yin R. Editorial: Computational methods to analyze RNA data for human diseases. Front Genet 2023; 14:1270334. [PMID: 37674479 PMCID: PMC10478215 DOI: 10.3389/fgene.2023.1270334] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 08/14/2023] [Indexed: 09/08/2023] Open
Affiliation(s)
- Pingjian Ding
- Center for Artificial Intelligence in Drug Discovery, School of Medicine, Case Western Reserve University, Cleveland, OH, United States
| | - Min Zeng
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Rui Yin
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, United States
| |
Collapse
|
30
|
Kim Y, Lee M. Deep Learning Approaches for lncRNA-Mediated Mechanisms: A Comprehensive Review of Recent Developments. Int J Mol Sci 2023; 24:10299. [PMID: 37373445 DOI: 10.3390/ijms241210299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 06/16/2023] [Accepted: 06/17/2023] [Indexed: 06/29/2023] Open
Abstract
This review paper provides an extensive analysis of the rapidly evolving convergence of deep learning and long non-coding RNAs (lncRNAs). Considering the recent advancements in deep learning and the increasing recognition of lncRNAs as crucial components in various biological processes, this review aims to offer a comprehensive examination of these intertwined research areas. The remarkable progress in deep learning necessitates thoroughly exploring its latest applications in the study of lncRNAs. Therefore, this review provides insights into the growing significance of incorporating deep learning methodologies to unravel the intricate roles of lncRNAs. By scrutinizing the most recent research spanning from 2021 to 2023, this paper provides a comprehensive understanding of how deep learning techniques are employed in investigating lncRNAs, thereby contributing valuable insights to this rapidly evolving field. The review is aimed at researchers and practitioners looking to integrate deep learning advancements into their lncRNA studies.
Collapse
Affiliation(s)
- Yoojoong Kim
- School of Computer Science and Information Engineering, The Catholic University of Korea, Bucheon 14662, Republic of Korea
| | - Minhyeok Lee
- School of Electrical and Electronics Engineering, Chung-Ang University, Seoul 06974, Republic of Korea
| |
Collapse
|
31
|
Li J, Zou Q, Yuan L. A review from biological mapping to computation-based subcellular localization. MOLECULAR THERAPY. NUCLEIC ACIDS 2023; 32:507-521. [PMID: 37215152 PMCID: PMC10192651 DOI: 10.1016/j.omtn.2023.04.015] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Subcellular localization is crucial to the study of virus and diseases. Specifically, research on protein subcellular localization can help identify clues between virus and host cells that can aid in the design of targeted drugs. Research on RNA subcellular localization is significant for human diseases (such as Alzheimer's disease, colon cancer, etc.). To date, only reviews addressing subcellular localization of proteins have been published, which are outdated for reference, and reviews of RNA subcellular localization are not comprehensive. Therefore, we collated (the most up-to-date) literature on protein and RNA subcellular localization to help researchers understand changes in the field of protein and RNA subcellular localization. Extensive and complete methods for constructing subcellular localization models have also been summarized, which can help readers understand the changes in application of biotechnology and computer science in subcellular localization research and explore how to use biological data to construct improved subcellular localization models. This paper is the first review to cover both protein subcellular localization and RNA subcellular localization. We urge researchers from biology and computational biology to jointly pay attention to transformation patterns, interrelationships, differences, and causality of protein subcellular localization and RNA subcellular localization.
Collapse
Affiliation(s)
- Jing Li
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, 1 Chengdian Road, Quzhou, Zhejiang 324000, China
- School of Biomedical Sciences, University of Hong Kong, Hong Kong, China
| | - Quan Zou
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, 1 Chengdian Road, Quzhou, Zhejiang 324000, China
| | - Lei Yuan
- Department of Hepatobiliary Surgery, Quzhou People's Hospital, 100 Minjiang Main Road, Quzhou, Zhejiang 324000, China
| |
Collapse
|
32
|
Kulkarni V, Jayakumar S, Mohan M, Kulkarni S. Aid or Antagonize: Nuclear Long Noncoding RNAs Regulate Host Responses and Outcomes of Viral Infections. Cells 2023; 12:987. [PMID: 37048060 PMCID: PMC10093752 DOI: 10.3390/cells12070987] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 03/12/2023] [Accepted: 03/15/2023] [Indexed: 04/14/2023] Open
Abstract
Long noncoding RNAs (lncRNAs) are transcripts measuring >200 bp in length and devoid of protein-coding potential. LncRNAs exceed the number of protein-coding mRNAs and regulate cellular, developmental, and immune pathways through diverse molecular mechanisms. In recent years, lncRNAs have emerged as epigenetic regulators with prominent roles in health and disease. Many lncRNAs, either host or virus-encoded, have been implicated in critical cellular defense processes, such as cytokine and antiviral gene expression, the regulation of cell signaling pathways, and the activation of transcription factors. In addition, cellular and viral lncRNAs regulate virus gene expression. Viral infections and associated immune responses alter the expression of host lncRNAs regulating immune responses, host metabolism, and viral replication. The influence of lncRNAs on the pathogenesis and outcomes of viral infections is being widely explored because virus-induced lncRNAs can serve as diagnostic and therapeutic targets. Future studies should focus on thoroughly characterizing lncRNA expressions in virus-infected primary cells, investigating their role in disease prognosis, and developing biologically relevant animal or organoid models to determine their suitability for specific therapeutic targeting. Many cellular and viral lncRNAs localize in the nucleus and epigenetically modulate viral transcription, latency, and host responses to infection. In this review, we provide an overview of the role of nuclear lncRNAs in the pathogenesis and outcomes of viral infections, such as the Influenza A virus, Sendai Virus, Respiratory Syncytial Virus, Hepatitis C virus, Human Immunodeficiency Virus, and Herpes Simplex Virus. We also address significant advances and barriers in characterizing lncRNA function and explore the potential of lncRNAs as therapeutic targets.
Collapse
Affiliation(s)
- Viraj Kulkarni
- Disease Intervention and Prevention Program, Texas Biomedical Research Institute, San Antonio, TX 78227, USA;
| | - Sahana Jayakumar
- Host-Pathogen Interaction Program, Texas Biomedical Research Institute, San Antonio, TX 78227, USA; (S.J.); (M.M.)
| | - Mahesh Mohan
- Host-Pathogen Interaction Program, Texas Biomedical Research Institute, San Antonio, TX 78227, USA; (S.J.); (M.M.)
| | - Smita Kulkarni
- Host-Pathogen Interaction Program, Texas Biomedical Research Institute, San Antonio, TX 78227, USA; (S.J.); (M.M.)
| |
Collapse
|