1
|
Zeng S, Zhang S, Wang Z, Yang C, Yuan S. GONNMDA: A Ordered Message Passing GNN Approach for miRNA-Disease Association Prediction. Genes (Basel) 2025; 16:425. [PMID: 40282386 PMCID: PMC12027447 DOI: 10.3390/genes16040425] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2025] [Revised: 03/26/2025] [Accepted: 03/27/2025] [Indexed: 04/29/2025] Open
Abstract
Small non-coding molecules known as microRNAs (miRNAs) play a critical role in disease diagnosis, treatment, and prognosis evaluation. Traditional wet-lab methods for validating miRNA-disease associations are often time-consuming and inefficient. With the advancement of high-throughput sequencing technologies, deep learning methods have become effective tools for uncovering potential patterns in miRNA-disease associations and revealing novel biological insights. Most of the existing approaches focus primarily on individual molecular behavior, overlooking interactions at the multi-molecular level. Conventional graph neural network (GNN) models struggle to generalize to heterogeneous graphs, and as network depth increases, node representations become indistinguishable due to over-smoothing, resulting in reduced predictive performance. GONNMDA first integrates similarity features from multiple data sources and applies noise reduction to obtain a reconstructed, comprehensive similarity representation. It then constructs heterogeneous graphs and applies a root-tree hierarchical alignment, along with an ordered gating message-passing mechanism, effectively addressing the challenges of heterogeneity and over-smoothing. Finally, a multilayer perceptron is employed to produce the final association predictions. To evaluate the effectiveness of GONNMDA, we conducted extensive experiments where the model achieved an AUC of 95.49% and an AUPR of 95.32%. The results demonstrate that GONNMDA outperforms several recent state-of-the-art methods. In addition, case studies and survival analyses on three common human cancers-breast cancer, rectal cancer, and lung cancer-further validate the effectiveness and reliability of GONNMDA in predicting miRNA-disease associations.
Collapse
Affiliation(s)
| | - Shanwen Zhang
- School of Electronic Information, Xijing University, Xi’an 710123, China; (S.Z.); (Z.W.); (C.Y.); (S.Y.)
| | | | | | | |
Collapse
|
2
|
Chen Y, Du Z, Ren X, Pan C, Zhu Y, Li Z, Meng T, Yao X. mRNA-CLA: An interpretable deep learning approach for predicting mRNA subcellular localization. Methods 2024; 227:17-26. [PMID: 38705502 DOI: 10.1016/j.ymeth.2024.04.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 03/30/2024] [Accepted: 04/28/2024] [Indexed: 05/07/2024] Open
Abstract
Messenger RNA (mRNA) is vital for post-transcriptional gene regulation, acting as the direct template for protein synthesis. However, the methods available for predicting mRNA subcellular localization need to be improved and enhanced. Notably, few existing algorithms can annotate mRNA sequences with multiple localizations. In this work, we propose the mRNA-CLA, an innovative multi-label subcellular localization prediction framework for mRNA, leveraging a deep learning approach with a multi-head self-attention mechanism. The framework employs a multi-scale convolutional layer to extract sequence features across different regions and uses a self-attention mechanism explicitly designed for each sequence. Paired with Position Weight Matrices (PWMs) derived from the convolutional neural network layers, our model offers interpretability in the analysis. In particular, we perform a base-level analysis of mRNA sequences from diverse subcellular localizations to determine the nucleotide specificity corresponding to each site. Our evaluations demonstrate that the mRNA-CLA model substantially outperforms existing methods and tools.
Collapse
Affiliation(s)
- Yifan Chen
- Institute of Artificial Intelligence Application, College of Computer and Information Engineering, Central South University of Forestry and Technology, Changsha, Hunan 410004, China
| | - Zhenya Du
- Guangzhou Xinhua University, 510520, Guangzhou, China
| | - Xuanbai Ren
- College of Information Science and Engineering, Hunan University, Changsha, Hunan, China
| | - Chu Pan
- College of Information Science and Engineering, Hunan University, Changsha, Hunan, China
| | - Yangbin Zhu
- Manufacturing and Electronic Engineering, Wenzhou University of Technology, 325027, Wenzhou, China.
| | - Zhen Li
- Institute of Computational Science and Technology, Guangzhou University, Guangzhou, 510006, China.
| | - Tao Meng
- Institute of Artificial Intelligence Application, College of Computer and Information Engineering, Central South University of Forestry and Technology, Changsha, Hunan 410004, China
| | - Xiaojun Yao
- Faculty of Applied Sciences, Macao Polytechnic University, 999078, Macao.
| |
Collapse
|
3
|
Wei L, Zou Q, Zeng X. Editorial: Artificial intelligence in drug discovery and development. Methods 2024; 226:133-137. [PMID: 38582311 DOI: 10.1016/j.ymeth.2024.04.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/08/2024] Open
Affiliation(s)
- Leyi Wei
- Faculty of Applied Sciences, Macao Polytechnic University, Macao 999078, China; School of Software, Shandong University, Jinan 250101, China.
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Xiangxiang Zeng
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| |
Collapse
|
4
|
Peng L, Yang Y, Yang C, Li Z, Cheong N. HRGCNLDA: Forecasting of lncRNA-disease association based on hierarchical refinement graph convolutional neural network. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:4814-4834. [PMID: 38872515 DOI: 10.3934/mbe.2024212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2024]
Abstract
Long non-coding RNA (lncRNA) is considered to be a crucial regulator involved in various human biological processes, including the regulation of tumor immune checkpoint proteins. It has great potential as both a cancer biomolecular biomarker and therapeutic target. Nevertheless, conventional biological experimental techniques are both resource-intensive and laborious, making it essential to develop an accurate and efficient computational method to facilitate the discovery of potential links between lncRNAs and diseases. In this study, we proposed HRGCNLDA, a computational approach utilizing hierarchical refinement of graph convolutional neural networks for forecasting lncRNA-disease potential associations. This approach effectively addresses the over-smoothing problem that arises from stacking multiple layers of graph convolutional neural networks. Specifically, HRGCNLDA enhances the layer representation during message propagation and node updates, thereby amplifying the contribution of hidden layers that resemble the ego layer while reducing discrepancies. The results of the experiments showed that HRGCNLDA achieved the highest AUC-ROC (area under the receiver operating characteristic curve, AUC for short) and AUC-PR (area under the precision versus recall curve, AUPR for short) values compared to other methods. Finally, to further demonstrate the reliability and efficacy of our approach, we performed case studies on the case of three prevalent human diseases, namely, breast cancer, lung cancer and gastric cancer.
Collapse
Affiliation(s)
- Li Peng
- College of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan 411201, China
- Hunan Key Laboratory for Service Computing and Novel Software Technology, Hunan University of Science and Technology, Xiangtan 411201, China
| | - Yujie Yang
- College of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan 411201, China
| | - Cheng Yang
- College of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan 411201, China
| | - Zejun Li
- School of Computer Science and Engineering, Hunan Institute of Technology, Hengyang 421002, China
| | - Ngai Cheong
- Faculty of Applied Sciences, Macao Polytechnic University, Macau 999078, China
| |
Collapse
|
5
|
Xu L, Fu X, Zhuo L, Zhou Z, Liao X, Tian S, Kang R, Chen Y. SGAE-MDA: Exploring the MiRNA-disease associations in herbal medicines based on semi-supervised graph autoencoder. Methods 2024; 221:73-81. [PMID: 38123109 DOI: 10.1016/j.ymeth.2023.12.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 11/28/2023] [Accepted: 12/12/2023] [Indexed: 12/23/2023] Open
Abstract
Research indicates that miRNAs present in herbal medicines are crucial for identifying disease markers, advancing gene therapy, facilitating drug delivery, and so on. These miRNAs maintain stability in the extracellular environment, making them viable tools for disease diagnosis. They can withstand the digestive processes in the gastrointestinal tract, positioning them as potential carriers for specific oral drug delivery. By engineering plants to generate effective, non-toxic miRNA interference sequences, it's possible to broaden their applicability, including the treatment of diseases such as hepatitis C. Consequently, delving into the miRNA-disease associations (MDAs) within herbal medicines holds immense promise for diagnosing and addressing miRNA-related diseases. In our research, we propose the SGAE-MDA model, which harnesses the strengths of a graph autoencoder (GAE) combined with a semi-supervised approach to uncover potential MDAs in herbal medicines more effectively. Leveraging the GAE framework, the SGAE-MDA model exactly integrates the inherent feature vectors of miRNAs and disease nodes with the regulatory data in the miRNA-disease network. Additionally, the proposed semi-supervised learning approach randomly hides the partial structure of the miRNA-disease network, subsequently reconstructing them within the GAE framework. This technique effectively minimizes network noise interference. Through comparison against other leading deep learning models, the results consistently highlighted the superior performance of the proposed SGAE-MDA model. Our code and dataset can be available at: https://github.com/22n9n23/SGAE-MDA.
Collapse
Affiliation(s)
- Lei Xu
- Wenzhou University of Technology, Wenzhou, China
| | - Xiangzheng Fu
- Neher's Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao, China; College of Information Science and Engineering, Hunan University, Changsha, Hunan, China
| | - Linlin Zhuo
- Wenzhou University of Technology, Wenzhou, China
| | | | - Xuefeng Liao
- Wenzhou University of Technology, Wenzhou, China.
| | - Sha Tian
- Department of Internal Medicine, College of Integrated Chinese and Western Medicine, Hunan University of Chinese Medicine, Changsha, Hunan, China.
| | - Ruofei Kang
- Xuhui Excellent Health Information Technology Co., Ltd., China
| | - Yifan Chen
- College of Information Science and Engineering, Hunan University, Changsha, Hunan, China.
| |
Collapse
|
6
|
Yang C, Wang Z, Zhang S, Li X, Wang X, Liu J, Li R, Zeng S. MVNMDA: A Multi-View Network Combing Semantic and Global Features for Predicting miRNA-Disease Association. Molecules 2023; 29:230. [PMID: 38202814 PMCID: PMC10780172 DOI: 10.3390/molecules29010230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2023] [Revised: 12/23/2023] [Accepted: 12/28/2023] [Indexed: 01/12/2024] Open
Abstract
A growing body of experimental evidence suggests that microRNAs (miRNAs) are closely associated with specific human diseases and play critical roles in their development and progression. Therefore, identifying miRNA related to specific diseases is of great significance for disease screening and treatment. In the early stages, the identification of associations between miRNAs and diseases demanded laborious and time-consuming biological experiments that often carried a substantial risk of failure. With the exponential growth in the number of potential miRNA-disease association combinations, traditional biological experimental methods face difficulties in processing massive amounts of data. Hence, developing more efficient computational methods to predict possible miRNA-disease associations and prioritize them is particularly necessary. In recent years, numerous deep learning-based computational methods have been developed and have demonstrated excellent performance. However, most of these methods rely on external databases or tools to compute various auxiliary information. Unfortunately, these external databases or tools often cover only a limited portion of miRNAs and diseases, resulting in many miRNAs and diseases being unable to match with these computational methods. Therefore, there are certain limitations associated with the practical application of these methods. To overcome the above limitations, this study proposes a multi-view computational model called MVNMDA, which predicts potential miRNA-disease associations by integrating features of miRNA and diseases from local views, global views, and semantic views. Specifically, MVNMDA utilizes known association information to construct node initial features. Then, multiple networks are constructed based on known association to extract low-dimensional feature embedding of all nodes. Finally, a cascaded attention classifier is proposed to fuse features from coarse to fine, suppressing noise within the features and making precise predictions. To validate the effectiveness of the proposed method, extensive experiments were conducted on the HMDD v2.0 and HMDD v3.2 datasets. The experimental results demonstrate that MVNMDA achieves better performance compared to other computational methods. Additionally, the case study results further demonstrate the reliable predictive performance of MVNMDA.
Collapse
Affiliation(s)
| | - Zhen Wang
- School of Electronic Infomation, Xijing University, Xi’an 710123, China; (C.Y.); (S.Z.); (X.L.); (X.W.); (J.L.); (R.L.); (S.Z.)
| | | | | | | | | | | | | |
Collapse
|
7
|
Fu X, Chen Y, Tian S. DlncRNALoc: A discrete wavelet transform-based model for predicting lncRNA subcellular localization. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:20648-20667. [PMID: 38124569 DOI: 10.3934/mbe.2023913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
The prediction of long non-coding RNA (lncRNA) subcellular localization is essential to the understanding of its function and involvement in cellular regulation. Traditional biological experimental methods are costly and time-consuming, making computational methods the preferred approach for predicting lncRNA subcellular localization (LSL). However, existing computational methods have limitations due to the structural characteristics of lncRNAs and the uneven distribution of data across subcellular compartments. We propose a discrete wavelet transform (DWT)-based model for predicting LSL, called DlncRNALoc. We construct a physicochemical property matrix of a 2-tuple bases based on lncRNA sequences, and we introduce a DWT lncRNA feature extraction method. We use the Synthetic Minority Over-sampling Technique (SMOTE) for oversampling and the local fisher discriminant analysis (LFDA) algorithm to optimize feature information. The optimized feature vectors are fed into support vector machine (SVM) to construct a predictive model. DlncRNALoc has been applied for a five-fold cross-validation on the three sets of benchmark datasets. Extensive experiments have demonstrated the superiority and effectiveness of the DlncRNALoc model in predicting LSL.
Collapse
Affiliation(s)
- Xiangzheng Fu
- Neher's Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao, China
- College of Information Science and Engineering, Hunan University, Changsha, Hunan, China
- Department of Basic Biology, Changsha Medical College, Changsha, Hunan, China
| | - Yifan Chen
- College of Information Science and Engineering, Hunan University, Changsha, Hunan, China
- Department of Basic Biology, Changsha Medical College, Changsha, Hunan, China
| | - Sha Tian
- Department of Internal Medicine, College of Integrated Chinese and Western Medicine, Hunan University of Chinese Medicine, Changsha, Hunan, China
| |
Collapse
|