1
|
Zeng S, Zhang S, Wang Z, Yang C, Yuan S. GONNMDA: A Ordered Message Passing GNN Approach for miRNA-Disease Association Prediction. Genes (Basel) 2025; 16:425. [PMID: 40282386 PMCID: PMC12027447 DOI: 10.3390/genes16040425] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2025] [Revised: 03/26/2025] [Accepted: 03/27/2025] [Indexed: 04/29/2025] Open
Abstract
Small non-coding molecules known as microRNAs (miRNAs) play a critical role in disease diagnosis, treatment, and prognosis evaluation. Traditional wet-lab methods for validating miRNA-disease associations are often time-consuming and inefficient. With the advancement of high-throughput sequencing technologies, deep learning methods have become effective tools for uncovering potential patterns in miRNA-disease associations and revealing novel biological insights. Most of the existing approaches focus primarily on individual molecular behavior, overlooking interactions at the multi-molecular level. Conventional graph neural network (GNN) models struggle to generalize to heterogeneous graphs, and as network depth increases, node representations become indistinguishable due to over-smoothing, resulting in reduced predictive performance. GONNMDA first integrates similarity features from multiple data sources and applies noise reduction to obtain a reconstructed, comprehensive similarity representation. It then constructs heterogeneous graphs and applies a root-tree hierarchical alignment, along with an ordered gating message-passing mechanism, effectively addressing the challenges of heterogeneity and over-smoothing. Finally, a multilayer perceptron is employed to produce the final association predictions. To evaluate the effectiveness of GONNMDA, we conducted extensive experiments where the model achieved an AUC of 95.49% and an AUPR of 95.32%. The results demonstrate that GONNMDA outperforms several recent state-of-the-art methods. In addition, case studies and survival analyses on three common human cancers-breast cancer, rectal cancer, and lung cancer-further validate the effectiveness and reliability of GONNMDA in predicting miRNA-disease associations.
Collapse
Affiliation(s)
| | - Shanwen Zhang
- School of Electronic Information, Xijing University, Xi’an 710123, China; (S.Z.); (Z.W.); (C.Y.); (S.Y.)
| | | | | | | |
Collapse
|
2
|
Yang G, Liu Y, Wen S, Chen W, Zhu X, Wang Y. DTI-MHAPR: optimized drug-target interaction prediction via PCA-enhanced features and heterogeneous graph attention networks. BMC Bioinformatics 2025; 26:11. [PMID: 39800678 PMCID: PMC11726937 DOI: 10.1186/s12859-024-06021-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2024] [Accepted: 12/20/2024] [Indexed: 01/16/2025] Open
Abstract
Drug-target interactions (DTIs) are pivotal in drug discovery and development, and their accurate identification can significantly expedite the process. Numerous DTI prediction methods have emerged, yet many fail to fully harness the feature information of drugs and targets or address the issue of feature redundancy. We aim to refine DTI prediction accuracy by eliminating redundant features and capitalizing on the node topological structure to enhance feature extraction. To achieve this, we introduce a PCA-augmented multi-layer heterogeneous graph-based network that concentrates on key features throughout the encoding-decoding phase. Our approach initiates with the construction of a heterogeneous graph from various similarity metrics, which is then encoded via a graph neural network. We concatenate and integrate the resultant representation vectors to merge multi-level information. Subsequently, principal component analysis is applied to distill the most informative features, with the random forest algorithm employed for the final decoding of the integrated data. Our method outperforms six baseline models in terms of accuracy, as demonstrated by extensive experimentation. Comprehensive ablation studies, visualization of results, and in-depth case analyses further validate our framework's efficacy and interpretability, providing a novel tool for drug discovery that integrates multimodal features.
Collapse
Affiliation(s)
- Guang Yang
- School of Information and Artificial Intelligence, Anhui Agricultural University, Changjiang West Road, Hefei, 230036, Anhui, China
| | - Yinbo Liu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Changjiang West Road, Hefei, 230036, Anhui, China
| | - Sijian Wen
- School of Information and Artificial Intelligence, Anhui Agricultural University, Changjiang West Road, Hefei, 230036, Anhui, China
| | - Wenxi Chen
- School of Information and Artificial Intelligence, Anhui Agricultural University, Changjiang West Road, Hefei, 230036, Anhui, China
| | - Xiaolei Zhu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Changjiang West Road, Hefei, 230036, Anhui, China
| | - Yongmei Wang
- School of Information and Artificial Intelligence, Anhui Agricultural University, Changjiang West Road, Hefei, 230036, Anhui, China.
| |
Collapse
|
3
|
Liu Y, Wu Q, Zhou L, Liu Y, Li C, Wei Z, Peng W, Yue Y, Zhu X. Disentangled similarity graph attention heterogeneous biological memory network for predicting disease-associated miRNAs. BMC Genomics 2024; 25:1161. [PMID: 39623332 PMCID: PMC11610307 DOI: 10.1186/s12864-024-11078-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Accepted: 11/21/2024] [Indexed: 12/06/2024] Open
Abstract
BACKGROUND The association between MicroRNAs (miRNAs) and diseases is crucial in treating and exploring many diseases or cancers. Although wet-lab methods for predicting miRNA-disease associations (MDAs) are effective, they are often expensive and time-consuming. Significant advancements have been made using Graph Neural Network-based methods (GNN-MDAs) to address these challenges. However, these methods still face limitations, such as not considering nodes' deep-level similarity associations and hierarchical learning patterns. Additionally, current models do not retain the memory of previously learned heterogeneous historical information about miRNAs or diseases, only focusing on parameter learning without the capability to remember heterogeneous associations. RESULTS This study introduces the K-means disentangled high-level biological similarity to utilize potential hierarchical relationships fully and proposes a Graph Attention Heterogeneous Biological Memory Network architecture (DiGAMN) with memory capabilities. Extensive experiments were conducted across four datasets, comparing the DiGAMN model and its disentangling method against ten state-of-the-art non-disentangled methods and six traditional GNNs. DiGAMN excelled, achieving AUC scores of 96.35%, 96.10%, 96.01%, and 95.89% on the Data1 to Data4 datasets, respectively, surpassing all other models. These results confirm the superior performance of DiGAMN and its disentangling method. Additionally, various ablation studies were conducted to validate the contributions of different modules within the framework, and's encoding statuses and memory units of DiGAMN were visualized to explore the utility and functionality of its modules. Case studies confirmed the effectiveness of DiGAMN's predictions, identifying several new disease-associated miRNAs. CONCLUSIONS DiGAMN introduces the use of a disentangled biological similarity approach for the first time and successfully constructs a Disentangled Graph Attention Heterogeneous Biological Memory Network model. This network can learn disentangled representations of similarity information and effectively store the potential biological entanglement information of miRNAs and diseases. By integrating disentangled similarity information with a heterogeneous attention memory network, DiGAMN enhances the model's ability to capture and utilize complex underlying biological data, significantly outperforming many existing models. The concepts used in this method also provide new perspectives for predicting miRNAs associated with diseases.
Collapse
Affiliation(s)
- Yinbo Liu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, 230036, China
- Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
| | - Qi Wu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, 230036, China
| | - Le Zhou
- Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
| | - Yuchen Liu
- Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
- China University of Petroleum, Beijing, Beijing, 102249, China
| | - Chao Li
- Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
| | - Zhuoyu Wei
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, 230036, China
| | - Wei Peng
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, 230036, China
| | - Yi Yue
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, 230036, China.
| | - Xiaolei Zhu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, 230036, China.
| |
Collapse
|
4
|
Ning Q, Zhao Y, Gao J, Chen C, Yin M. Hierarchical Hypergraph Learning in Association- Weighted Heterogeneous Network for miRNA- Disease Association Identification. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:2531-2542. [PMID: 39475747 DOI: 10.1109/tcbb.2024.3485788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
MicroRNAs (miRNAs) play a significant role in cell differentiation, biological development as well as the occurrence and growth of diseases. Although many computational methods contribute to predicting the association between miRNAs and diseases, they do not fully explore the attribute information contained in associated edges between miRNAs and diseases. In this study, we propose a new method, Hierarchical Hypergraph learning in Association-Weighted heterogeneous network for MiRNA-Disease association identification (HHAWMD). HHAWMD first adaptively fuses multi-view similarities based on channel attention and distinguishes the relevance of different associated relationships according to changes in expression levels of disease-related miRNAs, miRNA similarity information, and disease similarity information. Then, HHAWMD assigns edge weights and attribute features according to the association level to construct an association-weighted heterogeneous graph. Next, HHAWMD extracts the subgraph of the miRNA-disease node pair from the heterogeneous graph and builds the hyperedge (a kind of virtual edge) between the node pair to generate the hypergraph. Finally, HHAWMD proposes a hierarchical hypergraph learning approach, including node-aware attention and hyperedge-aware attention, which aggregates the abundant semantic information contained in deep and shallow neighborhoods to the hyperedge in the hypergraph. Our experiment results suggest that HHAWMD has better performance and can be used as a powerful tool for miRNA-disease association identification.
Collapse
|
5
|
Luo L, Tan Z, Wang S. RSANMDA: Resampling based subview attention network for miRNA-disease association prediction. Methods 2024; 230:99-107. [PMID: 39097178 DOI: 10.1016/j.ymeth.2024.07.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2024] [Revised: 07/16/2024] [Accepted: 07/23/2024] [Indexed: 08/05/2024] Open
Abstract
Many studies have demonstrated the importance of accurately identifying miRNA-disease associations (MDAs) for understanding disease mechanisms. However, the number of known MDAs is significantly fewer than the unknown pairs. Here, we propose RSANMDA, a subview attention network for predicting MDAs. We first extract miRNA and disease features from multiple similarity matrices. Next, using resampling techniques, we generate different subviews from known MDAs. Each subview undergoes multi-head graph attention to capture its features, followed by semantic attention to integrate features across subviews. Finally, combining raw and training features, we use a multilayer scoring perceptron for prediction. In the experimental section, we conducted comparative experiments with other advanced models on both HMDD v2.0 and HMDD v3.2 datasets. We also performed a series of ablation studies and parameter tuning exercises. Comprehensive experiments conclusively demonstrate the superiority of our model. Case studies on lung, breast, and esophageal cancers further validate our method's predictive capability for identifying disease-related miRNAs.
Collapse
Affiliation(s)
- Longfei Luo
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, Yunnan, China
| | - Zhuokun Tan
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, Yunnan, China
| | - Shunfang Wang
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, Yunnan, China.
| |
Collapse
|
6
|
Wen S, Liu Y, Yang G, Chen W, Wu H, Zhu X, Wang Y. A method for miRNA diffusion association prediction using machine learning decoding of multi-level heterogeneous graph Transformer encoded representations. Sci Rep 2024; 14:20490. [PMID: 39227405 PMCID: PMC11371806 DOI: 10.1038/s41598-024-68897-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Accepted: 07/29/2024] [Indexed: 09/05/2024] Open
Abstract
MicroRNAs (miRNAs) are a key class of endogenous non-coding RNAs that play a pivotal role in regulating diseases. Accurately predicting the intricate relationships between miRNAs and diseases carries profound implications for disease diagnosis, treatment, and prevention. However, these prediction tasks are highly challenging due to the complexity of the underlying relationships. While numerous effective prediction models exist for validating these associations, they often encounter information distortion due to limitations in efficiently retaining information during the encoding-decoding process. Inspired by Multi-layer Heterogeneous Graph Transformer and Machine Learning XGboost classifier algorithm, this study introduces a novel computational approach based on multi-layer heterogeneous encoder-machine learning decoder structure for miRNA-disease association prediction (MHXGMDA). First, we employ the multi-view similarity matrices as the input coding for MHXGMDA. Subsequently, we utilize the multi-layer heterogeneous encoder to capture the embeddings of miRNAs and diseases, aiming to capture the maximum amount of relevant features. Finally, the information from all layers is concatenated to serve as input to the machine learning classifier, ensuring maximal preservation of encoding details. We conducted a comprehensive comparison of seven different classifier models and ultimately selected the XGBoost algorithm as the decoder. This algorithm leverages miRNA embedding features and disease embedding features to decode and predict the association scores between miRNAs and diseases. We applied MHXGMDA to predict human miRNA-disease associations on two benchmark datasets. Experimental findings demonstrate that our approach surpasses several leading methods in terms of both the area under the receiver operating characteristic curve and the area under the precision-recall curve.
Collapse
Affiliation(s)
- SiJian Wen
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
| | - YinBo Liu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
| | - Guang Yang
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
| | - WenXi Chen
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
| | - HaiTao Wu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
| | - XiaoLei Zhu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China.
| | - YongMei Wang
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China.
- Anhui Provincial Engineering Laboratory for Beidou Precision Agriculture Information, Hefei, 230036, China.
| |
Collapse
|
7
|
Sun W, Zhang P, Zhang W, Xu J, Huang Y, Li L. Synchronous Mutual Learning Network and Asynchronous Multi-Scale Embedding Network for miRNA-Disease Association Prediction. Interdiscip Sci 2024; 16:532-553. [PMID: 38310628 DOI: 10.1007/s12539-023-00602-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 12/20/2023] [Accepted: 12/22/2023] [Indexed: 02/06/2024]
Abstract
MicroRNA (miRNA) serves as a pivotal regulator of numerous cellular processes, and the identification of miRNA-disease associations (MDAs) is crucial for comprehending complex diseases. Recently, graph neural networks (GNN) have made significant advancements in MDA prediction. However, these methods tend to learn one type of node representation from a single heterogeneous network, ignoring the importance of multiple network topologies and node attributes. Here, we propose SMDAP (Sequence hierarchical modeling-based Mirna-Disease Association Prediction framework), a novel GNN-based framework that incorporates multiple network topologies and various node attributes including miRNA seed and full-length sequences to predict potential MDAs. Specifically, SMDAP consists of two types of MDA representation: following a heterogeneous pattern, we construct a transfer learning-like synchronous mutual learning network to learn the first MDA representation in conjunction with the miRNA seed sequence. Meanwhile, following a homogeneous pattern, we design a subgraph-inspired asynchronous multi-scale embedding network to obtain the second MDA representation based on the miRNA full-length sequence. Subsequently, an adaptive fusion approach is designed to combine the two branches such that we can score the MDAs by the downstream classifier and infer novel MDAs. Comprehensive experiments demonstrate that SMDAP integrates the advantages of multiple network topologies and node attributes into two branch representations. Moreover, the area under the receiver operating characteristic curve is 0.9622 on DB1, which is a 5.06% increase from the baselines. The area under the precision-recall curve is 0.9777, which is a 7.33% increase from the baselines. In addition, case studies on three human cancers validated the predictive performance of SMDAP. Overall, SMDAP represents a powerful tool for MDA prediction.
Collapse
Affiliation(s)
- Weicheng Sun
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Ping Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Weihan Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Jinsheng Xu
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | | | - Li Li
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China.
- Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, 430070, China.
| |
Collapse
|
8
|
Liang X, Guo M, Jiang L, Fu Y, Zhang P, Chen Y. Predicting miRNA-Disease Associations by Combining Graph and Hypergraph Convolutional Network. Interdiscip Sci 2024; 16:289-303. [PMID: 38286905 DOI: 10.1007/s12539-023-00599-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 12/15/2023] [Accepted: 12/17/2023] [Indexed: 01/31/2024]
Abstract
miRNAs are important regulators for many crucial biological processes. Many recent studies have shown that miRNAs are closely related to various human diseases and can be potential biomarkers or therapeutic targets for some diseases, such as cancers. Therefore, accurately predicting miRNA-disease associations is of great importance for understanding and curing diseases. However, how to efficiently utilize the characteristics of miRNAs and diseases and the information on known miRNA-disease associations for prediction is still not fully explored. In this study, we propose a novel computational method for predicting miRNA-disease associations. The proposed method combines the graph convolutional network and the hypergraph convolutional network. The graph convolutional network is utilized to extract the information from miRNA-similarity data as well as disease-similarity data. Based on the representations of miRNAs and diseases learned by the graph convolutional network, we further use the hypergraph convolutional network to capture the complex high-order interactions in the known miRNA-disease associations. We conduct comprehensive experiments with different datasets and predictive tasks. The results show that the proposed method consistently outperforms several other state-of-the-art methods. We also discuss the influence of hyper-parameters and model structures on the performance of our method. Some case studies also demonstrate that the predictive results of the method can be verified by independent experiments.
Collapse
Affiliation(s)
- Xujun Liang
- Department of Oncology, NHC Key Laboratory of Cancer Proteomics, Xiangya Hospital, Central South University, Xiangya Road, Changsha, 410008, China.
- National Clinical Research Center for Gerontology, Xiangya Hospital, Central South University, Xiangya Road, Changsha, 410008, China.
| | - Ming Guo
- Department of Oncology, NHC Key Laboratory of Cancer Proteomics, Xiangya Hospital, Central South University, Xiangya Road, Changsha, 410008, China
- National Clinical Research Center for Gerontology, Xiangya Hospital, Central South University, Xiangya Road, Changsha, 410008, China
| | - Longying Jiang
- Department of Oncology, NHC Key Laboratory of Cancer Proteomics, Xiangya Hospital, Central South University, Xiangya Road, Changsha, 410008, China
- Department of Pathology, Xiangya Hospital, Central South University, Xiangya Road, Changsha, China, 410008
| | - Ying Fu
- Department of Oncology, NHC Key Laboratory of Cancer Proteomics, Xiangya Hospital, Central South University, Xiangya Road, Changsha, 410008, China
- National Clinical Research Center for Gerontology, Xiangya Hospital, Central South University, Xiangya Road, Changsha, 410008, China
| | - Pengfei Zhang
- Department of Oncology, NHC Key Laboratory of Cancer Proteomics, Xiangya Hospital, Central South University, Xiangya Road, Changsha, 410008, China
- National Clinical Research Center for Gerontology, Xiangya Hospital, Central South University, Xiangya Road, Changsha, 410008, China
| | - Yongheng Chen
- Department of Oncology, NHC Key Laboratory of Cancer Proteomics, Xiangya Hospital, Central South University, Xiangya Road, Changsha, 410008, China.
- National Clinical Research Center for Gerontology, Xiangya Hospital, Central South University, Xiangya Road, Changsha, 410008, China.
| |
Collapse
|
9
|
Liu Y, Zhang R, Dong X, Yang H, Li J, Cao H, Tian J, Zhang Y. DAE-CFR: detecting microRNA-disease associations using deep autoencoder and combined feature representation. BMC Bioinformatics 2024; 25:139. [PMID: 38553698 PMCID: PMC10981315 DOI: 10.1186/s12859-024-05757-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 03/20/2024] [Indexed: 04/01/2024] Open
Abstract
BACKGROUND MicroRNA (miRNA) has been shown to play a key role in the occurrence and progression of diseases, making uncovering miRNA-disease associations vital for disease prevention and therapy. However, traditional laboratory methods for detecting these associations are slow, strenuous, expensive, and uncertain. Although numerous advanced algorithms have emerged, it is still a challenge to develop more effective methods to explore underlying miRNA-disease associations. RESULTS In the study, we designed a novel approach on the basis of deep autoencoder and combined feature representation (DAE-CFR) to predict possible miRNA-disease associations. We began by creating integrated similarity matrices of miRNAs and diseases, performing a logistic function transformation, balancing positive and negative samples with k-means clustering, and constructing training samples. Then, deep autoencoder was used to extract low-dimensional feature from two kinds of feature representations for miRNAs and diseases, namely, original association information-based and similarity information-based. Next, we combined the resulting features for each miRNA-disease pair and used a logistic regression (LR) classifier to infer all unknown miRNA-disease interactions. Under five and tenfold cross-validation (CV) frameworks, DAE-CFR not only outperformed six popular algorithms and nine classifiers, but also demonstrated superior performance on an additional dataset. Furthermore, case studies on three diseases (myocardial infarction, hypertension and stroke) confirmed the validity of DAE-CFR in practice. CONCLUSIONS DAE-CFR achieved outstanding performance in predicting miRNA-disease associations and can provide evidence to inform biological experiments and clinical therapy.
Collapse
Affiliation(s)
- Yanling Liu
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
- Department of Mathematics, Changzhi Medical College, Changzhi, China
| | - Ruiyan Zhang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Xiaojing Dong
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Hong Yang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Jing Li
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Hongyan Cao
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Jing Tian
- Department of Cardiology, First Hospital of Shanxi Medical University, Taiyuan, China.
| | - Yanbo Zhang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China.
- Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, Taiyuan, China.
- School of Health and Service Management, Shanxi University of Chinese Medicine, Jinzhong, China.
| |
Collapse
|
10
|
Dong B, Sun W, Xu D, Wang G, Zhang T. DAEMDA: A Method with Dual-Channel Attention Encoding for miRNA-Disease Association Prediction. Biomolecules 2023; 13:1514. [PMID: 37892196 PMCID: PMC10604960 DOI: 10.3390/biom13101514] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Accepted: 10/08/2023] [Indexed: 10/29/2023] Open
Abstract
A growing number of studies have shown that aberrant microRNA (miRNA) expression is closely associated with the evolution and development of various complex human diseases. These key biomarkers' identification and observation are significant for gaining a deeper understanding of disease pathogenesis and therapeutic mechanisms. Consequently, pinpointing potential miRNA-disease associations (MDA) has become a prominent bioinformatics subject, encouraging several new computational methods given the advances in graph neural networks (GNN). Nevertheless, these existing methods commonly fail to exploit the network nodes' global feature information, leaving the generation of high-quality embedding representations using graph properties as a critical unsolved issue. Addressing these challenges, we introduce the DAEMDA, a computational method designed to optimize the current models' efficacy. First, we construct similarity and heterogeneous networks involving miRNAs and diseases, relying on experimentally corroborated miRNA-disease association data and analogous information. Then, a newly-fashioned parallel dual-channel feature encoder, designed to better comprehend the global information within the heterogeneous network and generate varying embedding representations, follows this. Ultimately, employing a neural network classifier, we merge the dual-channel embedding representations and undertake association predictions between miRNA and disease nodes. The experimental results of five-fold cross-validation and case studies of major diseases based on the HMDD v3.2 database show that this method can generate high-quality embedded representations and effectively improve the accuracy of MDA prediction.
Collapse
Affiliation(s)
| | | | | | - Guohua Wang
- College of Computer and Control Engineering, Northeast Forestry University, Harbin 150040, China; (B.D.)
| | - Tianjiao Zhang
- College of Computer and Control Engineering, Northeast Forestry University, Harbin 150040, China; (B.D.)
| |
Collapse
|
11
|
Sheng N, Wang Y, Huang L, Gao L, Cao Y, Xie X, Fu Y. Multi-task prediction-based graph contrastive learning for inferring the relationship among lncRNAs, miRNAs and diseases. Brief Bioinform 2023; 24:bbad276. [PMID: 37529914 DOI: 10.1093/bib/bbad276] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 07/09/2023] [Accepted: 07/11/2023] [Indexed: 08/03/2023] Open
Abstract
MOTIVATION Identifying the relationships among long non-coding RNAs (lncRNAs), microRNAs (miRNAs) and diseases is highly valuable for diagnosing, preventing, treating and prognosing diseases. The development of effective computational prediction methods can reduce experimental costs. While numerous methods have been proposed, they often to treat the prediction of lncRNA-disease associations (LDAs), miRNA-disease associations (MDAs) and lncRNA-miRNA interactions (LMIs) as separate task. Models capable of predicting all three relationships simultaneously remain relatively scarce. Our aim is to perform multi-task predictions, which not only construct a unified framework, but also facilitate mutual complementarity of information among lncRNAs, miRNAs and diseases. RESULTS In this work, we propose a novel unsupervised embedding method called graph contrastive learning for multi-task prediction (GCLMTP). Our approach aims to predict LDAs, MDAs and LMIs by simultaneously extracting embedding representations of lncRNAs, miRNAs and diseases. To achieve this, we first construct a triple-layer lncRNA-miRNA-disease heterogeneous graph (LMDHG) that integrates the complex relationships between these entities based on their similarities and correlations. Next, we employ an unsupervised embedding model based on graph contrastive learning to extract potential topological feature of lncRNAs, miRNAs and diseases from the LMDHG. The graph contrastive learning leverages graph convolutional network architectures to maximize the mutual information between patch representations and corresponding high-level summaries of the LMDHG. Subsequently, for the three prediction tasks, multiple classifiers are explored to predict LDA, MDA and LMI scores. Comprehensive experiments are conducted on two datasets (from older and newer versions of the database, respectively). The results show that GCLMTP outperforms other state-of-the-art methods for the disease-related lncRNA and miRNA prediction tasks. Additionally, case studies on two datasets further demonstrate the ability of GCLMTP to accurately discover new associations. To ensure reproducibility of this work, we have made the datasets and source code publicly available at https://github.com/sheng-n/GCLMTP.
Collapse
Affiliation(s)
- Nan Sheng
- Key laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, 130012 Changchun, China
| | - Yan Wang
- Key laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, 130012 Changchun, China
- School of Artificial Intelligence, Jilin University, 130012 Changchun, China
| | - Lan Huang
- Key laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, 130012 Changchun, China
| | - Ling Gao
- Key laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, 130012 Changchun, China
| | - Yangkun Cao
- School of Artificial Intelligence, Jilin University, 130012 Changchun, China
| | - Xuping Xie
- Key laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, 130012 Changchun, China
| | - Yuan Fu
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, Ceredigion, UK
| |
Collapse
|
12
|
He Q, Qiao W, Fang H, Bao Y. Improving the identification of miRNA-disease associations with multi-task learning on gene-disease networks. Brief Bioinform 2023; 24:bbad203. [PMID: 37287133 DOI: 10.1093/bib/bbad203] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 04/24/2023] [Accepted: 05/10/2023] [Indexed: 06/09/2023] Open
Abstract
MicroRNAs (miRNAs) are a family of non-coding RNA molecules with vital roles in regulating gene expression. Although researchers have recognized the importance of miRNAs in the development of human diseases, it is very resource-consuming to use experimental methods for identifying which dysregulated miRNA is associated with a specific disease. To reduce the cost of human effort, a growing body of studies has leveraged computational methods for predicting the potential miRNA-disease associations. However, the extant computational methods usually ignore the crucial mediating role of genes and suffer from the data sparsity problem. To address this limitation, we introduce the multi-task learning technique and develop a new model called MTLMDA (Multi-Task Learning model for predicting potential MicroRNA-Disease Associations). Different from existing models that only learn from the miRNA-disease network, our MTLMDA model exploits both miRNA-disease and gene-disease networks for improving the identification of miRNA-disease associations. To evaluate model performance, we compare our model with competitive baselines on a real-world dataset of experimentally supported miRNA-disease associations. Empirical results show that our model performs best using various performance metrics. We also examine the effectiveness of model components via ablation study and further showcase the predictive power of our model for six types of common cancers. The data and source code are available from https://github.com/qwslle/MTLMDA.
Collapse
Affiliation(s)
- Qiang He
- College of Medicine and Biological Information Engineering, Northeastern University, 110169 Shenyang, China
| | - Wei Qiao
- College of Medicine and Biological Information Engineering, Northeastern University, 110169 Shenyang, China
| | - Hui Fang
- Research Institute for Interdisciplinary Science and School of Information Management and Engineering, Shanghai University of Finance and Economics, 200434 Shanghai, China
| | - Yang Bao
- Antai College of Economics and Management, Shanghai Jiao Tong University, 200030 Shanghai, China
| |
Collapse
|
13
|
Wang W, Chen H. Predicting miRNA-disease associations based on graph attention networks and dual Laplacian regularized least squares. Brief Bioinform 2022; 23:6645486. [PMID: 35849099 DOI: 10.1093/bib/bbac292] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Revised: 06/23/2022] [Accepted: 06/26/2022] [Indexed: 01/05/2023] Open
Abstract
Increasing biomedical evidence has proved that the dysregulation of miRNAs is associated with human complex diseases. Identification of disease-related miRNAs is of great importance for disease prevention, diagnosis and remedy. To reduce the time and cost of biomedical experiments, there is a strong incentive to develop efficient computational methods to infer potential miRNA-disease associations. Although many computational approaches have been proposed to address this issue, the prediction accuracy needs to be further improved. In this study, we present a computational framework MKGAT to predict possible associations between miRNAs and diseases through graph attention networks (GATs) using dual Laplacian regularized least squares. We use GATs to learn embeddings of miRNAs and diseases on each layer from initial input features of known miRNA-disease associations, intra-miRNA similarities and intra-disease similarities. We then calculate kernel matrices of miRNAs and diseases based on Gaussian interaction profile (GIP) with the learned embeddings. We further fuse the kernel matrices of each layer and initial similarities with attention mechanism. Dual Laplacian regularized least squares are finally applied for new miRNA-disease association predictions with the fused miRNA and disease kernels. Compared with six state-of-the-art methods by 5-fold cross-validations, our method MKGAT receives the highest AUROC value of 0.9627 and AUPR value of 0.7372. We use MKGAT to predict related miRNAs for three cancers and discover that all the top 50 predicted results in the three diseases are confirmed by existing databases. The excellent performance indicates that MKGAT would be a useful computational tool for revealing disease-related miRNAs.
Collapse
Affiliation(s)
- Wengang Wang
- School of Software, East China Jiaotong University, Nanchang 330013, China
| | - Hailin Chen
- School of Software, East China Jiaotong University, Nanchang 330013, China
| |
Collapse
|
14
|
Ni J, Li L, Wang Y, Ji C, Zheng C. MDSCMF: Matrix Decomposition and Similarity-Constrained Matrix Factorization for miRNA-Disease Association Prediction. Genes (Basel) 2022; 13:1021. [PMID: 35741782 PMCID: PMC9223216 DOI: 10.3390/genes13061021] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2022] [Revised: 06/01/2022] [Accepted: 06/02/2022] [Indexed: 11/16/2022] Open
Abstract
MicroRNAs (miRNAs) are small non-coding RNAs that are related to a number of complicated biological processes, and numerous studies have demonstrated that miRNAs are closely associated with many human diseases. In this study, we present a matrix decomposition and similarity-constrained matrix factorization (MDSCMF) to predict potential miRNA-disease associations. First of all, we utilized a matrix decomposition (MD) algorithm to get rid of outliers from the miRNA-disease association matrix. Then, miRNA similarity was determined by utilizing similarity kernel fusion (SKF) to integrate miRNA function similarity and Gaussian interaction profile (GIP) kernel similarity, and disease similarity was determined by utilizing SKF to integrate disease semantic similarity and GIP kernel similarity. Furthermore, we added L2 regularization terms and similarity constraint terms to non-negative matrix factorization to form a similarity-constrained matrix factorization (SCMF) algorithm, which was applied to make prediction. MDSCMF achieved AUC values of 0.9488, 0.9540, and 0.8672 based on fivefold cross-validation (5-CV), global leave-one-out cross-validation (global LOOCV), and local leave-one-out cross-validation (local LOOCV), respectively. Case studies on three common human diseases were also implemented to demonstrate the prediction ability of MDSCMF. All experimental results confirmed that MDSCMF was effective in predicting underlying associations between miRNAs and diseases.
Collapse
Affiliation(s)
- Jiancheng Ni
- Network Information Center, Qufu Normal University, Qufu 273165, China;
| | - Lei Li
- School of Cyber Science and Engineering, Qufu Normal University, Qufu 273165, China; (Y.W.); (C.J.)
| | - Yutian Wang
- School of Cyber Science and Engineering, Qufu Normal University, Qufu 273165, China; (Y.W.); (C.J.)
| | - Cunmei Ji
- School of Cyber Science and Engineering, Qufu Normal University, Qufu 273165, China; (Y.W.); (C.J.)
| | - Chunhou Zheng
- School of Artifial Intelligence, Anhui University, Hefei 230601, China
| |
Collapse
|
15
|
Zhong T, Li Z, You ZH, Nie R, Zhao H. Predicting miRNA-disease associations based on graph random propagation network and attention network. Brief Bioinform 2022; 23:6515233. [PMID: 35079767 DOI: 10.1093/bib/bbab589] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Revised: 12/07/2021] [Accepted: 12/22/2021] [Indexed: 11/13/2022] Open
Abstract
Numerous experiments have demonstrated that abnormal expression of microRNAs (miRNAs) in organisms is often accompanied by the emergence of specific diseases. The research of miRNAs can promote the prevention and drug research of specific diseases. However, there are still many undiscovered links between miRNAs and diseases, which greatly limits the research of miRNAs. Therefore, for exploring the unknown miRNA-disease associations, we combine the graph random propagation network based on DropFeature with attention network to propose a novel deep learning model to predict the miRNA-disease associations (GRPAMDA). Specifically, we firstly construct the miRNA-disease heterogeneous graph based on miRNA-disease association information. Secondly, we adopt DropFeature to randomly delete the features of nodes in the graph and then perform propagation operations to enhance the features of miRNA and disease nodes. Thirdly, we employ the attention mechanism to fuse the features of random propagation by aggregating the enhanced neighbor features of miRNA and disease nodes. Finally, miRNA-disease association scores are generated by a fully connected layer. The average area under the curve of GRPAMDA model based on 5-fold cross-validation is 93.46% on HMDD v2.0. Case studies of esophageal tumors, lymphomas and prostate tumors show that 48, 47 and 46 of the top 50 miRNAs associated with these diseases are confirmed by dbDEMC and miR2Disease database, respectively. In short, the GRPAMDA model can be used as a valuable method to study miRNA-disease associations.
Collapse
Affiliation(s)
- Tangbo Zhong
- Engineering Research Center of Mine Digitalization of Ministry of Education, China University of Mining and Technology, Xuzhou, China
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, China
| | - Zhengwei Li
- Engineering Research Center of Mine Digitalization of Ministry of Education, China University of Mining and Technology, Xuzhou, China
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, China
| | - Zhu-Hong You
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China
| | - Ru Nie
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, China
| | - Huan Zhao
- Engineering Research Center of Mine Digitalization of Ministry of Education, China University of Mining and Technology, Xuzhou, China
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, China
| |
Collapse
|
16
|
GCAEMDA: Predicting miRNA-disease associations via graph convolutional autoencoder. PLoS Comput Biol 2021; 17:e1009655. [PMID: 34890410 PMCID: PMC8694430 DOI: 10.1371/journal.pcbi.1009655] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Revised: 12/22/2021] [Accepted: 11/17/2021] [Indexed: 01/02/2023] Open
Abstract
microRNAs (miRNAs) are small non-coding RNAs related to a number of complicated biological processes. A growing body of studies have suggested that miRNAs are closely associated with many human diseases. It is meaningful to consider disease-related miRNAs as potential biomarkers, which could greatly contribute to understanding the mechanisms of complex diseases and benefit the prevention, detection, diagnosis and treatment of extraordinary diseases. In this study, we presented a novel model named Graph Convolutional Autoencoder for miRNA-Disease Association Prediction (GCAEMDA). In the proposed model, we utilized miRNA-miRNA similarities, disease-disease similarities and verified miRNA-disease associations to construct a heterogeneous network, which is applied to learn the embeddings of miRNAs and diseases. In addition, we separately constructed miRNA-based and disease-based sub-networks. Combining the embeddings of miRNAs and diseases, graph convolutional autoencoder (GCAE) was utilized to calculate association scores of miRNA-disease on two sub-networks, respectively. Furthermore, we obtained final prediction scores between miRNAs and diseases by adopting an average ensemble way to integrate the prediction scores from two types of subnetworks. To indicate the accuracy of GCAEMDA, we applied different cross validation methods to evaluate our model whose performances were better than the state-of-the-art models. Case studies on a common human diseases were also implemented to prove the effectiveness of GCAEMDA. The results demonstrated that GCAEMDA was beneficial to infer potential associations of miRNA-disease. Numerous studies have demonstrated that miRNAs are closely related to several common human diseases, so observing unverified associations between miRNAs and diseases is conducive to the diagnose and treatment of complex diseases. Considerable models proposed to infer potential miRNA-disease associations have made the prediction more effective and productive. We constructed GCAEMDA model to acquire more accuracy prediction result by integrating graph convolutional network and autoencoder to make prediction based on multi-source miRNA and disease information. The five-fold cross validation and global leave-one-out cross validation were implemented to evaluate the performance of our model. Consequently, GCAEMDA reached AUCs of 0.9415 and 0.9505 respectively that were distinctly higher than AUCs of other comparative models. Furthermore, we carried out case studies on lung neoplasms and breast neoplasms to demonstrate the practical application of the model, 47 and 47 of top-50 candidate miRNAs were confirmed by experimental reports. In summary, GCAEMDA could be considered as an effective and accuracy model to reveal relationship between miRNAs and diseases.
Collapse
|