1
|
Sun G, Zheng M, Fan Y, Pan X. MVGNCDA: Identifying Potential circRNA-Disease Associations Based on Multi-view Graph Convolutional Networks and Network Embeddings. Interdiscip Sci 2025; 17:449-462. [PMID: 39962021 DOI: 10.1007/s12539-025-00690-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Revised: 01/07/2025] [Accepted: 01/09/2025] [Indexed: 05/28/2025]
Abstract
Increasing evidences have indicated that circular RNAs play a crucial role in the onset and progression of various diseases. However, exploring potential disease-associated circRNAs using conventional experimental techniques remains both time-intensive and costly. Recently, various computational approaches have been developed to detect the circRNA-disease associations. Nevertheless, due to the sparsity of the data and the inefficient utilization of similarity representation, it is still a challenge to effectively detect unknown circRNA-disease associations using multisource data. In this work, we propose an innovative computational framework, MVGNCDA, which merges a multi-view graph convolutional network (GCN) and biased random walk-based network embeddings to evaluate potential circRNA-disease associations from multisource data. First, we calculate disease semantic similarity, circRNA functional similarity, and their Gaussian interaction profile (GIP) kernel and cosine similarity. MVGNCDA utilizes multi-view GCNs to extract local node embeddings of diseases and circRNAs in the context of multisource information. Then, we construct a heterogeneous network utilizing integrated similarity and verified circRNA-disease associations, which is subsequently used to learn global node embeddings. Furthermore, the final fused local and global node embeddings are decoded to evaluate the circRNA-disease associations using a bilinear decoder. The fivefold cross-validation results demonstrate that MVGNCDA outperforms existing methods across five public datasets. Moreover, case study also confirms that MVGNCDA is capable of efficiently identifying unknown circRNA-disease associations.
Collapse
Affiliation(s)
- Guicong Sun
- School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, 541004, China
| | - Mengxin Zheng
- School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, 541004, China
| | - Yongxian Fan
- School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, 541004, China.
| | - Xiaoyong Pan
- Department of Automation, Key Laboratory of System Control and Information Processing, Ministry of Education of China, Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai, 200240, China
| |
Collapse
|
2
|
Lan W, Peng C, Zhang H, Li C, Chen Q, Xiao X, Wang Z. Predicting CircRNA-Disease Associations Based on Heterogeneous Graph Neural Network and Knowledge Graph Attribute Mining Attention. Interdiscip Sci 2025:10.1007/s12539-025-00706-6. [PMID: 40358837 DOI: 10.1007/s12539-025-00706-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2024] [Revised: 03/23/2025] [Accepted: 03/25/2025] [Indexed: 05/15/2025]
Abstract
The exploration of associations between circular RNAs (circRNAs) and diseases contributes to a deeper understanding of the pathogenesis of diseases. Many computational methods have been proposed for circRNA-disease associations identification. However, these methods still exhibit some limitations such as ignoring the effect of noise. In this paper, we proposed a new knowledge graph attribute mining attention network (KAATCDA) to predict circRNA-disease associations based on knowledge graph attribute network (KGA) and attribute mining attention network (AMA). Firstly, KGA is used to learn the feature representation of diseases. Then, the features of circRNAs are obtained using AMA, which are similar to disease feature representations. Finally, the scores of circRNA-disease associations are predicted based on circRNA feature representation and disease feature representation. Experiments of five-fold cross-validation on two datasets demonstrate that KAATCDA outperforms other state-of-the-art methods. In addition, the case study shows our method can effectively predict unknown circRNA-disease associations.
Collapse
Affiliation(s)
- Wei Lan
- School of Computer, Electronic and Information, Guangxi University, Nanning, 530004, China.
| | - Cong Peng
- School of Computer, Electronic and Information, Guangxi University, Nanning, 530004, China
| | - Hongyu Zhang
- School of Computer, Electronic and Information, Guangxi University, Nanning, 530004, China
| | - Chunling Li
- School of Computer, Electronic and Information, Guangxi University, Nanning, 530004, China
| | - Qingfeng Chen
- School of Computer, Electronic and Information, Guangxi University, Nanning, 530004, China
| | - Xin Xiao
- Visual Science and Optometry Center of Guangxi, The People's Hospital of Guangxi Zhuang Autonomous Region, Nanning, 530021, China.
| | - Zhiqiang Wang
- Guangxi Key Laboratory of Eye Health, The People's Hospital of Guangxi Zhuang Autonomous Region, Nanning, 530021, China
| |
Collapse
|
3
|
Guan Z, Jin X, Zhang X. MFF-nDA: A Computational Model for ncRNA-Disease Association Prediction Based on Multimodule Fusion. J Chem Inf Model 2025; 65:3324-3342. [PMID: 40129032 DOI: 10.1021/acs.jcim.5c00174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2025]
Abstract
Noncoding RNAs(ncRNAs), including piwi-interacting RNA(piRNA), long noncoding RNA(lncRNA), microRNA(miRNA), small nucleolar RNA(snoRNA), and circular RNA(circRNA), contribute significantly to gene expression regulation and serve as key factors in disease association studies and health-related exploration. Accurate prediction of ncRNA-disease associations is crucial for elucidating disease mechanisms and advancing therapeutic development. Recently, computational models based on a graph neural network have extensively emerged for identifying associations among various ncRNAs and diseases. However, existing computational models have not fully utilized integrative information on ncRNs and diseases, and reliance on GNN-based models alone may be limited in performance due to oversmoothing issues. On the other hand, existing models are mainly targeted at a specific type of ncRNA and may not be applicable to most ncRNAs. Therefore, to overcome these limitations, we propound a computational model MFF-nDA based on multimodule fusion. Specifically, we first introduce five types of similarity network information, including three types of ncRNA and two types of disease similarity information, in order to fully explore and optimize the multisource feature information on these entities. Subsequently, we establish three modules: heterogeneous network representation module based on Transformer, association network representation module based on graph convolutional network (GCN), and topological structure representation module based on graph attention network (GAT), which capture diverse features of nodes in heterogeneous networks and topological structure information reflected in association networks. The complementary effects of the three modules also help relieve the oversmoothing issue to some extent. By leveraging the multimodule fusion learning to comprehensively capture the diverse features of these entities, our model outperforms the available state-of-the-art methods, achieving an AUC greater than 0.9000 for each dataset. This demonstrates the highest predictive performance, making it a valuable tool for identifying potential ncRNA associated with diseases. The code of MFF-nDA can be accessed at https://github.com/Jack-Cxy/MFF-nDA.
Collapse
Affiliation(s)
- Zhihao Guan
- College of Information and Artificial Intelligence, Anhui Agricultural University, Hefei 230036, China
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Anhui Agricultural University, Hefei 230036, China
| | - Xiu Jin
- College of Information and Artificial Intelligence, Anhui Agricultural University, Hefei 230036, China
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Anhui Agricultural University, Hefei 230036, China
| | - Xiaodan Zhang
- College of Information and Artificial Intelligence, Anhui Agricultural University, Hefei 230036, China
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Anhui Agricultural University, Hefei 230036, China
| |
Collapse
|
4
|
Liang SZ, Wang L, You ZH, Yu CQ, Wei MM, Wei Y, Shi TL, Jiang C. Predicting circRNA-Disease Associations through Multisource Domain-Aware Embeddings and Feature Projection Networks. J Chem Inf Model 2025; 65:1666-1676. [PMID: 39829001 DOI: 10.1021/acs.jcim.4c02250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2025]
Abstract
Recent studies have highlighted the significant role of circular RNAs (circRNAs) in various diseases. Accurately predicting circRNA-disease associations is crucial for understanding their biological functions and disease mechanisms. This work introduces the MNDCDA method, designed to address the challenges posed by the limited number of known circRNA-disease associations and the high cost of biological experiments. MNDCDA integrates multiple biological data sources with neighborhood-aware embedding models and deep feature projection networks to predict potential pathways linking circRNAs to diseases. Initially, comprehensive biometric data are used to construct four similarity networks, forming a diverse circRNA-disease interaction framework. Next, a neighborhood-aware embedding model captures structural information about circRNAs and diseases, while deep feature projection networks learn high-order feature interactions and nonlinear connections. Finally, a bilinear decoder identifies novel associations between circRNAs and diseases. The MNDCDA model achieved an AUC of 0.9070 on a constructed benchmark dataset. In case studies, 25 out of 30 predicted circRNA-disease pairs were validated through wet lab experiments and published literature. These extensive experimental results demonstrate that MNDCDA is a robust computational tool for predicting circRNA-disease associations, providing valuable insights while helping to reduce research costs.
Collapse
Affiliation(s)
- Si-Zhe Liang
- School of Information Engineering, Xijing Univerity, Xi'an 710123, China
| | - Lei Wang
- Guangxi Key Lab of Human-Machine Interaction and Intelligent Decision, Guangxi Academy of Sciences, Nanning 530007, China
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
| | - Zhu-Hong You
- School of Computer Science, Northwestern Polytechnical University, Xi'an 710129, China
| | - Chang-Qing Yu
- School of Information Engineering, Xijing Univerity, Xi'an 710123, China
| | - Meng-Meng Wei
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
| | - Yu Wei
- School of Information Engineering, Xijing Univerity, Xi'an 710123, China
| | - Tai-Long Shi
- School of Information Engineering, Xijing Univerity, Xi'an 710123, China
| | - Chen Jiang
- School of Information Engineering, Xijing Univerity, Xi'an 710123, China
| |
Collapse
|
5
|
Li H, Qian Y, Sun Z, Zhu H. Prediction of circRNA-Disease Associations via Graph Isomorphism Transformer and Dual-Stream Neural Predictor. Biomolecules 2025; 15:234. [PMID: 40001537 PMCID: PMC11853643 DOI: 10.3390/biom15020234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2024] [Revised: 01/31/2025] [Accepted: 02/05/2025] [Indexed: 02/27/2025] Open
Abstract
Circular RNAs (circRNAs) have attracted increasing attention for their roles in human diseases, making the prediction of circRNA-disease associations (CDAs) a critical research area for advancing disease diagnosis and treatment. However, traditional experimental methods for exploring CDAs are time-consuming and resource-intensive, while existing computational models often struggle with the sparsity of CDA data and fail to uncover potential associations effectively. To address these challenges, we propose a novel CDA prediction method named the Graph Isomorphism Transformer with Dual-Stream Neural Predictor (GIT-DSP), which leverages knowledge graph technology to address data sparsity and predict CDAs more effectively. Specifically, the model incorporates multiple associations between circRNAs, diseases, and other non-coding RNAs (e.g., lncRNAs, and miRNAs) to construct a multi-source heterogeneous knowledge graph, thereby expanding the scope of CDA exploration. Subsequently, a Graph Isomorphism Transformer model is proposed to fully exploit both local and global association information within the knowledge graph, enabling deeper insights into potential CDAs. Furthermore, a Dual-Stream Neural Predictor is introduced to accurately predict complex circRNA-disease associations in the knowledge graph by integrating dual-stream predictive features. Experimental results demonstrate that GIT-DSP outperforms existing state-of-the-art models, offering valuable insights for precision medicine and disease-related research.
Collapse
Affiliation(s)
| | | | | | - Haodong Zhu
- School of Computer Science and Technology, Zhengzhou University of Light Industry, Zhengzhou 450000, China; (H.L.); (Y.Q.); (Z.S.)
| |
Collapse
|
6
|
Shang J, Zhao L, He X, Meng X, Zhang L, Ge D, Li F, Liu JX. SGFCCDA: Scale Graph Convolutional Networks and Feature Convolution for circRNA-Disease Association Prediction. IEEE J Biomed Health Inform 2024; 28:7006-7014. [PMID: 39250355 DOI: 10.1109/jbhi.2024.3456478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/11/2024]
Abstract
Circular RNAs (circRNAs) have emerged as a novel class of non-coding RNAs with regulatory roles in disease pathogenesis. Computational models aimed at predicting circRNA-disease associations offer valuable insights into disease mechanisms, thereby enabling the development of innovative diagnostic and therapeutic approaches while reducing the reliance on costly wet experiments. In this study, SGFCCDA is proposed for predicting potential circRNA-disease associations based on scale graph convolutional networks and feature convolution. Specifically, SGFCCDA integrates multiple measures of circRNA and disease similarity and combines known association information to construct a heterogeneous network. This network is then explored by scale graph convolutional networks to capture both topological and attribute information. Additionally, convolutional neural networks are employed to further learn the features and obtain higher-order feature representations containing richer information about nodes. The Hadamard product is utilized to effectively combine circRNA features with disease features, and a multilayer perceptron is applied to predict the association between each pair of circRNA and disease. Five-fold cross validation experiments conducted on the CircR2Disease dataset demonstrate the accurate prediction capabilities of SGFCCDA in identifying potential circRNA-disease associations. Furthermore, case studies provide further confirmation of SGFCCDA's ability to identify disease-associated circRNAs.
Collapse
|
7
|
Cen K, Xing Z, Wang X, Wang Y, Li J. circ2DGNN: circRNA-Disease Association Prediction via Transformer-Based Graph Neural Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:2556-2567. [PMID: 39475749 DOI: 10.1109/tcbb.2024.3488281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
Abstract
Investigating the associations between circRNA and diseases is vital for comprehending the underlying mechanisms of diseases and formulating effective therapies. Computational prediction methods often rely solely on known circRNA-disease data, indirectly incorporating other biomolecules' effects by computing circRNA and disease similarities based on these molecules. However, this approach is limited, as other biomolecules also play significant roles in circRNA-disease interactions. To address this, we construct a comprehensive heterogeneous network incorporating data on human circRNAs, diseases, and other biomolecule interactions to develop a novel computational model, circ2DGNN, which is built upon a heterogeneous graph neural network. circ2DGNN directly takes heterogeneous networks as inputs and obtains the embedded representation of each node for downstream link prediction through graph representation learning. circ2DGNN employs a Transformer-like architecture, which can compute heterogeneous attention score for each edge, and perform message propagation and aggregation, using a residual connection to enhance the representation vector. It uniquely applies the same parameter matrix only to identical meta-relationships, reflecting diverse parameter spaces for different relationship types. After fine-tuning hyperparameters via five-fold cross-validation, evaluation conducted on a test dataset shows circ2DGNN outperforms existing state-of-the-art(SOTA) methods.
Collapse
|
8
|
Wang S, Liu JX, Li F, Wang J, Gao YL. M 3HOGAT: A Multi-View Multi-Modal Multi-Scale High-Order Graph Attention Network for Microbe-Disease Association Prediction. IEEE J Biomed Health Inform 2024; 28:6259-6267. [PMID: 39012741 DOI: 10.1109/jbhi.2024.3429128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/18/2024]
Abstract
Numerous scientific studies have found a link between diverse microorganisms in the human body and complex human diseases. Because traditional experimental approaches are time-consuming and expensive, using computational methods to identify microbes correlated with diseases is critical. In this paper, a new microbe-disease association prediction model is proposed that combines a multi-view multi-modal network and a multi-scale feature fusion mechanism, called M3HOGAT. Firstly, a microbe-disease association network and multiple similarity views are constructed based on multi-source information. Then, consider that neighbor information from disparate orders might be more adept at learning node representations. Consequently, the higher-order graph attention network (HOGAT) is devised to aggregate neighbor information from disparate orders to extract microbe and disease features from different networks and views. Given that the embedding features of microbe and disease from different views possess varying importance, a multi-scale feature fusion mechanism is employed to learn their interaction information, thereby generating the final feature of microbes and diseases. Finally, an inner product decoder is used to reconstruct the microbe-disease association matrix. Compared with five state-of-the-art methods on the HMDAD and Disbiome datasets, the results of 5-fold cross-validations show that M3HOGAT achieves the best performance. Furthermore, case studies on asthma and obesity confirm the effectiveness of M3HOGAT in identifying potential disease-related microbes.
Collapse
|
9
|
Lu P, Wang Y. RDGAN: Prediction of circRNA-Disease Associations via Resistance Distance and Graph Attention Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1445-1457. [PMID: 38787672 DOI: 10.1109/tcbb.2024.3402248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2024]
Abstract
As a series of single-stranded RNAs, circRNAs have been implicated in numerous diseases and can serve as valuable biomarkers for disease therapy and prevention. However, traditional biological experiments demand significant time and effort. Therefore, various computational methods have been proposed to address this limitation, but how to extract features more comprehensively remains a challenge that needs further attention in the future. In this study, we propose a unique approach to predict circRNA-disease associations based on resistance distance and graph attention network (RDGAN). First, the associations of circRNA and disease are obtained by fusing multiple databases, and resistance distance as a similarity matrix is used to further deal with the sparse of the similarity matrices. Then the circRNA-disease heterogeneous network is constructed based on the similiarity of circRNA-circRNA, disease-disease and the known circRNA-disease adjacency matric. Second, leveraging the three neural network modules-ResGatedGraphConv, GAT and MFConv-we gather node feature embeddings collected from the heterogeneous network. Subsequently, all the characteristics are supplied to the self-attention mechanism to predict new potential connections. Finally, our model obtains a remarkable AUC value of 0.9630 through five-fold cross-validation, surpassing the predictive performance of the other eight state-of-the-art models.
Collapse
|
10
|
Lan W, Li C, Chen Q, Yu N, Pan Y, Zheng Y, Chen YPP. LGCDA: Predicting CircRNA-Disease Association Based on Fusion of Local and Global Features. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1413-1422. [PMID: 38607720 DOI: 10.1109/tcbb.2024.3387913] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/14/2024]
Abstract
CircRNA has been shown to be involved in the occurrence of many diseases. Several computational frameworks have been proposed to identify circRNA-disease associations. Despite the existing computational methods have obtained considerable successes, these methods still require to be improved as their performance may degrade due to the sparsity of the data and the problem of memory overflow. We develop a novel computational framework called LGCDA to predict circRNA-disease associations by fusing local and global features to solve the above mentioned problems. First, we construct closed local subgraphs by using k-hop closed subgraph and label the subgraphs to obtain rich graph pattern information. Then, the local features are extracted by using graph neural network (GNN). In addition, we fuse Gaussian interaction profile (GIP) kernel and cosine similarity to obtain global features. Finally, the score of circRNA-disease associations is predicted by using the multilayer perceptron (MLP) based on local and global features. We perform five-fold cross validation on five datasets for model evaluation and our model surpasses other advanced methods.
Collapse
|
11
|
Salooja CM, Sanker A, Deepthi K, Jereesh AS. An ensemble approach for circular RNA-disease association prediction using variational autoencoder and genetic algorithm. J Bioinform Comput Biol 2024; 22:2450018. [PMID: 39215523 DOI: 10.1142/s0219720024500185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/04/2024]
Abstract
Circular RNAs (circRNAs) are endogenous non-coding RNAs with a covalently closed loop structure. They have many biological functions, mainly regulatory ones. They have been proven to modulate protein-coding genes in the human genome. CircRNAs are linked to various diseases like Alzheimer's disease, diabetes, atherosclerosis, Parkinson's disease and cancer. Identifying the associations between circular RNAs and diseases is essential for disease diagnosis, prevention, and treatment. The proposed model, based on the variational autoencoder and genetic algorithm circular RNA disease association (VAGA-CDA), predicts novel circRNA-disease associations. First, the experimentally verified circRNA-disease associations are augmented with the synthetic minority oversampling technique (SMOTE) and regenerated using a variational autoencoder, and feature selection is applied to these vectors by a genetic algorithm (GA). The variational autoencoder effectively extracts features from the augmented samples. The optimized feature selection of the genetic algorithm effectively carried out dimensionality reduction. The sophisticated feature vectors extracted are then given to a Random Forest classifier to predict new circRNA-disease associations. The proposed model yields an AUC value of 0.9644 and 0.9628 under 5-fold and 10-fold cross-validations, respectively. The results of the case studies indicate the robustness of the proposed model.
Collapse
Affiliation(s)
- C M Salooja
- Bioinformatics Lab, Department of Computer Science, Cochin University of Science and Technology, Kerala-682022, India
| | - Arjun Sanker
- Bioinformatics Lab, Department of Computer Science, Cochin University of Science and Technology, Kerala-682022, India
| | - K Deepthi
- Department of Computer Science, Central University of Kerala (Central Govt. of India), Kerala-671316, India
| | - A S Jereesh
- Bioinformatics Lab, Department of Computer Science, Cochin University of Science and Technology, Kerala-682022, India
| |
Collapse
|
12
|
Guo Y, Yi M. THGNCDA: circRNA-disease association prediction based on triple heterogeneous graph network. Brief Funct Genomics 2024; 23:384-394. [PMID: 37738503 DOI: 10.1093/bfgp/elad042] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 08/21/2023] [Accepted: 09/04/2023] [Indexed: 09/24/2023] Open
Abstract
Circular RNAs (circRNAs) are a class of noncoding RNA molecules featuring a closed circular structure. They have been proved to play a significant role in the reduction of many diseases. Besides, many researches in clinical diagnosis and treatment of disease have revealed that circRNA can be considered as a potential biomarker. Therefore, understanding the association of circRNA and diseases can help to forecast some disorders of life activities. However, traditional biological experimental methods are time-consuming. The most common method for circRNA-disease association prediction on the basis of machine learning can avoid this, which relies on diverse data. Nevertheless, topological information of circRNA and disease usually is not involved in these methods. Moreover, circRNAs can be associated with diseases through miRNAs. With these considerations, we proposed a novel method, named THGNCDA, to predict the association between circRNAs and diseases. Specifically, for a certain pair of circRNA and disease, we employ a graph neural network with attention to learn the importance of its each neighbor. In addition, we use a multilayer convolutional neural network to explore the relationship of a circRNA-disease pair based on their attributes. When calculating embeddings, we introduce the information of miRNAs. The results of experiments show that THGNCDA outperformed the SOTA methods. In addition, it can be observed that our method gives a better recall rate. To confirm the significance of attention, we conducted extensive ablation studies. Case studies on Urinary Bladder and Prostatic Neoplasms further show THGNCDA's ability in discovering known relationships between circRNA candidates and diseases.
Collapse
Affiliation(s)
- Yuwei Guo
- School of Mathematics and Physics, China University of Geosciences, 388 Lumo Road, Hongshan District, 430074, Wuhan, Hubei, China
| | - Ming Yi
- School of Mathematics and Physics, China University of Geosciences, 388 Lumo Road, Hongshan District, 430074, Wuhan, Hubei, China
| |
Collapse
|
13
|
Tian Y, Zou Q, Wang C, Jia C. MAMLCDA: A Meta-Learning Model for Predicting circRNA-Disease Association Based on MAML Combined With CNN. IEEE J Biomed Health Inform 2024; 28:4325-4335. [PMID: 38578862 DOI: 10.1109/jbhi.2024.3385352] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/07/2024]
Abstract
Circular RNAs (circRNAs) exist in vivo and are a class of noncoding RNA molecules. They have a single-stranded, closed, annular structure. Many studies have shown that circRNAs and diseases are linked. Therefore, it is critical to build a reliable and accurate predictor to find the circRNA-disease association. In this paper, we presented a meta-learning model named MAMLCDA to identify the circRNA-disease association, which is based on model-agnostic meta-learning (MAML) combined with CNN classification. Specifically, similarities between diseases and circRNAs are extracted and integrated to characterize their relationships, and k-means is used to cluster majority samples and select a certain number of samples from each cluster to obtain the same number of negative samples as the positive samples. To further reduce the dimension of the features and save operation time, we applied probabilistic principal component analysis (PPCA) to compact the integrated circRNA and disease similarity network feature vectors. The feature vectors are converted into images. At this time, the prediction problem is transformed into the 2-way 1-shot problem of the image and input into the model with MAML as the meta-learner and CNN as the base-learner. Comparison results of five-fold cross-validation on two benchmark datasets illustrate that MAMLCDA outperforms several state-of-the-art approaches with the best accuracies of 95.33% and 98%. Therefore, MAMLCDA can help to understand the pathogenesis of complex diseases at the circRNA level.
Collapse
|
14
|
Wang Y, Lu P. GEHGAN: CircRNA-disease association prediction via graph embedding and heterogeneous graph attention network. Comput Biol Chem 2024; 110:108079. [PMID: 38704917 DOI: 10.1016/j.compbiolchem.2024.108079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Revised: 04/03/2024] [Accepted: 04/16/2024] [Indexed: 05/07/2024]
Abstract
There is growing proof suggested that circRNAs play a crucial function in diverse important biological reactions related to human diseases. Within the area of biochemistry, a massive range of wet experiments have been carried out to find out the connections of circRNA-disease in recent years. Since wet experiments are expensive and laborious, nowadays, calculation-based solutions have increasingly attracted the attention of researchers. However, the performance of these methods is restricted due to the inability to balance the distribution among various types of nodes. To remedy the problem, we present a novel computational method called GEHGAN to forecast the new relationships in this research, leveraging graph embedding and heterogeneous graph attention networks. Firstly, we calculate circRNA sequences similarity, circRNA RBP similarity, disease semantic similarity and corresponding GIP kernel similarity to construct heterogeneous graph. Secondly, a graph embedding method using random walks with jump and stay strategies is applied to obtain the preliminary embeddings of circRNAs and diseases, greatly improving the performance of the model. Thirdly, a multi-head graph attention network is employed to further update the embeddings, followed by the employment of the MLP as a predictor. As a result, the five-fold cross-validation indicates that GEHGAN achieves an outstanding AUC score of 0.9829 and an AUPR value of 0.9815 on the CircR2Diseasev2.0 database, and case studies on osteosarcoma, gastric and colorectal neoplasms further confirm the model's efficacy at identifying circRNA-disease correlations.
Collapse
Affiliation(s)
- Yuehao Wang
- School of Computer and Communication, Lanzhou University of Technology, Lanzhou, 730050, Gansu, PR China.
| | - Pengli Lu
- School of Computer and Communication, Lanzhou University of Technology, Lanzhou, 730050, Gansu, PR China.
| |
Collapse
|
15
|
Wei H, Gao L, Wu S, Jiang Y, Liu B. DiSMVC: a multi-view graph collaborative learning framework for measuring disease similarity. Bioinformatics 2024; 40:btae306. [PMID: 38715444 PMCID: PMC11256965 DOI: 10.1093/bioinformatics/btae306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 04/19/2024] [Accepted: 05/05/2024] [Indexed: 05/30/2024] Open
Abstract
MOTIVATION Exploring potential associations between diseases can help in understanding pathological mechanisms of diseases and facilitating the discovery of candidate biomarkers and drug targets, thereby promoting disease diagnosis and treatment. Some computational methods have been proposed for measuring disease similarity. However, these methods describe diseases without considering their latent multi-molecule regulation and valuable supervision signal, resulting in limited biological interpretability and efficiency to capture association patterns. RESULTS In this study, we propose a new computational method named DiSMVC. Different from existing predictors, DiSMVC designs a supervised graph collaborative framework to measure disease similarity. Multiple bio-entity associations related to genes and miRNAs are integrated via cross-view graph contrastive learning to extract informative disease representation, and then association pattern joint learning is implemented to compute disease similarity by incorporating phenotype-annotated disease associations. The experimental results show that DiSMVC can draw discriminative characteristics for disease pairs, and outperform other state-of-the-art methods. As a result, DiSMVC is a promising method for predicting disease associations with molecular interpretability. AVAILABILITY AND IMPLEMENTATION Datasets and source codes are available at https://github.com/Biohang/DiSMVC.
Collapse
Affiliation(s)
- Hang Wei
- School of Computer Science and Technology, Xidian University, Xi’an, Shaanxi 710126, China
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, Xi’an, Shaanxi 710126, China
| | - Shuai Wu
- School of Computer Science and Technology, Xidian University, Xi’an, Shaanxi 710126, China
| | - Yina Jiang
- Department of Basic Medicine, Shaanxi University of Chinese Medicine, Xianyang, Shaanxi 712046, China
| | - Bin Liu
- Faculty of Engineering, Shenzhen MSU-BIT University, Shenzhen, Guangdong 518172, China
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China
| |
Collapse
|
16
|
Yang J, Lei X, Zhang F. Identification of circRNA-disease associations via multi-model fusion and ensemble learning. J Cell Mol Med 2024; 28:e18180. [PMID: 38506066 PMCID: PMC10951890 DOI: 10.1111/jcmm.18180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 01/21/2024] [Accepted: 02/05/2024] [Indexed: 03/21/2024] Open
Abstract
Circular RNA (circRNA) is a common non-coding RNA and plays an important role in the diagnosis and therapy of human diseases, circRNA-disease associations prediction based on computational methods can provide a new way for better clinical diagnosis. In this article, we proposed a novel method for circRNA-disease associations prediction based on ensemble learning, named ELCDA. First, the association heterogeneous network was constructed via collecting multiple information of circRNAs and diseases, and multiple similarity measures are adopted here, then, we use metapath, matrix factorization and GraphSAGE-based models to extract features of nodes from different views, the final comprehensive features of circRNAs and diseases via ensemble learning, finally, a soft voting ensemble strategy is used to integrate the predicted results of all classifier. The performance of ELCDA is evaluated by fivefold cross-validation and compare with other state-of-the-art methods, the experimental results show that ELCDA is outperformance than others. Furthermore, three common diseases are used as case studies, which also demonstrate that ELCDA is an effective method for predicting circRNA-disease associations.
Collapse
Affiliation(s)
- Jing Yang
- School of Computer ScienceShaanxi Normal UniversityXi'anShaanxiChina
| | - Xiujuan Lei
- School of Computer ScienceShaanxi Normal UniversityXi'anShaanxiChina
| | - Fa Zhang
- School of Medical TechnologyBeijing Institute of TechnologyBeijingChina
| |
Collapse
|
17
|
Turgut H, Turanli B, Boz B. DCDA: CircRNA-Disease Association Prediction with Feed-Forward Neural Network and Deep Autoencoder. Interdiscip Sci 2024; 16:91-103. [PMID: 37978116 DOI: 10.1007/s12539-023-00590-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 10/13/2023] [Accepted: 10/15/2023] [Indexed: 11/19/2023]
Abstract
Circular RNA is a single-stranded RNA with a closed-loop structure. In recent years, academic research has revealed that circular RNAs play critical roles in biological processes and are related to human diseases. The discovery of potential circRNAs as disease biomarkers and drug targets is crucial since it can help diagnose diseases in the early stages and be used to treat people. However, in conventional experimental methods, conducting experiments to detect associations between circular RNAs and diseases is time-consuming and costly. To overcome this problem, various computational methodologies are proposed to extract essential features for both circular RNAs and diseases and predict the associations. Studies showed that computational methods successfully predicted performance and made it possible to detect possible highly related circular RNAs for diseases. This study proposes a deep learning-based circRNA-disease association predictor methodology called DCDA, which uses multiple data sources to create circRNA and disease features and reveal hidden feature codings of a circular RNA-disease pair with a deep autoencoder, then predict the relation score of the pair by a deep neural network. Fivefold cross-validation results on the benchmark dataset showed that our model outperforms state-of-the-art prediction methods in the literature with the AUC score of 0.9794.
Collapse
Affiliation(s)
- Hacer Turgut
- Computer Engineering Department, Marmara University, 34854, Istanbul, Türkiye.
| | - Beste Turanli
- Bioengineering Department, Marmara University, 34854, Istanbul, Türkiye
| | - Betül Boz
- Computer Engineering Department, Marmara University, 34854, Istanbul, Türkiye.
| |
Collapse
|
18
|
Wang L, Li ZW, You ZH, Huang DS, Wong L. GSLCDA: An Unsupervised Deep Graph Structure Learning Method for Predicting CircRNA-Disease Association. IEEE J Biomed Health Inform 2024; 28:1742-1751. [PMID: 38127594 DOI: 10.1109/jbhi.2023.3344714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Growing studies reveal that Circular RNAs (circRNAs) are broadly engaged in physiological processes of cell proliferation, differentiation, aging, apoptosis, and are closely associated with the pathogenesis of numerous diseases. Clarification of the correlation among diseases and circRNAs is of great clinical importance to provide new therapeutic strategies for complex diseases. However, previous circRNA-disease association prediction methods rely excessively on the graph network, and the model performance is dramatically reduced when noisy connections occur in the graph structure. To address this problem, this paper proposes an unsupervised deep graph structure learning method GSLCDA to predict potential CDAs. Concretely, we first integrate circRNA and disease multi-source data to constitute the CDA heterogeneous network. Then the network topology is learned using the graph structure, and the original graph is enhanced in an unsupervised manner by maximize the inter information of the learned and original graphs to uncover their essential features. Finally, graph space sensitive k-nearest neighbor (KNN) algorithm is employed to search for latent CDAs. In the benchmark dataset, GSLCDA obtained 92.67% accuracy with 0.9279 AUC. GSLCDA also exhibits exceptional performance on independent datasets. Furthermore, 14, 12 and 14 of the top 16 circRNAs with the most points GSLCDA prediction scores were confirmed in the relevant literature in the breast cancer, colorectal cancer and lung cancer case studies, respectively. Such results demonstrated that GSLCDA can validly reveal underlying CDA and offer new perspectives for the diagnosis and therapy of complex human diseases.
Collapse
|
19
|
Niu M, Wang C, Chen Y, Zou Q, Qi R, Xu L. CircRNA identification and feature interpretability analysis. BMC Biol 2024; 22:44. [PMID: 38408987 PMCID: PMC10898045 DOI: 10.1186/s12915-023-01804-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Accepted: 12/18/2023] [Indexed: 02/28/2024] Open
Abstract
BACKGROUND Circular RNAs (circRNAs) can regulate microRNA activity and are related to various diseases, such as cancer. Functional research on circRNAs is the focus of scientific research. Accurate identification of circRNAs is important for gaining insight into their functions. Although several circRNA prediction models have been developed, their prediction accuracy is still unsatisfactory. Therefore, providing a more accurate computational framework to predict circRNAs and analyse their looping characteristics is crucial for systematic annotation. RESULTS We developed a novel framework, CircDC, for classifying circRNAs from other lncRNAs. CircDC uses four different feature encoding schemes and adopts a multilayer convolutional neural network and bidirectional long short-term memory network to learn high-order feature representation and make circRNA predictions. The results demonstrate that the proposed CircDC model is more accurate than existing models. In addition, an interpretable analysis of the features affecting the model is performed, and the computational framework is applied to the extended application of circRNA identification. CONCLUSIONS CircDC is suitable for the prediction of circRNA. The identification of circRNA helps to understand and delve into the related biological processes and functions. Feature importance analysis increases model interpretability and uncovers significant biological properties. The relevant code and data in this article can be accessed for free at https://github.com/nmt315320/CircDC.git .
Collapse
Affiliation(s)
- Mengting Niu
- School of Electronic and Communication Engineering, Shenzhen Polytechnic University, Shenzhen, 518055, China
- Postdoctoral Innovation Practice Base, Shenzhen Polytechnic University, Shenzhen, 518055, China
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Chunyu Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin, 150000, Heilongjiang, China
| | - Yaojia Chen
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, No.4 Block 2 North Jianshe Road, Chengdu, 610054, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, No.4 Block 2 North Jianshe Road, Chengdu, 610054, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Ren Qi
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China.
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China.
| | - Lei Xu
- School of Electronic and Communication Engineering, Shenzhen Polytechnic University, Shenzhen, 518055, China.
| |
Collapse
|
20
|
Lu P, Zhang W, Wu J. AMPCDA: Prediction of circRNA-disease associations by utilizing attention mechanisms on metapaths. Comput Biol Chem 2024; 108:107989. [PMID: 38016366 DOI: 10.1016/j.compbiolchem.2023.107989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 10/24/2023] [Accepted: 11/15/2023] [Indexed: 11/30/2023]
Abstract
Researchers have been creating an expanding corpus of experimental evidences in biomedical field which has revealed prevalent associations between circRNAs and human diseases. Such linkages unveiled afforded a new perspective for elucidating etiology and devise innovative therapeutic strategies. In recent years, many computational methods were introduced to remedy the limitations of inefficiency and exorbitant budgets brought by conventional lab-experimental approaches to enumerate possible circRNA-disease associations, but the majority of existing methods still face challenges in effectively integrating node embeddings with higher-order neighborhood representations, which might hinder the final predictive accuracy from attaining optimal measures. To overcome such constraints, we proposed AMPCDA, a computational technique harnessing predefined metapaths to predict circRNA-disease associations. Specifically, an association graph is initially built upon three source databases and two similarity derivation procedures, and DeepWalk is subsequently imposed on the graph to procure initial feature representations. Vectorial embeddings of metapath instances, concatenated by initial node features, are then fed through a customed encoder. By employing self-attention section, metapath-specific contributions to each node are accumulated before combining with node's intrinsic features and channeling into a graph attention module, which furnished the input representations for the multilayer perceptron to predict the ultimate association probability scores. By integrating graph topology features and node embedding themselves, AMPCDA managed to effectively leverage information carried by multiple nodes along paths and exhibited an exceptional predictive performance, achieving AUC values of 0.9623, 0.9675, and 0.9711 under 5-fold cross validation, 10-fold cross validation, and leave-one-out cross validation, respectively. These results signify substantial accuracy improvements compared to other prediction models. Case study assessments confirm the high predictive accuracy of our proposed technique in identifying circRNA-disease connections, highlighting its value in guiding future biological research to reveal new disease mechanisms.
Collapse
Affiliation(s)
- Pengli Lu
- School of Computer and Communication, Lanzhou University of Technology, Lanzhou, 730050, Gansu, PR China.
| | - Wenqi Zhang
- School of Computer and Communication, Lanzhou University of Technology, Lanzhou, 730050, Gansu, PR China.
| | - Jinkai Wu
- School of Computer and Communication, Lanzhou University of Technology, Lanzhou, 730050, Gansu, PR China.
| |
Collapse
|
21
|
Chen L, Xu J, Zhou Y. PDATC-NCPMKL: Predicting drug's Anatomical Therapeutic Chemical (ATC) codes based on network consistency projection and multiple kernel learning. Comput Biol Med 2024; 169:107862. [PMID: 38150886 DOI: 10.1016/j.compbiomed.2023.107862] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 11/19/2023] [Accepted: 12/17/2023] [Indexed: 12/29/2023]
Abstract
The development and discovery of new drugs is time-consuming and needs lots of human and material resources. Therefore, discovery of novel effects of existing drugs is an important alternative way, which can accelerate the process of designing "new" drugs. The anatomical Therapeutic Chemical (ATC) classification system recommended by World Health Organization (WHO) is a basic research area in this regard. A novel ATC code of an existing drug suggests its novel effects. Some computational models have been proposed, which can predict the drug-ATC code associations. However, their performance is not very high. There still exist spaces for improvement. In this study, a new recommendation system (named PDATC-NCPMKL), which incorporated network consistency projection and multi-kernel learning, was designed to identify drug-ATC code associations. For drugs or ATC codes, several kernels were constructed, which were fused by a multiple kernel learning method and an additional kernel integration scheme. To enhance the performance, the drug-ATC code association adjacency matrix was reformulated by a variant of weighted K nearest known neighbors (WKNKN). The reformulated adjacency matrix, drug and ATC code kernels were fed into network consistency projection to generate the association score matrix. The proposed recommendation system was tested on the ATC codes at the second, third and fourth levels in drug ATC classification system using ten-fold cross-validation. The results indicated that all AUROC and AUPR values were close to or exceeded 0.96. Such performance was higher than some existing computational models. Some additional tests were conducted to prove the utility of adjacency matrix reformulation and to analyze the importance of drug and ATC code kernels.
Collapse
Affiliation(s)
- Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai, 201306, China.
| | - Jing Xu
- College of Information Engineering, Shanghai Maritime University, Shanghai, 201306, China.
| | - Yubin Zhou
- Department of Thoracic Surgery, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu, 610072, China.
| |
Collapse
|
22
|
Niu M, Wang C, Zhang Z, Zou Q. A computational model of circRNA-associated diseases based on a graph neural network: prediction and case studies for follow-up experimental validation. BMC Biol 2024; 22:24. [PMID: 38281919 PMCID: PMC10823650 DOI: 10.1186/s12915-024-01826-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Accepted: 01/11/2024] [Indexed: 01/30/2024] Open
Abstract
BACKGROUND Circular RNAs (circRNAs) have been confirmed to play a vital role in the occurrence and development of diseases. Exploring the relationship between circRNAs and diseases is of far-reaching significance for studying etiopathogenesis and treating diseases. To this end, based on the graph Markov neural network algorithm (GMNN) constructed in our previous work GMNN2CD, we further considered the multisource biological data that affects the association between circRNA and disease and developed an updated web server CircDA and based on the human hepatocellular carcinoma (HCC) tissue data to verify the prediction results of CircDA. RESULTS CircDA is built on a Tumarkov-based deep learning framework. The algorithm regards biomolecules as nodes and the interactions between molecules as edges, reasonably abstracts multiomics data, and models them as a heterogeneous biomolecular association network, which can reflect the complex relationship between different biomolecules. Case studies using literature data from HCC, cervical, and gastric cancers demonstrate that the CircDA predictor can identify missing associations between known circRNAs and diseases, and using the quantitative real-time PCR (RT-qPCR) experiment of HCC in human tissue samples, it was found that five circRNAs were significantly differentially expressed, which proved that CircDA can predict diseases related to new circRNAs. CONCLUSIONS This efficient computational prediction and case analysis with sufficient feedback allows us to identify circRNA-associated diseases and disease-associated circRNAs. Our work provides a method to predict circRNA-associated diseases and can provide guidance for the association of diseases with certain circRNAs. For ease of use, an online prediction server ( http://server.malab.cn/CircDA ) is provided, and the code is open-sourced ( https://github.com/nmt315320/CircDA.git ) for the convenience of algorithm improvement.
Collapse
Affiliation(s)
- Mengting Niu
- School of Electronic and Communication Engineering, Shenzhen Polytechnic University, Shenzhen, 518055, China
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Chunyu Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin, 150000, Heilongjiang, China
| | - Zhanguo Zhang
- Hepatic Surgery Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, 1095 Jiefang Avenue, Wuhan, 430030, China.
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, No. 4 Block 2 North Jianshe Road, Chengdu, 610054, China.
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China.
| |
Collapse
|
23
|
Chen L, Zhao X. PCDA-HNMP: Predicting circRNA-disease association using heterogeneous network and meta-path. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:20553-20575. [PMID: 38124565 DOI: 10.3934/mbe.2023909] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Increasing amounts of experimental studies have shown that circular RNAs (circRNAs) play important regulatory roles in human diseases through interactions with related microRNAs (miRNAs). CircRNAs have become new potential disease biomarkers and therapeutic targets. Predicting circRNA-disease association (CDA) is of great significance for exploring the pathogenesis of complex diseases, which can improve the diagnosis level of diseases and promote the targeted therapy of diseases. However, determination of CDAs through traditional clinical trials is usually time-consuming and expensive. Computational methods are now alternative ways to predict CDAs. In this study, a new computational method, named PCDA-HNMP, was designed. For obtaining informative features of circRNAs and diseases, a heterogeneous network was first constructed, which defined circRNAs, mRNAs, miRNAs and diseases as nodes and associations between them as edges. Then, a deep analysis was conducted on the heterogeneous network by extracting meta-paths connecting to circRNAs (diseases), thereby mining hidden associations between various circRNAs (diseases). These associations constituted the meta-path-induced networks for circRNAs and diseases. The features of circRNAs and diseases were derived from the aforementioned networks via mashup. On the other hand, miRNA-disease associations (mDAs) were employed to improve the model's performance. miRNA features were yielded from the meta-path-induced networks on miRNAs and circRNAs, which were constructed from the meta-paths connecting miRNAs and circRNAs in the heterogeneous network. A concatenation operation was adopted to build the features of CDAs and mDAs. Such representations of CDAs and mDAs were fed into XGBoost to set up the model. The five-fold cross-validation yielded an area under the curve (AUC) of 0.9846, which was better than those of some existing state-of-the-art methods. The employment of mDAs can really enhance the model's performance and the importance analysis on meta-path-induced networks shown that networks produced by the meta-paths containing validated CDAs provided the most important contributions.
Collapse
Affiliation(s)
- Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Xiaoyu Zhao
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| |
Collapse
|
24
|
Liang J, Li ZW, Sun ZN, Bi Y, Cheng H, Zeng T, Guo WF. Latent space search based multimodal optimization with personalized edge-network biomarker for multi-purpose early disease prediction. Brief Bioinform 2023; 24:bbad364. [PMID: 37833844 DOI: 10.1093/bib/bbad364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 09/06/2023] [Accepted: 09/20/2023] [Indexed: 10/15/2023] Open
Abstract
Considering that cancer is resulting from the comutation of several essential genes of individual patients, researchers have begun to focus on identifying personalized edge-network biomarkers (PEBs) using personalized edge-network analysis for clinical practice. However, most of existing methods ignored the optimization of PEBs when multimodal biomarkers exist in multi-purpose early disease prediction (MPEDP). To solve this problem, this study proposes a novel model (MMPDENB-RBM) that combines personalized dynamic edge-network biomarkers (PDENB) theory, multimodal optimization strategy and latent space search scheme to identify biomarkers with different configurations of PDENB modules (i.e. to effectively identify multimodal PDENBs). The application to the three largest cancer omics datasets from The Cancer Genome Atlas database (i.e. breast invasive carcinoma, lung squamous cell carcinoma and lung adenocarcinoma) showed that the MMPDENB-RBM model could more effectively predict critical cancer state compared with other advanced methods. And, our model had better convergence, diversity and multimodal property as well as effective optimization ability compared with the other state-of-art methods. Particularly, multimodal PDENBs identified were more enriched with different functional biomarkers simultaneously, such as tissue-specific synthetic lethality edge-biomarkers including cancer driver genes and disease marker genes. Importantly, as our aim, these multimodal biomarkers can perform diverse biological and biomedical significances for drug target screen, survival risk assessment and novel biomedical sight as the expected multi-purpose of personalized early disease prediction. In summary, the present study provides multimodal property of PDENBs, especially the therapeutic biomarkers with more biological significances, which can help with MPEDP of individual cancer patients.
Collapse
Affiliation(s)
- Jing Liang
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China
- State Key Laboratory of Intelligent Agricultural Power Equipment, Zhengzhou University, Luoyang 471000, China
| | - Zong-Wei Li
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China
| | - Ze-Ning Sun
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China
| | - Ying Bi
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China
| | - Han Cheng
- School of Life Sciences, Zhengzhou University, Zhengzhou 450001, China
| | - Tao Zeng
- Guangzhou National Laboratory, Guangzhou 510005, China
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou Laboratory, 510005, Guangzhou Medical University
| | - Wei-Feng Guo
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China
- State Key Laboratory of Intelligent Agricultural Power Equipment, Zhengzhou University, Luoyang 471000, China
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center,Guangzhou 7510060, China
| |
Collapse
|
25
|
Wu J, Ning Z, Ding Y, Wang Y, Peng Q, Fu L. KGETCDA: an efficient representation learning framework based on knowledge graph encoder from transformer for predicting circRNA-disease associations. Brief Bioinform 2023; 24:bbad292. [PMID: 37587836 DOI: 10.1093/bib/bbad292] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 07/27/2023] [Accepted: 07/27/2023] [Indexed: 08/18/2023] Open
Abstract
Recent studies have demonstrated the significant role that circRNA plays in the progression of human diseases. Identifying circRNA-disease associations (CDA) in an efficient manner can offer crucial insights into disease diagnosis. While traditional biological experiments can be time-consuming and labor-intensive, computational methods have emerged as a viable alternative in recent years. However, these methods are often limited by data sparsity and their inability to explore high-order information. In this paper, we introduce a novel method named Knowledge Graph Encoder from Transformer for predicting CDA (KGETCDA). Specifically, KGETCDA first integrates more than 10 databases to construct a large heterogeneous non-coding RNA dataset, which contains multiple relationships between circRNA, miRNA, lncRNA and disease. Then, a biological knowledge graph is created based on this dataset and Transformer-based knowledge representation learning and attentive propagation layers are applied to obtain high-quality embeddings with accurately captured high-order interaction information. Finally, multilayer perceptron is utilized to predict the matching scores of CDA based on their embeddings. Our empirical results demonstrate that KGETCDA significantly outperforms other state-of-the-art models. To enhance user experience, we have developed an interactive web-based platform named HNRBase that allows users to visualize, download data and make predictions using KGETCDA with ease. The code and datasets are publicly available at https://github.com/jinyangwu/KGETCDA.
Collapse
Affiliation(s)
- Jinyang Wu
- School of Automation Science and Engineering, Xi'an Jiaotong University, 710049, Shaanxi, China
| | - Zhiwei Ning
- School of Automation Science and Engineering, Xi'an Jiaotong University, 710049, Shaanxi, China
| | - Yidong Ding
- School of Automation Science and Engineering, Xi'an Jiaotong University, 710049, Shaanxi, China
| | - Ying Wang
- School of Automation Science and Engineering, Xi'an Jiaotong University, 710049, Shaanxi, China
| | - Qinke Peng
- School of Automation Science and Engineering, Xi'an Jiaotong University, 710049, Shaanxi, China
| | - Laiyi Fu
- School of Automation Science and Engineering, Xi'an Jiaotong University, 710049, Shaanxi, China
- Research Institute of Xi'an Jiaotong University, 311200, Zhejiang, China
- Sichuan Digital Economy Industry Development Research Institute, 610036, Sichuan, China
| |
Collapse
|
26
|
Liang S, Liu S, Song J, Lin Q, Zhao S, Li S, Li J, Liang S, Wang J. HMCDA: a novel method based on the heterogeneous graph neural network and metapath for circRNA-disease associations prediction. BMC Bioinformatics 2023; 24:335. [PMID: 37697297 PMCID: PMC10494331 DOI: 10.1186/s12859-023-05441-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 08/08/2023] [Indexed: 09/13/2023] Open
Abstract
Circular RNA (CircRNA) is a type of non-coding RNAs in which both ends are covalently linked. Researchers have demonstrated that many circRNAs can act as biomarkers of diseases. However, traditional experimental methods for circRNA-disease associations identification are labor-intensive. In this work, we propose a novel method based on the heterogeneous graph neural network and metapaths for circRNA-disease associations prediction termed as HMCDA. First, a heterogeneous graph consisting of circRNA-disease associations, circRNA-miRNA associations, miRNA-disease associations and disease-disease associations are constructed. Then, six metapaths are defined and generated according to the biomedical pathways. Afterwards, the entity content transformation, intra-metapath and inter-metapath aggregation are implemented to learn the embeddings of circRNA and disease entities. Finally, the learned embeddings are used to predict novel circRNA-disase associations. In particular, the result of extensive experiments demonstrates that HMCDA outperforms four state-of-the-art models in fivefold cross validation. In addition, our case study indicates that HMCDA has the ability to identify novel circRNA-disease associations.
Collapse
Affiliation(s)
- Shiyang Liang
- Department of Gastroenterology, Tangdu Hospital, Air Force Medical University, Xinsi Road, Xi'an, China
- Department of Internal Medicine, The No. 944 Hospital of Joint Logistic Support Force of PLA, Xiongguan Road, Jiuquan, China
| | - Siwei Liu
- Department of Machine Learning, Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, United Arab Emirates
| | - Junliang Song
- Department of Gastroenterology, Tangdu Hospital, Air Force Medical University, Xinsi Road, Xi'an, China
| | - Qiang Lin
- Department of Gastroenterology, Tangdu Hospital, Air Force Medical University, Xinsi Road, Xi'an, China
| | - Shihong Zhao
- Department of Respiratory Medicine, Tangdu Hospital, Air Force Medical University, Xinsi Road, Xi'an, China
| | - Shuaixin Li
- Department of Gastroenterology, Tangdu Hospital, Air Force Medical University, Xinsi Road, Xi'an, China
| | - Jiahui Li
- Department of Gastroenterology, Tangdu Hospital, Air Force Medical University, Xinsi Road, Xi'an, China
| | - Shangsong Liang
- Department of Machine Learning, Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, United Arab Emirates
| | - Jingjie Wang
- Department of Gastroenterology, Tangdu Hospital, Air Force Medical University, Xinsi Road, Xi'an, China.
| |
Collapse
|
27
|
Qiao LJ, Gao Z, Ji CM, Liu ZH, Zheng CH, Wang YT. Potential circRNA-Disease Association Prediction Using DeepWalk and Nonnegative Matrix Factorization. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:3154-3162. [PMID: 37018084 DOI: 10.1109/tcbb.2023.3264466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Circular RNAs (circRNAs) are a category of noncoding RNAs that exist in great numbers in eukaryotes. They have recently been discovered to be crucial in the growth of tumors. Therefore, it is important to explore the association of circRNAs with disease. This paper proposes a new method based on DeepWalk and nonnegative matrix factorization (DWNMF) to predict circRNA-disease association. Based on the known circRNA-disease association, we calculate the topological similarity of circRNA and disease via the DeepWalk-based method to learn the node features on the association network. Next, the functional similarity of the circRNAs and the semantic similarity of the diseases are fused with their respective topological similarities at different scales. Then, we use the improved weighted K-nearest neighbor (IWKNN) method to preprocess the circRNA-disease association network and correct nonnegative associations by setting different parameters K1 and K2 in the circRNA and disease matrices. Finally, the L2,1-norm, dual-graph regularization term and Frobenius norm regularization term are introduced into the nonnegative matrix factorization model to predict the circRNA-disease correlation. We perform cross-validation on circR2Disease, circRNADisease, and MNDR. The numerical results show that DWNMF is an efficient tool for forecasting potential circRNA-disease relationships, outperforming other state-of-the-art approaches in terms of predictive performance.
Collapse
|
28
|
Wu Q, Deng Z, Zhang W, Pan X, Choi KS, Zuo Y, Shen HB, Yu DJ. MLNGCF: circRNA-disease associations prediction with multilayer attention neural graph-based collaborative filtering. Bioinformatics 2023; 39:btad499. [PMID: 37561093 PMCID: PMC10457666 DOI: 10.1093/bioinformatics/btad499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 06/17/2023] [Accepted: 08/09/2023] [Indexed: 08/11/2023] Open
Abstract
MOTIVATION CircRNAs play a critical regulatory role in physiological processes, and the abnormal expression of circRNAs can mediate the processes of diseases. Therefore, exploring circRNAs-disease associations is gradually becoming an important area of research. Due to the high cost of validating circRNA-disease associations using traditional wet-lab experiments, novel computational methods based on machine learning are gaining more and more attention in this field. However, current computational methods suffer to insufficient consideration of latent features in circRNA-disease interactions. RESULTS In this study, a multilayer attention neural graph-based collaborative filtering (MLNGCF) is proposed. MLNGCF first enhances multiple biological information with autoencoder as the initial features of circRNAs and diseases. Then, by constructing a central network of different diseases and circRNAs, a multilayer cooperative attention-based message propagation is performed on the central network to obtain the high-order features of circRNAs and diseases. A neural network-based collaborative filtering is constructed to predict the unknown circRNA-disease associations and update the model parameters. Experiments on the benchmark datasets demonstrate that MLNGCF outperforms state-of-the-art methods, and the prediction results are supported by the literature in the case studies. AVAILABILITY AND IMPLEMENTATION The source codes and benchmark datasets of MLNGCF are available at https://github.com/ABard0/MLNGCF.
Collapse
Affiliation(s)
- Qunzhuo Wu
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, China
| | - Zhaohong Deng
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, China
| | - Wei Zhang
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, China
| | - Xiaoyong Pan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiaotong University, Shanghai, China
| | - Kup-Sze Choi
- The Centre for Smart Health, The Hong Kong Polytechnic University, Hong Kong
| | - Yun Zuo
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, China
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiaotong University, Shanghai, China
| | - Dong-Jun Yu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China
| |
Collapse
|
29
|
Ai N, Liang Y, Yuan H, Ouyang D, Xie S, Liu X. GDCL-NcDA: identifying non-coding RNA-disease associations via contrastive learning between deep graph learning and deep matrix factorization. BMC Genomics 2023; 24:424. [PMID: 37501127 PMCID: PMC10373414 DOI: 10.1186/s12864-023-09501-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Accepted: 07/02/2023] [Indexed: 07/29/2023] Open
Abstract
Non-coding RNAs (ncRNAs) draw much attention from studies widely in recent years because they play vital roles in life activities. As a good complement to wet experiment methods, computational prediction methods can greatly save experimental costs. However, high false-negative data and insufficient use of multi-source information can affect the performance of computational prediction methods. Furthermore, many computational methods do not have good robustness and generalization on different datasets. In this work, we propose an effective end-to-end computing framework, called GDCL-NcDA, of deep graph learning and deep matrix factorization (DMF) with contrastive learning, which identifies the latent ncRNA-disease association on diverse multi-source heterogeneous networks (MHNs). The diverse MHNs include different similarity networks and proven associations among ncRNAs (miRNAs, circRNAs, and lncRNAs), genes, and diseases. Firstly, GDCL-NcDA employs deep graph convolutional network and multiple attention mechanisms to adaptively integrate multi-source of MHNs and reconstruct the ncRNA-disease association graph. Then, GDCL-NcDA utilizes DMF to predict the latent disease-associated ncRNAs based on the reconstructed graphs to reduce the impact of the false-negatives from the original associations. Finally, GDCL-NcDA uses contrastive learning (CL) to generate a contrastive loss on the reconstructed graphs and the predicted graphs to improve the generalization and robustness of our GDCL-NcDA framework. The experimental results show that GDCL-NcDA outperforms highly related computational methods. Moreover, case studies demonstrate the effectiveness of GDCL-NcDA in identifying the associations among diversiform ncRNAs and diseases.
Collapse
Affiliation(s)
- Ning Ai
- Peng Cheng Laboratory, Shenzhen, 518005, Guangdong, China
- School of Computer Science and Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, China
| | - Yong Liang
- Peng Cheng Laboratory, Shenzhen, 518005, Guangdong, China.
- Pazhou Laboratory (Huangpu), Guangzhou, 510555, Guangdong, China.
| | - Haoliang Yuan
- School of Automation, Guangdong University of Technology, Guangzhou, 510006, Guangdong, China
| | - Dong Ouyang
- Peng Cheng Laboratory, Shenzhen, 518005, Guangdong, China
- School of Computer Science and Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, China
| | - Shengli Xie
- Institute of Intelligent Information Processing, Guangdong University of Technology, Guangzhou, 510000, Guangdong, China
| | - Xiaoying Liu
- Computer Engineering Technical College, Guangdong Polytechnic of Science and Technology, Zhuhai, Guangdong, 519090, China
| |
Collapse
|
30
|
Wang F, Yang H, Wu Y, Peng L, Li X. SAELGMDA: Identifying human microbe-disease associations based on sparse autoencoder and LightGBM. Front Microbiol 2023; 14:1207209. [PMID: 37415823 PMCID: PMC10320730 DOI: 10.3389/fmicb.2023.1207209] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 05/18/2023] [Indexed: 07/08/2023] Open
Abstract
Introduction Identification of complex associations between diseases and microbes is important to understand the pathogenesis of diseases and design therapeutic strategies. Biomedical experiment-based Microbe-Disease Association (MDA) detection methods are expensive, time-consuming, and laborious. Methods Here, we developed a computational method called SAELGMDA for potential MDA prediction. First, microbe similarity and disease similarity are computed by integrating their functional similarity and Gaussian interaction profile kernel similarity. Second, one microbe-disease pair is presented as a feature vector by combining the microbe and disease similarity matrices. Next, the obtained feature vectors are mapped to a low-dimensional space based on a Sparse AutoEncoder. Finally, unknown microbe-disease pairs are classified based on Light Gradient boosting machine. Results The proposed SAELGMDA method was compared with four state-of-the-art MDA methods (MNNMDA, GATMDA, NTSHMDA, and LRLSHMDA) under five-fold cross validations on diseases, microbes, and microbe-disease pairs on the HMDAD and Disbiome databases. The results show that SAELGMDA computed the best accuracy, Matthews correlation coefficient, AUC, and AUPR under the majority of conditions, outperforming the other four MDA prediction models. In particular, SAELGMDA obtained the best AUCs of 0.8358 and 0.9301 under cross validation on diseases, 0.9838 and 0.9293 under cross validation on microbes, and 0.9857 and 0.9358 under cross validation on microbe-disease pairs on the HMDAD and Disbiome databases. Colorectal cancer, inflammatory bowel disease, and lung cancer are diseases that severely threat human health. We used the proposed SAELGMDA method to find possible microbes for the three diseases. The results demonstrate that there are potential associations between Clostridium coccoides and colorectal cancer and one between Sphingomonadaceae and inflammatory bowel disease. In addition, Veillonella may associate with autism. The inferred MDAs need further validation. Conclusion We anticipate that the proposed SAELGMDA method contributes to the identification of new MDAs.
Collapse
Affiliation(s)
- Feixiang Wang
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Huandong Yang
- Department of Gastrointestinal Surgery, Yidu Central Hospital of Weifang, Weifang, China
| | - Yan Wu
- Geneis (Beijing) Co., Ltd., Beijing, China
| | - Lihong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Xiaoling Li
- The Second Department of Oncology, Beidahuang Industry Group General Hospital, Harbin, China
- The Second Department of Oncology, Heilongjiang Second Cancer Hospital, Harbin, China
| |
Collapse
|
31
|
Shi K, Li L, Wang Z, Chen H, Chen Z, Fang S. Identifying microbe-disease association based on graph convolutional attention network: Case study of liver cirrhosis and epilepsy. Front Neurosci 2023; 16:1124315. [PMID: 36741060 PMCID: PMC9892757 DOI: 10.3389/fnins.2022.1124315] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Accepted: 12/31/2022] [Indexed: 01/20/2023] Open
Abstract
The interactions between the microbiota and the human host can affect the physiological functions of organs (such as the brain, liver, gut, etc.). Accumulating investigations indicate that the imbalance of microbial community is closely related to the occurrence and development of diseases. Thus, the identification of potential links between microbes and diseases can provide insight into the pathogenesis of diseases. In this study, we propose a deep learning framework (MDAGCAN) based on graph convolutional attention network to identify potential microbe-disease associations. In MDAGCAN, we first construct a heterogeneous network consisting of the known microbe-disease associations and multi-similarity fusion networks of microbes and diseases. Then, the node embeddings considering the neighbor information of the heterogeneous network are learned by applying graph convolutional layers and graph attention layers. Finally, a bilinear decoder using node embedding representations reconstructs the unknown microbe-disease association. Experiments show that our method achieves reliable performance with average AUCs of 0.9778 and 0.9454 ± 0.0038 in the frameworks of Leave-one-out cross validation (LOOCV) and 5-fold cross validation (5-fold CV), respectively. Furthermore, we apply MDAGCAN to predict latent microbes for two high-risk human diseases, i.e., liver cirrhosis and epilepsy, and results illustrate that 16 and 17 out of the top 20 predicted microbes are verified by published literatures, respectively. In conclusion, our method displays effective and reliable prediction performance and can be expected to predict unknown microbe-disease associations facilitating disease diagnosis and prevention.
Collapse
Affiliation(s)
- Kai Shi
- College of Information Science and Engineering, Guilin University of Technology, Guilin, China
- Guangxi Key Laboratory of Embedded Technology and Intelligent System, Guilin University of Technology, Guilin, China
| | - Lin Li
- College of Information Science and Engineering, Guilin University of Technology, Guilin, China
| | - Zhengfeng Wang
- College of Information Science and Engineering, Guilin University of Technology, Guilin, China
| | - Huazhou Chen
- College of Science, Guilin University of Technology, Guilin, China
| | - Zilin Chen
- Department of Developmental and Behavioural Pediatric Department & Department of Child Primary Care, Xinhua Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Shuanfeng Fang
- Department of Children Health Care, Children’s Hospital Affiliated to Zhengzhou University, Zhengzhou, China
| |
Collapse
|
32
|
Lu C, Zhang L, Zeng M, Lan W, Duan G, Wang J. Inferring disease-associated circRNAs by multi-source aggregation based on heterogeneous graph neural network. Brief Bioinform 2023; 24:6960978. [PMID: 36572658 DOI: 10.1093/bib/bbac549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Revised: 11/03/2022] [Accepted: 11/11/2022] [Indexed: 12/28/2022] Open
Abstract
Emerging evidence has proved that circular RNAs (circRNAs) are implicated in pathogenic processes. They are regarded as promising biomarkers for diagnosis due to covalently closed loop structures. As opposed to traditional experiments, computational approaches can identify circRNA-disease associations at a lower cost. Aggregating multi-source pathogenesis data helps to alleviate data sparsity and infer potential associations at the system level. The majority of computational approaches construct a homologous network using multi-source data, but they lose the heterogeneity of the data. Effective methods that use the features of multi-source data are considered as a matter of urgency. In this paper, we propose a model (CDHGNN) based on edge-weighted graph attention and heterogeneous graph neural networks for potential circRNA-disease association prediction. The circRNA network, micro RNA network, disease network and heterogeneous network are constructed based on multi-source data. To reflect association probabilities between nodes, an edge-weighted graph attention network model is designed for node features. To assign attention weights to different types of edges and learn contextual meta-path, CDHGNN infers potential circRNA-disease association based on heterogeneous neural networks. CDHGNN outperforms state-of-the-art algorithms in terms of accuracy. Edge-weighted graph attention networks and heterogeneous graph networks have both improved performance significantly. Furthermore, case studies suggest that CDHGNN is capable of identifying specific molecular associations and investigating biomolecular regulatory relationships in pathogenesis. The code of CDHGNN is freely available at https://github.com/BioinformaticsCSU/CDHGNN.
Collapse
Affiliation(s)
- Chengqian Lu
- School of Computer Science and Engineering, Central South University, Changsha, 410083, Hunan, China.,Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, Hunan, China.,School of Computer Science, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Lishen Zhang
- School of Computer Science and Engineering, Central South University, Changsha, 410083, Hunan, China.,Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, Hunan, China
| | - Min Zeng
- School of Computer Science and Engineering, Central South University, Changsha, 410083, Hunan, China.,Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, Hunan, China
| | - Wei Lan
- School of Computer, Electronic and Information, Guangxi University, Nanning, 530004, Guangxi, China
| | - Guihua Duan
- School of Computer Science and Engineering, Central South University, Changsha, 410083, Hunan, China.,Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, Hunan, China
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha, 410083, Hunan, China.,Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, Hunan, China
| |
Collapse
|
33
|
Lan W, Dong Y, Zhang H, Li C, Chen Q, Liu J, Wang J, Chen YPP. Benchmarking of computational methods for predicting circRNA-disease associations. Brief Bioinform 2023; 24:6972300. [PMID: 36611256 DOI: 10.1093/bib/bbac613] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 10/29/2022] [Accepted: 12/11/2022] [Indexed: 01/09/2023] Open
Abstract
Accumulating evidences demonstrate that circular RNA (circRNA) plays an important role in human diseases. Identification of circRNA-disease associations can help for the diagnosis of human diseases, while the traditional method based on biological experiments is time-consuming. In order to address the limitation, a series of computational methods have been proposed in recent years. However, few works have summarized these methods or compared the performance of them. In this paper, we divided the existing methods into three categories: information propagation, traditional machine learning and deep learning. Then, the baseline methods in each category are introduced in detail. Further, 5 different datasets are collected, and 14 representative methods of each category are selected and compared in the 5-fold, 10-fold cross-validation and the de novo experiment. In order to further evaluate the effectiveness of these methods, six common cancers are selected to compare the number of correctly identified circRNA-disease associations in the top-10, top-20, top-50, top-100 and top-200. In addition, according to the results, the observation about the robustness and the character of these methods are concluded. Finally, the future directions and challenges are discussed.
Collapse
Affiliation(s)
- Wei Lan
- School of Computer, Electronic and Information and Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning, Guangxi 530004, China
| | - Yi Dong
- School of Computer, Electronic and Information and Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning, Guangxi 530004, China
| | - Hongyu Zhang
- School of Computer, Electronic and Information and Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning, Guangxi 530004, China
| | - Chunling Li
- School of Computer, Electronic and Information and Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning, Guangxi 530004, China
| | - Qingfeng Chen
- School of Computer, Electronic and Information and State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Guangxi University, Nanning, Guangxi 530004, China
| | - Jin Liu
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Jianxin Wang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Yi-Ping Phoebe Chen
- Department of Computer Science and Information Technology, La Trobe University, Melbourne, Victoria 3086, Australia
| |
Collapse
|
34
|
Liu H, Bing P, Zhang M, Tian G, Ma J, Li H, Bao M, He K, He J, He B, Yang J. MNNMDA: Predicting human microbe-disease association via a method to minimize matrix nuclear norm. Comput Struct Biotechnol J 2023; 21:1414-1423. [PMID: 36824227 PMCID: PMC9941872 DOI: 10.1016/j.csbj.2022.12.053] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 12/29/2022] [Accepted: 12/30/2022] [Indexed: 01/03/2023] Open
Abstract
Identifying the potential associations between microbes and diseases is the first step for revealing the pathological mechanisms of microbe-associated diseases. However, traditional culture-based microbial experiments are expensive and time-consuming. Thus, it is critical to prioritize disease-associated microbes by computational methods for further experimental validation. In this study, we proposed a novel method called MNNMDA, to predict microbe-disease associations (MDAs) by applying a Matrix Nuclear Norm method into known microbe and disease data. Specifically, we first calculated Gaussian interaction profile kernel similarity and functional similarity for diseases and microbes. Then we constructed a heterogeneous information network by combining the integrated disease similarity network, the integrated microbe similarity network and the known microbe-disease bipartite network. Finally, we formulated the microbe-disease association prediction problem as a low-rank matrix completion problem, which was solved by minimizing the nuclear norm of a matrix with a few regularization terms. We tested the performances of MNNMDA in three datasets including HMDAD, Disbiome, and Combined Data with small, medium and large sizes respectively. We also compared MNNMDA with 5 state-of-the-art methods including KATZHMDA, LRLSHMDA, NTSHMDA, GATMDA, and KGNMDA, respectively. MNNMDA achieved area under the ROC curves (AUROC) of 0.9536 and 0.9364 respectively on HDMAD and Disbiome, better than the AUCs of compared methods under the 5-fold cross-validation for all microbe-disease associations. It also obtained a relatively good performance with AUROC 0.8858 in the combined data. In addition, MNNMDA was also better than other methods in area under precision and recall curve (AUPR) under the 5-fold cross-validation for all associations, and in both AUROC and AUPR under the 5-fold cross-validation for diseases and the 5-fold cross-validation for microbes. Finally, the case studies on colon cancer and inflammatory bowel disease (IBD) also validated the effectiveness of MNNMDA. In conclusion, MNNMDA is an effective method in predicting microbe-disease associations. Availability The codes and data for this paper are freely available at Github https://github.com/Haiyan-Liu666/MNNMDA.
Collapse
Affiliation(s)
- Haiyan Liu
- Academician Workstation, Changsha Medical University, Changsha 410219, PR China,College of Information Engineering, Changsha Medical University, Changsha 410219, PR China,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, PR China
| | - Pingping Bing
- Academician Workstation, Changsha Medical University, Changsha 410219, PR China
| | - Meijun Zhang
- Geneis Beijing Co., Ltd., Beijing 100102, PR China
| | - Geng Tian
- Geneis Beijing Co., Ltd., Beijing 100102, PR China
| | - Jun Ma
- College of Information Engineering, Changsha Medical University, Changsha 410219, PR China
| | - Haigang Li
- Academician Workstation, Changsha Medical University, Changsha 410219, PR China,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, PR China,School of pharmacy, Changsha Medical University, Changsha 410219, PR China
| | - Meihua Bao
- Academician Workstation, Changsha Medical University, Changsha 410219, PR China,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, PR China,School of pharmacy, Changsha Medical University, Changsha 410219, PR China
| | - Kunhui He
- Academician Workstation, Changsha Medical University, Changsha 410219, PR China,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, PR China,School of pharmacy, Changsha Medical University, Changsha 410219, PR China
| | - Jianjun He
- Academician Workstation, Changsha Medical University, Changsha 410219, PR China,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, PR China,School of pharmacy, Changsha Medical University, Changsha 410219, PR China,Corresponding authors at: Academician Workstation, Changsha Medical University, Changsha 410219, PR China.
| | - Binsheng He
- Academician Workstation, Changsha Medical University, Changsha 410219, PR China,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, PR China,School of pharmacy, Changsha Medical University, Changsha 410219, PR China,Corresponding authors at: Academician Workstation, Changsha Medical University, Changsha 410219, PR China.
| | - Jialiang Yang
- Academician Workstation, Changsha Medical University, Changsha 410219, PR China,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, PR China,Geneis Beijing Co., Ltd., Beijing 100102, PR China,School of pharmacy, Changsha Medical University, Changsha 410219, PR China,Corresponding authors at: Academician Workstation, Changsha Medical University, Changsha 410219, PR China.
| |
Collapse
|
35
|
Hu W, Yang X, Wang L, Zhu X. MADGAN:A microbe-disease association prediction model based on generative adversarial networks. Front Microbiol 2023; 14:1159076. [PMID: 37032881 PMCID: PMC10076708 DOI: 10.3389/fmicb.2023.1159076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Accepted: 03/02/2023] [Indexed: 04/11/2023] Open
Abstract
Researches have demonstrated that microorganisms are indispensable for the nutrition transportation, growth and development of human bodies, and disorder and imbalance of microbiota may lead to the occurrence of diseases. Therefore, it is crucial to study relationships between microbes and diseases. In this manuscript, we proposed a novel prediction model named MADGAN to infer potential microbe-disease associations by combining biological information of microbes and diseases with the generative adversarial networks. To our knowledge, it is the first attempt to use the generative adversarial network to complete this important task. In MADGAN, we firstly constructed different features for microbes and diseases based on multiple similarity metrics. And then, we further adopted graph convolution neural network (GCN) to derive different features for microbes and diseases automatically. Finally, we trained MADGAN to identify latent microbe-disease associations by games between the generation network and the decision network. Especially, in order to prevent over-smoothing during the model training process, we introduced the cross-level weight distribution structure to enhance the depth of the network based on the idea of residual network. Moreover, in order to validate the performance of MADGAN, we conducted comprehensive experiments and case studies based on databases of HMDAD and Disbiome respectively, and experimental results demonstrated that MADGAN not only achieved satisfactory prediction performances, but also outperformed existing state-of-the-art prediction models.
Collapse
Affiliation(s)
- Weixin Hu
- College of Computer Science and Technology, Hengyang Normal University, Hengyang, China
| | - Xiaoyu Yang
- Institute of Bioinformatics Complex Network Big Data, Changsha University, Changsha, China
| | - Lei Wang
- Institute of Bioinformatics Complex Network Big Data, Changsha University, Changsha, China
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, China
- *Correspondence: Lei Wang,
| | - Xianyou Zhu
- College of Computer Science and Technology, Hengyang Normal University, Hengyang, China
- Xianyou Zhu,
| |
Collapse
|
36
|
Wang L, You ZH, Huang DS, Li JQ. MGRCDA: Metagraph Recommendation Method for Predicting CircRNA-Disease Association. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:67-75. [PMID: 34236991 DOI: 10.1109/tcyb.2021.3090756] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Clinical evidence began to accumulate, suggesting that circRNAs can be novel therapeutic targets for various diseases and play a critical role in human health. However, limited by the complex mechanism of circRNA, it is difficult to quickly and large-scale explore the relationship between disease and circRNA in the wet-lab experiment. In this work, we design a new computational model MGRCDA on account of the metagraph recommendation theory to predict the potential circRNA-disease associations. Specifically, we first regard the circRNA-disease association prediction problem as the system recommendation problem, and design a series of metagraphs according to the heterogeneous biological networks; then extract the semantic information of the disease and the Gaussian interaction profile kernel (GIPK) similarity of circRNA and disease as network attributes; finally, the iterative search of the metagraph recommendation algorithm is used to calculate the scores of the circRNA-disease pair. On the gold standard dataset circR2Disease, MGRCDA achieved a prediction accuracy of 92.49% with an area under the ROC curve of 0.9298, which is significantly higher than other state-of-the-art models. Furthermore, among the top 30 disease-related circRNAs recommended by the model, 25 have been verified by the latest published literature. The experimental results prove that MGRCDA is feasible and efficient, and it can recommend reliable candidates to further wet-lab experiment and reduce the scope of the experiment.
Collapse
|
37
|
Liu ZH, Ji CM, Ni JC, Wang YT, Qiao LJ, Zheng CH. Convolution Neural Networks Using Deep Matrix Factorization for Predicting Circrna-Disease Association. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:277-284. [PMID: 34951853 DOI: 10.1109/tcbb.2021.3138339] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
CircRNAs have a stable structure, which gives them a higher tolerance to nucleases. Therefore, the properties of circular RNAs are beneficial in disease diagnosis. However, there are few known associations between circRNAs and disease. Biological experiments identify new associations is time-consuming and high-cost. As a result, there is a need of building efficient and achievable computation models to predict potential circRNA-disease associations. In this paper, we design a novel convolution neural networks framework(DMFCNNCD) to learn features from deep matrix factorization to predict circRNA-disease associations. Firstly, we decompose the circRNA-disease association matrix to obtain the original features of the disease and circRNA, and use the mapping module to extract potential nonlinear features. Then, we integrate it with the similarity information to form a training set. Finally, we apply convolution neural networks to predict the unknown association between circRNAs and diseases. The five-fold cross-validation on various experiments shows that our method can predict circRNA-disease association and outperforms state of the art methods.
Collapse
|
38
|
Li P, Tiwari P, Xu J, Qian Y, Ai C, Ding Y, Guo F. Sparse regularized joint projection model for identifying associations of non-coding RNAs and human diseases. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.110044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
39
|
Zhang W, Liu B. iSnoDi-LSGT: identifying snoRNA-disease associations based on local similarity constraints and global topological constraints. RNA (NEW YORK, N.Y.) 2022; 28:1558-1567. [PMID: 36192132 PMCID: PMC9670808 DOI: 10.1261/rna.079325.122] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 09/26/2022] [Indexed: 06/16/2023]
Abstract
Growing evidence proves that small nucleolar RNAs (snoRNAs) have important functions in various biological processes, the malfunction of which leads to the emergence and development of complex diseases. However, identifying snoRNA-disease associations is an ongoing challenging task due to the considerable time- and money-consuming biological experiments. Therefore, it is urgent to design efficient and economical methods for the identification of snoRNA-disease associations. In this regard, we propose a computational method named iSnoDi-LSGT, which utilizes snoRNA sequence similarity and disease similarity as local similarity constraints. The iSnoDi-LSGT predictor further employs network embedding technology to extract topological features of snoRNAs and diseases, based on which snoRNA topological similarity and disease topological similarity are calculated as global topological constraints. To the best of our knowledge, the iSnoDi-LSGT is the first computational method for snoRNA-disease association identification. The experimental results indicate that the iSnoDi-LSGT predictor can effectively predict unknown snoRNA-disease associations. The web server of the iSnoDi-LSGT predictor is freely available at http://bliulab.net/iSnoDi-LSGT.
Collapse
Affiliation(s)
- Wenxiang Zhang
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
- Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing 100081, China
| |
Collapse
|
40
|
DRGCNCDA: Predicting circRNA-disease interactions based on knowledge graph and disentangled relational graph convolutional network. Methods 2022; 208:35-41. [DOI: 10.1016/j.ymeth.2022.10.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 09/15/2022] [Accepted: 10/10/2022] [Indexed: 11/06/2022] Open
|
41
|
Chen Y, Wang J, Wang C, Liu M, Zou Q. Deep learning models for disease-associated circRNA prediction: a review. Brief Bioinform 2022; 23:6696465. [PMID: 36130259 DOI: 10.1093/bib/bbac364] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Revised: 07/30/2022] [Accepted: 08/03/2022] [Indexed: 12/14/2022] Open
Abstract
Emerging evidence indicates that circular RNAs (circRNAs) can provide new insights and potential therapeutic targets for disease diagnosis and treatment. However, traditional biological experiments are expensive and time-consuming. Recently, deep learning with a more powerful ability for representation learning enables it to be a promising technology for predicting disease-associated circRNAs. In this review, we mainly introduce the most popular databases related to circRNA, and summarize three types of deep learning-based circRNA-disease associations prediction methods: feature-generation-based, type-discrimination and hybrid-based methods. We further evaluate seven representative models on benchmark with ground truth for both balance and imbalance classification tasks. In addition, we discuss the advantages and limitations of each type of method and highlight suggested applications for future research.
Collapse
Affiliation(s)
- Yaojia Chen
- College of Electronics and Information Engineering Guangdong Ocean University, Zhanjiang, China and the Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Jiacheng Wang
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Chuyu Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin, China
| | - Mingxin Liu
- College of Electronics and Information Engineering, Guangdong Ocean University, Zhanjiang, China
| | - Quan Zou
- University of Electronic Science and Technology of China, China
| |
Collapse
|
42
|
Huang L, Zhang L, Chen X. Updated review of advances in microRNAs and complex diseases: towards systematic evaluation of computational models. Brief Bioinform 2022; 23:6712303. [PMID: 36151749 DOI: 10.1093/bib/bbac407] [Citation(s) in RCA: 67] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2022] [Revised: 08/11/2022] [Accepted: 08/20/2022] [Indexed: 12/14/2022] Open
Abstract
Currently, there exist no generally accepted strategies of evaluating computational models for microRNA-disease associations (MDAs). Though K-fold cross validations and case studies seem to be must-have procedures, the value of K, the evaluation metrics, and the choice of query diseases as well as the inclusion of other procedures (such as parameter sensitivity tests, ablation studies and computational cost reports) are all determined on a case-by-case basis and depending on the researchers' choices. In the current review, we include a comprehensive analysis on how 29 state-of-the-art models for predicting MDAs were evaluated. Based on the analytical results, we recommend a feasible evaluation workflow that would suit any future model to facilitate fair and systematic assessment of predictive performance.
Collapse
Affiliation(s)
- Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, 10084, China.,The Future Laboratory, Tsinghua University, Beijing, 10084, China
| | - Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.,Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|
43
|
Li Y, Hu XG, Wang L, Li PP, You ZH. MNMDCDA: prediction of circRNA-disease associations by learning mixed neighborhood information from multiple distances. Brief Bioinform 2022; 23:6831006. [PMID: 36384071 DOI: 10.1093/bib/bbac479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 09/25/2022] [Accepted: 10/10/2022] [Indexed: 11/18/2022] Open
Abstract
Emerging evidence suggests that circular RNA (circRNA) is an important regulator of a variety of pathological processes and serves as a promising biomarker for many complex human diseases. Nevertheless, there are relatively few known circRNA-disease associations, and uncovering new circRNA-disease associations by wet-lab methods is time consuming and costly. Considering the limitations of existing computational methods, we propose a novel approach named MNMDCDA, which combines high-order graph convolutional networks (high-order GCNs) and deep neural networks to infer associations between circRNAs and diseases. Firstly, we computed different biological attribute information of circRNA and disease separately and used them to construct multiple multi-source similarity networks. Then, we used the high-order GCN algorithm to learn feature embedding representations with high-order mixed neighborhood information of circRNA and disease from the constructed multi-source similarity networks, respectively. Finally, the deep neural network classifier was implemented to predict associations of circRNAs with diseases. The MNMDCDA model obtained AUC scores of 95.16%, 94.53%, 89.80% and 91.83% on four benchmark datasets, i.e., CircR2Disease, CircAtlas v2.0, Circ2Disease and CircRNADisease, respectively, using the 5-fold cross-validation approach. Furthermore, 25 of the top 30 circRNA-disease pairs with the best scores of MNMDCDA in the case study were validated by recent literature. Numerous experimental results indicate that MNMDCDA can be used as an effective computational tool to predict circRNA-disease associations and can provide the most promising candidates for biological experiments.
Collapse
Affiliation(s)
- Yang Li
- School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230601, China
| | - Xue-Gang Hu
- School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230601, China
| | - Lei Wang
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning 530007, China.,College of Information Science and Engineering, Zaozhuang University, Shandong 277100, China
| | - Pei-Pei Li
- School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230601, China
| | - Zhu-Hong You
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning 530007, China.,School of Computer Science, Northwestern Polytechnical University, Xi'an Shaanxi 710129, China
| |
Collapse
|
44
|
Lan W, Dong Y, Chen Q, Liu J, Wang J, Chen YPP, Pan S. IGNSCDA: Predicting CircRNA-Disease Associations Based on Improved Graph Convolutional Network and Negative Sampling. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3530-3538. [PMID: 34506289 DOI: 10.1109/tcbb.2021.3111607] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Accumulating evidences have shown that circRNA plays an important role in human diseases. It can be used as potential biomarker for diagnose and treatment of disease. Although some computational methods have been proposed to predict circRNA-disease associations, the performance still need to be improved. In this paper, we propose a new computational model based on Improved Graph convolutional network and Negative Sampling to predict CircRNA-Disease Associations. In our method, it constructs the heterogeneous network based on known circRNA-disease associations. Then, an improved graph convolutional network is designed to obtain the feature vectors of circRNA and disease. Further, the multi-layer perceptron is employed to predict circRNA-disease associations based on the feature vectors of circRNA and disease. In addition, the negative sampling method is employed to reduce the effect of the noise samples, which selects negative samples based on circRNA's expression profile similarity and Gaussian Interaction Profile kernel similarity. The 5-fold cross validation is utilized to evaluate the performance of the method. The results show that IGNSCDA outperforms than other state-of-the-art methods in the prediction performance. Moreover, the case study shows that IGNSCDA is an effective tool for predicting potential circRNA-disease associations.
Collapse
|
45
|
MHDMF: Prediction of miRNA-disease associations based on Deep Matrix Factorization with Multi-source Graph Convolutional Network. Comput Biol Med 2022; 149:106069. [PMID: 36115300 DOI: 10.1016/j.compbiomed.2022.106069] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 07/31/2022] [Accepted: 08/27/2022] [Indexed: 11/24/2022]
Abstract
A growing number of works have proved that microRNAs (miRNAs) are a crucial biomarker in diverse bioprocesses affecting various diseases. As a good complement to high-cost wet experiment-based methods, numerous computational prediction methods have sprung up. However, there are still challenges that exist in making effective use of high false-negative associations and multi-source information for finding the potential associations. In this work, we develop an end-to-end computational framework, called MHDMF, which integrates the multi-source information on a heterogeneous network to discover latent disease-miRNA associations. Since high false-negative exist in the miRNA-disease associations, MHDMF utilizes the multi-source Graph Convolutional Network (GCN) to correct the false-negative association by reformulating the miRNA-disease association score matrix. The score matrix reformulation is based on different similarity profiles and known associations between miRNAs, genes, and diseases. Then, MHDMF employs Deep Matrix Factorization (DMF) to predict the miRNA-disease associations based on reformulated miRNA-disease association score matrix. The experimental results show that the proposed framework outperforms highly related comparison methods by a large margin on tasks of miRNA-disease association prediction. Furthermore, case studies suggest that MHDMF could be a convenient and efficient tool and may supply a new way to think about miRNA-disease association prediction.
Collapse
|
46
|
Xie X, Wang Y, Sheng N, Zhang S, Cao Y, Fu Y. Predicting miRNA-disease associations based on multi-view information fusion. Front Genet 2022; 13:979815. [PMID: 36238163 PMCID: PMC9552014 DOI: 10.3389/fgene.2022.979815] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Accepted: 08/16/2022] [Indexed: 11/13/2022] Open
Abstract
MicroRNAs (miRNAs) play an important role in various biological processes and their abnormal expression could lead to the occurrence of diseases. Exploring the potential relationships between miRNAs and diseases can contribute to the diagnosis and treatment of complex diseases. The increasing databases storing miRNA and disease information provide opportunities to develop computational methods for discovering unobserved disease-related miRNAs, but there are still some challenges in how to effectively learn and fuse information from multi-source data. In this study, we propose a multi-view information fusion based method for miRNA-disease association (MDA)prediction, named MVIFMDA. Firstly, multiple heterogeneous networks are constructed by combining the known MDAs and different similarities of miRNAs and diseases based on multi-source information. Secondly, the topology features of miRNAs and diseases are obtained by using the graph convolutional network to each heterogeneous network view, respectively. Moreover, we design the attention strategy at the topology representation level to adaptively fuse representations including different structural information. Meanwhile, we learn the attribute representations of miRNAs and diseases from their similarity attribute views with convolutional neural networks, respectively. Finally, the complicated associations between miRNAs and diseases are reconstructed by applying a bilinear decoder to the combined features, which combine topology and attribute representations. Experimental results on the public dataset demonstrate that our proposed model consistently outperforms baseline methods. The case studies further show the ability of the MVIFMDA model for inferring underlying associations between miRNAs and diseases.
Collapse
Affiliation(s)
- Xuping Xie
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, China
| | - Yan Wang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, China
- School of Artificial Intelligence, Jilin University, Changchun, China
- *Correspondence: Yan Wang,
| | - Nan Sheng
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, China
| | - Shuangquan Zhang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, China
| | - Yangkun Cao
- School of Artificial Intelligence, Jilin University, Changchun, China
| | - Yuan Fu
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, United Kingdom
| |
Collapse
|
47
|
Wang L, Wong L, Li Z, Huang Y, Su X, Zhao B, You Z. A machine learning framework based on multi-source feature fusion for circRNA-disease association prediction. Brief Bioinform 2022; 23:6693603. [PMID: 36070867 DOI: 10.1093/bib/bbac388] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Revised: 07/26/2022] [Accepted: 08/11/2022] [Indexed: 11/14/2022] Open
Abstract
Circular RNAs (circRNAs) are involved in the regulatory mechanisms of multiple complex diseases, and the identification of their associations is critical to the diagnosis and treatment of diseases. In recent years, many computational methods have been designed to predict circRNA-disease associations. However, most of the existing methods rely on single correlation data. Here, we propose a machine learning framework for circRNA-disease association prediction, called MLCDA, which effectively fuses multiple sources of heterogeneous information including circRNA sequences and disease ontology. Comprehensive evaluation in the gold standard dataset showed that MLCDA can successfully capture the complex relationships between circRNAs and diseases and accurately predict their potential associations. In addition, the results of case studies on real data show that MLCDA significantly outperforms other existing methods. MLCDA can serve as a useful tool for circRNA-disease association prediction, providing mechanistic insights for disease research and thus facilitating the progress of disease treatment.
Collapse
Affiliation(s)
- Lei Wang
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning, 530007, China
| | - Leon Wong
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning, 530007, China
| | - Zhengwei Li
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning, 530007, China
| | - Yuan Huang
- Department of Computing, Hong Kong Polytechnic University, Hong Kong 999077, China
| | - Xiaorui Su
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Bowei Zhao
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Zhuhong You
- School of Computer Science, Northwestern Polytechnical University, Xi'an 710129, China
| |
Collapse
|
48
|
Dai Q, Liu Z, Wang Z, Duan X, Guo M. GraphCDA: a hybrid graph representation learning framework based on GCN and GAT for predicting disease-associated circRNAs. Brief Bioinform 2022; 23:6692549. [PMID: 36070619 DOI: 10.1093/bib/bbac379] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Revised: 07/18/2022] [Accepted: 08/09/2022] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION CircularRNA (circRNA) is a class of noncoding RNA with high conservation and stability, which is considered as an important disease biomarker and drug target. Accumulating pieces of evidence have indicated that circRNA plays a crucial role in the pathogenesis and progression of many complex diseases. As the biological experiments are time-consuming and labor-intensive, developing an accurate computational prediction method has become indispensable to identify disease-related circRNAs. RESULTS We presented a hybrid graph representation learning framework, named GraphCDA, for predicting the potential circRNA-disease associations. Firstly, the circRNA-circRNA similarity network and disease-disease similarity network were constructed to characterize the relationships of circRNAs and diseases, respectively. Secondly, a hybrid graph embedding model combining Graph Convolutional Networks and Graph Attention Networks was introduced to learn the feature representations of circRNAs and diseases simultaneously. Finally, the learned representations were concatenated and employed to build the prediction model for identifying the circRNA-disease associations. A series of experimental results demonstrated that GraphCDA outperformed other state-of-the-art methods on several public databases. Moreover, GraphCDA could achieve good performance when only using a small number of known circRNA-disease associations as the training set. Besides, case studies conducted on several human diseases further confirmed the prediction capability of GraphCDA for predicting potential disease-related circRNAs. In conclusion, extensive experimental results indicated that GraphCDA could serve as a reliable tool for exploring the regulatory role of circRNAs in complex diseases.
Collapse
Affiliation(s)
- Qiguo Dai
- School of Computer Science and Engineering, Dalian Minzu University, 116600, Dalian, China.,SEAC Key Laboratory of Big Data Applied Technology, Dalian Minzu University, 116600, Dalian, China
| | - Ziqiang Liu
- School of Computer Science and Engineering, Dalian Minzu University, 116600, Dalian, China.,SEAC Key Laboratory of Big Data Applied Technology, Dalian Minzu University, 116600, Dalian, China
| | - Zhaowei Wang
- SEAC Key Laboratory of Big Data Applied Technology, Dalian Minzu University, 116600, Dalian, China.,School of Computer Science and Technology, Dalian University of Technology, 116024, Dalian, China
| | - Xiaodong Duan
- SEAC Key Laboratory of Big Data Applied Technology, Dalian Minzu University, 116600, Dalian, China
| | - Maozu Guo
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, 100044, Beijing, China
| |
Collapse
|
49
|
Huang Y, Li Y, Lin W, Fan S, Chen H, Xia J, Pi J, Xu JF. Promising Roles of Circular RNAs as Biomarkers and Targets for Potential Diagnosis and Therapy of Tuberculosis. Biomolecules 2022; 12:biom12091235. [PMID: 36139074 PMCID: PMC9496049 DOI: 10.3390/biom12091235] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2022] [Revised: 09/01/2022] [Accepted: 09/02/2022] [Indexed: 12/02/2022] Open
Abstract
Tuberculosis (TB), caused by Mycobacterium tuberculosis (Mtb) infection, remains one of the most threatening infectious diseases worldwide. A series of challenges still exist for TB prevention, diagnosis and treatment, which therefore require more attempts to clarify the pathological and immunological mechanisms in the development and progression of TB. Circular RNAs (circRNAs) are a large class of non-coding RNA, mostly expressed in eukaryotic cells, which are generated by the spliceosome through the back-splicing of linear RNAs. Accumulating studies have identified that circRNAs are widely involved in a variety of physiological and pathological processes, acting as the sponges or decoys for microRNAs and proteins, scaffold platforms for proteins, modulators for transcription and special templates for translation. Due to the stable and widely spread characteristics of circRNAs, they are expected to serve as promising prognostic/diagnostic biomarkers and therapeutic targets for diseases. In this review, we briefly describe the biogenesis, classification, detection technology and functions of circRNAs, and, in particular, outline the dynamic, and sometimes aberrant changes of circRNAs in TB. Moreover, we further summarize the recent progress of research linking circRNAs to TB-related pathogenetic processes, as well as the potential roles of circRNAs as diagnostic biomarkers and miRNAs sponges in the case of Mtb infection, which is expected to enhance our understanding of TB and provide some novel ideas about how to overcome the challenges associated TB in the future.
Collapse
Affiliation(s)
- Yifan Huang
- Guangdong Provincial Key Laboratory of Medical Molecular Diagnostics, The First Dongguan Affiliated Hospital, Guangdong Medical University, Dongguan 523808, China
- Institute of Laboratory Medicine, School of Medical Technology, Guangdong Medical University, Dongguan 523808, China
| | - Ying Li
- Guangdong Provincial Key Laboratory of Medical Molecular Diagnostics, The First Dongguan Affiliated Hospital, Guangdong Medical University, Dongguan 523808, China
- Institute of Laboratory Medicine, School of Medical Technology, Guangdong Medical University, Dongguan 523808, China
| | - Wensen Lin
- Guangdong Provincial Key Laboratory of Medical Molecular Diagnostics, The First Dongguan Affiliated Hospital, Guangdong Medical University, Dongguan 523808, China
- Institute of Laboratory Medicine, School of Medical Technology, Guangdong Medical University, Dongguan 523808, China
| | - Shuhao Fan
- Guangdong Provincial Key Laboratory of Medical Molecular Diagnostics, The First Dongguan Affiliated Hospital, Guangdong Medical University, Dongguan 523808, China
- Institute of Laboratory Medicine, School of Medical Technology, Guangdong Medical University, Dongguan 523808, China
| | - Haorong Chen
- Guangdong Provincial Key Laboratory of Medical Molecular Diagnostics, The First Dongguan Affiliated Hospital, Guangdong Medical University, Dongguan 523808, China
- Institute of Laboratory Medicine, School of Medical Technology, Guangdong Medical University, Dongguan 523808, China
| | - Jiaojiao Xia
- Guangdong Provincial Key Laboratory of Medical Molecular Diagnostics, The First Dongguan Affiliated Hospital, Guangdong Medical University, Dongguan 523808, China
- Institute of Laboratory Medicine, School of Medical Technology, Guangdong Medical University, Dongguan 523808, China
| | - Jiang Pi
- Guangdong Provincial Key Laboratory of Medical Molecular Diagnostics, The First Dongguan Affiliated Hospital, Guangdong Medical University, Dongguan 523808, China
- Institute of Laboratory Medicine, School of Medical Technology, Guangdong Medical University, Dongguan 523808, China
- Correspondence: (J.P.); (J.-F.X.)
| | - Jun-Fa Xu
- Guangdong Provincial Key Laboratory of Medical Molecular Diagnostics, The First Dongguan Affiliated Hospital, Guangdong Medical University, Dongguan 523808, China
- Institute of Laboratory Medicine, School of Medical Technology, Guangdong Medical University, Dongguan 523808, China
- Correspondence: (J.P.); (J.-F.X.)
| |
Collapse
|
50
|
Huang L, Zhang L, Chen X. Updated review of advances in microRNAs and complex diseases: taxonomy, trends and challenges of computational models. Brief Bioinform 2022; 23:6686738. [PMID: 36056743 DOI: 10.1093/bib/bbac358] [Citation(s) in RCA: 71] [Impact Index Per Article: 23.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 07/24/2022] [Accepted: 07/30/2022] [Indexed: 12/12/2022] Open
Abstract
Since the problem proposed in late 2000s, microRNA-disease association (MDA) predictions have been implemented based on the data fusion paradigm. Integrating diverse data sources gains a more comprehensive research perspective, and brings a challenge to algorithm design for generating accurate, concise and consistent representations of the fused data. After more than a decade of research progress, a relatively simple algorithm like the score function or a single computation layer may no longer be sufficient for further improving predictive performance. Advanced model design has become more frequent in recent years, particularly in the form of reasonably combing multiple algorithms, a process known as model fusion. In the current review, we present 29 state-of-the-art models and introduce the taxonomy of computational models for MDA prediction based on model fusion and non-fusion. The new taxonomy exhibits notable changes in the algorithmic architecture of models, compared with that of earlier ones in the 2017 review by Chen et al. Moreover, we discuss the progresses that have been made towards overcoming the obstacles to effective MDA prediction since 2017 and elaborated on how future models can be designed according to a set of new schemas. Lastly, we analysed the strengths and weaknesses of each model category in the proposed taxonomy and proposed future research directions from diverse perspectives for enhancing model performance.
Collapse
Affiliation(s)
- Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, 10084, China.,The Future Laboratory, Tsinghua University, Beijing, 10084, China
| | - Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.,Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|