1
|
Guan Z, Jin X, Zhang X. MFF-nDA: A Computational Model for ncRNA-Disease Association Prediction Based on Multimodule Fusion. J Chem Inf Model 2025; 65:3324-3342. [PMID: 40129032 DOI: 10.1021/acs.jcim.5c00174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2025]
Abstract
Noncoding RNAs(ncRNAs), including piwi-interacting RNA(piRNA), long noncoding RNA(lncRNA), microRNA(miRNA), small nucleolar RNA(snoRNA), and circular RNA(circRNA), contribute significantly to gene expression regulation and serve as key factors in disease association studies and health-related exploration. Accurate prediction of ncRNA-disease associations is crucial for elucidating disease mechanisms and advancing therapeutic development. Recently, computational models based on a graph neural network have extensively emerged for identifying associations among various ncRNAs and diseases. However, existing computational models have not fully utilized integrative information on ncRNs and diseases, and reliance on GNN-based models alone may be limited in performance due to oversmoothing issues. On the other hand, existing models are mainly targeted at a specific type of ncRNA and may not be applicable to most ncRNAs. Therefore, to overcome these limitations, we propound a computational model MFF-nDA based on multimodule fusion. Specifically, we first introduce five types of similarity network information, including three types of ncRNA and two types of disease similarity information, in order to fully explore and optimize the multisource feature information on these entities. Subsequently, we establish three modules: heterogeneous network representation module based on Transformer, association network representation module based on graph convolutional network (GCN), and topological structure representation module based on graph attention network (GAT), which capture diverse features of nodes in heterogeneous networks and topological structure information reflected in association networks. The complementary effects of the three modules also help relieve the oversmoothing issue to some extent. By leveraging the multimodule fusion learning to comprehensively capture the diverse features of these entities, our model outperforms the available state-of-the-art methods, achieving an AUC greater than 0.9000 for each dataset. This demonstrates the highest predictive performance, making it a valuable tool for identifying potential ncRNA associated with diseases. The code of MFF-nDA can be accessed at https://github.com/Jack-Cxy/MFF-nDA.
Collapse
Affiliation(s)
- Zhihao Guan
- College of Information and Artificial Intelligence, Anhui Agricultural University, Hefei 230036, China
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Anhui Agricultural University, Hefei 230036, China
| | - Xiu Jin
- College of Information and Artificial Intelligence, Anhui Agricultural University, Hefei 230036, China
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Anhui Agricultural University, Hefei 230036, China
| | - Xiaodan Zhang
- College of Information and Artificial Intelligence, Anhui Agricultural University, Hefei 230036, China
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Anhui Agricultural University, Hefei 230036, China
| |
Collapse
|
2
|
Fu HY, Liu YY, Zhang MY, Yang HX. Enrichment Analysis and Deep Learning in Biomedical Ontology: Applications and Advancements. CHINESE MEDICAL SCIENCES JOURNAL = CHUNG-KUO I HSUEH K'O HSUEH TSA CHIH 2025; 40:45-56. [PMID: 40164517 DOI: 10.24920/004464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/02/2025]
Abstract
Biomedical big data, characterized by its massive scale, multi-dimensionality, and heterogeneity, offers novel perspectives for disease research, elucidates biological principles, and simultaneously prompts changes in related research methodologies. Biomedical ontology, as a shared formal conceptual system, not only offers standardized terms for multi-source biomedical data but also provides a solid data foundation and framework for biomedical research. In this review, we summarize enrichment analysis and deep learning for biomedical ontology based on its structure and semantic annotation properties, highlighting how technological advancements are enabling the more comprehensive use of ontology information. Enrichment analysis represents an important application of ontology to elucidate the potential biological significance for a particular molecular list. Deep learning, on the other hand, represents an increasingly powerful analytical tool that can be more widely combined with ontology for analysis and prediction. With the continuous evolution of big data technologies, the integration of these technologies with biomedical ontologies is opening up exciting new possibilities for advancing biomedical research.
Collapse
Affiliation(s)
- Hong-Yu Fu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Yang-Yang Liu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Mei-Yi Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Hai-Xiu Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China.
| |
Collapse
|
3
|
Yao B, Song Y. lncRNA-disease association prediction based on optimizing measures of multi-graph regularized matrix factorization. Comput Methods Biomech Biomed Engin 2025:1-16. [PMID: 40114384 DOI: 10.1080/10255842.2025.2479854] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2024] [Revised: 02/05/2025] [Accepted: 02/17/2025] [Indexed: 03/22/2025]
Abstract
In this paper, we propose a novel lncRNA-disease association prediction algorithm based on optimizing measures of multi-graph regularized matrix factorization (OM-MGRMF). The method first calculates the semantic similarity of diseases, the functional similarity of lncRNAs, and the Gaussian similarity of both. It then constructs a new lncRNA-disease association matrix by using the K-nearest-neighbor (KNN) algorithm. Finally, the objective function is constructed through the utilization of ranking measures and multi-graph regularization constraints. This objective function is iteratively optimized by an adaptive gradient descent algorithm. The experimental results of OM-MGRMF outperform those of classical methods in both K-fold cross-validation.
Collapse
Affiliation(s)
- Bin Yao
- School of Electrical Engineering and Automation, Henan Polytechnic University, Jiaozuo, China
- Henan International Joint Laboratory of Direct Drive and General of Intelligent Equipment, Jiaozuo, China
| | - Yunzhong Song
- School of Electrical Engineering and Automation, Henan Polytechnic University, Jiaozuo, China
- Henan International Joint Laboratory of Direct Drive and General of Intelligent Equipment, Jiaozuo, China
| |
Collapse
|
4
|
Zhang X, Zou Q, Niu M, Wang C. Predicting circRNA-disease associations with shared units and multi-channel attention mechanisms. Bioinformatics 2025; 41:btaf088. [PMID: 40045181 PMCID: PMC11919450 DOI: 10.1093/bioinformatics/btaf088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2024] [Revised: 02/05/2025] [Accepted: 02/22/2025] [Indexed: 03/20/2025] Open
Abstract
MOTIVATION Circular RNAs (circRNAs) have been identified as key players in the progression of several diseases; however, their roles have not yet been determined because of the high financial burden of biological studies. This highlights the urgent need to develop efficient computational models that can predict circRNA-disease associations, offering an alternative approach to overcome the limitations of expensive experimental studies. Although multi-view learning methods have been widely adopted, most approaches fail to fully exploit the latent information across views, while simultaneously overlooking the fact that different views contribute to varying degrees of significance. RESULTS This study presents a method that combines multi-view shared units and multichannel attention mechanisms to predict circRNA-disease associations (MSMCDA). MSMCDA first constructs similarity and meta-path networks for circRNAs and diseases by introducing shared units to facilitate interactive learning across distinct network features. Subsequently, multichannel attention mechanisms were used to optimize the weights within similarity networks. Finally, contrastive learning strengthened the similarity features. Experiments on five public datasets demonstrated that MSMCDA significantly outperformed other baseline methods. Additionally, case studies on colorectal cancer, gastric cancer, and nonsmall cell lung cancer confirmed the effectiveness of MSMCDA in uncovering new associations. AVAILABILITY AND IMPLEMENTATION The source code and data are available at https://github.com/zhangxue2115/MSMCDA.git.
Collapse
Affiliation(s)
- Xue Zhang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang 150000, China
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, Sichuan 610000, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, Sichuan 610000, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang 324000, China
| | - Mengting Niu
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, Sichuan 610000, China
- School of Electronic and Communication Engineering, Shenzhen Polytechnic University, Shenzhen, Guangdong 518055, China
| | - Chunyu Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang 150000, China
| |
Collapse
|
5
|
Bi XA, Chen K, Jiang S, Luo S, Zhou W, Xing Z, Xu L, Liu Z, Liu T. Community Graph Convolution Neural Network for Alzheimer's Disease Classification and Pathogenetic Factors Identification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:1959-1973. [PMID: 37204952 DOI: 10.1109/tnnls.2023.3269446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
As a complex neural network system, the brain regions and genes collaborate to effectively store and transmit information. We abstract the collaboration correlations as the brain region gene community network (BG-CN) and present a new deep learning approach, such as the community graph convolutional neural network (Com-GCN), for investigating the transmission of information within and between communities. The results can be used for diagnosing and extracting causal factors for Alzheimer's disease (AD). First, an affinity aggregation model for BG-CN is developed to describe intercommunity and intracommunity information transmission. Second, we design the Com-GCN architecture with intercommunity convolution and intracommunity convolution operations based on the affinity aggregation model. Through sufficient experimental validation on the AD neuroimaging initiative (ADNI) dataset, the design of Com-GCN matches the physiological mechanism better and improves the interpretability and classification performance. Furthermore, Com-GCN can identify lesioned brain regions and disease-causing genes, which may assist precision medicine and drug design in AD and serve as a valuable reference for other neurological disorders.
Collapse
|
6
|
Lan W, Li C, Chen Q, Yu N, Pan Y, Zheng Y, Chen YPP. LGCDA: Predicting CircRNA-Disease Association Based on Fusion of Local and Global Features. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1413-1422. [PMID: 38607720 DOI: 10.1109/tcbb.2024.3387913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/14/2024]
Abstract
CircRNA has been shown to be involved in the occurrence of many diseases. Several computational frameworks have been proposed to identify circRNA-disease associations. Despite the existing computational methods have obtained considerable successes, these methods still require to be improved as their performance may degrade due to the sparsity of the data and the problem of memory overflow. We develop a novel computational framework called LGCDA to predict circRNA-disease associations by fusing local and global features to solve the above mentioned problems. First, we construct closed local subgraphs by using k-hop closed subgraph and label the subgraphs to obtain rich graph pattern information. Then, the local features are extracted by using graph neural network (GNN). In addition, we fuse Gaussian interaction profile (GIP) kernel and cosine similarity to obtain global features. Finally, the score of circRNA-disease associations is predicted by using the multilayer perceptron (MLP) based on local and global features. We perform five-fold cross validation on five datasets for model evaluation and our model surpasses other advanced methods.
Collapse
|
7
|
Wang XF, Yu CQ, You ZH, Wang Y, Huang L, Qiao Y, Wang L, Li ZW. BEROLECMI: a novel prediction method to infer circRNA-miRNA interaction from the role definition of molecular attributes and biological networks. BMC Bioinformatics 2024; 25:264. [PMID: 39127625 DOI: 10.1186/s12859-024-05891-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Accepted: 08/01/2024] [Indexed: 08/12/2024] Open
Abstract
Circular RNA (CircRNA)-microRNA (miRNA) interaction (CMI) is an important model for the regulation of biological processes by non-coding RNA (ncRNA), which provides a new perspective for the study of human complex diseases. However, the existing CMI prediction models mainly rely on the nearest neighbor structure in the biological network, ignoring the molecular network topology, so it is difficult to improve the prediction performance. In this paper, we proposed a new CMI prediction method, BEROLECMI, which uses molecular sequence attributes, molecular self-similarity, and biological network topology to define the specific role feature representation for molecules to infer the new CMI. BEROLECMI effectively makes up for the lack of network topology in the CMI prediction model and achieves the highest prediction performance in three commonly used data sets. In the case study, 14 of the 15 pairs of unknown CMIs were correctly predicted.
Collapse
Affiliation(s)
- Xin-Fei Wang
- School of Information Engineering, Xijing University, Xi'an, China
| | - Chang-Qing Yu
- School of Information Engineering, Xijing University, Xi'an, China.
| | - Zhu-Hong You
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China.
| | - Yan Wang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, China.
- School of Artificial Intelligence, Jilin University, Changchun, China.
| | - Lan Huang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, China
| | - Yan Qiao
- College of Agriculture and Forestry, Longdong University, Qingyang, China
| | - Lei Wang
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, China
- Guangxi Academy of Sciences, Nanning, China
| | - Zheng-Wei Li
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, China
| |
Collapse
|
8
|
Yin W, Wang S, Qiao S, Zhao Y, Wu W, Pang S, Lv Z. DETHACDA: A Dual-View Edge and Topology Hybrid Attention Model for CircRNA-Disease Associations Prediction. IEEE J Biomed Health Inform 2024; 28:4421-4431. [PMID: 37307176 DOI: 10.1109/jbhi.2023.3284851] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
There exists growing evidence that circRNAs are concerned with many complex diseases physiological processes and pathogenesis and may serve as critical therapeutic targets. Identifying disease-associated circRNAs through biological experiments is time-consuming, and designing an intelligent, precise calculation model is essential. Recently, many models based on graph technology have been proposed to predict circRNA-disease association. However, most existing methods only capture the neighborhood topology of the association network and ignore the complex semantic information. Therefore, we propose a Dual-view Edge and Topology Hybrid Attention model for predicting CircRNA-Disease Associations (DETHACDA), effectively capturing the neighborhood topology and various semantics of circRNA and disease nodes in a heterogeneous network. The 5-fold cross-validation experiments on circRNADisease indicate that the proposed DETHACDA achieves the area under receiver operating characteristic curve of 0.9882, better than four state-of-the-art calculation methods.
Collapse
|
9
|
Peng L, Ren M, Huang L, Chen M. GEnDDn: An lncRNA-Disease Association Identification Framework Based on Dual-Net Neural Architecture and Deep Neural Network. Interdiscip Sci 2024; 16:418-438. [PMID: 38733474 DOI: 10.1007/s12539-024-00619-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2023] [Revised: 02/02/2024] [Accepted: 02/03/2024] [Indexed: 05/13/2024]
Abstract
Accumulating studies have demonstrated close relationships between long non-coding RNAs (lncRNAs) and diseases. Identification of new lncRNA-disease associations (LDAs) enables us to better understand disease mechanisms and further provides promising insights into cancer targeted therapy and anti-cancer drug design. Here, we present an LDA prediction framework called GEnDDn based on deep learning. GEnDDn mainly comprises two steps: First, features of both lncRNAs and diseases are extracted by combining similarity computation, non-negative matrix factorization, and graph attention auto-encoder, respectively. And each lncRNA-disease pair (LDP) is depicted as a vector based on concatenation operation on the extracted features. Subsequently, unknown LDPs are classified by aggregating dual-net neural architecture and deep neural network. Using six different evaluation metrics, we found that GEnDDn surpassed four competing LDA identification methods (SDLDA, LDNFSGB, IPCARF, LDASR) on the lncRNADisease and MNDR databases under fivefold cross-validation experiments on lncRNAs, diseases, LDPs, and independent lncRNAs and independent diseases, respectively. Ablation experiments further validated the powerful LDA prediction performance of GEnDDn. Furthermore, we utilized GEnDDn to find underlying lncRNAs for lung cancer and breast cancer. The results elucidated that there may be dense linkages between IFNG-AS1 and lung cancer as well as between HIF1A-AS1 and breast cancer. The results require further biomedical experimental verification. GEnDDn is publicly available at https://github.com/plhhnu/GEnDDn.
Collapse
Affiliation(s)
- Lihong Peng
- College of Life Science and Chemistry, Hunan University of Technology, Zhuzhou, 412007, China
| | - Mengnan Ren
- College of Life Science and Chemistry, Hunan University of Technology, Zhuzhou, 412007, China
| | - Liangliang Huang
- College of Life Science and Chemistry, Hunan University of Technology, Zhuzhou, 412007, China
| | - Min Chen
- School of Computer Science, Hunan Institute of Technology, Hengyang, 421002, China.
| |
Collapse
|
10
|
Li YC, You ZH, Yu CQ, Wang L, Hu L, Hu PW, Qiao Y, Wang XF, Huang YA. DeepCMI: a graph-based model for accurate prediction of circRNA-miRNA interactions with multiple information. Brief Funct Genomics 2024; 23:276-285. [PMID: 37539561 DOI: 10.1093/bfgp/elad030] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2023] [Revised: 05/25/2023] [Accepted: 07/13/2023] [Indexed: 08/05/2023] Open
Abstract
Recently, the role of competing endogenous RNAs in regulating gene expression through the interaction of microRNAs has been closely associated with the expression of circular RNAs (circRNAs) in various biological processes such as reproduction and apoptosis. While the number of confirmed circRNA-miRNA interactions (CMIs) continues to increase, the conventional in vitro approaches for discovery are expensive, labor intensive, and time consuming. Therefore, there is an urgent need for effective prediction of potential CMIs through appropriate data modeling and prediction based on known information. In this study, we proposed a novel model, called DeepCMI, that utilizes multi-source information on circRNA/miRNA to predict potential CMIs. Comprehensive evaluations on the CMI-9905 and CMI-9589 datasets demonstrated that DeepCMI successfully infers potential CMIs. Specifically, DeepCMI achieved AUC values of 90.54% and 94.8% on the CMI-9905 and CMI-9589 datasets, respectively. These results suggest that DeepCMI is an effective model for predicting potential CMIs and has the potential to significantly reduce the need for downstream in vitro studies. To facilitate the use of our trained model and data, we have constructed a computational platform, which is available at http://120.77.11.78/DeepCMI/. The source code and datasets used in this work are available at https://github.com/LiYuechao1998/DeepCMI.
Collapse
Affiliation(s)
- Yue-Chao Li
- School of Information Engineering, Xijing University, Xi'an, China
| | - Zhu-Hong You
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China
| | - Chang-Qing Yu
- School of Information Engineering, Xijing University, Xi'an, China
| | - Lei Wang
- Guangxi Academy of Sciences, Nanning, China
| | - Lun Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Science, Urumqi, China
| | - Peng-Wei Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Science, Urumqi, China
| | - Yan Qiao
- College of Agriculture and Forestry, Longdong University, Qingyang 745000, China
| | - Xin-Fei Wang
- School of Information Engineering, Xijing University, Xi'an, China
| | - Yu-An Huang
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China
| |
Collapse
|
11
|
Wang L, Li ZW, You ZH, Huang DS, Wong L. GSLCDA: An Unsupervised Deep Graph Structure Learning Method for Predicting CircRNA-Disease Association. IEEE J Biomed Health Inform 2024; 28:1742-1751. [PMID: 38127594 DOI: 10.1109/jbhi.2023.3344714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Growing studies reveal that Circular RNAs (circRNAs) are broadly engaged in physiological processes of cell proliferation, differentiation, aging, apoptosis, and are closely associated with the pathogenesis of numerous diseases. Clarification of the correlation among diseases and circRNAs is of great clinical importance to provide new therapeutic strategies for complex diseases. However, previous circRNA-disease association prediction methods rely excessively on the graph network, and the model performance is dramatically reduced when noisy connections occur in the graph structure. To address this problem, this paper proposes an unsupervised deep graph structure learning method GSLCDA to predict potential CDAs. Concretely, we first integrate circRNA and disease multi-source data to constitute the CDA heterogeneous network. Then the network topology is learned using the graph structure, and the original graph is enhanced in an unsupervised manner by maximize the inter information of the learned and original graphs to uncover their essential features. Finally, graph space sensitive k-nearest neighbor (KNN) algorithm is employed to search for latent CDAs. In the benchmark dataset, GSLCDA obtained 92.67% accuracy with 0.9279 AUC. GSLCDA also exhibits exceptional performance on independent datasets. Furthermore, 14, 12 and 14 of the top 16 circRNAs with the most points GSLCDA prediction scores were confirmed in the relevant literature in the breast cancer, colorectal cancer and lung cancer case studies, respectively. Such results demonstrated that GSLCDA can validly reveal underlying CDA and offer new perspectives for the diagnosis and therapy of complex human diseases.
Collapse
|
12
|
Wang L, Li ZW, You ZH, Huang DS, Wong L. MAGCDA: A Multi-Hop Attention Graph Neural Networks Method for CircRNA-Disease Association Prediction. IEEE J Biomed Health Inform 2024; 28:1752-1761. [PMID: 38145538 DOI: 10.1109/jbhi.2023.3346821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2023]
Abstract
With a growing body of evidence establishing circular RNAs (circRNAs) are widely exploited in eukaryotic cells and have a significant contribution in the occurrence and development of many complex human diseases. Disease-associated circRNAs can serve as clinical diagnostic biomarkers and therapeutic targets, providing novel ideas for biopharmaceutical research. However, available computation methods for predicting circRNA-disease associations (CDAs) do not sufficiently consider the contextual information of biological network nodes, making their performance limited. In this work, we propose a multi-hop attention graph neural network-based approach MAGCDA to infer potential CDAs. Specifically, we first construct a multi-source attribute heterogeneous network of circRNAs and diseases, then use a multi-hop strategy of graph nodes to deeply aggregate node context information through attention diffusion, thus enhancing topological structure information and mining data hidden features, and finally use random forest to accurately infer potential CDAs. In the four gold standard data sets, MAGCDA achieved prediction accuracy of 92.58%, 91.42%, 83.46% and 91.12%, respectively. MAGCDA has also presented prominent achievements in ablation experiments and in comparisons with other models. Additionally, 18 and 17 potential circRNAs in top 20 predicted scores for MAGCDA prediction scores were confirmed in case studies of the complex diseases breast cancer and Almozheimer's disease, respectively. These results suggest that MAGCDA can be a practical tool to explore potential disease-associated circRNAs and provide a theoretical basis for disease diagnosis and treatment.
Collapse
|
13
|
Zhang P, Zhang W, Sun W, Xu J, Hu H, Wang L, Wong L. Identification of gene biomarkers for brain diseases via multi-network topological semantics extraction and graph convolutional network. BMC Genomics 2024; 25:175. [PMID: 38350848 PMCID: PMC10865627 DOI: 10.1186/s12864-024-09967-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Accepted: 01/03/2024] [Indexed: 02/15/2024] Open
Abstract
BACKGROUND Brain diseases pose a significant threat to human health, and various network-based methods have been proposed for identifying gene biomarkers associated with these diseases. However, the brain is a complex system, and extracting topological semantics from different brain networks is necessary yet challenging to identify pathogenic genes for brain diseases. RESULTS In this study, we present a multi-network representation learning framework called M-GBBD for the identification of gene biomarker in brain diseases. Specifically, we collected multi-omics data to construct eleven networks from different perspectives. M-GBBD extracts the spatial distributions of features from these networks and iteratively optimizes them using Kullback-Leibler divergence to fuse the networks into a common semantic space that represents the gene network for the brain. Subsequently, a graph consisting of both gene and large-scale disease proximity networks learns representations through graph convolution techniques and predicts whether a gene is associated which brain diseases while providing associated scores. Experimental results demonstrate that M-GBBD outperforms several baseline methods. Furthermore, our analysis supported by bioinformatics revealed CAMP as a significantly associated gene with Alzheimer's disease identified by M-GBBD. CONCLUSION Collectively, M-GBBD provides valuable insights into identifying gene biomarkers for brain diseases and serves as a promising framework for brain networks representation learning.
Collapse
Affiliation(s)
- Ping Zhang
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang, 277100, Shandong, China
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Weihan Zhang
- CAS Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, The Innovative Academy of Seed Design, Chinese Academy of Sciences, Hubei Hongshan Laboratory, Wuhan, 430074, China
| | - Weicheng Sun
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Jinsheng Xu
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Hua Hu
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang, 277100, Shandong, China.
| | - Lei Wang
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang, 277100, Shandong, China.
- Guangxi Key Lab of Human-Machine Interaction and Intelligent Decision, Guangxi Academy of Sciences, Nanning, 530007, China.
| | - Leon Wong
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen, 518118, China.
| |
Collapse
|
14
|
Guo LX, Wang L, You ZH, Yu CQ, Hu ML, Zhao BW, Li Y. Biolinguistic graph fusion model for circRNA-miRNA association prediction. Brief Bioinform 2024; 25:bbae058. [PMID: 38426324 PMCID: PMC10939421 DOI: 10.1093/bib/bbae058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 01/19/2024] [Accepted: 01/27/2024] [Indexed: 03/02/2024] Open
Abstract
Emerging clinical evidence suggests that sophisticated associations with circular ribonucleic acids (RNAs) (circRNAs) and microRNAs (miRNAs) are a critical regulatory factor of various pathological processes and play a critical role in most intricate human diseases. Nonetheless, the above correlations via wet experiments are error-prone and labor-intensive, and the underlying novel circRNA-miRNA association (CMA) has been validated by numerous existing computational methods that rely only on single correlation data. Considering the inadequacy of existing machine learning models, we propose a new model named BGF-CMAP, which combines the gradient boosting decision tree with natural language processing and graph embedding methods to infer associations between circRNAs and miRNAs. Specifically, BGF-CMAP extracts sequence attribute features and interaction behavior features by Word2vec and two homogeneous graph embedding algorithms, large-scale information network embedding and graph factorization, respectively. Multitudinous comprehensive experimental analysis revealed that BGF-CMAP successfully predicted the complex relationship between circRNAs and miRNAs with an accuracy of 82.90% and an area under receiver operating characteristic of 0.9075. Furthermore, 23 of the top 30 miRNA-associated circRNAs of the studies on data were confirmed in relevant experiences, showing that the BGF-CMAP model is superior to others. BGF-CMAP can serve as a helpful model to provide a scientific theoretical basis for the study of CMA prediction.
Collapse
Affiliation(s)
- Lu-Xiang Guo
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116, China
| | - Lei Wang
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116, China
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning 530007, China
- College of Information Science and Engineering, Zaozhuang University, Shandong 277100, China
| | - Zhu-Hong You
- School of Computer Science, Northwestern Polytechnical University, Xi’an, 710129, China
| | - Chang-Qing Yu
- College of Information Engineering, Xijing University, Xi’an 710123, China
| | - Meng-Lei Hu
- School of Medicine, Peking University, Beijing, 100091, China
| | - Bo-Wei Zhao
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Yang Li
- School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230601, China
| |
Collapse
|
15
|
Li G, Zeng F, Luo J, Liang C, Xiao Q. MNCLCDA: predicting circRNA-drug sensitivity associations by using mixed neighbourhood information and contrastive learning. BMC Med Inform Decis Mak 2023; 23:291. [PMID: 38110886 PMCID: PMC10729363 DOI: 10.1186/s12911-023-02384-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Accepted: 12/01/2023] [Indexed: 12/20/2023] Open
Abstract
BACKGROUND circRNAs play an important role in drug resistance and cancer development. Recently, many studies have shown that the expressions of circRNAs in human cells can affect the sensitivity of cells to therapeutic drugs, thus significantly influencing the therapeutic effects of these drugs. Traditional biomedical experiments required to verify this sensitivity relationship are not only time-consuming but also expensive. Hence, the development of an efficient computational approach that can accurately predict the novel associations between drug sensitivities and circRNAs is a crucial and pressing need. METHODS In this research, we present a novel computational framework called MNCLCDA, which aims to predict the potential associations between drug sensitivities and circRNAs to assist with medical research. First, MNCLCDA quantifies the similarity between the given drug and circRNA using drug structure information, circRNA gene sequence information, and GIP kernel information. Due to the existence of noise in similarity information, we employ a preprocessing approach based on random walk with restart for similarity networks to efficiently capture the useful features of circRNAs and drugs. Second, we use a mixed neighbourhood graph convolutional network to obtain the neighbourhood information of nodes. Then, a graph-based contrastive learning method is used to enhance the robustness of the model, and finally, a double Laplace-regularized least-squares method is used to predict potential circRNA-drug associations through the kernel matrices in the circRNA and drug spaces. RESULTS Numerous experimental results show that MNCLCDA outperforms six other advanced methods. In addition, the excellent performance of our proposed model in case studies illustrates that MNCLCDA also has the ability to predict the associations between drug sensitivity and circRNA in practical situations. CONCLUSIONS After a large number of experiments, it is illustrated that MNCLCDA is an efficient tool for predicting the potential associations between drug sensitivities and circRNAs, thereby can provide some guidance for clinical trials.
Collapse
Affiliation(s)
- Guanghui Li
- School of Information Engineering, East China Jiaotong University, Nanchang, China.
| | - Feifan Zeng
- School of Information Engineering, East China Jiaotong University, Nanchang, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China.
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Qiu Xiao
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
| |
Collapse
|
16
|
Shen Z, Liu W, Zhao S, Zhang Q, Wang S, Yuan L. Nucleotide-level prediction of CircRNA-protein binding based on fully convolutional neural network. Front Genet 2023; 14:1283404. [PMID: 37867600 PMCID: PMC10587422 DOI: 10.3389/fgene.2023.1283404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Accepted: 09/21/2023] [Indexed: 10/24/2023] Open
Abstract
Introduction: CircRNA-protein binding plays a critical role in complex biological activity and disease. Various deep learning-based algorithms have been proposed to identify CircRNA-protein binding sites. These methods predict whether the CircRNA sequence includes protein binding sites from the sequence level, and primarily concentrate on analysing the sequence specificity of CircRNA-protein binding. For model performance, these methods are unsatisfactory in accurately predicting motif sites that have special functions in gene expression. Methods: In this study, based on the deep learning models that implement pixel-level binary classification prediction in computer vision, we viewed the CircRNA-protein binding sites prediction as a nucleotide-level binary classification task, and use a fully convolutional neural networks to identify CircRNA-protein binding motif sites (CPBFCN). Results: CPBFCN provides a new path to predict CircRNA motifs. Based on the MEME tool, the existing CircRNA-related and protein-related database, we analysed the motif functions discovered by CPBFCN. We also investigated the correlation between CircRNA sponge and motif distribution. Furthermore, by comparing the motif distribution with different input sequence lengths, we found that some motifs in the flanking sequences of CircRNA-protein binding region may contribute to CircRNA-protein binding. Conclusion: This study contributes to identify circRNA-protein binding and provides help in understanding the role of circRNA-protein binding in gene expression regulation.
Collapse
Affiliation(s)
- Zhen Shen
- School of Computer and Software, Nanyang Institute of Technology, Nanyang, Henan, China
| | - Wei Liu
- School of Computer and Software, Nanyang Institute of Technology, Nanyang, Henan, China
| | - ShuJun Zhao
- School of Computer and Software, Nanyang Institute of Technology, Nanyang, Henan, China
| | - QinHu Zhang
- EIT Institute for Advanced Study, Ningbo, Zhejiang, China
| | - SiGuo Wang
- EIT Institute for Advanced Study, Ningbo, Zhejiang, China
| | - Lin Yuan
- Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
- Shandong Engineering Research Center of Big Data Applied Technology, Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
- Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan, China
| |
Collapse
|
17
|
Wu J, Ning Z, Ding Y, Wang Y, Peng Q, Fu L. KGETCDA: an efficient representation learning framework based on knowledge graph encoder from transformer for predicting circRNA-disease associations. Brief Bioinform 2023; 24:bbad292. [PMID: 37587836 DOI: 10.1093/bib/bbad292] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 07/27/2023] [Accepted: 07/27/2023] [Indexed: 08/18/2023] Open
Abstract
Recent studies have demonstrated the significant role that circRNA plays in the progression of human diseases. Identifying circRNA-disease associations (CDA) in an efficient manner can offer crucial insights into disease diagnosis. While traditional biological experiments can be time-consuming and labor-intensive, computational methods have emerged as a viable alternative in recent years. However, these methods are often limited by data sparsity and their inability to explore high-order information. In this paper, we introduce a novel method named Knowledge Graph Encoder from Transformer for predicting CDA (KGETCDA). Specifically, KGETCDA first integrates more than 10 databases to construct a large heterogeneous non-coding RNA dataset, which contains multiple relationships between circRNA, miRNA, lncRNA and disease. Then, a biological knowledge graph is created based on this dataset and Transformer-based knowledge representation learning and attentive propagation layers are applied to obtain high-quality embeddings with accurately captured high-order interaction information. Finally, multilayer perceptron is utilized to predict the matching scores of CDA based on their embeddings. Our empirical results demonstrate that KGETCDA significantly outperforms other state-of-the-art models. To enhance user experience, we have developed an interactive web-based platform named HNRBase that allows users to visualize, download data and make predictions using KGETCDA with ease. The code and datasets are publicly available at https://github.com/jinyangwu/KGETCDA.
Collapse
Affiliation(s)
- Jinyang Wu
- School of Automation Science and Engineering, Xi'an Jiaotong University, 710049, Shaanxi, China
| | - Zhiwei Ning
- School of Automation Science and Engineering, Xi'an Jiaotong University, 710049, Shaanxi, China
| | - Yidong Ding
- School of Automation Science and Engineering, Xi'an Jiaotong University, 710049, Shaanxi, China
| | - Ying Wang
- School of Automation Science and Engineering, Xi'an Jiaotong University, 710049, Shaanxi, China
| | - Qinke Peng
- School of Automation Science and Engineering, Xi'an Jiaotong University, 710049, Shaanxi, China
| | - Laiyi Fu
- School of Automation Science and Engineering, Xi'an Jiaotong University, 710049, Shaanxi, China
- Research Institute of Xi'an Jiaotong University, 311200, Zhejiang, China
- Sichuan Digital Economy Industry Development Research Institute, 610036, Sichuan, China
| |
Collapse
|
18
|
Zhang P, Wu H. IChrom-Deep: An Attention-Based Deep Learning Model for Identifying Chromatin Interactions. IEEE J Biomed Health Inform 2023; 27:4559-4568. [PMID: 37402191 DOI: 10.1109/jbhi.2023.3292299] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/06/2023]
Abstract
Identification of chromatin interactions is crucial for advancing our knowledge of gene regulation. However, due to the limitations of high-throughput experimental techniques, there is an urgent need to develop computational methods for predicting chromatin interactions. In this study, we propose a novel attention-based deep learning model, termed IChrom-Deep, to identify chromatin interactions using sequence features and genomic features. The experimental results based on the datasets of three cell lines demonstrate that the IChrom-Deep achieves satisfactory performance and is superior to the previous methods. We also investigate the effect of DNA sequence and associated features and genomic features on chromatin interactions, and highlight the applicable scenarios of some features, such as sequence conservation and distance. Moreover, we identify a few genomic features that are extremely important across different cell lines, and IChrom-Deep achieves comparable performance with only these significant genomic features versus using all genomic features. It is believed that IChrom-Deep can serve as a useful tool for future studies that seek to identify chromatin interactions.
Collapse
|
19
|
Ai N, Liang Y, Yuan H, Ouyang D, Xie S, Liu X. GDCL-NcDA: identifying non-coding RNA-disease associations via contrastive learning between deep graph learning and deep matrix factorization. BMC Genomics 2023; 24:424. [PMID: 37501127 PMCID: PMC10373414 DOI: 10.1186/s12864-023-09501-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Accepted: 07/02/2023] [Indexed: 07/29/2023] Open
Abstract
Non-coding RNAs (ncRNAs) draw much attention from studies widely in recent years because they play vital roles in life activities. As a good complement to wet experiment methods, computational prediction methods can greatly save experimental costs. However, high false-negative data and insufficient use of multi-source information can affect the performance of computational prediction methods. Furthermore, many computational methods do not have good robustness and generalization on different datasets. In this work, we propose an effective end-to-end computing framework, called GDCL-NcDA, of deep graph learning and deep matrix factorization (DMF) with contrastive learning, which identifies the latent ncRNA-disease association on diverse multi-source heterogeneous networks (MHNs). The diverse MHNs include different similarity networks and proven associations among ncRNAs (miRNAs, circRNAs, and lncRNAs), genes, and diseases. Firstly, GDCL-NcDA employs deep graph convolutional network and multiple attention mechanisms to adaptively integrate multi-source of MHNs and reconstruct the ncRNA-disease association graph. Then, GDCL-NcDA utilizes DMF to predict the latent disease-associated ncRNAs based on the reconstructed graphs to reduce the impact of the false-negatives from the original associations. Finally, GDCL-NcDA uses contrastive learning (CL) to generate a contrastive loss on the reconstructed graphs and the predicted graphs to improve the generalization and robustness of our GDCL-NcDA framework. The experimental results show that GDCL-NcDA outperforms highly related computational methods. Moreover, case studies demonstrate the effectiveness of GDCL-NcDA in identifying the associations among diversiform ncRNAs and diseases.
Collapse
Affiliation(s)
- Ning Ai
- Peng Cheng Laboratory, Shenzhen, 518005, Guangdong, China
- School of Computer Science and Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, China
| | - Yong Liang
- Peng Cheng Laboratory, Shenzhen, 518005, Guangdong, China.
- Pazhou Laboratory (Huangpu), Guangzhou, 510555, Guangdong, China.
| | - Haoliang Yuan
- School of Automation, Guangdong University of Technology, Guangzhou, 510006, Guangdong, China
| | - Dong Ouyang
- Peng Cheng Laboratory, Shenzhen, 518005, Guangdong, China
- School of Computer Science and Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, China
| | - Shengli Xie
- Institute of Intelligent Information Processing, Guangdong University of Technology, Guangzhou, 510000, Guangdong, China
| | - Xiaoying Liu
- Computer Engineering Technical College, Guangdong Polytechnic of Science and Technology, Zhuhai, Guangdong, 519090, China
| |
Collapse
|
20
|
Wang Y, Gao YL, Wang J, Li F, Liu JX. MSGCA: Drug-Disease Associations Prediction Based on Multi-Similarities Graph Convolutional Autoencoder. IEEE J Biomed Health Inform 2023; 27:3686-3694. [PMID: 37163398 DOI: 10.1109/jbhi.2023.3272154] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Identifying drug-disease associations (DDAs) is critical to the development of drugs. Traditional methods to determine DDAs are expensive and inefficient. Therefore, it is imperative to develop more accurate and effective methods for DDAs prediction. Most current DDAs prediction methods utilize original DDAs matrix directly. However, the original DDAs matrix is sparse, which greatly affects the prediction consequences. Hence, a prediction method based on multi-similarities graph convolutional autoencoder (MSGCA) is proposed for DDAs prediction. First, MSGCA integrates multiple drug similarities and disease similarities using centered kernel alignment-based multiple kernel learning (CKA-MKL) algorithm to form new drug similarity and disease similarity, respectively. Second, the new drug and disease similarities are improved by linear neighborhood, and the DDAs matrix is reconstructed by weighted K nearest neighbor profiles. Next, the reconstructed DDAs and the improved drug and disease similarities are integrated into a heterogeneous network. Finally, the graph convolutional autoencoder with attention mechanism is utilized to predict DDAs. Compared with extant methods, MSGCA shows superior results on three datasets. Furthermore, case studies further demonstrate the reliability of MSGCA.
Collapse
|
21
|
Shan D, Wang S, Wang J, Lu J, Ren J, Chen J, Wang D, Qi P. Computed tomography angiography-based radiomics model for predicting carotid atherosclerotic plaque vulnerability. Front Neurol 2023; 14:1151326. [PMID: 37396779 PMCID: PMC10312009 DOI: 10.3389/fneur.2023.1151326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 05/30/2023] [Indexed: 07/04/2023] Open
Abstract
Vulnerable carotid atherosclerotic plaque (CAP) significantly contributes to ischemic stroke. Neovascularization within plaques is an emerging biomarker linked to plaque vulnerability that can be detected using contrast-enhanced ultrasound (CEUS). Computed tomography angiography (CTA) is a common method used in clinical cerebrovascular assessments that can be employed to evaluate the vulnerability of CAPs. Radiomics is a technique that automatically extracts radiomic features from images. This study aimed to identify radiomic features associated with the neovascularization of CAP and construct a prediction model for CAP vulnerability based on radiomic features. CTA data and clinical data of patients with CAPs who underwent CTA and CEUS between January 2018 and December 2021 in Beijing Hospital were retrospectively collected. The data were divided into a training cohort and a testing cohort using a 7:3 split. According to the examination of CEUS, CAPs were dichotomized into vulnerable and stable groups. 3D Slicer software was used to delineate the region of interest in CTA images, and the Pyradiomics package was used to extract radiomic features in Python. Machine learning algorithms containing logistic regression (LR), support vector machine (SVM), random forest (RF), light gradient boosting machine (LGBM), adaptive boosting (AdaBoost), extreme gradient boosting (XGBoost), and multi-layer perception (MLP) were used to construct the models. The confusion matrix, receiver operating characteristic (ROC) curve, accuracy, precision, recall, and f-1 score were used to evaluate the performance of the models. A total of 74 patients with 110 CAPs were included. In all, 1,316 radiomic features were extracted, and 10 radiomic features were selected for machine-learning model construction. After evaluating several models on the testing cohorts, it was discovered that model_RF outperformed the others, achieving an AUC value of 0.93 (95% CI: 0.88-0.99). The accuracy, precision, recall, and f-1 score of model_RF in the testing cohort were 0.85, 0.87, 0.85, and 0.85, respectively. Radiomic features associated with the neovascularization of CAP were obtained. Our study highlights the potential of radiomics-based models for improving the accuracy and efficiency of diagnosing vulnerable CAP. In particular, the model_RF, utilizing radiomic features extracted from CTA, provides a noninvasive and efficient method for accurately predicting the vulnerability status of CAP. This model shows great potential for offering clinical guidance for early detection and improving patient outcomes.
Collapse
Affiliation(s)
- Dezhi Shan
- Department of Neurosurgery, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
- Graduate School of Peking Union Medical College, Beijing, China
| | - Siyu Wang
- Department of Ultrasound, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Junjie Wang
- Department of Neurosurgery, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Jun Lu
- Department of Neurosurgery, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Junhong Ren
- Department of Ultrasound, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Juan Chen
- Department of Radiology, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Daming Wang
- Department of Neurosurgery, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
- Graduate School of Peking Union Medical College, Beijing, China
| | - Peng Qi
- Department of Neurosurgery, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| |
Collapse
|
22
|
Wang H, Han J, Li H, Duan L, Liu Z, Cheng H. CDA-SKAG: Predicting circRNA-disease associations using similarity kernel fusion and an attention-enhancing graph autoencoder. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:7957-7980. [PMID: 37161181 DOI: 10.3934/mbe.2023345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Circular RNAs (circRNAs) constitute a category of circular non-coding RNA molecules whose abnormal expression is closely associated with the development of diseases. As biological data become abundant, a lot of computational prediction models have been used for circRNA-disease association prediction. However, existing prediction models ignore the non-linear information of circRNAs and diseases when fusing multi-source similarities. In addition, these models fail to take full advantage of the vital feature information of high-similarity neighbor nodes when extracting features of circRNAs or diseases. In this paper, we propose a deep learning model, CDA-SKAG, which introduces a similarity kernel fusion algorithm to integrate multi-source similarity matrices to capture the non-linear information of circRNAs or diseases, and construct a circRNA information space and a disease information space. The model embeds an attention-enhancing layer in the graph autoencoder to enhance the associations between nodes with higher similarity. A cost-sensitive neural network is introduced to address the problem of positive and negative sample imbalance, consequently improving our model's generalization capability. The experimental results show that the prediction performance of our model CDA-SKAG outperformed existing circRNA-disease association prediction models. The results of the case studies on lung and cervical cancer suggest that CDA-SKAG can be utilized as an effective tool to assist in predicting circRNA-disease associations.
Collapse
Affiliation(s)
- Huiqing Wang
- College of Information and Computer, Taiyuan University of Technology, Taiyuan 030024, China
| | - Jiale Han
- College of Information and Computer, Taiyuan University of Technology, Taiyuan 030024, China
| | - Haolin Li
- College of Information and Computer, Taiyuan University of Technology, Taiyuan 030024, China
| | - Liguo Duan
- College of Information and Computer, Taiyuan University of Technology, Taiyuan 030024, China
| | - Zhihao Liu
- College of Information and Computer, Taiyuan University of Technology, Taiyuan 030024, China
| | - Hao Cheng
- College of Information and Computer, Taiyuan University of Technology, Taiyuan 030024, China
| |
Collapse
|
23
|
Zheng K, Zhang XL, Wang L, You ZH, Ji BY, Liang X, Li ZW. SPRDA: a link prediction approach based on the structural perturbation to infer disease-associated Piwi-interacting RNAs. Brief Bioinform 2023; 24:6850564. [PMID: 36445194 DOI: 10.1093/bib/bbac498] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 10/17/2022] [Accepted: 10/19/2022] [Indexed: 11/30/2022] Open
Abstract
piRNA and PIWI proteins have been confirmed for disease diagnosis and treatment as novel biomarkers due to its abnormal expression in various cancers. However, the current research is not strong enough to further clarify the functions of piRNA in cancer and its underlying mechanism. Therefore, how to provide large-scale and serious piRNA candidates for biological research has grown up to be a pressing issue. In this study, a novel computational model based on the structural perturbation method is proposed to predict potential disease-associated piRNAs, called SPRDA. Notably, SPRDA belongs to positive-unlabeled learning, which is unaffected by negative examples in contrast to previous approaches. In the 5-fold cross-validation, SPRDA shows high performance on the benchmark dataset piRDisease, with an AUC of 0.9529. Furthermore, the predictive performance of SPRDA for 10 diseases shows the robustness of the proposed method. Overall, the proposed approach can provide unique insights into the pathogenesis of the disease and will advance the field of oncology diagnosis and treatment.
Collapse
Affiliation(s)
- Kai Zheng
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, 410083, China.,College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277100, China
| | - Xin-Lu Zhang
- Civil Product General Research Institute, The 36th Research Institute of China Electronics Technology Group Corporation, Jiaxing, 314000, China
| | - Lei Wang
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277100, China.,Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning, 530007, China
| | - Zhu-Hong You
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning, 530007, China
| | - Bo-Ya Ji
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410006, China
| | - Xiao Liang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | - Zheng-Wei Li
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277100, China.,Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning, 530007, China
| |
Collapse
|
24
|
MFIDMA: A Multiple Information Integration Model for the Prediction of Drug-miRNA Associations. BIOLOGY 2022; 12:biology12010041. [PMID: 36671734 PMCID: PMC9855084 DOI: 10.3390/biology12010041] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/20/2022] [Revised: 12/19/2022] [Accepted: 12/22/2022] [Indexed: 12/28/2022]
Abstract
Abnormal microRNA (miRNA) functions play significant roles in various pathological processes. Thus, predicting drug-miRNA associations (DMA) may hold great promise for identifying the potential targets of drugs. However, discovering the associations between drugs and miRNAs through wet experiments is time-consuming and laborious. Therefore, it is significant to develop computational prediction methods to improve the efficiency of identifying DMA on a large scale. In this paper, a multiple features integration model (MFIDMA) is proposed to predict drug-miRNA association. Specifically, we first formulated known DMA as a bipartite graph and utilized structural deep network embedding (SDNE) to learn the topological features from the graph. Second, the Word2vec algorithm was utilized to construct the attribute features of the miRNAs and drugs. Third, two kinds of features were entered into the convolution neural network (CNN) and deep neural network (DNN) to integrate features and predict potential target miRNAs for the drugs. To evaluate the MFIDMA model, it was implemented on three different datasets under a five-fold cross-validation and achieved average AUCs of 0.9407, 0.9444 and 0.8919. In addition, the MFIDMA model showed reliable results in the case studies of Verapamil and hsa-let-7c-5p, confirming that the proposed model can also predict DMA in real-world situations. The model was effective in analyzing the neighbors and topological features of the drug-miRNA network by SDNE. The experimental results indicated that the MFIDMA is an accurate and robust model for predicting potential DMA, which is significant for miRNA therapeutics research and drug discovery.
Collapse
|
25
|
Peng L, Yang J, Wang M, Zhou L. Editorial: Machine learning-based methods for RNA data analysis—Volume II. Front Genet 2022; 13:1010089. [DOI: 10.3389/fgene.2022.1010089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 09/20/2022] [Indexed: 12/02/2022] Open
|
26
|
Li Y, Hu XG, Wang L, Li PP, You ZH. MNMDCDA: prediction of circRNA-disease associations by learning mixed neighborhood information from multiple distances. Brief Bioinform 2022; 23:6831006. [PMID: 36384071 DOI: 10.1093/bib/bbac479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 09/25/2022] [Accepted: 10/10/2022] [Indexed: 11/18/2022] Open
Abstract
Emerging evidence suggests that circular RNA (circRNA) is an important regulator of a variety of pathological processes and serves as a promising biomarker for many complex human diseases. Nevertheless, there are relatively few known circRNA-disease associations, and uncovering new circRNA-disease associations by wet-lab methods is time consuming and costly. Considering the limitations of existing computational methods, we propose a novel approach named MNMDCDA, which combines high-order graph convolutional networks (high-order GCNs) and deep neural networks to infer associations between circRNAs and diseases. Firstly, we computed different biological attribute information of circRNA and disease separately and used them to construct multiple multi-source similarity networks. Then, we used the high-order GCN algorithm to learn feature embedding representations with high-order mixed neighborhood information of circRNA and disease from the constructed multi-source similarity networks, respectively. Finally, the deep neural network classifier was implemented to predict associations of circRNAs with diseases. The MNMDCDA model obtained AUC scores of 95.16%, 94.53%, 89.80% and 91.83% on four benchmark datasets, i.e., CircR2Disease, CircAtlas v2.0, Circ2Disease and CircRNADisease, respectively, using the 5-fold cross-validation approach. Furthermore, 25 of the top 30 circRNA-disease pairs with the best scores of MNMDCDA in the case study were validated by recent literature. Numerous experimental results indicate that MNMDCDA can be used as an effective computational tool to predict circRNA-disease associations and can provide the most promising candidates for biological experiments.
Collapse
Affiliation(s)
- Yang Li
- School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230601, China
| | - Xue-Gang Hu
- School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230601, China
| | - Lei Wang
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning 530007, China.,College of Information Science and Engineering, Zaozhuang University, Shandong 277100, China
| | - Pei-Pei Li
- School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230601, China
| | - Zhu-Hong You
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning 530007, China.,School of Computer Science, Northwestern Polytechnical University, Xi'an Shaanxi 710129, China
| |
Collapse
|