1
|
Ha J. DeepWalk-Based Graph Embeddings for miRNA-Disease Association Prediction Using Deep Neural Network. Biomedicines 2025; 13:536. [PMID: 40149513 PMCID: PMC11940379 DOI: 10.3390/biomedicines13030536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2025] [Revised: 02/17/2025] [Accepted: 02/17/2025] [Indexed: 03/29/2025] Open
Abstract
Background: In recent years, micro ribonucleic acids (miRNAs) have been recognized as key regulators in numerous biological processes, particularly in the development and progression of diseases. As a result, extensive research has focused on uncovering the critical involvement of miRNAs in disease mechanisms to better comprehend the underlying causes of human diseases. Despite these efforts, relying solely on biological experiments to identify miRNA-disease associations is both time-consuming and costly, making it an impractical approach for large-scale studies. Methods: In this paper, we propose a novel DeepWalk-based graph embedding method for predicting miRNA-disease association (DWMDA). Using DeepWalk, we extracted meaningful low-dimensional vectors from the miRNA and disease networks. Then, we applied a deep neural network to identify miRNA-disease associations using the low-dimensional vectors of miRNAs and diseases extracted via DeepWalk. Results: An ablation study was conducted to assess the proposed graph embedding modules. Furthermore, DWMDA demonstrates exceptional performance in two major cancer case studies (breast and lung), with results based on statistically robust measures, further emphasizing its reliability as a method for identifying associations between miRNAs and diseases. Conclusions: We expect that our model will not only facilitate the accurate prediction of disease-associated miRNAs but also serve as a generalizable framework for exploring interactions among various biological entities.
Collapse
Affiliation(s)
- Jihwan Ha
- Major of Big Data Convergence, Division of Data Information Science, Pukyong National University, Busan 48513, Republic of Korea
| |
Collapse
|
2
|
Ha J. Graph Convolutional Network with Neural Collaborative Filtering for Predicting miRNA-Disease Association. Biomedicines 2025; 13:136. [PMID: 39857720 PMCID: PMC11762804 DOI: 10.3390/biomedicines13010136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2024] [Revised: 01/01/2025] [Accepted: 01/07/2025] [Indexed: 01/27/2025] Open
Abstract
Background: Over the past few decades, micro ribonucleic acids (miRNAs) have been shown to play significant roles in various biological processes, including disease incidence. Therefore, much effort has been devoted to discovering the pivotal roles of miRNAs in disease incidence to understand the underlying pathogenesis of human diseases. However, identifying miRNA-disease associations using biological experiments is inefficient in terms of cost and time. Methods: Here, we discuss a novel machine-learning model that effectively predicts disease-related miRNAs using a graph convolutional neural network with neural collaborative filtering (GCNCF). By applying the graph convolutional neural network, we could effectively capture important miRNAs and disease feature vectors present in the network while preserving the network structure. By exploiting neural collaborative filtering, miRNAs and disease feature vectors were effectively learned through matrix factorization and deep learning, and disease-related miRNAs were identified. Results: Extensive experimental results based on area under the curve (AUC) scores (0.9216 and 0.9018) demonstrated the superiority of our model over previous models. Conclusions: We anticipate that our model could not only serve as an effective tool for predicting disease-related miRNAs but could be employed as a universal computational framework for inferring relationships across biological entities.
Collapse
Affiliation(s)
- Jihwan Ha
- Major of Big Data Convergence, Division of Data Information Science, Pukyong National University, Busan 48513, Republic of Korea
| |
Collapse
|
3
|
Zhu R, Wang Y, Dai LY. CLHGNNMDA: Hypergraph Neural Network Model Enhanced by Contrastive Learning for miRNA-Disease Association Prediction. J Comput Biol 2025; 32:47-63. [PMID: 39602201 DOI: 10.1089/cmb.2024.0720] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2024] Open
Abstract
Numerous biological experiments have demonstrated that microRNA (miRNA) is involved in gene regulation within cells, and mutations and abnormal expression of miRNA can cause a myriad of intricate diseases. Forecasting the association between miRNA and diseases can enhance disease prevention and treatment and accelerate drug research, which holds considerable importance for the development of clinical medicine and drug research. This investigation introduces a contrastive learning-augmented hypergraph neural network model, termed CLHGNNMDA, aimed at predicting associations between miRNAs and diseases. Initially, CLHGNNMDA constructs multiple hypergraphs by leveraging diverse similarity metrics related to miRNAs and diseases. Subsequently, hypergraph convolution is applied to each hypergraph to extract feature representations for nodes and hyperedges. Following this, autoencoders are employed to reconstruct information regarding the feature representations of nodes and hyperedges and to integrate various features of miRNAs and diseases extracted from each hypergraph. Finally, a joint contrastive loss function is utilized to refine the model and optimize its parameters. The CLHGNNMDA framework employs multi-hypergraph contrastive learning for the construction of a contrastive loss function. This approach takes into account inter-view interactions and upholds the principle of consistency, thereby augmenting the model's representational efficacy. The results obtained from fivefold cross-validation substantiate that the CLHGNNMDA algorithm achieves a mean area under the receiver operating characteristic curve of 0.9635 and a mean area under the precision-recall curve of 0.9656. These metrics are notably superior to those attained by contemporary state-of-the-art methodologies.
Collapse
Affiliation(s)
- Rong Zhu
- School of Computer Science, Qufu Normal University, Rizhao, China
| | - Yong Wang
- Laboratory Experimental Teaching and Equipment Management Center, Qufu Normal University, Rizhao, China
| | - Ling-Yun Dai
- School of Computer Science, Qufu Normal University, Rizhao, China
| |
Collapse
|
4
|
Xia H, Dong C, Chen X, Wei Z, Gu L, Zhu X. SGTCDA: Prediction of circRNA-drug sensitivity associations with interpretable graph transformers and effective assessment. BMC Genomics 2024; 25:1113. [PMID: 39567908 PMCID: PMC11577602 DOI: 10.1186/s12864-024-11022-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Accepted: 11/08/2024] [Indexed: 11/22/2024] Open
Abstract
CircRNAs are a type of circular non-coding RNA whose associations with drug sensitivities have been demonstrated in recent studies. Due to the high cost of biomedical experiments for detecting the associations between circRNAs and drug sensitivities, several computational methods have been developed. However, these methods were evaluated mainly based on 5- or tenfold cross-validation, which are often over-optimistic. Furthermore, there are technique issues with these models, such as over-smoothing and over-squashing. To address these issues, we propose a strategy to evaluate models based on independent test sets for association prediction-related studies. In the light of this effective assessment, we constructed a model, SGTCDA, by integrating structural deep network embedding (SDNE) and a graph transformer to predict the potential associations of circRNA-drug sensitivity, which can efficiently capture long-range dependencies and local structural information of nodes. Our results on the training sets and the independent test sets indicate that SGTCDA outperforms the other state-of-the-art models, demonstrating its capacity for accurate prediction of circRNA-drug sensitivity. Moreover, we leveraged EdgeSHAPer to explain the performance of the proposed SGTCDA model, which illustrates that the edges between drugs are more important than other edges for the performance of the model. The source code and dataset of SGTCDA are available at: https://github.com/hwxia/SGTCDA .
Collapse
Affiliation(s)
- Hongwei Xia
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, 230036, China
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Hefei, Anhui, 230036, China
- Research Center for Agricultural Information Perception and Intelligent Computing Engineering of Anhui Province, Hefei, Anhui, 230036, China
| | - Caiyue Dong
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, 230036, China
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Hefei, Anhui, 230036, China
- Research Center for Agricultural Information Perception and Intelligent Computing Engineering of Anhui Province, Hefei, Anhui, 230036, China
| | - Xinxing Chen
- School of Life Sciences, Anhui Agricultural University, Hefei, Anhui, 230036, China
| | - Zhuoyu Wei
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, 230036, China
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Hefei, Anhui, 230036, China
- Research Center for Agricultural Information Perception and Intelligent Computing Engineering of Anhui Province, Hefei, Anhui, 230036, China
| | - Lichuan Gu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, 230036, China.
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Hefei, Anhui, 230036, China.
- Research Center for Agricultural Information Perception and Intelligent Computing Engineering of Anhui Province, Hefei, Anhui, 230036, China.
| | - Xiaolei Zhu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, 230036, China.
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Hefei, Anhui, 230036, China.
- Research Center for Agricultural Information Perception and Intelligent Computing Engineering of Anhui Province, Hefei, Anhui, 230036, China.
| |
Collapse
|
5
|
Sun W, Ren C, Xu J, Zhang P. SAGCN: Using Graph Convolutional Network With Subgraph-Aware for circRNA-Drug Sensitivity Identification. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1765-1774. [PMID: 38885113 DOI: 10.1109/tcbb.2024.3415058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/20/2024]
Abstract
Circular RNAs (circRNAs) play a significant role in cancer development and therapy resistance. There is substantial evidence indicating that the expression of circRNAs affects the sensitivity of cells to drugs. Identifying circRNAs-drug sensitivity association (CDA) is helpful for disease treatment and drug discovery. However, the identification of CDA through conventional biological experiments is both time-consuming and costly. Therefore, it is urgent to develop computational methods to predict CDA. In this study, we propose a new computational method, the subgraph-aware graph convolutional network (SAGCN), for predicting CDA. SAGCN first constructs a heterogeneous network composed of circRNA similarity network, drug similarity network, and circRNA-drug bipartite network. Then, a subgraph extractor is proposed to learn the latent subgraph structure of the heterogeneous network using a graph convolutional network. The extractor can capture 1-hop and 2-hop information and then a fusing attention mechanism is designed to integrate them adaptively. Simultaneously, a novel subgraph-aware attention mechanism is proposed to detect intrinsic subgraph structure. The final node feature representation is obtained to make the CDA prediction. Experimental results demonstrate that SAGCN obtained an average AUC of 0.9120 and AUPR of 0.8693, exceeding the performance of the most advanced models under 10-fold cross-validation. Case studies have demonstrated the potential of SAGCN in identifying associations between circRNA and drug sensitivity.
Collapse
|
6
|
Jalali P, Aliyari S, Etesami M, Saeedi Niasar M, Taher S, Kavousi K, Nazemalhosseini Mojarad E, Salehi Z. GUCA2A dysregulation as a promising biomarker for accurate diagnosis and prognosis of colorectal cancer. Clin Exp Med 2024; 24:251. [PMID: 39485546 PMCID: PMC11530487 DOI: 10.1007/s10238-024-01512-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Accepted: 10/21/2024] [Indexed: 11/03/2024]
Abstract
Colorectal cancer is a leading cause of global mortality and presents a significant barrier to improving life expectancy. The primary objective of this study was to discern a unique differentially expressed gene (DEG) that exhibits a strong association with colorectal cancer. By achieving this goal, the research aims to contribute valuable insights to the field of translational medicine. We performed analysis of colorectal cancer microarray and the TCGA colon adenoma carcinoma (COAD) datasets to identify DEGs associated with COAD and common DEGs were selected. Furthermore, a pan-cancer analysis encompassing 33 different cancer types was performed to identify differential genes significantly expressed only in COAD. Then, comprehensively in-silico analysis including gene set enrichment analysis, constructing Protein-Protein interaction, co-expression, and competing endogenous RNA (ceRNA) networks, investigating the correlation between tumor-immune signatures in distinct tumor microenvironment and also the potential interactions between the identified gene and various drugs was executed. Further, the candidate gene was experimentally validated in tumoral colorectal tissues and colorectal adenomatous polyps by qRael-Time PCR. GUCA2A emerged as a significant DEG specific to colorectal cancer (|log2FC|> 1 and adjusted q-value < 0.05). Importantly, GUCA2A exhibited excellent diagnostic performance for COAD, with a 99.6% and 78% area under the curve (AUC) based on TCGA-COAD and colon cancer patients. In addition, GUCA2A expression in adenomatous polyps equal to or larger than 5 mm was significantly lower compared to smaller than 5 mm. Moreover, low expression of GUCA2A significantly impacted overall patient survival. Significant correlations were observed between tumor-immune signatures and GUCA2A expression. The ceRNA constructed included GUCA2A, 8 shared miRNAs, and 61 circRNAs. This study identifies GUCA2A as a promising prognostic and diagnostic biomarker for colorectal cancer. Further investigations are warranted to explore the potential of GUCA2A as a therapeutic biomarker.
Collapse
Affiliation(s)
- Pooya Jalali
- Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Centre, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, P.O. Box: 19857-17411, Tehran, Iran
| | - Shahram Aliyari
- Department of Bioinformatics, Kish International Campus University of Tehran, Kish, Iran
- Division of Applied Bioinformatics, German Cancer Research Center DKFZ, Heidelberg, Germany
| | - Marziyeh Etesami
- Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Centre, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, P.O. Box: 19857-17411, Tehran, Iran
| | - Mahsa Saeedi Niasar
- Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Centre, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, P.O. Box: 19857-17411, Tehran, Iran
| | - Sahar Taher
- Islamic Azad University, Tabriz Branch, Tabriz, Iran
| | - Kaveh Kavousi
- Laboratory of Complex Biological Systems and Bioinformatics (CBB), Department of Bioinformatics, Institute of Biochemistry and Biophysics (IBB), University of Tehran, Tehran, Iran
| | - Ehsan Nazemalhosseini Mojarad
- Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Centre, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, P.O. Box: 19857-17411, Tehran, Iran.
- Department of Surgery, Leiden University Medical Center, Leiden, Netherlands.
| | - Zahra Salehi
- Hematology, Oncology and Stem Cell Transplantation Research Center, Research Institute for Oncology, Hematology and Cell Therapy, Tehran University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
7
|
Ouyang D, Miao R, Zeng J, Li X, Ai N, Wang P, Hou J, Zheng J. SPLHRNMTF: robust orthogonal non-negative matrix tri-factorization with self-paced learning and dual hypergraph regularization for predicting miRNA-disease associations. BMC Genomics 2024; 25:885. [PMID: 39304826 DOI: 10.1186/s12864-024-10729-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Accepted: 08/20/2024] [Indexed: 09/22/2024] Open
Abstract
MicroRNAs (miRNAs) have been demonstrated to be closely related to human diseases. Studying the potential associations between miRNAs and diseases contributes to our understanding of disease pathogenic mechanisms. As traditional biological experiments are costly and time-consuming, computational models can be considered as effective complementary tools. In this study, we propose a novel model of robust orthogonal non-negative matrix tri-factorization (NMTF) with self-paced learning and dual hypergraph regularization, named SPLHRNMTF, to predict miRNA-disease associations. More specifically, SPLHRNMTF first uses a non-linear fusion method to obtain miRNA and disease comprehensive similarity. Subsequently, the improved miRNA-disease association matrix is reformulated based on weighted k-nearest neighbor profiles to correct false-negative associations. In addition, we utilize L 2 , 1 norm to replace Frobenius norm to calculate residual error, alleviating the impact of noise and outliers on prediction performance. Then, we integrate self-paced learning into NMTF to alleviate the model from falling into bad local optimal solutions by gradually including samples from easy to complex. Finally, hypergraph regularization is introduced to capture high-order complex relations from hypergraphs related to miRNAs and diseases. In 5-fold cross-validation five times experiments, SPLHRNMTF obtains higher average AUC values than other baseline models. Moreover, the case studies on breast neoplasms and lung neoplasms further demonstrate the accuracy of SPLHRNMTF. Meanwhile, the potential associations discovered are of biological significance.
Collapse
Affiliation(s)
- Dong Ouyang
- School of Biomedical Engineering, Guangdong Medical University, Dongguan, 523808, China.
| | - Rui Miao
- Basic Teaching Department, Zhuhai Campus of Zunyi Medical University, Zhuhai, 519099, China
| | - Juan Zeng
- School of Biomedical Engineering, Guangdong Medical University, Dongguan, 523808, China
| | - Xing Li
- School of Biomedical Engineering, Guangdong Medical University, Dongguan, 523808, China
| | - Ning Ai
- The college of Mechanical and Electrical Engineering, Shihezi University, Shihezi, 832003, China
| | - Panke Wang
- School of Biomedical Engineering, Guangdong Medical University, Dongguan, 523808, China
| | - Jie Hou
- School of Biomedical Engineering, Guangdong Medical University, Dongguan, 523808, China
| | - Jinqiu Zheng
- School of Biomedical Engineering, Guangdong Medical University, Dongguan, 523808, China
| |
Collapse
|
8
|
Chen Z, Zhang L, Li J, Chen H. Microbe-disease associations prediction by graph regularized non-negative matrix factorization with L 2 , 1 $$ {L}_{2,1} $$ norm regularization terms. J Cell Mol Med 2024; 28:e18553. [PMID: 39239860 PMCID: PMC11377990 DOI: 10.1111/jcmm.18553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Revised: 06/19/2024] [Accepted: 07/09/2024] [Indexed: 09/07/2024] Open
Abstract
Microbes are involved in a wide range of biological processes and are closely associated with disease. Inferring potential disease-associated microbes as the biomarkers or drug targets may help prevent, diagnose and treat complex human diseases. However, biological experiments are time-consuming and expensive. In this study, we introduced a new method called iPALM-GLMF, which modelled microbe-disease association prediction as a problem of non-negative matrix factorization with graph dual regularization terms andL 2 , 1 $$ {L}_{2,1} $$ norm regularization terms. The graph dual regularization terms were used to capture potential features in the microbe and disease space, and theL 2 , 1 $$ {L}_{2,1} $$ norm regularization terms were used to ensure the sparsity of the feature matrices obtained from the non-negative matrix factorization and to improve the interpretability. To solve the model, iPALM-GLMF used a non-negative double singular value decomposition to initialize the matrix factorization and adopted an inertial Proximal Alternating Linear Minimization iterative process to obtain the final matrix factorization results. As a result, iPALM-GLMF performed better than other existing methods in leave-one-out cross-validation and fivefold cross-validation. In addition, case studies of different diseases demonstrated that iPALM-GLMF could effectively predict potential microbial-disease associations. iPALM-GLMF is publicly available at https://github.com/LiangzheZhang/iPALM-GLMF.
Collapse
Affiliation(s)
- Ziwei Chen
- School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, China
| | - Liangzhe Zhang
- School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, China
| | - Jingyi Li
- School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, China
| | - Hang Chen
- School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, China
| |
Collapse
|
9
|
Chu S, Duan G, Yan C. PGCNMDA: Learning node representations along paths with graph convolutional network for predicting miRNA-disease associations. Methods 2024; 229:71-81. [PMID: 38909974 DOI: 10.1016/j.ymeth.2024.06.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 05/26/2024] [Accepted: 06/16/2024] [Indexed: 06/25/2024] Open
Abstract
Identifying miRNA-disease associations (MDAs) is crucial for improving the diagnosis and treatment of various diseases. However, biological experiments can be time-consuming and expensive. To overcome these challenges, computational approaches have been developed, with Graph Convolutional Network (GCN) showing promising results in MDA prediction. The success of GCN-based methods relies on learning a meaningful spatial operator to extract effective node feature representations. To enhance the inference of MDAs, we propose a novel method called PGCNMDA, which employs graph convolutional networks with a learning graph spatial operator from paths. This approach enables the generation of meaningful spatial convolutions from paths in GCN, leading to improved prediction performance. On HMDD v2.0, PGCNMDA obtains a mean AUC of 0.9229 and an AUPRC of 0.9206 under 5-fold cross-validation (5-CV), and a mean AUC of 0.9235 and an AUPRC of 0.9212 under 10-fold cross-validation (10-CV), respectively. Additionally, the AUC of PGCNMDA also reaches 0.9238 under global leave-one-out cross-validation (GLOOCV). On HMDD v3.2, PGCNMDA obtains a mean AUC of 0.9413 and an AUPRC of 0.9417 under 5-CV, and a mean AUC of 0.9419 and an AUPRC of 0.9425 under 10-CV, respectively. Furthermore, the AUC of PGCNMDA also reaches 0.9415 under GLOOCV. The results show that PGCNMDA is superior to other compared methods. In addition, the case studies on pancreatic neoplasms, thyroid neoplasms and leukemia show that 50, 50 and 48 of the top 50 predicted miRNAs linked to these diseases are confirmed, respectively. It further validates the effectiveness and feasibility of PGCNMDA in practical applications.
Collapse
Affiliation(s)
- Shuang Chu
- School of Informatics, Hunan University of Chinese Medicine, Changsha 410208, China.
| | - Guihua Duan
- School of Computer Science and Engineering, Central South University, Changsha 410083, China.
| | - Cheng Yan
- School of Informatics, Hunan University of Chinese Medicine, Changsha 410208, China.
| |
Collapse
|
10
|
Sun W, Zhang P, Zhang W, Xu J, Huang Y, Li L. Synchronous Mutual Learning Network and Asynchronous Multi-Scale Embedding Network for miRNA-Disease Association Prediction. Interdiscip Sci 2024; 16:532-553. [PMID: 38310628 DOI: 10.1007/s12539-023-00602-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 12/20/2023] [Accepted: 12/22/2023] [Indexed: 02/06/2024]
Abstract
MicroRNA (miRNA) serves as a pivotal regulator of numerous cellular processes, and the identification of miRNA-disease associations (MDAs) is crucial for comprehending complex diseases. Recently, graph neural networks (GNN) have made significant advancements in MDA prediction. However, these methods tend to learn one type of node representation from a single heterogeneous network, ignoring the importance of multiple network topologies and node attributes. Here, we propose SMDAP (Sequence hierarchical modeling-based Mirna-Disease Association Prediction framework), a novel GNN-based framework that incorporates multiple network topologies and various node attributes including miRNA seed and full-length sequences to predict potential MDAs. Specifically, SMDAP consists of two types of MDA representation: following a heterogeneous pattern, we construct a transfer learning-like synchronous mutual learning network to learn the first MDA representation in conjunction with the miRNA seed sequence. Meanwhile, following a homogeneous pattern, we design a subgraph-inspired asynchronous multi-scale embedding network to obtain the second MDA representation based on the miRNA full-length sequence. Subsequently, an adaptive fusion approach is designed to combine the two branches such that we can score the MDAs by the downstream classifier and infer novel MDAs. Comprehensive experiments demonstrate that SMDAP integrates the advantages of multiple network topologies and node attributes into two branch representations. Moreover, the area under the receiver operating characteristic curve is 0.9622 on DB1, which is a 5.06% increase from the baselines. The area under the precision-recall curve is 0.9777, which is a 7.33% increase from the baselines. In addition, case studies on three human cancers validated the predictive performance of SMDAP. Overall, SMDAP represents a powerful tool for MDA prediction.
Collapse
Affiliation(s)
- Weicheng Sun
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Ping Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Weihan Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Jinsheng Xu
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | | | - Li Li
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China.
- Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, 430070, China.
| |
Collapse
|
11
|
Salooja CM, Sanker A, Deepthi K, Jereesh AS. An ensemble approach for circular RNA-disease association prediction using variational autoencoder and genetic algorithm. J Bioinform Comput Biol 2024; 22:2450018. [PMID: 39215523 DOI: 10.1142/s0219720024500185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/04/2024]
Abstract
Circular RNAs (circRNAs) are endogenous non-coding RNAs with a covalently closed loop structure. They have many biological functions, mainly regulatory ones. They have been proven to modulate protein-coding genes in the human genome. CircRNAs are linked to various diseases like Alzheimer's disease, diabetes, atherosclerosis, Parkinson's disease and cancer. Identifying the associations between circular RNAs and diseases is essential for disease diagnosis, prevention, and treatment. The proposed model, based on the variational autoencoder and genetic algorithm circular RNA disease association (VAGA-CDA), predicts novel circRNA-disease associations. First, the experimentally verified circRNA-disease associations are augmented with the synthetic minority oversampling technique (SMOTE) and regenerated using a variational autoencoder, and feature selection is applied to these vectors by a genetic algorithm (GA). The variational autoencoder effectively extracts features from the augmented samples. The optimized feature selection of the genetic algorithm effectively carried out dimensionality reduction. The sophisticated feature vectors extracted are then given to a Random Forest classifier to predict new circRNA-disease associations. The proposed model yields an AUC value of 0.9644 and 0.9628 under 5-fold and 10-fold cross-validations, respectively. The results of the case studies indicate the robustness of the proposed model.
Collapse
Affiliation(s)
- C M Salooja
- Bioinformatics Lab, Department of Computer Science, Cochin University of Science and Technology, Kerala-682022, India
| | - Arjun Sanker
- Bioinformatics Lab, Department of Computer Science, Cochin University of Science and Technology, Kerala-682022, India
| | - K Deepthi
- Department of Computer Science, Central University of Kerala (Central Govt. of India), Kerala-671316, India
| | - A S Jereesh
- Bioinformatics Lab, Department of Computer Science, Cochin University of Science and Technology, Kerala-682022, India
| |
Collapse
|
12
|
Peng H, Xu J, Liu K, Liu F, Zhang A, Zhang X. EIEPCF: accurate inference of functional gene regulatory networks by eliminating indirect effects from confounding factors. Brief Funct Genomics 2024; 23:373-383. [PMID: 37642217 DOI: 10.1093/bfgp/elad040] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 07/07/2023] [Accepted: 08/14/2023] [Indexed: 08/31/2023] Open
Abstract
Reconstructing functional gene regulatory networks (GRNs) is a primary prerequisite for understanding pathogenic mechanisms and curing diseases in animals, and it also provides an important foundation for cultivating vegetable and fruit varieties that are resistant to diseases and corrosion in plants. Many computational methods have been developed to infer GRNs, but most of the regulatory relationships between genes obtained by these methods are biased. Eliminating indirect effects in GRNs remains a significant challenge for researchers. In this work, we propose a novel approach for inferring functional GRNs, named EIEPCF (eliminating indirect effects produced by confounding factors), which eliminates indirect effects caused by confounding factors. This method eliminates the influence of confounding factors on regulatory factors and target genes by measuring the similarity between their residuals. The validation results of the EIEPCF method on simulation studies, the gold-standard networks provided by the DREAM3 Challenge and the real gene networks of Escherichia coli demonstrate that it achieves significantly higher accuracy compared to other popular computational methods for inferring GRNs. As a case study, we utilized the EIEPCF method to reconstruct the cold-resistant specific GRN from gene expression data of cold-resistant in Arabidopsis thaliana. The source code and data are available at https://github.com/zhanglab-wbgcas/EIEPCF.
Collapse
Affiliation(s)
- Huixiang Peng
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074 China
- University of Chinese Academy of Sciences, Beijing 100049 China
| | - Jing Xu
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074 China
- University of Chinese Academy of Sciences, Beijing 100049 China
| | - Kangchen Liu
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074 China
- University of Chinese Academy of Sciences, Beijing 100049 China
| | - Fang Liu
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074 China
| | - Aidi Zhang
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074 China
| | - Xiujun Zhang
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074 China
- Center of Economic Botany, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan 430074, China
| |
Collapse
|
13
|
Xuan P, Wang X, Cui H, Meng X, Nakaguchi T, Zhang T. Meta-Path Semantic and Global-Local Representation Learning Enhanced Graph Convolutional Model for Disease-Related miRNA Prediction. IEEE J Biomed Health Inform 2024; 28:4306-4316. [PMID: 38709611 DOI: 10.1109/jbhi.2024.3397003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
Dysregulation of miRNAs is closely related to the progression of various diseases, so identifying disease-related miRNAs is crucial. Most recently proposed methods are based on graph reasoning, while they did not completely exploit the topological structure composed of the higher-order neighbor nodes and the global and local features of miRNA and disease nodes. We proposed a prediction method, MDAP, to learn semantic features of miRNA and disease nodes based on various meta-paths, as well as node features from the entire heterogeneous network perspective, and node pair attributes. Firstly, for both the miRNA and disease nodes, node category-wise meta-paths were constructed to integrate the similarity and association connection relationships. Each target node has its specific neighbor nodes for each meta-path, and the neighbors of longer meta-paths constitute its higher-order neighbor topological structure. Secondly, we constructed a meta-path specific graph convolutional network module to integrate the features of higher-order neighbors and their topology, and then learned the semantic representations of nodes. Thirdly, for the entire miRNA-disease heterogeneous network, a global-aware graph convolutional autoencoder was built to learn the network-view feature representations of nodes. We also designed semantic-level and representation-level attentions to obtain informative semantic features and node representations. Finally, the strategy based on the parallel convolutional-deconvolutional neural networks were designed to enhance the local feature learning for a pair of miRNA and disease nodes. The experiment results showed that MDAP outperformed other state-of-the-art methods, and the ablation experiments demonstrated the effectiveness of MDAP's major innovations. MDAP's ability in discovering potential disease-related miRNAs was further analyzed by the case studies over three diseases.
Collapse
|
14
|
Biyu H, Mengshan L, Yuxin H, Ming Z, Nan W, Lixin G. A miRNA-disease association prediction model based on tree-path global feature extraction and fully connected artificial neural network with multi-head self-attention mechanism. BMC Cancer 2024; 24:683. [PMID: 38840078 PMCID: PMC11151537 DOI: 10.1186/s12885-024-12420-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2023] [Accepted: 05/23/2024] [Indexed: 06/07/2024] Open
Abstract
BACKGROUND MicroRNAs (miRNAs) emerge in various organisms, ranging from viruses to humans, and play crucial regulatory roles within cells, participating in a variety of biological processes. In numerous prediction methods for miRNA-disease associations, the issue of over-dependence on both similarity measurement data and the association matrix still hasn't been improved. In this paper, a miRNA-Disease association prediction model (called TP-MDA) based on tree path global feature extraction and fully connected artificial neural network (FANN) with multi-head self-attention mechanism is proposed. The TP-MDA model utilizes an association tree structure to represent the data relationships, multi-head self-attention mechanism for extracting feature vectors, and fully connected artificial neural network with 5-fold cross-validation for model training. RESULTS The experimental results indicate that the TP-MDA model outperforms the other comparative models, AUC is 0.9714. In the case studies of miRNAs associated with colorectal cancer and lung cancer, among the top 15 miRNAs predicted by the model, 12 in colorectal cancer and 15 in lung cancer were validated respectively, the accuracy is as high as 0.9227. CONCLUSIONS The model proposed in this paper can accurately predict the miRNA-disease association, and can serve as a valuable reference for data mining and association prediction in the fields of life sciences, biology, and disease genetics, among others.
Collapse
Affiliation(s)
- Hou Biyu
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou, Jiangxi, 341000, China
| | - Li Mengshan
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou, Jiangxi, 341000, China.
| | - Hou Yuxin
- College of Computer Science and Engineering, Shanxi Datong University, Datong, Shanxi, 037000, China
| | - Zeng Ming
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou, Jiangxi, 341000, China
| | - Wang Nan
- College of Life Sciences, Jiaying University, Meizhou, Guangdong, 514000, China
| | - Guan Lixin
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou, Jiangxi, 341000, China
| |
Collapse
|
15
|
Jia C, Wang F, Xing B, Li S, Zhao Y, Li Y, Wang Q. DGAMDA: Predicting miRNA-disease association based on dynamic graph attention network. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING 2024; 40:e3809. [PMID: 38472636 DOI: 10.1002/cnm.3809] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Revised: 01/22/2024] [Accepted: 01/27/2024] [Indexed: 03/14/2024]
Abstract
MiRNA (microRNA)-disease association prediction has essential applications for early disease screening. The process of traditional biological experimental validation is both time-consuming and expensive. However, as artificial intelligence technology continues to advance, computational methods have become efficient tools for predicting miRNA-disease associations. These methods often rely on the combination of multiple sources of association data and require improved feature mining. This study proposes a dynamic graph attention-based association prediction model, DGAMDA, which combines feature mapping and dynamic graph attention mechanisms through feature mining on a single miRNA-disease association network. DGAMDA effectively solves the problems of feature heterogeneity and inadequate feature mining by previous static graph attention mechanisms and achieves high-precision feature mining and association scoring prediction. We conducted a five-fold cross-validation experiment and obtained the mean values of Accuracy, Precision, Recall, and F1-score, which were .8986, .8869, .9115, and .8984, respectively. Our proposed model outperforms other advanced models in terms of experimental results, demonstrating its effectiveness in feature mining and association prediction based on a single association network. In addition, our model can also be used to predict miRNAs associated with unknown diseases.
Collapse
Affiliation(s)
- ChangXin Jia
- Department of Anesthesiology, the Affiliated Hospital of Qingdao University, Qingdao, People's Republic of China
| | - FuYu Wang
- College of Computer Science and Technology, China University of Petroleum, Qingdao, People's Republic of China
| | - Baoxiang Xing
- Department of Obstetrics, the Affiliated Hospital of Qingdao University, Qingdao, People's Republic of China
| | - ShaoNa Li
- Department of Anesthesiology, the Affiliated Hospital of Qingdao University, Qingdao, People's Republic of China
| | - Yang Zhao
- Department of Anesthesiology, the Affiliated Hospital of Qingdao University, Qingdao, People's Republic of China
| | - Yu Li
- Department of Anesthesiology, the Affiliated Hospital of Qingdao University, Qingdao, People's Republic of China
| | - Qing Wang
- Department of Endocrine and Metabolic, the Affiliated Hospital of Qingdao University, Qingdao, People's Republic of China
| |
Collapse
|
16
|
Sheng N, Xie X, Wang Y, Huang L, Zhang S, Gao L, Wang H. A Survey of Deep Learning for Detecting miRNA- Disease Associations: Databases, Computational Methods, Challenges, and Future Directions. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:328-347. [PMID: 38194377 DOI: 10.1109/tcbb.2024.3351752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2024]
Abstract
MicroRNAs (miRNAs) are an important class of non-coding RNAs that play an essential role in the occurrence and development of various diseases. Identifying the potential miRNA-disease associations (MDAs) can be beneficial in understanding disease pathogenesis. Traditional laboratory experiments are expensive and time-consuming. Computational models have enabled systematic large-scale prediction of potential MDAs, greatly improving the research efficiency. With recent advances in deep learning, it has become an attractive and powerful technique for uncovering novel MDAs. Consequently, numerous MDA prediction methods based on deep learning have emerged. In this review, we first summarize publicly available databases related to miRNAs and diseases for MDA prediction. Next, we outline commonly used miRNA and disease similarity calculation and integration methods. Then, we comprehensively review the 48 existing deep learning-based MDA computation methods, categorizing them into classical deep learning and graph neural network-based techniques. Subsequently, we investigate the evaluation methods and metrics that are frequently used to assess MDA prediction performance. Finally, we discuss the performance trends of different computational methods, point out some problems in current research, and propose 9 potential future research directions. Data resources and recent advances in MDA prediction methods are summarized in the GitHub repository https://github.com/sheng-n/DL-miRNA-disease-association-methods.
Collapse
|
17
|
Wang T, Wang W, Jiang X, Mao J, Zhuo L, Liu M, Fu X, Yao X. ML-NPI: Predicting Interactions between Noncoding RNA and Protein Based on Meta-Learning in a Large-Scale Dynamic Graph. J Chem Inf Model 2024; 64:2912-2920. [PMID: 37920888 DOI: 10.1021/acs.jcim.3c01238] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2023]
Abstract
Deep learning methods can accurately study noncoding RNA protein interactions (NPI), which is of great significance in gene regulation, human disease, and other fields. However, the computational method for predicting NPI in large-scale dynamic ncRNA protein bipartite graphs is rarely discussed, which is an online modeling and prediction problem. In addition, the results published by researchers on the Web site cannot meet real-time needs due to the large amount of basic data and long update cycles. Therefore, we propose a real-time method based on the dynamic ncRNA-protein bipartite graph learning framework, termed ML-GNN, which can model and predict the NPIs in real time. Our proposed method has the following advantages: first, the meta-learning strategy can alleviate the problem of large prediction errors in sparse neighborhood samples; second, dynamic modeling of newly added data can reduce computational pressure and predict NPIs in real-time. In the experiment, we built a dynamic bipartite graph based on 300000 NPIs from the NPInterv4.0 database. The experimental results indicate that our model achieved excellent performance in multiple experiments. The code for the model is available at https://github.com/taowang11/ML-NPI, and the data can be downloaded freely at http://bigdata.ibp.ac.cn/npinter4.
Collapse
Affiliation(s)
- Tao Wang
- Wenzhou University of Technology, 325000, Wenzhou, China
| | - Wentao Wang
- Wenzhou University of Technology, 325000, Wenzhou, China
| | - Xin Jiang
- Wenzhou University of Technology, 325000, Wenzhou, China
| | - Jiaxing Mao
- Central South University of Forestry and Technology, 410000, Changsha, China
| | - Linlin Zhuo
- Wenzhou University of Technology, 325000, Wenzhou, China
| | - Mingzhe Liu
- Wenzhou University of Technology, 325000, Wenzhou, China
| | - Xiangzheng Fu
- Neher's Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, 999078, Macao, China
| | - Xiaojun Yao
- Neher's Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, 999078, Macao, China
| |
Collapse
|
18
|
Ouyang D, Liang Y, Wang J, Li L, Ai N, Feng J, Lu S, Liao S, Liu X, Xie S. HGCLAMIR: Hypergraph contrastive learning with attention mechanism and integrated multi-view representation for predicting miRNA-disease associations. PLoS Comput Biol 2024; 20:e1011927. [PMID: 38652712 PMCID: PMC11037542 DOI: 10.1371/journal.pcbi.1011927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Accepted: 02/19/2024] [Indexed: 04/25/2024] Open
Abstract
Existing studies have shown that the abnormal expression of microRNAs (miRNAs) usually leads to the occurrence and development of human diseases. Identifying disease-related miRNAs contributes to studying the pathogenesis of diseases at the molecular level. As traditional biological experiments are time-consuming and expensive, computational methods have been used as an effective complement to infer the potential associations between miRNAs and diseases. However, most of the existing computational methods still face three main challenges: (i) learning of high-order relations; (ii) insufficient representation learning ability; (iii) importance learning and integration of multi-view embedding representation. To this end, we developed a HyperGraph Contrastive Learning with view-aware Attention Mechanism and Integrated multi-view Representation (HGCLAMIR) model to discover potential miRNA-disease associations. First, hypergraph convolutional network (HGCN) was utilized to capture high-order complex relations from hypergraphs related to miRNAs and diseases. Then, we combined HGCN with contrastive learning to improve and enhance the embedded representation learning ability of HGCN. Moreover, we introduced view-aware attention mechanism to adaptively weight the embedded representations of different views, thereby obtaining the importance of multi-view latent representations. Next, we innovatively proposed integrated representation learning to integrate the embedded representation information of multiple views for obtaining more reasonable embedding information. Finally, the integrated representation information was fed into a neural network-based matrix completion method to perform miRNA-disease association prediction. Experimental results on the cross-validation set and independent test set indicated that HGCLAMIR can achieve better prediction performance than other baseline models. Furthermore, the results of case studies and enrichment analysis further demonstrated the accuracy of HGCLAMIR and unconfirmed potential associations had biological significance.
Collapse
Affiliation(s)
- Dong Ouyang
- Peng Cheng Laboratory, Shenzhen, China
- School of Biomedical Engineering, Guangdong Medical University, Dongguan, China
| | - Yong Liang
- Peng Cheng Laboratory, Shenzhen, China
- Pazhou Laboratory (Huangpu), Guangzhou, China
| | - Jinfeng Wang
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou, China
| | - Le Li
- School of Computer Science and Engineering, Faculty of Innovation Engineering, Macau University of Science and Technology, Macau, China
| | - Ning Ai
- School of Computer Science and Engineering, Faculty of Innovation Engineering, Macau University of Science and Technology, Macau, China
| | - Junning Feng
- School of Computer Science and Engineering, Faculty of Innovation Engineering, Macau University of Science and Technology, Macau, China
| | - Shanghui Lu
- School of Computer Science and Engineering, Faculty of Innovation Engineering, Macau University of Science and Technology, Macau, China
| | - Shuilin Liao
- School of Computer Science and Engineering, Faculty of Innovation Engineering, Macau University of Science and Technology, Macau, China
| | - Xiaoying Liu
- Computer Engineering Technical College, Guangdong Polytechnic of Science and Technology, Zhuhai, China
| | - Shengli Xie
- Guangdong-HongKong-Macao Joint Laboratory for Smart Discrete Manufacturing, Guangzhou, China
| |
Collapse
|
19
|
Peng L, Yang Y, Yang C, Li Z, Cheong N. HRGCNLDA: Forecasting of lncRNA-disease association based on hierarchical refinement graph convolutional neural network. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:4814-4834. [PMID: 38872515 DOI: 10.3934/mbe.2024212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2024]
Abstract
Long non-coding RNA (lncRNA) is considered to be a crucial regulator involved in various human biological processes, including the regulation of tumor immune checkpoint proteins. It has great potential as both a cancer biomolecular biomarker and therapeutic target. Nevertheless, conventional biological experimental techniques are both resource-intensive and laborious, making it essential to develop an accurate and efficient computational method to facilitate the discovery of potential links between lncRNAs and diseases. In this study, we proposed HRGCNLDA, a computational approach utilizing hierarchical refinement of graph convolutional neural networks for forecasting lncRNA-disease potential associations. This approach effectively addresses the over-smoothing problem that arises from stacking multiple layers of graph convolutional neural networks. Specifically, HRGCNLDA enhances the layer representation during message propagation and node updates, thereby amplifying the contribution of hidden layers that resemble the ego layer while reducing discrepancies. The results of the experiments showed that HRGCNLDA achieved the highest AUC-ROC (area under the receiver operating characteristic curve, AUC for short) and AUC-PR (area under the precision versus recall curve, AUPR for short) values compared to other methods. Finally, to further demonstrate the reliability and efficacy of our approach, we performed case studies on the case of three prevalent human diseases, namely, breast cancer, lung cancer and gastric cancer.
Collapse
Affiliation(s)
- Li Peng
- College of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan 411201, China
- Hunan Key Laboratory for Service Computing and Novel Software Technology, Hunan University of Science and Technology, Xiangtan 411201, China
| | - Yujie Yang
- College of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan 411201, China
| | - Cheng Yang
- College of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan 411201, China
| | - Zejun Li
- School of Computer Science and Engineering, Hunan Institute of Technology, Hengyang 421002, China
| | - Ngai Cheong
- Faculty of Applied Sciences, Macao Polytechnic University, Macau 999078, China
| |
Collapse
|
20
|
Chen Z, Zhang L, Li J, Fu M. MLFLHMDA: predicting human microbe-disease association based on multi-view latent feature learning. Front Microbiol 2024; 15:1353278. [PMID: 38371933 PMCID: PMC10869561 DOI: 10.3389/fmicb.2024.1353278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Accepted: 01/17/2024] [Indexed: 02/20/2024] Open
Abstract
Introduction A growing body of research indicates that microorganisms play a crucial role in human health. Imbalances in microbial communities are closely linked to human diseases, and identifying potential relationships between microbes and diseases can help elucidate the pathogenesis of diseases. However, traditional methods based on biological or clinical experiments are costly, so the use of computational models to predict potential microbe-disease associations is of great importance. Methods In this paper, we present a novel computational model called MLFLHMDA, which is based on a Multi-View Latent Feature Learning approach to predict Human potential Microbe-Disease Associations. Specifically, we compute Gaussian interaction profile kernel similarity between diseases and microbes based on the known microbe-disease associations from the Human Microbe-Disease Association Database and perform a preprocessing step on the resulting microbe-disease association matrix, namely, weighting K nearest known neighbors (WKNKN) to reduce the sparsity of the microbe-disease association matrix. To obtain unobserved associations in the microbe and disease views, we extract different latent features based on the geometrical structure of microbes and diseases, and project multi-modal latent features into a common subspace. Next, we introduce graph regularization to preserve the local manifold structure of Gaussian interaction profile kernel similarity and add L p , q -norms to the projection matrix to ensure the interpretability and sparsity of the model. Results The AUC values for global leave-one-out cross-validation and 5-fold cross validation implemented by MLFLHMDA are 0.9165 and 0.8942+/-0.0041, respectively, which perform better than other existing methods. In addition, case studies of different diseases have demonstrated the superiority of the predictive power of MLFLHMDA. The source code of our model and the data are available on https://github.com/LiangzheZhang/MLFLHMDA_master.
Collapse
|
21
|
Chang Z, Zhu R, Liu J, Shang J, Dai L. HGSMDA: miRNA-Disease Association Prediction Based on HyperGCN and Sørensen-Dice Loss. Noncoding RNA 2024; 10:9. [PMID: 38392964 PMCID: PMC10893088 DOI: 10.3390/ncrna10010009] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 01/19/2024] [Accepted: 01/24/2024] [Indexed: 02/25/2024] Open
Abstract
Biological research has demonstrated the significance of identifying miRNA-disease associations in the context of disease prevention, diagnosis, and treatment. However, the utilization of experimental approaches involving biological subjects to infer these associations is both costly and inefficient. Consequently, there is a pressing need to devise novel approaches that offer enhanced accuracy and effectiveness. Presently, the predominant methods employed for predicting disease associations rely on Graph Convolutional Network (GCN) techniques. However, the Graph Convolutional Network algorithm, which is locally aggregated, solely incorporates information from the immediate neighboring nodes of a given node at each layer. Consequently, GCN cannot simultaneously aggregate information from multiple nodes. This constraint significantly impacts the predictive efficacy of the model. To tackle this problem, we propose a novel approach, based on HyperGCN and Sørensen-Dice loss (HGSMDA), for predicting associations between miRNAs and diseases. In the initial phase, we developed multiple networks to represent the similarity between miRNAs and diseases and employed GCNs to extract information from diverse perspectives. Subsequently, we draw into HyperGCN to construct a miRNA-disease heteromorphic hypergraph using hypernodes and train GCN on the graph to aggregate information. Finally, we utilized the Sørensen-Dice loss function to evaluate the degree of similarity between the predicted outcomes and the ground truth values, thereby enabling the prediction of associations between miRNAs and diseases. In order to assess the soundness of our methodology, an extensive series of experiments was conducted employing the Human MicroRNA Disease Database (HMDD v3.2) as the dataset. The experimental outcomes unequivocally indicate that HGSMDA exhibits remarkable efficacy when compared to alternative methodologies. Furthermore, the predictive capacity of HGSMDA was corroborated through a case study focused on colon cancer. These findings strongly imply that HGSMDA represents a dependable and valid framework, thereby offering a novel avenue for investigating the intricate association between miRNAs and diseases.
Collapse
Affiliation(s)
| | - Rong Zhu
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; (Z.C.); (J.L.); (J.S.); (L.D.)
| | | | | | | |
Collapse
|
22
|
Bai T, Yan K, Liu B. DAmiRLocGNet: miRNA subcellular localization prediction by combining miRNA-disease associations and graph convolutional networks. Brief Bioinform 2023:bbad212. [PMID: 37332057 DOI: 10.1093/bib/bbad212] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 05/17/2023] [Accepted: 05/18/2023] [Indexed: 06/20/2023] Open
Abstract
MicroRNAs (miRNAs) are human post-transcriptional regulators in humans, which are involved in regulating various physiological processes by regulating the gene expression. The subcellular localization of miRNAs plays a crucial role in the discovery of their biological functions. Although several computational methods based on miRNA functional similarity networks have been presented to identify the subcellular localization of miRNAs, it remains difficult for these approaches to effectively extract well-referenced miRNA functional representations due to insufficient miRNA-disease association representation and disease semantic representation. Currently, there has been a significant amount of research on miRNA-disease associations, making it possible to address the issue of insufficient miRNA functional representation. In this work, a novel model is established, named DAmiRLocGNet, based on graph convolutional network (GCN) and autoencoder (AE) for identifying the subcellular localizations of miRNA. The DAmiRLocGNet constructs the features based on miRNA sequence information, miRNA-disease association information and disease semantic information. GCN is utilized to gather the information of neighboring nodes and capture the implicit information of network structures from miRNA-disease association information and disease semantic information. AE is employed to capture sequence semantics from sequence similarity networks. The evaluation demonstrates that the performance of DAmiRLocGNet is superior to other competing computational approaches, benefiting from implicit features captured by using GCNs. The DAmiRLocGNet has the potential to be applied to the identification of subcellular localization of other non-coding RNAs. Moreover, it can facilitate further investigation into the functional mechanisms underlying miRNA localization. The source code and datasets are accessed at http://bliulab.net/DAmiRLocGNet.
Collapse
Affiliation(s)
- Tao Bai
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
- School of Mathematics & Computer Science, Yan'an University, Shaanxi 716000, China
| | - Ke Yan
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
- Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing 100081, China
| |
Collapse
|
23
|
Gu C, Li X. Prediction of disease-related miRNAs by voting with multiple classifiers. BMC Bioinformatics 2023; 24:177. [PMID: 37122001 PMCID: PMC10150488 DOI: 10.1186/s12859-023-05308-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Accepted: 04/26/2023] [Indexed: 05/02/2023] Open
Abstract
There is strong evidence to support that mutations and dysregulation of miRNAs are associated with a variety of diseases, including cancer. However, the experimental methods used to identify disease-related miRNAs are expensive and time-consuming. Effective computational approaches to identify disease-related miRNAs are in high demand and would aid in the detection of lncRNA biomarkers for disease diagnosis, treatment, and prevention. In this study, we develop an ensemble learning framework to reveal the potential associations between miRNAs and diseases (ELMDA). The ELMDA framework does not rely on the known associations when calculating miRNA and disease similarities and uses multi-classifiers voting to predict disease-related miRNAs. As a result, the average AUC of the ELMDA framework was 0.9229 for the HMDD v2.0 database in a fivefold cross-validation. All potential associations in the HMDD V2.0 database were predicted, and 90% of the top 50 results were verified with the updated HMDD V3.2 database. The ELMDA framework was implemented to investigate gastric neoplasms, prostate neoplasms and colon neoplasms, and 100%, 94%, and 90%, respectively, of the top 50 potential miRNAs were validated by the HMDD V3.2 database. Moreover, the ELMDA framework can predict isolated disease-related miRNAs. In conclusion, ELMDA appears to be a reliable method to uncover disease-associated miRNAs.
Collapse
Affiliation(s)
- Changlong Gu
- College of Information Science and Engineering, Hunan University, Changsha, 410082, Hunan, China.
| | - Xiaoying Li
- College of Information Science and Engineering, Hunan University, Changsha, 410082, Hunan, China.
| |
Collapse
|
24
|
Qu Q, Chen X, Ning B, Zhang X, Nie H, Zeng L, Chen H, Fu X. Prediction of miRNA-disease associations by neural network-based deep matrix factorization. Methods 2023; 212:1-9. [PMID: 36813017 DOI: 10.1016/j.ymeth.2023.02.003] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 01/17/2023] [Accepted: 02/10/2023] [Indexed: 02/23/2023] Open
Abstract
MicroRNA(miRNA) is a class of short non-coding RNAs with a length of about 22 nucleotides, which participates in various biological processes of cells. A number of studies have shown that miRNAs are closely related to the occurrence of cancer and various human diseases. Therefore, studying miRNA-disease associations is helpful to understand the pathogenesis of diseases as well as the prevention, diagnosis, treatment and prognosis of diseases. Traditional biological experimental methods for studying miRNA-disease associations have disadvantages such as expensive equipment, time-consuming and labor-intensive. With the rapid development of bioinformatics, more and more researchers are committed to developing effective computational methods to predict miRNA-disease associations in roder to reduce the time and money cost of experiments. In this study, we proposed a neural network-based deep matrix factorization method named NNDMF to predict miRNA-disease associations. To address the problem that traditional matrix factorization methods can only extract linear features, NNDMF used neural network to perform deep matrix factorization to extract nonlinear features, which makes up for the shortcomings of traditional matrix factorization methods. We compared NNDMF with four previous classical prediction models (IMCMDA, GRMDA, SACMDA and ICFMDA) in global LOOCV and local LOOCV, respectively. The AUCs achieved by NNDMF in two cross-validation methods were 0.9340 and 0.8763, respectively. Furthermore, we conducted case studies on three important human diseases (lymphoma, colorectal cancer and lung cancer) to validate the effectiveness of NNDMF. In conclusion, NNDMF could effectively predict the potential miRNA-disease associations.
Collapse
Affiliation(s)
- Qiang Qu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Xia Chen
- School of Basic Education, Changsha Aeronautical Vocational and Technical College, Changsha, China
| | - Bin Ning
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Xiang Zhang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Hao Nie
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Li Zeng
- College of Life and Environmental Science, Hunan University of Art and Science, Changde, China
| | - Haowen Chen
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China.
| | - Xiangzheng Fu
- Research Institute of Hunan University in Chongqing, Chongqing, China.
| |
Collapse
|
25
|
Ha J, Park S. NCMD: Node2vec-Based Neural Collaborative Filtering for Predicting MiRNA-Disease Association. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1257-1268. [PMID: 35849666 DOI: 10.1109/tcbb.2022.3191972] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Numerous studies have reported that micro RNAs (miRNAs) play pivotal roles in disease pathogenesis based on the deregulation of the expressions of target messenger RNAs. Therefore, the identification of disease-related miRNAs is of great significance in understanding human complex diseases, which can also provide insight into the design of novel prognostic markers and disease therapies. Considering the time and cost involved in wet experiments, most recent works have focused on the effective and feasible modeling of computational frameworks to uncover miRNA-disease associations. In this study, we propose a novel framework called node2vec-based neural collaborative filtering for predicting miRNA-disease association (NCMD) based on deep neural networks. Initially, NCMD exploits Node2vec to learn low-dimensional vector representations of miRNAs and diseases. Next, it utilizes a deep learning framework that combines the linear ability of generalized matrix factorization and nonlinear ability of a multilayer perceptron. Experimental results clearly demonstrate the comparable performance of NCMD relative to the state-of-the-art methods according to statistical measures. In addition, case studies on breast cancer, lung cancer and pancreatic cancer validate the effectiveness of NCMD. Extensive experiments demonstrate the benefits of modeling a neural collaborative-filtering-based approach for discovering novel miRNA-disease associations.
Collapse
|
26
|
Feng H, Jin D, Li J, Li Y, Zou Q, Liu T. Matrix reconstruction with reliable neighbors for predicting potential MiRNA-disease associations. Brief Bioinform 2023; 24:6960615. [PMID: 36567252 DOI: 10.1093/bib/bbac571] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 10/16/2022] [Accepted: 11/23/2022] [Indexed: 12/27/2022] Open
Abstract
Numerous experimental studies have indicated that alteration and dysregulation in mircroRNAs (miRNAs) are associated with serious diseases. Identifying disease-related miRNAs is therefore an essential and challenging task in bioinformatics research. Computational methods are an efficient and economical alternative to conventional biomedical studies and can reveal underlying miRNA-disease associations for subsequent experimental confirmation with reasonable confidence. Despite the success of existing computational approaches, most of them only rely on the known miRNA-disease associations to predict associations without adding other data to increase the prediction accuracy, and they are affected by issues of data sparsity. In this paper, we present MRRN, a model that combines matrix reconstruction with node reliability to predict probable miRNA-disease associations. In MRRN, the most reliable neighbors of miRNA and disease are used to update the original miRNA-disease association matrix, which significantly reduces data sparsity. Unknown miRNA-disease associations are reconstructed by aggregating the most reliable first-order neighbors to increase prediction accuracy by representing the local and global structure of the heterogeneous network. Five-fold cross-validation of MRRN produced an area under the curve (AUC) of 0.9355 and area under the precision-recall curve (AUPR) of 0.2646, values that were greater than those produced by comparable models. Two different types of case studies using three diseases were conducted to demonstrate the accuracy of MRRN, and all top 30 predicted miRNAs were verified.
Collapse
Affiliation(s)
- Hailin Feng
- School of mathematics and computer science, Zhejiang A&F University, No.666 Wusu Street,Lin'an District, 311300, Hangzhou, China
| | - Dongdong Jin
- School of mathematics and computer science, Zhejiang A&F University, No.666 Wusu Street,Lin'an District, 311300, Hangzhou, China
| | - Jian Li
- School of mathematics and computer science, Zhejiang A&F University, No.666 Wusu Street,Lin'an District, 311300, Hangzhou, China
| | - Yane Li
- School of mathematics and computer science, Zhejiang A&F University, No.666 Wusu Street,Lin'an District, 311300, Hangzhou, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, No. 2006, Xiyuan Avenue, West District, high tech Zone, 611731, Chengdu, China
| | - Tongcun Liu
- School of mathematics and computer science, Zhejiang A&F University, No.666 Wusu Street,Lin'an District, 311300, Hangzhou, China
| |
Collapse
|
27
|
Yang B, Chen H. Predicting circRNA-drug sensitivity associations by learning multimodal networks using graph auto-encoders and attention mechanism. Brief Bioinform 2023; 24:6972879. [PMID: 36617209 DOI: 10.1093/bib/bbac596] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 11/11/2022] [Accepted: 12/04/2022] [Indexed: 01/09/2023] Open
Abstract
Recent studies have shown that the expression of circRNAs would affect drug sensitivity of cells and thus significantly influence the efficacy of drugs. Traditional biomedical experiments to validate such relationships are time-consuming and costly. Therefore, developing effective computational methods to predict potential associations between circRNAs and drug sensitivity is an important and urgent task. In this study, we propose a novel method, called MNGACDA, to predict possible circRNA-drug sensitivity associations for further biomedical screening. First, MNGACDA uses multiple sources of information from circRNAs and drugs to construct multimodal networks. It then employs node-level attention graph auto-encoders to obtain low-dimensional embeddings for circRNAs and drugs from the multimodal networks. Finally, an inner product decoder is applied to predict the association scores between circRNAs and drug sensitivity based on the embedding representations of circRNAs and drugs. Extensive experimental results based on cross-validations show that MNGACDA outperforms six other state-of-the-art methods. Furthermore, excellent performance in case studies demonstrates that MNGACDA is an effective tool for predicting circRNA-drug sensitivity associations in real situations. These results confirm the reliable prediction ability of MNGACDA in revealing circRNA-drug sensitivity associations.
Collapse
Affiliation(s)
- Bo Yang
- School of Software, East China Jiaotong University
| | - Hailin Chen
- School of Software, East China Jiaotong University
| |
Collapse
|
28
|
Liu ZH, Ji CM, Ni JC, Wang YT, Qiao LJ, Zheng CH. Convolution Neural Networks Using Deep Matrix Factorization for Predicting Circrna-Disease Association. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:277-284. [PMID: 34951853 DOI: 10.1109/tcbb.2021.3138339] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
CircRNAs have a stable structure, which gives them a higher tolerance to nucleases. Therefore, the properties of circular RNAs are beneficial in disease diagnosis. However, there are few known associations between circRNAs and disease. Biological experiments identify new associations is time-consuming and high-cost. As a result, there is a need of building efficient and achievable computation models to predict potential circRNA-disease associations. In this paper, we design a novel convolution neural networks framework(DMFCNNCD) to learn features from deep matrix factorization to predict circRNA-disease associations. Firstly, we decompose the circRNA-disease association matrix to obtain the original features of the disease and circRNA, and use the mapping module to extract potential nonlinear features. Then, we integrate it with the similarity information to form a training set. Finally, we apply convolution neural networks to predict the unknown association between circRNAs and diseases. The five-fold cross-validation on various experiments shows that our method can predict circRNA-disease association and outperforms state of the art methods.
Collapse
|
29
|
Peng L, Yang J, Wang M, Zhou L. Editorial: Machine learning-based methods for RNA data analysis—Volume II. Front Genet 2022; 13:1010089. [DOI: 10.3389/fgene.2022.1010089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 09/20/2022] [Indexed: 12/02/2022] Open
|
30
|
Huang L, Zhang L, Chen X. Updated review of advances in microRNAs and complex diseases: experimental results, databases, webservers and data fusion. Brief Bioinform 2022; 23:6696143. [PMID: 36094095 DOI: 10.1093/bib/bbac397] [Citation(s) in RCA: 62] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 07/19/2022] [Accepted: 08/15/2022] [Indexed: 12/14/2022] Open
Abstract
MicroRNAs (miRNAs) are gene regulators involved in the pathogenesis of complex diseases such as cancers, and thus serve as potential diagnostic markers and therapeutic targets. The prerequisite for designing effective miRNA therapies is accurate discovery of miRNA-disease associations (MDAs), which has attracted substantial research interests during the last 15 years, as reflected by more than 55 000 related entries available on PubMed. Abundant experimental data gathered from the wealth of literature could effectively support the development of computational models for predicting novel associations. In 2017, Chen et al. published the first-ever comprehensive review on MDA prediction, presenting various relevant databases, 20 representative computational models, and suggestions for building more powerful ones. In the current review, as the continuation of the previous study, we revisit miRNA biogenesis, detection techniques and functions; summarize recent experimental findings related to common miRNA-associated diseases; introduce recent updates of miRNA-relevant databases and novel database releases since 2017, present mainstream webservers and new webserver releases since 2017 and finally elaborate on how fusion of diverse data sources has contributed to accurate MDA prediction.
Collapse
Affiliation(s)
- Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, 10084, China.,The Future Laboratory, Tsinghua University, Beijing, 10084, China
| | - Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.,Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|
31
|
Peng L, Tu Y, Huang L, Li Y, Fu X, Chen X. DAESTB: inferring associations of small molecule-miRNA via a scalable tree boosting model based on deep autoencoder. Brief Bioinform 2022; 23:6827720. [PMID: 36377749 DOI: 10.1093/bib/bbac478] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Revised: 09/28/2022] [Accepted: 10/08/2022] [Indexed: 11/16/2022] Open
Abstract
MicroRNAs (miRNAs) are closely related to a variety of human diseases, not only regulating gene expression, but also having an important role in human life activities and being viable targets of small molecule drugs for disease treatment. Current computational techniques to predict the potential associations between small molecule and miRNA are not that accurate. Here, we proposed a new computational method based on a deep autoencoder and a scalable tree boosting model (DAESTB), to predict associations between small molecule and miRNA. First, we constructed a high-dimensional feature matrix by integrating small molecule-small molecule similarity, miRNA-miRNA similarity and known small molecule-miRNA associations. Second, we reduced feature dimensionality on the integrated matrix using a deep autoencoder to obtain the potential feature representation of each small molecule-miRNA pair. Finally, a scalable tree boosting model is used to predict small molecule and miRNA potential associations. The experiments on two datasets demonstrated the superiority of DAESTB over various state-of-the-art methods. DAESTB achieved the best AUC value. Furthermore, in three case studies, a large number of predicted associations by DAESTB are confirmed with the public accessed literature. We envision that DAESTB could serve as a useful biological model for predicting potential small molecule-miRNA associations.
Collapse
Affiliation(s)
- Li Peng
- College of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411201, Hunan, China.,Hunan Key Laboratory for Service computing and Novel Software Technology
| | - Yuan Tu
- College of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411201, Hunan, China
| | - Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, 10084, China.,The Future Laboratory, Tsinghua University, Beijing, 10084, China
| | - Yang Li
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, 411105, China
| | - Xiangzheng Fu
- College of Information Science and Engineering, Hunan University, Changsha, 410082, Hunan, China
| | - Xiang Chen
- College of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411201, Hunan, China
| |
Collapse
|
32
|
Huang L, Zhang L, Chen X. Updated review of advances in microRNAs and complex diseases: towards systematic evaluation of computational models. Brief Bioinform 2022; 23:6712303. [PMID: 36151749 DOI: 10.1093/bib/bbac407] [Citation(s) in RCA: 67] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2022] [Revised: 08/11/2022] [Accepted: 08/20/2022] [Indexed: 12/14/2022] Open
Abstract
Currently, there exist no generally accepted strategies of evaluating computational models for microRNA-disease associations (MDAs). Though K-fold cross validations and case studies seem to be must-have procedures, the value of K, the evaluation metrics, and the choice of query diseases as well as the inclusion of other procedures (such as parameter sensitivity tests, ablation studies and computational cost reports) are all determined on a case-by-case basis and depending on the researchers' choices. In the current review, we include a comprehensive analysis on how 29 state-of-the-art models for predicting MDAs were evaluated. Based on the analytical results, we recommend a feasible evaluation workflow that would suit any future model to facilitate fair and systematic assessment of predictive performance.
Collapse
Affiliation(s)
- Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, 10084, China.,The Future Laboratory, Tsinghua University, Beijing, 10084, China
| | - Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.,Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|
33
|
Ouyang D, Liang Y, Wang J, Liu X, Xie S, Miao R, Ai N, Li L, Dang Q. Predicting multiple types of miRNA-disease associations using adaptive weighted nonnegative tensor factorization with self-paced learning and hypergraph regularization. Brief Bioinform 2022; 23:6720405. [PMID: 36168938 DOI: 10.1093/bib/bbac390] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 08/09/2022] [Accepted: 08/11/2022] [Indexed: 12/14/2022] Open
Abstract
More and more evidence indicates that the dysregulations of microRNAs (miRNAs) lead to diseases through various kinds of underlying mechanisms. Identifying the multiple types of disease-related miRNAs plays an important role in studying the molecular mechanism of miRNAs in diseases. Moreover, compared with traditional biological experiments, computational models are time-saving and cost-minimized. However, most tensor-based computational models still face three main challenges: (i) easy to fall into bad local minima; (ii) preservation of high-order relations; (iii) false-negative samples. To this end, we propose a novel tensor completion framework integrating self-paced learning, hypergraph regularization and adaptive weight tensor into nonnegative tensor factorization, called SPLDHyperAWNTF, for the discovery of potential multiple types of miRNA-disease associations. We first combine self-paced learning with nonnegative tensor factorization to effectively alleviate the model from falling into bad local minima. Then, hypergraphs for miRNAs and diseases are constructed, and hypergraph regularization is used to preserve the high-order complex relations of these hypergraphs. Finally, we innovatively introduce adaptive weight tensor, which can effectively alleviate the impact of false-negative samples on the prediction performance. The average results of 5-fold and 10-fold cross-validation on four datasets show that SPLDHyperAWNTF can achieve better prediction performance than baseline models in terms of Top-1 precision, Top-1 recall and Top-1 F1. Furthermore, we implement case studies to further evaluate the accuracy of SPLDHyperAWNTF. As a result, 98 (MDAv2.0) and 98 (MDAv2.0-2) of top-100 are confirmed by HMDDv3.2 dataset. Moreover, the results of enrichment analysis illustrate that unconfirmed potential associations have biological significance.
Collapse
Affiliation(s)
- Dong Ouyang
- Peng Cheng Laboratory, Shenzhen 518055, China.,School of Computer Science and Engineering, Faculty of Innovation Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau 999078, China
| | - Yong Liang
- Peng Cheng Laboratory, Shenzhen 518055, China
| | - Jianjun Wang
- School of Mathematics and Statistics, Southwest University, Chongqing 400715, China
| | - Xiaoying Liu
- Computer Engineering Technical College, Guangdong Polytechnic of Science and Technology, Zhuhai 519090, China
| | - Shengli Xie
- Guangdong-HongKong-Macao Joint Laboratory for Smart Discrete Manufacturing, Guangzhou 510000, China
| | - Rui Miao
- Basic Teaching Department, ZhuHai Campus of ZunYi Medical University, Zhuhai 519090, China
| | - Ning Ai
- School of Computer Science and Engineering, Faculty of Innovation Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau 999078, China
| | - Le Li
- School of Computer Science and Engineering, Faculty of Innovation Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau 999078, China
| | - Qi Dang
- School of Computer Science and Engineering, Faculty of Innovation Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau 999078, China
| |
Collapse
|
34
|
Yan C, Ding C, Duan G. PMMS: Predicting essential miRNAs based on multi-head self-attention mechanism and sequences. Front Med (Lausanne) 2022; 9:1015278. [DOI: 10.3389/fmed.2022.1015278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 10/25/2022] [Indexed: 11/18/2022] Open
Abstract
Increasing evidence has proved that miRNA plays a significant role in biological progress. In order to understand the etiology and mechanisms of various diseases, it is necessary to identify the essential miRNAs. However, it is time-consuming and expensive to identify essential miRNAs by using traditional biological experiments. It is critical to develop computational methods to predict potential essential miRNAs. In this study, we provided a new computational method (called PMMS) to identify essential miRNAs by using multi-head self-attention and sequences. First, PMMS computes the statistic and structure features and extracts the static feature by concatenating them. Second, PMMS extracts the deep learning original feature (BiLSTM-based feature) by using bi-directional long short-term memory (BiLSTM) and pre-miRNA sequences. In addition, we further obtained the multi-head self-attention feature (MS-based feature) based on BiLSTM-based feature and multi-head self-attention mechanism. By considering the importance of the subsequence of pre-miRNA to the static feature of miRNA, we obtained the deep learning final feature (WA-based feature) based on the weighted attention mechanism. Finally, we concatenated WA-based feature and static feature as an input to the multilayer perceptron) model to predict essential miRNAs. We conducted five-fold cross-validation to evaluate the prediction performance of PMMS. The areas under the ROC curves (AUC), the F1-score, and accuracy (ACC) are used as performance metrics. From the experimental results, PMMS obtained best prediction performances (AUC: 0.9556, F1-score: 0.9030, and ACC: 0.9097). It also outperformed other compared methods. The experimental results also illustrated that PMMS is an effective method to identify essential miRNA.
Collapse
|
35
|
Chen L, Lin D, Xu H, Li J, Lin L. WLLP: A weighted reconstruction-based linear label propagation algorithm for predicting potential therapeutic agents for COVID-19. Front Microbiol 2022; 13:1040252. [PMID: 36466666 PMCID: PMC9713947 DOI: 10.3389/fmicb.2022.1040252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 10/06/2022] [Indexed: 11/18/2022] Open
Abstract
The global coronavirus disease 2019 (COVID-19) pandemic caused by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV) has led to a huge health and economic crises. However, the research required to develop new drugs and vaccines is very expensive in terms of labor, money, and time. Owing to recent advances in data science, drug-repositioning technologies have become one of the most promising strategies available for developing effective treatment options. Using the previously reported human drug virus database (HDVD), we proposed a model to predict possible drug regimens based on a weighted reconstruction-based linear label propagation algorithm (WLLP). For the drug–virus association matrix, we used the weighted K-nearest known neighbors method for preprocessing and label propagation of the network based on the linear neighborhood similarity of drugs and viruses to obtain the final prediction results. In the framework of 10 times 10-fold cross-validated area under the receiver operating characteristic (ROC) curve (AUC), WLLP exhibited excellent performance with an AUC of 0.8828 ± 0.0037 and an area under the precision-recall curve of 0.5277 ± 0.0053, outperforming the other four models used for comparison. We also predicted effective drug regimens against SARS-CoV-2, and this case study showed that WLLP can be used to suggest potential drugs for the treatment of COVID-19.
Collapse
Affiliation(s)
- Langcheng Chen
- Center of Campus Network and Modern Educational Technology, Guangdong University of Technology, Guangzhou, China
| | - Dongying Lin
- School of Computer Science, Guangdong University of Technology, Guangzhou, China
| | - Haojie Xu
- School of Computer Science, Guangdong University of Technology, Guangzhou, China
| | - Jianming Li
- School of Computer Science, Guangdong University of Technology, Guangzhou, China
| | - Lieqing Lin
- Center of Campus Network and Modern Educational Technology, Guangdong University of Technology, Guangzhou, China
- *Correspondence: Lieqing Lin
| |
Collapse
|
36
|
Cao B, Li R, Xiao S, Deng S, Zhou X, Zhou L. Predicting miRNA-disease association through combining miRNA function and network topological similarities based on MINE. iScience 2022; 25:105299. [DOI: 10.1016/j.isci.2022.105299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Revised: 07/08/2022] [Accepted: 09/28/2022] [Indexed: 11/16/2022] Open
|
37
|
Li L, Gao Z, Zheng CH, Qi R, Wang YT, Ni JC. Predicting miRNA-Disease Association Based on Improved Graph Regression. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3604-3613. [PMID: 34757912 DOI: 10.1109/tcbb.2021.3127017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Recently, as a growing number of associations between microRNAs (miRNAs) and diseases are discovered, researchers gradually realize that miRNAs are closely related to several complicated biological processes and human diseases. Hence, it is especially important to construct availably models to infer associations between miRNAs and diseases. In this study, we presented Improved Graph Regression for miRNA-Disease Association Prediction (IGRMDA) to observe potential relationship between miRNAs and diseases. In order to reduce the inherent noise existing in the acquired biological datasets, we utilized matrix decomposition algorithm to process miRNA functional similarity and disease semantic similarity and then combining them with existing similarity information to obtain final miRNA similarity data and disease similarity data. Then, we applied miRNA-disease association data, miRNA similarity data and disease similarity data to form corresponding latent spaces. Furthermore, we performed improved graph regression algorithm in latent spaces, which included miRNA-disease association space, miRNA similarity space and disease similarity space. Non-negative matrix factorization and partial least squares were used in the graph regression process to obtain important related attributes. The cross validation experiments and case studies were also implemented to prove the effectiveness of IGRMDA, which showed that IGRMDA could predict potential associations between miRNAs and diseases.
Collapse
|
38
|
Guo R, Chen H, Wang W, Wu G, Lv F. Predicting potential miRNA-disease associations based on more reliable negative sample selection. BMC Bioinformatics 2022; 23:432. [PMID: 36253735 PMCID: PMC9575264 DOI: 10.1186/s12859-022-04978-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Accepted: 10/06/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Increasing biomedical studies have shown that the dysfunction of miRNAs is closely related with many human diseases. Identifying disease-associated miRNAs would contribute to the understanding of pathological mechanisms of diseases. Supervised learning-based computational methods have continuously been developed for miRNA-disease association predictions. Negative samples of experimentally-validated uncorrelated miRNA-disease pairs are required for these approaches, while they are not available due to lack of biomedical research interest. Existing methods mainly choose negative samples from the unlabelled ones randomly. Therefore, the selection of more reliable negative samples is of great importance for these methods to achieve satisfactory prediction results. RESULTS In this study, we propose a computational method termed as KR-NSSM which integrates two semi-supervised algorithms to select more reliable negative samples for miRNA-disease association predictions. Our method uses a refined K-means algorithm for preliminary screening of likely negative and positive miRNA-disease samples. A Rocchio classification-based method is applied for further screening to receive more reliable negative and positive samples. We implement ablation tests in KR-NSSM and find that the combination of the two selection procedures would obtain more reliable negative samples for miRNA-disease association predictions. Comprehensive experiments based on fivefold cross-validations demonstrate improvements in prediction accuracy on six classic classifiers and five known miRNA-disease association prediction models when using negative samples chose by our method than by previous negative sample selection strategies. Moreover, 469 out of 1123 selected positive miRNA-disease associations by our method are confirmed by existing databases. CONCLUSIONS Our experiments show that KR-NSSM can screen out more reliable negative samples from the unlabelled ones, which greatly improves the performance of supervised machine learning methods in miRNA-disease association predictions. We expect that KR-NSSM would be a useful tool in negative sample selection in biomedical research.
Collapse
Affiliation(s)
- Ruiyu Guo
- School of Software, East China Jiaotong University, Nanchang, 330013, China
| | - Hailin Chen
- School of Software, East China Jiaotong University, Nanchang, 330013, China.
| | - Wengang Wang
- School of Software, East China Jiaotong University, Nanchang, 330013, China
| | - Guangsheng Wu
- School of Mathematics and Computer Science, Xinyu University, Xinyu, 338004, China
| | - Fangliang Lv
- School of Software, East China Jiaotong University, Nanchang, 330013, China
| |
Collapse
|
39
|
Wang B, Wang X, Zheng X, Han Y, Du X. JSCSNCP-LMA: a method for predicting the association of lncRNA-miRNA. Sci Rep 2022; 12:17030. [PMID: 36220862 PMCID: PMC9552706 DOI: 10.1038/s41598-022-21243-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Accepted: 09/26/2022] [Indexed: 12/29/2022] Open
Abstract
Non-coding RNAs (ncRNAs) have long been considered the "white elephant" on the genome because they lack the ability to encode proteins. However, in recent years, more and more biological experiments and clinical reports have proved that ncRNAs account for a large proportion in organisms. At the same time, they play a decisive role in the biological processes such as gene expression and cell growth and development. Recently, it has been found that short sequence non-coding RNA(miRNA) and long sequence non-coding RNA(lncRNA) can regulate each other, which plays an important role in various complex human diseases. In this paper, we used a new method (JSCSNCP-LMA) to predict lncRNA-miRNA with unknown associations. This method combined Jaccard similarity algorithm, self-tuning spectral clustering similarity algorithm, cosine similarity algorithm and known lncRNA-miRNA association networks, and used the consistency projection to complete the final prediction. The results showed that the AUC values of JSCSNCP-LMA in fivefold cross validation (fivefold CV) and leave-one-out cross validation (LOOCV) were 0.9145 and 0.9268, respectively. Compared with other models, we have successfully proved its superiority and good extensibility. Meanwhile, the model also used three different lncRNA-miRNA datasets in the fivefold CV experiment and obtained good results with AUC values of 0.9145, 0.9662 and 0.9505, respectively. Therefore, JSCSNCP-LMA will help to predict the associations between lncRNA and miRNA.
Collapse
Affiliation(s)
- Bo Wang
- grid.412616.60000 0001 0002 2355College of Computer and Control Engineering, Qiqihar University, Qiqihar, 161006 People’s Republic of China
| | - Xinwei Wang
- grid.412616.60000 0001 0002 2355College of Computer and Control Engineering, Qiqihar University, Qiqihar, 161006 People’s Republic of China
| | - Xiaodong Zheng
- grid.412616.60000 0001 0002 2355College of Computer and Control Engineering, Qiqihar University, Qiqihar, 161006 People’s Republic of China
| | - Yu Han
- grid.412616.60000 0001 0002 2355College of Computer and Control Engineering, Qiqihar University, Qiqihar, 161006 People’s Republic of China
| | - Xiaoxin Du
- grid.412616.60000 0001 0002 2355College of Computer and Control Engineering, Qiqihar University, Qiqihar, 161006 People’s Republic of China
| |
Collapse
|
40
|
MHDMF: Prediction of miRNA-disease associations based on Deep Matrix Factorization with Multi-source Graph Convolutional Network. Comput Biol Med 2022; 149:106069. [PMID: 36115300 DOI: 10.1016/j.compbiomed.2022.106069] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 07/31/2022] [Accepted: 08/27/2022] [Indexed: 11/24/2022]
Abstract
A growing number of works have proved that microRNAs (miRNAs) are a crucial biomarker in diverse bioprocesses affecting various diseases. As a good complement to high-cost wet experiment-based methods, numerous computational prediction methods have sprung up. However, there are still challenges that exist in making effective use of high false-negative associations and multi-source information for finding the potential associations. In this work, we develop an end-to-end computational framework, called MHDMF, which integrates the multi-source information on a heterogeneous network to discover latent disease-miRNA associations. Since high false-negative exist in the miRNA-disease associations, MHDMF utilizes the multi-source Graph Convolutional Network (GCN) to correct the false-negative association by reformulating the miRNA-disease association score matrix. The score matrix reformulation is based on different similarity profiles and known associations between miRNAs, genes, and diseases. Then, MHDMF employs Deep Matrix Factorization (DMF) to predict the miRNA-disease associations based on reformulated miRNA-disease association score matrix. The experimental results show that the proposed framework outperforms highly related comparison methods by a large margin on tasks of miRNA-disease association prediction. Furthermore, case studies suggest that MHDMF could be a convenient and efficient tool and may supply a new way to think about miRNA-disease association prediction.
Collapse
|
41
|
Peng L, Wang C, Tian G, Liu G, Li G, Lu Y, Yang J, Chen M, Li Z. Analysis of CT scan images for COVID-19 pneumonia based on a deep ensemble framework with DenseNet, Swin transformer, and RegNet. Front Microbiol 2022; 13:995323. [PMID: 36212877 PMCID: PMC9539545 DOI: 10.3389/fmicb.2022.995323] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Accepted: 08/22/2022] [Indexed: 12/15/2022] Open
Abstract
COVID-19 has caused enormous challenges to global economy and public health. The identification of patients with the COVID-19 infection by CT scan images helps prevent its pandemic. Manual screening COVID-19-related CT images spends a lot of time and resources. Artificial intelligence techniques including deep learning can effectively aid doctors and medical workers to screen the COVID-19 patients. In this study, we developed an ensemble deep learning framework, DeepDSR, by combining DenseNet, Swin transformer, and RegNet for COVID-19 image identification. First, we integrate three available COVID-19-related CT image datasets to one larger dataset. Second, we pretrain weights of DenseNet, Swin Transformer, and RegNet on the ImageNet dataset based on transformer learning. Third, we continue to train DenseNet, Swin Transformer, and RegNet on the integrated larger image dataset. Finally, the classification results are obtained by integrating results from the above three models and the soft voting approach. The proposed DeepDSR model is compared to three state-of-the-art deep learning models (EfficientNetV2, ResNet, and Vision transformer) and three individual models (DenseNet, Swin transformer, and RegNet) for binary classification and three-classification problems. The results show that DeepDSR computes the best precision of 0.9833, recall of 0.9895, accuracy of 0.9894, F1-score of 0.9864, AUC of 0.9991 and AUPR of 0.9986 under binary classification problem, and significantly outperforms other methods. Furthermore, DeepDSR obtains the best precision of 0.9740, recall of 0.9653, accuracy of 0.9737, and F1-score of 0.9695 under three-classification problem, further suggesting its powerful image identification ability. We anticipate that the proposed DeepDSR framework contributes to the diagnosis of COVID-19.
Collapse
Affiliation(s)
- Lihong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
- College of Life Sciences and Chemistry, Hunan University of Technology, Zhuzhou, China
| | - Chang Wang
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Geng Tian
- Geneis (Beijing) Co., Ltd., Beijing, China
| | - Guangyi Liu
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Gan Li
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Yuankang Lu
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | | | - Min Chen
- School of Computer Science, Hunan Institute of Technology, Hengyang, China
- *Correspondence: Min Chen, ; Zejun Li,
| | - Zejun Li
- School of Computer Science, Hunan Institute of Technology, Hengyang, China
- *Correspondence: Min Chen, ; Zejun Li,
| |
Collapse
|
42
|
Huang L, Zhang L, Chen X. Updated review of advances in microRNAs and complex diseases: taxonomy, trends and challenges of computational models. Brief Bioinform 2022; 23:6686738. [PMID: 36056743 DOI: 10.1093/bib/bbac358] [Citation(s) in RCA: 71] [Impact Index Per Article: 23.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 07/24/2022] [Accepted: 07/30/2022] [Indexed: 12/12/2022] Open
Abstract
Since the problem proposed in late 2000s, microRNA-disease association (MDA) predictions have been implemented based on the data fusion paradigm. Integrating diverse data sources gains a more comprehensive research perspective, and brings a challenge to algorithm design for generating accurate, concise and consistent representations of the fused data. After more than a decade of research progress, a relatively simple algorithm like the score function or a single computation layer may no longer be sufficient for further improving predictive performance. Advanced model design has become more frequent in recent years, particularly in the form of reasonably combing multiple algorithms, a process known as model fusion. In the current review, we present 29 state-of-the-art models and introduce the taxonomy of computational models for MDA prediction based on model fusion and non-fusion. The new taxonomy exhibits notable changes in the algorithmic architecture of models, compared with that of earlier ones in the 2017 review by Chen et al. Moreover, we discuss the progresses that have been made towards overcoming the obstacles to effective MDA prediction since 2017 and elaborated on how future models can be designed according to a set of new schemas. Lastly, we analysed the strengths and weaknesses of each model category in the proposed taxonomy and proposed future research directions from diverse perspectives for enhancing model performance.
Collapse
Affiliation(s)
- Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, 10084, China.,The Future Laboratory, Tsinghua University, Beijing, 10084, China
| | - Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.,Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|
43
|
Ma M, Na S, Zhang X, Chen C, Xu J. SFGAE: a self-feature-based graph autoencoder model for miRNA-disease associations prediction. Brief Bioinform 2022; 23:6678419. [PMID: 36037084 DOI: 10.1093/bib/bbac340] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2022] [Revised: 07/21/2022] [Accepted: 07/25/2022] [Indexed: 11/13/2022] Open
Abstract
Increasing evidence has suggested that microRNAs (miRNAs) are important biomarkers of various diseases. Numerous graph neural network (GNN) models have been proposed for predicting miRNA-disease associations. However, the existing GNN-based methods have over-smoothing issue-the learned feature embeddings of miRNA nodes and disease nodes are indistinguishable when stacking multiple GNN layers. This issue makes the performance of the methods sensitive to the number of layers, and significantly hurts the performance when more layers are employed. In this study, we resolve this issue by a novel self-feature-based graph autoencoder model, shortened as SFGAE. The key novelty of SFGAE is to construct miRNA-self embeddings and disease-self embeddings, and let them be independent of graph interactions between two types of nodes. The novel self-feature embeddings enrich the information of typical aggregated feature embeddings, which aggregate the information from direct neighbors and hence heavily rely on graph interactions. SFGAE adopts a graph encoder with attention mechanism to concatenate aggregated feature embeddings and self-feature embeddings, and adopts a bilinear decoder to predict links. Our experiments show that SFGAE achieves state-of-the-art performance. In particular, SFGAE improves the average AUC upon recent GAEMDA [1] on the benchmark datasets HMDD v2.0 and HMDD v3.2, and consistently performs better when less (e.g. 10%) training samples are used. Furthermore, SFGAE effectively overcomes the over-smoothing issue and performs stably well on deeper models (e.g. eight layers). Finally, we carry out case studies on three human diseases, colon neoplasms, esophageal neoplasms and kidney neoplasms, and perform a survival analysis using kidney neoplasm as an example. The results suggest that SFGAE is a reliable tool for predicting potential miRNA-disease associations.
Collapse
Affiliation(s)
- Mingyuan Ma
- Key Laboratory of High Confidence Software Technologies of Ministry of Education, School of Computer Science, Peking University, Beijing, China
| | - Sen Na
- International Computer Science Institute and Department of Statistics, University of California, Berkeley, Berkeley CA, USA
| | - Xiaolu Zhang
- Department of Information Systems, City University of Hong Kong, Hong Kong, China
| | - Congzhou Chen
- Key Laboratory of High Confidence Software Technologies of Ministry of Education, School of Computer Science, Peking University, Beijing, China
| | - Jin Xu
- Key Laboratory of High Confidence Software Technologies of Ministry of Education, School of Computer Science, Peking University, Beijing, China
| |
Collapse
|
44
|
Rao Y, Xie M, Wang H. Predict potential miRNA-disease associations based on bounded nuclear norm regularization. Front Genet 2022; 13:978975. [PMID: 36072658 PMCID: PMC9441603 DOI: 10.3389/fgene.2022.978975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Accepted: 07/18/2022] [Indexed: 11/13/2022] Open
Abstract
Increasing evidences show that the abnormal microRNA (miRNA) expression is related to a variety of complex human diseases. However, the current biological experiments to determine miRNA-disease associations are time consuming and expensive. Therefore, computational models to predict potential miRNA-disease associations are in urgent need. Though many miRNA-disease association prediction methods have been proposed, there is still a room to improve the prediction accuracy. In this paper, we propose a matrix completion model with bounded nuclear norm regularization to predict potential miRNA-disease associations, which is called BNNRMDA. BNNRMDA at first constructs a heterogeneous miRNA-disease network integrating the information of miRNA self-similarity, disease self-similarity, and the known miRNA-disease associations, which is represented by an adjacent matrix. Then, it models the miRNA-disease prediction as a relaxed matrix completion with error tolerance, value boundary and nuclear norm minimization. Finally it implements the alternating direction method to solve the matrix completion problem. BNNRMDA makes full use of available information of miRNAs and diseases, and can deals with the data containing noise. Compared with four state-of-the-art methods, the experimental results show BNNRMDA achieved the best performance in five-fold cross-validation and leave-one-out cross-validation. The case studies on two complex human diseases showed that 47 of the top 50 prediction results of BNNRMDA have been verified in the latest HMDD database.
Collapse
|
45
|
Wang W, Chen H. Predicting miRNA-disease associations based on graph attention networks and dual Laplacian regularized least squares. Brief Bioinform 2022; 23:6645486. [PMID: 35849099 DOI: 10.1093/bib/bbac292] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Revised: 06/23/2022] [Accepted: 06/26/2022] [Indexed: 01/05/2023] Open
Abstract
Increasing biomedical evidence has proved that the dysregulation of miRNAs is associated with human complex diseases. Identification of disease-related miRNAs is of great importance for disease prevention, diagnosis and remedy. To reduce the time and cost of biomedical experiments, there is a strong incentive to develop efficient computational methods to infer potential miRNA-disease associations. Although many computational approaches have been proposed to address this issue, the prediction accuracy needs to be further improved. In this study, we present a computational framework MKGAT to predict possible associations between miRNAs and diseases through graph attention networks (GATs) using dual Laplacian regularized least squares. We use GATs to learn embeddings of miRNAs and diseases on each layer from initial input features of known miRNA-disease associations, intra-miRNA similarities and intra-disease similarities. We then calculate kernel matrices of miRNAs and diseases based on Gaussian interaction profile (GIP) with the learned embeddings. We further fuse the kernel matrices of each layer and initial similarities with attention mechanism. Dual Laplacian regularized least squares are finally applied for new miRNA-disease association predictions with the fused miRNA and disease kernels. Compared with six state-of-the-art methods by 5-fold cross-validations, our method MKGAT receives the highest AUROC value of 0.9627 and AUPR value of 0.7372. We use MKGAT to predict related miRNAs for three cancers and discover that all the top 50 predicted results in the three diseases are confirmed by existing databases. The excellent performance indicates that MKGAT would be a useful computational tool for revealing disease-related miRNAs.
Collapse
Affiliation(s)
- Wengang Wang
- School of Software, East China Jiaotong University, Nanchang 330013, China
| | - Hailin Chen
- School of Software, East China Jiaotong University, Nanchang 330013, China
| |
Collapse
|
46
|
Yang Y, Shang J, Sun Y, Li F, Zhang Y, Kong XZ, Li S, Liu JX. TLNPMD: Prediction of miRNA-Disease Associations Based on miRNA-Drug-Disease Three-Layer Heterogeneous Network. Molecules 2022; 27:4371. [PMID: 35889243 PMCID: PMC9324587 DOI: 10.3390/molecules27144371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Accepted: 07/06/2022] [Indexed: 12/10/2022] Open
Abstract
Many microRNAs (miRNAs) have been confirmed to be associated with the generation of human diseases. Capturing miRNA-disease associations (M-DAs) provides an effective way to understand the etiology of diseases. Many models for predicting M-DAs have been constructed; nevertheless, there are still several limitations, such as generally considering direct information between miRNAs and diseases, usually ignoring potential knowledge hidden in isolated miRNAs or diseases. To overcome these limitations, in this study a novel method for predicting M-DAs was developed named TLNPMD, highlights of which are the introduction of drug heuristic information and a bipartite network reconstruction strategy. Specifically, three bipartite networks, including drug-miRNA, drug-disease, and miRNA-disease, were reconstructed as weighted ones using such reconstruction strategy. Based on these weighted bipartite networks, as well as three corresponding similarity networks of drugs, miRNAs and diseases, the miRNA-drug-disease three-layer heterogeneous network was constructed. Then, this heterogeneous network was converted into three two-layer heterogeneous networks, for each of which the network path computational model was employed to predict association scores. Finally, both direct and indirect miRNA-disease paths were used to predict M-DAs. Comparative experiments of TLNPMD and other four models were performed and evaluated by five-fold and global leave-one-out cross validations, results of which show that TLNPMD has the highest AUC values among those of compared methods. In addition, case studies of two common diseases were carried out to validate the effectiveness of the TLNPMD. These experiments demonstrate that the TLNPMD may serve as a promising alternative to existing methods for predicting M-DAs.
Collapse
Affiliation(s)
- Yi Yang
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; (Y.Y.); (Y.S.); (F.L.); (X.-Z.K.); (S.L.); (J.-X.L.)
| | - Junliang Shang
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; (Y.Y.); (Y.S.); (F.L.); (X.-Z.K.); (S.L.); (J.-X.L.)
| | - Yan Sun
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; (Y.Y.); (Y.S.); (F.L.); (X.-Z.K.); (S.L.); (J.-X.L.)
| | - Feng Li
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; (Y.Y.); (Y.S.); (F.L.); (X.-Z.K.); (S.L.); (J.-X.L.)
| | - Yuanyuan Zhang
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao 266520, China;
| | - Xiang-Zhen Kong
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; (Y.Y.); (Y.S.); (F.L.); (X.-Z.K.); (S.L.); (J.-X.L.)
| | - Shengjun Li
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; (Y.Y.); (Y.S.); (F.L.); (X.-Z.K.); (S.L.); (J.-X.L.)
| | - Jin-Xing Liu
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; (Y.Y.); (Y.S.); (F.L.); (X.-Z.K.); (S.L.); (J.-X.L.)
| |
Collapse
|
47
|
Chen B, Wang T, Zhang J, Zhang S, Shang X. Identification of Colon Cancer-Related RNAs Based on Heterogeneous Networks and Random Walk. BIOLOGY 2022; 11:1003. [PMID: 36101384 PMCID: PMC9312154 DOI: 10.3390/biology11071003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 06/25/2022] [Accepted: 06/28/2022] [Indexed: 11/17/2022]
Abstract
Colon cancer is considered as a complex disease that consists of metastatic seeding in early stages. Such disease is not simply caused by the action of a single RNA, but is associated with disorders of many kinds of RNAs and their regulation relationships. Hence, it is of great significance to study the complex regulatory roles among mRNAs, miRNAs and lncRNAs for further understanding the pathogenic mechanism of colon cancer. In this study, we constructed a heterogeneous network consisting of differentially expressed mRNAs, miRNAs and lncRNAs. This contains three kinds of vertices and six types of edges. All RNAs were re-divided into three categories, which were "related", "irrelevant" and "unlabeled". They were processed by dynamic excitation restart random walk (RW-DIR) for identifying colon cancer-related RNAs. Ten RNAs were finally obtained related to colon cancer, which were hsa-miR-2682-5p, hsa-miR-1277-3p, ANGPTL1, SLC22A18AS, FENDRR, PHLPP2, hsa-miR-302a-5p, APCDD1, MEX3A and hsa-miR-509-3-5p. Numerical experiments have indicated that the proposed network construction framework and the following RW-DIR algorithm are effective for identifying colon cancer-related RNAs, and this kind of analysis framework can also be easily extended to other diseases, effectively narrowing the scope of biological experimental research.
Collapse
Affiliation(s)
- Bolin Chen
- School of Computer Science, Northwestern Polytechnical University, Xi’an 710072, China; (B.C.); (T.W.); (J.Z.)
| | - Teng Wang
- School of Computer Science, Northwestern Polytechnical University, Xi’an 710072, China; (B.C.); (T.W.); (J.Z.)
| | - Jinlei Zhang
- School of Computer Science, Northwestern Polytechnical University, Xi’an 710072, China; (B.C.); (T.W.); (J.Z.)
| | - Shengli Zhang
- School of Information Technology, Minzu Normal University of Xingyi, Xingyi 562400, China;
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, Xi’an 710072, China; (B.C.); (T.W.); (J.Z.)
| |
Collapse
|
48
|
Ji C, Wang Y, Gao Z, Li L, Ni J, Zheng C. A Semi-Supervised Learning Method for MiRNA-Disease Association Prediction Based on Variational Autoencoder. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2049-2059. [PMID: 33735084 DOI: 10.1109/tcbb.2021.3067338] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
MicroRNAs (miRNAs) are a class of non-coding RNAs that play critical role in many biological processes, such as cell growth, development, differentiation and aging. Increasing studies have revealed that miRNAs are closely involved in many human diseases. Therefore, the prediction of miRNA-disease associations is of great significance to the study of the pathogenesis, diagnosis and intervention of human disease. However, biological experimentally methods are usually expensive in time and money, while computational methods can provide an efficient way to infer the underlying disease-related miRNAs. In this study, we propose a novel method to predict potential miRNA-disease associations, called SVAEMDA. Our method mainly consider the miRNA-disease association prediction as semi-supervised learning problem. SVAEMDA integrates disease semantic similarity, miRNA functional similarity and respective Gaussian interaction profile (GIP) similarities. The integrated similarities are used to learn the representations of diseases and miRNAs. SVAEMDA trains a variational autoencoder based predictor by using known miRNA-disease associations, with the form of concatenated dense vectors. Reconstruction probability of the predictor is used to measure the correlation of the miRNA-disease pairs. Experimental results show that SVAEMDA outperforms other stat-of-the-art methods. AUC values of SVAEMDA of global leave-one-out cross validation (LOOCV) and 5-fold cross validation (5-fold CV) are 0.9464 and 0.9428 respectively. In addition, case studies of three common human diseases indicate that SVAEMDA obtains 100 percent of the top 50 predicted candidates in the benchmark databases. Therefore, SVAEMDA can efficiently and accurately predict the potential associations between diseases and miRNAs.
Collapse
|
49
|
Paolini A, Baldassarre A, Bruno SP, Felli C, Muzi C, Ahmadi Badi S, Siadat SD, Sarshar M, Masotti A. Improving the Diagnostic Potential of Extracellular miRNAs Coupled to Multiomics Data by Exploiting the Power of Artificial Intelligence. Front Microbiol 2022; 13:888414. [PMID: 35756065 PMCID: PMC9218639 DOI: 10.3389/fmicb.2022.888414] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 05/11/2022] [Indexed: 12/15/2022] Open
Abstract
In recent years, the clinical use of extracellular miRNAs as potential biomarkers of disease has increasingly emerged as a new and powerful tool. Serum, urine, saliva and stool contain miRNAs that can exert regulatory effects not only in surrounding epithelial cells but can also modulate bacterial gene expression, thus acting as a “master regulator” of many biological processes. We think that in order to have a holistic picture of the health status of an individual, we have to consider comprehensively many “omics” data, such as miRNAs profiling form different parts of the body and their interactions with cells and bacteria. Moreover, Artificial Intelligence (AI) and Machine Learning (ML) algorithms coupled to other multiomics data (i.e., big data) could help researchers to classify better the patient’s molecular characteristics and drive clinicians to identify personalized therapeutic strategies. Here, we highlight how the integration of “multiomic” data (i.e., miRNAs profiling and microbiota signature) with other omics (i.e., metabolomics, exposomics) analyzed by AI algorithms could improve the diagnostic and prognostic potential of specific biomarkers of disease.
Collapse
Affiliation(s)
- Alessandro Paolini
- Research Laboratories, Bambino Gesù Children's Hospital-IRCCS, Rome, Italy
| | | | - Stefania Paola Bruno
- Research Laboratories, Bambino Gesù Children's Hospital-IRCCS, Rome, Italy.,Department of Science, University Roma Tre, Rome, Italy
| | - Cristina Felli
- Research Laboratories, Bambino Gesù Children's Hospital-IRCCS, Rome, Italy
| | - Chantal Muzi
- Research Laboratories, Bambino Gesù Children's Hospital-IRCCS, Rome, Italy
| | - Sara Ahmadi Badi
- Microbiology Research Center (MRC), Pasteur Institute of Iran, Tehran, Iran.,Mycobacteriology and Pulmonary Research Department, Pasteur Institute of Iran, Tehran, Iran
| | - Seyed Davar Siadat
- Microbiology Research Center (MRC), Pasteur Institute of Iran, Tehran, Iran.,Mycobacteriology and Pulmonary Research Department, Pasteur Institute of Iran, Tehran, Iran
| | - Meysam Sarshar
- Research Laboratories, Bambino Gesù Children's Hospital-IRCCS, Rome, Italy
| | - Andrea Masotti
- Research Laboratories, Bambino Gesù Children's Hospital-IRCCS, Rome, Italy
| |
Collapse
|
50
|
Li G, Fang T, Zhang Y, Liang C, Xiao Q, Luo J. Predicting miRNA-disease associations based on graph attention network with multi-source information. BMC Bioinformatics 2022; 23:244. [PMID: 35729531 PMCID: PMC9215044 DOI: 10.1186/s12859-022-04796-7] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Accepted: 06/15/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND There is a growing body of evidence from biological experiments suggesting that microRNAs (miRNAs) play a significant regulatory role in both diverse cellular activities and pathological processes. Exploring miRNA-disease associations not only can decipher pathogenic mechanisms but also provide treatment solutions for diseases. As it is inefficient to identify undiscovered relationships between diseases and miRNAs using biotechnology, an explosion of computational methods have been advanced. However, the prediction accuracy of existing models is hampered by the sparsity of known association network and single-category feature, which is hard to model the complicated relationships between diseases and miRNAs. RESULTS In this study, we advance a new computational framework (GATMDA) to discover unknown miRNA-disease associations based on graph attention network with multi-source information, which effectively fuses linear and non-linear features. In our method, the linear features of diseases and miRNAs are constructed by disease-lncRNA correlation profiles and miRNA-lncRNA correlation profiles, respectively. Then, the graph attention network is employed to extract the non-linear features of diseases and miRNAs by aggregating information of each neighbor with different weights. Finally, the random forest algorithm is applied to infer the disease-miRNA correlation pairs through fusing linear and non-linear features of diseases and miRNAs. As a result, GATMDA achieves impressive performance: an average AUC of 0.9566 with five-fold cross validation, which is superior to other previous models. In addition, case studies conducted on breast cancer, colon cancer and lymphoma indicate that 50, 50 and 48 out of the top fifty prioritized candidates are verified by biological experiments. CONCLUSIONS The extensive experimental results justify the accuracy and utility of GATMDA and we could anticipate that it may regard as a utility tool for identifying unobserved disease-miRNA relationships.
Collapse
Affiliation(s)
- Guanghui Li
- School of Information Engineering, East China Jiaotong University, Nanchang, China.
| | - Tao Fang
- School of Information Engineering, East China Jiaotong University, Nanchang, China
| | - Yuejin Zhang
- School of Information Engineering, East China Jiaotong University, Nanchang, China
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Qiu Xiao
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China.
| |
Collapse
|