1
|
Han GS, Gao Q, Peng LZ, Tang J. Hessian Regularized
L
2
,
1
-Nonnegative Matrix Factorization and Deep Learning for miRNA-Disease Associations Prediction. Interdiscip Sci 2024; 16:176-191. [PMID: 38099958 DOI: 10.1007/s12539-023-00594-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 11/05/2023] [Accepted: 11/07/2023] [Indexed: 02/22/2024]
Abstract
Since the identification of microRNAs (miRNAs), empirical research has demonstrated their crucial involvement in the functioning of organisms. Investigating miRNAs significantly bolsters efforts related to averting, diagnosing, and treating intricate human maladies. Yet, exploring every conceivable miRNA-disease association consumes significant resources and time within conventional wet experiments. On the computational front, forecasting potential miRNA-disease connections serves as a valuable source of preliminary insights for medical investigators. As a result, we have developed a novel matrix factorization model known as Hessian-regularizedL 2 , 1 nonnegative matrix factorization in combination with deep learning for predicting associations between miRNAs and diseases, denoted asH R L 2 , 1 -NMF-DF. In particular, we introduce a novel iterative fusion approach to integrate all similarities. This method effectively diminishes the sparsity of the initial miRNA-disease associations matrix. Additionally, we devise a mixed model framework that utilizes deep learning, matrix decomposition, and singular value decomposition to capture and depict the intricate nonlinear features of miRNA and disease. The prediction performance of the six matrix factorization methods is improved by comparison and analysis, similarity matrix fusion, data preprocessing, and parameter adjustment. The AUC and AUPR obtained by the new matrix factorization model under fivefold cross validation are comparative or better with other matrix factorization models. Finally, we select three diseases including lung tumor, bladder tumor and breast tumor for case analysis, and further extend the matrix factorization model based on deep learning. The results show that the hybrid algorithm combining matrix factorization with deep learning proposed in this paper can predict miRNAs related to different diseases with high accuracy.
Collapse
Affiliation(s)
- Guo-Sheng Han
- Department of Mathematics and Computational Science, Xiangtan University, Xiangtan, 411105, China.
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education and Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Xiangtan, 411105, China.
| | - Qi Gao
- Department of Mathematics and Computational Science, Xiangtan University, Xiangtan, 411105, China
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education and Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Xiangtan, 411105, China
| | - Ling-Zhi Peng
- Department of Mathematics and Computational Science, Xiangtan University, Xiangtan, 411105, China
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education and Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Xiangtan, 411105, China
| | - Jing Tang
- Department of Mathematics and Computational Science, Xiangtan University, Xiangtan, 411105, China
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education and Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Xiangtan, 411105, China
| |
Collapse
|
2
|
Qu Q, Chen X, Ning B, Zhang X, Nie H, Zeng L, Chen H, Fu X. Prediction of miRNA-disease associations by neural network-based deep matrix factorization. Methods 2023; 212:1-9. [PMID: 36813017 DOI: 10.1016/j.ymeth.2023.02.003] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 01/17/2023] [Accepted: 02/10/2023] [Indexed: 02/23/2023] Open
Abstract
MicroRNA(miRNA) is a class of short non-coding RNAs with a length of about 22 nucleotides, which participates in various biological processes of cells. A number of studies have shown that miRNAs are closely related to the occurrence of cancer and various human diseases. Therefore, studying miRNA-disease associations is helpful to understand the pathogenesis of diseases as well as the prevention, diagnosis, treatment and prognosis of diseases. Traditional biological experimental methods for studying miRNA-disease associations have disadvantages such as expensive equipment, time-consuming and labor-intensive. With the rapid development of bioinformatics, more and more researchers are committed to developing effective computational methods to predict miRNA-disease associations in roder to reduce the time and money cost of experiments. In this study, we proposed a neural network-based deep matrix factorization method named NNDMF to predict miRNA-disease associations. To address the problem that traditional matrix factorization methods can only extract linear features, NNDMF used neural network to perform deep matrix factorization to extract nonlinear features, which makes up for the shortcomings of traditional matrix factorization methods. We compared NNDMF with four previous classical prediction models (IMCMDA, GRMDA, SACMDA and ICFMDA) in global LOOCV and local LOOCV, respectively. The AUCs achieved by NNDMF in two cross-validation methods were 0.9340 and 0.8763, respectively. Furthermore, we conducted case studies on three important human diseases (lymphoma, colorectal cancer and lung cancer) to validate the effectiveness of NNDMF. In conclusion, NNDMF could effectively predict the potential miRNA-disease associations.
Collapse
Affiliation(s)
- Qiang Qu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Xia Chen
- School of Basic Education, Changsha Aeronautical Vocational and Technical College, Changsha, China
| | - Bin Ning
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Xiang Zhang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Hao Nie
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Li Zeng
- College of Life and Environmental Science, Hunan University of Art and Science, Changde, China
| | - Haowen Chen
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China.
| | - Xiangzheng Fu
- Research Institute of Hunan University in Chongqing, Chongqing, China.
| |
Collapse
|
3
|
Li L, Gao Z, Zheng CH, Qi R, Wang YT, Ni JC. Predicting miRNA-Disease Association Based on Improved Graph Regression. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3604-3613. [PMID: 34757912 DOI: 10.1109/tcbb.2021.3127017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Recently, as a growing number of associations between microRNAs (miRNAs) and diseases are discovered, researchers gradually realize that miRNAs are closely related to several complicated biological processes and human diseases. Hence, it is especially important to construct availably models to infer associations between miRNAs and diseases. In this study, we presented Improved Graph Regression for miRNA-Disease Association Prediction (IGRMDA) to observe potential relationship between miRNAs and diseases. In order to reduce the inherent noise existing in the acquired biological datasets, we utilized matrix decomposition algorithm to process miRNA functional similarity and disease semantic similarity and then combining them with existing similarity information to obtain final miRNA similarity data and disease similarity data. Then, we applied miRNA-disease association data, miRNA similarity data and disease similarity data to form corresponding latent spaces. Furthermore, we performed improved graph regression algorithm in latent spaces, which included miRNA-disease association space, miRNA similarity space and disease similarity space. Non-negative matrix factorization and partial least squares were used in the graph regression process to obtain important related attributes. The cross validation experiments and case studies were also implemented to prove the effectiveness of IGRMDA, which showed that IGRMDA could predict potential associations between miRNAs and diseases.
Collapse
|
4
|
Ji C, Wang Y, Gao Z, Li L, Ni J, Zheng C. A Semi-Supervised Learning Method for MiRNA-Disease Association Prediction Based on Variational Autoencoder. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2049-2059. [PMID: 33735084 DOI: 10.1109/tcbb.2021.3067338] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
MicroRNAs (miRNAs) are a class of non-coding RNAs that play critical role in many biological processes, such as cell growth, development, differentiation and aging. Increasing studies have revealed that miRNAs are closely involved in many human diseases. Therefore, the prediction of miRNA-disease associations is of great significance to the study of the pathogenesis, diagnosis and intervention of human disease. However, biological experimentally methods are usually expensive in time and money, while computational methods can provide an efficient way to infer the underlying disease-related miRNAs. In this study, we propose a novel method to predict potential miRNA-disease associations, called SVAEMDA. Our method mainly consider the miRNA-disease association prediction as semi-supervised learning problem. SVAEMDA integrates disease semantic similarity, miRNA functional similarity and respective Gaussian interaction profile (GIP) similarities. The integrated similarities are used to learn the representations of diseases and miRNAs. SVAEMDA trains a variational autoencoder based predictor by using known miRNA-disease associations, with the form of concatenated dense vectors. Reconstruction probability of the predictor is used to measure the correlation of the miRNA-disease pairs. Experimental results show that SVAEMDA outperforms other stat-of-the-art methods. AUC values of SVAEMDA of global leave-one-out cross validation (LOOCV) and 5-fold cross validation (5-fold CV) are 0.9464 and 0.9428 respectively. In addition, case studies of three common human diseases indicate that SVAEMDA obtains 100 percent of the top 50 predicted candidates in the benchmark databases. Therefore, SVAEMDA can efficiently and accurately predict the potential associations between diseases and miRNAs.
Collapse
|
5
|
Gao Z, Wang YT, Wu QW, Li L, Ni JC, Zheng CH. A New Method Based on Matrix Completion and Non-Negative Matrix Factorization for Predicting Disease-Associated miRNAs. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:763-772. [PMID: 32991287 DOI: 10.1109/tcbb.2020.3027444] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Numerous studies have shown that microRNAs are associated with the occurrence and development of human diseases. Thus, studying disease-associated miRNAs is significantly valuable to the prevention, diagnosis and treatment of diseases. In this paper, we proposed a novel method based on matrix completion and non-negative matrix factorization (MCNMF)for predicting disease-associated miRNAs. Due to the information inadequacy on miRNA similarities and disease similarities, we calculated the latter via two models, and introduced the Gaussian interaction profile kernel similarity. In addition, the matrix completion (MC)was employed to further replenish the miRNA and disease similarities to improve the prediction performance. And to reduce the sparsity of miRNA-disease association matrix, the method of weighted K nearest neighbor (WKNKN)was used, which is a pre-processing step. We also utilized non-negative matrix factorization (NMF)using dual L2,1-norm, graph Laplacian regularization, and Tikhonov regularization to effectively avoid the overfitting during the prediction. Finally, several experiments and a case study were implemented to evaluate the effectiveness and performance of the proposed MCNMF model. The results indicated that our method could reliably and effectively predict disease-associated miRNAs.
Collapse
|
6
|
Luo J, Liu Y, Liu P, Lai Z, Wu H. Data Integration Using Tensor Decomposition for The Prediction of miRNA-Disease Associations. IEEE J Biomed Health Inform 2021; 26:2370-2378. [PMID: 34748505 DOI: 10.1109/jbhi.2021.3125573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Dysfunction of miRNAs has an important relationship with diseases by impacting their target genes. Identifying disease-related miRNAs is of great significance to prevent and treat diseases. Integrating information of genes related miRNAs and/or diseases in calculational methods for miRNA-disease association studies is meaningful because of the complexity of biological mechanisms. Therefore, in this study, we propose a novel method based on tensor decomposition, termed TDMDA, to integrate multi-type data for identifying pathogenic miRNAs. First, we construct a three-order association tensor to express the associations of miRNA-disease pairs, the associations of miRNA-gene pairs, and the associations of gene-disease pairs simultaneously. Then, a tensor decomposition-based method with auxiliary information is applied to reconstruct the association tensor for predicting miRNA-disease associations, and the auxiliary information includes biological similarity information and adjacency information. The performance of TDMDA is compared with other advanced methods under 5-fold cross-validations. The experimental results indicate the TDMDA is a competitive method.
Collapse
|
7
|
Zhang ZW, Gao Z, Zheng CH, Li L, Qi SM, Wang YT. WVMDA: Predicting miRNA-Disease Association Based on Weighted Voting. Front Genet 2021; 12:742992. [PMID: 34659363 PMCID: PMC8511643 DOI: 10.3389/fgene.2021.742992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Accepted: 09/09/2021] [Indexed: 11/15/2022] Open
Abstract
An increasing number of experiments had verified that miRNA expression is related to human diseases. The miRNA expression profile may be an indicator of clinical diagnosis and provides a new direction for the prevention and treatment of complex diseases. In this work, we present a weighted voting-based model for predicting miRNA–disease association (WVMDA). To reasonably build a network of similarity, we established credibility similarity based on the reliability of known associations and used it to improve the original incomplete similarity. To eliminate noise interference as much as possible while maintaining more reliable similarity information, we developed a filter. More importantly, to ensure the fairness and efficiency of weighted voting, we focus on the design of weighting. Finally, cross-validation experiments and case studies are undertaken to verify the efficacy of the proposed model. The results showed that WVMDA could efficiently identify miRNAs associated with the disease.
Collapse
Affiliation(s)
- Zhen-Wei Zhang
- School of Cyberspace Security, Qufu Normal University, Qufu, China
| | - Zhen Gao
- School of Computer Science and Technology, Anhui University, Hefei, China
| | - Chun-Hou Zheng
- School of Cyberspace Security, Qufu Normal University, Qufu, China.,School of Computer Science and Technology, Anhui University, Hefei, China
| | - Lei Li
- School of Cyberspace Security, Qufu Normal University, Qufu, China
| | - Su-Min Qi
- School of Cyberspace Security, Qufu Normal University, Qufu, China
| | - Yu-Tian Wang
- School of Cyberspace Security, Qufu Normal University, Qufu, China
| |
Collapse
|
8
|
Wang YT, Li L, Ji CM, Zheng CH, Ni JC. ILPMDA: Predicting miRNA-Disease Association Based on Improved Label Propagation. Front Genet 2021; 12:743665. [PMID: 34659364 PMCID: PMC8514753 DOI: 10.3389/fgene.2021.743665] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Accepted: 08/30/2021] [Indexed: 12/21/2022] Open
Abstract
MicroRNAs (miRNAs) are small non-coding RNAs that have been demonstrated to be related to numerous complex human diseases. Considerable studies have suggested that miRNAs affect many complicated bioprocesses. Hence, the investigation of disease-related miRNAs by utilizing computational methods is warranted. In this study, we presented an improved label propagation for miRNA-disease association prediction (ILPMDA) method to observe disease-related miRNAs. First, we utilized similarity kernel fusion to integrate different types of biological information for generating miRNA and disease similarity networks. Second, we applied the weighted k-nearest known neighbor algorithm to update verified miRNA-disease association data. Third, we utilized improved label propagation in disease and miRNA similarity networks to make association prediction. Furthermore, we obtained final prediction scores by adopting an average ensemble method to integrate the two kinds of prediction results. To evaluate the prediction performance of ILPMDA, two types of cross-validation methods and case studies on three significant human diseases were implemented to determine the accuracy and effectiveness of ILPMDA. All results demonstrated that ILPMDA had the ability to discover potential miRNA-disease associations.
Collapse
Affiliation(s)
- Yu-Tian Wang
- School of Cyber Science and Engineering, Qufu Normal University, Qufu, China
| | - Lei Li
- School of Cyber Science and Engineering, Qufu Normal University, Qufu, China
| | - Cun-Mei Ji
- School of Cyber Science and Engineering, Qufu Normal University, Qufu, China
| | - Chun-Hou Zheng
- School of Artificial Intelligence, Anhui University, Hefei, China
| | - Jian-Cheng Ni
- School of Cyber Science and Engineering, Qufu Normal University, Qufu, China
| |
Collapse
|
9
|
Wu Y, Zhu D, Wang X, Zhang S. An ensemble learning framework for potential miRNA-disease association prediction with positive-unlabeled data. Comput Biol Chem 2021; 95:107566. [PMID: 34534906 DOI: 10.1016/j.compbiolchem.2021.107566] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2021] [Revised: 08/13/2021] [Accepted: 08/18/2021] [Indexed: 11/17/2022]
Abstract
To explore the pathogenic mechanisms of MicroRNA (miRNA) on diverse diseases, many researchers have concentrated on discovering the potential associations between miRNA and disease using machine learning methods. However, the prediction accuracy of supervised machine learning methods is limited by lacking of experimentally-validated uncorrelated miRNA-disease pairs. Without these negative samples, training a highly accurate model is much more difficult. Different from traditional miRNA-disease prediction models using randomly selected unknown samples as negative training samples, we propose an ensemble learning framework to solve this positive-unlabeled (PU) learning problem. The framework incorporates two steps, i.e., a novel semi-supervised Kmeans (SS-Kmeans) to extract reliable negative samples from unknown miRNA-disease pairs and subagging method to generate diverse training sample sets to make full use of those reliable negative samples for ensemble learning. Combined with effective random vector functional link (RVFL) network as prediction model, the proposed framework showed superior prediction accuracy comparing with other popular approaches. A case study on lung and gastric neoplasms further confirms the framework's efficacy at identifying miRNA disease associations.
Collapse
Affiliation(s)
- Yao Wu
- School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China
| | - Donghua Zhu
- School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China
| | - Xuefeng Wang
- School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China.
| | - Shuo Zhang
- School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China
| |
Collapse
|
10
|
Toprak A, Eryilmaz Dogan E. Prediction of Potential MicroRNA-Disease Association Using Kernelized Bayesian Matrix Factorization. Interdiscip Sci 2021; 13:595-602. [PMID: 34370220 DOI: 10.1007/s12539-021-00469-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Revised: 07/05/2021] [Accepted: 07/30/2021] [Indexed: 10/20/2022]
Abstract
MicroRNA (miRNA) molecules, which are effective in the formation and progression of many different diseases, are 18-22 nucleotides in length and make up a type of non-coding RNA. Predicting disease-related microRNAs is crucial for understanding the pathogenesis of disease and for diagnosis, treatment, and prevention of diseases. Many computational techniques have been studied and developed, as the experimental techniques used to find novel miRNA-disease associations in biology are costly. In this paper, a Kernelized Bayesian Matrix Factorization (KBMF) technique was suggested to predict new relations among miRNAs and diseases with several information such as miRNA functional similarity, disease semantic similarity, and known relations among miRNAs and diseases. AUC value of 0.9450 was obtained by implementing fivefold cross-validation for KBMF technique. We also carried out three kinds of case studies (breast, lung, and colon neoplasms) to prove the performance of KBMF technique, and the predictive reliability of this method was confirmed by the results. Thus, KBMF technique can be used as a reliable computational model to infer possible miRNA-disease associations.
Collapse
Affiliation(s)
- Ahmet Toprak
- Department of Electricity and Energy, Bozkır Vocational School, Selcuk University, Bozkır, Konya, Turkey
| | - Esma Eryilmaz Dogan
- Department of Biomedical Engineering, Faculty of Technology, Selcuk University, Selçuklu, Konya, Turkey.
| |
Collapse
|
11
|
SCMFMDA: Predicting microRNA-disease associations based on similarity constrained matrix factorization. PLoS Comput Biol 2021; 17:e1009165. [PMID: 34252084 PMCID: PMC8345837 DOI: 10.1371/journal.pcbi.1009165] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2021] [Revised: 08/06/2021] [Accepted: 06/08/2021] [Indexed: 11/21/2022] Open
Abstract
miRNAs belong to small non-coding RNAs that are related to a number of complicated biological processes. Considerable studies have suggested that miRNAs are closely associated with many human diseases. In this study, we proposed a computational model based on Similarity Constrained Matrix Factorization for miRNA-Disease Association Prediction (SCMFMDA). In order to effectively combine different disease and miRNA similarity data, we applied similarity network fusion algorithm to obtain integrated disease similarity (composed of disease functional similarity, disease semantic similarity and disease Gaussian interaction profile kernel similarity) and integrated miRNA similarity (composed of miRNA functional similarity, miRNA sequence similarity and miRNA Gaussian interaction profile kernel similarity). In addition, the L2 regularization terms and similarity constraint terms were added to traditional Nonnegative Matrix Factorization algorithm to predict disease-related miRNAs. SCMFMDA achieved AUCs of 0.9675 and 0.9447 based on global Leave-one-out cross validation and five-fold cross validation, respectively. Furthermore, the case studies on two common human diseases were also implemented to demonstrate the prediction accuracy of SCMFMDA. The out of top 50 predicted miRNAs confirmed by experimental reports that indicated SCMFMDA was effective for prediction of relationship between miRNAs and diseases. Considerable studies have suggested that miRNAs are closely associated with many human diseases, so predicting potential associations between miRNAs and diseases can contribute to the diagnose and treatment of diseases. Several models of discovering unknown miRNA-diseases associations make the prediction more productive and effective. We proposed SCMFMDA to obtain more accuracy prediction result by applying similarity network fusion to fuse multi-source disease and miRNA information and utilizing similarity constrained matrix factorization to make prediction based on biological information. The global Leave-one-out cross validation and five-fold cross validation were applied to evaluate our model. Consequently, SCMFMDA could achieve AUCs of 0.9675 and 0.9447 that were obviously higher than previous computational models. Furthermore, we implemented case studies on significant human diseases including colon neoplasms and lung neoplasms, 47 and 46 of top-50 were confirmed by experimental reports. All results proved that SCMFMDA could be regard as an effective way to discover unverified connections of miRNA-disease.
Collapse
|
12
|
Chu Y, Wang X, Dai Q, Wang Y, Wang Q, Peng S, Wei X, Qiu J, Salahub DR, Xiong Y, Wei DQ. MDA-GCNFTG: identifying miRNA-disease associations based on graph convolutional networks via graph sampling through the feature and topology graph. Brief Bioinform 2021; 22:6261915. [PMID: 34009265 DOI: 10.1093/bib/bbab165] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 04/02/2021] [Accepted: 04/08/2021] [Indexed: 11/13/2022] Open
Abstract
Accurate identification of the miRNA-disease associations (MDAs) helps to understand the etiology and mechanisms of various diseases. However, the experimental methods are costly and time-consuming. Thus, it is urgent to develop computational methods towards the prediction of MDAs. Based on the graph theory, the MDA prediction is regarded as a node classification task in the present study. To solve this task, we propose a novel method MDA-GCNFTG, which predicts MDAs based on Graph Convolutional Networks (GCNs) via graph sampling through the Feature and Topology Graph to improve the training efficiency and accuracy. This method models both the potential connections of feature space and the structural relationships of MDA data. The nodes of the graphs are represented by the disease semantic similarity, miRNA functional similarity and Gaussian interaction profile kernel similarity. Moreover, we considered six tasks simultaneously on the MDA prediction problem at the first time, which ensure that under both balanced and unbalanced sample distribution, MDA-GCNFTG can predict not only new MDAs but also new diseases without known related miRNAs and new miRNAs without known related diseases. The results of 5-fold cross-validation show that the MDA-GCNFTG method has achieved satisfactory performance on all six tasks and is significantly superior to the classic machine learning methods and the state-of-the-art MDA prediction methods. Moreover, the effectiveness of GCNs via the graph sampling strategy and the feature and topology graph in MDA-GCNFTG has also been demonstrated. More importantly, case studies for two diseases and three miRNAs are conducted and achieved satisfactory performance.
Collapse
Affiliation(s)
- Yanyi Chu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Xuhong Wang
- School of Electronic, Information and Electrical Engineering (SEIEE), Shanghai Jiao Tong University, China
| | - Qiuying Dai
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Yanjing Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Qiankun Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Shaoliang Peng
- College of Computer Science and Electronic Engineering, Hunan University, China
| | | | | | - Dennis Russell Salahub
- Department of Chemistry, University of Calgary, Fellow Royal Society of Canada and Fellow of the American Association for the Advancement of Science, China
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, P.R. China
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, P.R. China
| |
Collapse
|
13
|
Ji C, Gao Z, Ma X, Wu Q, Ni J, Zheng C. AEMDA: inferring miRNA-disease associations based on deep autoencoder. Bioinformatics 2021; 37:66-72. [PMID: 32726399 DOI: 10.1093/bioinformatics/btaa670] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2020] [Revised: 05/27/2020] [Accepted: 07/20/2020] [Indexed: 12/19/2022] Open
Abstract
MOTIVATION MicroRNAs (miRNAs) are a class of non-coding RNAs that play critical roles in various biological processes. Many studies have shown that miRNAs are closely related to the occurrence, development and diagnosis of human diseases. Traditional biological experiments are costly and time consuming. As a result, effective computational models have become increasingly popular for predicting associations between miRNAs and diseases, which could effectively boost human disease diagnosis and prevention. RESULTS We propose a novel computational framework, called AEMDA, to identify associations between miRNAs and diseases. AEMDA applies a learning-based method to extract dense and high-dimensional representations of diseases and miRNAs from integrated disease semantic similarity, miRNA functional similarity and heterogeneous related interaction data. In addition, AEMDA adopts a deep autoencoder that does not need negative samples to retrieve the underlying associations between miRNAs and diseases. Furthermore, the reconstruction error is used as a measurement to predict disease-associated miRNAs. Our experimental results indicate that AEMDA can effectively predict disease-related miRNAs and outperforms state-of-the-art methods. AVAILABILITY AND IMPLEMENTATION The source code and data are available at https://github.com/CunmeiJi/AEMDA. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Cunmei Ji
- School of Software, Qufu Normal University, Qufu 273165, China
| | - Zhen Gao
- School of Software, Qufu Normal University, Qufu 273165, China
| | - Xu Ma
- School of Software, Qufu Normal University, Qufu 273165, China
| | - Qingwen Wu
- School of Software, Qufu Normal University, Qufu 273165, China
| | - Jiancheng Ni
- School of Software, Qufu Normal University, Qufu 273165, China
| | - Chunhou Zheng
- School of Software, Qufu Normal University, Qufu 273165, China.,School of Computer Science and Technology, Anhui University, Hefei 230601, China
| |
Collapse
|
14
|
Wang YT, Wu QW, Gao Z, Ni JC, Zheng CH. MiRNA-disease association prediction via hypergraph learning based on high-dimensionality features. BMC Med Inform Decis Mak 2021; 21:133. [PMID: 33882934 PMCID: PMC8061020 DOI: 10.1186/s12911-020-01320-w] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2020] [Accepted: 11/09/2020] [Indexed: 11/10/2022] Open
Abstract
Background MicroRNAs (miRNAs) have been confirmed to have close relationship with various human complex diseases. The identification of disease-related miRNAs provides great insights into the underlying pathogenesis of diseases. However, it is still a big challenge to identify which miRNAs are related to diseases. As experimental methods are in general expensive and time‐consuming, it is important to develop efficient computational models to discover potential miRNA-disease associations. Methods This study presents a novel prediction method called HFHLMDA, which is based on high-dimensionality features and hypergraph learning, to reveal the association between diseases and miRNAs. Firstly, the miRNA functional similarity and the disease semantic similarity are integrated to form an informative high-dimensionality feature vector. Then, a hypergraph is constructed by the K-Nearest-Neighbor (KNN) method, in which each miRNA-disease pair and its k most relevant neighbors are linked as one hyperedge to represent the complex relationships among miRNA-disease pairs. Finally, the hypergraph learning model is designed to learn the projection matrix which is used to calculate uncertain miRNA-disease association score. Result Compared with four state-of-the-art computational models, HFHLMDA achieved best results of 92.09% and 91.87% in leave-one-out cross validation and fivefold cross validation, respectively. Moreover, in case studies on Esophageal neoplasms, Hepatocellular Carcinoma, Breast Neoplasms, 90%, 98%, and 96% of the top 50 predictions have been manually confirmed by previous experimental studies. Conclusion MiRNAs have complex connections with many human diseases. In this study, we proposed a novel computational model to predict the underlying miRNA-disease associations. All results show that the proposed method is effective for miRNA–disease association predication.
Collapse
Affiliation(s)
- Yu-Tian Wang
- School of Software, Qufu Normal University, Qufu, China
| | - Qing-Wen Wu
- School of Software, Qufu Normal University, Qufu, China
| | - Zhen Gao
- School of Software, Qufu Normal University, Qufu, China
| | - Jian-Cheng Ni
- School of Software, Qufu Normal University, Qufu, China.
| | - Chun-Hou Zheng
- School of Computer Science and Technology, Anhui University, Hefei, China. .,College of Mathematics and System Science, Xinjiang University, Urumqi, China.
| |
Collapse
|
15
|
Wu Q, Wang Y, Gao Z, Ni J, Zheng C. MSCHLMDA: Multi-Similarity Based Combinative Hypergraph Learning for Predicting MiRNA-Disease Association. Front Genet 2020; 11:354. [PMID: 32351545 PMCID: PMC7174776 DOI: 10.3389/fgene.2020.00354] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Accepted: 03/23/2020] [Indexed: 12/17/2022] Open
Abstract
Accumulating biological and clinical evidence has confirmed the important associations between microRNAs (miRNAs) and a variety of human diseases. Predicting disease-related miRNAs is beneficial for understanding the molecular mechanisms of pathological conditions at the miRNA level, and facilitating the finding of new biomarkers for prevention, diagnosis and treatment of complex human diseases. However, the challenge for researchers is to establish methods that can effectively combine different datasets and make reliable predictions. In this work, we propose the method of Multi-Similarity based Combinative Hypergraph Learning for Predicting MiRNA-disease Association (MSCHLMDA). To establish this method, complex features were extracted by two measures for each miRNA-disease pair. Then, K-nearest neighbor (KNN) and K-means algorithm were used to construct two different hypergraphs. Finally, results from combinative hypergraph learning were used for predicting miRNA-disease association. In order to evaluate the prediction performance of our method, leave-one-out cross validation and 5-fold cross validation was implemented, showing that our method had significantly improved prediction performance compared to previously used methods. Moreover, three case studies on different human complex diseases were performed, which further demonstrated the predictive performance of MSCHLMDA. It is anticipated that MSCHLMDA would become an excellent complement to the biomedical research field in the future.
Collapse
Affiliation(s)
- Qingwen Wu
- School of Software, Qufu Normal University, Qufu, China
| | - Yutian Wang
- School of Software, Qufu Normal University, Qufu, China
| | - Zhen Gao
- School of Software, Qufu Normal University, Qufu, China
| | - Jiancheng Ni
- School of Software, Qufu Normal University, Qufu, China
| | - Chunhou Zheng
- School of Software, Qufu Normal University, Qufu, China.,School of Computer Science and Technology, Anhui University, Hefei, China
| |
Collapse
|
16
|
Gao Z, Wang YT, Wu QW, Ni JC, Zheng CH. Graph regularized L 2,1-nonnegative matrix factorization for miRNA-disease association prediction. BMC Bioinformatics 2020; 21:61. [PMID: 32070280 PMCID: PMC7029547 DOI: 10.1186/s12859-020-3409-x] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Accepted: 02/11/2020] [Indexed: 01/24/2023] Open
Abstract
BACKGROUND The aberrant expression of microRNAs is closely connected to the occurrence and development of a great deal of human diseases. To study human diseases, numerous effective computational models that are valuable and meaningful have been presented by researchers. RESULTS Here, we present a computational framework based on graph Laplacian regularized L2, 1-nonnegative matrix factorization (GRL2, 1-NMF) for inferring possible human disease-connected miRNAs. First, manually validated disease-connected microRNAs were integrated, and microRNA functional similarity information along with two kinds of disease semantic similarities were calculated. Next, we measured Gaussian interaction profile (GIP) kernel similarities for both diseases and microRNAs. Then, we adopted a preprocessing step, namely, weighted K nearest known neighbours (WKNKN), to decrease the sparsity of the miRNA-disease association matrix network. Finally, the GRL2,1-NMF framework was used to predict links between microRNAs and diseases. CONCLUSIONS The new method (GRL2, 1-NMF) achieved AUC values of 0.9280 and 0.9276 in global leave-one-out cross validation (global LOOCV) and five-fold cross validation (5-CV), respectively, showing that GRL2, 1-NMF can powerfully discover potential disease-related miRNAs, even if there is no known associated disease.
Collapse
Affiliation(s)
- Zhen Gao
- School of Software, Qufu Normal University, Qufu, 273165, China
| | - Yu-Tian Wang
- School of Software, Qufu Normal University, Qufu, 273165, China
| | - Qing-Wen Wu
- School of Software, Qufu Normal University, Qufu, 273165, China
| | - Jian-Cheng Ni
- School of Software, Qufu Normal University, Qufu, 273165, China.
| | - Chun-Hou Zheng
- School of Software, Qufu Normal University, Qufu, 273165, China.
| |
Collapse
|
17
|
Huang Z, Liu L, Gao Y, Shi J, Cui Q, Li J, Zhou Y. Benchmark of computational methods for predicting microRNA-disease associations. Genome Biol 2019; 20:202. [PMID: 31594544 PMCID: PMC6781296 DOI: 10.1186/s13059-019-1811-3] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2019] [Accepted: 09/03/2019] [Indexed: 01/06/2023] Open
Abstract
BACKGROUND A series of miRNA-disease association prediction methods have been proposed to prioritize potential disease-associated miRNAs. Independent benchmarking of these methods is warranted to assess their effectiveness and robustness. RESULTS Based on more than 8000 novel miRNA-disease associations from the latest HMDD v3.1 database, we perform systematic comparison among 36 readily available prediction methods. Their overall performances are evaluated with rigorous precision-recall curve analysis, where 13 methods show acceptable accuracy (AUPRC > 0.200) while the top two methods achieve a promising AUPRC over 0.300, and most of these methods are also highly ranked when considering only the causal miRNA-disease associations as the positive samples. The potential of performance improvement is demonstrated by combining different predictors or adopting a more updated miRNA similarity matrix, which would result in up to 16% and 46% of AUPRC augmentations compared to the best single predictor and the predictors using the previous similarity matrix, respectively. Our analysis suggests a common issue of the available methods, which is that the prediction results are severely biased toward well-annotated diseases with many associated miRNAs known and cannot further stratify the positive samples by discriminating the causal miRNA-disease associations from the general miRNA-disease associations. CONCLUSION Our benchmarking results not only provide a reference for biomedical researchers to choose appropriate miRNA-disease association predictors for their purpose, but also suggest the future directions for the development of more robust miRNA-disease association predictors.
Collapse
Affiliation(s)
- Zhou Huang
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
| | - Leibo Liu
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, 300401, China
| | - Yuanxu Gao
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
| | - Jiangcheng Shi
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
| | - Qinghua Cui
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
- Center of Bioinformatics, Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Jianwei Li
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, 300401, China.
| | - Yuan Zhou
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China.
| |
Collapse
|
18
|
Wang Y, Nie C, Zang T, Wang Y. Predicting circRNA-Disease Associations Based on circRNA Expression Similarity and Functional Similarity. Front Genet 2019; 10:832. [PMID: 31572444 PMCID: PMC6751509 DOI: 10.3389/fgene.2019.00832] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2019] [Accepted: 08/13/2019] [Indexed: 12/19/2022] Open
Abstract
Circular RNAs (circRNAs) are a novel class of endogenous noncoding RNAs that have well-conserved sequences. Emerging evidence has shown that circRNAs can be novel biomarkers or therapeutic targets for many diseases and play an important role in the development of various pathological conditions. Therefore, identifying potential disease-related circRNAs is helpful in improving the efficiency of finding therapeutic targets for diseases. Here, we propose a computational model (PreCDA) to predict potential circRNA-disease associations. First, we calculated the circRNA expression similarity based on circRNA expression profiles. The circRNA functional similarity is calculated based on cosine similarity, and the disease similarity is used as the dimension of each circRNA vector. The associations between circRNAs and diseases are defined based on the circRNA functional similarity and expression similarity. We constructed a disease-related circRNA association network and used a graph-based recommendation algorithm (PersonalRank) to sort candidate disease-related circRNAs. As a result, PreCDA has an average area under the receiver operating characteristic curve value of 78.15% in predicting candidate disease-related circRNAs. In addition, we discuss the factors that affect the performance of this method and find some unknown circRNAs related to diseases, with several common diseases used as case studies. These results show that PreCDA has good performance in predicting potential circRNA-disease associations and is helpful for the diagnosis and treatment of human diseases.
Collapse
Affiliation(s)
| | | | - Tianyi Zang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Yadong Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| |
Collapse
|
19
|
PWCDA: Path Weighted Method for Predicting circRNA-Disease Associations. Int J Mol Sci 2018; 19:ijms19113410. [PMID: 30384427 PMCID: PMC6274797 DOI: 10.3390/ijms19113410] [Citation(s) in RCA: 62] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2018] [Revised: 10/25/2018] [Accepted: 10/26/2018] [Indexed: 12/22/2022] Open
Abstract
CircRNAs have particular biological structure and have proven to play important roles in diseases. It is time-consuming and costly to identify circRNA-disease associations by biological experiments. Therefore, it is appealing to develop computational methods for predicting circRNA-disease associations. In this study, we propose a new computational path weighted method for predicting circRNA-disease associations. Firstly, we calculate the functional similarity scores of diseases based on disease-related gene annotations and the semantic similarity scores of circRNAs based on circRNA-related gene ontology, respectively. To address missing similarity scores of diseases and circRNAs, we calculate the Gaussian Interaction Profile (GIP) kernel similarity scores for diseases and circRNAs, respectively, based on the circRNA-disease associations downloaded from circR2Disease database (http://bioinfo.snnu.edu.cn/CircR2Disease/). Then, we integrate disease functional similarity scores and circRNA semantic similarity scores with their related GIP kernel similarity scores to construct a heterogeneous network made up of three sub-networks: disease similarity network, circRNA similarity network and circRNA-disease association network. Finally, we compute an association score for each circRNA-disease pair based on paths connecting them in the heterogeneous network to determine whether this circRNA-disease pair is associated. We adopt leave one out cross validation (LOOCV) and five-fold cross validations to evaluate the performance of our proposed method. In addition, three common diseases, Breast Cancer, Gastric Cancer and Colorectal Cancer, are used for case studies. Experimental results illustrate the reliability and usefulness of our computational method in terms of different validation measures, which indicates PWCDA can effectively predict potential circRNA-disease associations.
Collapse
|