51
|
Ni J, Li L, Wang Y, Ji C, Zheng C. MDSCMF: Matrix Decomposition and Similarity-Constrained Matrix Factorization for miRNA-Disease Association Prediction. Genes (Basel) 2022; 13:1021. [PMID: 35741782 PMCID: PMC9223216 DOI: 10.3390/genes13061021] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2022] [Revised: 06/01/2022] [Accepted: 06/02/2022] [Indexed: 11/16/2022] Open
Abstract
MicroRNAs (miRNAs) are small non-coding RNAs that are related to a number of complicated biological processes, and numerous studies have demonstrated that miRNAs are closely associated with many human diseases. In this study, we present a matrix decomposition and similarity-constrained matrix factorization (MDSCMF) to predict potential miRNA-disease associations. First of all, we utilized a matrix decomposition (MD) algorithm to get rid of outliers from the miRNA-disease association matrix. Then, miRNA similarity was determined by utilizing similarity kernel fusion (SKF) to integrate miRNA function similarity and Gaussian interaction profile (GIP) kernel similarity, and disease similarity was determined by utilizing SKF to integrate disease semantic similarity and GIP kernel similarity. Furthermore, we added L2 regularization terms and similarity constraint terms to non-negative matrix factorization to form a similarity-constrained matrix factorization (SCMF) algorithm, which was applied to make prediction. MDSCMF achieved AUC values of 0.9488, 0.9540, and 0.8672 based on fivefold cross-validation (5-CV), global leave-one-out cross-validation (global LOOCV), and local leave-one-out cross-validation (local LOOCV), respectively. Case studies on three common human diseases were also implemented to demonstrate the prediction ability of MDSCMF. All experimental results confirmed that MDSCMF was effective in predicting underlying associations between miRNAs and diseases.
Collapse
Affiliation(s)
- Jiancheng Ni
- Network Information Center, Qufu Normal University, Qufu 273165, China;
| | - Lei Li
- School of Cyber Science and Engineering, Qufu Normal University, Qufu 273165, China; (Y.W.); (C.J.)
| | - Yutian Wang
- School of Cyber Science and Engineering, Qufu Normal University, Qufu 273165, China; (Y.W.); (C.J.)
| | - Cunmei Ji
- School of Cyber Science and Engineering, Qufu Normal University, Qufu 273165, China; (Y.W.); (C.J.)
| | - Chunhou Zheng
- School of Artifial Intelligence, Anhui University, Hefei 230601, China
| |
Collapse
|
52
|
Zhong J, Zhou W, Kang J, Fang Z, Xie M, Xiao Q, Peng W. DNRLCNN: A CNN Framework for Identifying MiRNA-Disease Associations Using Latent Feature Matrix Extraction with Positive Samples. Interdiscip Sci 2022; 14:607-622. [PMID: 35428965 DOI: 10.1007/s12539-022-00509-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Revised: 02/24/2022] [Accepted: 03/01/2022] [Indexed: 06/14/2023]
Abstract
Emerging evidence indicates that miRNAs have strong relationships with many human diseases. Investigating the associations will contribute to elucidating the activities of miRNAs and pathogenesis mechanisms, and providing new opportunities for disease diagnosis and drug discovery. Therefore, it is of significance to identify potential associations between miRNAs and diseases. The existing databases about the miRNA-disease associations (MDAs) only provide the known MDAs, which can be regarded as positive samples. However, the unknown MDAs are not sufficient to regard as reliable negative samples. To deal with this uncertainty, we proposed a convolutional neural network (CNN) framework, named DNRLCNN, based on a latent feature matrix extracted by only positive samples to predict MDAs. First, by only considering the positive samples into the calculation process, we captured the latent feature matrix for complex interactions between miRNAs and diseases in low-dimensional space. Then, we constructed a feature vector for each miRNA and disease pair based on the feature representation. Finally, we adopted a modified CNN for the feature vector to predict MDAs. As a result, our model achieves better performance than other state-of-the-art methods which based CNN in fivefold cross-validation on both miRNA-disease association prediction task (average AUC of 0.9030) and miRNA-phenotype association prediction task (average AUC of 0. 9442). In addition, we carried out case studies on two human diseases, and all the top-50 predicted miRNAs for lung neoplasms are confirmed by HMDD v3.2 and dbDEMC 2.0 databases, 98% of the top-50 predicted miRNAs for heart failure are confirmed. The experiment results show that our model has the capability of inferring potential disease-related miRNAs.
Collapse
Affiliation(s)
- Jiancheng Zhong
- College of Information Science and Engineering, Hunan Normal University, Changsha, 410083, China
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | - Wubin Zhou
- College of Information Science and Engineering, Hunan Normal University, Changsha, 410083, China
| | - Jiedong Kang
- College of Information Science and Engineering, Hunan Normal University, Changsha, 410083, China
| | - Zhuo Fang
- College of Information Science and Engineering, Hunan Normal University, Changsha, 410083, China
| | - Minzhu Xie
- College of Information Science and Engineering, Hunan Normal University, Changsha, 410083, China
| | - Qiu Xiao
- College of Information Science and Engineering, Hunan Normal University, Changsha, 410083, China.
| | - Wei Peng
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650500, China.
| |
Collapse
|
53
|
Peng L, Yang J, Wang M, Zhou L. Editorial: Machine Learning-Based Methods for RNA Data Analysis. Front Genet 2022; 13:828575. [PMID: 35692815 PMCID: PMC9175173 DOI: 10.3389/fgene.2022.828575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Accepted: 04/12/2022] [Indexed: 11/13/2022] Open
Affiliation(s)
- Lihong Peng
- College of Life Sciences and Chemistry, Hunan University of Technology, Zhuzhou, China
- School of Computer, Hunan University of Technology, Zhuzhou, China
| | | | - Minxian Wang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Liqian Zhou
- College of Life Sciences and Chemistry, Hunan University of Technology, Zhuzhou, China
- *Correspondence: Liqian Zhou,
| |
Collapse
|
54
|
Ji BY, Pan LR, Zhou JR, You ZH, Peng SL. SMMDA: Predicting miRNA-Disease Associations by Incorporating Multiple Similarity Profiles and a Novel Disease Representation. BIOLOGY 2022; 11:biology11050777. [PMID: 35625505 PMCID: PMC9138858 DOI: 10.3390/biology11050777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Revised: 05/17/2022] [Accepted: 05/17/2022] [Indexed: 12/24/2022]
Abstract
Simple Summary Predicting possible associations between miRNAs and diseases would provide new perspectives on disease diagnosis, pathogenesis, and gene therapy. In this work, considering the limited accessibility, high time consumption and high cost in traditional biological researches, we presented a novel computational method called SMMDA by incorporating multiple similarity profiles and a novel disease rep-resentation to accelerate the identification of potential miRNA-disease associations. SMMDA was intended to be useful for the prediction of associations between miRNAs and diseases, and to be effective for prevention, diagnosis, treatment and prognosis of Human diseases. Abstract Increasing evidence has suggested that microRNAs (miRNAs) are significant in research on human diseases. Predicting possible associations between miRNAs and diseases would provide new perspectives on disease diagnosis, pathogenesis, and gene therapy. However, considering the intrinsic time-consuming and expensive cost of traditional Vitro studies, there is an urgent need for a computational approach that would allow researchers to identify potential associations between miRNAs and diseases for further research. In this paper, we presented a novel computational method called SMMDA to predict potential miRNA-disease associations. In particular, SMMDA first utilized a new disease representation method (MeSHHeading2vec) based on the network embedding algorithm and then fused it with Gaussian interaction profile kernel similarity information of miRNAs and diseases, disease semantic similarity, and miRNA functional similarity. Secondly, SMMDA utilized a deep auto-coder network to transform the original features further to achieve a better feature representation. Finally, the ensemble learning model, XGBoost, was used as the underlying training and prediction method for SMMDA. In the results, SMMDA acquired a mean accuracy of 86.68% with a standard deviation of 0.42% and a mean AUC of 94.07% with a standard deviation of 0.23%, outperforming many previous works. Moreover, we also compared the predictive ability of SMMDA with different classifiers and different feature descriptors. In the case studies of three common Human diseases, the top 50 candidate miRNAs have 47 (esophageal neoplasms), 48 (breast neoplasms), and 48 (colon neoplasms) are successfully verified by two other databases. The experimental results proved that SMMDA has a reliable prediction ability in predicting potential miRNA-disease associations. Therefore, it is anticipated that SMMDA could be an effective tool for biomedical researchers.
Collapse
Affiliation(s)
- Bo-Ya Ji
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410200, China; (B.-Y.J.); (L.-R.P.)
| | - Liang-Rui Pan
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410200, China; (B.-Y.J.); (L.-R.P.)
| | - Ji-Ren Zhou
- College of Computer Science, Northwestern Polytechnic University, Xi’an 710072, China;
| | - Zhu-Hong You
- College of Computer Science, Northwestern Polytechnic University, Xi’an 710072, China;
- Correspondence: (Z.-H.Y.); (S.-L.P.)
| | - Shao-Liang Peng
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410200, China; (B.-Y.J.); (L.-R.P.)
- Correspondence: (Z.-H.Y.); (S.-L.P.)
| |
Collapse
|
55
|
Liu W, Sun X, Yang L, Li K, Yang Y, Fu X. NSCGRN: a network structure control method for gene regulatory network inference. Brief Bioinform 2022; 23:6585392. [PMID: 35554485 DOI: 10.1093/bib/bbac156] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 03/27/2022] [Accepted: 04/06/2022] [Indexed: 01/18/2023] Open
Abstract
Accurate inference of gene regulatory networks (GRNs) is an essential premise for understanding pathogenesis and curing diseases. Various computational methods have been developed for GRN inference, but the identification of redundant regulation remains a challenge faced by researchers. Although combining global and local topology can identify and reduce redundant regulations, the topologies' specific forms and cooperation modes are unclear and real regulations may be sacrificed. Here, we propose a network structure control method [network-structure-controlling-based GRN inference method (NSCGRN)] that stipulates the global and local topology's specific forms and cooperation mode. The method is carried out in a cooperative mode of 'global topology dominates and local topology refines'. Global topology requires layering and sparseness of the network, and local topology requires consistency of the subgraph association pattern with the network motifs (fan-in, fan-out, cascade and feedforward loop). Specifically, an ordered gene list is obtained by network topology centrality sorting. A Bernaola-Galvan mutation detection algorithm applied to the list gives the hierarchy of GRNs to control the upstream and downstream regulations within the global scope. Finally, four network motifs are integrated into the hierarchy to optimize local complex regulations and form a cooperative mode where global and local topologies play the dominant and refined roles, respectively. NSCGRN is compared with state-of-the-art methods on three different datasets (six networks in total), and it achieves the highest F1 and Matthews correlation coefficient. Experimental results show its unique advantages in GRN inference.
Collapse
Affiliation(s)
- Wei Liu
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, 411105, China.,School of Computer Science, Xiangtan University, Xiangtan, 411105, China
| | - Xingen Sun
- School of Computer Science, Xiangtan University, Xiangtan, 411105, China.,Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, 411105, China
| | - Li Yang
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, 411105, China
| | - Kaiwen Li
- Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
| | - Yu Yang
- School of Computer Science, Xiangtan University, Xiangtan, 411105, China.,Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, 411105, China
| | - Xiangzheng Fu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410000, China
| |
Collapse
|
56
|
Peng L, Yang C, Huang L, Chen X, Fu X, Liu W. RNMFLP: Predicting circRNA-disease associations based on robust nonnegative matrix factorization and label propagation. Brief Bioinform 2022; 23:6582881. [PMID: 35534179 DOI: 10.1093/bib/bbac155] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 03/09/2022] [Accepted: 04/06/2022] [Indexed: 12/22/2022] Open
Abstract
Circular RNAs (circRNAs) are a class of structurally stable endogenous noncoding RNA molecules. Increasing studies indicate that circRNAs play vital roles in human diseases. However, validating disease-related circRNAs in vivo is costly and time-consuming. A reliable and effective computational method to identify circRNA-disease associations deserves further studies. In this study, we propose a computational method called RNMFLP that combines robust nonnegative matrix factorization (RNMF) and label propagation algorithm (LP) to predict circRNA-disease associations. First, to reduce the impact of false negative data, the original circRNA-disease adjacency matrix is updated by matrix multiplication using the integrated circRNA similarity and the disease similarity information. Subsequently, the RNMF algorithm is used to obtain the restricted latent space to capture potential circRNA-disease pairs from the association matrix. Finally, the LP algorithm is utilized to predict more accurate circRNA-disease associations from the integrated circRNA similarity network and integrated disease similarity network, respectively. Fivefold cross-validation of four datasets shows that RNMFLP is superior to the state-of-the-art methods. In addition, case studies on lung cancer, hepatocellular carcinoma and colorectal cancer further demonstrate the reliability of our method to discover disease-related circRNAs.
Collapse
Affiliation(s)
- Li Peng
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411201, Hunan, China.,Hunan Key Laboratory for Service computing and Novel Software Technology
| | - Cheng Yang
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411201, Hunan, China
| | - Li Huang
- Academy of Arts and Design, Tsinghua University, 10084, Beijing, China.,The Future Laboratory, Tsinghua University, 10084, Beijing, China
| | - Xiang Chen
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411201, Hunan, China
| | - Xiangzheng Fu
- College of Information Science and Engineering, Hunan University, Changsha, 410082, Hunan, China
| | - Wei Liu
- College of Information Engineering, Xiangtan University, Xiangtan, 411105, Hunan, China
| |
Collapse
|
57
|
Lou Z, Cheng Z, Li H, Teng Z, Liu Y, Tian Z. Predicting miRNA-disease associations via learning multimodal networks and fusing mixed neighborhood information. Brief Bioinform 2022; 23:6582005. [PMID: 35524503 DOI: 10.1093/bib/bbac159] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Revised: 03/29/2022] [Accepted: 04/10/2022] [Indexed: 12/13/2022] Open
Abstract
MOTIVATION In recent years, a large number of biological experiments have strongly shown that miRNAs play an important role in understanding disease pathogenesis. The discovery of miRNA-disease associations is beneficial for disease diagnosis and treatment. Since inferring these associations through biological experiments is time-consuming and expensive, researchers have sought to identify the associations utilizing computational approaches. Graph Convolutional Networks (GCNs), which exhibit excellent performance in link prediction problems, have been successfully used in miRNA-disease association prediction. However, GCNs only consider 1st-order neighborhood information at one layer but fail to capture information from high-order neighbors to learn miRNA and disease representations through information propagation. Therefore, how to aggregate information from high-order neighborhood effectively in an explicit way is still challenging. RESULTS To address such a challenge, we propose a novel method called mixed neighborhood information for miRNA-disease association (MINIMDA), which could fuse mixed high-order neighborhood information of miRNAs and diseases in multimodal networks. First, MINIMDA constructs the integrated miRNA similarity network and integrated disease similarity network respectively with their multisource information. Then, the embedding representations of miRNAs and diseases are obtained by fusing mixed high-order neighborhood information from multimodal network which are the integrated miRNA similarity network, integrated disease similarity network and the miRNA-disease association networks. Finally, we concentrate the multimodal embedding representations of miRNAs and diseases and feed them into the multilayer perceptron (MLP) to predict their underlying associations. Extensive experimental results show that MINIMDA is superior to other state-of-the-art methods overall. Moreover, the outstanding performance on case studies for esophageal cancer, colon tumor and lung cancer further demonstrates the effectiveness of MINIMDA. AVAILABILITY AND IMPLEMENTATION https://github.com/chengxu123/MINIMDA and http://120.79.173.96/.
Collapse
Affiliation(s)
- Zhengzheng Lou
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Zhaoxu Cheng
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Hui Li
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Zhixia Teng
- College of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China
| | - Yang Liu
- Departments of Cerebrovascular Diseases, The Second Affiliated Hospital of Zhengzhou University, Zhengzhou 450000, China
| | - Zhen Tian
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| |
Collapse
|
58
|
Liu P, Luo J, Chen X. miRCom: Tensor Completion Integrating Multi-View Information to Deduce the Potential Disease-Related miRNA-miRNA Pairs. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1747-1759. [PMID: 33180730 DOI: 10.1109/tcbb.2020.3037331] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
MicroRNAs (miRNAs) are consistently capable of regulating gene expression synergistically in a combination mode and play a key role in various biological processes associated with the initiation and development of human diseases, which indicate that comprehending the synergistic molecular mechanism of miRNAs may facilitate understanding the pathogenesis of diseases or even overcome it. However, most existing computational methods had an incomprehensive acknowledge of the miRNA synergistic effect on the pathogenesis of complex diseases, or were hard to be extended to a large-scale prediction task of miRNA synergistic combinations for different diseases. In this article, we propose a novel tensor completion framework integrating multi-view miRNAs and diseases information, called miRCom, for the discovery of potential disease-associated miRNA-miRNA pairs. We first construct an incomplete three-order association tensor and several types of similarity matrices based on existing biological knowledge. Then, we formulate an objective function via performing the factorizations of coupled tensor and matrices simultaneously. Finally, we build an optimization schema by adopting the ADMM algorithm. After that, we obtain the prediction of miRNA-miRNA pairs for different diseases from the full tensor. The contrastive experimental results with other approaches verified that miRCom effectively identify the potential disease-related miRNA-miRNA pairs. Moreover, case study results further illustrated that miRNA-miRNA pairs have more biologically significance and prognostic value than single miRNAs.
Collapse
|
59
|
Lombardo SD, Wangsaputra IF, Menche J, Stevens A. Network Approaches for Charting the Transcriptomic and Epigenetic Landscape of the Developmental Origins of Health and Disease. Genes (Basel) 2022; 13:764. [PMID: 35627149 PMCID: PMC9141211 DOI: 10.3390/genes13050764] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 04/04/2022] [Accepted: 04/13/2022] [Indexed: 02/04/2023] Open
Abstract
The early developmental phase is of critical importance for human health and disease later in life. To decipher the molecular mechanisms at play, current biomedical research is increasingly relying on large quantities of diverse omics data. The integration and interpretation of the different datasets pose a critical challenge towards the holistic understanding of the complex biological processes that are involved in early development. In this review, we outline the major transcriptomic and epigenetic processes and the respective datasets that are most relevant for studying the periconceptional period. We cover both basic data processing and analysis steps, as well as more advanced data integration methods. A particular focus is given to network-based methods. Finally, we review the medical applications of such integrative analyses.
Collapse
Affiliation(s)
- Salvo Danilo Lombardo
- Max Perutz Labs, Department of Structural and Computational Biology, University of Vienna, 1030 Vienna, Austria;
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, 1030 Vienna, Austria
| | - Ivan Fernando Wangsaputra
- Maternal and Fetal Health Research Group, Division of Developmental Biology and Medicine, Faculty of Biology, Medicine and Health, University of Manchester, Manchester M13 9WL, UK;
| | - Jörg Menche
- Max Perutz Labs, Department of Structural and Computational Biology, University of Vienna, 1030 Vienna, Austria;
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, 1030 Vienna, Austria
- Faculty of Mathematics, University of Vienna, 1030 Vienna, Austria
| | - Adam Stevens
- Maternal and Fetal Health Research Group, Division of Developmental Biology and Medicine, Faculty of Biology, Medicine and Health, University of Manchester, Manchester M13 9WL, UK;
| |
Collapse
|
60
|
Long Y, Wu M, Liu Y, Fang Y, Kwoh CK, Chen J, Luo J, Li X. Pre-training graph neural networks for link prediction in biomedical networks. Bioinformatics 2022; 38:2254-2262. [PMID: 35171981 DOI: 10.1093/bioinformatics/btac100] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 01/15/2022] [Accepted: 02/14/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Graphs or networks are widely utilized to model the interactions between different entities (e.g. proteins, drugs, etc.) for biomedical applications. Predicting potential interactions/links in biomedical networks is important for understanding the pathological mechanisms of various complex human diseases, as well as screening compound targets for drug discovery. Graph neural networks (GNNs) have been utilized for link prediction in various biomedical networks, which rely on the node features extracted from different data sources, e.g. sequence, structure and network data. However, it is challenging to effectively integrate these data sources and automatically extract features for different link prediction tasks. RESULTS In this article, we propose a novel Pre-Training Graph Neural Networks-based framework named PT-GNN to integrate different data sources for link prediction in biomedical networks. First, we design expressive deep learning methods [e.g. convolutional neural network and graph convolutional network (GCN)] to learn features for individual nodes from sequence and structure data. Second, we further propose a GCN-based encoder to effectively refine the node features by modelling the dependencies among nodes in the network. Third, the node features are pre-trained based on graph reconstruction tasks. The pre-trained features can be used for model initialization in downstream tasks. Extensive experiments have been conducted on two critical link prediction tasks, i.e. synthetic lethality (SL) prediction and drug-target interaction (DTI) prediction. Experimental results demonstrate PT-GNN outperforms the state-of-the-art methods for SL prediction and DTI prediction. In addition, the pre-trained features benefit improving the performance and reduce the training time of existing models. AVAILABILITY AND IMPLEMENTATION Python codes and dataset are available at: https://github.com/longyahui/PT-GNN. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yahui Long
- Singapore Immunology Network (SIgN), Agency for Science, Technology and Research, Singapore, Singapore
| | - Min Wu
- Institute for Infocomm Research, Agency for Science, Technology and Research, Singapore, Singapore
| | - Yong Liu
- Joint NTU-UBC Research Centre of Excellence in Active Living for the Elderly, Singapore, Singapore
| | - Yuan Fang
- School of Information Systems, Singapore Management University, 178902 Singapore, Singapore
| | - Chee Keong Kwoh
- School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
| | - Jinmiao Chen
- Singapore Immunology Network (SIgN), Agency for Science, Technology and Research, Singapore, Singapore
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Xiaoli Li
- Institute for Infocomm Research, Agency for Science, Technology and Research, Singapore, Singapore
| |
Collapse
|
61
|
Liu W, Lin H, Huang L, Peng L, Tang T, Zhao Q, Yang L. Identification of miRNA-disease associations via deep forest ensemble learning based on autoencoder. Brief Bioinform 2022; 23:6553934. [PMID: 35325038 DOI: 10.1093/bib/bbac104] [Citation(s) in RCA: 60] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 02/18/2022] [Accepted: 03/01/2022] [Indexed: 12/31/2022] Open
Abstract
Increasing evidences show that the occurrence of human complex diseases is closely related to microRNA (miRNA) variation and imbalance. For this reason, predicting disease-related miRNAs is essential for the diagnosis and treatment of complex human diseases. Although some current computational methods can effectively predict potential disease-related miRNAs, the accuracy of prediction should be further improved. In our study, a new computational method via deep forest ensemble learning based on autoencoder (DFELMDA) is proposed to predict miRNA-disease associations. Specifically, a new feature representation strategy is proposed to obtain different types of feature representations (from miRNA and disease) for each miRNA-disease association. Then, two types of low-dimensional feature representations are extracted by two deep autoencoders for predicting miRNA-disease associations. Finally, two prediction scores of the miRNA-disease associations are obtained by the deep random forest and combined to determine the final results. DFELMDA is compared with several classical methods on the The Human microRNA Disease Database (HMDD) dataset. Results reveal that the performance of this method is superior. The area under receiver operating characteristic curve (AUC) values obtained by DFELMDA through 5-fold and 10-fold cross-validation are 0.9552 and 0.9560, respectively. In addition, case studies on colon, breast and lung tumors of different disease types further demonstrate the excellent ability of DFELMDA to predict disease-associated miRNA-disease. Performance analysis shows that DFELMDA can be used as an effective computational tool for predicting miRNA-disease associations.
Collapse
Affiliation(s)
- Wei Liu
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, 411105, China.,School of Computer Science, Xiangtan University, Xiangtan, 411105, China
| | - Hui Lin
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, 411105, China.,School of Computer Science, Xiangtan University, Xiangtan, 411105, China
| | - Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, 10084, China.,The Future Laboratory, Tsinghua University, Beijing, 10084, China
| | - Li Peng
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411201, China
| | - Ting Tang
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, 411105, China.,School of Computer Science, Xiangtan University, Xiangtan, 411105, China
| | - Qi Zhao
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China
| | - Li Yang
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, 411105, China
| |
Collapse
|
62
|
Yu L, Zheng Y, Ju B, Ao C, Gao L. Research progress of miRNA-disease association prediction and comparison of related algorithms. Brief Bioinform 2022; 23:6542222. [PMID: 35246678 DOI: 10.1093/bib/bbac066] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Revised: 01/30/2022] [Accepted: 02/08/2022] [Indexed: 11/13/2022] Open
Abstract
With an in-depth understanding of noncoding ribonucleic acid (RNA), many studies have shown that microRNA (miRNA) plays an important role in human diseases. Because traditional biological experiments are time-consuming and laborious, new calculation methods have recently been developed to predict associations between miRNA and diseases. In this review, we collected various miRNA-disease association prediction models proposed in recent years and used two common data sets to evaluate the performance of the prediction models. First, we systematically summarized the commonly used databases and similarity data for predicting miRNA-disease associations, and then divided the various calculation models into four categories for summary and detailed introduction. In this study, two independent datasets (D5430 and D6088) were compiled to systematically evaluate 11 publicly available prediction tools for miRNA-disease associations. The experimental results indicate that the methods based on information dissemination and the method based on scoring function require shorter running time. The method based on matrix transformation often requires a longer running time, but the overall prediction result is better than the previous two methods. We hope that the summary of work related to miRNA and disease will provide comprehensive knowledge for predicting the relationship between miRNA and disease and contribute to advanced computation tools in the future.
Collapse
Affiliation(s)
- Liang Yu
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Yujia Zheng
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Bingyi Ju
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Chunyan Ao
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, Xi'an, China
| |
Collapse
|
63
|
Gao Z, Wang YT, Wu QW, Li L, Ni JC, Zheng CH. A New Method Based on Matrix Completion and Non-Negative Matrix Factorization for Predicting Disease-Associated miRNAs. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:763-772. [PMID: 32991287 DOI: 10.1109/tcbb.2020.3027444] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Numerous studies have shown that microRNAs are associated with the occurrence and development of human diseases. Thus, studying disease-associated miRNAs is significantly valuable to the prevention, diagnosis and treatment of diseases. In this paper, we proposed a novel method based on matrix completion and non-negative matrix factorization (MCNMF)for predicting disease-associated miRNAs. Due to the information inadequacy on miRNA similarities and disease similarities, we calculated the latter via two models, and introduced the Gaussian interaction profile kernel similarity. In addition, the matrix completion (MC)was employed to further replenish the miRNA and disease similarities to improve the prediction performance. And to reduce the sparsity of miRNA-disease association matrix, the method of weighted K nearest neighbor (WKNKN)was used, which is a pre-processing step. We also utilized non-negative matrix factorization (NMF)using dual L2,1-norm, graph Laplacian regularization, and Tikhonov regularization to effectively avoid the overfitting during the prediction. Finally, several experiments and a case study were implemented to evaluate the effectiveness and performance of the proposed MCNMF model. The results indicated that our method could reliably and effectively predict disease-associated miRNAs.
Collapse
|
64
|
A miRNA-Disease Association Identification Method Based on Reliable Negative Sample Selection and Improved Single-Hidden Layer Feedforward Neural Network. INFORMATION 2022. [DOI: 10.3390/info13030108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
miRNAs are a category of important endogenous non-coding small RNAs and are ubiquitous in eukaryotes. They are widely involved in the regulatory process of post-transcriptional gene expression and play a critical part in the development of human diseases. By utilizing recent advancements in big data technology, using bioinformatics methods to identify causative miRNA becomes a hot spot. In this paper, a method called RNSSLFN is proposed to identify the miRNA-disease associations by reliable negative sample selection and an improved single-hidden layer feedforward neural network (SLFN). It involves, firstly, obtaining integrated similarity for miRNAs and diseases; next, selecting reliable negative samples from unknown miRNA-disease associations via distinguishing up-regulated or down-regulated miRNAs; then, introducing an improved SLFN to solve the prediction task. The experimental results on the latest data sets HMDD v3.2 and the framework of 5-fold cross-validation (CV) show that the average AUC and AUPR of RNSSLFN achieve 0.9316 and 0.9065 m, respectively, which are superior to the other three state-of-the-art methods. Furthermore, in the case studies of 10 common cancers, more than 70% of the top 30 predicted miRNA-disease association pairs are verified in the databases, which further confirms the reliability and effectiveness of the RNSSLFN model. Generally, RNSSLFN in predicting miRNA-disease associations has prodigious potential and extensive foreground.
Collapse
|
65
|
Yu L, Zheng Y, Gao L. MiRNA-disease association prediction based on meta-paths. Brief Bioinform 2022; 23:6501422. [PMID: 35018405 DOI: 10.1093/bib/bbab571] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Revised: 12/02/2021] [Accepted: 12/11/2021] [Indexed: 01/09/2023] Open
Abstract
Since miRNAs can participate in the posttranscriptional regulation of gene expression, they may provide ideas for the development of new drugs or become new biomarkers for drug targets or disease diagnosis. In this work, we propose an miRNA-disease association prediction method based on meta-paths (MDPBMP). First, an miRNA-disease-gene heterogeneous information network was constructed, and seven symmetrical meta-paths were defined according to different semantics. After constructing the initial feature vector for the node, the vector information carried by all nodes on the meta-path instance is extracted and aggregated to update the feature vector of the starting node. Then, the vector information obtained by the nodes on different meta-paths is aggregated. Finally, miRNA and disease embedding feature vectors are used to calculate their associated scores. Compared with the other methods, MDPBMP obtained the highest AUC value of 0.9214. Among the top 50 predicted miRNAs for lung neoplasms, esophageal neoplasms, colon neoplasms and breast neoplasms, 49, 48, 49 and 50 have been verified. Furthermore, for breast neoplasms, we deleted all the known associations between breast neoplasms and miRNAs from the training set. These results also show that for new diseases without known related miRNA information, our model can predict their potential miRNAs. Code and data are available at https://github.com/LiangYu-Xidian/MDPBMP.
Collapse
Affiliation(s)
- Liang Yu
- School of Computer Science and Technology, Xidian University, Xi'an 710071, P.R. China
| | - Yujia Zheng
- School of Computer Science and Technology, Xidian University, Xi'an 710071, P.R. China
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, Xi'an 710071, P.R. China
| |
Collapse
|
66
|
Kong H, Sun ML, Zhang XA, Wang XQ. Crosstalk Among circRNA/lncRNA, miRNA, and mRNA in Osteoarthritis. Front Cell Dev Biol 2022; 9:774370. [PMID: 34977024 PMCID: PMC8714905 DOI: 10.3389/fcell.2021.774370] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2021] [Accepted: 11/29/2021] [Indexed: 12/12/2022] Open
Abstract
Osteoarthritis (OA) is a joint disease that is pervasive in life, and the incidence and mortality of OA are increasing, causing many adverse effects on people's life. Therefore, it is very vital to identify new biomarkers and therapeutic targets in the clinical diagnosis and treatment of OA. ncRNA is a nonprotein-coding RNA that does not translate into proteins but participates in protein translation. At the RNA level, it can perform biological functions. Many studies have found that miRNA, lncRNA, and circRNA are closely related to the course of OA and play important regulatory roles in transcription, post-transcription, and post-translation, which can be used as biological targets for the prevention, diagnosis, and treatment of OA. In this review, we summarized and described the various roles of different types of miRNA, lncRNA, and circRNA in OA, the roles of different lncRNA/circRNA-miRNA-mRNA axis in OA, and the possible prospects of these ncRNAs in clinical application.
Collapse
Affiliation(s)
- Hui Kong
- College of Kinesiology, Shenyang Sport University, Shenyang, China
| | - Ming-Li Sun
- College of Kinesiology, Shenyang Sport University, Shenyang, China
| | - Xin-An Zhang
- College of Kinesiology, Shenyang Sport University, Shenyang, China
| | - Xue-Qiang Wang
- Department of Sport Rehabilitation, Shanghai University of Sport, Shanghai, China.,Department of Rehabilitation Medicine, Shanghai Shangti Orthopaedic Hospital, Shanghai, China
| |
Collapse
|
67
|
Predicting miRNA-Disease Association Based on Neural Inductive Matrix Completion with Graph Autoencoders and Self-Attention Mechanism. Biomolecules 2022; 12:biom12010064. [PMID: 35053212 PMCID: PMC8774034 DOI: 10.3390/biom12010064] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Revised: 12/29/2021] [Accepted: 12/31/2021] [Indexed: 02/06/2023] Open
Abstract
Many studies have clarified that microRNAs (miRNAs) are associated with many human diseases. Therefore, it is essential to predict potential miRNA-disease associations for disease pathogenesis and treatment. Numerous machine learning and deep learning approaches have been adopted to this problem. In this paper, we propose a Neural Inductive Matrix completion-based method with Graph Autoencoders (GAE) and Self-Attention mechanism for miRNA-disease associations prediction (NIMGSA). Some of the previous works based on matrix completion ignore the importance of label propagation procedure for inferring miRNA-disease associations, while others cannot integrate matrix completion and label propagation effectively. Varying from previous studies, NIMGSA unifies inductive matrix completion and label propagation via neural network architecture, through the collaborative training of two graph autoencoders. This neural inductive matrix completion-based method is also an implementation of self-attention mechanism for miRNA-disease associations prediction. This end-to-end framework can strengthen the robustness and preciseness of both matrix completion and label propagation. Cross validations indicate that NIMGSA outperforms current miRNA-disease prediction methods. Case studies demonstrate that NIMGSA is competent in detecting potential miRNA-disease associations.
Collapse
|
68
|
Wang Y, Chen L, Jo J, Wang Y. Joint t-SNE for Comparable Projections of Multiple High-Dimensional Datasets. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:623-632. [PMID: 34587021 DOI: 10.1109/tvcg.2021.3114765] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
We present Joint t-Stochastic Neighbor Embedding (Joint t-SNE), a technique to generate comparable projections of multiple high-dimensional datasets. Although t-SNE has been widely employed to visualize high-dimensional datasets from various domains, it is limited to projecting a single dataset. When a series of high-dimensional datasets, such as datasets changing over time, is projected independently using t-SNE, misaligned layouts are obtained. Even items with identical features across datasets are projected to different locations, making the technique unsuitable for comparison tasks. To tackle this problem, we introduce edge similarity, which captures the similarities between two adjacent time frames based on the Graphlet Frequency Distribution (GFD). We then integrate a novel loss term into the t-SNE loss function, which we call vector constraints, to preserve the vectors between projected points across the projections, allowing these points to serve as visual landmarks for direct comparisons between projections. Using synthetic datasets whose ground-truth structures are known, we show that Joint t-SNE outperforms existing techniques, including Dynamic t-SNE, in terms of local coherence error, Kullback-Leibler divergence, and neighborhood preservation. We also showcase a real-world use case to visualize and compare the activation of different layers of a neural network.
Collapse
|
69
|
Abstract
MicroRNAs (miRNAs) are small noncoding elements that play essential roles in the posttranscriptional regulation of biochemical processes. miRNAs recognize and target multiple mRNAs; therefore, investigating miRNA dysregulation is an indispensable strategy to understand pathological conditions and to design innovative drugs. Targeting miRNAs in diseases improve outcomes of several therapeutic strategies thus, this present study highlights miRNA targeting methods through experimental assays and bioinformatics tools. The first part of this review focuses on experimental miRNA targeting approaches for elucidating key biochemical pathways. A growing body of evidence about the miRNA world reveals the fact that it is not possible to uncover these molecules' structural and functional characteristics related to the biological processes with a deterministic approach. Instead, a systemic point of view is needed to truly understand the facts behind the natural complexity of interactions and regulations that miRNA regulations present. This task heavily depends both on computational and experimental capabilities. Fortunately, several miRNA bioinformatics tools catering to nonexperts are available as complementary wet-lab approaches. For this purpose, this work provides recent research and information about computational tools for miRNA targeting research.
Collapse
Affiliation(s)
- Hossein Ghanbarian
- Biotechnology Department & Cellular and Molecular Biology Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Mehmet Taha Yıldız
- Division of Molecular Medicine, Hamidiye Institute of Health Sciences, University of Health Sciences-Turkey, Istanbul, Turkey
| | - Yusuf Tutar
- Division of Biochemistry, Department of Basic Pharmaceutical Sciences, Hamidiye Faculty of Pharmacy & Division of Molecular Medicine, Hamidiye Institute of Health Sciences, University of Health Sciences-Turkey, Istanbul, Turkey.
| |
Collapse
|
70
|
Zhang S, Li J, Zhou W, Li T, Zhang Y, Wang J. Higher-Order Proximity-Based MiRNA-Disease Associations Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:501-512. [PMID: 32750847 DOI: 10.1109/tcbb.2020.2994971] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
MiRNA-disease association prediction plays an important role in identifying human disease-related miRNAs. This approach is helpful not only to formulate individualized diagnosis schemes, but also to understand the pathogenesis of diseases. Many studies have focused on enhancing the prediction performance using explicit side information, such as miRNA functional similarity and disease semantic similarity. The existing approaches, however, often ignore the higher-order implicit proximity among miRNAs and diseases. To this end, in this paper, we first propose a novel approach HOP_MDA (Higher-Order Proximity based MiRNA and Disease Association Prediction) for predicting potential association between miRNA and disease. Both explicit interaction information and implicit higher-order proximity information between miRNA and disease are encoded with different order proximity matrices which are weightily combined into a parameterized prediction matrix. A supervised learning approach based on the known miRNAs-disease associations is proposed to determine the optimal weight parameters. The prediction matrix is then used to achieve effective prediction. Additionally, a higher-order proximity approximation technique (HOPA_MDA) is presented to make more efficient predictions. 5-fold cross validation is used to evaluate the performance of our proposed method. The average AUC values of HOPA_MDA for two real datasets are 0.921+/-0.002 and 0.944+/-0.0015, respectively. Our method can also predict potential miRNAs specific to new diseases with no known related miRNAs.
Collapse
|
71
|
Neal ZP, Domagalski R, Sagan B. Comparing alternatives to the fixed degree sequence model for extracting the backbone of bipartite projections. Sci Rep 2021; 11:23929. [PMID: 34907253 PMCID: PMC8671427 DOI: 10.1038/s41598-021-03238-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Accepted: 11/12/2021] [Indexed: 12/02/2022] Open
Abstract
Projections of bipartite or two-mode networks capture co-occurrences, and are used in diverse fields (e.g., ecology, economics, bibliometrics, politics) to represent unipartite networks. A key challenge in analyzing such networks is determining whether an observed number of co-occurrences between two nodes is significant, and therefore whether an edge exists between them. One approach, the fixed degree sequence model (FDSM), evaluates the significance of an edge's weight by comparison to a null model in which the degree sequences of the original bipartite network are fixed. Although the FDSM is an intuitive null model, it is computationally expensive because it requires Monte Carlo simulation to estimate each edge's p value, and therefore is impractical for large projections. In this paper, we explore four potential alternatives to FDSM: fixed fill model, fixed row model, fixed column model, and stochastic degree sequence model (SDSM). We compare these models to FDSM in terms of accuracy, speed, statistical power, similarity, and ability to recover known communities. We find that the computationally-fast SDSM offers a statistically conservative but close approximation of the computationally-impractical FDSM under a wide range of conditions, and that it correctly recovers a known community structure even when the signal is weak. Therefore, although each backbone model may have particular applications, we recommend SDSM for extracting the backbone of bipartite projections when FDSM is impractical.
Collapse
Affiliation(s)
- Zachary P Neal
- Psychology Department, Michigan State University, East Lansing, MI, USA.
| | - Rachel Domagalski
- Mathematics Department, Michigan State University, East Lansing, MI, USA
| | - Bruce Sagan
- Mathematics Department, Michigan State University, East Lansing, MI, USA
| |
Collapse
|
72
|
GCAEMDA: Predicting miRNA-disease associations via graph convolutional autoencoder. PLoS Comput Biol 2021; 17:e1009655. [PMID: 34890410 PMCID: PMC8694430 DOI: 10.1371/journal.pcbi.1009655] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Revised: 12/22/2021] [Accepted: 11/17/2021] [Indexed: 01/02/2023] Open
Abstract
microRNAs (miRNAs) are small non-coding RNAs related to a number of complicated biological processes. A growing body of studies have suggested that miRNAs are closely associated with many human diseases. It is meaningful to consider disease-related miRNAs as potential biomarkers, which could greatly contribute to understanding the mechanisms of complex diseases and benefit the prevention, detection, diagnosis and treatment of extraordinary diseases. In this study, we presented a novel model named Graph Convolutional Autoencoder for miRNA-Disease Association Prediction (GCAEMDA). In the proposed model, we utilized miRNA-miRNA similarities, disease-disease similarities and verified miRNA-disease associations to construct a heterogeneous network, which is applied to learn the embeddings of miRNAs and diseases. In addition, we separately constructed miRNA-based and disease-based sub-networks. Combining the embeddings of miRNAs and diseases, graph convolutional autoencoder (GCAE) was utilized to calculate association scores of miRNA-disease on two sub-networks, respectively. Furthermore, we obtained final prediction scores between miRNAs and diseases by adopting an average ensemble way to integrate the prediction scores from two types of subnetworks. To indicate the accuracy of GCAEMDA, we applied different cross validation methods to evaluate our model whose performances were better than the state-of-the-art models. Case studies on a common human diseases were also implemented to prove the effectiveness of GCAEMDA. The results demonstrated that GCAEMDA was beneficial to infer potential associations of miRNA-disease. Numerous studies have demonstrated that miRNAs are closely related to several common human diseases, so observing unverified associations between miRNAs and diseases is conducive to the diagnose and treatment of complex diseases. Considerable models proposed to infer potential miRNA-disease associations have made the prediction more effective and productive. We constructed GCAEMDA model to acquire more accuracy prediction result by integrating graph convolutional network and autoencoder to make prediction based on multi-source miRNA and disease information. The five-fold cross validation and global leave-one-out cross validation were implemented to evaluate the performance of our model. Consequently, GCAEMDA reached AUCs of 0.9415 and 0.9505 respectively that were distinctly higher than AUCs of other comparative models. Furthermore, we carried out case studies on lung neoplasms and breast neoplasms to demonstrate the practical application of the model, 47 and 47 of top-50 candidate miRNAs were confirmed by experimental reports. In summary, GCAEMDA could be considered as an effective and accuracy model to reveal relationship between miRNAs and diseases.
Collapse
|
73
|
Zhang G, Li M, Deng H, Xu X, Liu X, Zhang W. SGNNMD: signed graph neural network for predicting deregulation types of miRNA-disease associations. Brief Bioinform 2021; 23:6455665. [PMID: 34875683 DOI: 10.1093/bib/bbab464] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2021] [Revised: 10/08/2021] [Accepted: 10/11/2021] [Indexed: 11/13/2022] Open
Abstract
MiRNAs are a class of small non-coding RNA molecules that play an important role in many biological processes, and determining miRNA-disease associations can benefit drug development and clinical diagnosis. Although great efforts have been made to develop miRNA-disease association prediction methods, few attention has been paid to in-depth classification of miRNA-disease associations, e.g. up/down-regulation of miRNAs in diseases. In this paper, we regard known miRNA-disease associations as a signed bipartite network, which has miRNA nodes, disease nodes and two types of edges representing up/down-regulation of miRNAs in diseases, and propose a signed graph neural network method (SGNNMD) for predicting deregulation types of miRNA-disease associations. SGNNMD extracts subgraphs around miRNA-disease pairs from the signed bipartite network and learns structural features of subgraphs via a labeling algorithm and a neural network, and then combines them with biological features (i.e. miRNA-miRNA functional similarity and disease-disease semantic similarity) to build the prediction model. In the computational experiments, SGNNMD achieves highly competitive performance when compared with several baselines, including the signed graph link prediction methods, multi-relation prediction methods and one existing deregulation type prediction method. Moreover, SGNNMD has good inductive capability and can generalize to miRNAs/diseases unseen during the training.
Collapse
Affiliation(s)
- Guangzhan Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Menglu Li
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Huan Deng
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Xinran Xu
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Xuan Liu
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Wen Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| |
Collapse
|
74
|
Chen X, Jiang Z. ISFMDA: Learning Interactions of Selected Features-Based Method for Predicting Potential MicroRNA-Disease Associations. J Comput Biol 2021; 28:1219-1227. [PMID: 34847740 DOI: 10.1089/cmb.2021.0149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Prediction of potential microRNA-disease associations is one of the important tasks in computational biology fields. Mining more sophisticated features can improve the performance of the prediction methods. This article proposes a novel algorithm (ISFMDA) that can effectively learn low- or high-order interactions of recursive feature elimination selected features by an extreme gradient boosting, a factorization machine, and a deep neural network. As a result, ISFMDA can obtain an area under receiver operating characteristic curve (AUROC) of 0.9342 ± 0.0007 in fivefold cross-validation tests with 51.25% of original features, which verifies the effectiveness of the methods.
Collapse
Affiliation(s)
- Xuejun Chen
- School of Computer Science and Technology, East China Normal University, Shanghai, China
| | - Zhenran Jiang
- School of Computer Science and Technology, East China Normal University, Shanghai, China
| |
Collapse
|
75
|
Graph convolutional network approach to discovering disease-related circRNA-miRNA-mRNA axes. Methods 2021; 198:45-55. [PMID: 34758394 DOI: 10.1016/j.ymeth.2021.10.006] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2021] [Revised: 10/07/2021] [Accepted: 10/19/2021] [Indexed: 02/05/2023] Open
Abstract
Non-coding RNAs are gaining prominence in biology and medicine, as they play major roles in cellular homeostasis among which the circRNA-miRNA-mRNA axes are involved in a series of disease-related pathways, such as apoptosis, cell invasion and metastasis. Recently, many computational methods have been developed for the prediction of the relationship between ncRNAs and diseases, which can alleviate the time-consuming and labor-intensive exploration involved with biological experiments. However, these methods handle ncRNAs separately, ignoring the impact of the interactions among ncRNAs on the diseases. In this paper we present a novel approach to discovering disease-related circRNA-miRNA-mRNA axes from the disease-RNA information network. Our method, using graph convolutional network, learns the characteristic representation of each biological entity by propagating and aggregating local neighbor information based on the global structure of the network. The approach is evaluated using the real-world datasets and the results show that it outperforms other state-of-the-art baselines on most of the metrics.
Collapse
|
76
|
Yang X, Kuang L, Chen Z, Wang L. Multi-Similarities Bilinear Matrix Factorization-Based Method for Predicting Human Microbe-Disease Associations. Front Genet 2021; 12:754425. [PMID: 34721543 PMCID: PMC8551558 DOI: 10.3389/fgene.2021.754425] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Accepted: 09/21/2021] [Indexed: 11/13/2022] Open
Abstract
Accumulating studies have shown that microbes are closely related to human diseases. In this paper, a novel method called MSBMFHMDA was designed to predict potential microbe-disease associations by adopting multi-similarities bilinear matrix factorization. In MSBMFHMDA, a microbe multiple similarities matrix was constructed first based on the Gaussian interaction profile kernel similarity and cosine similarity for microbes. Then, we use the Gaussian interaction profile kernel similarity, cosine similarity, and symptom similarity for diseases to compose the disease multiple similarities matrix. Finally, we integrate these two similarity matrices and the microbe-disease association matrix into our model to predict potential associations. The results indicate that our method can achieve reliable AUCs of 0.9186 and 0.9043 ± 0.0048 in the framework of leave-one-out cross validation (LOOCV) and fivefold cross validation, respectively. What is more, experimental results indicated that there are 10, 10, and 8 out of the top 10 related microbes for asthma, inflammatory bowel disease, and type 2 diabetes mellitus, respectively, which were confirmed by experiments and literatures. Therefore, our model has favorable performance in predicting potential microbe-disease associations.
Collapse
Affiliation(s)
- Xiaoyu Yang
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, China.,College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
| | - Linai Kuang
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, China
| | - Zhiping Chen
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
| | - Lei Wang
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, China.,College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
| |
Collapse
|
77
|
Nguyen VT, Le TTK, Than K, Tran DH. Predicting miRNA-disease associations using improved random walk with restart and integrating multiple similarities. Sci Rep 2021; 11:21071. [PMID: 34702958 PMCID: PMC8548500 DOI: 10.1038/s41598-021-00677-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Accepted: 10/15/2021] [Indexed: 12/20/2022] Open
Abstract
Predicting beneficial and valuable miRNA-disease associations (MDAs) by doing biological laboratory experiments is costly and time-consuming. Proposing a forceful and meaningful computational method for predicting MDAs is essential and captivated many computer scientists in recent years. In this paper, we proposed a new computational method to predict miRNA-disease associations using improved random walk with restart and integrating multiple similarities (RWRMMDA). We used a WKNKN algorithm as a pre-processing step to solve the problem of sparsity and incompletion of data to reduce the negative impact of a large number of missing associations. Two heterogeneous networks in disease and miRNA spaces were built by integrating multiple similarity networks, respectively, and different walk probabilities could be designated to each linked neighbor node of the disease or miRNA node in line with its degree in respective networks. Finally, an improve extended random walk with restart algorithm based on miRNA similarity-based and disease similarity-based heterogeneous networks was used to calculate miRNA-disease association prediction probabilities. The experiments showed that our proposed method achieved a momentous performance with Global LOOCV AUC (Area Under Roc Curve) and AUPR (Area Under Precision-Recall Curve) values of 0.9882 and 0.9066, respectively. And the best AUC and AUPR values under fivefold cross-validation of 0.9855 and 0.8642 which are proven by statistical tests, respectively. In comparison with other previous related methods, it outperformed than NTSHMDA, PMFMDA, IMCMDA and MCLPMDA methods in both AUC and AUPR values. In case studies of Breast Neoplasms, Carcinoma Hepatocellular and Stomach Neoplasms diseases, it inferred 1, 12 and 7 new associations out of top 40 predicted associated miRNAs for each disease, respectively. All of these new inferred associations have been confirmed in different databases or literatures.
Collapse
Affiliation(s)
- Van Tinh Nguyen
- Faculty of Information Technology, Hanoi National University of Education, Hanoi, Vietnam
- Faculty of Information Technology, Hanoi University of Industry, 298 Cau Dien Street, Bac Tu Liem District, Hanoi, Vietnam
| | - Thi Tu Kien Le
- Faculty of Information Technology, Hanoi National University of Education, Hanoi, Vietnam
| | - Khoat Than
- Hanoi University of Science and Technology, Hanoi, Vietnam
| | - Dang Hung Tran
- Faculty of Information Technology, Hanoi National University of Education, Hanoi, Vietnam.
| |
Collapse
|
78
|
Ding P, Ouyang W, Luo J, Kwoh CK. Heterogeneous information network and its application to human health and disease. Brief Bioinform 2021; 21:1327-1346. [PMID: 31566212 DOI: 10.1093/bib/bbz091] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Revised: 06/29/2019] [Accepted: 06/30/2019] [Indexed: 12/11/2022] Open
Abstract
The molecular components with the functional interdependencies in human cell form complicated biological network. Diseases are mostly caused by the perturbations of the composite of the interaction multi-biomolecules, rather than an abnormality of a single biomolecule. Furthermore, new biological functions and processes could be revealed by discovering novel biological entity relationships. Hence, more and more biologists focus on studying the complex biological system instead of the individual biological components. The emergence of heterogeneous information network (HIN) offers a promising way to systematically explore complicated and heterogeneous relationships between various molecules for apparently distinct phenotypes. In this review, we first present the basic definition of HIN and the biological system considered as a complex HIN. Then, we discuss the topological properties of HIN and how these can be applied to detect network motif and functional module. Afterwards, methodologies of discovering relationships between disease and biomolecule are presented. Useful insights on how HIN aids in drug development and explores human interactome are provided. Finally, we analyze the challenges and opportunities for uncovering combinatorial patterns among pharmacogenomics and cell-type detection based on single-cell genomic data.
Collapse
Affiliation(s)
- Pingjian Ding
- School of Computer Science, University of South China, Hengyang, China
| | - Wenjue Ouyang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Chee-Keong Kwoh
- School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
| |
Collapse
|
79
|
Wu Y, Zhu D, Wang X, Zhang S. An ensemble learning framework for potential miRNA-disease association prediction with positive-unlabeled data. Comput Biol Chem 2021; 95:107566. [PMID: 34534906 DOI: 10.1016/j.compbiolchem.2021.107566] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2021] [Revised: 08/13/2021] [Accepted: 08/18/2021] [Indexed: 11/17/2022]
Abstract
To explore the pathogenic mechanisms of MicroRNA (miRNA) on diverse diseases, many researchers have concentrated on discovering the potential associations between miRNA and disease using machine learning methods. However, the prediction accuracy of supervised machine learning methods is limited by lacking of experimentally-validated uncorrelated miRNA-disease pairs. Without these negative samples, training a highly accurate model is much more difficult. Different from traditional miRNA-disease prediction models using randomly selected unknown samples as negative training samples, we propose an ensemble learning framework to solve this positive-unlabeled (PU) learning problem. The framework incorporates two steps, i.e., a novel semi-supervised Kmeans (SS-Kmeans) to extract reliable negative samples from unknown miRNA-disease pairs and subagging method to generate diverse training sample sets to make full use of those reliable negative samples for ensemble learning. Combined with effective random vector functional link (RVFL) network as prediction model, the proposed framework showed superior prediction accuracy comparing with other popular approaches. A case study on lung and gastric neoplasms further confirms the framework's efficacy at identifying miRNA disease associations.
Collapse
Affiliation(s)
- Yao Wu
- School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China
| | - Donghua Zhu
- School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China
| | - Xuefeng Wang
- School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China.
| | - Shuo Zhang
- School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China
| |
Collapse
|
80
|
Ji C, Wang Y, Ni J, Zheng C, Su Y. Predicting miRNA-Disease Associations Based on Heterogeneous Graph Attention Networks. Front Genet 2021; 12:727744. [PMID: 34512733 PMCID: PMC8424198 DOI: 10.3389/fgene.2021.727744] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2021] [Accepted: 08/02/2021] [Indexed: 11/23/2022] Open
Abstract
In recent years, more and more evidence has shown that microRNAs (miRNAs) play an important role in the regulation of post-transcriptional gene expression, and are closely related to human diseases. Many studies have also revealed that miRNAs can be served as promising biomarkers for the potential diagnosis and treatment of human diseases. The interactions between miRNA and human disease have rarely been demonstrated, and the underlying mechanism of miRNA is not clear. Therefore, computational approaches has attracted the attention of researchers, which can not only save time and money, but also improve the efficiency and accuracy of biological experiments. In this work, we proposed a Heterogeneous Graph Attention Networks (GAT) based method for miRNA-disease associations prediction, named HGATMDA. We constructed a heterogeneous graph for miRNAs and diseases, introduced weighted DeepWalk and GAT methods to extract features of miRNAs and diseases from the graph. Moreover, a fully-connected neural networks is used to predict correlation scores between miRNA-disease pairs. Experimental results under five-fold cross validation (five-fold CV) showed that HGATMDA achieved better prediction performance than other state-of-the-art methods. In addition, we performed three case studies on breast neoplasms, lung neoplasms and kidney neoplasms. The results showed that for the three diseases mentioned above, 50 out of top 50 candidates were confirmed by the validation datasets. Therefore, HGATMDA is suitable as an effective tool to identity potential diseases-related miRNAs.
Collapse
Affiliation(s)
- Cunmei Ji
- School of Cyber Science and Engineering, Qufu Normal University, Qufu, China
| | - Yutian Wang
- School of Cyber Science and Engineering, Qufu Normal University, Qufu, China
| | - Jiancheng Ni
- School of Cyber Science and Engineering, Qufu Normal University, Qufu, China
| | - Chunhou Zheng
- School of Artificial Intelligence, Anhui University, Hefei, China
| | - Yansen Su
- School of Artificial Intelligence, Anhui University, Hefei, China
| |
Collapse
|
81
|
Dai Q, Chu Y, Li Z, Zhao Y, Mao X, Wang Y, Xiong Y, Wei DQ. MDA-CF: Predicting MiRNA-Disease associations based on a cascade forest model by fusing multi-source information. Comput Biol Med 2021; 136:104706. [PMID: 34371319 DOI: 10.1016/j.compbiomed.2021.104706] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Revised: 07/26/2021] [Accepted: 07/26/2021] [Indexed: 01/17/2023]
Abstract
MicroRNAs (miRNAs) are significant regulators in various biological processes. They may become promising biomarkers or therapeutic targets, which provide a new perspective in diagnosis and treatment of multiple diseases. Since the experimental methods are always costly and resource-consuming, prediction of disease-related miRNAs using computational methods is in great need. In this study, we developed MDA-CF to identify underlying miRNA-disease associations based on a cascade forest model. In this method, multi-source information was integrated to represent miRNAs and diseases comprehensively, and the autoencoder was utilized for dimension reduction to obtain the optimal feature space. The cascade forest model was then employed for miRNA-disease association prediction. As a result, the average AUC of MDA-CF was 0.9464 on HMDD v3.2 in five-fold cross-validation. Compared with previous computational methods, MDA-CF performed better on HMDD v2.0 with an average AUC of 0.9258. Moreover, MDA-CF was implemented to investigate colon neoplasm, breast neoplasm, and gastric neoplasm, and 100%, 86%, 88% of the top 50 potential miRNAs were validated by authoritative databases. In conclusion, MDA-CF appears to be a reliable method to uncover disease-associated miRNAs. The source code of MDA-CF is available at https://github.com/a1622108/MDA-CF.
Collapse
Affiliation(s)
- Qiuying Dai
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yanyi Chu
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Zhiqi Li
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yusong Zhao
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Xueying Mao
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yanjing Wang
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China.
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China; Peng Cheng Laboratory, Vanke Cloud City Phase I Building 8, Xili Street, Nanshan District, Shenzhen, Guangdong, 518055, China.
| |
Collapse
|
82
|
Nie R, Li Z, You ZH, Bao W, Li J. Efficient framework for predicting MiRNA-disease associations based on improved hybrid collaborative filtering. BMC Med Inform Decis Mak 2021; 21:254. [PMID: 34461870 PMCID: PMC8406577 DOI: 10.1186/s12911-021-01616-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2021] [Accepted: 08/23/2021] [Indexed: 01/21/2023] Open
Abstract
BACKGROUND Accumulating studies indicates that microRNAs (miRNAs) play vital roles in the process of development and progression of many human complex diseases. However, traditional biochemical experimental methods for identifying disease-related miRNAs cost large amount of time, manpower, material and financial resources. METHODS In this study, we developed a framework named hybrid collaborative filtering for miRNA-disease association prediction (HCFMDA) by integrating heterogeneous data, e.g., miRNA functional similarity, disease semantic similarity, known miRNA-disease association networks, and Gaussian kernel similarity of miRNAs and diseases. To capture the intrinsic interaction patterns embedded in the sparse association matrix, we prioritized the predictive score by fusing three types of information: similar disease associations, similar miRNA associations, and similar disease-miRNA associations. Meanwhile, singular value decomposition was adopted to reduce the impact of noise and accelerate predictive speed. RESULTS We then validated HCFMDA with leave-one-out cross-validation (LOOCV) and two types of case studies. In the LOOCV, we achieved 0.8379 of AUC (area under the curve). To evaluate the performance of HCFMDA on real diseases, we further implemented the first type of case validation over three important human diseases: Colon Neoplasms, Esophageal Neoplasms and Prostate Neoplasms. As a result, 44, 46 and 44 out of the top 50 predicted disease-related miRNAs were confirmed by experimental evidence. Moreover, the second type of case validation on Breast Neoplasms indicates that HCFMDA could also be applied to predict potential miRNAs towards those diseases without any known associated miRNA. CONCLUSIONS The satisfactory prediction performance demonstrates that our model could serve as a reliable tool to guide the following research for identifying candidate miRNAs associated with human diseases.
Collapse
Affiliation(s)
- Ru Nie
- Engineering Research Center of Mine Digitalization of Ministry of Education, China University of Mining and Technology, Xuzhou, 221116, China
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116, China
| | - Zhengwei Li
- Engineering Research Center of Mine Digitalization of Ministry of Education, China University of Mining and Technology, Xuzhou, 221116, China.
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116, China.
- Institute of Machine Learning and Systems Biology, College of Electronics and Information Engineering, Tongji University, Shanghai, 201804, China.
- KUNPAND Communications (Kunshan) Co., Ltd., Suzhou, 215300, China.
| | - Zhu-Hong You
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, China.
| | - Wenzheng Bao
- School of Information Engineering, Xuzhou University of Technology, Xuzhou, 221018, China
| | - Jiashu Li
- Engineering Research Center of Mine Digitalization of Ministry of Education, China University of Mining and Technology, Xuzhou, 221116, China
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|
83
|
Qu J, Wang CC, Cai SB, Zhao WD, Cheng XL, Ming Z. Biased Random Walk With Restart on Multilayer Heterogeneous Networks for MiRNA-Disease Association Prediction. Front Genet 2021; 12:720327. [PMID: 34447416 PMCID: PMC8384471 DOI: 10.3389/fgene.2021.720327] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2021] [Accepted: 07/13/2021] [Indexed: 01/07/2023] Open
Abstract
Numerous experiments have proved that microRNAs (miRNAs) could be used as diagnostic biomarkers for many complex diseases. Thus, it is conceivable that predicting the unobserved associations between miRNAs and diseases is extremely significant for the medical field. Here, based on heterogeneous networks built on the information of known miRNA-disease associations, miRNA function similarity, disease semantic similarity, and Gaussian interaction profile kernel similarity for miRNAs and diseases, we developed a computing model of biased random walk with restart on multilayer heterogeneous networks for miRNA-disease association prediction (BRWRMHMDA) through enforcing degree-based biased random walk with restart (BRWR). Assessment results reflected that an AUC of 0.8310 was gained in local leave-one-out cross-validation (LOOCV), which proved the calculation algorithm's good performance. Besides, we carried out BRWRMHMDA to prioritize candidate miRNAs for esophageal neoplasms based on HMDD v2.0. We further prioritize candidate miRNAs for breast neoplasms based on HMDD v1.0. The local LOOCV results and performance analysis of the case study all showed that the proposed model has good and stable performance.
Collapse
Affiliation(s)
- Jia Qu
- School of Computer Science and Artificial Intelligence & Aliyun School of Big Data, Changzhou University, Changzhou, China
| | - Chun-Chun Wang
- Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| | - Shu-Bin Cai
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
| | - Wen-Di Zhao
- School of Computer Science and Artificial Intelligence & Aliyun School of Big Data, Changzhou University, Changzhou, China
| | - Xiao-Long Cheng
- School of Computer Science and Artificial Intelligence & Aliyun School of Big Data, Changzhou University, Changzhou, China
| | - Zhong Ming
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
| |
Collapse
|
84
|
Li J, Liu T, Wang J, Li Q, Ning C, Yang Y. MvKFN-MDA: Multi-view Kernel Fusion Network for miRNA-disease association prediction. Artif Intell Med 2021; 118:102115. [PMID: 34412838 DOI: 10.1016/j.artmed.2021.102115] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2020] [Revised: 05/13/2021] [Accepted: 05/21/2021] [Indexed: 12/01/2022]
Abstract
Predicting the associations between microRNAs (miRNAs) and diseases is of great significance for identifying miRNAs related to human diseases. Since it is time-consuming and costly to identify the association between miRNA and disease through biological experiments, computational methods are currently used as an effective supplement to identify the potential association between disease and miRNA. This paper presents a Multi-view Kernel Fusion Network (MvKFN) based prediction method (MvKFN-MDA) to address the problem of miRNA-disease associations prediction. A novel multiple kernel fusion framework Multi-view Kernel Fusion Network (MvKFN) is first proposed to effectively fuse different views similarity kernels constructed from different data sources in a highly nonlinear way. Using MvKFNs, both different base similarity kernels for miRNA, such as sequence, functional, semantic, Gaussian profile kernels and different base similarity kernels for diseases, such as semantic, Gaussian profile kernel are nonlinearly fused into two integrated similarity kernels, one for miRNA, another for disease. Then, miRNA and disease feature representations are extracted from the miRNA and disease integrated similarity kernels respectively. These features are then fed into a neural matrix completion framework which finally outputs the association prediction scores. The parameters of MvKFN-MDA are learned based on the known miRNA-disease association matrix in a supervised end-to-end way. We compare the proposed method with other state-of-the-art methods. The AUCs of our proposed method were superior to the existing methods in both 5-FCV and LOOCV on two open experimental datasets. Furthermore, 49, 48, and 47 of the top 50 predicted miRNAs for three high-risk human diseases, namely, colon cancer, lymphoma, and kidney cancer, are verified respectively using experimental literature. Finally, 100% accuracy from the top 50 predicted miRNAs is achieved when breast cancer is used as a case study to evaluate the ability of MvKFN-MDA for predicting a new disease without any known related miRNAs.
Collapse
Affiliation(s)
- Jin Li
- School of Software, Yunnan University, Kunming, China; Kunming Key Laboratory of Data Science and Intelligent Computing, Kunming, China
| | - Tao Liu
- School of Software, Yunnan University, Kunming, China
| | - Jingru Wang
- School of Software, Yunnan University, Kunming, China
| | - Qing Li
- First Affiliated Hospital of Kunming Medical University, Kunming, China
| | - Chenxi Ning
- School of Software, Yunnan University, Kunming, China
| | - Yun Yang
- School of Software, Yunnan University, Kunming, China; Kunming Key Laboratory of Data Science and Intelligent Computing, Kunming, China.
| |
Collapse
|
85
|
Toprak A, Eryilmaz Dogan E. Prediction of Potential MicroRNA-Disease Association Using Kernelized Bayesian Matrix Factorization. Interdiscip Sci 2021; 13:595-602. [PMID: 34370220 DOI: 10.1007/s12539-021-00469-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Revised: 07/05/2021] [Accepted: 07/30/2021] [Indexed: 10/20/2022]
Abstract
MicroRNA (miRNA) molecules, which are effective in the formation and progression of many different diseases, are 18-22 nucleotides in length and make up a type of non-coding RNA. Predicting disease-related microRNAs is crucial for understanding the pathogenesis of disease and for diagnosis, treatment, and prevention of diseases. Many computational techniques have been studied and developed, as the experimental techniques used to find novel miRNA-disease associations in biology are costly. In this paper, a Kernelized Bayesian Matrix Factorization (KBMF) technique was suggested to predict new relations among miRNAs and diseases with several information such as miRNA functional similarity, disease semantic similarity, and known relations among miRNAs and diseases. AUC value of 0.9450 was obtained by implementing fivefold cross-validation for KBMF technique. We also carried out three kinds of case studies (breast, lung, and colon neoplasms) to prove the performance of KBMF technique, and the predictive reliability of this method was confirmed by the results. Thus, KBMF technique can be used as a reliable computational model to infer possible miRNA-disease associations.
Collapse
Affiliation(s)
- Ahmet Toprak
- Department of Electricity and Energy, Bozkır Vocational School, Selcuk University, Bozkır, Konya, Turkey
| | - Esma Eryilmaz Dogan
- Department of Biomedical Engineering, Faculty of Technology, Selcuk University, Selçuklu, Konya, Turkey.
| |
Collapse
|
86
|
Ji C, Liu Z, Wang Y, Ni J, Zheng C. GATNNCDA: A Method Based on Graph Attention Network and Multi-Layer Neural Network for Predicting circRNA-Disease Associations. Int J Mol Sci 2021; 22:8505. [PMID: 34445212 PMCID: PMC8395191 DOI: 10.3390/ijms22168505] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Revised: 07/30/2021] [Accepted: 08/03/2021] [Indexed: 12/30/2022] Open
Abstract
Circular RNAs (circRNAs) are a new class of endogenous non-coding RNAs with covalent closed loop structure. Researchers have revealed that circRNAs play an important role in human diseases. As experimental identification of interactions between circRNA and disease is time-consuming and expensive, effective computational methods are an urgent need for predicting potential circRNA-disease associations. In this study, we proposed a novel computational method named GATNNCDA, which combines Graph Attention Network (GAT) and multi-layer neural network (NN) to infer disease-related circRNAs. Specially, GATNNCDA first integrates disease semantic similarity, circRNA functional similarity and the respective Gaussian Interaction Profile (GIP) kernel similarities. The integrated similarities are used as initial node features, and then GAT is applied for further feature extraction in the heterogeneous circRNA-disease graph. Finally, the NN-based classifier is introduced for prediction. The results of fivefold cross validation demonstrated that GATNNCDA achieved an average AUC of 0.9613 and AUPR of 0.9433 on the CircR2Disease dataset, and outperformed other state-of-the-art methods. In addition, case studies on breast cancer and hepatocellular carcinoma showed that 20 and 18 of the top 20 candidates were respectively confirmed in the validation datasets or published literature. Therefore, GATNNCDA is an effective and reliable tool for discovering circRNA-disease associations.
Collapse
Affiliation(s)
- Cunmei Ji
- School of Cyber Science and Engineering, Qufu Normal University, Qufu 273165, China; (Z.L.); (Y.W.); (J.N.)
| | - Zhihao Liu
- School of Cyber Science and Engineering, Qufu Normal University, Qufu 273165, China; (Z.L.); (Y.W.); (J.N.)
| | - Yutian Wang
- School of Cyber Science and Engineering, Qufu Normal University, Qufu 273165, China; (Z.L.); (Y.W.); (J.N.)
| | - Jiancheng Ni
- School of Cyber Science and Engineering, Qufu Normal University, Qufu 273165, China; (Z.L.); (Y.W.); (J.N.)
| | - Chunhou Zheng
- School of Artificial Intelligence, Anhui University, Hefei 230601, China
| |
Collapse
|
87
|
Zhu CC, Wang CC, Zhao Y, Zuo M, Chen X. Identification of miRNA-disease associations via multiple information integration with Bayesian ranking. Brief Bioinform 2021; 22:6338537. [PMID: 34347021 DOI: 10.1093/bib/bbab302] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 07/06/2021] [Accepted: 07/14/2021] [Indexed: 12/22/2022] Open
Abstract
In recent years, increasing microRNA (miRNA)-disease associations were identified through traditionally biological experiments. These associations contribute to revealing molecular mechanism of diseases and preventing and curing diseases. To improve the efficiency of miRNA-disease association discovery, some calculation methods were developed as auxiliary tools for researchers. In the current study, we raised a novel model named Bayesian Ranking for MiRNA-Disease Association prediction (BRMDA) by improving Bayesian Personalized Ranking from three aspects: (i) taking advantage of similarity of diseases and miRNAs; (ii) incorporating miRNA bias for miRNAs associated with different number of diseases; and (iii) implementing neighborhood-based approach for new miRNAs and diseases. For each investigated disease, BRMDA used the set of triples (i.e. disease, labeled miRNA, unlabeled miRNA) that reflected association preference of the disease to miRNAs as training set, which made full use of unknown samples rather than simply considering them as negative samples. To investigate the predictive performance of BRMDA, we employed leave-one-out cross-validation and obtained Area Under the Curve of 0.8697, which outperformed many classical methods. Besides, we further implemented three distinct classes of case studies for three common Neoplasms. As a result, there are 44 (Colon Neoplasms), 49 (Esophageal Neoplasms) and 49 (Lung Neoplasms) among the top 50 predicted miRNAs validated through experiments. In short, BRMDA would be a trustable tool for inferring valuable associations.
Collapse
Affiliation(s)
- Chi-Chi Zhu
- Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou 221116, China.,School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
| | - Chun-Chun Wang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
| | - Yan Zhao
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
| | - Mingcheng Zuo
- Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou 221116, China
| | - Xing Chen
- Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou 221116, China
| |
Collapse
|
88
|
Zhang L, Yang P, Feng H, Zhao Q, Liu H. Using Network Distance Analysis to Predict lncRNA-miRNA Interactions. Interdiscip Sci 2021; 13:535-545. [PMID: 34232474 DOI: 10.1007/s12539-021-00458-z] [Citation(s) in RCA: 149] [Impact Index Per Article: 37.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2020] [Revised: 06/26/2021] [Accepted: 06/29/2021] [Indexed: 01/08/2023]
Abstract
LncRNA-miRNA interactions contribute to the regulation of therapeutic targets and diagnostic biomarkers in multifarious human diseases. However, it remains difficult to experimentally identify lncRNA-miRNA associations at large scale, and computational prediction methods are limited. In this study, we developed a network distance analysis model for lncRNA-miRNA association prediction (NDALMA). Similarity networks for lncRNAs and miRNAs were calculated and integrated with Gaussian interaction profile (GIP) kernel similarity. Then, network distance analysis was applied to the integrated similarity networks, and final scores were obtained after confidence calculation and score conversion. Our model obtained satisfactory results in fivefold cross validation, achieving an AUC of 0.8810 and an AUPR of 0.8315. Moreover, NDALMA showed superior prediction performance over several other network algorithms, and we tested the suitability and flexibility of the model by comparing different types of similarity. In addition, case studies of the relationships between lncRNAs and miRNAs were conducted, which verified the reliability of our method in predicting lncRNA-miRNA associations. The datasets and source code used in this study are available at https://github.com/Liu-Lab-Lnu/NDALMA .
Collapse
Affiliation(s)
- Li Zhang
- School of Life Science, Liaoning University, Shenyang, 110036, China
- Research Center for Computer Simulating and Information Processing of Bio-Macromolecules of Shenyang, Liaoning University, Shenyang, 110036, China
- Technology Innovation Center for Computer Simulating and Information Processing of Bio-Macromolecules of Shenyang, Shenyang, 110036, China
| | - Pengyu Yang
- School of Information, Liaoning University, Shenyang, 110036, China
| | - Huawei Feng
- School of Life Science, Liaoning University, Shenyang, 110036, China
| | - Qi Zhao
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China.
| | - Hongsheng Liu
- Research Center for Computer Simulating and Information Processing of Bio-Macromolecules of Shenyang, Liaoning University, Shenyang, 110036, China.
- Technology Innovation Center for Computer Simulating and Information Processing of Bio-Macromolecules of Shenyang, Shenyang, 110036, China.
- School of Pharmacy, Liaoning University, Shenyang, 110036, China.
| |
Collapse
|
89
|
Min X, Lu F, Li C. Sequence-Based Deep Learning Frameworks on Enhancer-Promoter Interactions Prediction. Curr Pharm Des 2021; 27:1847-1855. [PMID: 33234095 DOI: 10.2174/1381612826666201124112710] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Revised: 07/29/2020] [Accepted: 08/06/2020] [Indexed: 11/22/2022]
Abstract
Enhancer-promoter interactions (EPIs) in the human genome are of great significance to transcriptional regulation, which tightly controls gene expression. Identification of EPIs can help us better decipher gene regulation and understand disease mechanisms. However, experimental methods to identify EPIs are constrained by funds, time, and manpower, while computational methods using DNA sequences and genomic features are viable alternatives. Deep learning methods have shown promising prospects in classification and efforts that have been utilized to identify EPIs. In this survey, we specifically focus on sequence-based deep learning methods and conduct a comprehensive review of the literature. First, we briefly introduce existing sequence- based frameworks on EPIs prediction and their technique details. After that, we elaborate on the dataset, pre-processing means, and evaluation strategies. Finally, we concluded with the challenges these methods are confronted with and suggest several future opportunities. We hope this review will provide a useful reference for further studies on enhancer-promoter interactions.
Collapse
Affiliation(s)
- Xiaoping Min
- School of Informatics, Xiamen University, Xiamen 361005, China
| | - Fengqing Lu
- School of Informatics, Xiamen University, Xiamen 361005, China
| | - Chunyan Li
- Graduate School, Yunnan Minzu University, Kunming 650504, China
| |
Collapse
|
90
|
Gui T, Yao C, Jia B, Shen K. Identification and analysis of genes associated with epithelial ovarian cancer by integrated bioinformatics methods. PLoS One 2021; 16:e0253136. [PMID: 34143800 PMCID: PMC8213194 DOI: 10.1371/journal.pone.0253136] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Accepted: 05/31/2021] [Indexed: 12/24/2022] Open
Abstract
Background Though considerable efforts have been made to improve the treatment of epithelial ovarian cancer (EOC), the prognosis of patients has remained poor. Identifying differentially expressed genes (DEGs) involved in EOC progression and exploiting them as novel biomarkers or therapeutic targets is of great value. Methods Overlapping DEGs were screened out from three independent gene expression omnibus (GEO) datasets and were subjected to Gene ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) pathway enrichment analyses. The protein-protein interactions (PPI) network of DEGs was constructed based on the STRING database. The expression of hub genes was validated in GEPIA and GEO. The relationship of hub genes expression with tumor stage and overall survival and progression-free survival of EOC patients was investigated using the cancer genome atlas data. Results A total of 306 DEGs were identified, including 265 up-regulated and 41 down-regulated. Through PPI network analysis, the top 20 genes were screened out, among which 4 hub genes, which were not researched in depth so far, were selected after literature retrieval, including CDC45, CDCA5, KIF4A, ESPL1. The four genes were up-regulated in EOC tissues compared with normal tissues, but their expression decreased gradually with the continuous progression of EOC. Survival curves illustrated that patients with a lower level of CDCA5 and ESPL1 had better overall survival and progression-free survival statistically. Conclusion Two hub genes, CDCA5 and ESPL1, identified as probably playing tumor-promotive roles, have great potential to be utilized as novel therapeutic targets for EOC treatment.
Collapse
Affiliation(s)
- Ting Gui
- Department of Obstetrics and Gynecology, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, China
| | - Chenhe Yao
- Department of R&D Technology Center, Beijing Zhicheng Biomedical Technology Co, Ltd, Beijing, China
| | - Binghan Jia
- Department of R&D Technology Center, Beijing Zhicheng Biomedical Technology Co, Ltd, Beijing, China
| | - Keng Shen
- Department of Obstetrics and Gynecology, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, China
- * E-mail:
| |
Collapse
|
91
|
Li A, Deng Y, Tan Y, Chen M. A novel miRNA-disease association prediction model using dual random walk with restart and space projection federated method. PLoS One 2021; 16:e0252971. [PMID: 34138933 PMCID: PMC8211179 DOI: 10.1371/journal.pone.0252971] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2021] [Accepted: 05/26/2021] [Indexed: 12/27/2022] Open
Abstract
A large number of studies have shown that the variation and disorder of miRNAs are important causes of diseases. The recognition of disease-related miRNAs has become an important topic in the field of biological research. However, the identification of disease-related miRNAs by biological experiments is expensive and time consuming. Thus, computational prediction models that predict disease-related miRNAs must be developed. A novel network projection-based dual random walk with restart (NPRWR) was used to predict potential disease-related miRNAs. The NPRWR model aims to estimate and accurately predict miRNA-disease associations by using dual random walk with restart and network projection technology, respectively. The leave-one-out cross validation (LOOCV) was adopted to evaluate the prediction performance of NPRWR. The results show that the area under the receiver operating characteristic curve(AUC) of NPRWR was 0.9029, which is superior to that of other advanced miRNA-disease associated prediction methods. In addition, lung and kidney neoplasms were selected to present a case study. Among the first 50 miRNAs predicted, 50 and 49 miRNAs have been proven by in databases or relevant literature. Moreover, NPRWR can be used to predict isolated diseases and new miRNAs. LOOCV and the case study achieved good prediction results. Thus, NPRWR will become an effective and accurate disease-miRNA association prediction model.
Collapse
Affiliation(s)
- Ang Li
- Hunan Institute of Technology, School of Computer Science and Technology, Hengyang, China
| | - Yingwei Deng
- Hunan Institute of Technology, School of Computer Science and Technology, Hengyang, China
- Hainan Key Laboratory for Computational Science and Application, Haikou, China
| | - Yan Tan
- Hunan Institute of Technology, School of Computer Science and Technology, Hengyang, China
| | - Min Chen
- Hunan Institute of Technology, School of Computer Science and Technology, Hengyang, China
| |
Collapse
|
92
|
Zuo ZL, Cao RF, Wei PJ, Xia JF, Zheng CH. Double matrix completion for circRNA-disease association prediction. BMC Bioinformatics 2021; 22:307. [PMID: 34103016 PMCID: PMC8185931 DOI: 10.1186/s12859-021-04231-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Accepted: 05/28/2021] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Circular RNAs (circRNAs) are a class of single-stranded RNA molecules with a closed-loop structure. A growing body of research has shown that circRNAs are closely related to the development of diseases. Because biological experiments to verify circRNA-disease associations are time-consuming and wasteful of resources, it is necessary to propose a reliable computational method to predict the potential candidate circRNA-disease associations for biological experiments to make them more efficient. RESULTS In this paper, we propose a double matrix completion method (DMCCDA) for predicting potential circRNA-disease associations. First, we constructed a similarity matrix of circRNA and disease according to circRNA sequence information and semantic disease information. We also built a Gauss interaction profile similarity matrix for circRNA and disease based on experimentally verified circRNA-disease associations. Then, the corresponding circRNA sequence similarity and semantic similarity of disease are used to update the association matrix from the perspective of circRNA and disease, respectively, by matrix multiplication. Finally, from the perspective of circRNA and disease, matrix completion is used to update the matrix block, which is formed by splicing the association matrix obtained in the previous step with the corresponding Gaussian similarity matrix. Compared with other approaches, the model of DMCCDA has a relatively good result in leave-one-out cross-validation and five-fold cross-validation. Additionally, the results of the case studies illustrate the effectiveness of the DMCCDA model. CONCLUSION The results show that our method works well for recommending the potential circRNAs for a disease for biological experiments.
Collapse
Affiliation(s)
- Zong-Lan Zuo
- Key Lab of Intelligent Computing and Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, Hefei, China
| | - Rui-Fen Cao
- Key Lab of Intelligent Computing and Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, Hefei, China
- Engineering Research Center of Big Data Application in Private Health Medicine, Fujian Province University, Putian, Fujian, China
| | - Pi-Jing Wei
- Institute of Physical Science and Information Technology, Anhui University, Hefei, China
| | - Jun-Feng Xia
- Institute of Physical Science and Information Technology, Anhui University, Hefei, China
| | - Chun-Hou Zheng
- Key Lab of Intelligent Computing and Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, Hefei, China.
| |
Collapse
|
93
|
Chu Y, Wang X, Dai Q, Wang Y, Wang Q, Peng S, Wei X, Qiu J, Salahub DR, Xiong Y, Wei DQ. MDA-GCNFTG: identifying miRNA-disease associations based on graph convolutional networks via graph sampling through the feature and topology graph. Brief Bioinform 2021; 22:6261915. [PMID: 34009265 DOI: 10.1093/bib/bbab165] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 04/02/2021] [Accepted: 04/08/2021] [Indexed: 11/13/2022] Open
Abstract
Accurate identification of the miRNA-disease associations (MDAs) helps to understand the etiology and mechanisms of various diseases. However, the experimental methods are costly and time-consuming. Thus, it is urgent to develop computational methods towards the prediction of MDAs. Based on the graph theory, the MDA prediction is regarded as a node classification task in the present study. To solve this task, we propose a novel method MDA-GCNFTG, which predicts MDAs based on Graph Convolutional Networks (GCNs) via graph sampling through the Feature and Topology Graph to improve the training efficiency and accuracy. This method models both the potential connections of feature space and the structural relationships of MDA data. The nodes of the graphs are represented by the disease semantic similarity, miRNA functional similarity and Gaussian interaction profile kernel similarity. Moreover, we considered six tasks simultaneously on the MDA prediction problem at the first time, which ensure that under both balanced and unbalanced sample distribution, MDA-GCNFTG can predict not only new MDAs but also new diseases without known related miRNAs and new miRNAs without known related diseases. The results of 5-fold cross-validation show that the MDA-GCNFTG method has achieved satisfactory performance on all six tasks and is significantly superior to the classic machine learning methods and the state-of-the-art MDA prediction methods. Moreover, the effectiveness of GCNs via the graph sampling strategy and the feature and topology graph in MDA-GCNFTG has also been demonstrated. More importantly, case studies for two diseases and three miRNAs are conducted and achieved satisfactory performance.
Collapse
Affiliation(s)
- Yanyi Chu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Xuhong Wang
- School of Electronic, Information and Electrical Engineering (SEIEE), Shanghai Jiao Tong University, China
| | - Qiuying Dai
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Yanjing Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Qiankun Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Shaoliang Peng
- College of Computer Science and Electronic Engineering, Hunan University, China
| | | | | | - Dennis Russell Salahub
- Department of Chemistry, University of Calgary, Fellow Royal Society of Canada and Fellow of the American Association for the Advancement of Science, China
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, P.R. China
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, P.R. China
| |
Collapse
|
94
|
Pal JK, Ray SS, Pal SK. Identifying Drug Resistant miRNAs Using Entropy Based Ranking. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:973-984. [PMID: 31398129 DOI: 10.1109/tcbb.2019.2933205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
MicroRNAs play an important role in controlling drug sensitivity and resistance in cancer. Identification of responsible miRNAs for drug resistance can enhance the effectiveness of treatment. A new set theoretic entropy measure (SPEM) is defined to determine the relevance and level of confidence of miRNAs in deciding their drug resistant nature. Here, a pattern is represented by a pair of values. One of them implies the degree of its belongingness (fuzzy membership) to a class and the other represents the actual class of origin (crisp membership). A measure, called granular probability, is defined that determines the confidence level of having a particular pair of membership values. The granules used to compute the said probability are formed by a histogram based method where each bin of a histogram is considered as one granule. The width and number of the bins are automatically determined by the algorithm. The set thus defined, comprising a pair of membership values and the confidence level for having them, is used for the computation of SPEM and thereby identifying the drug resistant miRNAs. The efficiency of SPEM is demonstrated extensively on six data sets. While the achieved F-score in classifying sensitive and resistant samples ranges between 0.31 & 0.50 using all the miRNAs by SVM classifier, the same score varies from 0.67 to 0.94 using only the top 1 percent drug resistant miRNAs. Superiority of the proposed method as compared to some existing ones is established in terms of F-score. The significance of the top 1 percent miRNAs in corresponding cancer is also verified by the different articles based on biological investigations. Source code of SPEM is available at http://www.jayanta.droppages.com/SPEM.html.
Collapse
|
95
|
|
96
|
Li J, Zhao H, Xuan Z, Yu J, Feng X, Liao B, Wang L. A Novel Approach for Potential Human LncRNA-Disease Association Prediction Based on Local Random Walk. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:1049-1059. [PMID: 31425046 DOI: 10.1109/tcbb.2019.2934958] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In recent years, lncRNAs (long non-coding RNAs) have been proved to be closely related to many diseases that are seriously harmful to human health. Although researches on clarifying the relationships between lncRNAs and diseases are developing rapidly, associations between the lncRNAs and diseases are still remaining largely unknown. In this manuscript, a novel Local Random Walk based prediction model called LRWHLDA is proposed for inferring potential associations between human lncRNAs and diseases. In LRWHLDA, a new heterogeneous network is established first, which allows that LRWHLDA can be implemented in the case of lacking known lncRNA-disease associations. And then, an improved local random walk method is designed for prediction of novel lncRNA-disease associations, which can help LRWHLDA achieve high prediction accuracy but with low time complexity. Finally, in order to evaluate the prediction performance of LRWHLDA, different frameworks such as LOOCV, 2-folds CV, and 5-folds CV have been implemented, simulation results indicate that LRWHLDA can achieve reliable AUCs of 0.8037, 0.8354, and 0.8556 under the frameworks of 2-fold CV, 5-fold CV, and LOOCV, respectively. Hence, it is easy to know that LRWHLDA contains the potential to be a representative of emerging methods in the field of research on potential lncRNA-disease associations prediction.
Collapse
|
97
|
Ji C, Gao Z, Ma X, Wu Q, Ni J, Zheng C. AEMDA: inferring miRNA-disease associations based on deep autoencoder. Bioinformatics 2021; 37:66-72. [PMID: 32726399 DOI: 10.1093/bioinformatics/btaa670] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2020] [Revised: 05/27/2020] [Accepted: 07/20/2020] [Indexed: 12/19/2022] Open
Abstract
MOTIVATION MicroRNAs (miRNAs) are a class of non-coding RNAs that play critical roles in various biological processes. Many studies have shown that miRNAs are closely related to the occurrence, development and diagnosis of human diseases. Traditional biological experiments are costly and time consuming. As a result, effective computational models have become increasingly popular for predicting associations between miRNAs and diseases, which could effectively boost human disease diagnosis and prevention. RESULTS We propose a novel computational framework, called AEMDA, to identify associations between miRNAs and diseases. AEMDA applies a learning-based method to extract dense and high-dimensional representations of diseases and miRNAs from integrated disease semantic similarity, miRNA functional similarity and heterogeneous related interaction data. In addition, AEMDA adopts a deep autoencoder that does not need negative samples to retrieve the underlying associations between miRNAs and diseases. Furthermore, the reconstruction error is used as a measurement to predict disease-associated miRNAs. Our experimental results indicate that AEMDA can effectively predict disease-related miRNAs and outperforms state-of-the-art methods. AVAILABILITY AND IMPLEMENTATION The source code and data are available at https://github.com/CunmeiJi/AEMDA. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Cunmei Ji
- School of Software, Qufu Normal University, Qufu 273165, China
| | - Zhen Gao
- School of Software, Qufu Normal University, Qufu 273165, China
| | - Xu Ma
- School of Software, Qufu Normal University, Qufu 273165, China
| | - Qingwen Wu
- School of Software, Qufu Normal University, Qufu 273165, China
| | - Jiancheng Ni
- School of Software, Qufu Normal University, Qufu 273165, China
| | - Chunhou Zheng
- School of Software, Qufu Normal University, Qufu 273165, China.,School of Computer Science and Technology, Anhui University, Hefei 230601, China
| |
Collapse
|
98
|
Wang YT, Wu QW, Gao Z, Ni JC, Zheng CH. MiRNA-disease association prediction via hypergraph learning based on high-dimensionality features. BMC Med Inform Decis Mak 2021; 21:133. [PMID: 33882934 PMCID: PMC8061020 DOI: 10.1186/s12911-020-01320-w] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2020] [Accepted: 11/09/2020] [Indexed: 11/10/2022] Open
Abstract
Background MicroRNAs (miRNAs) have been confirmed to have close relationship with various human complex diseases. The identification of disease-related miRNAs provides great insights into the underlying pathogenesis of diseases. However, it is still a big challenge to identify which miRNAs are related to diseases. As experimental methods are in general expensive and time‐consuming, it is important to develop efficient computational models to discover potential miRNA-disease associations. Methods This study presents a novel prediction method called HFHLMDA, which is based on high-dimensionality features and hypergraph learning, to reveal the association between diseases and miRNAs. Firstly, the miRNA functional similarity and the disease semantic similarity are integrated to form an informative high-dimensionality feature vector. Then, a hypergraph is constructed by the K-Nearest-Neighbor (KNN) method, in which each miRNA-disease pair and its k most relevant neighbors are linked as one hyperedge to represent the complex relationships among miRNA-disease pairs. Finally, the hypergraph learning model is designed to learn the projection matrix which is used to calculate uncertain miRNA-disease association score. Result Compared with four state-of-the-art computational models, HFHLMDA achieved best results of 92.09% and 91.87% in leave-one-out cross validation and fivefold cross validation, respectively. Moreover, in case studies on Esophageal neoplasms, Hepatocellular Carcinoma, Breast Neoplasms, 90%, 98%, and 96% of the top 50 predictions have been manually confirmed by previous experimental studies. Conclusion MiRNAs have complex connections with many human diseases. In this study, we proposed a novel computational model to predict the underlying miRNA-disease associations. All results show that the proposed method is effective for miRNA–disease association predication.
Collapse
Affiliation(s)
- Yu-Tian Wang
- School of Software, Qufu Normal University, Qufu, China
| | - Qing-Wen Wu
- School of Software, Qufu Normal University, Qufu, China
| | - Zhen Gao
- School of Software, Qufu Normal University, Qufu, China
| | - Jian-Cheng Ni
- School of Software, Qufu Normal University, Qufu, China.
| | - Chun-Hou Zheng
- School of Computer Science and Technology, Anhui University, Hefei, China. .,College of Mathematics and System Science, Xinjiang University, Urumqi, China.
| |
Collapse
|
99
|
Li HY, You ZH, Wang L, Yan X, Li ZW. DF-MDA: An effective diffusion-based computational model for predicting miRNA-disease association. Mol Ther 2021; 29:1501-1511. [PMID: 33429082 DOI: 10.1016/j.ymthe.2021.01.003] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Revised: 12/21/2020] [Accepted: 01/01/2021] [Indexed: 12/28/2022] Open
Abstract
It is reported that microRNAs (miRNAs) play an important role in various human diseases. However, the mechanisms of miRNA in these diseases have not been fully understood. Therefore, detecting potential miRNA-disease associations has far-reaching significance for pathological development and the diagnosis and treatment of complex diseases. In this study, we propose a novel diffusion-based computational method, DF-MDA, for predicting miRNA-disease association based on the assumption that molecules are related to each other in human physiological processes. Specifically, we first construct a heterogeneous network by integrating various known associations among miRNAs, diseases, proteins, long non-coding RNAs (lncRNAs), and drugs. Then, more representative features are extracted through a diffusion-based machine-learning method. Finally, the Random Forest classifier is adopted to classify miRNA-disease associations. In the 5-fold cross-validation experiment, the proposed model obtained the average area under the curve (AUC) of 0.9321 on the HMDD v3.0 dataset. To further verify the prediction performance of the proposed model, DF-MDA was applied in three significant human diseases, including lymphoma, lung neoplasms, and colon neoplasms. As a result, 47, 46, and 47 out of top 50 predictions were validated by independent databases. These experimental results demonstrated that DF-MDA is a reliable and efficient method for predicting potential miRNA-disease associations.
Collapse
Affiliation(s)
- Hao-Yuan Li
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
| | - Zhu-Hong You
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.
| | - Lei Wang
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277100, China.
| | - Xin Yan
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China; School of Foreign Languages, Zaozhuang University, Zaozhuang, Shandong 277100, China.
| | - Zheng-Wei Li
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
| |
Collapse
|
100
|
Prediction of miRNA-Disease Association Using Deep Collaborative Filtering. BIOMED RESEARCH INTERNATIONAL 2021; 2021:6652948. [PMID: 33681362 PMCID: PMC7929672 DOI: 10.1155/2021/6652948] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Revised: 02/01/2021] [Accepted: 02/10/2021] [Indexed: 12/12/2022]
Abstract
The existing studies have shown that miRNAs are related to human diseases by regulating gene expression. Identifying miRNA association with diseases will contribute to diagnosis, treatment, and prognosis of diseases. The experimental identification of miRNA-disease associations is time-consuming, tremendously expensive, and of high-failure rate. In recent years, many researchers predicted potential associations between miRNAs and diseases by computational approaches. In this paper, we proposed a novel method using deep collaborative filtering called DCFMDA to predict miRNA-disease potential associations. To improve prediction performance, we integrated neural network matrix factorization (NNMF) and multilayer perceptron (MLP) in a deep collaborative filtering framework. We utilized known miRNA-disease associations to capture miRNA-disease interaction features by NNMF and utilized miRNA similarity and disease similarity to extract miRNA feature vector and disease feature vector, respectively, by MLP. At last, we merged outputs of the NNMF and MLP to obtain the prediction matrix. The experimental results indicate that compared with other existing computational methods, our method can achieve the AUC of 0.9466 based on 10-fold cross-validation. In addition, case studies show that the DCFMDA can effectively predict candidate miRNAs for breast neoplasms, colon neoplasms, kidney neoplasms, leukemia, and lymphoma.
Collapse
|