1
|
Zhang W, Zhang P, Sun W, Xu J, Liao L, Cao Y, Han Y. Improving plant miRNA-target prediction with self-supervised k-mer embedding and spectral graph convolutional neural network. PeerJ 2024; 12:e17396. [PMID: 38799058 PMCID: PMC11122044 DOI: 10.7717/peerj.17396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Accepted: 04/25/2024] [Indexed: 05/29/2024] Open
Abstract
Deciphering the targets of microRNAs (miRNAs) in plants is crucial for comprehending their function and the variation in phenotype that they cause. As the highly cell-specific nature of miRNA regulation, recent computational approaches usually utilize expression data to identify the most physiologically relevant targets. Although these methods are effective, they typically require a large sample size and high-depth sequencing to detect potential miRNA-target pairs, thereby limiting their applicability in improving plant breeding. In this study, we propose a novel miRNA-target prediction framework named kmerPMTF (k-mer-based prediction framework for plant miRNA-target). Our framework effectively extracts the latent semantic embeddings of sequences by utilizing k-mer splitting and a deep self-supervised neural network. We construct multiple similarity networks based on k-mer embeddings and employ graph convolutional networks to derive deep representations of miRNAs and targets and calculate the probabilities of potential associations. We evaluated the performance of kmerPMTF on four typical plant datasets: Arabidopsis thaliana, Oryza sativa, Solanum lycopersicum, and Prunus persica. The results demonstrate its ability to achieve AUPRC values of 84.9%, 91.0%, 80.1%, and 82.1% in 5-fold cross-validation, respectively. Compared with several state-of-the-art existing methods, our framework achieves better performance on threshold-independent evaluation metrics. Overall, our study provides an efficient and simplified methodology for identifying plant miRNA-target associations, which will contribute to a deeper comprehension of miRNA regulatory mechanisms in plants.
Collapse
Affiliation(s)
- Weihan Zhang
- CAS Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, The Innovative Academy of Seed Design of Chinese Academy of Sciences, Wuhan, Hubei Province, China
- Sino-African Joint Research Center, Chinese Academy of Sciences, Wuhan, Hubei Province, China
| | - Ping Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, Hubei Province, China
| | - Weicheng Sun
- College of Informatics, Huazhong Agricultural University, Wuhan, Hubei Province, China
| | - Jinsheng Xu
- College of Informatics, Huazhong Agricultural University, Wuhan, Hubei Province, China
| | - Liao Liao
- CAS Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, The Innovative Academy of Seed Design of Chinese Academy of Sciences, Wuhan, Hubei Province, China
- Sino-African Joint Research Center, Chinese Academy of Sciences, Wuhan, Hubei Province, China
| | - Yunpeng Cao
- CAS Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, The Innovative Academy of Seed Design of Chinese Academy of Sciences, Wuhan, Hubei Province, China
- Sino-African Joint Research Center, Chinese Academy of Sciences, Wuhan, Hubei Province, China
| | - Yuepeng Han
- CAS Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, The Innovative Academy of Seed Design of Chinese Academy of Sciences, Wuhan, Hubei Province, China
- Sino-African Joint Research Center, Chinese Academy of Sciences, Wuhan, Hubei Province, China
| |
Collapse
|
2
|
Jiao CN, Zhou F, Liu BM, Zheng CH, Liu JX, Gao YL. Multi-Kernel Graph Attention Deep Autoencoder for MiRNA-Disease Association Prediction. IEEE J Biomed Health Inform 2024; 28:1110-1121. [PMID: 38055359 DOI: 10.1109/jbhi.2023.3336247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2023]
Abstract
Accumulating evidence indicates that microRNAs (miRNAs) can control and coordinate various biological processes. Consequently, abnormal expressions of miRNAs have been linked to various complex diseases. Recognizable proof of miRNA-disease associations (MDAs) will contribute to the diagnosis and treatment of human diseases. Nevertheless, traditional experimental verification of MDAs is laborious and limited to small-scale. Therefore, it is necessary to develop reliable and effective computational methods to predict novel MDAs. In this work, a multi-kernel graph attention deep autoencoder (MGADAE) method is proposed to predict potential MDAs. In detail, MGADAE first employs the multiple kernel learning (MKL) algorithm to construct an integrated miRNA similarity and disease similarity, providing more biological information for further feature learning. Second, MGADAE combines the known MDAs, disease similarity, and miRNA similarity into a heterogeneous network, then learns the representations of miRNAs and diseases through graph convolution operation. After that, an attention mechanism is introduced into MGADAE to integrate the representations from multiple graph convolutional network (GCN) layers. Lastly, the integrated representations of miRNAs and diseases are input into the bilinear decoder to obtain the final predicted association scores. Corresponding experiments prove that the proposed method outperforms existing advanced approaches in MDA prediction. Furthermore, case studies related to two human cancers provide further confirmation of the reliability of MGADAE in practice.
Collapse
|
3
|
Han Y, Zhou Q, Liu L, Li J, Zhou Y. DNI-MDCAP: improvement of causal MiRNA-disease association prediction based on deep network imputation. BMC Bioinformatics 2024; 25:22. [PMID: 38216907 PMCID: PMC10785389 DOI: 10.1186/s12859-024-05644-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 01/08/2024] [Indexed: 01/14/2024] Open
Abstract
BACKGROUND MiRNAs are involved in the occurrence and development of many diseases. Extensive literature studies have demonstrated that miRNA-disease associations are stratified and encompass ~ 20% causal associations. Computational models that predict causal miRNA-disease associations provide effective guidance in identifying novel interpretations of disease mechanisms and potential therapeutic targets. Although several predictive models for miRNA-disease associations exist, it is still challenging to discriminate causal miRNA-disease associations from non-causal ones. Hence, there is a pressing need to develop an efficient prediction model for causal miRNA-disease association prediction. RESULTS We developed DNI-MDCAP, an improved computational model that incorporated additional miRNA similarity metrics, deep graph embedding learning-based network imputation and semi-supervised learning framework. Through extensive predictive performance evaluation, including tenfold cross-validation and independent test, DNI-MDCAP showed excellent performance in identifying causal miRNA-disease associations, achieving an area under the receiver operating characteristic curve (AUROC) of 0.896 and 0.889, respectively. Regarding the challenge of discriminating causal miRNA-disease associations from non-causal ones, DNI-MDCAP exhibited superior predictive performance compared to existing models MDCAP and LE-MDCAP, reaching an AUROC of 0.870. Wilcoxon test also indicated significantly higher prediction scores for causal associations than for non-causal ones. Finally, the potential causal miRNA-disease associations predicted by DNI-MDCAP, exemplified by diabetic nephropathies and hsa-miR-193a, have been validated by recently published literature, further supporting the reliability of the prediction model. CONCLUSIONS DNI-MDCAP is a dedicated tool to specifically distinguish causal miRNA-disease associations with substantially improved accuracy. DNI-MDCAP is freely accessible at http://www.rnanut.net/DNIMDCAP/ .
Collapse
Affiliation(s)
- Yu Han
- Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, Beijing, China
| | - Qiong Zhou
- Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, Beijing, China
| | - Leibo Liu
- Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, Beijing, China
| | - Jianwei Li
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, China
| | - Yuan Zhou
- Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, Beijing, China.
- State Key Laboratory of Vascular Homeostasis and Remodeling, Peking University, Beijing, China.
| |
Collapse
|
4
|
Zhou F, Yin MM, Jiao CN, Zhao JX, Zheng CH, Liu JX. Predicting miRNA-Disease Associations Through Deep Autoencoder With Multiple Kernel Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:5570-5579. [PMID: 34860656 DOI: 10.1109/tnnls.2021.3129772] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Determining microRNA (miRNA)-disease associations (MDAs) is an integral part in the prevention, diagnosis, and treatment of complex diseases. However, wet experiments to discern MDAs are inefficient and expensive. Hence, the development of reliable and efficient data integrative models for predicting MDAs is of significant meaning. In the present work, a novel deep learning method for predicting MDAs through deep autoencoder with multiple kernel learning (DAEMKL) is presented. Above all, DAEMKL applies multiple kernel learning (MKL) in miRNA space and disease space to construct miRNA similarity network and disease similarity network, respectively. Then, for each disease or miRNA, its feature representation is learned from the miRNA similarity network and disease similarity network via the regression model. After that, the integrated miRNA feature representation and disease feature representation are input into deep autoencoder (DAE). Furthermore, the novel MDAs are predicted through reconstruction error. Ultimately, the AUC results show that DAEMKL achieves outstanding performance. In addition, case studies of three complex diseases further prove that DAEMKL has excellent predictive performance and can discover a large number of underlying MDAs. On the whole, our method DAEMKL is an effective method to identify MDAs.
Collapse
|
5
|
Zhang W, Liu B. iSnoDi-MDRF: Identifying snoRNA-Disease Associations Based on Multiple Biological Data by Ranking Framework. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:3013-3019. [PMID: 37030816 DOI: 10.1109/tcbb.2023.3258448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Accumulating evidence indicates that the dysregulation of small nucleolar RNAs (snoRNAs) is relevant with diseases. Identifying snoRNA-disease associations by computational methods is desired for biologists, which can save considerable costs and time compared biological experiments. However, it still faces some challenges as followings: (i) Many snoRNAs are detected in recent years, but only a few snoRNAs have been proved to be associated with diseases; (ii) Computational predictors trained with only a few known snoRNA-disease associations fail to accurately identify the snoRNA-disease associations. In this study, we propose a ranking framework, called iSnoDi-MDRF, to identify potential snoRNA-disease associations based on multiple biological data, which has the following highlights: (i) iSnoDi-MDRF integrates ranking framework, which is not only able to identify potential associations between known snoRNAs and diseases, but also can identify diseases associated with new snoRNAs. (ii) Known gene-disease associations are employed to help train a mature model for predicting snoRNA-disease association. Experimental results illustrate that iSnoDi-MDRF is very suitable for identifying potential snoRNA-disease associations. The web server of iSnoDi-MDRF predictor is freely available at http://bliulab.net/iSnoDi-MDRF/.
Collapse
|
6
|
Jabeer A, Temiz M, Bakir-Gungor B, Yousef M. miRdisNET: Discovering microRNA biomarkers that are associated with diseases utilizing biological knowledge-based machine learning. Front Genet 2023; 13:1076554. [PMID: 36712859 PMCID: PMC9877296 DOI: 10.3389/fgene.2022.1076554] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Accepted: 12/30/2022] [Indexed: 01/14/2023] Open
Abstract
During recent years, biological experiments and increasing evidence have shown that microRNAs play an important role in the diagnosis and treatment of human complex diseases. Therefore, to diagnose and treat human complex diseases, it is necessary to reveal the associations between a specific disease and related miRNAs. Although current computational models based on machine learning attempt to determine miRNA-disease associations, the accuracy of these models need to be improved, and candidate miRNA-disease relations need to be evaluated from a biological perspective. In this paper, we propose a computational model named miRdisNET to predict potential miRNA-disease associations. Specifically, miRdisNET requires two types of data, i.e., miRNA expression profiles and known disease-miRNA associations as input files. First, we generate subsets of specific diseases by applying the grouping component. These subsets contain miRNA expressions with class labels associated with each specific disease. Then, we assign an importance score to each group by using a machine learning method for classification. Finally, we apply a modeling component and obtain outputs. One of the most important outputs of miRdisNET is the performance of miRNA-disease prediction. Compared with the existing methods, miRdisNET obtained the highest AUC value of .9998. Another output of miRdisNET is a list of significant miRNAs for disease under study. The miRNAs identified by miRdisNET are validated via referring to the gold-standard databases which hold information on experimentally verified microRNA-disease associations. miRdisNET has been developed to predict candidate miRNAs for new diseases, where miRNA-disease relation is not yet known. In addition, miRdisNET presents candidate disease-disease associations based on shared miRNA knowledge. The miRdisNET tool and other supplementary files are publicly available at: https://github.com/malikyousef/miRdisNET.
Collapse
Affiliation(s)
- Amhar Jabeer
- Department of Computer Engineering, Faculty of Engineering, Abdullah Gul University, Kayseri, Turkey
| | - Mustafa Temiz
- Department of Computer Engineering, Faculty of Engineering, Abdullah Gul University, Kayseri, Turkey,*Correspondence: Malik Yousef, ; Mustafa Temiz,
| | - Burcu Bakir-Gungor
- Department of Computer Engineering, Faculty of Engineering, Abdullah Gul University, Kayseri, Turkey
| | - Malik Yousef
- Department of Information Systems, Zefat Academic College, Zefat, Israel,Galilee Digital Health Research Center (GDH), Zefat Academic College, Zefat, Israel,*Correspondence: Malik Yousef, ; Mustafa Temiz,
| |
Collapse
|
7
|
Zhang W, Liu B. iSnoDi-LSGT: identifying snoRNA-disease associations based on local similarity constraints and global topological constraints. RNA (NEW YORK, N.Y.) 2022; 28:1558-1567. [PMID: 36192132 PMCID: PMC9670808 DOI: 10.1261/rna.079325.122] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 09/26/2022] [Indexed: 06/16/2023]
Abstract
Growing evidence proves that small nucleolar RNAs (snoRNAs) have important functions in various biological processes, the malfunction of which leads to the emergence and development of complex diseases. However, identifying snoRNA-disease associations is an ongoing challenging task due to the considerable time- and money-consuming biological experiments. Therefore, it is urgent to design efficient and economical methods for the identification of snoRNA-disease associations. In this regard, we propose a computational method named iSnoDi-LSGT, which utilizes snoRNA sequence similarity and disease similarity as local similarity constraints. The iSnoDi-LSGT predictor further employs network embedding technology to extract topological features of snoRNAs and diseases, based on which snoRNA topological similarity and disease topological similarity are calculated as global topological constraints. To the best of our knowledge, the iSnoDi-LSGT is the first computational method for snoRNA-disease association identification. The experimental results indicate that the iSnoDi-LSGT predictor can effectively predict unknown snoRNA-disease associations. The web server of the iSnoDi-LSGT predictor is freely available at http://bliulab.net/iSnoDi-LSGT.
Collapse
Affiliation(s)
- Wenxiang Zhang
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
- Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing 100081, China
| |
Collapse
|
8
|
Li W, Wang S, Xu J, Xiang J. Inferring Latent MicroRNA-Disease Associations on a Gene-Mediated Tripartite Heterogeneous Multiplexing Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3190-3201. [PMID: 35041612 DOI: 10.1109/tcbb.2022.3143770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
MicroRNA (miRNA) is a class of non-coding single-stranded RNA molecules encoded by endogenous genes with a length of about 22 nucleotides. MiRNAs have been successfully identified as differentially expressed in various cancers. There is evidence that disorders of miRNAs are associated with a variety of complex diseases. Therefore, inferring potential miRNA-disease associations (MDAs) is very important for understanding the aetiology and pathogenesis of many diseases and is useful to disease diagnosis, prognosis and treatment. First, We creatively fused multiple similarity subnetworks from multi-sources for miRNAs, genes and diseases by multiplexing technology, respectively. Then, three multiplexed biological subnetworks are connected through the extended binary association to form a tripartite complete heterogeneous multiplexed network (Tri-HM). Finally, because the constructed Tri-HM network can retain subnetworks' original topology and biological functions and expands the binary association and dependence between the three biological entities, rich neighbourhood information is obtained iteratively from neighbours by a non-equilibrium random walk. Through cross-validation, our tri-HM-RWR model obtained an AUC value of 0.8657, and an AUPR value of 0.2139 in the global 5-fold cross-validation, which shows that our model can more fully speculate disease-related miRNAs.
Collapse
|
9
|
Dong N, Mucke S, Khosla M. MuCoMiD: A Multitask Graph Convolutional Learning Framework for miRNA-Disease Association Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3081-3092. [PMID: 35594217 DOI: 10.1109/tcbb.2022.3176456] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Growing evidence from recent studies implies that microRNAs or miRNAs could serve as biomarkers in various complex human diseases. Since wet-lab experiments for detecting miRNAs associated with a disease are expensive and time-consuming, machine learning techniques for miRNA-disease association prediction have attracted much attention in recent years. A big challenge in building reliable machine learning models is that of data scarcity. In particular, existing approaches trained on the available small datasets, even when combined with precalculated handcrafted input features, often suffer from bad generalization and data leakage problems. We overcome the limitations of existing works by proposing a novel multitask graph convolution-based approach, which we refer to as MuCoMiD. MuCoMiD allows automatic feature extraction while incorporating knowledge from five heterogeneous biological information sources (associations between miRNAs/diseases and protein-coding genes (PCGs), interactions between protein-coding genes, miRNA family information, and disease ontology) in a multitask setting which is a novel perspective and has not been studied before. To effectively test the generalization capability of our model, we conduct large-scale experiments on the standard benchmark datasets as well as on our proposed large independent testing sets and case studies. MuCoMiD obtains significantly higher Average Precision (AP) scores than all benchmarked models on three large independent testing sets, especially those with many new miRNAs, as well as in the detection of false positives. Thanks to its capability of learning directly from raw input information, MuCoMiD is easier to maintain and update than handcrafted feature-based methods, which would require recomputation of features every time there is a change in the original information sources (e.g., disease ontology, miRNA/disease-PCG associations, etc.). We share our code for reproducibility and future research at https://git.l3s.uni-hannover.de/dong/cmtt.
Collapse
|
10
|
Lu S, Liang Y, Li L, Liao S, Ouyang D. Inferring human miRNA–disease associations via multiple kernel fusion on GCNII. Front Genet 2022; 13:980497. [PMID: 36134032 PMCID: PMC9483142 DOI: 10.3389/fgene.2022.980497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 07/20/2022] [Indexed: 11/16/2022] Open
Abstract
Increasing evidence shows that the occurrence of human complex diseases is closely related to the mutation and abnormal expression of microRNAs(miRNAs). MiRNAs have complex and fine regulatory mechanisms, which makes it a promising target for drug discovery and disease diagnosis. Therefore, predicting the potential miRNA-disease associations has practical significance. In this paper, we proposed an miRNA–disease association predicting method based on multiple kernel fusion on Graph Convolutional Network via Initial residual and Identity mapping (GCNII), called MKFGCNII. Firstly, we built a heterogeneous network of miRNAs and diseases to extract multi-layer features via GCNII. Secondly, multiple kernel fusion method was applied to weight fusion of embeddings at each layer. Finally, Dual Laplacian Regularized Least Squares was used to predict new miRNA–disease associations by the combined kernel in miRNA and disease spaces. Compared with the other methods, MKFGCNII obtained the highest AUC value of 0.9631. Code is available at https://github.com/cuntjx/bioInfo.
Collapse
Affiliation(s)
- Shanghui Lu
- School of Computer Science and Engineering, Macau University of Science and Technology, Taipa, China
- School of Mathematics and Physics, Hechi University, Hechi, China
| | - Yong Liang
- School of Computer Science and Engineering, Macau University of Science and Technology, Taipa, China
- Peng Cheng Laboratory, Shenzhen, China
- *Correspondence: Yong Liang,
| | - Le Li
- School of Computer Science and Engineering, Macau University of Science and Technology, Taipa, China
| | - Shuilin Liao
- School of Computer Science and Engineering, Macau University of Science and Technology, Taipa, China
| | - Dong Ouyang
- School of Computer Science and Engineering, Macau University of Science and Technology, Taipa, China
| |
Collapse
|
11
|
Yang M, Huang ZA, Gu W, Han K, Pan W, Yang X, Zhu Z. Prediction of biomarker-disease associations based on graph attention network and text representation. Brief Bioinform 2022; 23:6651308. [PMID: 35901464 DOI: 10.1093/bib/bbac298] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2022] [Revised: 06/28/2022] [Accepted: 06/30/2022] [Indexed: 02/06/2023] Open
Abstract
MOTIVATION The associations between biomarkers and human diseases play a key role in understanding complex pathology and developing targeted therapies. Wet lab experiments for biomarker discovery are costly, laborious and time-consuming. Computational prediction methods can be used to greatly expedite the identification of candidate biomarkers. RESULTS Here, we present a novel computational model named GTGenie for predicting the biomarker-disease associations based on graph and text features. In GTGenie, a graph attention network is utilized to characterize diverse similarities of biomarkers and diseases from heterogeneous information resources. Meanwhile, a pretrained BERT-based model is applied to learn the text-based representation of biomarker-disease relation from biomedical literature. The captured graph and text features are then integrated in a bimodal fusion network to model the hybrid entity representation. Finally, inductive matrix completion is adopted to infer the missing entries for reconstructing relation matrix, with which the unknown biomarker-disease associations are predicted. Experimental results on HMDD, HMDAD and LncRNADisease data sets showed that GTGenie can obtain competitive prediction performance with other state-of-the-art methods. AVAILABILITY The source code of GTGenie and the test data are available at: https://github.com/Wolverinerine/GTGenie.
Collapse
Affiliation(s)
- Minghao Yang
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518000, China
| | - Zhi-An Huang
- Center for Computer Science and Information Technology, City University of Hong Kong Dongguan Research Institute, Dongguan, China
| | - Wenhao Gu
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518000, China.,GeneGenieDx Corp, 160 E Tasman Dr, San Jose, CA 95134
| | - Kun Han
- GeneGenieDx Corp, 160 E Tasman Dr, San Jose, CA 95134
| | - Wenying Pan
- GeneGenieDx Corp, 160 E Tasman Dr, San Jose, CA 95134
| | - Xiao Yang
- GeneGenieDx Corp, 160 E Tasman Dr, San Jose, CA 95134
| | - Zexuan Zhu
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518000, China
| |
Collapse
|
12
|
Wang W, Chen H. Predicting miRNA-disease associations based on graph attention networks and dual Laplacian regularized least squares. Brief Bioinform 2022; 23:6645486. [PMID: 35849099 DOI: 10.1093/bib/bbac292] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Revised: 06/23/2022] [Accepted: 06/26/2022] [Indexed: 01/05/2023] Open
Abstract
Increasing biomedical evidence has proved that the dysregulation of miRNAs is associated with human complex diseases. Identification of disease-related miRNAs is of great importance for disease prevention, diagnosis and remedy. To reduce the time and cost of biomedical experiments, there is a strong incentive to develop efficient computational methods to infer potential miRNA-disease associations. Although many computational approaches have been proposed to address this issue, the prediction accuracy needs to be further improved. In this study, we present a computational framework MKGAT to predict possible associations between miRNAs and diseases through graph attention networks (GATs) using dual Laplacian regularized least squares. We use GATs to learn embeddings of miRNAs and diseases on each layer from initial input features of known miRNA-disease associations, intra-miRNA similarities and intra-disease similarities. We then calculate kernel matrices of miRNAs and diseases based on Gaussian interaction profile (GIP) with the learned embeddings. We further fuse the kernel matrices of each layer and initial similarities with attention mechanism. Dual Laplacian regularized least squares are finally applied for new miRNA-disease association predictions with the fused miRNA and disease kernels. Compared with six state-of-the-art methods by 5-fold cross-validations, our method MKGAT receives the highest AUROC value of 0.9627 and AUPR value of 0.7372. We use MKGAT to predict related miRNAs for three cancers and discover that all the top 50 predicted results in the three diseases are confirmed by existing databases. The excellent performance indicates that MKGAT would be a useful computational tool for revealing disease-related miRNAs.
Collapse
Affiliation(s)
- Wengang Wang
- School of Software, East China Jiaotong University, Nanchang 330013, China
| | - Hailin Chen
- School of Software, East China Jiaotong University, Nanchang 330013, China
| |
Collapse
|
13
|
Tarallo S, Ferrero G, De Filippis F, Francavilla A, Pasolli E, Panero V, Cordero F, Segata N, Grioni S, Pensa RG, Pardini B, Ercolini D, Naccarati A. Stool microRNA profiles reflect different dietary and gut microbiome patterns in healthy individuals. Gut 2022; 71:1302-1314. [PMID: 34315772 PMCID: PMC9185830 DOI: 10.1136/gutjnl-2021-325168] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/16/2021] [Accepted: 07/15/2021] [Indexed: 02/07/2023]
Abstract
OBJECTIVES MicroRNA (miRNA) profiles have been evaluated in several biospecimens in relation to common diseases for which diet may have a considerable impact. We aimed at characterising how specific diets are associated with the miRNome in stool of vegans, vegetarians and omnivores and how this is reflected in the gut microbial composition, as this is still poorly explored. DESIGN We performed small RNA and shotgun metagenomic sequencing in faecal samples and dietary recording from 120 healthy volunteers, equally distributed for the different diets and matched for sex and age. RESULTS We found 49 miRNAs differentially expressed among vegans, vegetarians and omnivores (adj. p <0.05) and confirmed trends of expression levels of such miRNAs in vegans and vegetarians compared with an independent cohort of 45 omnivores. Two miRNAs related to lipid metabolism, miR-636 and miR-4739, were inversely correlated to the non-omnivorous diet duration, independently of subject age. Seventeen miRNAs correlated (|rho|>0.22, adj. p <0.05) with the estimated intake of nutrients, particularly animal proteins, phosphorus and, interestingly, lipids. In omnivores, higher Prevotella and Roseburia and lower Bacteroides abundances than in vegans and vegetarians were observed. Lipid metabolism-related miR-425-3p and miR-638 expression levels were associated with increased abundances of microbial species, such as Roseburia sp. CAG 182 and Akkermansia muciniphila, specific of different diets. An integrated analysis identified 25 miRNAs, 25 taxa and 7 dietary nutrients that clearly discriminated (area under the receiver operating characteristic curve=0.89) the three diets. CONCLUSION Stool miRNA profiles are associated with specific diets and support the role of lipids as a driver of epigenetic changes and host-microbial molecular interactions in the gut.
Collapse
Affiliation(s)
- Sonia Tarallo
- Italian Institute for Genomic Medicine (IIGM), c/o IRCCS Candiolo, Torino, Italy,Candiolo Cancer Institute - FPO IRCCS, Candiolo, Torino, Italy
| | - Giulio Ferrero
- Department of Computer Science, University of Torino, Torino, Italy,Department of Clinical and Biological Sciences, University of Torino, Torino, Italy
| | - Francesca De Filippis
- Department Agricultural Sciences, University of Naples Federico II, Portici, Napoli, Italy,Task Force on Microbiome Studies, University of Naples Federico II, Napoli, Italy
| | - Antonio Francavilla
- Italian Institute for Genomic Medicine (IIGM), c/o IRCCS Candiolo, Torino, Italy,Candiolo Cancer Institute - FPO IRCCS, Candiolo, Torino, Italy
| | - Edoardo Pasolli
- Department Agricultural Sciences, University of Naples Federico II, Portici, Napoli, Italy,Task Force on Microbiome Studies, University of Naples Federico II, Napoli, Italy
| | - Valentina Panero
- Italian Institute for Genomic Medicine (IIGM), c/o IRCCS Candiolo, Torino, Italy
| | | | - Nicola Segata
- Centre for Integrative Biology, University of Trento, Trento, Italy
| | - Sara Grioni
- Epidemiology and Prevention Unit, Fondazione IRCCS Istituto Nazionale dei Tumori, Milano, Italy
| | | | - Barbara Pardini
- Italian Institute for Genomic Medicine (IIGM), c/o IRCCS Candiolo, Torino, Italy,Candiolo Cancer Institute - FPO IRCCS, Candiolo, Torino, Italy
| | - Danilo Ercolini
- Department Agricultural Sciences, University of Naples Federico II, Portici, Napoli, Italy .,Task Force on Microbiome Studies, University of Naples Federico II, Napoli, Italy
| | - Alessio Naccarati
- Italian Institute for Genomic Medicine (IIGM), c/o IRCCS Candiolo, Torino, Italy .,Candiolo Cancer Institute - FPO IRCCS, Candiolo, Torino, Italy
| |
Collapse
|
14
|
Yu G, Yang Y, Yan Y, Guo M, Zhang X, Wang J. DeepIDA: Predicting Isoform-Disease Associations by Data Fusion and Deep Neural Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2166-2176. [PMID: 33571094 DOI: 10.1109/tcbb.2021.3058801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Alternative splicing produces different isoforms from the same gene locus, it is an important mechanism for regulating gene expression and proteome diversity. Although the prediction of gene(ncRNA)-disease associations has been extensively studied, few (or no) computational solutions have been proposed for the prediction of isoform-disease association (IDA) at a large scale, mainly due to the lack of disease annotations of isoforms. However, increasing evidences confirm the associations between diseases and isoforms, which can more precisely uncover the pathology of complex diseases. Therefore, it is highly desirable to predict IDAs. To bridge this gap, we propose a deep neural network based solution (DeepIDA) to fuse multi-type genomics and transcriptomics data to predict IDAs. Particularly, DeepIDA uses gene-isoform relations to dispatch gene-disease associations to isoforms. In addition, it utilizes two DNN sub-networks with different structures to capture nucleotide and expression features of isoforms, Gene Ontology data and miRNA target data, respectively. After that, these two sub-networks are merged in a dense layer to predict IDAs. The experimental results on public datasets show that DeepIDA can effectively predict IDAs with AUPRC (area under the precision-recall curve) of 0.9141, macro F-measure of 0.9155, G-mean of 0.9278 and balanced accuracy of 0.9303 across 732 diseases, which are much higher than those of competitive methods. Further study on sixteen isoform-disease association cases again corroborates the superiority of DeepIDA. The code of DeepIDA is available at http://mlda.swu.edu.cn/codes.php?name=DeepIDA.
Collapse
|
15
|
Ji C, Wang Y, Gao Z, Li L, Ni J, Zheng C. A Semi-Supervised Learning Method for MiRNA-Disease Association Prediction Based on Variational Autoencoder. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2049-2059. [PMID: 33735084 DOI: 10.1109/tcbb.2021.3067338] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
MicroRNAs (miRNAs) are a class of non-coding RNAs that play critical role in many biological processes, such as cell growth, development, differentiation and aging. Increasing studies have revealed that miRNAs are closely involved in many human diseases. Therefore, the prediction of miRNA-disease associations is of great significance to the study of the pathogenesis, diagnosis and intervention of human disease. However, biological experimentally methods are usually expensive in time and money, while computational methods can provide an efficient way to infer the underlying disease-related miRNAs. In this study, we propose a novel method to predict potential miRNA-disease associations, called SVAEMDA. Our method mainly consider the miRNA-disease association prediction as semi-supervised learning problem. SVAEMDA integrates disease semantic similarity, miRNA functional similarity and respective Gaussian interaction profile (GIP) similarities. The integrated similarities are used to learn the representations of diseases and miRNAs. SVAEMDA trains a variational autoencoder based predictor by using known miRNA-disease associations, with the form of concatenated dense vectors. Reconstruction probability of the predictor is used to measure the correlation of the miRNA-disease pairs. Experimental results show that SVAEMDA outperforms other stat-of-the-art methods. AUC values of SVAEMDA of global leave-one-out cross validation (LOOCV) and 5-fold cross validation (5-fold CV) are 0.9464 and 0.9428 respectively. In addition, case studies of three common human diseases indicate that SVAEMDA obtains 100 percent of the top 50 predicted candidates in the benchmark databases. Therefore, SVAEMDA can efficiently and accurately predict the potential associations between diseases and miRNAs.
Collapse
|
16
|
Wang KR, McGeachie MJ. DisiMiR: Predicting Pathogenic miRNAs Using Network Influence and miRNA Conservation. Noncoding RNA 2022; 8:ncrna8040045. [PMID: 35893228 PMCID: PMC9326518 DOI: 10.3390/ncrna8040045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Revised: 06/17/2022] [Accepted: 06/21/2022] [Indexed: 11/16/2022] Open
Abstract
MiRNAs have been shown to play a powerful regulatory role in the progression of serious diseases, including cancer, Alzheimer's, and others, raising the possibility of new miRNA-based therapies for these conditions. Current experimental methods, such as differential expression analysis, can discover disease-associated miRNAs, yet many of these miRNAs play no functional role in disease progression. Interventional experiments used to discover disease causal miRNAs can be time consuming and costly. We present DisiMiR: a novel computational method that predicts pathogenic miRNAs by inferring biological characteristics of pathogenicity, including network influence and evolutionary conservation. DisiMiR separates disease causal miRNAs from merely disease-associated miRNAs, and was accurate in four diseases: breast cancer (0.826 AUC), Alzheimer's (0.794 AUC), gastric cancer (0.853 AUC), and hepatocellular cancer (0.957 AUC). Additionally, DisiMiR can generate hypotheses effectively: 78.4% of its false positives that are mentioned in the literature have been confirmed to be causal through recently published research. In this work, we show that DisiMiR is a powerful tool that can be used to efficiently and flexibly to predict pathogenic miRNAs in an expression dataset, for the further elucidation of disease mechanisms, and the potential identification of novel drug targets.
Collapse
Affiliation(s)
| | - Michael J. McGeachie
- Channing Division of Network Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA;
| |
Collapse
|
17
|
Paul S, Ruiz-Manriquez LM, Ambriz-Gonzalez H, Medina-Gomez D, Valenzuela-Coronado E, Moreno-Gomez P, Pathak S, Chakraborty S, Srivastava A. Impact of smoking-induced dysregulated human miRNAs in chronic disease development and their potential use in prognostic and therapeutic purposes. J Biochem Mol Toxicol 2022; 36:e23134. [PMID: 35695328 DOI: 10.1002/jbt.23134] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Revised: 04/20/2022] [Accepted: 05/29/2022] [Indexed: 12/14/2022]
Abstract
MicroRNAs (miRNAs) are evolutionary conserved small noncoding RNA molecules with a significant ability to regulate gene expression at the posttranscriptional level either through translation repression or messenger RNA degradation. miRNAs are differentially expressed in various pathophysiological conditions, affecting the course of the disease by modulating several critical target genes. As the persistence of irreversible molecular changes caused by cigarette smoking is central to the pathogenesis of various chronic diseases, several studies have shown its direct correlation with the dysregulation of different miRNAs, affecting numerous essential biological processes. This review provides an insight into the current status of smoking-induced miRNAs dysregulation in chronic diseases such as COPD, atherosclerosis, pulmonary hypertension, and different cancers and explores the diagnostic/prognostic potential of miRNA-based biomarkers and their efficacy as therapeutic targets.
Collapse
Affiliation(s)
- Sujay Paul
- Tecnologico de Monterrey, School of Engineering and Sciences, Campus Queretaro, Av. Epigmenio Gonzalez, San Pablo, Queretaro, Mexico
| | - Luis M Ruiz-Manriquez
- Tecnologico de Monterrey, School of Engineering and Sciences, Campus Queretaro, Av. Epigmenio Gonzalez, San Pablo, Queretaro, Mexico
| | - Hector Ambriz-Gonzalez
- Tecnologico de Monterrey, School of Engineering and Sciences, Campus Queretaro, Av. Epigmenio Gonzalez, San Pablo, Queretaro, Mexico
| | - Daniel Medina-Gomez
- Tecnologico de Monterrey, School of Engineering and Sciences, Campus Queretaro, Av. Epigmenio Gonzalez, San Pablo, Queretaro, Mexico
| | - Estefania Valenzuela-Coronado
- Tecnologico de Monterrey, School of Engineering and Sciences, Campus Queretaro, Av. Epigmenio Gonzalez, San Pablo, Queretaro, Mexico
| | - Paloma Moreno-Gomez
- Tecnologico de Monterrey, School of Engineering and Sciences, Campus Queretaro, Av. Epigmenio Gonzalez, San Pablo, Queretaro, Mexico
| | - Surajit Pathak
- Department of Medical Biotechnology, Faculty of Allied Health Sciences, Chettinad Hospital and Research Institute (CHRI), Chettinad Academy of Research and Education (CARE), Kelambakkam, Chennai, Tamil Nadu, India
| | - Samik Chakraborty
- Division of Nephrology, Boston Children's Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Aashish Srivastava
- Section of Bioinformatics, Clinical Laboratory, Haukeland University Hospital, Bergen, Norway.,Department of Clinical Science, University of Bergen, Bergen, Norway
| |
Collapse
|
18
|
Huang Z, Han Y, Liu L, Cui Q, Zhou Y. LE-MDCAP: A Computational Model to Prioritize Causal miRNA-Disease Associations. Int J Mol Sci 2021; 22:ijms222413607. [PMID: 34948403 PMCID: PMC8706837 DOI: 10.3390/ijms222413607] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Revised: 12/12/2021] [Accepted: 12/14/2021] [Indexed: 01/03/2023] Open
Abstract
MicroRNAs (miRNAs) are associated with various complex human diseases and some miRNAs can be directly involved in the mechanisms of disease. Identifying disease-causative miRNAs can provide novel insight in disease pathogenesis from a miRNA perspective and facilitate disease treatment. To date, various computational models have been developed to predict general miRNA-disease associations, but few models are available to further prioritize causal miRNA-disease associations from non-causal associations. Therefore, in this study, we constructed a Levenshtein-Distance-Enhanced miRNA-disease Causal Association Predictor (LE-MDCAP), to predict potential causal miRNA-disease associations. Specifically, Levenshtein distance matrixes covering the sequence, expression and functional miRNA similarities were introduced to enhance the previous Gaussian interaction profile kernel-based similarity matrix. LE-MDCAP integrated miRNA similarity matrices, disease semantic similarity matrix and known causal miRNA-disease associations to make predictions. For regular causal vs. non-disease association discrimination task, LF-MDCAP achieved area under the receiver operating characteristic curve (AUROC) of 0.911 and 0.906 in 10-fold cross-validation and independent test, respectively. More importantly, LE-MDCAP prominently outperformed the previous MDCAP model in distinguishing causal versus non-causal miRNA-disease associations (AUROC 0.820 vs. 0.695). Case studies performed on diabetic retinopathy and hsa-mir-361 also validated the accuracy of our model. In summary, LE-MDCAP could be useful for screening causal miRNA-disease associations from general miRNA-disease associations.
Collapse
|
19
|
Zhou F, Yin MM, Jiao CN, Cui Z, Zhao JX, Liu JX. Bipartite graph-based collaborative matrix factorization method for predicting miRNA-disease associations. BMC Bioinformatics 2021; 22:573. [PMID: 34837953 PMCID: PMC8627000 DOI: 10.1186/s12859-021-04486-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Accepted: 11/17/2021] [Indexed: 01/15/2023] Open
Abstract
BACKGROUND With the rapid development of various advanced biotechnologies, researchers in related fields have realized that microRNAs (miRNAs) play critical roles in many serious human diseases. However, experimental identification of new miRNA-disease associations (MDAs) is expensive and time-consuming. Practitioners have shown growing interest in methods for predicting potential MDAs. In recent years, an increasing number of computational methods for predicting novel MDAs have been developed, making a huge contribution to the research of human diseases and saving considerable time. In this paper, we proposed an efficient computational method, named bipartite graph-based collaborative matrix factorization (BGCMF), which is highly advantageous for predicting novel MDAs. RESULTS By combining two improved recommendation methods, a new model for predicting MDAs is generated. Based on the idea that some new miRNAs and diseases do not have any associations, we adopt the bipartite graph based on the collaborative matrix factorization method to complete the prediction. The BGCMF achieves a desirable result, with AUC of up to 0.9514 ± (0.0007) in the five-fold cross-validation experiments. CONCLUSIONS Five-fold cross-validation is used to evaluate the capabilities of our method. Simulation experiments are implemented to predict new MDAs. More importantly, the AUC value of our method is higher than those of some state-of-the-art methods. Finally, many associations between new miRNAs and new diseases are successfully predicted by performing simulation experiments, indicating that BGCMF is a useful method to predict more potential miRNAs with roles in various diseases.
Collapse
Affiliation(s)
- Feng Zhou
- The School of Computer Science, Qufu Normal University, Rizhao, 276826, China
| | - Meng-Meng Yin
- The School of Computer Science, Qufu Normal University, Rizhao, 276826, China
| | - Cui-Na Jiao
- The School of Computer Science, Qufu Normal University, Rizhao, 276826, China
| | - Zhen Cui
- The School of Computer Science, Qufu Normal University, Rizhao, 276826, China
| | - Jing-Xiu Zhao
- The School of Computer Science, Qufu Normal University, Rizhao, 276826, China
| | - Jin-Xing Liu
- The School of Computer Science, Qufu Normal University, Rizhao, 276826, China.
| |
Collapse
|
20
|
Wu Y, Zhu D, Wang X, Zhang S. An ensemble learning framework for potential miRNA-disease association prediction with positive-unlabeled data. Comput Biol Chem 2021; 95:107566. [PMID: 34534906 DOI: 10.1016/j.compbiolchem.2021.107566] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2021] [Revised: 08/13/2021] [Accepted: 08/18/2021] [Indexed: 11/17/2022]
Abstract
To explore the pathogenic mechanisms of MicroRNA (miRNA) on diverse diseases, many researchers have concentrated on discovering the potential associations between miRNA and disease using machine learning methods. However, the prediction accuracy of supervised machine learning methods is limited by lacking of experimentally-validated uncorrelated miRNA-disease pairs. Without these negative samples, training a highly accurate model is much more difficult. Different from traditional miRNA-disease prediction models using randomly selected unknown samples as negative training samples, we propose an ensemble learning framework to solve this positive-unlabeled (PU) learning problem. The framework incorporates two steps, i.e., a novel semi-supervised Kmeans (SS-Kmeans) to extract reliable negative samples from unknown miRNA-disease pairs and subagging method to generate diverse training sample sets to make full use of those reliable negative samples for ensemble learning. Combined with effective random vector functional link (RVFL) network as prediction model, the proposed framework showed superior prediction accuracy comparing with other popular approaches. A case study on lung and gastric neoplasms further confirms the framework's efficacy at identifying miRNA disease associations.
Collapse
Affiliation(s)
- Yao Wu
- School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China
| | - Donghua Zhu
- School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China
| | - Xuefeng Wang
- School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China.
| | - Shuo Zhang
- School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China
| |
Collapse
|
21
|
Ji C, Wang Y, Ni J, Zheng C, Su Y. Predicting miRNA-Disease Associations Based on Heterogeneous Graph Attention Networks. Front Genet 2021; 12:727744. [PMID: 34512733 PMCID: PMC8424198 DOI: 10.3389/fgene.2021.727744] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2021] [Accepted: 08/02/2021] [Indexed: 11/23/2022] Open
Abstract
In recent years, more and more evidence has shown that microRNAs (miRNAs) play an important role in the regulation of post-transcriptional gene expression, and are closely related to human diseases. Many studies have also revealed that miRNAs can be served as promising biomarkers for the potential diagnosis and treatment of human diseases. The interactions between miRNA and human disease have rarely been demonstrated, and the underlying mechanism of miRNA is not clear. Therefore, computational approaches has attracted the attention of researchers, which can not only save time and money, but also improve the efficiency and accuracy of biological experiments. In this work, we proposed a Heterogeneous Graph Attention Networks (GAT) based method for miRNA-disease associations prediction, named HGATMDA. We constructed a heterogeneous graph for miRNAs and diseases, introduced weighted DeepWalk and GAT methods to extract features of miRNAs and diseases from the graph. Moreover, a fully-connected neural networks is used to predict correlation scores between miRNA-disease pairs. Experimental results under five-fold cross validation (five-fold CV) showed that HGATMDA achieved better prediction performance than other state-of-the-art methods. In addition, we performed three case studies on breast neoplasms, lung neoplasms and kidney neoplasms. The results showed that for the three diseases mentioned above, 50 out of top 50 candidates were confirmed by the validation datasets. Therefore, HGATMDA is suitable as an effective tool to identity potential diseases-related miRNAs.
Collapse
Affiliation(s)
- Cunmei Ji
- School of Cyber Science and Engineering, Qufu Normal University, Qufu, China
| | - Yutian Wang
- School of Cyber Science and Engineering, Qufu Normal University, Qufu, China
| | - Jiancheng Ni
- School of Cyber Science and Engineering, Qufu Normal University, Qufu, China
| | - Chunhou Zheng
- School of Artificial Intelligence, Anhui University, Hefei, China
| | - Yansen Su
- School of Artificial Intelligence, Anhui University, Hefei, China
| |
Collapse
|
22
|
Yu DL, Yu ZG, Han GS, Li J, Anh V. Heterogeneous Types of miRNA-Disease Associations Stratified by Multi-Layer Network Embedding and Prediction. Biomedicines 2021; 9:biomedicines9091152. [PMID: 34572337 PMCID: PMC8465678 DOI: 10.3390/biomedicines9091152] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Revised: 08/15/2021] [Accepted: 08/30/2021] [Indexed: 12/02/2022] Open
Abstract
Abnormal miRNA functions are widely involved in many diseases recorded in the database of experimentally supported human miRNA-disease associations (HMDD). Some of the associations are complicated: There can be up to five heterogeneous association types of miRNA with the same disease, including genetics type, epigenetics type, circulating miRNAs type, miRNA tissue expression type and miRNA-target interaction type. When one type of association is known for an miRNA-disease pair, it is important to predict any other types of the association for a better understanding of the disease mechanism. It is even more important to reveal associations for currently unassociated miRNAs and diseases. Methods have been recently proposed to make predictions on the association types of miRNA-disease pairs through restricted Boltzman machines, label propagation theories and tensor completion algorithms. None of them has exploited the non-linear characteristics in the miRNA-disease association network to improve the performance. We propose to use attributed multi-layer heterogeneous network embedding to learn the latent representations of miRNAs and diseases from each association type and then to predict the existence of the association type for all the miRNA-disease pairs. The performance of our method is compared with two newest methods via 10-fold cross-validation on the database HMDD v3.2 to demonstrate the superior prediction achieved by our method under different settings. Moreover, our real predictions made beyond the HMDD database can be all validated by NCBI literatures, confirming that our method is capable of accurately predicting new associations of miRNAs with diseases and their association types as well.
Collapse
Affiliation(s)
- Dong-Ling Yu
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan 411105, China; (D.-L.Y.); (G.-S.H.)
- Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Xiangtan 411105, China
| | - Zu-Guo Yu
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan 411105, China; (D.-L.Y.); (G.-S.H.)
- Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Xiangtan 411105, China
- Correspondence: (Z.-G.Y.); (J.L.)
| | - Guo-Sheng Han
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan 411105, China; (D.-L.Y.); (G.-S.H.)
- Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Xiangtan 411105, China
| | - Jinyan Li
- Data Science Institute, University of Technology Sydney, Broadway, NSW 2007, Australia
- Correspondence: (Z.-G.Y.); (J.L.)
| | - Vo Anh
- Faculty of Science, Engineering and Technology, Swinburne University of Technology, Hawthorn, VIC 3122, Australia;
| |
Collapse
|
23
|
Peng W, Du J, Dai W, Lan W. Predicting miRNA-Disease Association Based on Modularity Preserving Heterogeneous Network Embedding. Front Cell Dev Biol 2021; 9:603758. [PMID: 34178973 PMCID: PMC8223753 DOI: 10.3389/fcell.2021.603758] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Accepted: 03/23/2021] [Indexed: 12/12/2022] Open
Abstract
MicroRNAs (miRNAs) are a category of small non-coding RNAs that profoundly impact various biological processes related to human disease. Inferring the potential miRNA-disease associations benefits the study of human diseases, such as disease prevention, disease diagnosis, and drug development. In this work, we propose a novel heterogeneous network embedding-based method called MDN-NMTF (Module-based Dynamic Neighborhood Non-negative Matrix Tri-Factorization) for predicting miRNA-disease associations. MDN-NMTF constructs a heterogeneous network of disease similarity network, miRNA similarity network and a known miRNA-disease association network. After that, it learns the latent vector representation for miRNAs and diseases in the heterogeneous network. Finally, the association probability is computed by the product of the latent miRNA and disease vectors. MDN-NMTF not only successfully integrates diverse biological information of miRNAs and diseases to predict miRNA-disease associations, but also considers the module properties of miRNAs and diseases in the course of learning vector representation, which can maximally preserve the heterogeneous network structural information and the network properties. At the same time, we also extend MDN-NMTF to a new version (called MDN-NMTF2) by using modular information to improve the miRNA-disease association prediction ability. Our methods and the other four existing methods are applied to predict miRNA-disease associations in four databases. The prediction results show that our methods can improve the miRNA-disease association prediction to a high level compared with the four existing methods.
Collapse
Affiliation(s)
- Wei Peng
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China.,Computer Technology Application Key Laboratory of Yunnan Province, Kunming University of Science and Technology, Kunming, China
| | - Jielin Du
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China
| | - Wei Dai
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China.,Computer Technology Application Key Laboratory of Yunnan Province, Kunming University of Science and Technology, Kunming, China
| | - Wei Lan
- Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning, China
| |
Collapse
|
24
|
Li Z, Jiang K, Qin S, Zhong Y, Elofsson A. GCSENet: A GCN, CNN and SENet ensemble model for microRNA-disease association prediction. PLoS Comput Biol 2021; 17:e1009048. [PMID: 34081706 PMCID: PMC8205154 DOI: 10.1371/journal.pcbi.1009048] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2020] [Revised: 06/15/2021] [Accepted: 05/06/2021] [Indexed: 12/21/2022] Open
Abstract
Recently, an increasing number of studies have demonstrated that miRNAs are involved in human diseases, indicating that miRNAs might be a potential pathogenic factor for various diseases. Therefore, figuring out the relationship between miRNAs and diseases plays a critical role in not only the development of new drugs, but also the formulation of individualized diagnosis and treatment. As the prediction of miRNA-disease association via biological experiments is expensive and time-consuming, computational methods have a positive effect on revealing the association. In this study, a novel prediction model integrating GCN, CNN and Squeeze-and-Excitation Networks (GCSENet) was constructed for the identification of miRNA-disease association. The model first captured features by GCN based on a heterogeneous graph including diseases, genes and miRNAs. Then, considering the different effects of genes on each type of miRNA and disease, as well as the different effects of the miRNA-gene and disease-gene relationships on miRNA-disease association, a feature weight was set and a combination of miRNA-gene and disease-gene associations was added as feature input for the convolution operation in CNN. Furthermore, the squeeze and excitation blocks of SENet were applied to determine the importance of each feature channel and enhance useful features by means of the attention mechanism, thus achieving a satisfactory prediction of miRNA-disease association. The proposed method was compared against other state-of-the-art methods. It achieved an AUROC score of 95.02% and an AUPR score of 95.55% in a 10-fold cross-validation, which led to the finding that the proposed method is superior to these popular methods on most of the performance evaluation indexes. Identifying miRNA-disease associations accelerates the understanding towards pathogenicity, which is beneficial for the development of treatment tools for diseases. Different from existing methods, our GCSENet captures the deep relationship between miRNA and disease through three heterogeneous graphs (disease, gene and miRNA) to promote an accurate prediction result. We performed the 10-fold cross validation to evaluate the performance of GCSENet, which can outperform many classic methods. Furthermore, we carried out case studies on four important diseases, which were used to evaluate the performance of our model regarding to the associations with experimental evidences in literature. The result shows that most predicted miRNAs (48 for lung neoplasms, 48 for heart failure, 48 for breast cancer and 50 for glioblastoma) in the top 50 predictions were confirmed in HMDD v3.0. As a result, it shows that GCSENet can make reliable predictions and guide experiments to uncover more miRNA-disease associations.
Collapse
Affiliation(s)
- Zhong Li
- Department of Mathematical Sciences, School of Science, Zhejiang Sci-Tech University, Hangzhou, China
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Stockholm, Solna, Sweden
- * E-mail:
| | - Kaiyancheng Jiang
- Department of Mathematical Sciences, School of Science, Zhejiang Sci-Tech University, Hangzhou, China
| | - Shengwei Qin
- Department of Mathematical Sciences, School of Science, Zhejiang Sci-Tech University, Hangzhou, China
| | - Yijun Zhong
- Department of Mathematical Sciences, School of Science, Zhejiang Sci-Tech University, Hangzhou, China
| | - Arne Elofsson
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Stockholm, Solna, Sweden
| |
Collapse
|
25
|
Mortazavi SS, Bahmanpour Z, Daneshmandpour Y, Roudbari F, Sheervalilou R, Kazeminasab S, Emamalizadeh B. An updated overview and classification of bioinformatics tools for MicroRNA analysis, which one to choose? Comput Biol Med 2021; 134:104544. [PMID: 34119921 DOI: 10.1016/j.compbiomed.2021.104544] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2021] [Revised: 05/30/2021] [Accepted: 05/30/2021] [Indexed: 12/16/2022]
Abstract
The term 'MicroRNA' (miRNA) refers to a class of small endogenous non-coding RNAs (ncRNAs) regenerated from hairpin transcripts. Recent studies reveal miRNAs' regulatory involvement in essential biological processes through translational repression or mRNA degradation. Recently, there is a growing body of literature focusing on the importance of miRNAs and their functions. In this respect, several databases have been developed to manage the dispersed data produced. Therefore, it is necessary to know the parameters and characteristics of each database to benefit their data. Besides, selecting the correct database is of great importance to scientists who do not have enough experience in this field. A comprehensive classification along with an explanation of the information contained in each database leads to facilitating access to these resources. In this regard, we have classified relevant databases into several categories, including miRNA sequencing and annotation, validated/predicted miRNA targets, disease-related miRNA, SNP in miRNA sequence or target site, miRNA-related pathways, or gene ontology, and mRNA-miRNA interactions. Hence, this review introduces available miRNA databases and presents a convenient overview to inform researchers of different backgrounds to find suitable miRNA-related bioinformatics web tools and relevant information rapidly.
Collapse
Affiliation(s)
| | - Zahra Bahmanpour
- Department of Medical Genetics, Faculty of Medicine, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Yousef Daneshmandpour
- Department of Medical Genetics, Faculty of Medicine, Tabriz University of Medical Sciences, Tabriz, Iran
| | | | | | - Somayeh Kazeminasab
- Department of Medical Genetics, Faculty of Medicine, Tabriz University of Medical Sciences, Tabriz, Iran; Research Vice-Chancellor, Faculty of Medicine, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Babak Emamalizadeh
- Department of Medical Genetics, Faculty of Medicine, Tabriz University of Medical Sciences, Tabriz, Iran.
| |
Collapse
|
26
|
Chu Y, Wang X, Dai Q, Wang Y, Wang Q, Peng S, Wei X, Qiu J, Salahub DR, Xiong Y, Wei DQ. MDA-GCNFTG: identifying miRNA-disease associations based on graph convolutional networks via graph sampling through the feature and topology graph. Brief Bioinform 2021; 22:6261915. [PMID: 34009265 DOI: 10.1093/bib/bbab165] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 04/02/2021] [Accepted: 04/08/2021] [Indexed: 11/13/2022] Open
Abstract
Accurate identification of the miRNA-disease associations (MDAs) helps to understand the etiology and mechanisms of various diseases. However, the experimental methods are costly and time-consuming. Thus, it is urgent to develop computational methods towards the prediction of MDAs. Based on the graph theory, the MDA prediction is regarded as a node classification task in the present study. To solve this task, we propose a novel method MDA-GCNFTG, which predicts MDAs based on Graph Convolutional Networks (GCNs) via graph sampling through the Feature and Topology Graph to improve the training efficiency and accuracy. This method models both the potential connections of feature space and the structural relationships of MDA data. The nodes of the graphs are represented by the disease semantic similarity, miRNA functional similarity and Gaussian interaction profile kernel similarity. Moreover, we considered six tasks simultaneously on the MDA prediction problem at the first time, which ensure that under both balanced and unbalanced sample distribution, MDA-GCNFTG can predict not only new MDAs but also new diseases without known related miRNAs and new miRNAs without known related diseases. The results of 5-fold cross-validation show that the MDA-GCNFTG method has achieved satisfactory performance on all six tasks and is significantly superior to the classic machine learning methods and the state-of-the-art MDA prediction methods. Moreover, the effectiveness of GCNs via the graph sampling strategy and the feature and topology graph in MDA-GCNFTG has also been demonstrated. More importantly, case studies for two diseases and three miRNAs are conducted and achieved satisfactory performance.
Collapse
Affiliation(s)
- Yanyi Chu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Xuhong Wang
- School of Electronic, Information and Electrical Engineering (SEIEE), Shanghai Jiao Tong University, China
| | - Qiuying Dai
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Yanjing Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Qiankun Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Shaoliang Peng
- College of Computer Science and Electronic Engineering, Hunan University, China
| | | | | | - Dennis Russell Salahub
- Department of Chemistry, University of Calgary, Fellow Royal Society of Canada and Fellow of the American Association for the Advancement of Science, China
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, P.R. China
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, P.R. China
| |
Collapse
|
27
|
Ji C, Gao Z, Ma X, Wu Q, Ni J, Zheng C. AEMDA: inferring miRNA-disease associations based on deep autoencoder. Bioinformatics 2021; 37:66-72. [PMID: 32726399 DOI: 10.1093/bioinformatics/btaa670] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2020] [Revised: 05/27/2020] [Accepted: 07/20/2020] [Indexed: 12/19/2022] Open
Abstract
MOTIVATION MicroRNAs (miRNAs) are a class of non-coding RNAs that play critical roles in various biological processes. Many studies have shown that miRNAs are closely related to the occurrence, development and diagnosis of human diseases. Traditional biological experiments are costly and time consuming. As a result, effective computational models have become increasingly popular for predicting associations between miRNAs and diseases, which could effectively boost human disease diagnosis and prevention. RESULTS We propose a novel computational framework, called AEMDA, to identify associations between miRNAs and diseases. AEMDA applies a learning-based method to extract dense and high-dimensional representations of diseases and miRNAs from integrated disease semantic similarity, miRNA functional similarity and heterogeneous related interaction data. In addition, AEMDA adopts a deep autoencoder that does not need negative samples to retrieve the underlying associations between miRNAs and diseases. Furthermore, the reconstruction error is used as a measurement to predict disease-associated miRNAs. Our experimental results indicate that AEMDA can effectively predict disease-related miRNAs and outperforms state-of-the-art methods. AVAILABILITY AND IMPLEMENTATION The source code and data are available at https://github.com/CunmeiJi/AEMDA. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Cunmei Ji
- School of Software, Qufu Normal University, Qufu 273165, China
| | - Zhen Gao
- School of Software, Qufu Normal University, Qufu 273165, China
| | - Xu Ma
- School of Software, Qufu Normal University, Qufu 273165, China
| | - Qingwen Wu
- School of Software, Qufu Normal University, Qufu 273165, China
| | - Jiancheng Ni
- School of Software, Qufu Normal University, Qufu 273165, China
| | - Chunhou Zheng
- School of Software, Qufu Normal University, Qufu 273165, China.,School of Computer Science and Technology, Anhui University, Hefei 230601, China
| |
Collapse
|
28
|
Wang J, Li J, Yue K, Wang L, Ma Y, Li Q. NMCMDA: neural multicategory MiRNA-disease association prediction. Brief Bioinform 2021; 22:6189772. [PMID: 33778850 DOI: 10.1093/bib/bbab074] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2020] [Revised: 02/05/2021] [Indexed: 01/20/2023] Open
Abstract
MOTIVATION There is growing evidence showing that the dysregulations of miRNAs cause diseases through various kinds of the underlying mechanism. Thus, predicting the multiple-category associations between microRNAs (miRNAs) and diseases plays an important role in investigating the roles of miRNAs in diseases. Moreover, in contrast with traditional biological experiments which are time-consuming and expensive, computational approaches for the prediction of multicategory miRNA-disease associations are time-saving and cost-effective that are highly desired for us. RESULTS We present a novel data-driven end-to-end learning-based method of neural multiple-category miRNA-disease association prediction (NMCMDA) for predicting multiple-category miRNA-disease associations. The NMCMDA has two main components: (i) encoder operates directly on the miRNA-disease heterogeneous network and leverages Graph Neural Network to learn miRNA and disease latent representations, respectively. (ii) Decoder yields miRNA-disease association scores with the learned latent representations as input. Various kinds of encoders and decoders are proposed for NMCMDA. Finally, the NMCMDA with the encoder of Relational Graph Convolutional Network and the neural multirelational decoder (NMR-RGCN) achieves the best prediction performance. We compared the NMCMDA with other baselines on three experimental datasets. The experimental results show that the NMR-RGCN is significantly superior to the state-of-the-art method TDRC in terms of Top-1 precision, Top-1 Recall, and Top-1 F1. Additionally, case studies are provided for two high-risk human diseases (namely, breast cancer and lung cancer) and we also provide the prediction and validation of top-10 miRNA-disease-category associations based on all known data of HMDD v3.2, which further validate the effectiveness and feasibility of the proposed method.
Collapse
Affiliation(s)
| | - Jin Li
- School of Software, Yunnan University, China
| | - Kun Yue
- School of Information, Yunnan University, China
| | | | | | - Qing Li
- Kunming Medical University, China
| |
Collapse
|
29
|
Tang Q, Kang J, Yuan J, Tang H, Li X, Lin H, Huang J, Chen W. DNA4mC-LIP: a linear integration method to identify N4-methylcytosine site in multiple species. Bioinformatics 2020; 36:3327-3335. [PMID: 32108866 DOI: 10.1093/bioinformatics/btaa143] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2020] [Revised: 02/12/2020] [Accepted: 02/25/2020] [Indexed: 12/17/2022] Open
Abstract
MOTIVATION DNA N4-methylcytosine (4mC) is a crucial epigenetic modification. However, the knowledge about its biological functions is limited. Effective and accurate identification of 4mC sites will be helpful to reveal its biological functions and mechanisms. Since experimental methods are cost and ineffective, a number of machine learning-based approaches have been proposed to detect 4mC sites. Although these methods yielded acceptable accuracy, there is still room for the improvement of the prediction performance and the stability of existing methods in practical applications. RESULTS In this work, we first systematically assessed the existing methods based on an independent dataset. And then, we proposed DNA4mC-LIP, a linear integration method by combining existing predictors to identify 4mC sites in multiple species. The results obtained from independent dataset demonstrated that DNA4mC-LIP outperformed existing methods for identifying 4mC sites. To facilitate the scientific community, a web server for DNA4mC-LIP was developed. We anticipated that DNA4mC-LIP could serve as a powerful computational technique for identifying 4mC sites and facilitate the interpretation of 4mC mechanism. AVAILABILITY AND IMPLEMENTATION http://i.uestc.edu.cn/DNA4mC-LIP/. CONTACT hlin@uestc.edu.cn or hj@uestc.edu.cn or chenweiimu@gmail.com. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Qiang Tang
- Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China
| | - Juanjuan Kang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Jiaqing Yuan
- Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China
| | - Hua Tang
- Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China
| | - Xianhai Li
- Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China
| | - Hao Lin
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Jian Huang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Wei Chen
- Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China.,Center for Genomics and Computational Biology, School of Life Sciences, North China University of Science and Technology, Tangshan 063000, China
| |
Collapse
|
30
|
Shi M, Sheng Z, Tang H. Prognostic outcome prediction by semi-supervised least squares classification. Brief Bioinform 2020; 22:5935498. [PMID: 33094318 DOI: 10.1093/bib/bbaa249] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Revised: 09/04/2020] [Accepted: 09/04/2020] [Indexed: 11/13/2022] Open
Abstract
Although great progress has been made in prognostic outcome prediction, small sample size remains a challenge in obtaining accurate and robust classifiers. We proposed the Rescaled linear square Regression based Least Squares Learning (RRLSL), a jointly developed semi-supervised feature selection and classifier, for predicting prognostic outcome of cancer patients. RRLSL used the least square regression to identify the scale factors and then rank the features in available multiple types of molecular data. We applied the unlabeled multiple molecular data in conjunction with the labeled data to develop a similarity graph. RRLSL produced the constraint with kernel functions to bridge the gap between label information and geometry information from messenger RNA and microRNA expression profiling. Importantly, this semi-supervised model proposed the least squares learning with L2 regularization to develop a semi-supervised classifier. RRLSL suggested the performance improvement in the prognostic outcome prediction and successfully discriminated between the recurrent patients and non-recurrent ones. We also demonstrated that RRLSL improved the accuracy and Area Under the Precision Recall Curve (AUPRC) as compared to the baseline semi-supervised methods. RRLSL is available for a stand-alone software package (https://github.com/ShiMGLab/RRLSL). A short abstract We proposed the Rescaled linear square Regression based Least Squares Learning (RRLSL), a jointly developed semi-supervised feature selection and classifier, for predicting prognostic outcome of cancer patients. RRLSL used the least square regression to identify the scale factors to rank the features in available multiple types of molecular data. RRLSL produced the constraint with kernel functions to bridge the gap between label information and geometry information from messenger RNA and microRNA expression profiling. Importantly, this semi-supervised model proposed the least squares learning with L2 regularization to develop the semi-supervised classifier. RRLSL suggested the performance improvement in the prognostic outcome prediction and successfully discriminated between the recurrent patients and non-recurrent ones.
Collapse
Affiliation(s)
- Mingguang Shi
- School of Electric Engineering and Automation, Hefei University of Technology, Hefei, Anhui, 230009 China
| | - Zhou Sheng
- School of Electric Engineering and Automation, Hefei University of Technology, Hefei, Anhui, 230009 China
| | - Hao Tang
- School of Electric Engineering and Automation, Hefei University of Technology, Hefei, Anhui, 230009 China
| |
Collapse
|
31
|
Li Z, Li J, Nie R, You ZH, Bao W. A graph auto-encoder model for miRNA-disease associations prediction. Brief Bioinform 2020; 22:5929824. [PMID: 34293850 DOI: 10.1093/bib/bbaa240] [Citation(s) in RCA: 56] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Revised: 08/26/2020] [Accepted: 08/27/2020] [Indexed: 02/06/2023] Open
Abstract
Emerging evidence indicates that the abnormal expression of miRNAs involves in the evolution and progression of various human complex diseases. Identifying disease-related miRNAs as new biomarkers can promote the development of disease pathology and clinical medicine. However, designing biological experiments to validate disease-related miRNAs is usually time-consuming and expensive. Therefore, it is urgent to design effective computational methods for predicting potential miRNA-disease associations. Inspired by the great progress of graph neural networks in link prediction, we propose a novel graph auto-encoder model, named GAEMDA, to identify the potential miRNA-disease associations in an end-to-end manner. More specifically, the GAEMDA model applies a graph neural networks-based encoder, which contains aggregator function and multi-layer perceptron for aggregating nodes' neighborhood information, to generate the low-dimensional embeddings of miRNA and disease nodes and realize the effective fusion of heterogeneous information. Then, the embeddings of miRNA and disease nodes are fed into a bilinear decoder to identify the potential links between miRNA and disease nodes. The experimental results indicate that GAEMDA achieves the average area under the curve of $93.56\pm 0.44\%$ under 5-fold cross-validation. Besides, we further carried out case studies on colon neoplasms, esophageal neoplasms and kidney neoplasms. As a result, 48 of the top 50 predicted miRNAs associated with these diseases are confirmed by the database of differentially expressed miRNAs in human cancers and microRNA deregulation in human disease database, respectively. The satisfactory prediction performance suggests that GAEMDA model could serve as a reliable tool to guide the following researches on the regulatory role of miRNAs. Besides, the source codes are available at https://github.com/chimianbuhetang/GAEMDA.
Collapse
Affiliation(s)
- Zhengwei Li
- Engineering Research Center of Mine Digitalization of Ministry of Education and School of Computer Science and Technology, China University of Mining and Technology
| | - Jiashu Li
- School of Computer Science and Technology, China University of Mining and Technology
| | - Ru Nie
- School of Computer Science and Technology, China University of Mining and Technology
| | - Zhu-Hong You
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Science
| | - Wenzheng Bao
- School of Information Engineering, Xuzhou University of Technology
| |
Collapse
|
32
|
Huang F, Yue X, Xiong Z, Yu Z, Liu S, Zhang W. Tensor decomposition with relational constraints for predicting multiple types of microRNA-disease associations. Brief Bioinform 2020; 22:5876601. [PMID: 32725161 DOI: 10.1093/bib/bbaa140] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Revised: 05/27/2020] [Accepted: 06/06/2020] [Indexed: 01/02/2023] Open
Abstract
MicroRNAs (miRNAs) play crucial roles in multifarious biological processes associated with human diseases. Identifying potential miRNA-disease associations contributes to understanding the molecular mechanisms of miRNA-related diseases. Most of the existing computational methods mainly focus on predicting whether a miRNA-disease association exists or not. However, the roles of miRNAs in diseases are prominently diverged, for instance, Genetic variants of miRNA (mir-15) may affect the expression level of miRNAs leading to B cell chronic lymphocytic leukemia, while circulating miRNAs (including mir-1246, mir-1307-3p, etc.) have potentials to detecting breast cancer in the early stage. In this paper, we aim to predict multi-type miRNA-disease associations instead of taking them as binary. To this end, we innovatively represent miRNA-disease-type triples as a tensor and introduce tensor decomposition methods to solve the prediction task. Experimental results on two widely-adopted miRNA-disease datasets: HMDD v2.0 and HMDD v3.2 show that tensor decomposition methods improve a recent baseline in a large scale (up to $38\%$ in Top-1F1). We then propose a novel method, Tensor Decomposition with Relational Constraints (TDRC), which incorporates biological features as relational constraints to further the existing tensor decomposition methods. Compared with two existing tensor decomposition methods, TDRC can produce better performance while being more efficient.
Collapse
Affiliation(s)
- Feng Huang
- College of Informatics, Huazhong Agricultural University
| | - Xiang Yue
- Department of Computer Science & Engineering, The Ohio State University
| | - Zhankun Xiong
- College of Informatics, Huazhong Agricultural University
| | - Zhouxin Yu
- College of Informatics, Huazhong Agricultural University
| | - Shichao Liu
- College of Informatics, Huazhong Agricultural University
| | | |
Collapse
|
33
|
Le DH, Tran TTH. RWRMTN: a tool for predicting disease-associated microRNAs based on a microRNA-target gene network. BMC Bioinformatics 2020; 21:244. [PMID: 32539680 PMCID: PMC7296691 DOI: 10.1186/s12859-020-03578-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Accepted: 06/01/2020] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND The misregulation of microRNA (miRNA) has been shown to cause diseases. Recently, we have proposed a computational method based on a random walk framework on a miRNA-target gene network to predict disease-associated miRNAs. The prediction performance of our method is better than that of some existing state-of-the-art network- and machine learning-based methods since it exploits the mutual regulation between miRNAs and their target genes in the miRNA-target gene interaction networks. RESULTS To facilitate the use of this method, we have developed a Cytoscape app, named RWRMTN, to predict disease-associated miRNAs. RWRMTN can work on any miRNA-target gene network. Highly ranked miRNAs are supported with evidence from the literature. They then can also be visualized based on the rankings and in relationships with the query disease and their target genes. In addition, automation functions are also integrated, which allow RWRMTN to be used in workflows from external environments. We demonstrate the ability of RWRMTN in predicting breast and lung cancer-associated miRNAs via workflows in Cytoscape and other environments. CONCLUSIONS Considering a few computational methods have been developed as software tools for convenient uses, RWRMTN is among the first GUI-based tools for the prediction of disease-associated miRNAs which can be used in workflows in different environments.
Collapse
Affiliation(s)
- Duc-Hau Le
- Department of Computational Biomedicine, Vingroup Big Data Institute, No 7, Bang Lang 1 Street, Viet Hung Ward, Long Bien District, Hanoi, Vietnam.
| | - Trang T H Tran
- Department of Computational Biomedicine, Vingroup Big Data Institute, No 7, Bang Lang 1 Street, Viet Hung Ward, Long Bien District, Hanoi, Vietnam
| |
Collapse
|
34
|
Huang Z, Liu L, Gao Y, Shi J, Cui Q, Li J, Zhou Y. Benchmark of computational methods for predicting microRNA-disease associations. Genome Biol 2019; 20:202. [PMID: 31594544 PMCID: PMC6781296 DOI: 10.1186/s13059-019-1811-3] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2019] [Accepted: 09/03/2019] [Indexed: 01/06/2023] Open
Abstract
BACKGROUND A series of miRNA-disease association prediction methods have been proposed to prioritize potential disease-associated miRNAs. Independent benchmarking of these methods is warranted to assess their effectiveness and robustness. RESULTS Based on more than 8000 novel miRNA-disease associations from the latest HMDD v3.1 database, we perform systematic comparison among 36 readily available prediction methods. Their overall performances are evaluated with rigorous precision-recall curve analysis, where 13 methods show acceptable accuracy (AUPRC > 0.200) while the top two methods achieve a promising AUPRC over 0.300, and most of these methods are also highly ranked when considering only the causal miRNA-disease associations as the positive samples. The potential of performance improvement is demonstrated by combining different predictors or adopting a more updated miRNA similarity matrix, which would result in up to 16% and 46% of AUPRC augmentations compared to the best single predictor and the predictors using the previous similarity matrix, respectively. Our analysis suggests a common issue of the available methods, which is that the prediction results are severely biased toward well-annotated diseases with many associated miRNAs known and cannot further stratify the positive samples by discriminating the causal miRNA-disease associations from the general miRNA-disease associations. CONCLUSION Our benchmarking results not only provide a reference for biomedical researchers to choose appropriate miRNA-disease association predictors for their purpose, but also suggest the future directions for the development of more robust miRNA-disease association predictors.
Collapse
Affiliation(s)
- Zhou Huang
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
| | - Leibo Liu
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, 300401, China
| | - Yuanxu Gao
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
| | - Jiangcheng Shi
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
| | - Qinghua Cui
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
- Center of Bioinformatics, Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Jianwei Li
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, 300401, China.
| | - Yuan Zhou
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China.
| |
Collapse
|