1
|
Chen M, Deng Y, Li Z, Ye Y, Zeng L, He Z, Peng G. SCPLPA: An miRNA-disease association prediction model based on spatial consistency projection and label propagation algorithm. J Cell Mol Med 2024; 28:e18345. [PMID: 38693850 PMCID: PMC11063733 DOI: 10.1111/jcmm.18345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Revised: 04/01/2024] [Accepted: 04/08/2024] [Indexed: 05/03/2024] Open
Abstract
Identifying the association between miRNA and diseases is helpful for disease prevention, diagnosis and treatment. It is of great significance to use computational methods to predict potential human miRNA disease associations. Considering the shortcomings of existing computational methods, such as low prediction accuracy and weak generalization, we propose a new method called SCPLPA to predict miRNA-disease associations. First, a heterogeneous disease similarity network was constructed using the disease semantic similarity network and the disease Gaussian interaction spectrum kernel similarity network, while a heterogeneous miRNA similarity network was constructed using the miRNA functional similarity network and the miRNA Gaussian interaction spectrum kernel similarity network. Then, the estimated miRNA-disease association scores were evaluated by integrating the outcomes obtained by implementing label propagation algorithms in the heterogeneous disease similarity network and the heterogeneous miRNA similarity network. Finally, the spatial consistency projection algorithm of the network was used to extract miRNA disease association features to predict unverified associations between miRNA and diseases. SCPLPA was compared with four classical methods (MDHGI, NSEMDA, RFMDA and SNMFMDA), and the results of multiple evaluation metrics showed that SCPLPA exhibited the most outstanding predictive performance. Case studies have shown that SCPLPA can effectively identify miRNAs associated with colon neoplasms and kidney neoplasms. In summary, our proposed SCPLPA algorithm is easy to implement and can effectively predict miRNA disease associations, making it a reliable auxiliary tool for biomedical research.
Collapse
Affiliation(s)
- Min Chen
- Hunan Institute of TechnologySchool of Computer Science and EngineeringHengyang 421002China
| | - Yingwei Deng
- Hunan Institute of TechnologySchool of Computer Science and EngineeringHengyang 421002China
| | - Zejun Li
- Hunan Institute of TechnologySchool of Computer Science and EngineeringHengyang 421002China
| | - Yifan Ye
- Hunan Institute of TechnologySchool of Computer Science and EngineeringHengyang 421002China
| | - Lijun Zeng
- Hunan Institute of TechnologySchool of Computer Science and EngineeringHengyang 421002China
| | - Ziyi He
- Hunan Institute of TechnologySchool of Computer Science and EngineeringHengyang 421002China
| | - Guofang Peng
- Hunan Institute of TechnologySchool of Computer Science and EngineeringHengyang 421002China
| |
Collapse
|
2
|
Guo Y, Zhou D, Ruan X, Cao J. Variational gated autoencoder-based feature extraction model for inferring disease-miRNA associations based on multiview features. Neural Netw 2023; 165:491-505. [PMID: 37336034 DOI: 10.1016/j.neunet.2023.05.052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Revised: 05/19/2023] [Accepted: 05/28/2023] [Indexed: 06/21/2023]
Abstract
MicroRNAs (miRNA) play critical roles in diverse biological processes of diseases. Inferring potential disease-miRNA associations enable us to better understand the development and diagnosis of complex human diseases via computational algorithms. The work presents a variational gated autoencoder-based feature extraction model to extract complex contextual features for inferring potential disease-miRNA associations. Specifically, our model fuses three different similarities of miRNAs into a comprehensive miRNA network and then combines two various similarities of diseases into a comprehensive disease network, respectively. Then, a novel graph autoencoder is designed to extract multilevel representations based on variational gate mechanisms from heterogeneous networks of miRNAs and diseases. Finally, a gate-based association predictor is devised to combine multiscale representations of miRNAs and diseases via a novel contrastive cross-entropy function, and then infer disease-miRNA associations. Experimental results indicate that our proposed model achieves remarkable association prediction performance, proving the efficacy of the variational gate mechanism and contrastive cross-entropy loss for inferring disease-miRNA associations.
Collapse
Affiliation(s)
- Yanbu Guo
- College of Software Engineering, Zhengzhou University of Light Industry, Zhengzhou 450002, China.
| | - Dongming Zhou
- School of Information Science and Engineering, Yunnan University, Kunming 650500, China.
| | - Xiaoli Ruan
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China.
| | - Jinde Cao
- School of Mathematics, Southeast University, Nanjing 211189, China; Yonsei Frontier Lab, Yonsei University, Seoul 03722, South Korea.
| |
Collapse
|
3
|
Chen M, Deng Y, Li Z, Ye Y, He Z. KATZNCP: a miRNA-disease association prediction model integrating KATZ algorithm and network consistency projection. BMC Bioinformatics 2023; 24:229. [PMID: 37268893 DOI: 10.1186/s12859-023-05365-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Accepted: 05/26/2023] [Indexed: 06/04/2023] Open
Abstract
BACKGROUND Clinical studies have shown that miRNAs are closely related to human health. The study of potential associations between miRNAs and diseases will contribute to a profound understanding of the mechanism of disease development, as well as human disease prevention and treatment. MiRNA-disease associations predicted by computational methods are the best complement to biological experiments. RESULTS In this research, a federated computational model KATZNCP was proposed on the basis of the KATZ algorithm and network consistency projection to infer the potential miRNA-disease associations. In KATZNCP, a heterogeneous network was initially constructed by integrating the known miRNA-disease association, integrated miRNA similarities, and integrated disease similarities; then, the KATZ algorithm was implemented in the heterogeneous network to obtain the estimated miRNA-disease prediction scores. Finally, the precise scores were obtained by the network consistency projection method as the final prediction results. KATZNCP achieved the reliable predictive performance in leave-one-out cross-validation (LOOCV) with an AUC value of 0.9325, which was better than the state-of-the-art comparable algorithms. Furthermore, case studies of lung neoplasms and esophageal neoplasms demonstrated the excellent predictive performance of KATZNCP. CONCLUSION A new computational model KATZNCP was proposed for predicting potential miRNA-drug associations based on KATZ and network consistency projections, which can effectively predict the potential miRNA-disease interactions. Therefore, KATZNCP can be used to provide guidance for future experiments.
Collapse
Affiliation(s)
- Min Chen
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, 421002, China
| | - Yingwei Deng
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, 421002, China.
| | - Zejun Li
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, 421002, China
| | - Yifan Ye
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, 421002, China
| | - Ziyi He
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, 421002, China
| |
Collapse
|
4
|
Zhou F, Yin MM, Zhao JX, Shang J, Liu JX. A Method Based On Dual-Network Information Fusion to Predict MiRNA-Disease Associations. IEEE/ACM Trans Comput Biol Bioinform 2023; 20:52-60. [PMID: 34882558 DOI: 10.1109/tcbb.2021.3133006] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
MicroRNAs (miRNAs) are single-stranded small RNAs. An increasing number of studies have shown that miRNAs play a vital role in many important biological processes. However, some experimental methods to predict unknown miRNA-disease associations (MDAs) are time-consuming and costly. Only a small percentage of MDAs are verified by researchers. Therefore, there is a great need for high-speed and efficient methods to predict novel MDAs. In this paper, a new computational method based on Dual-Network Information Fusion (DNIF) is developed to predict potential MDAs. Specifically, on the one hand, two enhanced sub-models are integrated to reconstruct an effective prediction framework; on the other hand, the prediction performance of the algorithm is improved by fully fusing multiple omics data information, including validated miRNA-disease associations network, miRNA functional similarity, disease semantic similarity and Gaussian interaction profile (GIP) kernel network associations. As a result, DNIF achieves the excellent performance under situation of 5-fold cross validation (average AUC of 0.9571). In the cases study of three important human diseases, our model has achieved satisfactory performance in predicting potential miRNAs for certain diseases. The reliable experimental results demonstrate that DNIF could serve as an effective calculation method to accelerate the identification of MDAs.
Collapse
|
5
|
Lee M, Kim PJ, Joe H, Kim HG. Gene-centric multi-omics integration with convolutional encoders for cancer drug response prediction. Comput Biol Med 2022; 151:106192. [PMID: 36327883 DOI: 10.1016/j.compbiomed.2022.106192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 08/26/2022] [Accepted: 10/08/2022] [Indexed: 12/27/2022]
Abstract
MOTIVATION Tumor heterogeneity, including genetic and transcriptomic characteristics, can reduce the efficacy of anticancer pharmacological therapy, resulting in clinical variability in patient response to therapeutic medications. Multi-omics integration can allow in silico models to provide an additional perspective on a biological system. METHODS In this study, we propose a gene-centric multi-channel (GCMC) architecture to integrate multi-omics for predicting cancer drug response. GCMC transformed multi-omics profiles into a three-dimensional tensor with an additional dimension for omics types. GCMC's convolutional encoders captures multi-omics profiles for each gene and yields gene-centric features to predict drug responses. RESULTS We evaluated GCMC on various datasets, including The Cancer Genome Atlas (TCGA) patients, patient-derived xenografts (PDX) mice models, and the Genomics of Drug Sensitivity in Cancer (GDSC) cell line datasets. GCMC achieved better performance than baseline models, including single-omics models, in more than 75% of 265 drugs from GDSC cell line datasets. Furthermore, as for the clinical applicability of GCMC, it achieved the best performance on TCGA and PDX datasets in terms of both AUPR and AUC. We also analyzed models' capability of integrating multi-omics profiles by measuring the contribution ratio of omics types. GCMC can incorporate multi-omics profiles in various manners to enhance performance for each drug type. These results suggested that GCMC can improve performance and feature extraction capability by integrating multi-omics profiles in a gene-centric manner.
Collapse
Affiliation(s)
- Munhwan Lee
- Biomedical Knowledge Engineering Lab., Seoul National University, 1 Gwanak-ro, Seoul, 08826, Republic of Korea.
| | - Pil-Jong Kim
- Biomedical Knowledge Engineering Lab., Seoul National University, 1 Gwanak-ro, Seoul, 08826, Republic of Korea.
| | - Hyunwhan Joe
- Biomedical Knowledge Engineering Lab., Seoul National University, 1 Gwanak-ro, Seoul, 08826, Republic of Korea.
| | - Hong-Gee Kim
- Biomedical Knowledge Engineering Lab., Seoul National University, 1 Gwanak-ro, Seoul, 08826, Republic of Korea.
| |
Collapse
|
6
|
Huang L, Zhang L, Chen X. Updated review of advances in microRNAs and complex diseases: towards systematic evaluation of computational models. Brief Bioinform 2022; 23:6712303. [PMID: 36151749 DOI: 10.1093/bib/bbac407] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2022] [Revised: 08/11/2022] [Accepted: 08/20/2022] [Indexed: 12/14/2022] Open
Abstract
Currently, there exist no generally accepted strategies of evaluating computational models for microRNA-disease associations (MDAs). Though K-fold cross validations and case studies seem to be must-have procedures, the value of K, the evaluation metrics, and the choice of query diseases as well as the inclusion of other procedures (such as parameter sensitivity tests, ablation studies and computational cost reports) are all determined on a case-by-case basis and depending on the researchers' choices. In the current review, we include a comprehensive analysis on how 29 state-of-the-art models for predicting MDAs were evaluated. Based on the analytical results, we recommend a feasible evaluation workflow that would suit any future model to facilitate fair and systematic assessment of predictive performance.
Collapse
Affiliation(s)
- Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, 10084, China.,The Future Laboratory, Tsinghua University, Beijing, 10084, China
| | - Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.,Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|
7
|
Ai N, Liang Y, Yuan HL, Ou-Yang D, Liu XY, Xie SL, Ji YH. MHDMF: Prediction of miRNA-disease associations based on Deep Matrix Factorization with Multi-source Graph Convolutional Network. Comput Biol Med 2022; 149:106069. [PMID: 36115300 DOI: 10.1016/j.compbiomed.2022.106069] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 07/31/2022] [Accepted: 08/27/2022] [Indexed: 11/24/2022]
Abstract
A growing number of works have proved that microRNAs (miRNAs) are a crucial biomarker in diverse bioprocesses affecting various diseases. As a good complement to high-cost wet experiment-based methods, numerous computational prediction methods have sprung up. However, there are still challenges that exist in making effective use of high false-negative associations and multi-source information for finding the potential associations. In this work, we develop an end-to-end computational framework, called MHDMF, which integrates the multi-source information on a heterogeneous network to discover latent disease-miRNA associations. Since high false-negative exist in the miRNA-disease associations, MHDMF utilizes the multi-source Graph Convolutional Network (GCN) to correct the false-negative association by reformulating the miRNA-disease association score matrix. The score matrix reformulation is based on different similarity profiles and known associations between miRNAs, genes, and diseases. Then, MHDMF employs Deep Matrix Factorization (DMF) to predict the miRNA-disease associations based on reformulated miRNA-disease association score matrix. The experimental results show that the proposed framework outperforms highly related comparison methods by a large margin on tasks of miRNA-disease association prediction. Furthermore, case studies suggest that MHDMF could be a convenient and efficient tool and may supply a new way to think about miRNA-disease association prediction.
Collapse
|
8
|
Huang L, Zhang L, Chen X. Updated review of advances in microRNAs and complex diseases: taxonomy, trends and challenges of computational models. Brief Bioinform 2022; 23:6686738. [PMID: 36056743 DOI: 10.1093/bib/bbac358] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 07/24/2022] [Accepted: 07/30/2022] [Indexed: 12/12/2022] Open
Abstract
Since the problem proposed in late 2000s, microRNA-disease association (MDA) predictions have been implemented based on the data fusion paradigm. Integrating diverse data sources gains a more comprehensive research perspective, and brings a challenge to algorithm design for generating accurate, concise and consistent representations of the fused data. After more than a decade of research progress, a relatively simple algorithm like the score function or a single computation layer may no longer be sufficient for further improving predictive performance. Advanced model design has become more frequent in recent years, particularly in the form of reasonably combing multiple algorithms, a process known as model fusion. In the current review, we present 29 state-of-the-art models and introduce the taxonomy of computational models for MDA prediction based on model fusion and non-fusion. The new taxonomy exhibits notable changes in the algorithmic architecture of models, compared with that of earlier ones in the 2017 review by Chen et al. Moreover, we discuss the progresses that have been made towards overcoming the obstacles to effective MDA prediction since 2017 and elaborated on how future models can be designed according to a set of new schemas. Lastly, we analysed the strengths and weaknesses of each model category in the proposed taxonomy and proposed future research directions from diverse perspectives for enhancing model performance.
Collapse
Affiliation(s)
- Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, 10084, China.,The Future Laboratory, Tsinghua University, Beijing, 10084, China
| | - Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.,Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|
9
|
Ding Y, Lei X, Liao B, Wu FX. MLRDFM: a multi-view Laplacian regularized DeepFM model for predicting miRNA-disease associations. Brief Bioinform 2022; 23:6552270. [PMID: 35323901 DOI: 10.1093/bib/bbac079] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Revised: 02/07/2022] [Accepted: 02/15/2022] [Indexed: 01/20/2023] Open
Abstract
MOTIVATION MicroRNAs (miRNAs), as critical regulators, are involved in various fundamental and vital biological processes, and their abnormalities are closely related to human diseases. Predicting disease-related miRNAs is beneficial to uncovering new biomarkers for the prevention, detection, prognosis, diagnosis and treatment of complex diseases. RESULTS In this study, we propose a multi-view Laplacian regularized deep factorization machine (DeepFM) model, MLRDFM, to predict novel miRNA-disease associations while improving the standard DeepFM. Specifically, MLRDFM improves DeepFM from two aspects: first, MLRDFM takes the relationships among items into consideration by regularizing their embedding features via their similarity-based Laplacians. In this study, miRNA Laplacian regularization integrates four types of miRNA similarity, while disease Laplacian regularization integrates two types of disease similarity. Second, to judiciously train our model, Laplacian eigenmaps are utilized to initialize the weights in the dense embedding layer. The experimental results on the latest HMDD v3.2 dataset show that MLRDFM improves the performance and reduces the overfitting phenomenon of DeepFM. Besides, MLRDFM is greatly superior to the state-of-the-art models in miRNA-disease association prediction in terms of different evaluation metrics with the 5-fold cross-validation. Furthermore, case studies further demonstrate the effectiveness of MLRDFM.
Collapse
Affiliation(s)
- Yulian Ding
- Division of Biomedical Engineering, University of Saskatchewan, 57 Campus Drive, S7N 5A9, Saskatchewan, Canada
| | - Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, 620 West Chang'an Avenue, 710119, Shaanxi, China
| | - Bo Liao
- School of Mathematics and Statistics, Hainan Normal University, 99 Longkun South Road, 571158, Hainan, China
| | - Fang-Xiang Wu
- Division of Biomedical Engineering, University of Saskatchewan, 57 Campus Drive, S7N 5A9, Saskatchewan, Canada.,Department of Mechanical Engineering and Department of Computer Science, University of Saskatchewan, 57 Campus Drive, S7N5A9, Saskatchewan, Canada
| |
Collapse
|
10
|
Uthayopas K, de Sá AGC, Alavi A, Pires DEV, Ascher DB. TSMDA: Target and symptom-based computational model for miRNA-disease-association prediction. Mol Ther Nucleic Acids 2021; 26:536-546. [PMID: 34631283 PMCID: PMC8479276 DOI: 10.1016/j.omtn.2021.08.016] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 08/19/2021] [Indexed: 02/06/2023]
Abstract
The emergence of high-throughput sequencing techniques has revealed a primary role of microRNAs (miRNAs) in a wide range of diseases, including cancers and neurodegenerative disorders. Understanding novel relationships between miRNAs and diseases can potentially unveil complex pathogenesis mechanisms, leading to effective diagnosis and treatment. The investigation of novel miRNA-disease associations, however, is currently costly and time consuming. Over the years, several computational models have been proposed to prioritize potential miRNA-disease associations, but with limited usability or predictive capability. In order to fill this gap, we introduce TSMDA, a novel machine-learning method that leverages target and symptom information and negative sample selection to predict miRNA-disease association. TSMDA significantly outperforms similar methods, achieving an area under the receiver operating characteristic (ROC) curve (AUC) of 0.989 and 0.982 under 5-fold cross-validation and blind test, respectively. We also demonstrate the capability of the method to uncover potential miRNA-disease associations in breast, prostate, and lung cancers, as case studies. We believe TSMDA will be an invaluable tool for the community to explore and prioritize potentially new miRNA-disease associations for further experimental characterization. The method was made available as a freely accessible and user-friendly web interface at http://biosig.unimelb.edu.au/tsmda/.
Collapse
Affiliation(s)
- Korawich Uthayopas
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Parkville 3052, VIC, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, VIC, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, VIC, Australia
| | - Alex G C de Sá
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Parkville 3052, VIC, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, VIC, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, VIC, Australia.,Baker Department of Cardiometabolic Health, Melbourne Medical School, University of Melbourne, Parkville 3010, VIC, Australia
| | - Azadeh Alavi
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Parkville 3052, VIC, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, VIC, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, VIC, Australia
| | - Douglas E V Pires
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Parkville 3052, VIC, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, VIC, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, VIC, Australia.,School of Computing and Information Systems, University of Melbourne, Parkville 3052, VIC, Australia
| | - David B Ascher
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Parkville 3052, VIC, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, VIC, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, VIC, Australia.,Baker Department of Cardiometabolic Health, Melbourne Medical School, University of Melbourne, Parkville 3010, VIC, Australia.,Department of Biochemistry, University of Cambridge, 80 Tennis Ct Rd, Cambridge CB2 1GA, UK
| |
Collapse
|
11
|
Toprak A, Eryilmaz Dogan E. Prediction of Potential MicroRNA-Disease Association Using Kernelized Bayesian Matrix Factorization. Interdiscip Sci 2021; 13:595-602. [PMID: 34370220 DOI: 10.1007/s12539-021-00469-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Revised: 07/05/2021] [Accepted: 07/30/2021] [Indexed: 10/20/2022]
Abstract
MicroRNA (miRNA) molecules, which are effective in the formation and progression of many different diseases, are 18-22 nucleotides in length and make up a type of non-coding RNA. Predicting disease-related microRNAs is crucial for understanding the pathogenesis of disease and for diagnosis, treatment, and prevention of diseases. Many computational techniques have been studied and developed, as the experimental techniques used to find novel miRNA-disease associations in biology are costly. In this paper, a Kernelized Bayesian Matrix Factorization (KBMF) technique was suggested to predict new relations among miRNAs and diseases with several information such as miRNA functional similarity, disease semantic similarity, and known relations among miRNAs and diseases. AUC value of 0.9450 was obtained by implementing fivefold cross-validation for KBMF technique. We also carried out three kinds of case studies (breast, lung, and colon neoplasms) to prove the performance of KBMF technique, and the predictive reliability of this method was confirmed by the results. Thus, KBMF technique can be used as a reliable computational model to infer possible miRNA-disease associations.
Collapse
Affiliation(s)
- Ahmet Toprak
- Department of Electricity and Energy, Bozkır Vocational School, Selcuk University, Bozkır, Konya, Turkey
| | - Esma Eryilmaz Dogan
- Department of Biomedical Engineering, Faculty of Technology, Selcuk University, Selçuklu, Konya, Turkey.
| |
Collapse
|
12
|
Dai Q, Chu Y, Li Z, Zhao Y, Mao X, Wang Y, Xiong Y, Wei DQ. MDA-CF: Predicting MiRNA-Disease associations based on a cascade forest model by fusing multi-source information. Comput Biol Med 2021; 136:104706. [PMID: 34371319 DOI: 10.1016/j.compbiomed.2021.104706] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Revised: 07/26/2021] [Accepted: 07/26/2021] [Indexed: 01/17/2023]
Abstract
MicroRNAs (miRNAs) are significant regulators in various biological processes. They may become promising biomarkers or therapeutic targets, which provide a new perspective in diagnosis and treatment of multiple diseases. Since the experimental methods are always costly and resource-consuming, prediction of disease-related miRNAs using computational methods is in great need. In this study, we developed MDA-CF to identify underlying miRNA-disease associations based on a cascade forest model. In this method, multi-source information was integrated to represent miRNAs and diseases comprehensively, and the autoencoder was utilized for dimension reduction to obtain the optimal feature space. The cascade forest model was then employed for miRNA-disease association prediction. As a result, the average AUC of MDA-CF was 0.9464 on HMDD v3.2 in five-fold cross-validation. Compared with previous computational methods, MDA-CF performed better on HMDD v2.0 with an average AUC of 0.9258. Moreover, MDA-CF was implemented to investigate colon neoplasm, breast neoplasm, and gastric neoplasm, and 100%, 86%, 88% of the top 50 potential miRNAs were validated by authoritative databases. In conclusion, MDA-CF appears to be a reliable method to uncover disease-associated miRNAs. The source code of MDA-CF is available at https://github.com/a1622108/MDA-CF.
Collapse
Affiliation(s)
- Qiuying Dai
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yanyi Chu
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Zhiqi Li
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yusong Zhao
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Xueying Mao
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yanjing Wang
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China.
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China; Peng Cheng Laboratory, Vanke Cloud City Phase I Building 8, Xili Street, Nanshan District, Shenzhen, Guangdong, 518055, China.
| |
Collapse
|
13
|
Li A, Deng Y, Tan Y, Chen M. A novel miRNA-disease association prediction model using dual random walk with restart and space projection federated method. PLoS One 2021; 16:e0252971. [PMID: 34138933 PMCID: PMC8211179 DOI: 10.1371/journal.pone.0252971] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2021] [Accepted: 05/26/2021] [Indexed: 12/27/2022] Open
Abstract
A large number of studies have shown that the variation and disorder of miRNAs are important causes of diseases. The recognition of disease-related miRNAs has become an important topic in the field of biological research. However, the identification of disease-related miRNAs by biological experiments is expensive and time consuming. Thus, computational prediction models that predict disease-related miRNAs must be developed. A novel network projection-based dual random walk with restart (NPRWR) was used to predict potential disease-related miRNAs. The NPRWR model aims to estimate and accurately predict miRNA-disease associations by using dual random walk with restart and network projection technology, respectively. The leave-one-out cross validation (LOOCV) was adopted to evaluate the prediction performance of NPRWR. The results show that the area under the receiver operating characteristic curve(AUC) of NPRWR was 0.9029, which is superior to that of other advanced miRNA-disease associated prediction methods. In addition, lung and kidney neoplasms were selected to present a case study. Among the first 50 miRNAs predicted, 50 and 49 miRNAs have been proven by in databases or relevant literature. Moreover, NPRWR can be used to predict isolated diseases and new miRNAs. LOOCV and the case study achieved good prediction results. Thus, NPRWR will become an effective and accurate disease-miRNA association prediction model.
Collapse
Affiliation(s)
- Ang Li
- Hunan Institute of Technology, School of Computer Science and Technology, Hengyang, China
| | - Yingwei Deng
- Hunan Institute of Technology, School of Computer Science and Technology, Hengyang, China
- Hainan Key Laboratory for Computational Science and Application, Haikou, China
| | - Yan Tan
- Hunan Institute of Technology, School of Computer Science and Technology, Hengyang, China
| | - Min Chen
- Hunan Institute of Technology, School of Computer Science and Technology, Hengyang, China
| |
Collapse
|
14
|
Chu Y, Wang X, Dai Q, Wang Y, Wang Q, Peng S, Wei X, Qiu J, Salahub DR, Xiong Y, Wei DQ. MDA-GCNFTG: identifying miRNA-disease associations based on graph convolutional networks via graph sampling through the feature and topology graph. Brief Bioinform 2021; 22:6261915. [PMID: 34009265 DOI: 10.1093/bib/bbab165] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 04/02/2021] [Accepted: 04/08/2021] [Indexed: 11/13/2022] Open
Abstract
Accurate identification of the miRNA-disease associations (MDAs) helps to understand the etiology and mechanisms of various diseases. However, the experimental methods are costly and time-consuming. Thus, it is urgent to develop computational methods towards the prediction of MDAs. Based on the graph theory, the MDA prediction is regarded as a node classification task in the present study. To solve this task, we propose a novel method MDA-GCNFTG, which predicts MDAs based on Graph Convolutional Networks (GCNs) via graph sampling through the Feature and Topology Graph to improve the training efficiency and accuracy. This method models both the potential connections of feature space and the structural relationships of MDA data. The nodes of the graphs are represented by the disease semantic similarity, miRNA functional similarity and Gaussian interaction profile kernel similarity. Moreover, we considered six tasks simultaneously on the MDA prediction problem at the first time, which ensure that under both balanced and unbalanced sample distribution, MDA-GCNFTG can predict not only new MDAs but also new diseases without known related miRNAs and new miRNAs without known related diseases. The results of 5-fold cross-validation show that the MDA-GCNFTG method has achieved satisfactory performance on all six tasks and is significantly superior to the classic machine learning methods and the state-of-the-art MDA prediction methods. Moreover, the effectiveness of GCNs via the graph sampling strategy and the feature and topology graph in MDA-GCNFTG has also been demonstrated. More importantly, case studies for two diseases and three miRNAs are conducted and achieved satisfactory performance.
Collapse
Affiliation(s)
- Yanyi Chu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Xuhong Wang
- School of Electronic, Information and Electrical Engineering (SEIEE), Shanghai Jiao Tong University, China
| | - Qiuying Dai
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Yanjing Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Qiankun Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Shaoliang Peng
- College of Computer Science and Electronic Engineering, Hunan University, China
| | | | | | - Dennis Russell Salahub
- Department of Chemistry, University of Calgary, Fellow Royal Society of Canada and Fellow of the American Association for the Advancement of Science, China
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, P.R. China
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, P.R. China
| |
Collapse
|
15
|
Abstract
Biomolecules, such as microRNAs, circRNAs, lncRNAs and genes, are functionally interdependent in human cells, and all play critical roles in diverse fundamental and vital biological processes. The dysregulations of such biomolecules can cause diseases. Identifying the associations between biomolecules and diseases can uncover the mechanisms of complex diseases, which is conducive to their diagnosis, treatment, prognosis and prevention. Due to the time consumption and cost of biologically experimental methods, many computational association prediction methods have been proposed in the past few years. In this study, we provide a comprehensive review of machine learning-based approaches for predicting disease-biomolecule associations with multi-view data sources. Firstly, we introduce some databases and general strategies for integrating multi-view data sources in the prediction models. Then we discuss several feature representation methods for machine learning-based prediction models. Thirdly, we comprehensively review machine learning-based prediction approaches in three categories: basic machine learning methods, matrix completion-based methods and deep learning-based methods, while discussing their advantages and disadvantages. Finally, we provide some perspectives for further improving biomolecule-disease prediction methods.
Collapse
Affiliation(s)
- Yulian Ding
- Division of Biomedical Engineering at the University of Saskatchewan
| | - Xiujuan Lei
- School of Computer Science at Shaanxi Normal University
| | - Bo Liao
- School of Mathematics and Statistics at Hainan Normal University, Haikou, China
| | - Fang-Xiang Wu
- College of Engineering and the Department of Computer Science at University of Saskatchewan
| |
Collapse
|
16
|
Lei X, Mudiyanselage TB, Zhang Y, Bian C, Lan W, Yu N, Pan Y. A comprehensive survey on computational methods of non-coding RNA and disease association prediction. Brief Bioinform 2020; 22:6042241. [PMID: 33341893 DOI: 10.1093/bib/bbaa350] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 10/20/2020] [Accepted: 11/01/2020] [Indexed: 02/06/2023] Open
Abstract
The studies on relationships between non-coding RNAs and diseases are widely carried out in recent years. A large number of experimental methods and technologies of producing biological data have also been developed. However, due to their high labor cost and production time, nowadays, calculation-based methods, especially machine learning and deep learning methods, have received a lot of attention and been used commonly to solve these problems. From a computational point of view, this survey mainly introduces three common non-coding RNAs, i.e. miRNAs, lncRNAs and circRNAs, and the related computational methods for predicting their association with diseases. First, the mainstream databases of above three non-coding RNAs are introduced in detail. Then, we present several methods for RNA similarity and disease similarity calculations. Later, we investigate ncRNA-disease prediction methods in details and classify these methods into five types: network propagating, recommend system, matrix completion, machine learning and deep learning. Furthermore, we provide a summary of the applications of these five types of computational methods in predicting the associations between diseases and miRNAs, lncRNAs and circRNAs, respectively. Finally, the advantages and limitations of various methods are identified, and future researches and challenges are also discussed.
Collapse
Affiliation(s)
- Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| | | | - Yuchen Zhang
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| | - Chen Bian
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| | - Wei Lan
- School of Computer, Electronics and Information at Guangxi University, Nanning, China
| | - Ning Yu
- Department of Computing Sciences at the College at Brockport, State University of New York, Rochester, NY, USA
| | - Yi Pan
- Computer Science Department at Georgia State University, Atlanta, GA, USA
| |
Collapse
|
17
|
Toprak A, Eryilmaz E. Prediction of miRNA-disease associations based on Weighted [Formula: see text]-Nearest known neighbors and network consistency projection. J Bioinform Comput Biol 2020; 19:2050041. [PMID: 33148093 DOI: 10.1142/s0219720020500419] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
MicroRNAs (miRNA) are a type of non-coding RNA molecules that are effective on the formation and the progression of many different diseases. Various researches have reported that miRNAs play a major role in the prevention, diagnosis, and treatment of complex human diseases. In recent years, researchers have made a tremendous effort to find the potential relationships between miRNAs and diseases. Since the experimental techniques used to find that new miRNA-disease relationships are time-consuming and expensive, many computational techniques have been developed. In this study, Weighted [Formula: see text]-Nearest Known Neighbors and Network Consistency Projection techniques were suggested to predict new miRNA-disease relationships using various types of knowledge such as known miRNA-disease relationships, functional similarity of miRNA, and disease semantic similarity. An average AUC of 0.9037 and 0.9168 were calculated in our method by 5-fold and leave-one-out cross validation, respectively. Case studies of breast, lung, and colon neoplasms were applied to prove the performance of our proposed technique, and the results confirmed the predictive reliability of this method. Therefore, reported experimental results have shown that our proposed method can be used as a reliable computational model to reveal potential relationships between miRNAs and diseases.
Collapse
Affiliation(s)
- Ahmet Toprak
- Department of Electricity and Energy, Bozkır Vocational School, Selcuk University, Konya, Turkey
| | - Esma Eryilmaz
- Department of Biomedical Engineering, Faculty of Technology, Selcuk University, Konya, Turkey
| |
Collapse
|
18
|
Zhang Y, Chen M, Cheng X, Wei H. MSFSP: A Novel miRNA-Disease Association Prediction Model by Federating Multiple-Similarities Fusion and Space Projection. Front Genet 2020; 11:389. [PMID: 32425980 PMCID: PMC7204399 DOI: 10.3389/fgene.2020.00389] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Accepted: 03/27/2020] [Indexed: 12/11/2022] Open
Abstract
Growing evidences have indicated that microRNAs (miRNAs) play a significant role relating to many important bioprocesses; their mutations and disorders will cause the occurrence of various complex diseases. The prediction of miRNAs associated with underlying diseases via computational approaches is beneficial to identify biomarkers and discover specific medicine, which can greatly reduce the cost of diagnosis, cure, prognosis, and prevention of human diseases. However, how to further achieve a more reliable prediction of potential miRNA-disease associations with effective integration of different biological data is a challenge for researchers. In this study, we proposed a computational model by using a federated method of combined multiple-similarities fusion and space projection (MSFSP). MSFSP firstly fused the integrated disease similarity (composed of disease semantic similarity, disease functional similarity, and disease Hamming similarity) with the integrated miRNA similarity (composed of miRNA functional similarity, miRNA sequence similarity, and miRNA Hamming similarity). Secondly, it constructed the weighted network of miRNA-disease associations from the experimentally verified Boolean network of miRNA-disease associations by using similarity networks. Finally, it calculated the prediction results by weighting miRNA space projection scores and the disease space projection scores. Leave-one-out cross-validation demonstrated that MSFSP has the distinguished predictive accuracy with area under the receiver operating characteristics curve (AUC) of 0.9613 better than that of five other existing models. In case studies, the predictive ability of MSFSP was further confirmed as 96 and 98% of the top 50 predictions for prostatic neoplasms and lung neoplasms were successfully validated by experimental evidences and supporting experimental evidences were also found for 100% of the top 50 predictions for isolated diseases.
Collapse
Affiliation(s)
- Yi Zhang
- School of Information Science and Engineering, Guilin University of Technology, Guilin, China
| | - Min Chen
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, China
| | - Xiaohui Cheng
- School of Information Science and Engineering, Guilin University of Technology, Guilin, China
| | - Hanyan Wei
- School of Pharmacy, Guilin Medical University, Guilin, China
| |
Collapse
|
19
|
Xiao Q, Zhang N, Luo J, Dai J, Tang X. Adaptive multi-source multi-view latent feature learning for inferring potential disease-associated miRNAs. Brief Bioinform 2020; 22:2043-2057. [PMID: 32186712 DOI: 10.1093/bib/bbaa028] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Revised: 02/16/2020] [Accepted: 01/14/2020] [Indexed: 12/13/2022] Open
Abstract
Accumulating evidence has shown that microRNAs (miRNAs) play crucial roles in different biological processes, and their mutations and dysregulations have been proved to contribute to tumorigenesis. In silico identification of disease-associated miRNAs is a cost-effective strategy to discover those most promising biomarkers for disease diagnosis and treatment. The increasing available omics data sources provide unprecedented opportunities to decipher the underlying relationships between miRNAs and diseases by computational models. However, most existing methods are biased towards a single representation of miRNAs or diseases and are also not capable of discovering unobserved associations for new miRNAs or diseases without association information. In this study, we present a novel computational method with adaptive multi-source multi-view latent feature learning (M2LFL) to infer potential disease-associated miRNAs. First, we adopt multiple data sources to obtain similarity profiles and capture different latent features according to the geometric characteristic of miRNA and disease spaces. Then, the multi-modal latent features are projected to a common subspace to discover unobserved miRNA-disease associations in both miRNA and disease views, and an adaptive joint graph regularization term is developed to preserve the intrinsic manifold structures of multiple similarity profiles. Meanwhile, the Lp,q-norms are imposed into the projection matrices to ensure the sparsity and improve interpretability. The experimental results confirm the superior performance of our proposed method in screening reliable candidate disease miRNAs, which suggests that M2LFL could be an efficient tool to discover diagnostic biomarkers for guiding laborious clinical trials.
Collapse
|
20
|
Peng LH, Zhou LQ, Chen X, Piao X. A Computational Study of Potential miRNA-Disease Association Inference Based on Ensemble Learning and Kernel Ridge Regression. Front Bioeng Biotechnol 2020; 8:40. [PMID: 32117922 PMCID: PMC7015868 DOI: 10.3389/fbioe.2020.00040] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Accepted: 01/17/2020] [Indexed: 12/11/2022] Open
Abstract
As increasing experimental studies have shown that microRNAs (miRNAs) are closely related to multiple biological processes and the prevention, diagnosis and treatment of human diseases, a growing number of researchers are focusing on the identification of associations between miRNAs and diseases. Identifying such associations purely via experiments is costly and demanding, which prompts researchers to develop computational methods to complement the experiments. In this paper, a novel prediction model named Ensemble of Kernel Ridge Regression based MiRNA-Disease Association prediction (EKRRMDA) was developed. EKRRMDA obtained features of miRNAs and diseases by integrating the disease semantic similarity, the miRNA functional similarity and the Gaussian interaction profile kernel similarity for diseases and miRNAs. Under the computational framework that utilized ensemble learning and feature dimensionality reduction, multiple base classifiers that combined two Kernel Ridge Regression classifiers from the miRNA side and disease side, respectively, were obtained based on random selection of features. Then average strategy for these base classifiers was adopted to obtain final association scores of miRNA-disease pairs. In the global and local leave-one-out cross validation, EKRRMDA attained the AUCs of 0.9314 and 0.8618, respectively. Moreover, the model’s average AUC with standard deviation in 5-fold cross validation was 0.9275 ± 0.0008. In addition, we implemented three different types of case studies on predicting miRNAs associated with five important diseases. As a result, there were 90% (Esophageal Neoplasms), 86% (Kidney Neoplasms), 86% (Lymphoma), 98% (Lung Neoplasms), and 96% (Breast Neoplasms) of the top 50 predicted miRNAs verified to have associations with these diseases.
Collapse
Affiliation(s)
- Li-Hong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Li-Qian Zhou
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| | - Xue Piao
- School of Medical Informatics, Xuzhou Medical University, Xuzhou, China
| |
Collapse
|
21
|
Wu M, Yang Y, Wang H, Ding J, Zhu H, Xu Y. IMPMD: An Integrated Method for Predicting Potential Associations Between miRNAs and Diseases. Curr Genomics 2020; 20:581-591. [PMID: 32581646 PMCID: PMC7290057 DOI: 10.2174/1389202920666191023090215] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Revised: 08/07/2019] [Accepted: 10/16/2019] [Indexed: 01/06/2023] Open
Abstract
Background With the rapid development of biological research, microRNAs (miRNAs) have increasingly attracted worldwide attention. The increasing biological studies and scientific experiments have proven that miRNAs are related to the occurrence and development of a large number of key biological processes which cause complex human diseases. Thus, identifying the association between miRNAs and disease is helpful to diagnose the diseases. Although some studies have found considerable associations between miRNAs and diseases, there are still a lot of associations that need to be identified. Experimental methods to uncover miRNA-disease associations are time-consuming and expensive. Therefore, effective computational methods are urgently needed to predict new associations. Methodology In this work, we propose an integrated method for predicting potential associations between miRNAs and diseases (IMPMD). The enhanced similarity for miRNAs is obtained by combination of functional similarity, gaussian similarity and Jaccard similarity. To diseases, it is obtained by combination of semantic similarity, gaussian similarity and Jaccard similarity. Then, we use these two enhanced similarities to construct the features and calculate cumulative score to choose robust features. Finally, the general linear regression is applied to assign weights for Support Vector Machine, K-Nearest Neighbor and Logistic Regression algorithms. Results IMPMD obtains AUC of 0.9386 in 10-fold cross-validation, which is better than most of the previous models. To further evaluate our model, we implement IMPMD on two types of case studies for lung cancer and breast cancer. 49 (Lung Cancer) and 50 (Breast Cancer) out of the top 50 related miRNAs are validated by experimental discoveries. Conclusion We built a software named IMPMD which can be freely downloaded from https://github.com/Sunmile/IMPMD.
Collapse
Affiliation(s)
- Meiqi Wu
- 1Department of Information and Computer Science, University of Science and Technology Beijing, Beijing100083, China; 2Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China; 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing100080, China
| | - Yingxi Yang
- 1Department of Information and Computer Science, University of Science and Technology Beijing, Beijing100083, China; 2Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China; 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing100080, China
| | - Hui Wang
- 1Department of Information and Computer Science, University of Science and Technology Beijing, Beijing100083, China; 2Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China; 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing100080, China
| | - Jun Ding
- 1Department of Information and Computer Science, University of Science and Technology Beijing, Beijing100083, China; 2Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China; 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing100080, China
| | - Huan Zhu
- 1Department of Information and Computer Science, University of Science and Technology Beijing, Beijing100083, China; 2Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China; 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing100080, China
| | - Yan Xu
- 1Department of Information and Computer Science, University of Science and Technology Beijing, Beijing100083, China; 2Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China; 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing100080, China
| |
Collapse
|
22
|
Oskooei A, Manica M, Mathis R, Martínez MR. Network-based Biased Tree Ensembles (NetBiTE) for Drug Sensitivity Prediction and Drug Sensitivity Biomarker Identification in Cancer. Sci Rep 2019; 9:15918. [PMID: 31685861 DOI: 10.1038/s41598-019-52093-w] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2019] [Accepted: 10/07/2019] [Indexed: 12/15/2022] Open
Abstract
We present the Network-based Biased Tree Ensembles (NetBiTE) method for drug sensitivity prediction and drug sensitivity biomarker identification in cancer using a combination of prior knowledge and gene expression data. Our devised method consists of a biased tree ensemble that is built according to a probabilistic bias weight distribution. The bias weight distribution is obtained from the assignment of high weights to the drug targets and propagating the assigned weights over a protein-protein interaction network such as STRING. The propagation of weights, defines neighborhoods of influence around the drug targets and as such simulates the spread of perturbations within the cell, following drug administration. Using a synthetic dataset, we showcase how application of biased tree ensembles (BiTE) results in significant accuracy gains at a much lower computational cost compared to the unbiased random forests (RF) algorithm. We then apply NetBiTE to the Genomics of Drug Sensitivity in Cancer (GDSC) dataset and demonstrate that NetBiTE outperforms RF in predicting IC50 drug sensitivity, only for drugs that target membrane receptor pathways (MRPs): RTK, EGFR and IGFR signaling pathways. We propose based on the NetBiTE results, that for drugs that inhibit MRPs, the expression of target genes prior to drug administration is a biomarker for IC50 drug sensitivity following drug administration. We further verify and reinforce this proposition through control studies on, PI3K/MTOR signaling pathway inhibitors, a drug category that does not target MRPs, and through assignment of dummy targets to MRP inhibiting drugs and investigating the variation in NetBiTE accuracy.
Collapse
|
23
|
Huang Z, Liu L, Gao Y, Shi J, Cui Q, Li J, Zhou Y. Benchmark of computational methods for predicting microRNA-disease associations. Genome Biol 2019; 20:202. [PMID: 31594544 PMCID: PMC6781296 DOI: 10.1186/s13059-019-1811-3] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2019] [Accepted: 09/03/2019] [Indexed: 01/06/2023] Open
Abstract
BACKGROUND A series of miRNA-disease association prediction methods have been proposed to prioritize potential disease-associated miRNAs. Independent benchmarking of these methods is warranted to assess their effectiveness and robustness. RESULTS Based on more than 8000 novel miRNA-disease associations from the latest HMDD v3.1 database, we perform systematic comparison among 36 readily available prediction methods. Their overall performances are evaluated with rigorous precision-recall curve analysis, where 13 methods show acceptable accuracy (AUPRC > 0.200) while the top two methods achieve a promising AUPRC over 0.300, and most of these methods are also highly ranked when considering only the causal miRNA-disease associations as the positive samples. The potential of performance improvement is demonstrated by combining different predictors or adopting a more updated miRNA similarity matrix, which would result in up to 16% and 46% of AUPRC augmentations compared to the best single predictor and the predictors using the previous similarity matrix, respectively. Our analysis suggests a common issue of the available methods, which is that the prediction results are severely biased toward well-annotated diseases with many associated miRNAs known and cannot further stratify the positive samples by discriminating the causal miRNA-disease associations from the general miRNA-disease associations. CONCLUSION Our benchmarking results not only provide a reference for biomedical researchers to choose appropriate miRNA-disease association predictors for their purpose, but also suggest the future directions for the development of more robust miRNA-disease association predictors.
Collapse
Affiliation(s)
- Zhou Huang
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
| | - Leibo Liu
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, 300401, China
| | - Yuanxu Gao
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
| | - Jiangcheng Shi
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
| | - Qinghua Cui
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
- Center of Bioinformatics, Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Jianwei Li
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, 300401, China.
| | - Yuan Zhou
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China.
| |
Collapse
|
24
|
Zhang L, Chen X, Yin J. Prediction of Potential miRNA-Disease Associations Through a Novel Unsupervised Deep Learning Framework with Variational Autoencoder. Cells 2019; 8:cells8091040. [PMID: 31489920 PMCID: PMC6770222 DOI: 10.3390/cells8091040] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2019] [Revised: 08/31/2019] [Accepted: 09/02/2019] [Indexed: 12/22/2022] Open
Abstract
The important role of microRNAs (miRNAs) in the formation, development, diagnosis, and treatment of diseases has attracted much attention among researchers recently. In this study, we present an unsupervised deep learning model of the variational autoencoder for MiRNA–disease association prediction (VAEMDA). Through combining the integrated miRNA similarity and the integrated disease similarity with known miRNA–disease associations, respectively, we constructed two spliced matrices. These matrices were applied to train the variational autoencoder (VAE), respectively. The final predicted association scores between miRNAs and diseases were obtained by integrating the scores from the two trained VAE models. Unlike previous models, VAEMDA can avoid noise introduced by the random selection of negative samples and reveal associations between miRNAs and diseases from the perspective of data distribution. Compared with previous methods, VAEMDA obtained higher area under the receiver operating characteristics curves (AUCs) of 0.9118, 0.8652, and 0.9091 ± 0.0065 in global leave-one-out cross validation (LOOCV), local LOOCV, and five-fold cross validation, respectively. Further, the AUCs of VAEMDA were 0.8250 and 0.8237 in global leave-one-disease-out cross validation (LODOCV), and local LODOCV, respectively. In three different types of case studies on three important diseases, the results showed that most of the top 50 potentially associated miRNAs were verified by databases and the literature.
Collapse
Affiliation(s)
- Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China.
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China.
| | - Jun Yin
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China.
| |
Collapse
|
25
|
Wiczling P, Daghir-Wojtkowiak E, Kaliszan R, Markuszewski MJ, Limon J, Koczkowska M, Stukan M, Kuźniacka A, Ratajska M. Bayesian multilevel model of micro RNA levels in ovarian-cancer and healthy subjects. PLoS One 2019; 14:e0221764. [PMID: 31465488 PMCID: PMC6715278 DOI: 10.1371/journal.pone.0221764] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2018] [Accepted: 08/14/2019] [Indexed: 12/31/2022] Open
Abstract
In transcriptomics, micro RNAs (miRNAs) has gained much interest especially as potential disease indicators. However, apart from holding a great promise related to their clinical application, a lot of inconsistent results have been published. Our aim was to compare the miRNA expression levels in ovarian cancer and healthy subjects using the Bayesian multilevel model and to assess their potential usefulness in diagnosis. We have analyzed a case-control observational data on expression profiling of 49 preselected miRNA-based ovarian cancer indicators in 119 controls and 59 patients. A Bayesian multilevel model was used to characterize the effect of disease on miRNA levels controlling for differences in age and body weight. The difference between the miRNA level and health status of the patient on the scale of the data variability were discussed in the context of their potential usefulness in diagnosis. Additionally, the cross-validated area under the ROC curve (AUC) was used to assess the expected out-of-sample discrimination index of a different sets of miRNAs. The proposed model allowed us to describe the set of miRNA levels in patients and controls. Three highly correlated miRNAs: miR-101-3p, miR-142-5p, miR-148a-3p rank the highest with almost identical effect sizes that ranges from 0.45 to 1.0. For those miRNAs the credible interval for AUC ranged from 0.63 to 0.67 indicating their limited discrimination potential. A little benefit in adding information from other miRNAs was observed. There were several miRNAs in the dataset (miR-604, hsa-miR-221-5p) for which inferences were uncertain. For those miRNAs more experimental effort is needed to fully assess their effect in the context of new hits discovery and usefulness as disease indicators. The proposed multilevel Bayesian model can be used to characterize the panel of miRNA profile and to assess the difference in expression levels between healthy and cancer individuals.
Collapse
Affiliation(s)
- Paweł Wiczling
- Department of Biopharmaceutics and Pharmacodynamics, Medical University of Gdańsk, Gen. J. Hallera, Gdańsk, Poland
- * E-mail:
| | - Emilia Daghir-Wojtkowiak
- Department of Biopharmaceutics and Pharmacodynamics, Medical University of Gdańsk, Gen. J. Hallera, Gdańsk, Poland
| | - Roman Kaliszan
- Department of Biopharmaceutics and Pharmacodynamics, Medical University of Gdańsk, Gen. J. Hallera, Gdańsk, Poland
| | - Michał Jan Markuszewski
- Department of Biopharmaceutics and Pharmacodynamics, Medical University of Gdańsk, Gen. J. Hallera, Gdańsk, Poland
| | - Janusz Limon
- Department of Biology and Genetics, Medical University of Gdańsk, Dębinki, Gdańsk, Poland
| | - Magdalena Koczkowska
- Department of Biology and Genetics, Medical University of Gdańsk, Dębinki, Gdańsk, Poland
| | - Maciej Stukan
- Department of Gynecological Oncology, Gdynia Oncology Centre, Powstania Styczniowego, Gdynia, Poland
| | - Alina Kuźniacka
- Department of Biology and Genetics, Medical University of Gdańsk, Dębinki, Gdańsk, Poland
| | - Magdalena Ratajska
- Department of Biology and Genetics, Medical University of Gdańsk, Dębinki, Gdańsk, Poland
| |
Collapse
|
26
|
Zhang Y, Chen M, Cheng X, Chen Z. LSGSP: a novel miRNA–disease association prediction model using a Laplacian score of the graphs and space projection federated method. RSC Adv 2019; 9:29747-29759. [PMID: 35531537 PMCID: PMC9071959 DOI: 10.1039/c9ra05554a] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2019] [Accepted: 09/09/2019] [Indexed: 12/31/2022] Open
Abstract
Lots of research findings have indicated that miRNAs (microRNAs) are involved in many important biological processes; their mutations and disorders are closely related to diseases, therefore, determining the associations between human diseases and miRNAs is key to understand pathogenic mechanisms. Existing biological experimental methods for identifying miRNA–disease associations are usually expensive and time consuming. Therefore, the development of efficient and reliable computational methods for identifying disease-related miRNAs has become an important topic in the field of biological research in recent years. In this study, we developed a novel miRNA–disease association prediction model using a Laplacian score of the graphs and space projection federated method (LSGSP). This integrates experimentally validated miRNA–disease associations, disease semantic similarity scores, miRNA functional scores, and miRNA family information to build a new disease similarity network and miRNA similarity network, and then obtains the global similarities of these networks through calculating the Laplacian score of the graphs, based on which the miRNA–disease weighted network can be constructed through combination with the miRNA–disease Boolean network. Finally, the miRNA–disease score was obtained via projecting the miRNA space and disease space onto the miRNA–disease weighted network. Compared with several other state-of-the-art methods, using leave-one-out cross validation (LOOCV) to evaluate the accuracy of LSGSP with respect to a benchmark dataset, prediction dataset and compare dataset, LSGSP showed excellent predictive performance with high AUC values of 0.9221, 0.9745 and 0.9194, respectively. In addition, for prostate neoplasms and lung neoplasms, the consistencies between the top 50 predicted miRNAs (obtained from LSGSP) and the results (confirmed from the updated HMDD, miR2Disease, and dbDEMC databases) reached 96% and 100%, respectively. Similarly, for isolated diseases (diseases not associated with any miRNAs), the consistencies between the top 50 predicted miRNAs (obtained from LSGSP) and the results (confirmed from the above-mentioned three databases) reached 98% and 100%, respectively. These results further indicate that LSGSP can effectively predict potential associations between miRNAs and diseases. Lots of research findings have indicated that the mutations and disorders of miRNAs (microRNAs) are closely related to diseases. Therefore, determining the associations between human diseases and miRNAs is key to understand the pathogenic mechanisms.![]()
Collapse
Affiliation(s)
- Yi Zhang
- School of Information Science and Engineering
- Guilin University of Technology
- 541004 Guilin
- China
| | - Min Chen
- School of Computer Science and Technology
- Hunan Institute of Technology
- 421002 Hengyang
- China
| | - Xiaohui Cheng
- School of Information Science and Engineering
- Guilin University of Technology
- 541004 Guilin
- China
| | - Zheng Chen
- School of Computer Science and Technology
- Hunan Institute of Technology
- 421002 Hengyang
- China
| |
Collapse
|