1
|
Liu T, Chen Q, Liu R, Sun Y, Wang Y, Zhu Y, Zhao T. DMGAT: predicting ncRNA-drug resistance associations based on diffusion map and heterogeneous graph attention network. Brief Bioinform 2025; 26:bbaf179. [PMID: 40251829 PMCID: PMC12008124 DOI: 10.1093/bib/bbaf179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2025] [Revised: 03/26/2025] [Accepted: 03/30/2025] [Indexed: 04/21/2025] Open
Abstract
Non-coding RNAs (ncRNAs) play crucial roles in drug resistance and sensitivity, making them important biomarkers and therapeutic targets. However, predicting ncRNA-drug associations is challenging due to issues such as dataset imbalance and sparsity, limiting the identification of robust biomarkers. Existing models often fall short in capturing local and global sequence information, limiting the reliability of predictions. This study introduces DMGAT (diffusion map and heterogeneous graph attention network), a novel deep learning model designed to predict ncRNA-drug associations. DMGAT integrates diffusion maps for sequence embedding, graph convolutional networks for feature extraction, and GAT for heterogeneous information fusion. To address dataset imbalance, the model incorporates sensitivity associations and employs a random forest classifier to select reliable negative samples. DMGAT embeds ncRNA sequences and drug SMILES using the word2vec technique, capturing local and global sequence information. The model constructs a heterogeneous network by combining sequence similarity and Gaussian Interaction Profile kernel similarity, providing a comprehensive representation of ncRNA-drug interactions. Evaluated through five-fold cross-validation on a curated dataset from NoncoRNA and ncDR, DMGAT outperforms seven state-of-the-art methods, achieving the highest area under the receiver operating characteristic curve (0.8964), area under the precision-recall curve (0.8984), recall (0.9576), and F1-score (0.8285). The raw data are released to Zenodo with identifier 13929676. The source code of DMGAT is available at https://github.com/liutingyu0616/DMGAT/tree/main.
Collapse
Affiliation(s)
- Tingyu Liu
- School of Medicine and Heath, Harbin Institute of Technology, 150000, Nangang District, Xidazhi Street No. 90, Harbin, China
| | - Qiuhao Chen
- Zhengzhou Research Institute, Harbin Instituteof Technology, 150000, Nangang District, Xidazhi Street No. 90, Harbin, Heilongjiang, China
| | - Renjie Liu
- Zhengzhou Research Institute, Harbin Instituteof Technology, 150000, Nangang District, Xidazhi Street No. 90, Harbin, Heilongjiang, China
| | - Yuzhi Sun
- School of Computer Science and Technology, Harbin Institute of Technology, 150000, Nangang District, Xidazhi Street No. 90, Harbin, Heilongjiang, China
| | - Yadong Wang
- School of Computer Science and Technology, Harbin Institute of Technology, 150000, Nangang District, Xidazhi Street No. 90, Harbin, Heilongjiang, China
| | - Yan Zhu
- College of Veterinary Medicine, Northeast Agricultural University, 150038, Xiangfang District, Changjiang Road No. 600, Harbin, China
| | - Tianyi Zhao
- School of Medicine and Heath, Harbin Institute of Technology, 150000, Nangang District, Xidazhi Street No. 90, Harbin, China
- Zhengzhou Research Institute, Harbin Instituteof Technology, 150000, Nangang District, Xidazhi Street No. 90, Harbin, Heilongjiang, China
| |
Collapse
|
2
|
Wang S, Liu JX, Li F, Wang J, Gao YL. M 3HOGAT: A Multi-View Multi-Modal Multi-Scale High-Order Graph Attention Network for Microbe-Disease Association Prediction. IEEE J Biomed Health Inform 2024; 28:6259-6267. [PMID: 39012741 DOI: 10.1109/jbhi.2024.3429128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/18/2024]
Abstract
Numerous scientific studies have found a link between diverse microorganisms in the human body and complex human diseases. Because traditional experimental approaches are time-consuming and expensive, using computational methods to identify microbes correlated with diseases is critical. In this paper, a new microbe-disease association prediction model is proposed that combines a multi-view multi-modal network and a multi-scale feature fusion mechanism, called M3HOGAT. Firstly, a microbe-disease association network and multiple similarity views are constructed based on multi-source information. Then, consider that neighbor information from disparate orders might be more adept at learning node representations. Consequently, the higher-order graph attention network (HOGAT) is devised to aggregate neighbor information from disparate orders to extract microbe and disease features from different networks and views. Given that the embedding features of microbe and disease from different views possess varying importance, a multi-scale feature fusion mechanism is employed to learn their interaction information, thereby generating the final feature of microbes and diseases. Finally, an inner product decoder is used to reconstruct the microbe-disease association matrix. Compared with five state-of-the-art methods on the HMDAD and Disbiome datasets, the results of 5-fold cross-validations show that M3HOGAT achieves the best performance. Furthermore, case studies on asthma and obesity confirm the effectiveness of M3HOGAT in identifying potential disease-related microbes.
Collapse
|
3
|
Shi K, Huang K, Li L, Liu Q, Zhang Y, Zheng H. Predicting microbe-disease association based on graph autoencoder and inductive matrix completion with multi-similarities fusion. Front Microbiol 2024; 15:1438942. [PMID: 39355422 PMCID: PMC11443509 DOI: 10.3389/fmicb.2024.1438942] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Accepted: 08/02/2024] [Indexed: 10/03/2024] Open
Abstract
Background Clinical studies have demonstrated that microbes play a crucial role in human health and disease. The identification of microbe-disease interactions can provide insights into the pathogenesis and promote the diagnosis, treatment, and prevention of disease. Although a large number of computational methods are designed to screen novel microbe-disease associations, the accurate and efficient methods are still lacking due to data inconsistence, underutilization of prior information, and model performance. Methods In this study, we proposed an improved deep learning-based framework, named GIMMDA, to identify latent microbe-disease associations, which is based on graph autoencoder and inductive matrix completion. By co-training the information from microbe and disease space, the new representations of microbes and diseases are used to reconstruct microbe-disease association in the end-to-end framework. In particular, a similarity fusion strategy is conducted to improve prediction performance. Results The experimental results show that the performance of GIMMDA is competitive with that of existing state-of-the-art methods on 3 datasets (i.e., HMDAD, Disbiome, and multiMDA). In particular, it performs best with the area under the receiver operating characteristic curve (AUC) of 0.9735, 0.9156, 0.9396 on abovementioned 3 datasets, respectively. And the result also confirms that different similarity fusions can improve the prediction performance. Furthermore, case studies on two diseases, i.e., asthma and obesity, validate the effectiveness and reliability of our proposed model. Conclusion The proposed GIMMDA model show a strong capability in predicting microbe-disease associations. We expect that GPUDMDA will help identify potential microbe-related diseases in the future.
Collapse
Affiliation(s)
- Kai Shi
- College of Computer Science and Engineering, Guilin University of Technology, Guilin, China
- Guangxi Key Laboratory of Embedded Technology and Intelligent Systems, Guilin University of Technology, Guilin, China
| | - Kai Huang
- College of Computer Science and Engineering, Guilin University of Technology, Guilin, China
| | - Lin Li
- College of Computer Science and Engineering, Guilin University of Technology, Guilin, China
| | - Qiaohui Liu
- College of Computer Science and Engineering, Guilin University of Technology, Guilin, China
| | - Yi Zhang
- College of Computer Science and Engineering, Guilin University of Technology, Guilin, China
| | - Huilin Zheng
- College of Computer Science and Engineering, Guilin University of Technology, Guilin, China
| |
Collapse
|
4
|
Zhang L, Chen M, Hu X, Deng L. Graph Convolutional Network and Contrastive Learning Small Nucleolar RNA (snoRNA) Disease Associations (GCLSDA): Predicting snoRNA-Disease Associations via Graph Convolutional Network and Contrastive Learning. Int J Mol Sci 2023; 24:14429. [PMID: 37833876 PMCID: PMC10572952 DOI: 10.3390/ijms241914429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 09/17/2023] [Accepted: 09/18/2023] [Indexed: 10/15/2023] Open
Abstract
Small nucleolar RNAs (snoRNAs) constitute a prevalent class of noncoding RNAs localized within the nucleoli of eukaryotic cells. Their involvement in diverse diseases underscores the significance of forecasting associations between snoRNAs and diseases. However, conventional experimental techniques for such predictions suffer limitations in scalability, protracted timelines, and suboptimal success rates. Consequently, efficient computational methodologies are imperative to realize the accurate predictions of snoRNA-disease associations. Herein, we introduce GCLSDA-graph Convolutional Network and contrastive learning predict snoRNA disease associations. GCLSDA is an innovative framework that combines graph convolution networks and self-supervised learning for snoRNA-disease association prediction. Leveraging the repository of MNDR v4.0 and ncRPheno databases, we construct a robust snoRNA-disease association dataset, which serves as the foundation to create bipartite graphs. The computational prowess of the light graph convolutional network (LightGCN) is harnessed to acquire nuanced embedded representations of both snoRNAs and diseases. With careful consideration, GCLSDA intelligently incorporates contrast learning to address the challenging issues of sparsity and over-smoothing inside correlation matrices. This combination not only ensures the precision of predictions but also amplifies the model's robustness. Moreover, we introduce the augmentation technique of random noise to refine the embedded snoRNA representations, consequently enhancing the precision of predictions. Within the domain of contrast learning, we unite the tasks of contrast and recommendation. This harmonization streamlines the cross-layer contrast process, simplifying the information propagation and concurrently curtailing computational complexity. In the area of snoRNA-disease associations, GCLSDA constantly shows its promising capacity for prediction through extensive research. This success not only contributes valuable insights into the functional roles of snoRNAs in disease etiology, but also plays an instrumental role in identifying potential drug targets and catalyzing innovative treatment modalities.
Collapse
Affiliation(s)
| | | | | | - Lei Deng
- School of Computer Science and Engineering, Central South University, Changsha 410083, China; (L.Z.); (M.C.); (X.H.)
| |
Collapse
|
5
|
Peng L, Huang L, Tian G, Wu Y, Li G, Cao J, Wang P, Li Z, Duan L. Predicting potential microbe-disease associations with graph attention autoencoder, positive-unlabeled learning, and deep neural network. Front Microbiol 2023; 14:1244527. [PMID: 37789848 PMCID: PMC10543759 DOI: 10.3389/fmicb.2023.1244527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 08/16/2023] [Indexed: 10/05/2023] Open
Abstract
Background Microbes have dense linkages with human diseases. Balanced microorganisms protect human body against physiological disorders while unbalanced ones may cause diseases. Thus, identification of potential associations between microbes and diseases can contribute to the diagnosis and therapy of various complex diseases. Biological experiments for microbe-disease association (MDA) prediction are expensive, time-consuming, and labor-intensive. Methods We developed a computational MDA prediction method called GPUDMDA by combining graph attention autoencoder, positive-unlabeled learning, and deep neural network. First, GPUDMDA computes disease similarity and microbe similarity matrices by integrating their functional similarity and Gaussian association profile kernel similarity, respectively. Next, it learns the feature representation of each microbe-disease pair using graph attention autoencoder based on the obtained disease similarity and microbe similarity matrices. Third, it selects a few reliable negative MDAs based on positive-unlabeled learning. Finally, it takes the learned MDA features and the selected negative MDAs as inputs and designed a deep neural network to predict potential MDAs. Results GPUDMDA was compared with four state-of-the-art MDA identification models (i.e., MNNMDA, GATMDA, LRLSHMDA, and NTSHMDA) on the HMDAD and Disbiome databases under five-fold cross validations on microbes, diseases, and microbe-disease pairs. Under the three five-fold cross validations, GPUDMDA computed the best AUCs of 0.7121, 0.9454, and 0.9501 on the HMDAD database and 0.8372, 0.8908, and 0.8948 on the Disbiome database, respectively, outperforming the other four MDA prediction methods. Asthma is the most common chronic respiratory condition and affects ~339 million people worldwide. Inflammatory bowel disease is a class of globally chronic intestinal disease widely existed in the gut and gastrointestinal tract and extraintestinal organs of patients. Particularly, inflammatory bowel disease severely affects the growth and development of children. We used the proposed GPUDMDA method and found that Enterobacter hormaechei had potential associations with both asthma and inflammatory bowel disease and need further biological experimental validation. Conclusion The proposed GPUDMDA demonstrated the powerful MDA prediction ability. We anticipate that GPUDMDA helps screen the therapeutic clues for microbe-related diseases.
Collapse
Affiliation(s)
- Lihong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
- College of Life Sciences and Chemistry, Hunan University of Technology, Zhuzhou, China
| | - Liangliang Huang
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Geng Tian
- Geneis (Beijing) Co. Ltd., Beijing, China
| | - Yan Wu
- Geneis (Beijing) Co. Ltd., Beijing, China
| | - Guang Li
- Faculty of Pediatrics, The Chinese PLA General Hospital, Beijing, China
- Department of Pediatric Surgery, The Seventh Medical Center of PLA General Hospital, Beijing, China
- National Engineering Laboratory for Birth Defects Prevention and Control of Key Technology, Beijing, China
- Beijing Key Laboratory of Pediatric Organ Failure, Beijing, China
| | - Jianying Cao
- Faculty of Pediatrics, The Chinese PLA General Hospital, Beijing, China
- Department of Pediatric Surgery, The Seventh Medical Center of PLA General Hospital, Beijing, China
- National Engineering Laboratory for Birth Defects Prevention and Control of Key Technology, Beijing, China
- Beijing Key Laboratory of Pediatric Organ Failure, Beijing, China
| | - Peng Wang
- School of Computer Science, Hunan Institute of Technology, Hengyang, China
| | - Zejun Li
- School of Computer Science, Hunan Institute of Technology, Hengyang, China
| | - Lian Duan
- Faculty of Pediatrics, The Chinese PLA General Hospital, Beijing, China
- Department of Pediatric Surgery, The Seventh Medical Center of PLA General Hospital, Beijing, China
- National Engineering Laboratory for Birth Defects Prevention and Control of Key Technology, Beijing, China
- Beijing Key Laboratory of Pediatric Organ Failure, Beijing, China
| |
Collapse
|
6
|
Wang F, Yang H, Wu Y, Peng L, Li X. SAELGMDA: Identifying human microbe-disease associations based on sparse autoencoder and LightGBM. Front Microbiol 2023; 14:1207209. [PMID: 37415823 PMCID: PMC10320730 DOI: 10.3389/fmicb.2023.1207209] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 05/18/2023] [Indexed: 07/08/2023] Open
Abstract
Introduction Identification of complex associations between diseases and microbes is important to understand the pathogenesis of diseases and design therapeutic strategies. Biomedical experiment-based Microbe-Disease Association (MDA) detection methods are expensive, time-consuming, and laborious. Methods Here, we developed a computational method called SAELGMDA for potential MDA prediction. First, microbe similarity and disease similarity are computed by integrating their functional similarity and Gaussian interaction profile kernel similarity. Second, one microbe-disease pair is presented as a feature vector by combining the microbe and disease similarity matrices. Next, the obtained feature vectors are mapped to a low-dimensional space based on a Sparse AutoEncoder. Finally, unknown microbe-disease pairs are classified based on Light Gradient boosting machine. Results The proposed SAELGMDA method was compared with four state-of-the-art MDA methods (MNNMDA, GATMDA, NTSHMDA, and LRLSHMDA) under five-fold cross validations on diseases, microbes, and microbe-disease pairs on the HMDAD and Disbiome databases. The results show that SAELGMDA computed the best accuracy, Matthews correlation coefficient, AUC, and AUPR under the majority of conditions, outperforming the other four MDA prediction models. In particular, SAELGMDA obtained the best AUCs of 0.8358 and 0.9301 under cross validation on diseases, 0.9838 and 0.9293 under cross validation on microbes, and 0.9857 and 0.9358 under cross validation on microbe-disease pairs on the HMDAD and Disbiome databases. Colorectal cancer, inflammatory bowel disease, and lung cancer are diseases that severely threat human health. We used the proposed SAELGMDA method to find possible microbes for the three diseases. The results demonstrate that there are potential associations between Clostridium coccoides and colorectal cancer and one between Sphingomonadaceae and inflammatory bowel disease. In addition, Veillonella may associate with autism. The inferred MDAs need further validation. Conclusion We anticipate that the proposed SAELGMDA method contributes to the identification of new MDAs.
Collapse
Affiliation(s)
- Feixiang Wang
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Huandong Yang
- Department of Gastrointestinal Surgery, Yidu Central Hospital of Weifang, Weifang, China
| | - Yan Wu
- Geneis (Beijing) Co., Ltd., Beijing, China
| | - Lihong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Xiaoling Li
- The Second Department of Oncology, Beidahuang Industry Group General Hospital, Harbin, China
- The Second Department of Oncology, Heilongjiang Second Cancer Hospital, Harbin, China
| |
Collapse
|
7
|
Shokri Garjan H, Omidi Y, Poursheikhali Asghari M, Ferdousi R. In-silico computational approaches to study microbiota impacts on diseases and pharmacotherapy. Gut Pathog 2023; 15:10. [PMID: 36882861 PMCID: PMC9990230 DOI: 10.1186/s13099-023-00535-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Accepted: 02/21/2023] [Indexed: 03/09/2023] Open
Abstract
Microorganisms have been linked to a variety of critical human disease, thanks to advances in sequencing technology and microbiology. The growing recognition of human microbe-disease relationships provides crucial insights into the underlying disease process from the perspective of pathogens, which is extremely useful for pathogenesis research, early diagnosis, and precision medicine and therapy. Microbe-based analysis in terms of diseases and related drug discovery can predict new connections/mechanisms and provide new concepts. These phenomena have been studied via various in-silico computational approaches. This review aims to elaborate on the computational works conducted on the microbe-disease and microbe-drug topics, discuss the computational model approaches used for predicting associations and provide comprehensive information on the related databases. Finally, we discussed potential prospects and obstacles in this field of study, while also outlining some recommendations for further enhancing predictive capabilities.
Collapse
Affiliation(s)
- Hassan Shokri Garjan
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Yadollah Omidi
- Department of Pharmaceutical Sciences, Nova Southeastern University, College of Pharmacy, Fort Lauderdale, FL, USA
| | | | - Reza Ferdousi
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran.
| |
Collapse
|
8
|
Gong H, You X, Jin M, Meng Y, Zhang H, Yang S, Xu J. Graph neural network and multi-data heterogeneous networks for microbe-disease prediction. Front Microbiol 2022; 13:1077111. [PMID: 36620040 PMCID: PMC9814480 DOI: 10.3389/fmicb.2022.1077111] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2022] [Accepted: 11/30/2022] [Indexed: 12/24/2022] Open
Abstract
The research on microbe association networks is greatly significant for understanding the pathogenic mechanism of microbes and promoting the application of microbes in precision medicine. In this paper, we studied the prediction of microbe-disease associations based on multi-data biological network and graph neural network algorithm. The HMDAD database provided a dataset that included 39 diseases, 292 microbes, and 450 known microbe-disease associations. We proposed a Microbe-Disease Heterogeneous Network according to the microbe similarity network, disease similarity network, and known microbe-disease associations. Furthermore, we integrated the network into the graph convolutional neural network algorithm and developed the GCNN4Micro-Dis model to predict microbe-disease associations. Finally, the performance of the GCNN4Micro-Dis model was evaluated via 5-fold cross-validation. We randomly divided all known microbe-disease association data into five groups. The results showed that the average AUC value and standard deviation were 0.8954 ± 0.0030. Our model had good predictive power and can help identify new microbe-disease associations. In addition, we compared GCNN4Micro-Dis with three advanced methods to predict microbe-disease associations, KATZHMDA, BiRWHMDA, and LRLSHMDA. The results showed that our method had better prediction performance than the other three methods. Furthermore, we selected breast cancer as a case study and found the top 12 microbes related to breast cancer from the intestinal flora of patients, which further verified the model's accuracy.
Collapse
Affiliation(s)
- Houwu Gong
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China,Academy of Military Sciences, Beijing, China
| | - Xiong You
- Center of Rehabilitation Diagnosis and Treatment, Hunan Provincial Rehabilitation Hospital, Changsha, China
| | - Min Jin
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China,*Correspondence: Min Jin, ✉
| | - Yajie Meng
- School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, China
| | - Hanxue Zhang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Shuaishuai Yang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Junlin Xu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China,Junlin Xu, ✉
| |
Collapse
|
9
|
Liu D, Liu J, Luo Y, He Q, Deng L. MGATMDA: Predicting Microbe-Disease Associations via Multi-Component Graph Attention Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3578-3585. [PMID: 34587092 DOI: 10.1109/tcbb.2021.3116318] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Microbes are parasitic in various human body organs and play significant roles in a wide range of diseases. Identifying microbe-disease associations is conducive to the identification of potential drug targets. Considering the high cost and risk of biological experiments, developing computational approaches to explore the relationship between microbes and diseases is an alternative choice. However, most existing methods are based on unreliable or noisy similarity, and the prediction accuracy could be affected. Besides, it is still a great challenge for most previous methods to make predictions for the large-scale dataset. In this work, we develop a multi-component Graph Attention Network (GAT) based framework, termed MGATMDA, for predicting microbe-disease associations. MGATMDA is built on a bipartite graph of microbes and diseases. It contains three essential parts: decomposer, combiner, and predictor. The decomposer first decomposes the edges in the bipartite graph to identify the latent components by node-level attention mechanism. The combiner then recombines these latent components automatically to obtain unified embedding for prediction by component-level attention mechanism. Finally, a fully connected network is used to predict unknown microbes-disease associations. Experimental results showed that our proposed method outperformed eight state-of-the-art methods. Case studies for two common diseases further demonstrated the effectiveness of MGATMDA in predicting potential microbe-disease associations. The codes are available at Github https://github.com/dayunliu/MGATMDA.
Collapse
|
10
|
Yu CQ, Wang XF, Li LP, You ZH, Huang WZ, Li YC, Ren ZH, Guan YJ. SGCNCMI: A New Model Combining Multi-Modal Information to Predict circRNA-Related miRNAs, Diseases and Genes. BIOLOGY 2022; 11:biology11091350. [PMID: 36138829 PMCID: PMC9495879 DOI: 10.3390/biology11091350] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 08/21/2022] [Accepted: 09/08/2022] [Indexed: 11/16/2022]
Abstract
Computational prediction of miRNAs, diseases, and genes associated with circRNAs has important implications for circRNA research, as well as provides a reference for wet experiments to save costs and time. In this study, SGCNCMI, a computational model combining multimodal information and graph convolutional neural networks, combines node similarity to form node information and then predicts associated nodes using GCN with a distributive contribution mechanism. The model can be used not only to predict the molecular level of circRNA–miRNA interactions but also to predict circRNA–cancer and circRNA–gene associations. The AUCs of circRNA—miRNA, circRNA–disease, and circRNA–gene associations in the five-fold cross-validation experiment of SGCNCMI is 89.42%, 84.18%, and 82.44%, respectively. SGCNCMI is one of the few models in this field and achieved the best results. In addition, in our case study, six of the top ten relationship pairs with the highest prediction scores were verified in PubMed.
Collapse
Affiliation(s)
- Chang-Qing Yu
- School of Information Engineering, Xijing University, Xi’an 710123, China
- Correspondence:
| | - Xin-Fei Wang
- School of Information Engineering, Xijing University, Xi’an 710123, China
| | - Li-Ping Li
- College of Grassland and Environment Sciences, Xinjiang Agricultural University, Urumqi 830052, China
| | - Zhu-Hong You
- School of Computer Science, Northwestern Polytechnical University, Xi’an 710129, China
| | - Wen-Zhun Huang
- School of Information Engineering, Xijing University, Xi’an 710123, China
| | - Yue-Chao Li
- School of Information Engineering, Xijing University, Xi’an 710123, China
| | - Zhong-Hao Ren
- School of Information Engineering, Xijing University, Xi’an 710123, China
| | - Yong-Jian Guan
- School of Information Engineering, Xijing University, Xi’an 710123, China
| |
Collapse
|
11
|
Hua M, Yu S, Liu T, Yang X, Wang H. MVGCNMDA: Multi-view Graph Augmentation Convolutional Network for Uncovering Disease-Related Microbes. Interdiscip Sci 2022; 14:669-682. [PMID: 35428964 DOI: 10.1007/s12539-022-00514-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 03/06/2022] [Accepted: 03/13/2022] [Indexed: 06/14/2023]
Abstract
MOTIVATION Exploring the interrelationships between microbes and disease can help microbiologists make decisions and plan treatments. Predicting new microbe-disease associations currently relies on biological experiments and domain knowledge, which is time-consuming and inefficient. Automated algorithms are used to uncover the intrinsic link between microbes and disease. However, due to data noise and inadequate understanding of relevant biology, the efficient prediction of microbe-disease associations is still crucial. This study develops a multi-view graph augmentation convolutional network (MVGCNMDA) to predict potential disease-associated microbes. METHODS First, we use two data augmentation methods, edge perturbation and node dropping, to remove the data noise in the preprocessing stage. Second, we calculate Gaussian interaction profile kernel similarity and cosine similarity. Therefore, the Graph Convolutional Network(GCN) can fully use multi-view features. Then, the multi-view features are fed into the multi-attention block to learn the weights of different features adaptively. Finally, the embedding results are obtained using a Convolutional Neural Network (CNN) combiner, and the matrix completion is used to predict the relationship between potential microbes and diseases. RESULTS We test our model on the Human microbe-disease Association Database (HMDAD), Disbiome, and the Combined Dataset (Peryton and MicroPhenoDB). The area under PR curve (AUPR), area under ROC curve (AUC), F1 score, and RECALL value are calculated to evaluate the performance of the developed MVGCNMDA. The AUPR is 0.9440, AUC is 0.9428, F1 score is 0.9383, and RECALL value is 0.8858. The experiments show that our model can accurately predict potential microbe-disease associations compared with the state-of-the-art works on the global Leave-One-Out-Cross-Validation (LOOCV) and the fivefold Cross-Validation (fivefold CV). To further verify the effectiveness of the proposed graph data augmentation, we designed five different settings in the ablation study. Furthermore, we present two case studies that validate the prediction of the potential association between microbes and diseases by MVGCNMDA.
Collapse
Affiliation(s)
- Meifang Hua
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Shengpeng Yu
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Tianyu Liu
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Xue Yang
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Hong Wang
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China.
| |
Collapse
|
12
|
Wang XF, Yu CQ, Li LP, You ZH, Huang WZ, Li YC, Ren ZH, Guan YJ. KGDCMI: A New Approach for Predicting circRNA–miRNA Interactions From Multi-Source Information Extraction and Deep Learning. Front Genet 2022; 13:958096. [PMID: 36051691 PMCID: PMC9426772 DOI: 10.3389/fgene.2022.958096] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Accepted: 06/23/2022] [Indexed: 11/13/2022] Open
Abstract
Emerging evidence has revealed that circular RNA (circRNA) is widely distributed in mammalian cells and functions as microRNA (miRNA) sponges involved in transcriptional and posttranscriptional regulation of gene expression. Recognizing the circRNA–miRNA interaction provides a new perspective for the detection and treatment of human complex diseases. Compared with the traditional biological experimental methods used to predict the association of molecules, which are limited to the small-scale and are time-consuming and laborious, computing models can provide a basis for biological experiments at low cost. Considering that the proposed calculation model is limited, it is necessary to develop an effective computational method to predict the circRNA–miRNA interaction. This study thus proposed a novel computing method, named KGDCMI, to predict the interactions between circRNA and miRNA based on multi-source information extraction and fusion. The KGDCMI obtains RNA attribute information from sequence and similarity, capturing the behavior information in RNA association through a graph-embedding algorithm. Then, the obtained feature vector is extracted further by principal component analysis and sent to the deep neural network for information fusion and prediction. At last, KGDCMI obtains the prediction accuracy (area under the curve [AUC] = 89.30% and area under the precision–recall curve [AUPR] = 87.67%). Meanwhile, with the same dataset, KGDCMI is 2.37% and 3.08%, respectively, higher than the only existing model, and we conducted three groups of comparative experiments, obtaining the best classification strategy, feature extraction parameters, and dimensions. In addition, in the performed case study, 7 of the top 10 interaction pairs were confirmed in PubMed. These results suggest that KGDCMI is a feasible and useful method to predict the circRNA–miRNA interaction and can act as a reliable candidate for related RNA biological experiments.
Collapse
Affiliation(s)
- Xin-Fei Wang
- School of Information Engineering, Xijing University, Xi’an, China
| | - Chang-Qing Yu
- School of Information Engineering, Xijing University, Xi’an, China
- *Correspondence: Chang-Qing Yu, ; Li-Ping Li,
| | - Li-Ping Li
- School of Information Engineering, Xijing University, Xi’an, China
- College of Grassland and Environment Sciences, Xinjiang Agricultural University, Urumqi, China
- *Correspondence: Chang-Qing Yu, ; Li-Ping Li,
| | - Zhu-Hong You
- School of Computer Science, Northwestern Polytechnical University, Xi’an, China
| | - Wen-Zhun Huang
- School of Information Engineering, Xijing University, Xi’an, China
| | - Yue-Chao Li
- School of Information Engineering, Xijing University, Xi’an, China
| | - Zhong-Hao Ren
- School of Information Engineering, Xijing University, Xi’an, China
| | - Yong-Jian Guan
- School of Information Engineering, Xijing University, Xi’an, China
| |
Collapse
|
13
|
He J, Xiao P, Chen C, Zhu Z, Zhang J, Deng L. GCNCMI: A Graph Convolutional Neural Network Approach for Predicting circRNA-miRNA Interactions. Front Genet 2022; 13:959701. [PMID: 35991563 PMCID: PMC9389118 DOI: 10.3389/fgene.2022.959701] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Accepted: 06/23/2022] [Indexed: 11/18/2022] Open
Abstract
The interactions between circular RNAs (circRNAs) and microRNAs (miRNAs) have been shown to alter gene expression and regulate genes on diseases. Since traditional experimental methods are time-consuming and labor-intensive, most circRNA-miRNA interactions remain largely unknown. Developing computational approaches to large-scale explore the interactions between circRNAs and miRNAs can help bridge this gap. In this paper, we proposed a graph convolutional neural network-based approach named GCNCMI to predict the potential interactions between circRNAs and miRNAs. GCNCMI first mines the potential interactions of adjacent nodes in the graph convolutional neural network and then recursively propagates interaction information on the graph convolutional layers. Finally, it unites the embedded representations generated by each layer to make the final prediction. In the five-fold cross-validation, GCNCMI achieved the highest AUC of 0.9312 and the highest AUPR of 0.9412. In addition, the case studies of two miRNAs, hsa-miR-622 and hsa-miR-149-5p, showed that our model has a good effect on predicting circRNA-miRNA interactions. The code and data are available at https://github.com/csuhjhjhj/GCNCMI.
Collapse
Affiliation(s)
- Jie He
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Pei Xiao
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Chunyu Chen
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Zeqin Zhu
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Jiaxuan Zhang
- Department of Electrical Engineering, University of California, San Diego, San Diego, CA, United States
| | - Lei Deng
- School of Computer Science and Engineering, Central South University, Changsha, China
- *Correspondence: Lei Deng,
| |
Collapse
|
14
|
Zheng J, Qian Y, He J, Kang Z, Deng L. Graph Neural Network with Self-Supervised Learning for Noncoding RNA-Drug Resistance Association Prediction. J Chem Inf Model 2022; 62:3676-3684. [PMID: 35838124 DOI: 10.1021/acs.jcim.2c00367] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Noncoding RNA(ncRNA) is closely related to drug resistance. Identifying the association between ncRNA and drug resistance is of great significance for drug development. Methods based on biological experiments are often time-consuming and small-scale. Therefore, developing computational methods to distinguish the association between ncRNA and drug resistance is urgent. We develop a computational framework called GSLRDA to predict the association between ncRNA and drug resistance in this work. First, the known ncRNA-drug resistance associations are modeled as a bipartite graph of ncRNA and drug. Then, GSLRDA uses the light graph convolutional network (lightGCN) to learn the vector representation of ncRNA and drug from the ncRNA-drug bipartite graph. In addition, GSLRDA uses different data augmentation methods to generate different views for ncRNA and drug nodes and performs self-supervised learning, further improving the quality of learned ncRNA and drug vector representations through contrastive learning between nodes. Finally, GSLRDA uses the inner product to predict the association between ncRNA and drug resistance. To the best of our knowledge, GSLRDA is the first to apply self-supervised learning in association prediction tasks in the field of bioinformatics. The experimental results show that GSLRDA takes an AUC value of 0.9101, higher than the other eight state-of-the-art models. In addition, case studies including two drugs further illustrate the effectiveness of GSLRDA in predicting the association between ncRNA and drug resistance. The code and data sets of GSLRDA are available at https://github.com/JJZ-code/GSLRDA.
Collapse
Affiliation(s)
- Jingjing Zheng
- School of Software, Xinjiang University, Urumqi 830091, China
| | - Yurong Qian
- School of Software, Xinjiang University, Urumqi 830091, China
| | - Jie He
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Zerui Kang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Lei Deng
- School of Software, Xinjiang University, Urumqi 830091, China.,School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
15
|
PDSM-LGCN: Prediction of drug sensitivity associated microRNAs via Light Graph Convolution Neural Network. Methods 2022; 205:106-113. [PMID: 35753591 DOI: 10.1016/j.ymeth.2022.06.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2022] [Revised: 06/12/2022] [Accepted: 06/14/2022] [Indexed: 11/24/2022] Open
Abstract
Cancer has become one of the critical diseases threatening human life and health. The sensitivity difference of cancer drugs has always been a critical cause of the treatment come to nothing. Once drug resistance occurs, it will make the anticancer treatment or even various drugs ineffective. With the deepening of cancer research, a growing number of evidence shows that microRNA has a particular regulatory effect on the sensitivity of cancer drugs, which provides new research ideas. However, using traditional biological experiments to verify and discover the relations of microRNA-drug sensitivity is cumbersome and time-consuming, significantly slowing down cancer drug sensitivity's research progress. Therefore, this paper proposes a computational method (PDSM-LGCN) that spreads information through the high-order connection between cancer drug sensitivity and microRNA. At the same time, the model constructs an optimized-GCN as an embedding propagation layer to obtain the practical embeddings of microRNA and medicines. Finally, based on a collaborative filtering algorithm, the model brings the prediction score between microRNA and drug sensitivity. The results of five-fold cross-validation show that the AUC of PDSM-LGCN is 0.8872, and the AUPR is as high as 0.9026. At the same time, we also reproduced the five latest models of similar problems and compared the results. Our model has the best comprehensive effect among them. In addition, the reliability of PDSM-LGCN was further confirmed through the case study of Cisplatin and Doxorubicin, which can be used as a powerful tool for clinical and biological research. The source code and datasets can be obtained from https://github.com/19990915fzy/PDSM-LGCN/.
Collapse
|
16
|
Tan H, Qiu S, Wang J, Yu G, Guo W, Guo M. Weighted deep factorizing heterogeneous molecular network for genome-phenome association prediction. Methods 2022; 205:18-28. [PMID: 35690250 DOI: 10.1016/j.ymeth.2022.05.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2022] [Revised: 05/14/2022] [Accepted: 05/26/2022] [Indexed: 11/18/2022] Open
Abstract
Genome-phenome association (GPA) prediction can promote the understanding of biological mechanisms about complex pathology of phenotypes (i.e., traits and diseases). Traditional heterogeneous network-based GPA approaches overwhelmingly need to project heterogeneous data toward homogeneous network for data fusion and prediction, such projections result in the loss of heterogeneous network structure information. Matrix factorization based data fusion can avoid such projection by integrating multi-type data in a coherent way, but they typically perform linear factorization and cannot mine the nonlinear relationships between molecules, which compromise the accuracy of GPA analysis. Furthermore, most of them can not selectively synergy network topology and node attribution information in a principle way. In this paper, we propose a weighted deep matrix factorization based solution (WDGPA) to predict GPAs by selectively and differentially fusing heterogeneous molecular network and diverse attributes of nodes. WDGPA firstly assigns weights to inter/intra-relational data matrices and attribute data matrices, and performs deep matrix factorization on these matrices of heterogeneous network in a cooperative manner to obtain the nonlinear representations of different nodes. In addition, it performs low-rank representation learning on the attribute data with the shared nonlinear representations. In this way, both the network topology and node attributes are jointly mined to explore the representations of molecules and complex interplays between molecules and phenotypes. WDGPA then uses the representational vectors of gene and phenotype nodes to predict GPAs. Experimental results on maize and human datasets confirm that WDGPA outperforms competitive methods by a large margin under different evaluation protocols.
Collapse
Affiliation(s)
- Haojiang Tan
- School of Software, Shandong University, Jinan, China; Joint SDU-NTU Centre For AI Research (C-FAIR), Shandong University, Jinan, China.
| | - Sichao Qiu
- School of Software, Shandong University, Jinan, China; Joint SDU-NTU Centre For AI Research (C-FAIR), Shandong University, Jinan, China.
| | - Jun Wang
- Joint SDU-NTU Centre For AI Research (C-FAIR), Shandong University, Jinan, China.
| | - Guoxian Yu
- Joint SDU-NTU Centre For AI Research (C-FAIR), Shandong University, Jinan, China.
| | - Wei Guo
- Joint SDU-NTU Centre For AI Research (C-FAIR), Shandong University, Jinan, China.
| | - Maozu Guo
- College of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China.
| |
Collapse
|
17
|
Xu H, Hu X, Yan X, Zhong W, Yin D, Gai Y. Exploring noncoding RNAs in thyroid cancer using a graph convolutional network approach. Comput Biol Med 2022; 145:105447. [DOI: 10.1016/j.compbiomed.2022.105447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 03/20/2022] [Accepted: 03/21/2022] [Indexed: 12/01/2022]
|
18
|
Wang L, Tan Y, Yang X, Kuang L, Ping P. Review on predicting pairwise relationships between human microbes, drugs and diseases: from biological data to computational models. Brief Bioinform 2022; 23:6553604. [PMID: 35325024 DOI: 10.1093/bib/bbac080] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Revised: 02/14/2022] [Accepted: 02/15/2022] [Indexed: 12/11/2022] Open
Abstract
In recent years, with the rapid development of techniques in bioinformatics and life science, a considerable quantity of biomedical data has been accumulated, based on which researchers have developed various computational approaches to discover potential associations between human microbes, drugs and diseases. This paper provides a comprehensive overview of recent advances in prediction of potential correlations between microbes, drugs and diseases from biological data to computational models. Firstly, we introduced the widely used datasets relevant to the identification of potential relationships between microbes, drugs and diseases in detail. And then, we divided a series of a lot of representative computing models into five major categories including network, matrix factorization, matrix completion, regularization and artificial neural network for in-depth discussion and comparison. Finally, we analysed possible challenges and opportunities in this research area, and at the same time we outlined some suggestions for further improvement of predictive performances as well.
Collapse
Affiliation(s)
- Lei Wang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Yaqin Tan
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Xiaoyu Yang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Linai Kuang
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Pengyao Ping
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China
| |
Collapse
|
19
|
Li Y, Wang R, Zhang S, Xu H, Deng L. LRGCPND: Predicting Associations between ncRNA and Drug Resistance via Linear Residual Graph Convolution. Int J Mol Sci 2021; 22:10508. [PMID: 34638849 PMCID: PMC8508984 DOI: 10.3390/ijms221910508] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 09/25/2021] [Accepted: 09/27/2021] [Indexed: 01/08/2023] Open
Abstract
Accurate inference of the relationship between non-coding RNAs (ncRNAs) and drug resistance is essential for understanding the complicated mechanisms of drug actions and clinical treatment. Traditional biological experiments are time-consuming, laborious, and minor in scale. Although several databases provide relevant resources, computational method for predicting this type of association has not yet been developed. In this paper, we leverage the verified association data of ncRNA and drug resistance to construct a bipartite graph and then develop a linear residual graph convolution approach for predicting associations between non-coding RNA and drug resistance (LRGCPND) without introducing or defining additional data. LRGCPND first aggregates the potential features of neighboring nodes per graph convolutional layer. Next, we transform the information between layers through a linear function. Eventually, LRGCPND unites the embedding representations of each layer to complete the prediction. Results of comparison experiments demonstrate that LRGCPND has more reliable performance than seven other state-of-the-art approaches with an average AUC value of 0.8987. Case studies illustrate that LRGCPND is an effective tool for inferring the associations between ncRNA and drug resistance.
Collapse
Affiliation(s)
| | | | | | | | - Lei Deng
- School of Computer Science and Engineering, Central South University, Changsha 410083, China; (Y.L.); (R.W.); (S.Z.); (H.X.)
| |
Collapse
|
20
|
Yang H, Tong F, Qi C, Wang P, Li J, Cheng L. Prioritizing Disease-Related Microbes Based on the Topological Properties of a Comprehensive Network. Front Microbiol 2021; 12:685549. [PMID: 34326821 PMCID: PMC8315281 DOI: 10.3389/fmicb.2021.685549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 05/10/2021] [Indexed: 01/09/2023] Open
Abstract
Many microbes are parasitic within the human body, engaging in various physiological processes and playing an important role in human diseases. The discovery of new microbe-disease associations aids our understanding of disease pathogenesis. Computational methods can be applied in such investigations, thereby avoiding the time-consuming and laborious nature of experimental methods. In this study, we constructed a comprehensive microbe-disease network by integrating known microbe-disease associations from three large-scale databases (Peryton, Disbiome, and gutMDisorder), and extended the random walk with restart to the network for prioritizing unknown microbe-disease associations. The area under the curve values of the leave-one-out cross-validation and the fivefold cross-validation exceeded 0.9370 and 0.9366, respectively, indicating the high performance of this method. Despite being widely studied diseases, in case studies of inflammatory bowel disease, asthma, and obesity, some prioritized disease-related microbes were validated by recent literature. This suggested that our method is effective at prioritizing novel disease-related microbes and may offer further insight into disease pathogenesis.
Collapse
Affiliation(s)
- Haixiu Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Fan Tong
- Academy of Military Medical Science, Beijing, China
| | - Changlu Qi
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Ping Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Jiangyu Li
- Academy of Military Medical Science, Beijing, China
| | - Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.,NHC and CAMS Key Laboratory of Molecular Probe and Targeted Theranostics, Harbin Medical University, Harbin, China
| |
Collapse
|
21
|
Tang X, Cai L, Meng Y, Xu J, Lu C, Yang J. Indicator Regularized Non-Negative Matrix Factorization Method-Based Drug Repurposing for COVID-19. Front Immunol 2021; 11:603615. [PMID: 33584672 PMCID: PMC7878370 DOI: 10.3389/fimmu.2020.603615] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Accepted: 12/22/2020] [Indexed: 12/24/2022] Open
Abstract
A novel coronavirus, named COVID-19, has become one of the most prevalent and severe infectious diseases in human history. Currently, there are only very few vaccines and therapeutic drugs against COVID-19, and their efficacies are yet to be tested. Drug repurposing aims to explore new applications of approved drugs, which can significantly reduce time and cost compared with de novo drug discovery. In this study, we built a virus-drug dataset, which included 34 viruses, 210 drugs, and 437 confirmed related virus-drug pairs from existing literature. Besides, we developed an Indicator Regularized non-negative Matrix Factorization (IRNMF) method, which introduced the indicator matrix and Karush-Kuhn-Tucker condition into the non-negative matrix factorization algorithm. According to the 5-fold cross-validation on the virus-drug dataset, the performance of IRNMF was better than other methods, and its Area Under receiver operating characteristic Curve (AUC) value was 0.8127. Additionally, we analyzed the case on COVID-19 infection, and our results suggested that the IRNMF algorithm could prioritize unknown virus-drug associations.
Collapse
Affiliation(s)
- Xianfang Tang
- College of Information Science and Engineering, Hunan University, Changsha, China
| | - Lijun Cai
- College of Information Science and Engineering, Hunan University, Changsha, China
| | - Yajie Meng
- College of Information Science and Engineering, Hunan University, Changsha, China
| | - JunLin Xu
- College of Information Science and Engineering, Hunan University, Changsha, China
| | - Changcheng Lu
- College of Information Science and Engineering, Hunan University, Changsha, China
| | - Jialiang Yang
- Department of Science, Geneis Beijing Co., Ltd., Beijing, China
- Academician Workstation, Changsha Medical University, Changsha, China
| |
Collapse
|