1
|
Li X, Liang Z. Causal effect of gut microbiota on pancreatic cancer: A Mendelian randomization and colocalization study. J Cell Mol Med 2024; 28:e18255. [PMID: 38526030 PMCID: PMC10962122 DOI: 10.1111/jcmm.18255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Revised: 03/04/2024] [Accepted: 03/07/2024] [Indexed: 03/26/2024] Open
Abstract
The causal relationship between gut microbiota (GM) and pancreatic cancer (PC) remains unclear. This study aimed to investigate the potential genes underlying this mechanism. GM Genome-wide association study (GWAS) summary data were from the MiBioGen consortium. PC GWAS data were from the National Human Genome Research Institute-European Bioinformatics Institute (NHGRI-EBI) GWAS Catalogue. To detect the causal relationship between GM and PC, we implemented three complementary Mendelian randomization (MR) methods: Inverse Variance Weighting (IVW), MR-Egger and Weighted Median, followed by sensitivity analyses. Furthermore, we integrated GM GWAS data with blood cis-expression quantitative trait loci (eQTLs) and blood cis-DNA methylation QTL (mQTLs) using Summary data-based Mendelian Randomization (SMR) methods. This integration aimed to prioritize potential GM-affecting genes through SMR analysis of two molecular traits. PC cis-eQTLs and cis-mQTLs were summarized from The Cancer Genome Atlas (TCGA) data. Through colocalization analysis of GM cis-QTLs and PC cis-QTLs data, we identified common genes that influence both GM and PC. Our study found a causal association between GM and PC, including four protective and five risk-associated GM [Inverse Variance Weighted (IVW), p < 0.05]. No significant heterogeneity of instrumental variables (IVs) or horizontal pleiotropy was found. The gene SVBP was identified as a GM-affecting gene using SMR analysis of two molecular traits (FDR<0.05, P_HEIDI>0.05). Additionally, two genes, MCM6 and RPS26, were implicated in the interaction between GM and PC based on colocalization analysis (PPH4>0.5). In summary, this study provides evidence for future research aimed at developing suitable therapeutic interventions and disease prevention.
Collapse
Affiliation(s)
- Xin Li
- Department of Gastroenterology, The First Affiliated HospitalGuangxi Medical UniversityNanningChina
| | - Zhihai Liang
- Department of Gastroenterology, The First Affiliated HospitalGuangxi Medical UniversityNanningChina
| |
Collapse
|
2
|
Chen R, Xie G, Lin Z, Gu G, Yu Y, Yu J, Liu Z. Predicting Microbe-Disease Associations Based on a Linear Neighborhood Label Propagation Method with Multi-order Similarity Fusion Learning. Interdiscip Sci 2024:10.1007/s12539-024-00607-0. [PMID: 38436840 DOI: 10.1007/s12539-024-00607-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 01/04/2024] [Accepted: 01/05/2024] [Indexed: 03/05/2024]
Abstract
Computational approaches employed for predicting potential microbe-disease associations often rely on similarity information between microbes and diseases. Therefore, it is important to obtain reliable similarity information by integrating multiple types of similarity information. However, existing similarity fusion methods do not consider multi-order fusion of similarity networks. To address this problem, a novel method of linear neighborhood label propagation with multi-order similarity fusion learning (MOSFL-LNP) is proposed to predict potential microbe-disease associations. Multi-order fusion learning comprises two parts: low-order global learning and high-order feature learning. Low-order global learning is used to obtain common latent features from multiple similarity sources. High-order feature learning relies on the interactions between neighboring nodes to identify high-order similarities and learn deeper interactive network structures. Coefficients are assigned to different high-order feature learning modules to balance the similarities learned from different orders and enhance the robustness of the fusion network. Overall, by combining low-order global learning with high-order feature learning, multi-order fusion learning can capture both the shared and unique features of different similarity networks, leading to more accurate predictions of microbe-disease associations. In comparison to six other advanced methods, MOSFL-LNP exhibits superior prediction performance in the leave-one-out cross-validation and 5-fold validation frameworks. In the case study, the predicted 10 microbes associated with asthma and type 1 diabetes have an accuracy rate of up to 90% and 100%, respectively.
Collapse
Affiliation(s)
- Ruibin Chen
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Guobo Xie
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Zhiyi Lin
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China.
| | - Guosheng Gu
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China.
| | - Yi Yu
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Junrui Yu
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Zhenguo Liu
- Department of Thoracic Surgery, The First Affiliated Hospital of Sun Yat-sen University, Guangzhou, 510080, China.
| |
Collapse
|
3
|
Chen Z, Zhang L, Li J, Fu M. MLFLHMDA: predicting human microbe-disease association based on multi-view latent feature learning. Front Microbiol 2024; 15:1353278. [PMID: 38371933 PMCID: PMC10869561 DOI: 10.3389/fmicb.2024.1353278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Accepted: 01/17/2024] [Indexed: 02/20/2024] Open
Abstract
Introduction A growing body of research indicates that microorganisms play a crucial role in human health. Imbalances in microbial communities are closely linked to human diseases, and identifying potential relationships between microbes and diseases can help elucidate the pathogenesis of diseases. However, traditional methods based on biological or clinical experiments are costly, so the use of computational models to predict potential microbe-disease associations is of great importance. Methods In this paper, we present a novel computational model called MLFLHMDA, which is based on a Multi-View Latent Feature Learning approach to predict Human potential Microbe-Disease Associations. Specifically, we compute Gaussian interaction profile kernel similarity between diseases and microbes based on the known microbe-disease associations from the Human Microbe-Disease Association Database and perform a preprocessing step on the resulting microbe-disease association matrix, namely, weighting K nearest known neighbors (WKNKN) to reduce the sparsity of the microbe-disease association matrix. To obtain unobserved associations in the microbe and disease views, we extract different latent features based on the geometrical structure of microbes and diseases, and project multi-modal latent features into a common subspace. Next, we introduce graph regularization to preserve the local manifold structure of Gaussian interaction profile kernel similarity and add L p , q -norms to the projection matrix to ensure the interpretability and sparsity of the model. Results The AUC values for global leave-one-out cross-validation and 5-fold cross validation implemented by MLFLHMDA are 0.9165 and 0.8942+/-0.0041, respectively, which perform better than other existing methods. In addition, case studies of different diseases have demonstrated the superiority of the predictive power of MLFLHMDA. The source code of our model and the data are available on https://github.com/LiangzheZhang/MLFLHMDA_master.
Collapse
|
4
|
Lu S, Liang Y, Li L, Miao R, Liao S, Zou Y, Yang C, Ouyang D. Predicting potential microbe-disease associations based on auto-encoder and graph convolution network. BMC Bioinformatics 2023; 24:476. [PMID: 38097930 PMCID: PMC10722760 DOI: 10.1186/s12859-023-05611-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Accepted: 12/11/2023] [Indexed: 12/17/2023] Open
Abstract
The increasing body of research has consistently demonstrated the intricate correlation between the human microbiome and human well-being. Microbes can impact the efficacy and toxicity of drugs through various pathways, as well as influence the occurrence and metastasis of tumors. In clinical practice, it is crucial to elucidate the association between microbes and diseases. Although traditional biological experiments accurately identify this association, they are time-consuming, expensive, and susceptible to experimental conditions. Consequently, conducting extensive biological experiments to screen potential microbe-disease associations becomes challenging. The computational methods can solve the above problems well, but the previous computational methods still have the problems of low utilization of node features and the prediction accuracy needs to be improved. To address this issue, we propose the DAEGCNDF model predicting potential associations between microbes and diseases. Our model calculates four similar features for each microbe and disease. These features are fused to obtain a comprehensive feature matrix representing microbes and diseases. Our model first uses the graph convolutional network module to extract low-rank features with graph information of microbes and diseases, and then uses a deep sparse Auto-Encoder to extract high-rank features of microbe-disease pairs, after which the low-rank and high-rank features are spliced to improve the utilization of node features. Finally, Deep Forest was used for microbe-disease potential relationship prediction. The experimental results show that combining low-rank and high-rank features helps to improve the model performance and Deep Forest has better classification performance than the baseline model.
Collapse
Affiliation(s)
- Shanghui Lu
- Faculty of Innovation Enginee, Macau University of Science and Technology, Avenida Wai Long, Taipa, 999078, Macao, Macao Special Administrative Region of China, China
- School of Mathematics and Physics, Hechi University, No. 42, Longjiang, Hechi, 546300, Guangxi, China
| | - Yong Liang
- Faculty of Innovation Enginee, Macau University of Science and Technology, Avenida Wai Long, Taipa, 999078, Macao, Macao Special Administrative Region of China, China.
- Peng Cheng Laboratory, Shenzhen, 518055, Guangdong, China.
| | - Le Li
- Faculty of Innovation Enginee, Macau University of Science and Technology, Avenida Wai Long, Taipa, 999078, Macao, Macao Special Administrative Region of China, China
| | - Rui Miao
- Basic Teaching Department, Zhuhai Campus of Zunyi Medical University, Zhuhai, 519041, Guangdong, China
| | - Shuilin Liao
- Faculty of Innovation Enginee, Macau University of Science and Technology, Avenida Wai Long, Taipa, 999078, Macao, Macao Special Administrative Region of China, China
| | - Yongfu Zou
- School of Mathematics and Physics, Hechi University, No. 42, Longjiang, Hechi, 546300, Guangxi, China
| | - Chengjun Yang
- School of Artificial Intelligence and Manufacturing, Hechi University, No. 42, Longjiang, Hechi, 546300, Guangxi, China
| | - Dong Ouyang
- School of Biomedical Engineering, Guangdong Medical University, No. 1, Xincheng, Zhanjiang, 523808, Guangdong, China
| |
Collapse
|
5
|
Xiang H, Guo R, Liu L, Guo T, Huang Q. MSIF-LNP: microbial and human health association prediction based on matrix factorization noise reduction for similarity fusion and bidirectional linear neighborhood label propagation. Front Microbiol 2023; 14:1216811. [PMID: 37389340 PMCID: PMC10303805 DOI: 10.3389/fmicb.2023.1216811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 05/25/2023] [Indexed: 07/01/2023] Open
Abstract
Studies have shown that microbes are closely related to human health. Clarifying the relationship between microbes and diseases that cause health problems can provide new solutions for the treatment, diagnosis, and prevention of diseases, and provide strong protection for human health. Currently, more and more similarity fusion methods are available to predict potential microbe-disease associations. However, existing methods have noise problems in the process of similarity fusion. To address this issue, we propose a method called MSIF-LNP that can efficiently and accurately identify potential connections between microbes and diseases, and thus clarify the relationship between microbes and human health. This method is based on matrix factorization denoising similarity fusion (MSIF) and bidirectional linear neighborhood propagation (LNP) techniques. First, we use non-linear iterative fusion to obtain a similarity network for microbes and diseases by fusing the initial microbe and disease similarities, and then reduce noise by using matrix factorization. Next, we use the initial microbe-disease association pairs as label information to perform linear neighborhood label propagation on the denoised similarity network of microbes and diseases. This enables us to obtain a score matrix for predicting microbe-disease relationships. We evaluate the predictive performance of MSIF-LNP and seven other advanced methods through 10-fold cross-validation, and the experimental results show that MSIF-LNP outperformed the other seven methods in terms of AUC. In addition, the analysis of Cystic fibrosis and Obesity cases further demonstrate the predictive ability of this method in practical applications.
Collapse
Affiliation(s)
- Hui Xiang
- College of Physical Education, Southwest Forestry University, Kunming, Yunnan, China
| | - Rong Guo
- College of Physical Education, Southwest Forestry University, Kunming, Yunnan, China
| | - Li Liu
- College of Physical Education, Suzhou University, Suzhou, Anhui, China
| | - Tengjie Guo
- College of Physical Education, Yunnan Normal University, Kunming, Yunnan, China
| | - Quan Huang
- College of Physical Education, Southwest Forestry University, Kunming, Yunnan, China
| |
Collapse
|
6
|
Shokri Garjan H, Omidi Y, Poursheikhali Asghari M, Ferdousi R. In-silico computational approaches to study microbiota impacts on diseases and pharmacotherapy. Gut Pathog 2023; 15:10. [PMID: 36882861 PMCID: PMC9990230 DOI: 10.1186/s13099-023-00535-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Accepted: 02/21/2023] [Indexed: 03/09/2023] Open
Abstract
Microorganisms have been linked to a variety of critical human disease, thanks to advances in sequencing technology and microbiology. The growing recognition of human microbe-disease relationships provides crucial insights into the underlying disease process from the perspective of pathogens, which is extremely useful for pathogenesis research, early diagnosis, and precision medicine and therapy. Microbe-based analysis in terms of diseases and related drug discovery can predict new connections/mechanisms and provide new concepts. These phenomena have been studied via various in-silico computational approaches. This review aims to elaborate on the computational works conducted on the microbe-disease and microbe-drug topics, discuss the computational model approaches used for predicting associations and provide comprehensive information on the related databases. Finally, we discussed potential prospects and obstacles in this field of study, while also outlining some recommendations for further enhancing predictive capabilities.
Collapse
Affiliation(s)
- Hassan Shokri Garjan
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Yadollah Omidi
- Department of Pharmaceutical Sciences, Nova Southeastern University, College of Pharmacy, Fort Lauderdale, FL, USA
| | | | - Reza Ferdousi
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran.
| |
Collapse
|
7
|
Liu JX, Yin MM, Gao YL, Shang J, Zheng CH. MSF-LRR: Multi-Similarity Information Fusion Through Low-Rank Representation to Predict Disease-Associated Microbes. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:534-543. [PMID: 35085090 DOI: 10.1109/tcbb.2022.3146176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
An Increase in microbial activity is shown to be intimately connected with the pathogenesis of diseases. Considering the expense of traditional verification methods, researchers are working to develop high-efficiency methods for detecting potential disease-related microbes. In this article, a new prediction method, MSF-LRR, is established, which uses Low-Rank Representation (LRR) to perform multi-similarity information fusion to predict disease-related microbes. Considering that most existing methods only use one class of similarity, three classes of microbe and disease similarity are added. Then, LRR is used to obtain low-rank structural similarity information. Additionally, the method adaptively extracts the local low-rank structure of the data from a global perspective, to make the information used for the prediction more effective. Finally, a neighbor-based prediction method that utilizes the concept of collaborative filtering is applied to predict unknown microbe-disease pairs. As a result, the AUC value of MSF-LRR is superior to other existing algorithms under 5-fold cross-validation. Furthermore, in case studies, excluding originally known associations, 16 and 19 of the top 20 microbes associated with Bacterial Vaginosis and Irritable Bowel Syndrome, respectively, have been confirmed by the recent literature. In summary, MSF-LRR is a good predictor of potential microbe-disease associations and can contribute to drug discovery and biological research.
Collapse
|
8
|
Gong H, You X, Jin M, Meng Y, Zhang H, Yang S, Xu J. Graph neural network and multi-data heterogeneous networks for microbe-disease prediction. Front Microbiol 2022; 13:1077111. [PMID: 36620040 PMCID: PMC9814480 DOI: 10.3389/fmicb.2022.1077111] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2022] [Accepted: 11/30/2022] [Indexed: 12/24/2022] Open
Abstract
The research on microbe association networks is greatly significant for understanding the pathogenic mechanism of microbes and promoting the application of microbes in precision medicine. In this paper, we studied the prediction of microbe-disease associations based on multi-data biological network and graph neural network algorithm. The HMDAD database provided a dataset that included 39 diseases, 292 microbes, and 450 known microbe-disease associations. We proposed a Microbe-Disease Heterogeneous Network according to the microbe similarity network, disease similarity network, and known microbe-disease associations. Furthermore, we integrated the network into the graph convolutional neural network algorithm and developed the GCNN4Micro-Dis model to predict microbe-disease associations. Finally, the performance of the GCNN4Micro-Dis model was evaluated via 5-fold cross-validation. We randomly divided all known microbe-disease association data into five groups. The results showed that the average AUC value and standard deviation were 0.8954 ± 0.0030. Our model had good predictive power and can help identify new microbe-disease associations. In addition, we compared GCNN4Micro-Dis with three advanced methods to predict microbe-disease associations, KATZHMDA, BiRWHMDA, and LRLSHMDA. The results showed that our method had better prediction performance than the other three methods. Furthermore, we selected breast cancer as a case study and found the top 12 microbes related to breast cancer from the intestinal flora of patients, which further verified the model's accuracy.
Collapse
Affiliation(s)
- Houwu Gong
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China,Academy of Military Sciences, Beijing, China
| | - Xiong You
- Center of Rehabilitation Diagnosis and Treatment, Hunan Provincial Rehabilitation Hospital, Changsha, China
| | - Min Jin
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China,*Correspondence: Min Jin, ✉
| | - Yajie Meng
- School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, China
| | - Hanxue Zhang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Shuaishuai Yang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Junlin Xu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China,Junlin Xu, ✉
| |
Collapse
|
9
|
Liu D, Liu J, Luo Y, He Q, Deng L. MGATMDA: Predicting Microbe-Disease Associations via Multi-Component Graph Attention Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3578-3585. [PMID: 34587092 DOI: 10.1109/tcbb.2021.3116318] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Microbes are parasitic in various human body organs and play significant roles in a wide range of diseases. Identifying microbe-disease associations is conducive to the identification of potential drug targets. Considering the high cost and risk of biological experiments, developing computational approaches to explore the relationship between microbes and diseases is an alternative choice. However, most existing methods are based on unreliable or noisy similarity, and the prediction accuracy could be affected. Besides, it is still a great challenge for most previous methods to make predictions for the large-scale dataset. In this work, we develop a multi-component Graph Attention Network (GAT) based framework, termed MGATMDA, for predicting microbe-disease associations. MGATMDA is built on a bipartite graph of microbes and diseases. It contains three essential parts: decomposer, combiner, and predictor. The decomposer first decomposes the edges in the bipartite graph to identify the latent components by node-level attention mechanism. The combiner then recombines these latent components automatically to obtain unified embedding for prediction by component-level attention mechanism. Finally, a fully connected network is used to predict unknown microbes-disease associations. Experimental results showed that our proposed method outperformed eight state-of-the-art methods. Case studies for two common diseases further demonstrated the effectiveness of MGATMDA in predicting potential microbe-disease associations. The codes are available at Github https://github.com/dayunliu/MGATMDA.
Collapse
|
10
|
Hua M, Yu S, Liu T, Yang X, Wang H. MVGCNMDA: Multi-view Graph Augmentation Convolutional Network for Uncovering Disease-Related Microbes. Interdiscip Sci 2022; 14:669-682. [PMID: 35428964 DOI: 10.1007/s12539-022-00514-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 03/06/2022] [Accepted: 03/13/2022] [Indexed: 06/14/2023]
Abstract
MOTIVATION Exploring the interrelationships between microbes and disease can help microbiologists make decisions and plan treatments. Predicting new microbe-disease associations currently relies on biological experiments and domain knowledge, which is time-consuming and inefficient. Automated algorithms are used to uncover the intrinsic link between microbes and disease. However, due to data noise and inadequate understanding of relevant biology, the efficient prediction of microbe-disease associations is still crucial. This study develops a multi-view graph augmentation convolutional network (MVGCNMDA) to predict potential disease-associated microbes. METHODS First, we use two data augmentation methods, edge perturbation and node dropping, to remove the data noise in the preprocessing stage. Second, we calculate Gaussian interaction profile kernel similarity and cosine similarity. Therefore, the Graph Convolutional Network(GCN) can fully use multi-view features. Then, the multi-view features are fed into the multi-attention block to learn the weights of different features adaptively. Finally, the embedding results are obtained using a Convolutional Neural Network (CNN) combiner, and the matrix completion is used to predict the relationship between potential microbes and diseases. RESULTS We test our model on the Human microbe-disease Association Database (HMDAD), Disbiome, and the Combined Dataset (Peryton and MicroPhenoDB). The area under PR curve (AUPR), area under ROC curve (AUC), F1 score, and RECALL value are calculated to evaluate the performance of the developed MVGCNMDA. The AUPR is 0.9440, AUC is 0.9428, F1 score is 0.9383, and RECALL value is 0.8858. The experiments show that our model can accurately predict potential microbe-disease associations compared with the state-of-the-art works on the global Leave-One-Out-Cross-Validation (LOOCV) and the fivefold Cross-Validation (fivefold CV). To further verify the effectiveness of the proposed graph data augmentation, we designed five different settings in the ablation study. Furthermore, we present two case studies that validate the prediction of the potential association between microbes and diseases by MVGCNMDA.
Collapse
Affiliation(s)
- Meifang Hua
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Shengpeng Yu
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Tianyu Liu
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Xue Yang
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Hong Wang
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China.
| |
Collapse
|
11
|
Yang M, Huang ZA, Gu W, Han K, Pan W, Yang X, Zhu Z. Prediction of biomarker-disease associations based on graph attention network and text representation. Brief Bioinform 2022; 23:6651308. [PMID: 35901464 DOI: 10.1093/bib/bbac298] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2022] [Revised: 06/28/2022] [Accepted: 06/30/2022] [Indexed: 02/06/2023] Open
Abstract
MOTIVATION The associations between biomarkers and human diseases play a key role in understanding complex pathology and developing targeted therapies. Wet lab experiments for biomarker discovery are costly, laborious and time-consuming. Computational prediction methods can be used to greatly expedite the identification of candidate biomarkers. RESULTS Here, we present a novel computational model named GTGenie for predicting the biomarker-disease associations based on graph and text features. In GTGenie, a graph attention network is utilized to characterize diverse similarities of biomarkers and diseases from heterogeneous information resources. Meanwhile, a pretrained BERT-based model is applied to learn the text-based representation of biomarker-disease relation from biomedical literature. The captured graph and text features are then integrated in a bimodal fusion network to model the hybrid entity representation. Finally, inductive matrix completion is adopted to infer the missing entries for reconstructing relation matrix, with which the unknown biomarker-disease associations are predicted. Experimental results on HMDD, HMDAD and LncRNADisease data sets showed that GTGenie can obtain competitive prediction performance with other state-of-the-art methods. AVAILABILITY The source code of GTGenie and the test data are available at: https://github.com/Wolverinerine/GTGenie.
Collapse
Affiliation(s)
- Minghao Yang
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518000, China
| | - Zhi-An Huang
- Center for Computer Science and Information Technology, City University of Hong Kong Dongguan Research Institute, Dongguan, China
| | - Wenhao Gu
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518000, China.,GeneGenieDx Corp, 160 E Tasman Dr, San Jose, CA 95134
| | - Kun Han
- GeneGenieDx Corp, 160 E Tasman Dr, San Jose, CA 95134
| | - Wenying Pan
- GeneGenieDx Corp, 160 E Tasman Dr, San Jose, CA 95134
| | - Xiao Yang
- GeneGenieDx Corp, 160 E Tasman Dr, San Jose, CA 95134
| | - Zexuan Zhu
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518000, China
| |
Collapse
|
12
|
Chen Y, Lei X. Metapath Aggregated Graph Neural Network and Tripartite Heterogeneous Networks for Microbe-Disease Prediction. Front Microbiol 2022; 13:919380. [PMID: 35711758 PMCID: PMC9194683 DOI: 10.3389/fmicb.2022.919380] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Accepted: 04/29/2022] [Indexed: 11/25/2022] Open
Abstract
More and more studies have shown that understanding microbe-disease associations cannot only reveal the pathogenesis of diseases, but also promote the diagnosis and prognosis of diseases. Because traditional medical experiments are time-consuming and expensive, many computational methods have been proposed in recent years to identify potential microbe-disease associations. In this study, we propose a method based on heterogeneous network and metapath aggregated graph neural network (MAGNN) to predict microbe-disease associations, called MATHNMDA. First, we introduce microbe-drug interactions, drug-disease associations, and microbe-disease associations to construct a microbe-drug-disease heterogeneous network. Then we take the heterogeneous network as input to MAGNN. Second, for each layer of MAGNN, we carry out intra-metapath aggregation with a multi-head attention mechanism to learn the structural and semantic information embedded in the target node context, the metapath-based neighbor nodes, and the context between them, by encoding the metapath instances under the metapath definition mode. We then use inter-metapath aggregation with an attention mechanism to combine the semantic information of all different metapaths. Third, we can get the final embedding of microbe nodes and disease nodes based on the output of the last layer in the MAGNN. Finally, we predict potential microbe-disease associations by reconstructing the microbe-disease association matrix. In addition, we evaluated the performance of MATHNMDA by comparing it with that of its variants, some state-of-the-art methods, and different datasets. The results suggest that MATHNMDA is an effective prediction method. The case studies on asthma, inflammatory bowel disease (IBD), and coronavirus disease 2019 (COVID-19) further validate the effectiveness of MATHNMDA.
Collapse
Affiliation(s)
- Yali Chen
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| | - Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| |
Collapse
|
13
|
Yin MM, Liu JX, Gao YL, Kong XZ, Zheng CH. NCPLP: A Novel Approach for Predicting Microbe-Associated Diseases With Network Consistency Projection and Label Propagation. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:5079-5087. [PMID: 33119529 DOI: 10.1109/tcyb.2020.3026652] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
A growing number of clinical studies have provided substantial evidence of a close relationship between the microbe and the disease. Thus, it is necessary to infer potential microbe-disease associations. But traditional approaches use experiments to validate these associations that often spend a lot of materials and time. Hence, more reliable computational methods are expected to be applied to predict disease-associated microbes. In this article, an innovative mean for predicting microbe-disease associations is proposed, which is based on network consistency projection and label propagation (NCPLP). Given that most existing algorithms use the Gaussian interaction profile (GIP) kernel similarity as the similarity criterion between microbe pairs and disease pairs, in this model, Medical Subject Headings descriptors are considered to calculate disease semantic similarity. In addition, 16S rRNA gene sequences are borrowed for the calculation of microbe functional similarity. In view of the gene-based sequence information, we use two conventional methods (BLAST+ and MEGA7) to assess the similarity between each pair of microbes from different perspectives. Especially, network consistency projection is added to obtain network projection scores from the microbe space and the disease space. Ultimately, label propagation is utilized to reliably predict microbes related to diseases. NCPLP achieves better performance in various evaluation indicators and discovers a greater number of potential associations between microbes and diseases. Also, case studies further confirm the reliable prediction performance of NCPLP. To conclude, our algorithm NCPLP has the ability to discover these underlying microbe-disease associations and can provide help for biological study.
Collapse
|
14
|
Ma Y, Liu Q. Generalized matrix factorization based on weighted hypergraph learning for microbe-drug association prediction. Comput Biol Med 2022; 145:105503. [DOI: 10.1016/j.compbiomed.2022.105503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2022] [Revised: 03/28/2022] [Accepted: 04/04/2022] [Indexed: 11/03/2022]
|
15
|
Zhou L, Tang Y, Yan G. A New Estimation Method for the Biological Interaction Predicting Problems. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1415-1423. [PMID: 33406043 DOI: 10.1109/tcbb.2021.3049642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
For the past decades, computational methods have been developed to predict various interactions in biological problems. Usually these methods treated the predicting problems as semi-supervised problem or positive-unlabeled(PU) learning problem. Researchers focused on the prediction of unlabeled samples and hoped to find novel interactions in the datasets they collected. However, most of the computational methods could only predict a small proportion of undiscovered interactions and the total number was unknown. In this paper, we developed an estimation method with deep learning to calculate the number of undiscovered interactions in the unlabeled samples, derived its asymptotic interval estimation, and applied it to the compound synergism dataset, drug-target interaction(DTI) dataset and MicroRNA-disease interaction dataset successfully. Moreover, this method could reveal which dataset contained more undiscovered interactions and would be a guidance for the experimental validation. Furthermore, we compared our method with some mixture proportion estimators and demonstarted the efficacy of our method. Finally, we proved that AUC and AUPR were related with the number of undiscovered interactions, which was regarded as another evaluation indicator for the computational methods.
Collapse
|
16
|
Wang L, Tan Y, Yang X, Kuang L, Ping P. Review on predicting pairwise relationships between human microbes, drugs and diseases: from biological data to computational models. Brief Bioinform 2022; 23:6553604. [PMID: 35325024 DOI: 10.1093/bib/bbac080] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Revised: 02/14/2022] [Accepted: 02/15/2022] [Indexed: 12/11/2022] Open
Abstract
In recent years, with the rapid development of techniques in bioinformatics and life science, a considerable quantity of biomedical data has been accumulated, based on which researchers have developed various computational approaches to discover potential associations between human microbes, drugs and diseases. This paper provides a comprehensive overview of recent advances in prediction of potential correlations between microbes, drugs and diseases from biological data to computational models. Firstly, we introduced the widely used datasets relevant to the identification of potential relationships between microbes, drugs and diseases in detail. And then, we divided a series of a lot of representative computing models into five major categories including network, matrix factorization, matrix completion, regularization and artificial neural network for in-depth discussion and comparison. Finally, we analysed possible challenges and opportunities in this research area, and at the same time we outlined some suggestions for further improvement of predictive performances as well.
Collapse
Affiliation(s)
- Lei Wang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Yaqin Tan
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Xiaoyu Yang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Linai Kuang
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Pengyao Ping
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China
| |
Collapse
|
17
|
Huang YA, Huang ZA, Li JQ, You ZH, Wang L, Yi HC, Yu CQ. GBDR: a Bayesian model for precise prediction of pathogenic microorganisms using 16S rRNA gene sequences. BMC Genomics 2022; 22:916. [PMID: 35296232 PMCID: PMC8925046 DOI: 10.1186/s12864-022-08423-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2021] [Accepted: 02/25/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Recent evidences have suggested that human microorganisms participate in important biological activities in the human body. The dysfunction of host-microbiota interactions could lead to complex human disorders. The knowledge on host-microbiota interactions can provide valuable insights into understanding the pathological mechanism of diseases. However, it is time-consuming and costly to identify the disorder-specific microbes from the biological "haystack" merely by routine wet-lab experiments. With the developments in next-generation sequencing and omics-based trials, it is imperative to develop computational prediction models for predicting microbe-disease associations on a large scale. RESULTS Based on the known microbe-disease associations derived from the Human Microbe-Disease Association Database (HMDAD), the proposed model shows reliable performance with high values of the area under ROC curve (AUC) of 0.9456 and 0.8866 in leave-one-out cross validations and five-fold cross validations, respectively. In case studies of colorectal carcinoma, 80% out of the top-20 predicted microbes have been experimentally confirmed via published literatures. CONCLUSION Based on the assumption that functionally similar microbes tend to share the similar interaction patterns with human diseases, we here propose a group based computational model of Bayesian disease-oriented ranking to prioritize the most potential microbes associating with various human diseases. Based on the sequence information of genes, two computational approaches (BLAST+ and MEGA 7) are leveraged to measure the microbe-microbe similarity from different perspectives. The disease-disease similarity is calculated by capturing the hierarchy information from the Medical Subject Headings (MeSH) data. The experimental results illustrate the accuracy and effectiveness of the proposed model. This work is expected to facilitate the characterization and identification of promising microbial biomarkers.
Collapse
Affiliation(s)
- Yu-An Huang
- Department of Information Engineering, Xijing University, Xi'an, 710123, China.
| | - Zhi-An Huang
- Center for Computer Science and Information Technology, City University of Hong Kong Dongguan Research Institute, Dongguan, China
| | - Jian-Qiang Li
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518000, China.
| | - Zhu-Hong You
- Department of Information Engineering, Xijing University, Xi'an, 710123, China
| | - Lei Wang
- Guangxi Academy of Science, Nanning, 530000, China
| | - Hai-Cheng Yi
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, 830000, China
| | - Chang-Qing Yu
- Department of Information Engineering, Xijing University, Xi'an, 710123, China
| |
Collapse
|
18
|
Zha Y, Ning K. Ontology-aware neural network: a general framework for pattern mining from microbiome data. Brief Bioinform 2022; 23:6517031. [PMID: 35091743 PMCID: PMC8921649 DOI: 10.1093/bib/bbac005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Revised: 12/30/2021] [Accepted: 01/04/2022] [Indexed: 11/23/2022] Open
Abstract
With the rapid accumulation of microbiome data around the world, numerous computational bioinformatics methods have been developed for pattern mining from such paramount microbiome data. Current microbiome data mining methods, such as gene and species mining, rely heavily on sequence comparison. Most of these methods, however, have a clear trade-off, particularly, when it comes to big-data analytical efficiency and accuracy. Microbiome entities are usually organized in ontology structures, and pattern mining methods that have considered ontology structures could offer advantages in mining efficiency and accuracy. Here, we have summarized the ontology-aware neural network (ONN) as a novel framework for microbiome data mining. We have discussed the applications of ONN in multiple contexts, including gene mining, species mining and microbial community dynamic pattern mining. We have then highlighted one of the most important characteristics of ONN, namely, novel knowledge discovery, which makes ONN a standout among all microbiome data mining methods. Finally, we have provided several applications to showcase the advantage of ONN over other methods in microbiome data mining. In summary, ONN represents a paradigm shift for pattern mining from microbiome data: from traditional machine learning approach to ontology-aware and model-based approach, which has found its broad application scenarios in microbiome data mining.
Collapse
Affiliation(s)
- Yuguo Zha
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Department of Bioinformatics and Systems Biology, Center of AI Biology, College of Life Science and Technology, Huazhong University of Science and Technology, 1037 Luoyu Road Wuhan, Hubei, Wuhan 430074, China
| | - Kang Ning
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Department of Bioinformatics and Systems Biology, Center of AI Biology, College of Life Science and Technology, Huazhong University of Science and Technology, 1037 Luoyu Road Wuhan, Hubei, Wuhan 430074, China
| |
Collapse
|
19
|
Li H, Wang Y, Zhang Z, Tan Y, Chen Z, Wang X, Pei T, Wang L. Identifying Microbe-Disease Association Based on a Novel Back-Propagation Neural Network Model. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:2502-2513. [PMID: 32305935 DOI: 10.1109/tcbb.2020.2986459] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Over the years, numerous evidences have demonstrated that microbes living in the human body are closely related to human life activities and human diseases. However, traditional biological experiments are time-consuming and expensive, so it has become a research topic in bioinformatics to predict potential microbe-disease associations by adopting computational methods. In this study, a novel calculative method called BPNNHMDA is proposed to identify potential microbe-disease associations. In BPNNHMDA, a novel neural network model is first designed to infer potential microbe-disease associations, its input signal is a matrix of known microbe-disease associations, and its output signal is matrix of potential microbe-disease associations probabilities. And moreover, in the novel neural network model, a new activation function is designed to activate the hidden layer and the output layer based on the hyperbolic tangent function, and its initial connection weights are optimized by adopting Gaussian Interaction Profile kernel (GIP) similarity for microbes, which can improve the training speed of BPNNHMDA efficiently. Finally, in order to verify the performance of our prediction model, different frameworks such as the Leave-One-Out Cross Validation (LOOCV) and k-Fold Cross Validation ( k-Fold CV) are implemented on BPNNHMDA respectively. Simulation results illustrate that BPNNHMDA can achieve reliable AUCs of 0.9242, 0.9127 ± 0.0009 and 0.8955 ± 0.0018 in LOOCV, 5-Fold CV and 2-Fold CV separately, which are superior to previous state-of-the-art methods. Furthermore, case studies of inflammatory bowel disease (IBD), asthma and obesity demonstrate that BPNNHMDA has excellent prediction ability in practical applications as well.
Collapse
|
20
|
Liu Y, Wang SL, Zhang JF, Zhang W, Zhou S, Li W. DMFMDA: Prediction of Microbe-Disease Associations Based on Deep Matrix Factorization Using Bayesian Personalized Ranking. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:1763-1772. [PMID: 32816678 DOI: 10.1109/tcbb.2020.3018138] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Identifying the microbe-disease associations is conducive to understanding the pathogenesis of disease from the perspective of microbe. In this paper, we propose a deep matrix factorization prediction model (DMFMDA) based on deep neural network. First, the disease one-hot encoding is fed into neural network, which is transformed into a low-dimensional dense vector in implicit semantic space via embedding layer, and so is microbe. Then, matrix factorization is realized by neural network with embedding layer. Furthermore, our model synthesizes the non-linear modeling advantages of multi-layer perceptron based on the linear modeling advantages of matrix factorization. Finally, different from other methods using square error loss function, Bayesian Personalized Ranking optimizes the model from a ranking perspective to obtain the optimal model parameters, which makes full use of the unobserved data. Experiments show that DMFMDA reaches average AUCs of 0.9091 and 0.9103 in the framework of 5-fold cross validation and Leave-one-out cross validation, which is superior to three the-state-of-art methods. In case studies, 10, 9 and 9 out of top-10 candidate microbes are verified by recently published literature for asthma, inflammatory bowel disease and colon cancer, respectively. In conclusion, DMFMDA is successful application of deep learning in the prediction of microbe-disease association.
Collapse
|
21
|
Huang ZA, Zhu Z, Yau CH, Tan KC. Identifying Autism Spectrum Disorder From Resting-State fMRI Using Deep Belief Network. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:2847-2861. [PMID: 32692687 DOI: 10.1109/tnnls.2020.3007943] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
With the increasing prevalence of autism spectrum disorder (ASD), it is important to identify ASD patients for effective treatment and intervention, especially in early childhood. Neuroimaging techniques have been used to characterize the complex biomarkers based on the functional connectivity anomalies in the ASD. However, the diagnosis of ASD still adopts the symptom-based criteria by clinical observation. The existing computational models tend to achieve unreliable diagnostic classification on the large-scale aggregated data sets. In this work, we propose a novel graph-based classification model using the deep belief network (DBN) and the Autism Brain Imaging Data Exchange (ABIDE) database, which is a worldwide multisite functional and structural brain imaging data aggregation. The remarkable connectivity features are selected through a graph extension of K -nearest neighbors and then refined by a restricted path-based depth-first search algorithm. Thanks to the feature reduction, lower computational complexity could contribute to the shortening of the training time. The automatic hyperparameter-tuning technique is introduced to optimize the hyperparameters of the DBN by exploring the potential parameter space. The simulation experiments demonstrate the superior performance of our model, which is 6.4% higher than the best result reported on the ABIDE database. We also propose to use the data augmentation and the oversampling technique to identify further the possible subtypes within the ASD. The interpretability of our model enables the identification of the most remarkable autistic neural correlation patterns from the data-driven outcomes.
Collapse
|
22
|
Li HY, Chen HY, Wang L, Song SJ, You ZH, Yan X, Yu JQ. A structural deep network embedding model for predicting associations between miRNA and disease based on molecular association network. Sci Rep 2021; 11:12640. [PMID: 34135401 PMCID: PMC8209151 DOI: 10.1038/s41598-021-91991-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2020] [Accepted: 04/30/2021] [Indexed: 02/05/2023] Open
Abstract
Previous studies indicated that miRNA plays an important role in human biological processes especially in the field of diseases. However, constrained by biotechnology, only a small part of the miRNA-disease associations has been verified by biological experiment. This impel that more and more researchers pay attention to develop efficient and high-precision computational methods for predicting the potential miRNA-disease associations. Based on the assumption that molecules are related to each other in human physiological processes, we developed a novel structural deep network embedding model (SDNE-MDA) for predicting miRNA-disease association using molecular associations network. Specifically, the SDNE-MDA model first integrating miRNA attribute information by Chao Game Representation (CGR) algorithm and disease attribute information by disease semantic similarity. Secondly, we extract feature by structural deep network embedding from the heterogeneous molecular associations network. Then, a comprehensive feature descriptor is constructed by combining attribute information and behavior information. Finally, Convolutional Neural Network (CNN) is adopted to train and classify these feature descriptors. In the five-fold cross validation experiment, SDNE-MDA achieved AUC of 0.9447 with the prediction accuracy of 87.38% on the HMDD v3.0 dataset. To further verify the performance of SDNE-MDA, we contrasted it with different feature extraction models and classifier models. Moreover, the case studies with three important human diseases, including Breast Neoplasms, Kidney Neoplasms, Lymphoma were implemented by the proposed model. As a result, 47, 46 and 46 out of top-50 predicted disease-related miRNAs have been confirmed by independent databases. These results anticipate that SDNE-MDA would be a reliable computational tool for predicting potential miRNA-disease associations.
Collapse
Affiliation(s)
- Hao-Yuan Li
- grid.411510.00000 0000 9030 231XSchool of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116 China
| | - Hai-Yan Chen
- Xinjiang Autonomous Region tax Service, State Taxation Administration, Urumqi, 830011 China
| | - Lei Wang
- grid.9227.e0000000119573309Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011 China
| | - Shen-Jian Song
- Science & Technology Department of Xinjiang Uygur Autonomous Region, Urumqi, 830011 China
| | - Zhu-Hong You
- grid.9227.e0000000119573309Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011 China
| | - Xin Yan
- grid.411510.00000 0000 9030 231XSchool of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116 China
| | - Jin-Qian Yu
- grid.411510.00000 0000 9030 231XSchool of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116 China
| |
Collapse
|
23
|
Yan C, Duan G, Wu FX, Pan Y, Wang J. MCHMDA:Predicting Microbe-Disease Associations Based on Similarities and Low-Rank Matrix Completion. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:611-620. [PMID: 31295117 DOI: 10.1109/tcbb.2019.2926716] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
With the development of high-through sequencing technology and microbiology, many studies have evidenced that microbes are associated with human diseases, such as obesity, liver cancer, and so on. Therefore, identifying the association between microbes and diseases has become an important study topic in current bioinformatics. The emergence of microbe-disease association database has provided an unprecedented opportunity to develop computational method for predicting microbe-disease associations. In the study, we propose a low-rank matrix completion method (called MCHMDA) to predict microbe-disease associations by integrating similarities of microbes and diseases and known microbe-disease associations into a heterogeneous network. The microbe similarity is computed from Gaussian Interaction Profile (GIP) kernel similarity based on the known microbe-disease associations. Then, we further improve the microbe similarity by taking into account the inhabiting organs of these microbes in human body. The disease similarity is computed by the average of disease GIP similarity, disease symptom-based similarity, and disease functional similarity. Then, we construct a heterogeneous microbe-disease association network by integrating the microbe similarity network, disease similarity network, and known microbe-disease association network. Finally, a matrix completion method is used to calculate the association scores of unknown microbe-disease pairs by the fast Singular Value Thresholding (SVT) algorithm. Via 5-fold Cross Validation (5CV) and Leave-One-Out Cross Validation (LOOCV), we evaluate the prediction performances of MCHMDA and other state-of-the-art methods which include BRWMDA, NGRHMDA, LRLSHMDA, and KATZHMDA. On benchmark dataset HMDAD, the experimental results show that MCHMDA outperforms other methods in terms of area under the receiver operating characteristic curve (AUC). MCHMDA achieves the AUC values of 0.9251 and 0.9495 in 5CV and LOOCV, respectively, which are the highest values among the competing methods. In addition, we also further indicate the prediction generality of MCHMDA on an expanded microbe-disease associations dataset (HMDAD-SUP). Finally, case studies prove the prediction ability in practical applications.
Collapse
|
24
|
Discovering microbe-disease associations from the literature using a hierarchical long short-term memory network and an ensemble parser model. Sci Rep 2021; 11:4490. [PMID: 33627732 PMCID: PMC7904816 DOI: 10.1038/s41598-021-83966-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Accepted: 02/08/2021] [Indexed: 02/07/2023] Open
Abstract
With recent advances in biotechnology and sequencing technology, the microbial community has been intensively studied and discovered to be associated with many chronic as well as acute diseases. Even though a tremendous number of studies describing the association between microbes and diseases have been published, text mining methods that focus on such associations have been rarely studied. We propose a framework that combines machine learning and natural language processing methods to analyze the association between microbes and diseases. A hierarchical long short-term memory network was used to detect sentences that describe the association. For the sentences determined, two different parse tree-based search methods were combined to find the relation-describing word. The ensemble model of constituency parsing for structural pattern matching and dependency-based relation extraction improved the prediction accuracy. By combining deep learning and parse tree-based extractions, our proposed framework could extract the microbe-disease association with higher accuracy. The evaluation results showed that our system achieved an F-score of 0.8764 and 0.8524 in binary decisions and extracting relation words, respectively. As a case study, we performed a large-scale analysis of the association between microbes and diseases. Additionally, a set of common microbes shared by multiple diseases were also identified in this study. This study could provide valuable information for the major microbes that were studied for a specific disease. The code and data are available at https://github.com/DMnBI/mdi_predictor .
Collapse
|
25
|
Xu D, Xu H, Zhang Y, Wang M, Chen W, Gao R. MDAKRLS: Predicting human microbe-disease association based on Kronecker regularized least squares and similarities. J Transl Med 2021; 19:66. [PMID: 33579301 PMCID: PMC7881563 DOI: 10.1186/s12967-021-02732-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2020] [Accepted: 02/01/2021] [Indexed: 01/16/2023] Open
Abstract
BACKGROUND Microbes are closely related to human health and diseases. Identification of disease-related microbes is of great significance for revealing the pathological mechanism of human diseases and understanding the interaction mechanisms between microbes and humans, which is also useful for the prevention, diagnosis and treatment of human diseases. Considering the known disease-related microbes are still insufficient, it is necessary to develop effective computational methods and reduce the time and cost of biological experiments. METHODS In this work, we developed a novel computational method called MDAKRLS to discover potential microbe-disease associations (MDAs) based on the Kronecker regularized least squares. Specifically, we introduced the Hamming interaction profile similarity to measure the similarities of microbes and diseases besides Gaussian interaction profile kernel similarity. In addition, we introduced the Kronecker product to construct two kinds of Kronecker similarities between microbe-disease pairs. Then, we designed the Kronecker regularized least squares with different Kronecker similarities to obtain prediction scores, respectively, and calculated the final prediction scores by integrating the contributions of different similarities. RESULTS The AUCs value of global leave-one-out cross-validation and 5-fold cross-validation achieved by MDAKRLS were 0.9327 and 0.9023 ± 0.0015, which were significantly higher than five state-of-the-art methods used for comparison. Comparison results demonstrate that MDAKRLS has faster computing speed under two kinds of frameworks. In addition, case studies of inflammatory bowel disease (IBD) and asthma further showed 19 (IBD), 19 (asthma) of the top 20 prediction disease-related microbes could be verified by previously published biological or medical literature. CONCLUSIONS All the evaluation results adequately demonstrated that MDAKRLS has an effective and reliable prediction performance. It may be a useful tool to seek disease-related new microbes and help biomedical researchers to carry out follow-up studies.
Collapse
Affiliation(s)
- Da Xu
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Hanxiao Xu
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Yusen Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China.
| | - Mingyi Wang
- Department of Central Lab, Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, Shandong, China.
| | - Wei Chen
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Rui Gao
- School of Control Science and Engineering, Shandong University, Jinan, 250061, China
| |
Collapse
|
26
|
Human Microbe-Disease Association Prediction by a Novel Double-Ended Random Walk with Restart. BIOMED RESEARCH INTERNATIONAL 2020; 2020:3978702. [PMID: 32851068 PMCID: PMC7439206 DOI: 10.1155/2020/3978702] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Revised: 07/07/2020] [Accepted: 07/20/2020] [Indexed: 11/17/2022]
Abstract
Microorganisms in the human body play a vital role in metabolism, immune defense, nutrient absorption, cancer control, and prevention of pathogen colonization. More and more biological and clinical studies have shown that the imbalance of microbial communities is closely related to the occurrence and development of various complex human diseases. Finding potential microbial-disease associations is critical for understanding the pathology of a few diseases and thus further improving disease diagnosis and prognosis. In this study, we proposed a novel computational model to predict disease-associated microbes. Specifically, we first constructed a heterogeneous interconnection network based on known microbe-disease associations deposited in a few databases, the similarity between diseases, and the similarity between microorganisms. We then predicted novel microbe-disease associations by a new method called the double-ended restart random walk model (DRWHMDA) implemented on the interconnection network. In addition, we performed case studies of colon cancer and asthma for further evaluation. The results indicate that 10 and 9 of the top 10 microorganisms predicted to be associated with colorectal cancer and asthma were validated by relevant literatures, respectively. Our method is expected to be effective in identifying disease-related microorganisms and will help to reveal the relationship between microorganisms and complex human diseases.
Collapse
|
27
|
Zhao Y, Wang CC, Chen X. Microbes and complex diseases: from experimental results to computational models. Brief Bioinform 2020; 22:5882184. [PMID: 32766753 DOI: 10.1093/bib/bbaa158] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Revised: 06/19/2020] [Accepted: 06/22/2020] [Indexed: 12/13/2022] Open
Abstract
Studies have shown that the number of microbes in humans is almost 10 times that of cells. These microbes have been proven to play an important role in a variety of physiological processes, such as enhancing immunity, improving the digestion of gastrointestinal tract and strengthening metabolic function. In addition, in recent years, more and more research results have indicated that there are close relationships between the emergence of the human noncommunicable diseases and microbes, which provides a novel insight for us to further understand the pathogenesis of the diseases. An in-depth study about the relationships between diseases and microbes will not only contribute to exploring new strategies for the diagnosis and treatment of diseases but also significantly heighten the efficiency of new drugs development. However, applying the methods of biological experimentation to reveal the microbe-disease associations is costly and inefficient. In recent years, more and more researchers have constructed multiple computational models to predict microbes that are potentially associated with diseases. Here, we start with a brief introduction of microbes and databases as well as web servers related to them. Then, we mainly introduce four kinds of computational models, including score function-based models, network algorithm-based models, machine learning-based models and experimental analysis-based models. Finally, we summarize the advantages as well as disadvantages of them and set the direction for the future work of revealing microbe-disease associations based on computational models. We firmly believe that computational models are expected to be important tools in large-scale predictions of disease-related microbes.
Collapse
Affiliation(s)
- Yan Zhao
- School of Information and Control Engineering, China University of Mining
| | - Chun-Chun Wang
- School of Information and Control Engineering, China University of Mining
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining
| |
Collapse
|
28
|
Wen Z, Yan C, Duan G, Li S, Wu FX, Wang J. A survey on predicting microbe-disease associations: biological data and computational methods. Brief Bioinform 2020; 22:5881365. [PMID: 34020541 DOI: 10.1093/bib/bbaa157] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 06/18/2020] [Accepted: 06/22/2020] [Indexed: 02/06/2023] Open
Abstract
Various microbes have proved to be closely related to the pathogenesis of human diseases. While many computational methods for predicting human microbe-disease associations (MDAs) have been developed, few systematic reviews on these methods have been reported. In this study, we provide a comprehensive overview of the existing methods. Firstly, we introduce the data used in existing MDA prediction methods. Secondly, we classify those methods into different categories by their nature and describe their algorithms and strategies in detail. Next, experimental evaluations are conducted on representative methods using different similarity data and calculation methods to compare their prediction performances. Based on the principles of computational methods and experimental results, we discuss the advantages and disadvantages of those methods and propose suggestions for the improvement of prediction performances. Considering the problems of the MDA prediction at present stage, we discuss future work from three perspectives including data, methods and formulations at the end.
Collapse
Affiliation(s)
- Zhongqi Wen
- Hunan Provincial Key Lab of Bioinformatics, School of Computer Science and Engineering at Central South University, Hunan, China
| | - Cheng Yan
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Guihua Duan
- School of Computer Science and Engineering, Central South University
| | - Suning Li
- Hunan Provincial Key Lab of Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Fang-Xiang Wu
- College of Engineering and the Department of Computer Sciences, University of Saskatchewan, Saskatoon, Canada
| | - Jianxin Wang
- Hunan Provincial Key Lab of Bioinformatics, School of Computer Science and Engineering at Central South University, Hunan, China
| |
Collapse
|
29
|
Fan Y, Chen M, Zhu Q, Wang W. Inferring Disease-Associated Microbes Based on Multi-Data Integration and Network Consistency Projection. Front Bioeng Biotechnol 2020; 8:831. [PMID: 32850711 PMCID: PMC7418576 DOI: 10.3389/fbioe.2020.00831] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Accepted: 06/29/2020] [Indexed: 12/18/2022] Open
Abstract
Plenty of microbes in our human body play a vital role in the process of cell physiology. In recent years, there is accumulating evidence indicating that microbes are closely related to many complex human diseases. In-depth investigation of disease-associated microbes can contribute to understanding the pathogenesis of diseases and thus provide novel strategies for the treatment, diagnosis, and prevention of diseases. To date, many computational models have been proposed for predicting microbe-disease associations using available similarity networks. However, these similarity networks are not effectively fused. In this study, we proposed a novel computational model based on multi-data integration and network consistency projection for Human Microbe-Disease Associations Prediction (HMDA-Pred), which fuses multiple similarity networks by a linear network fusion method. HMDA-Pred yielded AUC values of 0.9589 and 0.9361 ± 0.0037 in the experiments of leave-one-out cross validation (LOOCV) and 5-fold cross validation (5-fold CV), respectively. Furthermore, in case studies, 10, 8, and 10 out of the top 10 predicted microbes of asthma, colon cancer, and inflammatory bowel disease were confirmed by the literatures, respectively.
Collapse
Affiliation(s)
- Yongxian Fan
- School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, China
| | | | | | | |
Collapse
|
30
|
Long Y, Luo J, Zhang Y, Xia Y. Predicting human microbe-disease associations via graph attention networks with inductive matrix completion. Brief Bioinform 2020; 22:5876591. [PMID: 32725163 DOI: 10.1093/bib/bbaa146] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Revised: 06/07/2020] [Accepted: 06/11/2020] [Indexed: 12/13/2022] Open
Abstract
MOTIVATION human microbes play a critical role in an extensive range of complex human diseases and become a new target in precision medicine. In silico methods of identifying microbe-disease associations not only can provide a deep insight into understanding the pathogenic mechanism of complex human diseases but also assist pharmacologists to screen candidate targets for drug development. However, the majority of existing approaches are based on linear models or label propagation, which suffers from limitations in capturing nonlinear associations between microbes and diseases. Besides, it is still a great challenge for most previous methods to make predictions for new diseases (or new microbes) with few or without any observed associations. RESULTS in this work, we construct features for microbes and diseases by fully exploiting multiply sources of biomedical data, and then propose a novel deep learning framework of graph attention networks with inductive matrix completion for human microbe-disease association prediction, named GATMDA. To our knowledge, this is the first attempt to leverage graph attention networks for this important task. In particular, we develop an optimized graph attention network with talking-heads to learn representations for nodes (i.e. microbes and diseases). To focus on more important neighbours and filter out noises, we further design a bi-interaction aggregator to enforce representation aggregation of similar neighbours. In addition, we combine inductive matrix completion to reconstruct microbe-disease associations to capture the complicated associations between diseases and microbes. Comprehensive experiments on two data sets (i.e. HMDAD and Disbiome) demonstrated that our proposed model consistently outperformed baseline methods. Case studies on two diseases, i.e. asthma and inflammatory bowel disease, further confirmed the effectiveness of our proposed model of GATMDA. AVAILABILITY python codes and data set are available at: https://github.com/yahuilong/GATMDA. CONTACT luojiawei@hnu.edu.cn.
Collapse
Affiliation(s)
- Yahui Long
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410000, China.,School of Computer Science and Engineering, Nanyang Technological University, Singapore 639798, Singapore
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410000, China
| | - Yu Zhang
- School of Computer Science and Engineering, Nanyang Technological University, Singapore 639798, Singapore
| | - Yan Xia
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410000, China
| |
Collapse
|
31
|
Luo J, Long Y. NTSHMDA: Prediction of Human Microbe-Disease Association Based on Random Walk by Integrating Network Topological Similarity. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:1341-1351. [PMID: 30489271 DOI: 10.1109/tcbb.2018.2883041] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Accumulating clinic evidences have demonstrated that the microbes residing in human bodies play a significantly important role in the formation, development, and progression of various complex human diseases. Identifying latent related microbes for disease could provide insight into human disease mechanisms and promote disease prevention, diagnosis, and treatment. In this paper, we first construct a heterogeneous network by connecting the disease similarity network and the microbe similarity network through known microbe-disease association network, and then develop a novel computational model to predict human microbe-disease associations based on random walk by integrating network topological similarity (NTSHMDA). Specifically, each microbe-disease association pair is regarded as a distinct relationship level and, thus, assigned different weights based on network topological similarity. The experimental results show that NTSHMDA outperforms some state-of-the-art methods with average AUCs of 0.9070, 0.8896 ± 0.0038 in the frameworks of Leave-one-out cross validation and 5-fold cross validation, respectively. In case studies, 9, 18, 38 and 9, 18, 45 out of top-10, 20, 50 candidate microbes are verified by recently published literatures for asthma and inflammatory bowel disease, respectively. In conclusion, NTSHMDA has potential ability to identify novel disease-microbe associations and can also provide valuable information for drug discovery and biological researches.
Collapse
|
32
|
Wang L, You ZH, Huang DS, Zhou F. Combining High Speed ELM Learning with a Deep Convolutional Neural Network Feature Encoding for Predicting Protein-RNA Interactions. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:972-980. [PMID: 30296240 DOI: 10.1109/tcbb.2018.2874267] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Emerging evidence has shown that RNA plays a crucial role in many cellular processes, and their biological functions are primarily achieved by binding with a variety of proteins. High-throughput biological experiments provide a lot of valuable information for the initial identification of RNA-protein interactions (RPIs), but with the increasing complexity of RPIs networks, this method gradually falls into expensive and time-consuming situations. Therefore, there is an urgent need for high speed and reliable methods to predict RNA-protein interactions. In this study, we propose a computational method for predicting the RNA-protein interactions using sequence information. The deep learning convolution neural network (CNN) algorithm is utilized to mine the hidden high-level discriminative features from the RNA and protein sequences and feed it into the extreme learning machine (ELM) classifier. The experimental results with 5-fold cross-validation indicate that the proposed method achieves superior performance on benchmark datasets (RPI1807, RPI2241, and RPI369) with the accuracy of 98.83, 90.83, and 85.63 percent, respectively. We further evaluate the performance of the proposed model by comparing it with the state-of-the-art SVM classifier and other existing methods on the same benchmark data set. In addition, we predicted the independent NPInter v2.0 data set using the model trained on RPI369. The experimental results show that our model can serve as a useful tool for predicting RNA-protein interactions.
Collapse
|
33
|
Liu D, Ma Y, Jiang X, He T. Predicting virus-host association by Kernelized logistic matrix factorization and similarity network fusion. BMC Bioinformatics 2019; 20:594. [PMID: 31787095 PMCID: PMC6886165 DOI: 10.1186/s12859-019-3082-0] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Background Viruses are closely related to bacteria and human diseases. It is of great significance to predict associations between viruses and hosts for understanding the dynamics and complex functional networks in microbial community. With the rapid development of the metagenomics sequencing, some methods based on sequence similarity and genomic homology have been used to predict associations between viruses and hosts. However, the known virus-host association network was ignored in these methods. Results We proposed a kernelized logistic matrix factorization with integrating different information to predict potential virus-host associations on the heterogeneous network (ILMF-VH) which is constructed by connecting a virus network with a host network based on known virus-host associations. The virus network is constructed based on oligonucleotide frequency measurement, and the host network is constructed by integrating oligonucleotide frequency similarity and Gaussian interaction profile kernel similarity through similarity network fusion. The host prediction accuracy of our method is better than other methods. In addition, case studies show that the host of crAssphage predicted by ILMF-VH is consistent with presumed host in previous studies, and another potential host Escherichia coli is also predicted. Conclusions The proposed model is an effective computational tool for predicting interactions between viruses and hosts effectively, and it has great potential for discovering novel hosts of viruses.
Collapse
Affiliation(s)
- Dan Liu
- School of Computer, Central China Normal University, Wuhan, Hubei, China.,Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan, Hubei, China
| | - Yingjun Ma
- School of Computer, Central China Normal University, Wuhan, Hubei, China.,Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan, Hubei, China
| | - Xingpeng Jiang
- School of Computer, Central China Normal University, Wuhan, Hubei, China. .,Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan, Hubei, China.
| | - Tingting He
- School of Computer, Central China Normal University, Wuhan, Hubei, China. .,Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan, Hubei, China.
| |
Collapse
|
34
|
Long Y, Luo J. WMGHMDA: a novel weighted meta-graph-based model for predicting human microbe-disease association on heterogeneous information network. BMC Bioinformatics 2019; 20:541. [PMID: 31675979 PMCID: PMC6824056 DOI: 10.1186/s12859-019-3066-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Accepted: 09/02/2019] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND An increasing number of biological and clinical evidences have indicated that the microorganisms significantly get involved in the pathological mechanism of extensive varieties of complex human diseases. Inferring potential related microbes for diseases can not only promote disease prevention, diagnosis and treatment, but also provide valuable information for drug development. Considering that experimental methods are expensive and time-consuming, developing computational methods is an alternative choice. However, most of existing methods are biased towards well-characterized diseases and microbes. Furthermore, existing computational methods are limited in predicting potential microbes for new diseases. RESULTS Here, we developed a novel computational model to predict potential human microbe-disease associations (MDAs) based on Weighted Meta-Graph (WMGHMDA). We first constructed a heterogeneous information network (HIN) by combining the integrated microbe similarity network, the integrated disease similarity network and the known microbe-disease bipartite network. And then, we implemented iteratively pre-designed Weighted Meta-Graph search algorithm on the HIN to uncover possible microbe-disease pairs by cumulating the contribution values of weighted meta-graphs to the pairs as their probability scores. Depending on contribution potential, we described the contribution degree of different types of meta-graphs to a microbe-disease pair with bias rating. Meta-graph with higher bias rating will be assigned greater weight value when calculating probability scores. CONCLUSIONS The experimental results showed that WMGHMDA outperformed some state-of-the-art methods with average AUCs of 0.9288, 0.9068 ±0.0031 in global leave-one-out cross validation (LOOCV) and 5-fold cross validation (5-fold CV), respectively. In the case studies, 9, 19, 37 and 10, 20, 45 out of top-10, 20, 50 candidate microbes were manually verified by previous reports for asthma and inflammatory bowel disease (IBD), respectively. Furthermore, three common human diseases (Crohn's disease, Liver cirrhosis, Type 1 diabetes) were adopted to demonstrate that WMGHMDA could be efficiently applied to make predictions for new diseases. In summary, WMGHMDA has a high potential in predicting microbe-disease associations.
Collapse
Affiliation(s)
- Yahui Long
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China.
| |
Collapse
|
35
|
Srivastava D, Baksi KD, Kuntal BK, Mande SS. "EviMass": A Literature Evidence-Based Miner for Human Microbial Associations. Front Genet 2019; 10:849. [PMID: 31616466 PMCID: PMC6763948 DOI: 10.3389/fgene.2019.00849] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2019] [Accepted: 08/14/2019] [Indexed: 12/12/2022] Open
Abstract
The importance of understanding microbe–microbe as well as microbe–disease associations is one of the key thrust areas in human microbiome research. High-throughput metagenomic and transcriptomic projects have fueled discovery of a number of new microbial associations. Consequently, a plethora of information is being added routinely to biomedical literature, thereby contributing toward enhancing our knowledge on microbial associations. In this communication, we present a tool called “EviMass” (Evidence based mining of human Microbial Associations), which can assist biologists to validate their predicted hypotheses from new microbiome studies. Users can interactively query the processed back-end database for microbe–microbe and disease–microbe associations. The EviMass tool can also be used to upload microbial association networks generated from a human “disease–control” microbiome study and validate the associations from biomedical literature. Additionally, a list of differentially abundant microbes for the corresponding disease can be queried in the tool for reported evidences. The results are presented as graphical plots, tabulated summary, and other evidence statistics. EviMass is a comprehensive platform and is expected to enable microbiome researchers not only in mining microbial associations, but also enriching a new research hypothesis. The tool is available free for academic use at https://web.rniapps.net/evimass.
Collapse
Affiliation(s)
- Divyanshu Srivastava
- Bio-Sciences R&D Division, TCS Research, Tata Consultancy Services Ltd., Pune, India
| | - Krishanu D Baksi
- Bio-Sciences R&D Division, TCS Research, Tata Consultancy Services Ltd., Pune, India.,School of Information Technology, Indian Institute of Technology Delhi, Delhi, India
| | - Bhusan K Kuntal
- Bio-Sciences R&D Division, TCS Research, Tata Consultancy Services Ltd., Pune, India.,Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, India.,Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| | - Sharmila S Mande
- Bio-Sciences R&D Division, TCS Research, Tata Consultancy Services Ltd., Pune, India
| |
Collapse
|
36
|
Badal VD, Wright D, Katsis Y, Kim HC, Swafford AD, Knight R, Hsu CN. Challenges in the construction of knowledge bases for human microbiome-disease associations. MICROBIOME 2019; 7:129. [PMID: 31488215 PMCID: PMC6728997 DOI: 10.1186/s40168-019-0742-2] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Accepted: 08/20/2019] [Indexed: 05/05/2023]
Abstract
The last few years have seen tremendous growth in human microbiome research, with a particular focus on the links to both mental and physical health and disease. Medical and experimental settings provide initial sources of information about these links, but individual studies produce disconnected pieces of knowledge bounded in context by the perspective of expert researchers reading full-text publications. Building a knowledge base (KB) consolidating these disconnected pieces is an essential first step to democratize and accelerate the process of accessing the collective discoveries of human disease connections to the human microbiome. In this article, we survey the existing tools and development efforts that have been produced to capture portions of the information needed to construct a KB of all known human microbiome-disease associations and highlight the need for additional innovations in natural language processing (NLP), text mining, taxonomic representations, and field-wide vocabulary standardization in human microbiome research. Addressing these challenges will enable the construction of KBs that help identify new insights amenable to experimental validation and potentially clinical decision support.
Collapse
Affiliation(s)
- Varsha Dave Badal
- Center for Microbiome Innovation, Jacobs School of Engineering, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093 USA
| | - Dustin Wright
- Center for Microbiome Innovation, Jacobs School of Engineering, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093 USA
- Department of Computer Science and Engineering, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093 USA
| | - Yannis Katsis
- Scalable Knowledge Intelligence, IBM Research-Almaden, 650 Harry Road, San Jose, CA 95120 USA
| | - Ho-Cheol Kim
- Scalable Knowledge Intelligence, IBM Research-Almaden, 650 Harry Road, San Jose, CA 95120 USA
| | - Austin D. Swafford
- Center for Microbiome Innovation, Jacobs School of Engineering, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093 USA
| | - Rob Knight
- Center for Microbiome Innovation, Jacobs School of Engineering, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093 USA
- Department of Computer Science and Engineering, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093 USA
- UCSD Health Department of Pediatrics, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093 USA
- Department of Bioengineering, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093 USA
| | - Chun-Nan Hsu
- Center for Microbiome Innovation, Jacobs School of Engineering, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093 USA
- Department of Neurosciences and Center for Research in Biological Systems, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093 USA
| |
Collapse
|
37
|
Niu YW, Qu CQ, Wang GH, Yan GY. RWHMDA: Random Walk on Hypergraph for Microbe-Disease Association Prediction. Front Microbiol 2019; 10:1578. [PMID: 31354672 PMCID: PMC6635699 DOI: 10.3389/fmicb.2019.01578] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2019] [Accepted: 06/25/2019] [Indexed: 12/12/2022] Open
Abstract
Based on advancements in deep sequencing technology and microbiology, increasing evidence indicates that microbes inhabiting humans modulate various host physiological phenomena, thus participating in various disease pathogeneses. Owing to increasing availability of biological data, further studies on the establishment of efficient computational models for predicting potential associations are required. In particular, computational approaches can also reduce the discovery cycle of novel microbe-disease associations and further facilitate disease treatment, drug design, and other scientific activities. This study aimed to develop a model based on the random walk on hypergraph for microbe-disease association prediction (RWHMDA). As a class of higher-order data representation, hypergraph could effectively recover information loss occurring in the normal graph methodology, thus exclusively illustrating multiple pair-wise associations. Integrating known microbe-disease associations in the Human Microbe-Disease Association Database (HMDAD) and the Gaussian interaction profile kernel similarity for microbes, random walk was then implemented for the constructed hypergraph. Consequently, RWHMDA performed optimally in predicting the underlying disease-associated microbes. More specifically, our model displayed AUC values of 0.8898 and 0.8524 in global and local leave-one-out cross-validation (LOOCV), respectively. Furthermore, three human diseases (asthma, Crohn's disease, and type 2 diabetes) were studied to further illustrate prediction performance. Moreover, 8, 10, and 8 of the 10 highest ranked microbes were confirmed through recent experimental or clinical studies. In conclusion, RWHMDA is expected to display promising potential to predict disease-microbe associations for follow-up experimental studies and facilitate the prevention, diagnosis, treatment, and prognosis of complex human diseases.
Collapse
Affiliation(s)
- Ya-Wei Niu
- School of Mathematics, Shandong University, Jinan, China
| | - Cun-Quan Qu
- School of Mathematics, Shandong University, Jinan, China.,Data Science Institute, Shandong University, Jinan, China
| | - Guang-Hui Wang
- School of Mathematics, Shandong University, Jinan, China.,Data Science Institute, Shandong University, Jinan, China
| | - Gui-Ying Yan
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
38
|
Ai D, Pan H, Li X, Gao Y, Liu G, Xia LC. Identifying Gut Microbiota Associated With Colorectal Cancer Using a Zero-Inflated Lognormal Model. Front Microbiol 2019; 10:826. [PMID: 31068913 PMCID: PMC6491826 DOI: 10.3389/fmicb.2019.00826] [Citation(s) in RCA: 92] [Impact Index Per Article: 18.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Accepted: 04/01/2019] [Indexed: 12/26/2022] Open
Abstract
Colorectal cancer (CRC) is the third most common cancer worldwide. Its incidence is still increasing, and the mortality rate is high. New therapeutic and prognostic strategies are urgently needed. It became increasingly recognized that the gut microbiota composition differs significantly between healthy people and CRC patients. Thus, identifying the difference between gut microbiota of the healthy people and CRC patients is fundamental to understand these microbes' functional roles in the development of CRC. We studied the microbial community structure of a CRC metagenomic dataset of 156 patients and healthy controls, and analyzed the diversity, differentially abundant bacteria, and co-occurrence networks. We applied a modified zero-inflated lognormal (ZIL) model for estimating the relative abundance. We found that the abundance of genera: Anaerostipes, Bilophila, Catenibacterium, Coprococcus, Desulfovibrio, Flavonifractor, Porphyromonas, Pseudoflavonifractor, and Weissella was significantly different between the healthy and CRC groups. We also found that bacteria such as Streptococcus, Parvimonas, Collinsella, and Citrobacter were uniquely co-occurring within the CRC patients. In addition, we found that the microbial diversity of healthy controls is significantly higher than that of the CRC patients, which indicated a significant negative correlation between gut microbiota diversity and the stage of CRC. Collectively, our results strengthened the view that individual microbes as well as the overall structure of gut microbiota were co-evolving with CRC.
Collapse
Affiliation(s)
- Dongmei Ai
- Basic Experimental of Natural Science, University of Science and Technology Beijing, Beijing, China
- School of Mathematics and Physics, University of Science and Technology Beijing, Beijing, China
| | - Hongfei Pan
- School of Mathematics and Physics, University of Science and Technology Beijing, Beijing, China
| | - Xiaoxin Li
- School of Mathematics and Physics, University of Science and Technology Beijing, Beijing, China
| | - Yingxin Gao
- School of Mathematics and Physics, University of Science and Technology Beijing, Beijing, China
| | - Gang Liu
- School of Mathematics and Physics, University of Science and Technology Beijing, Beijing, China
| | - Li C Xia
- Department of Medicine, Stanford University School of Medicine, Stanford, CA, United States
| |
Collapse
|
39
|
Qu K, Guo F, Liu X, Lin Y, Zou Q. Application of Machine Learning in Microbiology. Front Microbiol 2019; 10:827. [PMID: 31057526 PMCID: PMC6482238 DOI: 10.3389/fmicb.2019.00827] [Citation(s) in RCA: 89] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Accepted: 04/01/2019] [Indexed: 02/01/2023] Open
Abstract
Microorganisms are ubiquitous and closely related to people's daily lives. Since they were first discovered in the 19th century, researchers have shown great interest in microorganisms. People studied microorganisms through cultivation, but this method is expensive and time consuming. However, the cultivation method cannot keep a pace with the development of high-throughput sequencing technology. To deal with this problem, machine learning (ML) methods have been widely applied to the field of microbiology. Literature reviews have shown that ML can be used in many aspects of microbiology research, especially classification problems, and for exploring the interaction between microorganisms and the surrounding environment. In this study, we summarize the application of ML in microbiology.
Collapse
Affiliation(s)
- Kaiyang Qu
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Fei Guo
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Xiangrong Liu
- School of Information Science and Technology, Xiamen University, Xiamen, China
| | - Yuan Lin
- School of Information Science and Technology, Xiamen University, Xiamen, China
- Department of System Integration, Sparebanken Vest, Bergen, Norway
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
40
|
Li H, Wang Y, Jiang J, Zhao H, Feng X, Zhao B, Wang L. A Novel Human Microbe-Disease Association Prediction Method Based on the Bidirectional Weighted Network. Front Microbiol 2019; 10:676. [PMID: 31024478 PMCID: PMC6465552 DOI: 10.3389/fmicb.2019.00676] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2018] [Accepted: 03/18/2019] [Indexed: 12/12/2022] Open
Abstract
The survival of human beings is inseparable from microbes. More and more studies have proved that microbes can affect human physiological processes in various aspects and are closely related to some human diseases. In this paper, based on known microbe-disease associations, a bidirectional weighted network was constructed by integrating the schemes of normalized Gaussian interactions and bidirectional recommendations firstly. And then, based on the newly constructed bidirectional network, a computational model called BWNMHMDA was developed to predict potential relationships between microbes and diseases. Finally, in order to evaluate the superiority of the new prediction model BWNMHMDA, the framework of LOOCV and 5-fold cross validation were implemented, and simulation results indicated that BWNMHMDA could achieve reliable AUCs of 0.9127 and 0.8967 ± 0.0027 in these two different frameworks respectively, which is outperformed some state-of-the-art methods. Moreover, case studies of asthma, colorectal carcinoma, and chronic obstructive pulmonary disease were implemented to further estimate the performance of BWNMHMDA. Experimental results showed that there are 10, 9, and 8 out of the top 10 predicted microbes having been confirmed by related literature in these three kinds of case studies separately, which also demonstrated that our new model BWNMHMDA could achieve satisfying prediction performance.
Collapse
Affiliation(s)
- Hao Li
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, China
| | - Yuqi Wang
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, China
| | - Jingwu Jiang
- Clinical Lab, Yongcheng People's Hospital, Shangqiu, China
| | - Haochen Zhao
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, China
| | - Xiang Feng
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, China
| | - Bihai Zhao
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, China
| | - Lei Wang
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, China
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, China
| |
Collapse
|
41
|
Wang L, Wang Y, Li H, Feng X, Yuan D, Yang J. A Bidirectional Label Propagation Based Computational Model for Potential Microbe-Disease Association Prediction. Front Microbiol 2019; 10:684. [PMID: 31024481 PMCID: PMC6465563 DOI: 10.3389/fmicb.2019.00684] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Accepted: 03/19/2019] [Indexed: 12/12/2022] Open
Abstract
A growing number of clinical observations have indicated that microbes are involved in a variety of important human diseases. It is obvious that in-depth investigation of correlations between microbes and diseases will benefit the prevention, early diagnosis, and prognosis of diseases greatly. Hence, in this paper, based on known microbe-disease associations, a prediction model called NBLPIHMDA was proposed to infer potential microbe-disease associations. Specifically, two kinds of networks including the disease similarity network and the microbe similarity network were first constructed based on the Gaussian interaction profile kernel similarity. The bidirectional label propagation was then applied on these two kinds of networks to predict potential microbe-disease associations. We applied NBLPIHMDA on Human Microbe-Disease Association database (HMDAD), and compared it with 3 other recent published methods including LRLSHMDA, BiRWMP, and KATZHMDA based on the leave-one-out cross validation and 5-fold cross validation, respectively. As a result, the area under the receiver operating characteristic curves (AUCs) achieved by NBLPIHMDA were 0.8777 and 0.8958 ± 0.0027, respectively, outperforming the compared methods. In addition, in case studies of asthma, colorectal carcinoma, and Chronic obstructive pulmonary disease, simulation results illustrated that there are 10, 10, and 8 out of the top 10 predicted microbes having been confirmed by published documentary evidences, which further demonstrated that NBLPIHMDA is promising in predicting novel associations between diseases and microbes as well.
Collapse
Affiliation(s)
- Lei Wang
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, China
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
| | - Yuqi Wang
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, China
| | - Hao Li
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, China
| | - Xiang Feng
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, China
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
| | - Dawei Yuan
- Geneis Beijing Co., Ltd., Beijing, China
| | | |
Collapse
|
42
|
Qu J, Zhao Y, Yin J. Identification and Analysis of Human Microbe-Disease Associations by Matrix Decomposition and Label Propagation. Front Microbiol 2019; 10:291. [PMID: 30863376 PMCID: PMC6399478 DOI: 10.3389/fmicb.2019.00291] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Accepted: 02/04/2019] [Indexed: 12/12/2022] Open
Abstract
Studies have shown that microbes exist widely in the human body and are closely related to human complex diseases. Predicting potential associations between microbes and diseases is conducive to understanding the mechanisms of complex diseases and can also facilitate the diagnosis and prevention of human diseases. In this paper, we put forward the Matrix Decomposition and Label Propagation for Human Microbe-Disease Association prediction (MDLPHMDA) on the basis of the dataset of known microbe-disease associations collected from the database of HMDAD and the Gaussian interaction profile kernel similarity for diseases and microbes, disease symptom similarity. Moreover, the performance of our model was evaluated by means of leave-one-out cross validation and five-fold cross validation, and the corresponding AUCs of 0.9034 and 0.8954 ± 0.0030 were gained, respectively. In case studies, 10, 9, 9, and 8 out of the top 10 predicted microbes for asthma, colorectal carcinoma, liver cirrhosis, and type 1 diabetes were confirmed by literatures, respectively. Overall, evaluation results showed that MDLPHMDA has good performance in potential microbe-diseasepositive free parameter, which associations prediction.
Collapse
Affiliation(s)
- Jia Qu
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| | - Yan Zhao
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| | - Jun Yin
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| |
Collapse
|
43
|
Predicting the associations between microbes and diseases by integrating multiple data sources and path-based HeteSim scores. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2018.09.054] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
|
44
|
Zhou S, Ren X, Yang J, Jin Q. Evaluating the Value of Defensins for Diagnosing Secondary Bacterial Infections in Influenza-Infected Patients. Front Microbiol 2018; 9:2762. [PMID: 30524393 PMCID: PMC6256186 DOI: 10.3389/fmicb.2018.02762] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2018] [Accepted: 10/29/2018] [Indexed: 11/13/2022] Open
Abstract
Acute respiratory infections by influenza viruses are commonly causes of severe pneumonia, which can further deteriorate if secondary bacterial infections occur. Although the viral and bacterial agents are quite diverse, defensins, a set of antimicrobial peptides expressed by the host, may provide promising biomarkers that would greatly improve the diagnosis and treatment. We examined the correlations between the gene expression levels of defensins and the viral and bacterial loads in the blood on a longitudinal, precision-medical study of a severe pneumonia patient infected by influenza A H7N9 virus. We found that DEFA5 is positively correlated to the blood load of influenza A H7N9 virus (r = 0.735, p < 0.05, Spearman correlation). DEFB116 and DEFB127 are positively and DEFB108B and DEFB114 are negatively correlated to the bacterial load. Then the diagnostic potential of defensins to discriminate bacterial and viral infections was evaluated on an independent dataset with 61 bacterial pneumonia patients and 39 viral pneumonia patients infected by influenza A viruses and reached 93% accuracy. Expression levels of defensins in the blood may be of important diagnostic values in clinic to indicate viral and bacterial infections.
Collapse
Affiliation(s)
- Siyu Zhou
- MOH Key Laboratory of Systems Biology of Pathogens, Peking Union Medical College, Institute of Pathogen Biology, Chinese Academy of Medical Sciences, Beijing, China
| | - Xianwen Ren
- BIOPIC, School of Life Sciences, Peking University, Beijing, China
| | - Jian Yang
- MOH Key Laboratory of Systems Biology of Pathogens, Peking Union Medical College, Institute of Pathogen Biology, Chinese Academy of Medical Sciences, Beijing, China
| | - Qi Jin
- MOH Key Laboratory of Systems Biology of Pathogens, Peking Union Medical College, Institute of Pathogen Biology, Chinese Academy of Medical Sciences, Beijing, China
| |
Collapse
|
45
|
He BS, Peng LH, Li Z. Human Microbe-Disease Association Prediction With Graph Regularized Non-Negative Matrix Factorization. Front Microbiol 2018; 9:2560. [PMID: 30443240 PMCID: PMC6223245 DOI: 10.3389/fmicb.2018.02560] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2018] [Accepted: 10/08/2018] [Indexed: 12/12/2022] Open
Abstract
A microbe is a microscopic organism which may exists in its single-celled form or in a colony of cells. In recent years, accumulating researchers have been engaged in the field of uncovering microbe-disease associations since microbes are found to be closely related to the prevention, diagnosis, and treatment of many complex human diseases. As an effective supplement to the traditional experiment, more and more computational models based on various algorithms have been proposed for microbe-disease association prediction to improve efficiency and cost savings. In this work, we developed a novel predictive model of Graph Regularized Non-negative Matrix Factorization for Human Microbe-Disease Association prediction (GRNMFHMDA). Initially, microbe similarity and disease similarity were constructed on the basis of the symptom-based disease similarity and Gaussian interaction profile kernel similarity for microbes and diseases. Subsequently, it is worth noting that we utilized a preprocessing step in which unknown microbe-disease pairs were assigned associated likelihood scores to avoid the possible negative impact on the prediction performance. Finally, we implemented a graph regularized non-negative matrix factorization framework to identify potential associations for all diseases simultaneously. To assess the performance of our model, cross validations including global leave-one-out cross validation (LOOCV) and local LOOCV were implemented. The AUCs of 0.8715 (global LOOCV) and 0.7898 (local LOOCV) proved the reliable performance of our computational model. In addition, we carried out two types of case studies on three different human diseases to further analyze the prediction performance of GRNMFHMDA, in which most of the top 10 predicted disease-related microbes were verified by database HMDAD or experimental literatures.
Collapse
Affiliation(s)
- Bin-Sheng He
- The First Affiliated Hospital, Changsha Medical University, Changsha, China
| | - Li-Hong Peng
- School of Information Engineering, Changsha Medical University, Changsha, China
| | - Zejun Li
- College of Information Science and Engineering, Hunan University, Changsha, China.,School of Computer and Information Science, Hunan Institute of Technology, Hengyang, China
| |
Collapse
|
46
|
Li W, Yuan Y, Xia Y, Sun Y, Miao Y, Ma S. A Cross-Scale Neutral Theory Approach to the Influence of Obesity on Community Assembly of Human Gut Microbiome. Front Microbiol 2018; 9:2320. [PMID: 30420838 PMCID: PMC6215851 DOI: 10.3389/fmicb.2018.02320] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2018] [Accepted: 09/11/2018] [Indexed: 12/23/2022] Open
Abstract
Background: The implications of gut microbiome to obesity have been extensively investigated in recent years although the exact mechanism is still unclear. The question whether or not obesity influences gut microbiome assembly has not been addressed. The question is significant because it is fundamental for investigating the diversity maintenance and stability of gut microbiome, and the latter should hold a key for understanding the etiological implications of gut microbiome to obesity. Methods: In this study, we adopt a dual neutral theory modeling strategy to address this question from both species and community perspectives, with both discrete and continuous neutral theory models. The first neutral theory model we apply is Hubbell's neutral theory of biodiversity that has been extensively tested in macro-ecology of plants and animals, and the second we apply is Sloan's neutral theory model that was developed particularly for microbial communities based on metagenomic sequencing data. Both the neutral models are complementary to each other and integrated together offering a comprehensive approach to more accurately revealing the possible influence of obesity on gut microbiome assembly. This is not only because the focus of both neutral theory models is different (community vs. species), but also because they adopted two different modeling strategies (discrete vs. continuous). Results: We test both the neutral theory models with datasets from Turnbaugh et al. (2009). Our tests showed that the species abundance distributions of more than ½ species (59-69%) in gut microbiome satisfied the prediction of Sloan's neutral theory, although at the community level, the number of communities satisfied the Hubbell's neutral theory was negligible (2 out of 278). Conclusion: The apparently contradictory findings above suggest that both stochastic neutral effects and deterministic environmental (host) factors play important roles in shaping the assembly and diversity of gut microbiome. Furthermore, obesity may just be one of the host factors, but its influence may not be strong enough to tip the balance between stochastic and deterministic forces that shape the community assembly. Finally, the apparent contradiction from both the neutral theories should not be surprising given that there are still near 30-40% species that do not obey the neutral law.
Collapse
Affiliation(s)
- Wendy Li
- Computational Biology and Medical Ecology Lab, State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Yali Yuan
- Computational Biology and Medical Ecology Lab, State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- College of Clinical Medicine, Lanzhou University, Lanzhou, China
| | - Yao Xia
- Computational Biology and Medical Ecology Lab, State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming, China
| | - Yang Sun
- Department of Gastroenterology, The First Affiliated Hospital of Kunming Medical University, Yunnan Institute of Digestive Disease, Kunming, China
| | - Yinglei Miao
- Department of Gastroenterology, The First Affiliated Hospital of Kunming Medical University, Yunnan Institute of Digestive Disease, Kunming, China
| | - Sam Ma
- Computational Biology and Medical Ecology Lab, State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, China
| |
Collapse
|
47
|
Peng LH, Yin J, Zhou L, Liu MX, Zhao Y. Human Microbe-Disease Association Prediction Based on Adaptive Boosting. Front Microbiol 2018; 9:2440. [PMID: 30356751 PMCID: PMC6189371 DOI: 10.3389/fmicb.2018.02440] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2018] [Accepted: 09/24/2018] [Indexed: 12/13/2022] Open
Abstract
There are countless microbes in the human body, and they play various roles in the physiological process. There is growing evidence that microbes are closely associated with human diseases. Researching disease-related microbes helps us understand the mechanisms of diseases and provides new strategies for diseases diagnosis and treatment. Many computational models have been proposed to predict disease-related microbes, in this paper, we developed a model of Adaptive Boosting for Human Microbe-Disease Association prediction (ABHMDA) to reveal the associations between diseases and microbes by calculating the relation probability of disease-microbe pair using a strong classifier. Our model could be applied to new diseases without any known related microbes. In order to assess the prediction power of the model, global and local leave-one-out cross validation (LOOCV) were implemented. As shown in the results, the global and local LOOCV values reached 0.8869 and 0.7910, respectively. What's more, 10, 10, and 8 out of the top 10 microbes predicted to be most likely to be associated with Asthma, Colorectal carcinoma and Type 1 diabetes were all verified by relevant literatures or database HMDAD, respectively. The above results verify the superior predictive performance of ABHMDA.
Collapse
Affiliation(s)
- Li-Hong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Jun Yin
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| | - Liqian Zhou
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Ming-Xi Liu
- Institutes of Science and Development, Chinese Academy of Sciences, Beijing, China
| | - Yan Zhao
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| |
Collapse
|
48
|
Novel human microbe-disease associations inference based on network consistency projection. Sci Rep 2018; 8:8034. [PMID: 29795313 PMCID: PMC5966389 DOI: 10.1038/s41598-018-26448-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2017] [Accepted: 01/09/2018] [Indexed: 02/06/2023] Open
Abstract
Increasing evidence shows that microbes are closely related to various human diseases. Obtaining a comprehensive and detailed understanding of the relationships between microbes and diseases would not only be beneficial to disease prevention, diagnosis and prognosis, but also would lead to the discovery of new drugs. However, because of a lack of data, little effort has been made to predict novel microbe-disease associations. To date, few methods have been proposed to solve the problem. In this study, we developed a new computational model based on network consistency projection to infer novel human microbe-disease associations (NCPHMDA) by integrating Gaussian interaction profile kernel similarity of microbes and diseases, and symptom-based disease similarity. NCPHMDA is a non-parametric and global network based model that combines microbe space projection and disease space projection to achieve the final prediction. Experimental results demonstrated that the integrated space projection of microbes and diseases, and symptom-based disease similarity played roles in the model performance. Cross validation frameworks and case studies further illustrated the superior predictive performance over other methods.
Collapse
|
49
|
Wu C, Gao R, Zhang D, Han S, Zhang Y. PRWHMDA: Human Microbe-Disease Association Prediction by Random Walk on the Heterogeneous Network with PSO. Int J Biol Sci 2018; 14:849-857. [PMID: 29989079 PMCID: PMC6036753 DOI: 10.7150/ijbs.24539] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2017] [Accepted: 02/28/2018] [Indexed: 12/24/2022] Open
Abstract
Microorganisms resided in human body play a vital role in metabolism, immune defense, nutrition absorption, cancer control and protection against pathogen colonization. The changes of microbial communities can cause human diseases. Based on the known microbe-disease association, we presented a novel computational model employing Random Walking with Restart optimized by Particle Swarm Optimization (PSO) on the heterogeneous interlinked network of Human Microbe-Disease Associations (PRWHMDA) (see Figure 1). Based on the known human microbe-disease associations, we constructed the heterogeneous interlinked network with Cosine similarity. The extended random walk with restart (RWR) method was derived to get the potential microbe-disease associations. PSO was utilized to get the optimal parameters of RWR. To evaluate the prediction effectiveness, we performed leave one out cross validation (LOOCV) and 5-fold cross validation (CV), which got the AUC (The area under ROC curve) of 0.915 (LOOCV) and the average AUCs of 0.8875 ± 0.0046 (5-fold CV). Moreover, we carried out three case studies of asthma, inflammatory bowel disease (IBD) and type 1 diabetes (T1D) for the further evaluation. The result showed that 10, 10 and 9 of top-10 predicted microbes were verified by previously published experimental results, respectively. It is anticipated that PRWHMDA can be effective to identify the disease-related microbes and maybe helpful to disclose the relationship between microorganisms and their human host.
Collapse
Affiliation(s)
- Chuanyan Wu
- School of Control Science and Engineering, Shandong University, Jinan, 250061, China
| | - Rui Gao
- School of Control Science and Engineering, Shandong University, Jinan, 250061, China
| | - Daoliang Zhang
- School of Control Science and Engineering, Shandong University, Jinan, 250061, China
| | - Shiyun Han
- General Clinic, The No. 2 People's Hospital of Tianqiao, Jinan, 250032, China
| | - Yusen Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| |
Collapse
|
50
|
Abstract
The genome-scale cellular network has become a necessary tool in the systematic analysis of microbes. In a cell, there are several layers (i.e., types) of the molecular networks, for example, genome-scale metabolic network (GMN), transcriptional regulatory network (TRN), and signal transduction network (STN). It has been realized that the limitation and inaccuracy of the prediction exist just using only a single-layer network. Therefore, the integrated network constructed based on the networks of the three types attracts more interests. The function of a biological process in living cells is usually performed by the interaction of biological components. Therefore, it is necessary to integrate and analyze all the related components at the systems level for the comprehensively and correctly realizing the physiological function in living organisms. In this review, we discussed three representative genome-scale cellular networks: GMN, TRN, and STN, representing different levels (i.e., metabolism, gene regulation, and cellular signaling) of a cell’s activities. Furthermore, we discussed the integration of the networks of the three types. With more understanding on the complexity of microbial cells, the development of integrated network has become an inevitable trend in analyzing genome-scale cellular networks of microorganisms.
Collapse
Affiliation(s)
- Tong Hao
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin, China
| | - Dan Wu
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin, China
| | - Lingxuan Zhao
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin, China
| | - Qian Wang
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin, China
| | - Edwin Wang
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin, China.,Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Jinsheng Sun
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin, China.,Tianjin Bohai Fisheries Research Institute, Tianjin, China
| |
Collapse
|