1
|
Li X, Liang Z. Causal effect of gut microbiota on pancreatic cancer: A Mendelian randomization and colocalization study. J Cell Mol Med 2024; 28:e18255. [PMID: 38526030 PMCID: PMC10962122 DOI: 10.1111/jcmm.18255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Revised: 03/04/2024] [Accepted: 03/07/2024] [Indexed: 03/26/2024] Open
Abstract
The causal relationship between gut microbiota (GM) and pancreatic cancer (PC) remains unclear. This study aimed to investigate the potential genes underlying this mechanism. GM Genome-wide association study (GWAS) summary data were from the MiBioGen consortium. PC GWAS data were from the National Human Genome Research Institute-European Bioinformatics Institute (NHGRI-EBI) GWAS Catalogue. To detect the causal relationship between GM and PC, we implemented three complementary Mendelian randomization (MR) methods: Inverse Variance Weighting (IVW), MR-Egger and Weighted Median, followed by sensitivity analyses. Furthermore, we integrated GM GWAS data with blood cis-expression quantitative trait loci (eQTLs) and blood cis-DNA methylation QTL (mQTLs) using Summary data-based Mendelian Randomization (SMR) methods. This integration aimed to prioritize potential GM-affecting genes through SMR analysis of two molecular traits. PC cis-eQTLs and cis-mQTLs were summarized from The Cancer Genome Atlas (TCGA) data. Through colocalization analysis of GM cis-QTLs and PC cis-QTLs data, we identified common genes that influence both GM and PC. Our study found a causal association between GM and PC, including four protective and five risk-associated GM [Inverse Variance Weighted (IVW), p < 0.05]. No significant heterogeneity of instrumental variables (IVs) or horizontal pleiotropy was found. The gene SVBP was identified as a GM-affecting gene using SMR analysis of two molecular traits (FDR<0.05, P_HEIDI>0.05). Additionally, two genes, MCM6 and RPS26, were implicated in the interaction between GM and PC based on colocalization analysis (PPH4>0.5). In summary, this study provides evidence for future research aimed at developing suitable therapeutic interventions and disease prevention.
Collapse
Affiliation(s)
- Xin Li
- Department of Gastroenterology, The First Affiliated HospitalGuangxi Medical UniversityNanningChina
| | - Zhihai Liang
- Department of Gastroenterology, The First Affiliated HospitalGuangxi Medical UniversityNanningChina
| |
Collapse
|
2
|
Chen R, Xie G, Lin Z, Gu G, Yu Y, Yu J, Liu Z. Predicting Microbe-Disease Associations Based on a Linear Neighborhood Label Propagation Method with Multi-order Similarity Fusion Learning. Interdiscip Sci 2024:10.1007/s12539-024-00607-0. [PMID: 38436840 DOI: 10.1007/s12539-024-00607-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 01/04/2024] [Accepted: 01/05/2024] [Indexed: 03/05/2024]
Abstract
Computational approaches employed for predicting potential microbe-disease associations often rely on similarity information between microbes and diseases. Therefore, it is important to obtain reliable similarity information by integrating multiple types of similarity information. However, existing similarity fusion methods do not consider multi-order fusion of similarity networks. To address this problem, a novel method of linear neighborhood label propagation with multi-order similarity fusion learning (MOSFL-LNP) is proposed to predict potential microbe-disease associations. Multi-order fusion learning comprises two parts: low-order global learning and high-order feature learning. Low-order global learning is used to obtain common latent features from multiple similarity sources. High-order feature learning relies on the interactions between neighboring nodes to identify high-order similarities and learn deeper interactive network structures. Coefficients are assigned to different high-order feature learning modules to balance the similarities learned from different orders and enhance the robustness of the fusion network. Overall, by combining low-order global learning with high-order feature learning, multi-order fusion learning can capture both the shared and unique features of different similarity networks, leading to more accurate predictions of microbe-disease associations. In comparison to six other advanced methods, MOSFL-LNP exhibits superior prediction performance in the leave-one-out cross-validation and 5-fold validation frameworks. In the case study, the predicted 10 microbes associated with asthma and type 1 diabetes have an accuracy rate of up to 90% and 100%, respectively.
Collapse
Affiliation(s)
- Ruibin Chen
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Guobo Xie
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Zhiyi Lin
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China.
| | - Guosheng Gu
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China.
| | - Yi Yu
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Junrui Yu
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Zhenguo Liu
- Department of Thoracic Surgery, The First Affiliated Hospital of Sun Yat-sen University, Guangzhou, 510080, China.
| |
Collapse
|
3
|
Chen Z, Zhang L, Li J, Fu M. MLFLHMDA: predicting human microbe-disease association based on multi-view latent feature learning. Front Microbiol 2024; 15:1353278. [PMID: 38371933 PMCID: PMC10869561 DOI: 10.3389/fmicb.2024.1353278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Accepted: 01/17/2024] [Indexed: 02/20/2024] Open
Abstract
Introduction A growing body of research indicates that microorganisms play a crucial role in human health. Imbalances in microbial communities are closely linked to human diseases, and identifying potential relationships between microbes and diseases can help elucidate the pathogenesis of diseases. However, traditional methods based on biological or clinical experiments are costly, so the use of computational models to predict potential microbe-disease associations is of great importance. Methods In this paper, we present a novel computational model called MLFLHMDA, which is based on a Multi-View Latent Feature Learning approach to predict Human potential Microbe-Disease Associations. Specifically, we compute Gaussian interaction profile kernel similarity between diseases and microbes based on the known microbe-disease associations from the Human Microbe-Disease Association Database and perform a preprocessing step on the resulting microbe-disease association matrix, namely, weighting K nearest known neighbors (WKNKN) to reduce the sparsity of the microbe-disease association matrix. To obtain unobserved associations in the microbe and disease views, we extract different latent features based on the geometrical structure of microbes and diseases, and project multi-modal latent features into a common subspace. Next, we introduce graph regularization to preserve the local manifold structure of Gaussian interaction profile kernel similarity and add L p , q -norms to the projection matrix to ensure the interpretability and sparsity of the model. Results The AUC values for global leave-one-out cross-validation and 5-fold cross validation implemented by MLFLHMDA are 0.9165 and 0.8942+/-0.0041, respectively, which perform better than other existing methods. In addition, case studies of different diseases have demonstrated the superiority of the predictive power of MLFLHMDA. The source code of our model and the data are available on https://github.com/LiangzheZhang/MLFLHMDA_master.
Collapse
|
4
|
Zhao Y, Yin J, Zhang L, Zhang Y, Chen X. Drug-drug interaction prediction: databases, web servers and computational models. Brief Bioinform 2023; 25:bbad445. [PMID: 38113076 PMCID: PMC10782925 DOI: 10.1093/bib/bbad445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Revised: 10/26/2023] [Accepted: 11/14/2023] [Indexed: 12/21/2023] Open
Abstract
In clinical treatment, two or more drugs (i.e. drug combination) are simultaneously or successively used for therapy with the purpose of primarily enhancing the therapeutic efficacy or reducing drug side effects. However, inappropriate drug combination may not only fail to improve efficacy, but even lead to adverse reactions. Therefore, according to the basic principle of improving the efficacy and/or reducing adverse reactions, we should study drug-drug interactions (DDIs) comprehensively and thoroughly so as to reasonably use drug combination. In this review, we first introduced the basic conception and classification of DDIs. Further, some important publicly available databases and web servers about experimentally verified or predicted DDIs were briefly described. As an effective auxiliary tool, computational models for predicting DDIs can not only save the cost of biological experiments, but also provide relevant guidance for combination therapy to some extent. Therefore, we summarized three types of prediction models (including traditional machine learning-based models, deep learning-based models and score function-based models) proposed during recent years and discussed the advantages as well as limitations of them. Besides, we pointed out the problems that need to be solved in the future research of DDIs prediction and provided corresponding suggestions.
Collapse
Affiliation(s)
- Yan Zhao
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
| | - Jun Yin
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
| | - Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
| | - Yong Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
| | - Xing Chen
- School of Science, Jiangnan University, Wuxi 214122, China
| |
Collapse
|
5
|
Xiang H, Guo R, Liu L, Guo T, Huang Q. MSIF-LNP: microbial and human health association prediction based on matrix factorization noise reduction for similarity fusion and bidirectional linear neighborhood label propagation. Front Microbiol 2023; 14:1216811. [PMID: 37389340 PMCID: PMC10303805 DOI: 10.3389/fmicb.2023.1216811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 05/25/2023] [Indexed: 07/01/2023] Open
Abstract
Studies have shown that microbes are closely related to human health. Clarifying the relationship between microbes and diseases that cause health problems can provide new solutions for the treatment, diagnosis, and prevention of diseases, and provide strong protection for human health. Currently, more and more similarity fusion methods are available to predict potential microbe-disease associations. However, existing methods have noise problems in the process of similarity fusion. To address this issue, we propose a method called MSIF-LNP that can efficiently and accurately identify potential connections between microbes and diseases, and thus clarify the relationship between microbes and human health. This method is based on matrix factorization denoising similarity fusion (MSIF) and bidirectional linear neighborhood propagation (LNP) techniques. First, we use non-linear iterative fusion to obtain a similarity network for microbes and diseases by fusing the initial microbe and disease similarities, and then reduce noise by using matrix factorization. Next, we use the initial microbe-disease association pairs as label information to perform linear neighborhood label propagation on the denoised similarity network of microbes and diseases. This enables us to obtain a score matrix for predicting microbe-disease relationships. We evaluate the predictive performance of MSIF-LNP and seven other advanced methods through 10-fold cross-validation, and the experimental results show that MSIF-LNP outperformed the other seven methods in terms of AUC. In addition, the analysis of Cystic fibrosis and Obesity cases further demonstrate the predictive ability of this method in practical applications.
Collapse
Affiliation(s)
- Hui Xiang
- College of Physical Education, Southwest Forestry University, Kunming, Yunnan, China
| | - Rong Guo
- College of Physical Education, Southwest Forestry University, Kunming, Yunnan, China
| | - Li Liu
- College of Physical Education, Suzhou University, Suzhou, Anhui, China
| | - Tengjie Guo
- College of Physical Education, Yunnan Normal University, Kunming, Yunnan, China
| | - Quan Huang
- College of Physical Education, Southwest Forestry University, Kunming, Yunnan, China
| |
Collapse
|
6
|
Shen K, Din AU, Sinha B, Zhou Y, Qian F, Shen B. Translational informatics for human microbiota: data resources, models and applications. Brief Bioinform 2023; 24:7152256. [PMID: 37141135 DOI: 10.1093/bib/bbad168] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 04/07/2023] [Accepted: 04/11/2023] [Indexed: 05/05/2023] Open
Abstract
With the rapid development of human intestinal microbiology and diverse microbiome-related studies and investigations, a large amount of data have been generated and accumulated. Meanwhile, different computational and bioinformatics models have been developed for pattern recognition and knowledge discovery using these data. Given the heterogeneity of these resources and models, we aimed to provide a landscape of the data resources, a comparison of the computational models and a summary of the translational informatics applied to microbiota data. We first review the existing databases, knowledge bases, knowledge graphs and standardizations of microbiome data. Then, the high-throughput sequencing techniques for the microbiome and the informatics tools for their analyses are compared. Finally, translational informatics for the microbiome, including biomarker discovery, personalized treatment and smart healthcare for complex diseases, are discussed.
Collapse
Affiliation(s)
- Ke Shen
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610212, China
| | - Ahmad Ud Din
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610212, China
| | - Baivab Sinha
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610212, China
| | - Yi Zhou
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610212, China
| | - Fuliang Qian
- Center for Systems Biology, Suzhou Medical College of Soochow University, Suzhou 215123, China
- Jiangsu Province Engineering Research Center of Precision Diagnostics and Therapeutics Development, Suzhou 215123, China
| | - Bairong Shen
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610212, China
| |
Collapse
|
7
|
Gu C, Li X. Prediction of disease-related miRNAs by voting with multiple classifiers. BMC Bioinformatics 2023; 24:177. [PMID: 37122001 PMCID: PMC10150488 DOI: 10.1186/s12859-023-05308-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Accepted: 04/26/2023] [Indexed: 05/02/2023] Open
Abstract
There is strong evidence to support that mutations and dysregulation of miRNAs are associated with a variety of diseases, including cancer. However, the experimental methods used to identify disease-related miRNAs are expensive and time-consuming. Effective computational approaches to identify disease-related miRNAs are in high demand and would aid in the detection of lncRNA biomarkers for disease diagnosis, treatment, and prevention. In this study, we develop an ensemble learning framework to reveal the potential associations between miRNAs and diseases (ELMDA). The ELMDA framework does not rely on the known associations when calculating miRNA and disease similarities and uses multi-classifiers voting to predict disease-related miRNAs. As a result, the average AUC of the ELMDA framework was 0.9229 for the HMDD v2.0 database in a fivefold cross-validation. All potential associations in the HMDD V2.0 database were predicted, and 90% of the top 50 results were verified with the updated HMDD V3.2 database. The ELMDA framework was implemented to investigate gastric neoplasms, prostate neoplasms and colon neoplasms, and 100%, 94%, and 90%, respectively, of the top 50 potential miRNAs were validated by the HMDD V3.2 database. Moreover, the ELMDA framework can predict isolated disease-related miRNAs. In conclusion, ELMDA appears to be a reliable method to uncover disease-associated miRNAs.
Collapse
Affiliation(s)
- Changlong Gu
- College of Information Science and Engineering, Hunan University, Changsha, 410082, Hunan, China.
| | - Xiaoying Li
- College of Information Science and Engineering, Hunan University, Changsha, 410082, Hunan, China.
| |
Collapse
|
8
|
Liu JX, Yin MM, Gao YL, Shang J, Zheng CH. MSF-LRR: Multi-Similarity Information Fusion Through Low-Rank Representation to Predict Disease-Associated Microbes. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:534-543. [PMID: 35085090 DOI: 10.1109/tcbb.2022.3146176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
An Increase in microbial activity is shown to be intimately connected with the pathogenesis of diseases. Considering the expense of traditional verification methods, researchers are working to develop high-efficiency methods for detecting potential disease-related microbes. In this article, a new prediction method, MSF-LRR, is established, which uses Low-Rank Representation (LRR) to perform multi-similarity information fusion to predict disease-related microbes. Considering that most existing methods only use one class of similarity, three classes of microbe and disease similarity are added. Then, LRR is used to obtain low-rank structural similarity information. Additionally, the method adaptively extracts the local low-rank structure of the data from a global perspective, to make the information used for the prediction more effective. Finally, a neighbor-based prediction method that utilizes the concept of collaborative filtering is applied to predict unknown microbe-disease pairs. As a result, the AUC value of MSF-LRR is superior to other existing algorithms under 5-fold cross-validation. Furthermore, in case studies, excluding originally known associations, 16 and 19 of the top 20 microbes associated with Bacterial Vaginosis and Irritable Bowel Syndrome, respectively, have been confirmed by the recent literature. In summary, MSF-LRR is a good predictor of potential microbe-disease associations and can contribute to drug discovery and biological research.
Collapse
|
9
|
Wu S, Yang S, Wang M, Song N, Feng J, Wu H, Yang A, Liu C, Li Y, Guo F, Qiao J. Quorum sensing-based interactions among drugs, microbes, and diseases. SCIENCE CHINA. LIFE SCIENCES 2023; 66:137-151. [PMID: 35933489 DOI: 10.1007/s11427-021-2121-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2022] [Accepted: 05/02/2022] [Indexed: 02/04/2023]
Abstract
Many diseases and health conditions are closely related to various microbes, which participate in complex interactions with diverse drugs; nonetheless, the detailed targets of such drugs remain to be elucidated. Many existing studies have reported causal associations among drugs, gut microbes, or diseases, calling for a workflow to reveal their intricate interactions. In this study, we developed a systematic workflow comprising three modules to construct a Quorum Sensing-based Drug-Microbe-Disease (QS-DMD) database ( http://www.qsdmd.lbci.net/ ), which includes diverse interactions for more than 8,000 drugs, 163 microbes, and 42 common diseases. Potential interactions between microbes and more than 8,000 drugs have been systematically studied by targeting microbial QS receptors combined with a docking-based virtual screening technique and in vitro experimental validations. Furthermore, we have constructed a QS-based drug-receptor interaction network, proposed a systematic framework including various drug-receptor-microbe-disease connections, and mapped a paradigmatic circular interaction network based on the QS-DMD, which can provide the underlying QS-based mechanisms for the reported causal associations. The QS-DMD will promote an understanding of personalized medicine and the development of potential therapies for diverse diseases. This work contributes to a paradigm for the construction of a molecule-receptor-microbe-disease interaction network for human health that may form one of the key knowledge maps of precision medicine in the future.
Collapse
Affiliation(s)
- Shengbo Wu
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, China.,State Key Laboratory of Chemical Engineering, Tianjin University, Tianjin, 300072, China.,Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), Tianjin, 300072, China
| | - Shujuan Yang
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, China
| | - Manman Wang
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, China
| | - Nan Song
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, 300350, China
| | - Jie Feng
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, 300350, China
| | - Hao Wu
- Institute of Shaoxing, Tianjin University, Shaoxing, 312300, China
| | - Aidong Yang
- Department of Engineering Science, University of Oxford, Oxford, OX1 3PJ, UK
| | - Chunjiang Liu
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, China.,State Key Laboratory of Chemical Engineering, Tianjin University, Tianjin, 300072, China.,Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), Tianjin, 300072, China
| | - Yanni Li
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, China. .,Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), Tianjin, 300072, China. .,Key Laboratory of Systems Bioengineering, Ministry of Education (Tianjin University), Tianjin, 300072, China.
| | - Fei Guo
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China.
| | - Jianjun Qiao
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, China. .,Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), Tianjin, 300072, China. .,Key Laboratory of Systems Bioengineering, Ministry of Education (Tianjin University), Tianjin, 300072, China. .,Institute of Shaoxing, Tianjin University, Shaoxing, 312300, China.
| |
Collapse
|
10
|
Guan J, Zhang ZG, Liu Y, Wang M. A novel bi-directional heterogeneous network selection method for disease and microbial association prediction. BMC Bioinformatics 2022; 23:483. [PMID: 36376802 PMCID: PMC9664813 DOI: 10.1186/s12859-022-04961-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2022] [Accepted: 09/21/2022] [Indexed: 11/16/2022] Open
Abstract
Microorganisms in the human body have a great impact on human health. Therefore, mastering the potential relationship between microorganisms and diseases is helpful to understand the pathogenesis of diseases and is of great significance to the prevention, diagnosis, and treatment of diseases. In order to predict the potential microbial disease relationship, we propose a new computational model. Firstly, a bi-directional heterogeneous microbial disease network is constructed by integrating multiple similarities, including Gaussian kernel similarity, microbial function similarity, disease semantic similarity, and disease symptom similarity. Secondly, the neighbor information of the network is learned by random walk; Finally, the selection model is used for information aggregation, and the microbial disease node pair is analyzed. Our method is superior to the existing methods in leave-one-out cross-validation and five-fold cross-validation. Moreover, in case studies of different diseases, our method was proven to be effective.
Collapse
|
11
|
Liu D, Liu J, Luo Y, He Q, Deng L. MGATMDA: Predicting Microbe-Disease Associations via Multi-Component Graph Attention Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3578-3585. [PMID: 34587092 DOI: 10.1109/tcbb.2021.3116318] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Microbes are parasitic in various human body organs and play significant roles in a wide range of diseases. Identifying microbe-disease associations is conducive to the identification of potential drug targets. Considering the high cost and risk of biological experiments, developing computational approaches to explore the relationship between microbes and diseases is an alternative choice. However, most existing methods are based on unreliable or noisy similarity, and the prediction accuracy could be affected. Besides, it is still a great challenge for most previous methods to make predictions for the large-scale dataset. In this work, we develop a multi-component Graph Attention Network (GAT) based framework, termed MGATMDA, for predicting microbe-disease associations. MGATMDA is built on a bipartite graph of microbes and diseases. It contains three essential parts: decomposer, combiner, and predictor. The decomposer first decomposes the edges in the bipartite graph to identify the latent components by node-level attention mechanism. The combiner then recombines these latent components automatically to obtain unified embedding for prediction by component-level attention mechanism. Finally, a fully connected network is used to predict unknown microbes-disease associations. Experimental results showed that our proposed method outperformed eight state-of-the-art methods. Case studies for two common diseases further demonstrated the effectiveness of MGATMDA in predicting potential microbe-disease associations. The codes are available at Github https://github.com/dayunliu/MGATMDA.
Collapse
|
12
|
Qi C, Cai Y, Qian K, Li X, Ren J, Wang P, Fu T, Zhao T, Cheng L, Shi L, Zhang X. gutMDisorder v2.0: a comprehensive database for dysbiosis of gut microbiota in phenotypes and interventions. Nucleic Acids Res 2022; 51:D717-D722. [PMID: 36215029 PMCID: PMC9825589 DOI: 10.1093/nar/gkac871] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 09/16/2022] [Accepted: 09/28/2022] [Indexed: 01/30/2023] Open
Abstract
Gut microbiota plays a significant role in maintaining host health, and conversely, disorders potentially lead to dysbiosis, an imbalance in the composition of the gut microbial community. Intervention approaches, such as medications, diets, and several others, also alter the gut microbiota in either a beneficial or harmful direction. In 2020, the gutMDisorder was developed to facilitate researchers in the investigation of dysbiosis of gut microbes as occurs in various disorders as well as with therapeutic interventions. The database has been updated this year, following revision of previous publications and newly published reports to manually integrate confirmed associations under multitudinous conditions. Additionally, the microbial contents of downloaded gut microbial raw sequencing data were annotated, the metadata of the corresponding hosts were manually curated, and the interactive charts were developed to enhance visualization. The improvements have assembled into gutMDisorder v2.0, a more advanced search engine and an upgraded web interface, which can be freely accessed via http://bio-annotation.cn/gutMDisorder/.
Collapse
Affiliation(s)
| | | | | | - Xuefeng Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, Heilongjiang, China
| | - Jialiang Ren
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, Heilongjiang, China
| | - Ping Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, Heilongjiang, China
| | - Tongze Fu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, Heilongjiang, China
| | - Tianyi Zhao
- School of Medicine and Health, Harbin Institute of Technology, Harbin 150001, Heilongjiang, China
| | - Liang Cheng
- To whom correspondence should be addressed. Tel: +86 153 0361 4540;
| | - Lei Shi
- Correspondence may also be addressed to Lei Shi.
| | - Xue Zhang
- Correspondence may also be addressed to Xue Zhang.
| |
Collapse
|
13
|
Chen P, Zhong J, Yang K, Zhang X, Chen Y, Liu R. TPD: a web tool for tipping-point detection based on dynamic network biomarker. Brief Bioinform 2022; 23:6693599. [DOI: 10.1093/bib/bbac399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2022] [Revised: 08/04/2022] [Accepted: 08/16/2022] [Indexed: 11/12/2022] Open
Abstract
Abstract
Tipping points or critical transitions widely exist during the progression of many biological processes. It is of great importance to detect the tipping point with the measured omics data, which may be a key to achieving predictive or preventive medicine. We present the tipping point detector (TPD), a web tool for the detection of the tipping point during the dynamic process of biological systems, and further its leading molecules or network, based on the input high-dimensional time series or stage course data. With the solid theoretical background of dynamic network biomarker (DNB) and a series of computational methods for DNB detection, TPD detects the potential tipping point/critical state from the input omics data and outputs multifarious visualized results, including a suggested tipping point with a statistically significant P value, the identified key genes and their functional biological information, the dynamic change in the DNB/leading network that may drive the critical transition and the survival analysis based on DNB scores that may help to identify ‘dark’ genes (nondifferential in terms of expression but differential in terms of DNB scores). TPD fits all current browsers, such as Chrome, Firefox, Edge, Opera, Safari and Internet Explorer. TPD is freely accessible at http://www.rpcomputationalbiology.cn/TPD.
Collapse
Affiliation(s)
- Pei Chen
- School of Mathematics, South China University of Technology , Guangzhou 510640, China
| | - Jiayuan Zhong
- School of Mathematics and Big Data, Foshan University , Foshan 528000, China
| | - Kun Yang
- School of Computer Science and Engineering, South China University of Technology , Guangzhou 510640, China
| | - Xuhang Zhang
- School of Computer Science and Engineering, South China University of Technology , Guangzhou 510640, China
| | - Yingqi Chen
- School of Computer Science and Engineering, South China University of Technology , Guangzhou 510640, China
| | - Rui Liu
- School of Mathematics, South China University of Technology , Guangzhou 510640, China
| |
Collapse
|
14
|
Huang L, Zhang L, Chen X. Updated review of advances in microRNAs and complex diseases: taxonomy, trends and challenges of computational models. Brief Bioinform 2022; 23:6686738. [PMID: 36056743 DOI: 10.1093/bib/bbac358] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 07/24/2022] [Accepted: 07/30/2022] [Indexed: 12/12/2022] Open
Abstract
Since the problem proposed in late 2000s, microRNA-disease association (MDA) predictions have been implemented based on the data fusion paradigm. Integrating diverse data sources gains a more comprehensive research perspective, and brings a challenge to algorithm design for generating accurate, concise and consistent representations of the fused data. After more than a decade of research progress, a relatively simple algorithm like the score function or a single computation layer may no longer be sufficient for further improving predictive performance. Advanced model design has become more frequent in recent years, particularly in the form of reasonably combing multiple algorithms, a process known as model fusion. In the current review, we present 29 state-of-the-art models and introduce the taxonomy of computational models for MDA prediction based on model fusion and non-fusion. The new taxonomy exhibits notable changes in the algorithmic architecture of models, compared with that of earlier ones in the 2017 review by Chen et al. Moreover, we discuss the progresses that have been made towards overcoming the obstacles to effective MDA prediction since 2017 and elaborated on how future models can be designed according to a set of new schemas. Lastly, we analysed the strengths and weaknesses of each model category in the proposed taxonomy and proposed future research directions from diverse perspectives for enhancing model performance.
Collapse
Affiliation(s)
- Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, 10084, China.,The Future Laboratory, Tsinghua University, Beijing, 10084, China
| | - Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.,Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|
15
|
Wang Y, Lei X, Lu C, Pan Y. Predicting Microbe-Disease Association Based on Multiple Similarities and LINE Algorithm. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2399-2408. [PMID: 34014827 DOI: 10.1109/tcbb.2021.3082183] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Numerous microbes have been found to have vital impacts on human health through affecting biological processes. Therefore, exploring potential associations between microbes and diseases will promote the understanding and diagnosis of diseases. In this study, we present a novel computational model, named MSLINE, to infer potential microbe-disease associations by integrating Multiple Similarities and Large-scale Information Network Embedding (LINE) based on known associations. Specifically, on the basis of known microbe-disease associations from the Human Microbe-Disease Association Database, we first increase the known associations by collecting proven associations from existing literatures. We then construct a microbe-disease heterogeneous network (MDHN) by integrating known associations and multiple similarities (including Gaussian interaction profile kernel similarity, microbe function similarity, disease semantic similarity and disease-symptom similarity). After that, we implement random walk and LINE algorithm on MDHN to learn its structure information. Finally, we score the microbe-disease associations according to the structure information for every nodes. In the Leave-one-out cross validation and 5-fold cross validation, MSLINE performs better compared to other existing methods. Moreover, case studies of different diseases proved that MSLINE could predict the potential microbe-disease associations efficiently.
Collapse
|
16
|
Jeon YJ, Hasan MM, Park HW, Lee KW, Manavalan B. TACOS: a novel approach for accurate prediction of cell-specific long noncoding RNAs subcellular localization. Brief Bioinform 2022; 23:6618237. [PMID: 35753698 PMCID: PMC9294414 DOI: 10.1093/bib/bbac243] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 05/23/2022] [Accepted: 05/24/2022] [Indexed: 11/14/2022] Open
Abstract
Long noncoding RNAs (lncRNAs) are primarily regulated by their cellular localization, which is responsible for their molecular functions, including cell cycle regulation and genome rearrangements. Accurately identifying the subcellular location of lncRNAs from sequence information is crucial for a better understanding of their biological functions and mechanisms. In contrast to traditional experimental methods, bioinformatics or computational methods can be applied for the annotation of lncRNA subcellular locations in humans more effectively. In the past, several machine learning-based methods have been developed to identify lncRNA subcellular localization, but relevant work for identifying cell-specific localization of human lncRNA remains limited. In this study, we present the first application of the tree-based stacking approach, TACOS, which allows users to identify the subcellular localization of human lncRNA in 10 different cell types. Specifically, we conducted comprehensive evaluations of six tree-based classifiers with 10 different feature descriptors, using a newly constructed balanced training dataset for each cell type. Subsequently, the strengths of the AdaBoost baseline models were integrated via a stacking approach, with an appropriate tree-based classifier for the final prediction. TACOS displayed consistent performance in both the cross-validation and independent assessments compared with the other two approaches employed in this study. The user-friendly online TACOS web server can be accessed at https://balalab-skku.org/TACOS.
Collapse
Affiliation(s)
- Young-Jun Jeon
- Department of Integrative Biotechnology, College of Bioengineering and Biotechnology, Sungkyunkwan University, Suwon 16419, Korea
| | - Md Mehedi Hasan
- Tulane Center for Biomedical Informatics and Genomics, Division of Biomedical Informatics and Genomics, John W. Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, LA 70112, USA
| | - Hyun Woo Park
- Department of Integrative Biotechnology, College of Bioengineering and Biotechnology, Sungkyunkwan University, Suwon 16419, Korea
| | - Ki Wook Lee
- Department of Integrative Biotechnology, College of Bioengineering and Biotechnology, Sungkyunkwan University, Suwon 16419, Korea
| | - Balachandran Manavalan
- Computational Biology and Bioinformatics laboratory, Department of Integrative Biotechnology, College of Bioengineering and Biotechnology, Sungkyunkwan University, Suwon 16419, Korea
| |
Collapse
|
17
|
Peng H, Zhong J, Chen P, Liu R. Identifying the critical states of complex diseases by the dynamic change of multivariate distribution. Brief Bioinform 2022; 23:6590435. [DOI: 10.1093/bib/bbac177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Revised: 04/10/2022] [Accepted: 04/18/2022] [Indexed: 11/13/2022] Open
Abstract
Abstract
The dynamics of complex diseases are not always smooth; they are occasionally abrupt, i.e. there is a critical state transition or tipping point at which the disease undergoes a sudden qualitative shift. There are generally a few significant differences in the critical state in terms of gene expressions or other static measurements, which may lead to the failure of traditional differential expression-based biomarkers to identify such a tipping point. In this study, we propose a computational method, the direct interaction network-based divergence, to detect the critical state of complex diseases by exploiting the dynamic changes in multivariable distributions inferred from observable samples and local biomolecular direct interaction networks. Such a method is model-free and applicable to both bulk and single-cell expression data. Our approach was validated by successfully identifying the tipping point just before the occurrence of a critical transition for both a simulated data set and seven real data sets, including those from The Cancer Genome Atlas and two single-cell RNA-sequencing data sets of cell differentiation. Functional and pathway enrichment analyses also validated the computational results from the perspectives of both molecules and networks.
Collapse
Affiliation(s)
- Hao Peng
- School of Mathematics, South China University of Technology, Guangzhou 510640, China
| | - Jiayuan Zhong
- School of Mathematics, South China University of Technology, Guangzhou 510640, China
- School of mathematics and big data, Foshan University, Foshan 528225, China
| | - Pei Chen
- School of Mathematics, South China University of Technology, Guangzhou 510640, China
| | - Rui Liu
- School of Mathematics, South China University of Technology, Guangzhou 510640, China
- Pazhou Lab, Guangzhou 510330, China
| |
Collapse
|
18
|
Wang L, Tan Y, Yang X, Kuang L, Ping P. Review on predicting pairwise relationships between human microbes, drugs and diseases: from biological data to computational models. Brief Bioinform 2022; 23:6553604. [PMID: 35325024 DOI: 10.1093/bib/bbac080] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Revised: 02/14/2022] [Accepted: 02/15/2022] [Indexed: 12/11/2022] Open
Abstract
In recent years, with the rapid development of techniques in bioinformatics and life science, a considerable quantity of biomedical data has been accumulated, based on which researchers have developed various computational approaches to discover potential associations between human microbes, drugs and diseases. This paper provides a comprehensive overview of recent advances in prediction of potential correlations between microbes, drugs and diseases from biological data to computational models. Firstly, we introduced the widely used datasets relevant to the identification of potential relationships between microbes, drugs and diseases in detail. And then, we divided a series of a lot of representative computing models into five major categories including network, matrix factorization, matrix completion, regularization and artificial neural network for in-depth discussion and comparison. Finally, we analysed possible challenges and opportunities in this research area, and at the same time we outlined some suggestions for further improvement of predictive performances as well.
Collapse
Affiliation(s)
- Lei Wang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Yaqin Tan
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Xiaoyu Yang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Linai Kuang
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Pengyao Ping
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China
| |
Collapse
|
19
|
Zhang HY, Wang L, You ZH, Hu L, Zhao BW, Li ZW, Li YM. iGRLCDA: identifying circRNA-disease association based on graph representation learning. Brief Bioinform 2022; 23:6552271. [PMID: 35323894 DOI: 10.1093/bib/bbac083] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Revised: 02/16/2022] [Accepted: 02/17/2022] [Indexed: 12/18/2022] Open
Abstract
While the technologies of ribonucleic acid-sequence (RNA-seq) and transcript assembly analysis have continued to improve, a novel topology of RNA transcript was uncovered in the last decade and is called circular RNA (circRNA). Recently, researchers have revealed that they compete with messenger RNA (mRNA) and long noncoding for combining with microRNA in gene regulation. Therefore, circRNA was assumed to be associated with complex disease and discovering the relationship between them would contribute to medical research. However, the work of identifying the association between circRNA and disease in vitro takes a long time and usually without direction. During these years, more and more associations were verified by experiments. Hence, we proposed a computational method named identifying circRNA-disease association based on graph representation learning (iGRLCDA) for the prediction of the potential association of circRNA and disease, which utilized a deep learning model of graph convolution network (GCN) and graph factorization (GF). In detail, iGRLCDA first derived the hidden feature of known associations between circRNA and disease using the Gaussian interaction profile (GIP) kernel combined with disease semantic information to form a numeric descriptor. After that, it further used the deep learning model of GCN and GF to extract hidden features from the descriptor. Finally, the random forest classifier is introduced to identify the potential circRNA-disease association. The five-fold cross-validation of iGRLCDA shows strong competitiveness in comparison with other excellent prediction models at the gold standard data and achieved an average area under the receiver operating characteristic curve of 0.9289 and an area under the precision-recall curve of 0.9377. On reviewing the prediction results from the relevant literature, 22 of the top 30 predicted circRNA-disease associations were noted in recent published papers. These exceptional results make us believe that iGRLCDA can provide reliable circRNA-disease associations for medical research and reduce the blindness of wet-lab experiments.
Collapse
Affiliation(s)
- Han-Yuan Zhang
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Lei Wang
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning 530007, China.,College of Information Science and Engineering, Zaozhuang University, Shandong 277100, China
| | - Zhu-Hong You
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning 530007, China
| | - Lun Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Bo-Wei Zhao
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Zheng-Wei Li
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning 530007, China
| | - Yang-Ming Li
- College of Engineering Technology, Rochester Institute of Technology, Rochester, NY 14623, USA
| |
Collapse
|
20
|
Xu D, Xu H, Zhang Y, Gao R. Novel Collaborative Weighted Non-negative Matrix Factorization Improves Prediction of Disease-Associated Human Microbes. Front Microbiol 2022; 13:834982. [PMID: 35369503 PMCID: PMC8965656 DOI: 10.3389/fmicb.2022.834982] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Accepted: 01/19/2022] [Indexed: 12/14/2022] Open
Abstract
Extensive clinical and biomedical studies have shown that microbiome plays a prominent role in human health. Identifying potential microbe–disease associations (MDAs) can help reveal the pathological mechanism of human diseases and be useful for the prevention, diagnosis, and treatment of human diseases. Therefore, it is necessary to develop effective computational models and reduce the cost and time of biological experiments. Here, we developed a novel machine learning-based joint framework called CWNMF-GLapRLS for human MDA prediction using the proposed collaborative weighted non-negative matrix factorization (CWNMF) technique and graph Laplacian regularized least squares. Especially, to fuse more similarity information, we calculated the functional similarity of microbes. To deal with missing values and effectively overcome the data sparsity problem, we proposed a collaborative weighted NMF technique to reconstruct the original association matrix. In addition, we developed a graph Laplacian regularized least-squares method for prediction. The experimental results of fivefold and leave-one-out cross-validation demonstrated that our method achieved the best performance by comparing it with 5 state-of-the-art methods on the benchmark dataset. Case studies further showed that the proposed method is an effective tool to predict potential MDAs and can provide more help for biomedical researchers.
Collapse
Affiliation(s)
- Da Xu
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Hanxiao Xu
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Yusen Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, China
- *Correspondence: Yusen Zhang,
| | - Rui Gao
- School of Control Science and Engineering, Shandong University, Jinan, China
- Rui Gao,
| |
Collapse
|
21
|
Wang L, Li H, Wang Y, Tan Y, Chen Z, Pei T, Zou Q. MDADP: A webserver integrating database and prediction tools for microbe-disease associations. IEEE J Biomed Health Inform 2022; 26:3427-3434. [PMID: 35254998 DOI: 10.1109/jbhi.2022.3156166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
More and more evidence has demonstrated that microbiota play important roles in the life processes of the human body. In recent years, various computational methods have been proposed for identifying potentially disease-associated microbes to save costs in traditional biological experiments. However, prediction performances of these methods are generally limited by outdated and incomplete datasets. And moreover, until now, there are limited studies that can provide visual predictive tools for inferring possible microbe-disease associations (MDAs) as well. Hence, in this manuscript, a novel webserver called MDADP will be proposed to identify latent MDAs, in which, a new MDA database together with interactive prediction tools for MDAs studies will be designed simultaneously. Especially, in the newly constructed MDA database, 2019 known MDAs between 58 diseases and 703 microbes have been manually collected first. And then, through adopting the average ranking method and the co-confidence method respectively, eight representative computational models have been integrated together to identify potential disease-related microbes. As a result, MDADP can provide not only interactive features for users to access and capture MDAs entities, but also effective tools for users to identify candidate microbes for different diseases. To our knowledge, MDADP is the first online platform that incorporates a new MDA database with comprehensive MDA prediction tools. Therefore, we believe that it will be a valuable source of information for researches in microbiology and disease-related fields. MDADP can be accessed at http://mdadp.leelab2997.cn.
Collapse
|
22
|
Xiang J, Zhang J, Zhao Y, Wu FX, Li M. Biomedical data, computational methods and tools for evaluating disease-disease associations. Brief Bioinform 2022; 23:6522999. [PMID: 35136949 DOI: 10.1093/bib/bbac006] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Revised: 01/04/2022] [Accepted: 01/05/2022] [Indexed: 12/12/2022] Open
Abstract
In recent decades, exploring potential relationships between diseases has been an active research field. With the rapid accumulation of disease-related biomedical data, a lot of computational methods and tools/platforms have been developed to reveal intrinsic relationship between diseases, which can provide useful insights to the study of complex diseases, e.g. understanding molecular mechanisms of diseases and discovering new treatment of diseases. Human complex diseases involve both external phenotypic abnormalities and complex internal molecular mechanisms in organisms. Computational methods with different types of biomedical data from phenotype to genotype can evaluate disease-disease associations at different levels, providing a comprehensive perspective for understanding diseases. In this review, available biomedical data and databases for evaluating disease-disease associations are first summarized. Then, existing computational methods for disease-disease associations are reviewed and classified into five groups in terms of the usages of biomedical data, including disease semantic-based, phenotype-based, function-based, representation learning-based and text mining-based methods. Further, we summarize software tools/platforms for computation and analysis of disease-disease associations. Finally, we give a discussion and summary on the research of disease-disease associations. This review provides a systematic overview for current disease association research, which could promote the development and applications of computational methods and tools/platforms for disease-disease associations.
Collapse
Affiliation(s)
- Ju Xiang
- School of Computer Science and Engineering, Central South University, China
| | - Jiashuai Zhang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Yichao Zhao
- School of Computer Science and Engineering, Central South University, China
| | - Fang-Xiang Wu
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Min Li
- Division of Biomedical Engineering and Department of Mechanical Engineering at University of Saskatchewan, Saskatoon, Canada
| |
Collapse
|
23
|
Zha Y, Ning K. Ontology-aware neural network: a general framework for pattern mining from microbiome data. Brief Bioinform 2022; 23:6517031. [PMID: 35091743 PMCID: PMC8921649 DOI: 10.1093/bib/bbac005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Revised: 12/30/2021] [Accepted: 01/04/2022] [Indexed: 11/23/2022] Open
Abstract
With the rapid accumulation of microbiome data around the world, numerous computational bioinformatics methods have been developed for pattern mining from such paramount microbiome data. Current microbiome data mining methods, such as gene and species mining, rely heavily on sequence comparison. Most of these methods, however, have a clear trade-off, particularly, when it comes to big-data analytical efficiency and accuracy. Microbiome entities are usually organized in ontology structures, and pattern mining methods that have considered ontology structures could offer advantages in mining efficiency and accuracy. Here, we have summarized the ontology-aware neural network (ONN) as a novel framework for microbiome data mining. We have discussed the applications of ONN in multiple contexts, including gene mining, species mining and microbial community dynamic pattern mining. We have then highlighted one of the most important characteristics of ONN, namely, novel knowledge discovery, which makes ONN a standout among all microbiome data mining methods. Finally, we have provided several applications to showcase the advantage of ONN over other methods in microbiome data mining. In summary, ONN represents a paradigm shift for pattern mining from microbiome data: from traditional machine learning approach to ontology-aware and model-based approach, which has found its broad application scenarios in microbiome data mining.
Collapse
Affiliation(s)
- Yuguo Zha
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Department of Bioinformatics and Systems Biology, Center of AI Biology, College of Life Science and Technology, Huazhong University of Science and Technology, 1037 Luoyu Road Wuhan, Hubei, Wuhan 430074, China
| | - Kang Ning
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Department of Bioinformatics and Systems Biology, Center of AI Biology, College of Life Science and Technology, Huazhong University of Science and Technology, 1037 Luoyu Road Wuhan, Hubei, Wuhan 430074, China
| |
Collapse
|
24
|
Wang CC, Han CD, Zhao Q, Chen X. Circular RNAs and complex diseases: from experimental results to computational models. Brief Bioinform 2021; 22:bbab286. [PMID: 34329377 PMCID: PMC8575014 DOI: 10.1093/bib/bbab286] [Citation(s) in RCA: 99] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Revised: 06/23/2021] [Accepted: 07/03/2021] [Indexed: 12/13/2022] Open
Abstract
Circular RNAs (circRNAs) are a class of single-stranded, covalently closed RNA molecules with a variety of biological functions. Studies have shown that circRNAs are involved in a variety of biological processes and play an important role in the development of various complex diseases, so the identification of circRNA-disease associations would contribute to the diagnosis and treatment of diseases. In this review, we summarize the discovery, classifications and functions of circRNAs and introduce four important diseases associated with circRNAs. Then, we list some significant and publicly accessible databases containing comprehensive annotation resources of circRNAs and experimentally validated circRNA-disease associations. Next, we introduce some state-of-the-art computational models for predicting novel circRNA-disease associations and divide them into two categories, namely network algorithm-based and machine learning-based models. Subsequently, several evaluation methods of prediction performance of these computational models are summarized. Finally, we analyze the advantages and disadvantages of different types of computational models and provide some suggestions to promote the development of circRNA-disease association identification from the perspective of the construction of new computational models and the accumulation of circRNA-related data.
Collapse
Affiliation(s)
- Chun-Chun Wang
- School of Information and Control Engineering, China University of Mining and Technology
| | - Chen-Di Han
- School of Information and Control Engineering, China University of Mining and Technology
| | - Qi Zhao
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning
| | - Xing Chen
- China University of Mining and Technology
| |
Collapse
|
25
|
Discovering microbe-disease associations from the literature using a hierarchical long short-term memory network and an ensemble parser model. Sci Rep 2021; 11:4490. [PMID: 33627732 PMCID: PMC7904816 DOI: 10.1038/s41598-021-83966-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Accepted: 02/08/2021] [Indexed: 02/07/2023] Open
Abstract
With recent advances in biotechnology and sequencing technology, the microbial community has been intensively studied and discovered to be associated with many chronic as well as acute diseases. Even though a tremendous number of studies describing the association between microbes and diseases have been published, text mining methods that focus on such associations have been rarely studied. We propose a framework that combines machine learning and natural language processing methods to analyze the association between microbes and diseases. A hierarchical long short-term memory network was used to detect sentences that describe the association. For the sentences determined, two different parse tree-based search methods were combined to find the relation-describing word. The ensemble model of constituency parsing for structural pattern matching and dependency-based relation extraction improved the prediction accuracy. By combining deep learning and parse tree-based extractions, our proposed framework could extract the microbe-disease association with higher accuracy. The evaluation results showed that our system achieved an F-score of 0.8764 and 0.8524 in binary decisions and extracting relation words, respectively. As a case study, we performed a large-scale analysis of the association between microbes and diseases. Additionally, a set of common microbes shared by multiple diseases were also identified in this study. This study could provide valuable information for the major microbes that were studied for a specific disease. The code and data are available at https://github.com/DMnBI/mdi_predictor .
Collapse
|
26
|
Xu D, Xu H, Zhang Y, Wang M, Chen W, Gao R. MDAKRLS: Predicting human microbe-disease association based on Kronecker regularized least squares and similarities. J Transl Med 2021; 19:66. [PMID: 33579301 PMCID: PMC7881563 DOI: 10.1186/s12967-021-02732-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2020] [Accepted: 02/01/2021] [Indexed: 01/16/2023] Open
Abstract
BACKGROUND Microbes are closely related to human health and diseases. Identification of disease-related microbes is of great significance for revealing the pathological mechanism of human diseases and understanding the interaction mechanisms between microbes and humans, which is also useful for the prevention, diagnosis and treatment of human diseases. Considering the known disease-related microbes are still insufficient, it is necessary to develop effective computational methods and reduce the time and cost of biological experiments. METHODS In this work, we developed a novel computational method called MDAKRLS to discover potential microbe-disease associations (MDAs) based on the Kronecker regularized least squares. Specifically, we introduced the Hamming interaction profile similarity to measure the similarities of microbes and diseases besides Gaussian interaction profile kernel similarity. In addition, we introduced the Kronecker product to construct two kinds of Kronecker similarities between microbe-disease pairs. Then, we designed the Kronecker regularized least squares with different Kronecker similarities to obtain prediction scores, respectively, and calculated the final prediction scores by integrating the contributions of different similarities. RESULTS The AUCs value of global leave-one-out cross-validation and 5-fold cross-validation achieved by MDAKRLS were 0.9327 and 0.9023 ± 0.0015, which were significantly higher than five state-of-the-art methods used for comparison. Comparison results demonstrate that MDAKRLS has faster computing speed under two kinds of frameworks. In addition, case studies of inflammatory bowel disease (IBD) and asthma further showed 19 (IBD), 19 (asthma) of the top 20 prediction disease-related microbes could be verified by previously published biological or medical literature. CONCLUSIONS All the evaluation results adequately demonstrated that MDAKRLS has an effective and reliable prediction performance. It may be a useful tool to seek disease-related new microbes and help biomedical researchers to carry out follow-up studies.
Collapse
Affiliation(s)
- Da Xu
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Hanxiao Xu
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Yusen Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China.
| | - Mingyi Wang
- Department of Central Lab, Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, Shandong, China.
| | - Wei Chen
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Rui Gao
- School of Control Science and Engineering, Shandong University, Jinan, 250061, China
| |
Collapse
|