1
|
Park JH, Cho YR. Draw+: network-based computational drug repositioning with attention walking and noise filtering. Health Inf Sci Syst 2025; 13:14. [PMID: 39764174 PMCID: PMC11700073 DOI: 10.1007/s13755-024-00326-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2024] [Accepted: 12/11/2024] [Indexed: 02/02/2025] Open
Abstract
Purpose Drug repositioning, a strategy that repurposes already-approved drugs for novel therapeutic applications, provides a faster and more cost-effective alternative to traditional drug discovery. Network-based models have been adopted by many computational methodologies, especially those that use graph neural networks to predict drug-disease associations. However, these techniques frequently overlook the quality of the input network, which is a critical factor for achieving accurate predictions. Methods We present a novel network-based framework for drug repositioning, named DRAW+, which incorporates noise filtering and feature extraction using graph neural networks and attention mechanisms. The proposed model first constructs a heterogeneous network that integrates the drug-disease association network with the similarity networks of drugs and diseases, which are upgraded through reduced-rank singular value decomposition. Next, a subgraph surrounding the targeted drug-disease node pair is extracted, allowing the model to focus on local structures. Graph neural networks are then applied to extract structural representation, followed by attention walking to capture key features of the subgraph. Finally, a multi-layer perceptron classifies the subgraph as positive or negative, which indicates the presence of the link between the target node pair. Results Experimental validation across three benchmark datasets showed that DRAW+ outperformed seven state-of-the-art methods, achieving the highest average AUROC and AUPRC, 0.963 and 0.564, respectively. Moreover, DRAW+ demonstrated its robustness by achieving the best performance across two additional datasets, further confirming its generalizability and effectiveness in diverse settings. Conclusions The proposed network-based computational approach, DRAW+, demonstrates exceptional accuracy and robustness, confirming its effectiveness in drug repositioning tasks.
Collapse
Affiliation(s)
- Jong-Hoon Park
- Division of Software, Yonsei University, Mirae Campus, Yeonsedae-gil 1, Wonju-si, 26493 Gangwon-do Korea
| | - Young-Rae Cho
- Division of Software, Yonsei University, Mirae Campus, Yeonsedae-gil 1, Wonju-si, 26493 Gangwon-do Korea
- Division of Digital Healthcare, Yonsei University, Mirae Campus, Yeonsedae-gil 1, Wonju-si, Gangwon-do 26493 Korea
| |
Collapse
|
2
|
Zhu E, Li X, Liu C, Pal NR. Boosting Drug-Disease Association Prediction for Drug Repositioning via Dual-Feature Extraction and Cross-Dual-Domain Decoding. J Chem Inf Model 2025. [PMID: 40278791 DOI: 10.1021/acs.jcim.5c00070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/26/2025]
Abstract
The extraction of biomedical data has significant academic and practical value in contemporary biomedical sciences. In recent years, drug repositioning, a cost-effective strategy for drug development by discovering new indications for approved drugs, has gained increasing attention. However, many existing drug repositioning methods focus on mining information from adjacent nodes in biomedical networks without considering the potential inter-relationships between the feature spaces of drugs and diseases. This can lead to inaccurate encoding, resulting in biased mined drug-disease association information. To address this limitation, we propose a new model called Dual-Feature Drug Repurposing Neural Network (DFDRNN). DFDRNN allows the mining of two features (similarity and association) from the drug-disease biomedical networks to encode drugs and diseases. A self-attention mechanism is utilized to extract neighbor feature information. It incorporates two dual-feature extraction modules: the single-domain dual-feature extraction (SDDFE) module for extracting features within a single domain (drugs or diseases) and the cross-domain dual-feature extraction (CDDFE) module for extracting features across domains. By utilizing these modules, we ensure more appropriate encoding of drugs and diseases. A cross-dual-domain decoder is also designed to predict drug-disease associations in both domains. Our proposed DFDRNN model outperforms six state-of-the-art methods on four benchmark data sets, achieving an average AUROC of 0.946 and an average AUPR of 0.597. Case studies on three diseases show that the proposed DFDRNN model can be applied in real-world scenarios, demonstrating its significant potential in drug repositioning.
Collapse
Affiliation(s)
- Enqiang Zhu
- Institute of Computing Science and Technology, Guangzhou University, Guangzhou 510006, China
| | - Xiang Li
- Institute of Computing Science and Technology, Guangzhou University, Guangzhou 510006, China
| | - Chanjuan Liu
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Nikhil R Pal
- Electronics and Communication Sciences Unit, Indian Statistical Institute, Calcutta 700108, India
| |
Collapse
|
3
|
Tang X, Zhou C, Lu C, Meng Y, Xu J, Hu X, Tian G, Yang J. Enhancing Drug Repositioning Through Local Interactive Learning With Bilinear Attention Networks. IEEE J Biomed Health Inform 2025; 29:1644-1655. [PMID: 37988217 DOI: 10.1109/jbhi.2023.3335275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2023]
Abstract
Drug repositioning has emerged as a promising strategy for identifying new therapeutic applications for existing drugs. In this study, we present DRGBCN, a novel computational method that integrates heterogeneous information through a deep bilinear attention network to infer potential drugs for specific diseases. DRGBCN involves constructing a comprehensive drug-disease network by incorporating multiple similarity networks for drugs and diseases. Firstly, we introduce a layer attention mechanism to effectively learn the embeddings of graph convolutional layers from these networks. Subsequently, a bilinear attention network is constructed to capture pairwise local interactions between drugs and diseases. This combined approach enhances the accuracy and reliability of predictions. Finally, a multi-layer perceptron module is employed to evaluate potential drugs. Through extensive experiments on three publicly available datasets, DRGBCN demonstrates better performance over baseline methods in 10-fold cross-validation, achieving an average area under the receiver operating characteristic curve (AUROC) of 0.9399. Furthermore, case studies on bladder cancer and acute lymphoblastic leukemia confirm the practical application of DRGBCN in real-world drug repositioning scenarios. Importantly, our experimental results from the drug-disease network analysis reveal the successful clustering of similar drugs within the same community, providing valuable insights into drug-disease interactions. In conclusion, DRGBCN holds significant promise for uncovering new therapeutic applications of existing drugs, thereby contributing to the advancement of precision medicine.
Collapse
|
4
|
Bhatia T, Sharma S. Drug Repurposing: Insights into Current Advances and Future Applications. Curr Med Chem 2025; 32:468-510. [PMID: 37946344 DOI: 10.2174/0109298673266470231023110841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Revised: 09/04/2023] [Accepted: 09/11/2023] [Indexed: 11/12/2023]
Abstract
Drug development is a complex and expensive process that involves extensive research and testing before a new drug can be approved for use. This has led to a limited availability of potential therapeutics for many diseases. Despite significant advances in biomedical science, the process of drug development remains a bottleneck, as all hypotheses must be tested through experiments and observations, which can be timeconsuming and costly. To address this challenge, drug repurposing has emerged as an innovative strategy for finding new uses for existing medications that go beyond their original intended use. This approach has the potential to speed up the drug development process and reduce costs, making it an attractive option for pharmaceutical companies and researchers alike. It involves the identification of existing drugs or compounds that have the potential to be used for the treatment of a different disease or condition. This can be done through a variety of approaches, including screening existing drugs against new disease targets, investigating the biological mechanisms of existing drugs, and analyzing data from clinical trials and electronic health records. Additionally, repurposing drugs can lead to the identification of new therapeutic targets and mechanisms of action, which can enhance our understanding of disease biology and lead to the development of more effective treatments. Overall, drug repurposing is an exciting and promising area of research that has the potential to revolutionize the drug development process and improve the lives of millions of people around the world. The present review provides insights on types of interaction, approaches, availability of databases, applications and limitations of drug repurposing.
Collapse
Affiliation(s)
- Trisha Bhatia
- School of Pharmacy, National Forensic Sciences University, Gandhinagar, Gujarat, 382007, India
| | - Shweta Sharma
- School of Pharmacy, National Forensic Sciences University, Gandhinagar, Gujarat, 382007, India
| |
Collapse
|
5
|
Zhang S, Strayer N, Vessels T, Choi K, Wang GW, Li Y, Bejan CA, Hsi RS, Bick AG, Velez Edwards DR, Savona MR, Phillips EJ, Pulley JM, Self WH, Hopkins WC, Roden DM, Smoller JW, Ruderfer DM, Xu Y. PheMIME: an interactive web app and knowledge base for phenome-wide, multi-institutional multimorbidity analysis. J Am Med Inform Assoc 2024; 31:2440-2446. [PMID: 39127052 PMCID: PMC11491640 DOI: 10.1093/jamia/ocae182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 06/03/2024] [Accepted: 07/18/2024] [Indexed: 08/12/2024] Open
Abstract
OBJECTIVES To address the need for interactive visualization tools and databases in characterizing multimorbidity patterns across different populations, we developed the Phenome-wide Multi-Institutional Multimorbidity Explorer (PheMIME). This tool leverages three large-scale EHR systems to facilitate efficient analysis and visualization of disease multimorbidity, aiming to reveal both robust and novel disease associations that are consistent across different systems and to provide insight for enhancing personalized healthcare strategies. MATERIALS AND METHODS PheMIME integrates summary statistics from phenome-wide analyses of disease multimorbidities, utilizing data from Vanderbilt University Medical Center, Mass General Brigham, and the UK Biobank. It offers interactive and multifaceted visualizations for exploring multimorbidity. Incorporating an enhanced version of associationSubgraphs, PheMIME also enables dynamic analysis and inference of disease clusters, promoting the discovery of complex multimorbidity patterns. A case study on schizophrenia demonstrates its capability for generating interactive visualizations of multimorbidity networks within and across multiple systems. Additionally, PheMIME supports diverse multimorbidity-based discoveries, detailed further in online case studies. RESULTS The PheMIME is accessible at https://prod.tbilab.org/PheMIME/. A comprehensive tutorial and multiple case studies for demonstration are available at https://prod.tbilab.org/PheMIME_supplementary_materials/. The source code can be downloaded from https://github.com/tbilab/PheMIME. DISCUSSION PheMIME represents a significant advancement in medical informatics, offering an efficient solution for accessing, analyzing, and interpreting the complex and noisy real-world patient data in electronic health records. CONCLUSION PheMIME provides an extensive multimorbidity knowledge base that consolidates data from three EHR systems, and it is a novel interactive tool designed to analyze and visualize multimorbidities across multiple EHR datasets. It stands out as the first of its kind to offer extensive multimorbidity knowledge integration with substantial support for efficient online analysis and interactive visualization.
Collapse
Affiliation(s)
- Siwei Zhang
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | | | - Tess Vessels
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, United States
| | - Karmel Choi
- Psychiatric & Neuro Developmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, United States
- Center for Precision Psychiatry, Department of Psychiatry, Massachusetts General Hospital, Boston, MA 02114, United States
| | - Geoffrey W Wang
- Department of Statistics, North Carolina State University, Raleigh, NC 27695, United States
| | - Yajing Li
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Cosmin A Bejan
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Ryan S Hsi
- Department of Urology, Vanderbilt University Medical Center, Nashville, TN 37232, United States
| | - Alexander G Bick
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, United States
| | - Digna R Velez Edwards
- Department of Obstetrics and Gynecology, Vanderbilt University Medical Center, Nashville, TN 37232, United States
| | - Michael R Savona
- Division of Hematology and Oncology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, United States
| | - Elizabeth J Phillips
- Center for Drug Safety and Immunology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, United States
- Institute for Immunology and Infectious Diseases, Murdoch University, Murdoch, WA 6150, Australia
| | - Jill M Pulley
- Vanderbilt Institute for Clinical and Translational Science, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Wesley H Self
- Vanderbilt Institute for Clinical and Translational Science, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Wilkins Consuelo Hopkins
- Vanderbilt Institute for Clinical and Translational Science, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Dan M Roden
- Department of Pharmacology, Vanderbilt University Medical Center, Nashville, TN 37232, United States
| | - Jordan W Smoller
- Psychiatric & Neuro Developmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, United States
- Center for Precision Psychiatry, Department of Psychiatry, Massachusetts General Hospital, Boston, MA 02114, United States
- Stanley Center for Psychiatric Research, Broad Institute, Cambridge, MA 02142, United States
| | - Douglas M Ruderfer
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, United States
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
- Department of Psychiatry and Behavioral Sciences, Vanderbilt University Medical Center, Nashville, TN 37212, United States
| | - Yaomin Xu
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| |
Collapse
|
6
|
Guo X, Song Y, Xu D, Jin X, Shang X. Genotype and Phenotype Association Analysis Based on Multi-omics Statistical Data. Curr Bioinform 2024; 19:933-942. [DOI: 10.2174/0115748936276861240109045208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 11/29/2023] [Accepted: 12/07/2023] [Indexed: 01/03/2025]
Abstract
Background:
When using clinical data for multi-omics analysis, there are issues such as
the insufficient number of omics data types and relatively small sample size due to the protection of
patients' privacy, the requirements of data management by various institutions, and the relatively
large number of features of each omics data. This paper describes the analysis of multi-omics pathway
relationships using statistical data in the absence of clinical data.
Methods:
We proposed a novel approach to exploit easily accessible statistics in public databases.
This approach introduces phenotypic associations that are not included in the clinical data and uses
these data to build a three-layer heterogeneous network. To simplify the analysis, we decomposed
the three-layer network into double two-layer networks to predict the weights of the inter-layer associations.
By adding a hyperparameter β, the weights of the two layers of the network were
merged, and then k-fold cross-validation was used to evaluate the accuracy of this method. In calculating
the weights of the two-layer networks, the RWR with fixed restart probability was combined
with PBMDA and CIPHER to generate the PCRWR with biased weights and improved accuracy.
Results:
The area under the receiver operating characteristic curve was increased by approximately
7% in the case of the RWR with initial weights.
Conclusion:
Multi-omics statistical data were used to establish genotype and phenotype correlation
networks for analysis, which was similar to the effect of clinical multi-omics analysis.
Collapse
Affiliation(s)
- Xinpeng Guo
- School of Air and Missile Defense, Air Force Engineering University, Xi’an, 710051, People’s Republic of China
- School of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an 710072, People’s Republic of China
| | - Yafei Song
- School of Air and Missile Defense, Air Force Engineering University, Xi’an, 710051, People’s Republic of China
| | - Dongyan Xu
- Department of Basic Sciences, Air Force Engineering University, Xi’an, 710051, People’s Republic
of China
| | - Xueping Jin
- School of Air and Missile Defense, Air Force Engineering University, Xi’an, 710051, People’s Republic of China
| | - Xuequn Shang
- School of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an 710072, People’s
Republic of China
| |
Collapse
|
7
|
Li G, Li S, Liang C, Xiao Q, Luo J. Drug repositioning based on residual attention network and free multiscale adversarial training. BMC Bioinformatics 2024; 25:261. [PMID: 39118000 PMCID: PMC11308596 DOI: 10.1186/s12859-024-05893-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 08/06/2024] [Indexed: 08/10/2024] Open
Abstract
BACKGROUND Conducting traditional wet experiments to guide drug development is an expensive, time-consuming and risky process. Analyzing drug function and repositioning plays a key role in identifying new therapeutic potential of approved drugs and discovering therapeutic approaches for untreated diseases. Exploring drug-disease associations has far-reaching implications for identifying disease pathogenesis and treatment. However, reliable detection of drug-disease relationships via traditional methods is costly and slow. Therefore, investigations into computational methods for predicting drug-disease associations are currently needed. RESULTS This paper presents a novel drug-disease association prediction method, RAFGAE. First, RAFGAE integrates known associations between diseases and drugs into a bipartite network. Second, RAFGAE designs the Re_GAT framework, which includes multilayer graph attention networks (GATs) and two residual networks. The multilayer GATs are utilized for learning the node embeddings, which is achieved by aggregating information from multihop neighbors. The two residual networks are used to alleviate the deep network oversmoothing problem, and an attention mechanism is introduced to combine the node embeddings from different attention layers. Third, two graph autoencoders (GAEs) with collaborative training are constructed to simulate label propagation to predict potential associations. On this basis, free multiscale adversarial training (FMAT) is introduced. FMAT enhances node feature quality through small gradient adversarial perturbation iterations, improving the prediction performance. Finally, tenfold cross-validations on two benchmark datasets show that RAFGAE outperforms current methods. In addition, case studies have confirmed that RAFGAE can detect novel drug-disease associations. CONCLUSIONS The comprehensive experimental results validate the utility and accuracy of RAFGAE. We believe that this method may serve as an excellent predictor for identifying unobserved disease-drug associations.
Collapse
Affiliation(s)
- Guanghui Li
- School of Information Engineering, East China Jiaotong University, Nanchang, China.
| | - Shuwen Li
- School of Information Engineering, East China Jiaotong University, Nanchang, China
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Qiu Xiao
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China.
| |
Collapse
|
8
|
Li J, Wang B, Ma X. Non-Coding RNAs Extended Omnigenic Module of Cancers. ENTROPY (BASEL, SWITZERLAND) 2024; 26:640. [PMID: 39202109 PMCID: PMC11353529 DOI: 10.3390/e26080640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Revised: 07/24/2024] [Accepted: 07/25/2024] [Indexed: 09/03/2024]
Abstract
The emergence of cancers involves numerous coding and non-coding genes. Understanding the contribution of non-coding RNAs (ncRNAs) to the cancer neighborhood is crucial for interpreting the interaction between molecular markers of cancer. However, there is a lack of systematic studies on the involvement of ncRNAs in the cancer neighborhood. In this paper, we construct an interaction network which encompasses multiple genes. We focus on the fundamental topological indicator, namely connectivity, and evaluate its performance when applied to cancer-affected genes using statistical indices. Our findings reveal that ncRNAs significantly enhance the connectivity of affected genes and mediate the inclusion of more genes in the cancer module. To further explore the role of ncRNAs in the network, we propose a connectivity-based method which leverages the bridging function of ncRNAs across cancer-affected genes and reveals the non-coding RNAs extended omnigenic module (NeOModule). Topologically, this module promotes the formation of cancer patterns involving ncRNAs. Biologically, it is enriched with cancer pathways and treatment targets, providing valuable insights into disease relationships.
Collapse
Affiliation(s)
| | - Bingbo Wang
- School of Computer Science and Technology, Xidian University, Xi’an 710119, China; (J.L.); (X.M.)
| | | |
Collapse
|
9
|
Park JH, Cho YR. Computational drug repositioning with attention walking. Sci Rep 2024; 14:10072. [PMID: 38698208 PMCID: PMC11066070 DOI: 10.1038/s41598-024-60756-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Accepted: 04/26/2024] [Indexed: 05/05/2024] Open
Abstract
Drug repositioning aims to identify new therapeutic indications for approved medications. Recently, the importance of computational drug repositioning has been highlighted because it can reduce the costs, development time, and risks compared to traditional drug discovery. Most approaches in this area use networks for systematic analysis. Inferring drug-disease associations is then defined as a link prediction problem in a heterogeneous network composed of drugs and diseases. In this article, we present a novel method of computational drug repositioning, named drug repositioning with attention walking (DRAW). DRAW proceeds as follows: first, a subgraph enclosing the target link for prediction is extracted. Second, a graph convolutional network captures the structural features of the labeled nodes in the subgraph. Third, the transition probabilities are computed using attention mechanisms and converted into random walk profiles. Finally, a multi-layer perceptron takes random walk profiles and predicts whether a target link exists. As an experiment, we constructed two heterogeneous networks with drug-drug similarities based on chemical structures and anatomical therapeutic chemical classification (ATC) codes. Using 10-fold cross-validation, DRAW achieved an area under the receiver operating characteristic (ROC) curve of 0.903 and outperformed state-of-the-art methods. Moreover, we demonstrated the results of case studies for selected drugs and diseases to further confirm the capability of DRAW to predict drug-disease associations.
Collapse
Affiliation(s)
- Jong-Hoon Park
- Division of Software, Yonsei University Mirae Campus, Wonju-si, 26493, Gangwon-do, Korea
| | - Young-Rae Cho
- Division of Software, Yonsei University Mirae Campus, Wonju-si, 26493, Gangwon-do, Korea.
- Division of Digital Healthcare, Yonsei University Mirae Campus, Wonju-si, 26493, Gangwon-do, Korea.
| |
Collapse
|
10
|
Li Y, Yang Y, Tong Z, Wang Y, Mi Q, Bai M, Liang G, Li B, Shu K. A comparative benchmarking and evaluation framework for heterogeneous network-based drug repositioning methods. Brief Bioinform 2024; 25:bbae172. [PMID: 38647153 PMCID: PMC11033846 DOI: 10.1093/bib/bbae172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Revised: 02/25/2024] [Accepted: 04/02/2024] [Indexed: 04/25/2024] Open
Abstract
Computational drug repositioning, which involves identifying new indications for existing drugs, is an increasingly attractive research area due to its advantages in reducing both overall cost and development time. As a result, a growing number of computational drug repositioning methods have emerged. Heterogeneous network-based drug repositioning methods have been shown to outperform other approaches. However, there is a dearth of systematic evaluation studies of these methods, encompassing performance, scalability and usability, as well as a standardized process for evaluating new methods. Additionally, previous studies have only compared several methods, with conflicting results. In this context, we conducted a systematic benchmarking study of 28 heterogeneous network-based drug repositioning methods on 11 existing datasets. We developed a comprehensive framework to evaluate their performance, scalability and usability. Our study revealed that methods such as HGIMC, ITRPCA and BNNR exhibit the best overall performance, as they rely on matrix completion or factorization. HINGRL, MLMC, ITRPCA and HGIMC demonstrate the best performance, while NMFDR, GROBMC and SCPMF display superior scalability. For usability, HGIMC, DRHGCN and BNNR are the top performers. Building on these findings, we developed an online tool called HN-DREP (http://hn-drep.lyhbio.com/) to facilitate researchers in viewing all the detailed evaluation results and selecting the appropriate method. HN-DREP also provides an external drug repositioning prediction service for a specific disease or drug by integrating predictions from all methods. Furthermore, we have released a Snakemake workflow named HN-DRES (https://github.com/lyhbio/HN-DRES) to facilitate benchmarking and support the extension of new methods into the field.
Collapse
Affiliation(s)
- Yinghong Li
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
| | - Yinqi Yang
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
| | - Zhuohao Tong
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
| | - Yu Wang
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
| | - Qin Mi
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
| | - Mingze Bai
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
| | - Guizhao Liang
- Key Laboratory of Biorheological Science and Technology, Ministry of Education, Bioengineering College, Chongqing University, Chongqing, 400044, P. R. China
| | - Bo Li
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, P. R. China
| | - Kunxian Shu
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
| |
Collapse
|
11
|
Zhang P, Zhang D, Zhou W, Wang L, Wang B, Zhang T, Li S. Network pharmacology: towards the artificial intelligence-based precision traditional Chinese medicine. Brief Bioinform 2023; 25:bbad518. [PMID: 38197310 PMCID: PMC10777171 DOI: 10.1093/bib/bbad518] [Citation(s) in RCA: 144] [Impact Index Per Article: 72.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2023] [Revised: 11/03/2023] [Accepted: 11/30/2023] [Indexed: 01/11/2024] Open
Abstract
Network pharmacology (NP) provides a new methodological perspective for understanding traditional medicine from a holistic perspective, giving rise to frontiers such as traditional Chinese medicine network pharmacology (TCM-NP). With the development of artificial intelligence (AI) technology, it is key for NP to develop network-based AI methods to reveal the treatment mechanism of complex diseases from massive omics data. In this review, focusing on the TCM-NP, we summarize involved AI methods into three categories: network relationship mining, network target positioning and network target navigating, and present the typical application of TCM-NP in uncovering biological basis and clinical value of Cold/Hot syndromes. Collectively, our review provides researchers with an innovative overview of the methodological progress of NP and its application in TCM from the AI perspective.
Collapse
Affiliation(s)
- Peng Zhang
- Institute for TCM-X, MOE Key Laboratory of Bioinformatics/Bioinformatics Division, BNRIST, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Dingfan Zhang
- Institute for TCM-X, MOE Key Laboratory of Bioinformatics/Bioinformatics Division, BNRIST, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Wuai Zhou
- China Mobile Information System Integration Co., Ltd, Beijing 100032, China
| | - Lan Wang
- Institute for TCM-X, MOE Key Laboratory of Bioinformatics/Bioinformatics Division, BNRIST, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Boyang Wang
- Institute for TCM-X, MOE Key Laboratory of Bioinformatics/Bioinformatics Division, BNRIST, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Tingyu Zhang
- Institute for TCM-X, MOE Key Laboratory of Bioinformatics/Bioinformatics Division, BNRIST, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Shao Li
- Institute for TCM-X, MOE Key Laboratory of Bioinformatics/Bioinformatics Division, BNRIST, Department of Automation, Tsinghua University, Beijing 100084, China
| |
Collapse
|
12
|
Muniyappan S, Rayan AXA, Varrieth GT. EGeRepDR: An enhanced genetic-based representation learning for drug repurposing using multiple biomedical sources. J Biomed Inform 2023; 147:104528. [PMID: 37858852 DOI: 10.1016/j.jbi.2023.104528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 09/11/2023] [Accepted: 10/16/2023] [Indexed: 10/21/2023]
Abstract
MOTIVATION Drug repurposing (DR) is an imminent approach for identifying novel therapeutic indications for the available drugs and discovering novel drugs for previously untreatable diseases. Nowadays, DR has major attention in the pharmaceutical industry due to the high cost and time of launching new drugs to the market through traditional drug development. DR task majorly depends on genetic information since the drugs revert the modified Gene Expression (GE) of diseases to normal. Many of the existing studies have not considered the genetic importance of predicting the potential candidates. METHOD We proposed a novel multimodal framework that utilizes genetic aspects of drugs and diseases such as genes, pathways, gene signatures, or expression to enhance the performance of DR using various data sources. Firstly, the heterogeneous biological network (HBN) is constructed with three types of nodes namely drug, disease, and gene, and 4 types of edges similarities (drug, gene, and disease), drug-gene, gene-disease, and drug-disease. Next, a modified graph auto-encoder (GAE*) model is applied to learn the representation of drug and disease nodes using the topological structure and edge information. Secondly, the HBN is enhanced with the information extracted from biomedical literature and ontology using a novel semi-supervised pattern embedding-based bootstrapping model and novel DR perspective representation learning respectively to improve the prediction performance. Finally, our proposed system uses a neural network model to generate the probability score of drug-disease pairs. RESULTS We demonstrate the efficiency of the proposed model on various datasets and achieved outstanding performance in 5-fold cross-validation (AUC = 0.99, AUPR = 0.98). Further, we validated the top-ranked potential candidates using pathway analysis and proved that the known and predicted candidates share common genes in the pathways.
Collapse
Affiliation(s)
- Saranya Muniyappan
- Computer Science and Engineering, CEG Campus, Anna University, Chennai, Tamil Nadu, India.
| | | | | |
Collapse
|
13
|
Yang M, Yang B, Duan G, Wang J. ITRPCA: a new model for computational drug repositioning based on improved tensor robust principal component analysis. Front Genet 2023; 14:1271311. [PMID: 37795241 PMCID: PMC10545866 DOI: 10.3389/fgene.2023.1271311] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Accepted: 08/23/2023] [Indexed: 10/06/2023] Open
Abstract
Background: Drug repositioning is considered a promising drug development strategy with the goal of discovering new uses for existing drugs. Compared with the experimental screening for drug discovery, computational drug repositioning offers lower cost and higher efficiency and, hence, has become a hot issue in bioinformatics. However, there are sparse samples, multi-source information, and even some noises, which makes it difficult to accurately identify potential drug-associated indications. Methods: In this article, we propose a new scheme with improved tensor robust principal component analysis (ITRPCA) in multi-source data to predict promising drug-disease associations. First, we use a weighted k-nearest neighbor (WKNN) approach to increase the overall density of the drug-disease association matrix that will assist in prediction. Second, a drug tensor with five frontal slices and a disease tensor with two frontal slices are constructed using multi-similarity matrices and an updated association matrix. The two target tensors naturally integrate multiple sources of data from the drug-side aspect and the disease-side aspect, respectively. Third, ITRPCA is employed to isolate the low-rank tensor and noise information in the tensor. In this step, an additional range constraint is incorporated to ensure that all the predicted entry values of a low-rank tensor are within the specific interval. Finally, we focus on identifying promising drug indications by analyzing drug-disease association pairs derived from the low-rank drug and low-rank disease tensors. Results: We evaluate the effectiveness of the ITRPCA method by comparing it with five prominent existing drug repositioning methods. This evaluation is carried out using 10-fold cross-validation and independent testing experiments. Our numerical results show that ITRPCA not only yields higher prediction accuracy but also exhibits remarkable computational efficiency. Furthermore, case studies demonstrate the practical effectiveness of our method.
Collapse
Affiliation(s)
- Mengyun Yang
- School of Mechanical and Energy Engineering, Shaoyang University, Shaoyang, China
- School of Computer Science, Hunan First Normal University, Changsha, China
| | - Bin Yang
- School of Mechanical and Energy Engineering, Shaoyang University, Shaoyang, China
| | - Guihua Duan
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha, China
| |
Collapse
|
14
|
Huang Y, Wu Z, Lan W, Zhong C. Predicting Disease-Associated N7-Methylguanosine (m 7G) Sites via Random Walk on Heterogeneous Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:3173-3181. [PMID: 37294648 DOI: 10.1109/tcbb.2023.3284505] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Recent studies revealed that the modification of N7-methylguanosine (m7G) has associations with many human diseases. Effectively identifying disease-associated m7G methylation sites would provide crucial clues for disease diagnosis and treatment. Previous studies have developed computational methods to predict disease-associated m7G sites based on similarities among m7G sites and diseases. However, few have focused on the influence of the known m7G-disease association information on calculating similarity measures of m7G site and disease, which potentially promotes the identification of the disease-associated m7G sites. In this work, we propose а computational method called m7GDP-RW to predict m7G-disease associations by random walk algorithm. m7GDP-RW first incorporates the feature information of m7G site and disease with the known m7G-disease associations to compute m7G site similarity and disease similarity. Then m7GDP-RW combines the known m7G-disease associations with the computed similarity of m7G site and disease to construct a m7G-disease heterogeneous network. Finally, m7GDP-RW utilizes a two-pass random walk with restart algorithm to find novel m7G-disease associations on the heterogeneous network. The experimental results show that our method achieves higher prediction accuracy compared to the existing methods. The study case also demonstrates the effectiveness of m7GDP-RW in discovering potential m7G-disease associations.
Collapse
|
15
|
Ai C, Yang H, Ding Y, Tang J, Guo F. Low Rank Matrix Factorization Algorithm Based on Multi-Graph Regularization for Detecting Drug-Disease Association. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:3033-3043. [PMID: 37159322 DOI: 10.1109/tcbb.2023.3274587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Detecting potential associations between drugs and diseases plays an indispensable role in drug development, which has also become a research hotspot in recent years. Compared with traditional methods, some computational approaches have the advantages of fast speed and low cost, which greatly accelerate the progress of predicting the drug-disease association. In this study, we propose a novel similarity-based method of low-rank matrix decomposition based on multi-graph regularization. On the basis of low-rank matrix factorization with L2 regularization, the multi-graph regularization constraint is constructed by combining a variety of similarity matrices from drugs and diseases respectively. In the experiments, we analyze the difference in the combination of different similarities, resulting that combining all the similarity information on drug space is unnecessary, and only a part of the similarity information can achieve the desired performance. Then our method is compared with other existing models on three data sets (Fdataset, Cdataset and LRSSLdataset) and have a good advantage in the evaluation measurement of AUPR. Besides, a case study experiment is conducted and showing that the superior ability for predicting the potential disease-related drugs of our model. Finally, we compare our model with some methods on six real world datasets, and our model has a good performance in detecting real world data.
Collapse
|
16
|
Zhu X, Lu W. Multi-Label Classification With Dual Tail-Node Augmentation for Drug Repositioning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:3068-3079. [PMID: 37418410 DOI: 10.1109/tcbb.2023.3292883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/09/2023]
Abstract
Due to the lengthy and costly process of new drug discovery, increasing attention has been paid to drug repositioning, i.e., identifying new drug-disease associations. Current machine learning methods for drug repositioning mainly leverage matrix factorization or graph neural networks, and have achieved impressive performance. However, they often suffer from insufficient training labels of inter-domain associations, while ignore the intra-domain associations. Moreover, they often neglect the importance of tail nodes that have few known associations, which limits their effectiveness in drug repositioning. In this paper, we propose a novel multi-label classification model with dual Tail-Node Augmentation for Drug Repositioning (TNA-DR). We incorporate disease-disease similarity and drug-drug similarity information into k-nearest neighbor ( kNN) augmentation module and contrastive augmentation module, respectively, which effectively complements the weak supervision of drug-disease associations. Furthermore, before employing the two augmentation modules, we filter the nodes by their degrees, so that the two modules are only applied to tail nodes. We conduct 10-fold cross validation experiments on four different real-world datasets, and our model achieves the state-of-the-art performance on all the four datasets. We also demonstrate our model's capability of identifying drug candidates for new diseases and discovering potential new links between existing drugs and diseases.
Collapse
|
17
|
Zhang S, Strayer N, Vessels T, Choi K, Wang GW, Li Y, Bejan CA, Hsi RS, Bick AG, Velez Edwards DR, Savona MR, Philips EJ, Pulley J, Self WH, Hopkins WC, Roden DM, Smoller JW, Ruderfer DM, Xu Y. PheMIME: An Interactive Web App and Knowledge Base for Phenome-Wide, Multi-Institutional Multimorbidity Analysis. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.07.23.23293047. [PMID: 37547012 PMCID: PMC10402210 DOI: 10.1101/2023.07.23.23293047] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
Motivation Multimorbidity, characterized by the simultaneous occurrence of multiple diseases in an individual, is an increasing global health concern, posing substantial challenges to healthcare systems. Comprehensive understanding of disease-disease interactions and intrinsic mechanisms behind multimorbidity can offer opportunities for innovative prevention strategies, targeted interventions, and personalized treatments. Yet, there exist limited tools and datasets that characterize multimorbidity patterns across different populations. To bridge this gap, we used large-scale electronic health record (EHR) systems to develop the Phenome-wide Multi-Institutional Multimorbidity Explorer (PheMIME), which facilitates research in exploring and comparing multimorbidity patterns among multiple institutions, potentially leading to the discovery of novel and robust disease associations and patterns that are interoperable across different systems and organizations. Results PheMIME integrates summary statistics from phenome-wide analyses of disease multimorbidities. These are currently derived from three major institutions: Vanderbilt University Medical Center, Mass General Brigham, and the UK Biobank. PheMIME offers interactive exploration of multimorbidity through multi-faceted visualization. Incorporating an enhanced version of associationSubgraphs, PheMIME enables dynamic analysis and inference of disease clusters, promoting the discovery of multimorbidity patterns. Once a disease of interest is selected, the tool generates interactive visualizations and tables that users can delve into multimorbidities or multimorbidity networks within a single system or compare across multiple systems. The utility of PheMIME is demonstrated through a case study on schizophrenia. Availability and implementation The PheMIME knowledge base and web application are accessible at https://prod.tbilab.org/PheMIME/. A comprehensive tutorial, including a use-case example, is available at https://prod.tbilab.org/PheMIME_supplementary_materials/. Furthermore, the source code for PheMIME can be freely downloaded from https://github.com/tbilab/PheMIME. Data availability statement The data underlying this article are available in the article and in its online web application or supplementary material.
Collapse
Affiliation(s)
- Siwei Zhang
- Department of Biostatistics, Vanderbilt University, Nashville, TN, USA
| | | | - Tess Vessels
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Karmel Choi
- Psychiatric & Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston MA
- Center for Precision Psychiatry, Department of Psychiatry, Massachusetts General Hospital, Boston MA
| | | | - Yajing Li
- Department of Biostatistics, Vanderbilt University, Nashville, TN, USA
| | - Cosmin A Bejan
- Department of Biomedical informatics, Vanderbilt University, Nashville, TN, USA
| | - Ryan S Hsi
- Department of Urology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Alexander G Bick
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Digna R Velez Edwards
- Department of Obstetrics and Gynecology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Michael R Savona
- Division of Hematology and Oncology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Elizabeth J Philips
- Center for Drug Safety and Immunology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Institute for Immunology and Infectious Diseases, Murdoch University, Murdoch, Western Australia, Australia
| | - Jill Pulley
- Vanderbilt Institute for Clinical and Translational Science, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Wesley H Self
- Vanderbilt Institute for Clinical and Translational Science, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Wilkins Consuelo Hopkins
- Vanderbilt Institute for Clinical and Translational Science, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Dan M Roden
- Department of Pharmacology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Jordan W Smoller
- Psychiatric & Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston MA
- Center for Precision Psychiatry, Department of Psychiatry, Massachusetts General Hospital, Boston MA
- Stanley Center for Psychiatric Research, Broad Institute, Cambridge, MA
| | - Douglas M Ruderfer
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Biomedical informatics, Vanderbilt University, Nashville, TN, USA
- Department of Psychiatry and Behavioral Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Yaomin Xu
- Department of Biostatistics, Vanderbilt University, Nashville, TN, USA
- Department of Biomedical informatics, Vanderbilt University, Nashville, TN, USA
| |
Collapse
|
18
|
Liu X, Gao L, Peng Y, Fang Z, Wang J. PheSom: a term frequency-based method for measuring human phenotype similarity on the basis of MeSH vocabulary. Front Genet 2023; 14:1185790. [PMID: 37496714 PMCID: PMC10366691 DOI: 10.3389/fgene.2023.1185790] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 06/21/2023] [Indexed: 07/28/2023] Open
Abstract
Background: Phenotype similarity calculation should be used to help improve drug repurposing. In this study, based on the MeSH terms describing the phenotypes deposited in OMIM, we proposed a method, namely, PheSom (Phenotype Similarity On MeSH), to measure the similarity between phenotypes. PheSom counted the number of overlapping MeSH terms between two phenotypes and then took the weight of every MeSH term within each phenotype into account according to the term frequency-inverse document frequency (FIDC). Phenotype-related genes were used for the evaluation of our method. Results: A 7,739 × 7,739 similarity score matrix was finally obtained and the number of phenotype pairs was dramatically decreased with the increase of similarity score. Besides, the overlapping rates of phenotype-related genes were remarkably increased with the increase of similarity score between phenotypes, which supports the reliability of our method. Conclusion: We anticipate our method can be applied to identifying novel therapeutic methods for complex diseases.
Collapse
Affiliation(s)
- Xinhua Liu
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Hangzhou Normal University, Hangzhou, Zhejiang, China
- School of Biomedical Engineering and Technology, Tianjin Medical University, Tianjin, China
| | - Ling Gao
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Hangzhou Normal University, Hangzhou, Zhejiang, China
| | - Yonglin Peng
- Shanghai Center for Systems Biomedicine, Shanghai Jiao Tong University, Shanghai, China
| | - Zhonghai Fang
- School of Biomedical Engineering and Technology, Tianjin Medical University, Tianjin, China
| | - Ju Wang
- School of Biomedical Engineering and Technology, Tianjin Medical University, Tianjin, China
| |
Collapse
|
19
|
Wang Y, Gao YL, Wang J, Li F, Liu JX. MSGCA: Drug-Disease Associations Prediction Based on Multi-Similarities Graph Convolutional Autoencoder. IEEE J Biomed Health Inform 2023; 27:3686-3694. [PMID: 37163398 DOI: 10.1109/jbhi.2023.3272154] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Identifying drug-disease associations (DDAs) is critical to the development of drugs. Traditional methods to determine DDAs are expensive and inefficient. Therefore, it is imperative to develop more accurate and effective methods for DDAs prediction. Most current DDAs prediction methods utilize original DDAs matrix directly. However, the original DDAs matrix is sparse, which greatly affects the prediction consequences. Hence, a prediction method based on multi-similarities graph convolutional autoencoder (MSGCA) is proposed for DDAs prediction. First, MSGCA integrates multiple drug similarities and disease similarities using centered kernel alignment-based multiple kernel learning (CKA-MKL) algorithm to form new drug similarity and disease similarity, respectively. Second, the new drug and disease similarities are improved by linear neighborhood, and the DDAs matrix is reconstructed by weighted K nearest neighbor profiles. Next, the reconstructed DDAs and the improved drug and disease similarities are integrated into a heterogeneous network. Finally, the graph convolutional autoencoder with attention mechanism is utilized to predict DDAs. Compared with extant methods, MSGCA shows superior results on three datasets. Furthermore, case studies further demonstrate the reliability of MSGCA.
Collapse
|
20
|
Xu Z, Marchionni L, Wang S. MultiNEP: a multi-omics network enhancement framework for prioritizing disease genes and metabolites simultaneously. Bioinformatics 2023; 39:btad333. [PMID: 37216914 PMCID: PMC10250081 DOI: 10.1093/bioinformatics/btad333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 04/28/2023] [Accepted: 05/19/2023] [Indexed: 05/24/2023] Open
Abstract
MOTIVATION Many studies have successfully used network information to prioritize candidate omics profiles associated with diseases. The metabolome, as the link between genotypes and phenotypes, has accumulated growing attention. Using a "multi-omics" network constructed with a gene-gene network, a metabolite-metabolite network, and a gene-metabolite network to simultaneously prioritize candidate disease-associated metabolites and gene expressions could further utilize gene-metabolite interactions that are not used when prioritizing them separately. However, the number of metabolites is usually 100 times fewer than that of genes. Without accounting for this imbalance issue, we cannot effectively use gene-metabolite interactions when simultaneously prioritizing disease-associated metabolites and genes. RESULTS Here, we developed a Multi-omics Network Enhancement Prioritization (MultiNEP) framework with a weighting scheme to reweight contributions of different sub-networks in a multi-omics network to effectively prioritize candidate disease-associated metabolites and genes simultaneously. In simulation studies, MultiNEP outperforms competing methods that do not address network imbalances and identifies more true signal genes and metabolites simultaneously when we down-weight relative contributions of the gene-gene network and up-weight that of the metabolite-metabolite network to the gene-metabolite network. Applications to two human cancer cohorts show that MultiNEP prioritizes more cancer-related genes by effectively using both within- and between-omics interactions after handling network imbalance. AVAILABILITY AND IMPLEMENTATION The developed MultiNEP framework is implemented in an R package and available at: https://github.com/Karenxzr/MultiNep.
Collapse
Affiliation(s)
- Zhuoran Xu
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY 10065, United States
| | - Luigi Marchionni
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY 10065, United States
| | - Shuang Wang
- Department of Biostatistics, Columbia University, New York, NY 10032, United States
| |
Collapse
|
21
|
Wang Y, Song J, Wei M, Duan X. Predicting Potential Drug-Disease Associations Based on Hypergraph Learning with Subgraph Matching. Interdiscip Sci 2023; 15:249-261. [PMID: 36906712 DOI: 10.1007/s12539-023-00556-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 02/06/2023] [Accepted: 02/06/2023] [Indexed: 03/13/2023]
Abstract
The search for potential drug-disease associations (DDA) can speed up drug development cycles, reduce costly wasted resources, and accelerate disease treatment by repurposing existing drugs that can control further disease progression. As technologies such as deep learning continue to mature, many researchers tend to use emerging technologies to predict potential DDA. The performance of DDA prediction is still challenging and there is some space for improvement due to issues such as the small number of existing associations and possible noise in the data. To better predict DDA, we propose a computational approach based on hypergraph learning with subgraph matching (HGDDA). In particular, HGDDA first extracts feature subgraph information in the validated drug-disease association network and proposes a negative sampling strategy based on similarity network to reduce the data imbalance. Second, the hypergraph Unet module is used by extracting Finally, the potential DDA is predicted by designing a hypergraph combination module to convolution and pooling the two constructed hypergraphs separately, and calculating the difference information between the subgraphs using cosine similarity for node matching. The performance of HGDDA is verified under two standard datasets by 10-fold cross-validation (10-CV), and the results outperform existing drug-disease prediction methods. In addition, to validate the overall utility of the model, the top 10 drugs for the specific disease are predicted through the case study and validated using the CTD database.
Collapse
Affiliation(s)
- Yuanxu Wang
- Key Laboratory of Big Data Applied Technology State Ethnic Affairs Commission, Dalian Minzu University, Dalian, 116650, China.,School of Computer Science and Engineering, Dalian Minzu University, Dalian, 116650, China
| | - Jinmiao Song
- Key Laboratory of Big Data Applied Technology State Ethnic Affairs Commission, Dalian Minzu University, Dalian, 116650, China. .,School of Information Science and Engineering, Xinjiang University, Urumqi, 830046, China.
| | - Mingjie Wei
- Key Laboratory of Big Data Applied Technology State Ethnic Affairs Commission, Dalian Minzu University, Dalian, 116650, China.,School of Computer Science and Engineering, Dalian Minzu University, Dalian, 116650, China
| | - Xiaodong Duan
- Key Laboratory of Big Data Applied Technology State Ethnic Affairs Commission, Dalian Minzu University, Dalian, 116650, China.,School of Computer Science and Engineering, Dalian Minzu University, Dalian, 116650, China
| |
Collapse
|
22
|
Wang Z, Gu Y, Zheng S, Yang L, Li J. MGREL: A multi-graph representation learning-based ensemble learning method for gene-disease association prediction. Comput Biol Med 2023; 155:106642. [PMID: 36805231 DOI: 10.1016/j.compbiomed.2023.106642] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2022] [Revised: 01/15/2023] [Accepted: 02/05/2023] [Indexed: 02/12/2023]
Abstract
The identification of gene-disease associations plays an important role in the exploration of pathogenic mechanisms and therapeutic targets. Computational methods have been regarded as an effective way to discover the potential gene-disease associations in recent years. However, most of them ignored the combination of abundant genetic, therapeutic information, and gene-disease network topology. To this end, we re-organized the current gene-disease association benchmark dataset by extracting the newest gene-disease associations from the OMIM database. Then, we developed a multi-graph representation learning-based ensemble model, named MGREL to predict gene-disease associations. MGREL integrated two feature generation channels to extract gene and disease features, including a knowledge extraction channel which learned high-order representations from genetic and therapeutic information, and a graph learning channel which acquired network topological representations through multiple advanced graph representation learning methods. Then, an ensemble learning method with 5 machine learning models was used as the classifier to predict the gene-disease association. Comprehensive experiments have demonstrated the significant performance achieved by MGREL compared to 5 state-of-the-art methods. For the major measurements (AUC = 0.925, AUPR = 0.935), the relative improvements of MGREL compared to the suboptimal methods are 3.24%, and 2.75%, respectively. MGREL also achieved impressive improvements in the challenging tasks of predicting potential associations for unknown genes/diseases. In addition, case studies implied potential applications for MGREL in the discovery of potential therapeutic targets.
Collapse
Affiliation(s)
- Ziyang Wang
- Institute of Medical Information IMI, Chinese Academy of Medical Sciences and Peking Union Medical College CAMS & PUMC, Beijing, 100020, China
| | - Yaowen Gu
- Institute of Medical Information IMI, Chinese Academy of Medical Sciences and Peking Union Medical College CAMS & PUMC, Beijing, 100020, China
| | - Si Zheng
- Institute of Medical Information IMI, Chinese Academy of Medical Sciences and Peking Union Medical College CAMS & PUMC, Beijing, 100020, China; Institute for Artificial Intelligence, Department of Computer Science and Technology, BNRist, Tsinghua University, Beijing, 100084, China
| | - Lin Yang
- Institute of Medical Information IMI, Chinese Academy of Medical Sciences and Peking Union Medical College CAMS & PUMC, Beijing, 100020, China
| | - Jiao Li
- Institute of Medical Information IMI, Chinese Academy of Medical Sciences and Peking Union Medical College CAMS & PUMC, Beijing, 100020, China.
| |
Collapse
|
23
|
Zhang Y, Xiang J, Tang L, Yang J, Li J. PGAGP: Predicting pathogenic genes based on adaptive network embedding algorithm. Front Genet 2023; 13:1087784. [PMID: 36744177 PMCID: PMC9895109 DOI: 10.3389/fgene.2022.1087784] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 12/09/2022] [Indexed: 01/21/2023] Open
Abstract
The study of disease-gene associations is an important topic in the field of computational biology. The accumulation of massive amounts of biomedical data provides new possibilities for exploring potential relations between diseases and genes through computational strategy, but how to extract valuable information from the data to predict pathogenic genes accurately and rapidly is currently a challenging and meaningful task. Therefore, we present a novel computational method called PGAGP for inferring potential pathogenic genes based on an adaptive network embedding algorithm. The PGAGP algorithm is to first extract initial features of nodes from a heterogeneous network of diseases and genes efficiently and effectively by Gaussian random projection and then optimize the features of nodes by an adaptive refining process. These low-dimensional features are used to improve the disease-gene heterogenous network, and we apply network propagation to the improved heterogenous network to predict pathogenic genes more effectively. By a series of experiments, we study the effect of PGAGP's parameters and integrated strategies on predictive performance and confirm that PGAGP is better than the state-of-the-art algorithms. Case studies show that many of the predicted candidate genes for specific diseases have been implied to be related to these diseases by literature verification and enrichment analysis, which further verifies the effectiveness of PGAGP. Overall, this work provides a useful solution for mining disease-gene heterogeneous network to predict pathogenic genes more effectively.
Collapse
Affiliation(s)
- Yan Zhang
- School of Computer Science and Engineering, Central South University, Changsha, China
- School of Information Science and Engineering, Changsha Medical University, Changsha, China
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Ju Xiang
- School of Computer Science and Engineering, Central South University, Changsha, China
- School of Information Science and Engineering, Changsha Medical University, Changsha, China
- Academician Workstation, Changsha Medical University, Changsha, China
- School of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha, China
- Department of Basic Medical Sciences and Neuroscience Research Center, Changsha Medical University, Changsha, China
| | - Liang Tang
- Academician Workstation, Changsha Medical University, Changsha, China
- Department of Basic Medical Sciences and Neuroscience Research Center, Changsha Medical University, Changsha, China
| | - Jialiang Yang
- Academician Workstation, Changsha Medical University, Changsha, China
- Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
- Geneis Beijing Co., Ltd, Beijing, China
| | - Jianming Li
- Academician Workstation, Changsha Medical University, Changsha, China
- Department of Basic Medical Sciences and Neuroscience Research Center, Changsha Medical University, Changsha, China
| |
Collapse
|
24
|
Wang MN, Xie XJ, You ZH, Ding DW, Wong L. A weighted non-negative matrix factorization approach to predict potential associations between drug and disease. J Transl Med 2022; 20:552. [PMID: 36463215 PMCID: PMC9719187 DOI: 10.1186/s12967-022-03757-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Accepted: 11/06/2022] [Indexed: 12/04/2022] Open
Abstract
BACKGROUND Associations of drugs with diseases provide important information for expediting drug development. Due to the number of known drug-disease associations is still insufficient, and considering that inferring associations between them through traditional in vitro experiments is time-consuming and costly. Therefore, more accurate and reliable computational methods urgent need to be developed to predict potential associations of drugs with diseases. METHODS In this study, we present the model called weighted graph regularized collaborative non-negative matrix factorization for drug-disease association prediction (WNMFDDA). More specifically, we first calculated the drug similarity and disease similarity based on the chemical structures of drugs and medical description information of diseases, respectively. Then, to extend the model to work for new drugs and diseases, weighted [Formula: see text] nearest neighbor was used as a preprocessing step to reconstruct the interaction score profiles of drugs with diseases. Finally, a graph regularized non-negative matrix factorization model was used to identify potential associations between drug and disease. RESULTS During the cross-validation process, WNMFDDA achieved the AUC values of 0.939 and 0.952 on Fdataset and Cdataset under ten-fold cross validation, respectively, which outperforms other competing prediction methods. Moreover, case studies for several drugs and diseases were carried out to further verify the predictive performance of WNMFDDA. As a result, 13(Doxorubicin), 13(Amiodarone), 12(Obesity) and 12(Asthma) of the top 15 corresponding candidate diseases or drugs were confirmed by existing databases. CONCLUSIONS The experimental results adequately demonstrated that WNMFDDA is a very effective method for drug-disease association prediction. We believe that WNMFDDA is helpful for relevant biomedical researchers in follow-up studies.
Collapse
Affiliation(s)
- Mei-Neng Wang
- grid.449868.f0000 0000 9798 3808School of Mathematics and Computer Science, Yichun University, Yichun, 336000 Jiangxi China
| | - Xue-Jun Xie
- grid.449868.f0000 0000 9798 3808School of Mathematics and Computer Science, Yichun University, Yichun, 336000 Jiangxi China
| | - Zhu-Hong You
- grid.440588.50000 0001 0307 1240School of Computer Science, Northwestern Polytechnical University, Xi’an, 710072 China
| | - De-Wu Ding
- grid.449868.f0000 0000 9798 3808School of Mathematics and Computer Science, Yichun University, Yichun, 336000 Jiangxi China
| | - Leon Wong
- grid.9227.e0000000119573309Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011 China ,grid.410726.60000 0004 1797 8419University of Chinese Academy of Sciences, Beijing, 100049 China
| |
Collapse
|
25
|
Huang L, Zhang L, Chen X. Updated review of advances in microRNAs and complex diseases: experimental results, databases, webservers and data fusion. Brief Bioinform 2022; 23:6696143. [PMID: 36094095 DOI: 10.1093/bib/bbac397] [Citation(s) in RCA: 62] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 07/19/2022] [Accepted: 08/15/2022] [Indexed: 12/14/2022] Open
Abstract
MicroRNAs (miRNAs) are gene regulators involved in the pathogenesis of complex diseases such as cancers, and thus serve as potential diagnostic markers and therapeutic targets. The prerequisite for designing effective miRNA therapies is accurate discovery of miRNA-disease associations (MDAs), which has attracted substantial research interests during the last 15 years, as reflected by more than 55 000 related entries available on PubMed. Abundant experimental data gathered from the wealth of literature could effectively support the development of computational models for predicting novel associations. In 2017, Chen et al. published the first-ever comprehensive review on MDA prediction, presenting various relevant databases, 20 representative computational models, and suggestions for building more powerful ones. In the current review, as the continuation of the previous study, we revisit miRNA biogenesis, detection techniques and functions; summarize recent experimental findings related to common miRNA-associated diseases; introduce recent updates of miRNA-relevant databases and novel database releases since 2017, present mainstream webservers and new webserver releases since 2017 and finally elaborate on how fusion of diverse data sources has contributed to accurate MDA prediction.
Collapse
Affiliation(s)
- Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, 10084, China.,The Future Laboratory, Tsinghua University, Beijing, 10084, China
| | - Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.,Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|
26
|
He B, Wang K, Xiang J, Bing P, Tang M, Tian G, Guo C, Xu M, Yang J. DGHNE: network enhancement-based method in identifying disease-causing genes through a heterogeneous biomedical network. Brief Bioinform 2022; 23:6712302. [PMID: 36151744 DOI: 10.1093/bib/bbac405] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 08/01/2022] [Accepted: 08/21/2022] [Indexed: 12/14/2022] Open
Abstract
The identification of disease-causing genes is critical for mechanistic understanding of disease etiology and clinical manipulation in disease prevention and treatment. Yet the existing approaches in tackling this question are inadequate in accuracy and efficiency, demanding computational methods with higher identification power. Here, we proposed a new method called DGHNE to identify disease-causing genes through a heterogeneous biomedical network empowered by network enhancement. First, a disease-disease association network was constructed by the cosine similarity scores between phenotype annotation vectors of diseases, and a new heterogeneous biomedical network was constructed by using disease-gene associations to connect the disease-disease network and gene-gene network. Then, the heterogeneous biomedical network was further enhanced by using network embedding based on the Gaussian random projection. Finally, network propagation was used to identify candidate genes in the enhanced network. We applied DGHNE together with five other methods into the most updated disease-gene association database termed DisGeNet. Compared with all other methods, DGHNE displayed the highest area under the receiver operating characteristic curve and the precision-recall curve, as well as the highest precision and recall, in both the global 5-fold cross-validation and predicting new disease-gene associations. We further performed DGHNE in identifying the candidate causal genes of Parkinson's disease and diabetes mellitus, and the genes connecting hyperglycemia and diabetes mellitus. In all cases, the predicted causing genes were enriched in disease-associated gene ontology terms and Kyoto Encyclopedia of Genes and Genomes pathways, and the gene-disease associations were highly evidenced by independent experimental studies.
Collapse
Affiliation(s)
- Binsheng He
- Academician Workstation, Changsha Medical University, Changsha 410219, China.,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, P. R. China.,School of pharmacy, Changsha Medical University, Changsha 410219, P. R. China
| | - Kun Wang
- School of Mathematical Sciences, Ocean University of China, Qingdao 266100, China
| | - Ju Xiang
- Academician Workstation, Changsha Medical University, Changsha 410219, China
| | - Pingping Bing
- Academician Workstation, Changsha Medical University, Changsha 410219, China.,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, P. R. China.,School of pharmacy, Changsha Medical University, Changsha 410219, P. R. China
| | - Min Tang
- School of Life Sciences, Jiangsu University, Zhenjiang 212001, Jiangsu, China
| | - Geng Tian
- Geneis (Beijing) Co., Ltd., Beijing 100102, China
| | - Cheng Guo
- Center for Infection and Immunity, Mailman School of Public Health, Columbia University, New York, NY, 10032, USA
| | - Miao Xu
- Broad institute of MIT and Harvard, 415 Main Street, Cambridge, MA 02142, USA
| | - Jialiang Yang
- Academician Workstation, Changsha Medical University, Changsha 410219, China.,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, P. R. China.,School of pharmacy, Changsha Medical University, Changsha 410219, P. R. China.,Geneis (Beijing) Co., Ltd., Beijing 100102, China
| |
Collapse
|
27
|
Zhao BW, Su XR, Hu PW, Ma YP, Zhou X, Hu L. A geometric deep learning framework for drug repositioning over heterogeneous information networks. Brief Bioinform 2022; 23:6692552. [PMID: 36125202 DOI: 10.1093/bib/bbac384] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Revised: 08/01/2022] [Accepted: 08/09/2022] [Indexed: 12/14/2022] Open
Abstract
Drug repositioning (DR) is a promising strategy to discover new indicators of approved drugs with artificial intelligence techniques, thus improving traditional drug discovery and development. However, most of DR computational methods fall short of taking into account the non-Euclidean nature of biomedical network data. To overcome this problem, a deep learning framework, namely DDAGDL, is proposed to predict drug-drug associations (DDAs) by using geometric deep learning (GDL) over heterogeneous information network (HIN). Incorporating complex biological information into the topological structure of HIN, DDAGDL effectively learns the smoothed representations of drugs and diseases with an attention mechanism. Experiment results demonstrate the superior performance of DDAGDL on three real-world datasets under 10-fold cross-validation when compared with state-of-the-art DR methods in terms of several evaluation metrics. Our case studies and molecular docking experiments indicate that DDAGDL is a promising DR tool that gains new insights into exploiting the geometric prior knowledge for improved efficacy.
Collapse
Affiliation(s)
- Bo-Wei Zhao
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.,University of Chinese Academy of Sciences, Beijing 100049, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China
| | - Xiao-Rui Su
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.,University of Chinese Academy of Sciences, Beijing 100049, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China
| | - Peng-Wei Hu
- Merck China Innovation Hub, Shanghai 200000, China
| | - Yu-Peng Ma
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.,University of Chinese Academy of Sciences, Beijing 100049, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China
| | - Xi Zhou
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.,University of Chinese Academy of Sciences, Beijing 100049, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China
| | - Lun Hu
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.,University of Chinese Academy of Sciences, Beijing 100049, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China
| |
Collapse
|
28
|
Liu BM, Gao YL, Zhang DJ, Zhou F, Wang J, Zheng CH, Liu JX. A new framework for drug-disease association prediction combing light-gated message passing neural network and gated fusion mechanism. Brief Bioinform 2022; 23:6775584. [PMID: 36305457 DOI: 10.1093/bib/bbac457] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Revised: 09/07/2022] [Accepted: 09/23/2022] [Indexed: 12/14/2022] Open
Abstract
With the development of research on the complex aetiology of many diseases, computational drug repositioning methodology has proven to be a shortcut to costly and inefficient traditional methods. Therefore, developing more promising computational methods is indispensable for finding new candidate diseases to treat with existing drugs. In this paper, a model integrating a new variant of message passing neural network and a novel-gated fusion mechanism called GLGMPNN is proposed for drug-disease association prediction. First, a light-gated message passing neural network (LGMPNN), including message passing, aggregation and updating, is proposed to separately extract multiple pieces of information from the similarity networks and the association network. Then, a gated fusion mechanism consisting of a forget gate and an output gate is applied to integrate the multiple pieces of information to extent. The forget gate calculated by the multiple embeddings is built to integrate the association information into the similarity information. Furthermore, the final node representations are controlled by the output gate, which fuses the topology information of the networks and the initial similarity information. Finally, a bilinear decoder is adopted to reconstruct an adjacency matrix for drug-disease associations. Evaluated by 10-fold cross-validations, GLGMPNN achieves excellent performance compared with the current models. The following studies show that our model can effectively discover novel drug-disease associations.
Collapse
Affiliation(s)
- Bao-Min Liu
- School of Computer Science, Qufu Normal University, Rizhao, 276826, Shandong, China
| | - Ying-Lian Gao
- Qufu Normal University Library, Qufu Normal University, Rizhao, 276826, Shandong, China
| | - Dai-Jun Zhang
- School of Computer Science, Qufu Normal University, Rizhao, 276826, Shandong, China
| | - Feng Zhou
- School of Computer Science, Qufu Normal University, Rizhao, 276826, Shandong, China
| | - Juan Wang
- School of Computer Science, Qufu Normal University, Rizhao, 276826, Shandong, China
| | - Chun-Hou Zheng
- School of Computer Science, Qufu Normal University, Rizhao, 276826, Shandong, China
| | - Jin-Xing Liu
- School of Computer Science, Qufu Normal University, Rizhao, 276826, Shandong, China
| |
Collapse
|
29
|
Mongia A, Chouzenoux E, Majumdar A. Computational Prediction of Drug-Disease Association Based on Graph-Regularized One Bit Matrix Completion. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3332-3339. [PMID: 35816539 DOI: 10.1109/tcbb.2022.3189879] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Investigation of existing drugs is an effective alternative to the discovery of new drugs for treating diseases. This task of drug re-positioning can be assisted by various kinds of computational methods to predict the best indication for a drug given the open-source biological datasets. Owing to the fact that similar drugs tend to have common pathways and disease indications, the association matrix is assumed to be of low-rank structure. Hence, the problem of drug-disease association prediction can be modeled as a low-rank matrix completion problem. In this work, we propose a novel matrix completion framework that makes use of the side-information associated with drugs/diseases for the prediction of drug-disease indications modeled as neighborhood graph: Graph regularized 1-bit matrix completion (GR1BMC). The algorithm is specially designed for binary data and uses parallel proximal algorithm to solve the aforesaid minimization problem taking into account all the constraints including the neighborhood graph incorporation and restricting predicted scores within the specified range. The results have been validated on two standard databases by evaluating the AUC across the 10-fold cross-validation splits. The usage of the method is also evaluated through a case study where top 5 indications are predicted for novel drugs, which then are verified with the CTD database.
Collapse
|
30
|
Wang L, Peng J, Kuang L, Tan Y, Chen Z. Identification of Essential Proteins Based on Local Random Walk and Adaptive Multi-View Multi-Label Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3507-3516. [PMID: 34788220 DOI: 10.1109/tcbb.2021.3128638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Accumulating evidences have indicated that essential proteins play vital roles in human physiological process. In recent years, although researches on prediction of essential proteins have been developing rapidly, there are as well various limitations such as unsatisfactory data suitability, low accuracy of predictive results and so on. In this manuscript, a novel method called RWAMVL was proposed to predict essential proteins based on the Random Walk and the Adaptive Multi-View multi-label Learning. In RWAMVL, considering that the inherent noise is ubiquitous in existing datasets of known protein-protein interactions (PPIs), a variety of different features including biological features of proteins and topological features of PPI networks were obtained by adopting adaptive multi-view multi-label learning first. And then, an improved random walk method was designed to detect essential proteins based on these different features. Finally, in order to verify the predictive performance of RWAMVL, intensive experiments were done to compare it with multiple state-of-the-art predictive methods under different expeditionary frameworks. And as a result, RWAMVL was proven that it can achieve better prediction accuracy than all those competitive methods, which demonstrated as well that RWAMVL may be a potential tool for prediction of key proteins in the future.
Collapse
|
31
|
Early illustrations of the importance of systematic phenotyping. Eur J Hum Genet 2022; 30:1102. [PMID: 36221027 PMCID: PMC9554047 DOI: 10.1038/s41431-022-01165-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Accepted: 07/21/2022] [Indexed: 11/05/2022] Open
|
32
|
Xie G, Xu H, Li J, Gu G, Sun Y, Lin Z, Zhu Y, Wang W, Wang Y, Shao J. DRPADC: A novel drug repositioning algorithm predicting adaptive drugs for COVID-19. Comput Chem Eng 2022; 166:107947. [PMID: 35942213 PMCID: PMC9349049 DOI: 10.1016/j.compchemeng.2022.107947] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 04/13/2022] [Accepted: 07/27/2022] [Indexed: 12/25/2022]
Abstract
Given that the usual process of developing a new vaccine or drug for COVID-19 demands significant time and funds, drug repositioning has emerged as a promising therapeutic strategy. We propose a method named DRPADC to predict novel drug-disease associations effectively from the original sparse drug-disease association adjacency matrix. Specifically, DRPADC processes the original association matrix with the WKNKN algorithm to reduce its sparsity. Furthermore, multiple types of similarity information are fused by a CKA-MKL algorithm. Finally, a compressed sensing algorithm is used to predict the potential drug-disease (virus) association scores. Experimental results show that DRPADC has superior performance than several competitive methods in terms of AUC values and case studies. DRPADC achieved the AUC value of 0.941, 0.955 and 0.876 in Fdataset, Cdataset and HDVD dataset, respectively. In addition, the conducted case studies of COVID-19 show that DRPADC can predict drug candidates accurately.
Collapse
Affiliation(s)
- Guobo Xie
- School of Computer Science, Guangdong University of Technology, Guangzhou 510006, China
| | - Haojie Xu
- School of Computer Science, Guangdong University of Technology, Guangzhou 510006, China
| | - Jianming Li
- School of Computer Science, Guangdong University of Technology, Guangzhou 510006, China
| | - Guosheng Gu
- School of Computer Science, Guangdong University of Technology, Guangzhou 510006, China,Corresponding author
| | - Yuping Sun
- School of Computer Science, Guangdong University of Technology, Guangzhou 510006, China
| | - Zhiyi Lin
- School of Computer Science, Guangdong University of Technology, Guangzhou 510006, China
| | - Yinting Zhu
- School of Computer Science, Guangdong University of Technology, Guangzhou 510006, China
| | - Weiming Wang
- School of Computer Science, Guangdong University of Technology, Guangzhou 510006, China
| | - Youfu Wang
- Huaneng Qinghai Power Generation Co., Ltd. New Energy Branch, Xining 810000, China
| | - Jiang Shao
- School of Architecture & Design, China University of Mining and Technology, Xuzhou 221116, China
| |
Collapse
|
33
|
Network-Based Approaches for Disease-Gene Association Prediction Using Protein-Protein Interaction Networks. Int J Mol Sci 2022; 23:ijms23137411. [PMID: 35806415 PMCID: PMC9266751 DOI: 10.3390/ijms23137411] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 06/25/2022] [Accepted: 06/30/2022] [Indexed: 01/02/2023] Open
Abstract
Genome-wide association studies (GWAS) can be used to infer genome intervals that are involved in genetic diseases. However, investigating a large number of putative mutations for GWAS is resource- and time-intensive. Network-based computational approaches are being used for efficient disease-gene association prediction. Network-based methods are based on the underlying assumption that the genes causing the same diseases are located close to each other in a molecular network, such as a protein-protein interaction (PPI) network. In this survey, we provide an overview of network-based disease-gene association prediction methods based on three categories: graph-theoretic algorithms, machine learning algorithms, and an integration of these two. We experimented with six selected methods to compare their prediction performance using a heterogeneous network constructed by combining a genome-wide weighted PPI network, an ontology-based disease network, and disease-gene associations. The experiment was conducted in two different settings according to the presence and absence of known disease-associated genes. The results revealed that HerGePred, an integrative method, outperformed in the presence of known disease-associated genes, whereas PRINCE, which adopted a network propagation algorithm, was the most competitive in the absence of known disease-associated genes. Overall, the results demonstrated that the integrative methods performed better than the methods using graph-theory only, and the methods using a heterogeneous network performed better than those using a homogeneous PPI network only.
Collapse
|
34
|
Network-Based Methods for Approaching Human Pathologies from a Phenotypic Point of View. Genes (Basel) 2022; 13:genes13061081. [PMID: 35741843 PMCID: PMC9222217 DOI: 10.3390/genes13061081] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Revised: 06/10/2022] [Accepted: 06/14/2022] [Indexed: 01/27/2023] Open
Abstract
Network and systemic approaches to studying human pathologies are helping us to gain insight into the molecular mechanisms of and potential therapeutic interventions for human diseases, especially for complex diseases where large numbers of genes are involved. The complex human pathological landscape is traditionally partitioned into discrete “diseases”; however, that partition is sometimes problematic, as diseases are highly heterogeneous and can differ greatly from one patient to another. Moreover, for many pathological states, the set of symptoms (phenotypes) manifested by the patient is not enough to diagnose a particular disease. On the contrary, phenotypes, by definition, are directly observable and can be closer to the molecular basis of the pathology. These clinical phenotypes are also important for personalised medicine, as they can help stratify patients and design personalised interventions. For these reasons, network and systemic approaches to pathologies are gradually incorporating phenotypic information. This review covers the current landscape of phenotype-centred network approaches to study different aspects of human diseases.
Collapse
|
35
|
Rintala TJ, Ghosh A, Fortino V. Network approaches for modeling the effect of drugs and diseases. Brief Bioinform 2022; 23:6608969. [PMID: 35704883 PMCID: PMC9294412 DOI: 10.1093/bib/bbac229] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 04/29/2022] [Accepted: 05/17/2021] [Indexed: 12/12/2022] Open
Abstract
The network approach is quickly becoming a fundamental building block of computational methods aiming at elucidating the mechanism of action (MoA) and therapeutic effect of drugs. By modeling the effect of drugs and diseases on different biological networks, it is possible to better explain the interplay between disease perturbations and drug targets as well as how drug compounds induce favorable biological responses and/or adverse effects. Omics technologies have been extensively used to generate the data needed to study the mechanisms of action of drugs and diseases. These data are often exploited to define condition-specific networks and to study whether drugs can reverse disease perturbations. In this review, we describe network data mining algorithms that are commonly used to study drug’s MoA and to improve our understanding of the basis of chronic diseases. These methods can support fundamental stages of the drug development process, including the identification of putative drug targets, the in silico screening of drug compounds and drug combinations for the treatment of diseases. We also discuss recent studies using biological and omics-driven networks to search for possible repurposed FDA-approved drug treatments for SARS-CoV-2 infections (COVID-19).
Collapse
Affiliation(s)
- T J Rintala
- Institute of Biomedicine, University of Eastern Finland, 70210 Kuopio, Finland
| | - Arindam Ghosh
- Institute of Biomedicine, University of Eastern Finland, 70210 Kuopio, Finland
| | - V Fortino
- Institute of Biomedicine, University of Eastern Finland, 70210 Kuopio, Finland
| |
Collapse
|
36
|
Xie X, Chen X. Deciphering the Core Metabolites of Fanconi Anemia by Using a Multi-Omics Composite Network. J Microbiol Biotechnol 2022; 32:387-395. [PMID: 34954697 PMCID: PMC9628788 DOI: 10.4014/jmb.2106.06027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 12/16/2021] [Accepted: 12/20/2021] [Indexed: 12/15/2022]
Abstract
Deciphering the metabolites of human diseases is an important objective of biomedical research. Here, we aimed to capture the core metabolites of Fanconi anemia (FA) using the bioinformatics method of a multi-omics composite network. Based on the assumption that metabolite levels can directly mirror the physiological state of the human body, we used a multi-omics composite network that integrates six types of interactions in humans (gene-gene, disease phenotype-phenotype, disease-related metabolite-metabolite, gene-phenotype, gene-metabolite, and metabolite-phenotype) to procure the core metabolites of FA. This method is applicable in predicting and prioritizing disease candidate metabolites and is effective in a network without known disease metabolites. In this report, we first singled out the differentially expressed genes upon different groups that were related with FA and then constructed the multi-omics composite network of FA by integrating the aforementioned six networks. Ultimately, we utilized random walk with restart (RWR) to screen the prioritized candidate metabolites of FA, and meanwhile the co-expression gene network of FA was also obtained. As a result, the top 5 metabolites of FA were tenormin (TN), guanosine 5'-triphosphate, guanosine 5'-diphosphate, triphosadenine (DCF) and adenosine 5'-diphosphate, all of which were reported to have a direct or indirect relationship with FA. Furthermore, the top 5 co-expressed genes were CASP3, BCL2, HSPD1, RAF1 and MMP9. By prioritizing the metabolites, the multi-omics composite network may provide us with additional indicators closely linked to FA.
Collapse
Affiliation(s)
- Xiaobin Xie
- Department of Pathology, School of Basic Medical Science, Guangzhou Medical University, Guangzhou, Guangdong 511436, P.R. China
| | - Xiaowei Chen
- Department of Hematology, Guangzhou First People's Hospital, South China University of Technology, Guangzhou 510080, P.R. China,Corresponding author Phone: +86-020-81048386 E-mail:
| |
Collapse
|
37
|
Wang L, Tan Y, Yang X, Kuang L, Ping P. Review on predicting pairwise relationships between human microbes, drugs and diseases: from biological data to computational models. Brief Bioinform 2022; 23:6553604. [PMID: 35325024 DOI: 10.1093/bib/bbac080] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Revised: 02/14/2022] [Accepted: 02/15/2022] [Indexed: 12/11/2022] Open
Abstract
In recent years, with the rapid development of techniques in bioinformatics and life science, a considerable quantity of biomedical data has been accumulated, based on which researchers have developed various computational approaches to discover potential associations between human microbes, drugs and diseases. This paper provides a comprehensive overview of recent advances in prediction of potential correlations between microbes, drugs and diseases from biological data to computational models. Firstly, we introduced the widely used datasets relevant to the identification of potential relationships between microbes, drugs and diseases in detail. And then, we divided a series of a lot of representative computing models into five major categories including network, matrix factorization, matrix completion, regularization and artificial neural network for in-depth discussion and comparison. Finally, we analysed possible challenges and opportunities in this research area, and at the same time we outlined some suggestions for further improvement of predictive performances as well.
Collapse
Affiliation(s)
- Lei Wang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Yaqin Tan
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Xiaoyu Yang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Linai Kuang
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Pengyao Ping
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China
| |
Collapse
|
38
|
Yan Y, Yang M, Zhao H, Duan G, Peng X, Wang J. Drug repositioning based on multi-view learning with matrix completion. Brief Bioinform 2022; 23:6548374. [PMID: 35289352 DOI: 10.1093/bib/bbac054] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Revised: 01/14/2022] [Accepted: 01/31/2022] [Indexed: 12/21/2022] Open
Abstract
Determining drug indications is a critical part of the drug development process. However, traditional drug discovery is expensive and time-consuming. Drug repositioning aims to find potential indications for existing drugs, which is considered as an important alternative to the traditional drug discovery. In this article, we propose a multi-view learning with matrix completion (MLMC) method to predict the potential associations between drugs and diseases. Specifically, MLMC first learns the comprehensive similarity matrices from five drug similarity matrices and two disease similarity matrices based on the multi-view learning (ML) with Laplacian graph regularization, and updates the drug-disease association matrix simultaneously. Then, we introduce matrix completion (MC) to add some positive entries in original association matrix based on low-rank structure, and re-execute the multi-view learning algorithm for association prediction. At last, the prediction results of the above two operations are integrated as the final output. Evaluated by 10-fold cross-validation and de novo tests, MLMC achieves higher prediction accuracy than the current state-of-the-art methods. Moreover, case studies confirm the ability of our method in novel drug-disease association discovery. The codes of MLMC are available at https://github.com/BioinformaticsCSU/MLMC. Contact: jxwang@mail.csu.edu.cn.
Collapse
Affiliation(s)
- Yixin Yan
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Mengyun Yang
- Provincial Key Laboratory of Informational Service for Rural Area of Southwestern Hunan, Shaoyang University, Shaoyang, Hunan 422000, China
| | - Haochen Zhao
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Guihua Duan
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Xiaoqing Peng
- Center for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan 410038, China
| | - Jianxin Wang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| |
Collapse
|
39
|
Zhang Y, Chen L, Li S. CIPHER-SC: Disease-Gene Association Inference Using Graph Convolution on a Context-Aware Network With Single-Cell Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:819-829. [PMID: 32809944 DOI: 10.1109/tcbb.2020.3017547] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Inference of disease-gene associations helps unravel the pathogenesis of diseases and contributes to the treatment. Although many machine learning-based methods have been developed to predict causative genes, accurate association inference remains challenging. One major reason is the inaccurate feature selection and accumulation of error brought by commonly used multi-stage training architecture. In addition, the existing methods do not incorporate cell-type-specific information, thus fail to study gene functions at a higher resolution. Therefore, we introduce single-cell transcriptome data and construct a context-aware network to unbiasedly integrate all data sources. Then we develop a graph convolution-based approach named CIPHER-SC to realize a complete end-to-end learning architecture. Our approach outperforms four state-of-the-art approaches in five-fold cross-validations on three distinct test sets with the best AUC of 0.9501, demonstrating its stable ability either to predict the novel genes or to predict with genetic basis. The ablation study shows that our complete end-to-end design and unbiased data integration boost the performance from 0.8727 to 0.9443 in AUC. The addition of single-cell data further improves the prediction accuracy and makes our results be enriched for cell-type-specific genes. These results confirm the ability of CIPHER-SC to discover reliable disease genes. Our implementation is available at http://github.com/YidingZhang117/CIPHER-SC.
Collapse
|
40
|
Wang W, Zhang X, Dai DQ. springD2A: capturing uncertainty in disease-drug association prediction with model integration. Bioinformatics 2022; 38:1353-1360. [PMID: 34864881 DOI: 10.1093/bioinformatics/btab820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 11/23/2021] [Accepted: 11/30/2021] [Indexed: 01/05/2023] Open
Abstract
MOTIVATION Drug repositioning that aims to find new indications for existing drugs has been an efficient strategy for drug discovery. In the scenario where we only have confirmed disease-drug associations as positive pairs, a negative set of disease-drug pairs is usually constructed from the unknown disease-drug pairs in previous studies, where we do not know whether drugs and diseases can be associated, to train a model for disease-drug association prediction (drug repositioning). Drugs and diseases in these negative pairs can potentially be associated, but most studies have ignored them. RESULTS We present a method, springD2A, to capture the uncertainty in the negative pairs, and to discriminate between positive and unknown pairs because the former are more reliable. In springD2A, we introduce a spring-like penalty for the loss of negative pairs, which is strong if they are too close in a unit sphere, but mild if they are at a moderate distance. We also design a sequential sampling in which the probability of an unknown disease-drug pair sampled as negative is proportional to its score predicted as positive. Multiple models are learned during sequential sampling, and we adopt parameter- and feature-based ensemble schemes to boost performance. Experiments show springD2A is an effective tool for drug-repositioning. AVAILABILITY AND IMPLEMENTATION A python implementation of springD2A and datasets used in this study are available at https://github.com/wangyuanhao/springD2A. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Weiwen Wang
- Intelligent Data Center, School of Mathematics, Sun Yat-Sen University, Guangzhou 510000, China
| | - Xiwen Zhang
- Intelligent Data Center, School of Mathematics, Sun Yat-Sen University, Guangzhou 510000, China
| | - Dao-Qing Dai
- Intelligent Data Center, School of Mathematics, Sun Yat-Sen University, Guangzhou 510000, China
| |
Collapse
|
41
|
Xiang J, Zhang J, Zhao Y, Wu FX, Li M. Biomedical data, computational methods and tools for evaluating disease-disease associations. Brief Bioinform 2022; 23:6522999. [PMID: 35136949 DOI: 10.1093/bib/bbac006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Revised: 01/04/2022] [Accepted: 01/05/2022] [Indexed: 12/12/2022] Open
Abstract
In recent decades, exploring potential relationships between diseases has been an active research field. With the rapid accumulation of disease-related biomedical data, a lot of computational methods and tools/platforms have been developed to reveal intrinsic relationship between diseases, which can provide useful insights to the study of complex diseases, e.g. understanding molecular mechanisms of diseases and discovering new treatment of diseases. Human complex diseases involve both external phenotypic abnormalities and complex internal molecular mechanisms in organisms. Computational methods with different types of biomedical data from phenotype to genotype can evaluate disease-disease associations at different levels, providing a comprehensive perspective for understanding diseases. In this review, available biomedical data and databases for evaluating disease-disease associations are first summarized. Then, existing computational methods for disease-disease associations are reviewed and classified into five groups in terms of the usages of biomedical data, including disease semantic-based, phenotype-based, function-based, representation learning-based and text mining-based methods. Further, we summarize software tools/platforms for computation and analysis of disease-disease associations. Finally, we give a discussion and summary on the research of disease-disease associations. This review provides a systematic overview for current disease association research, which could promote the development and applications of computational methods and tools/platforms for disease-disease associations.
Collapse
Affiliation(s)
- Ju Xiang
- School of Computer Science and Engineering, Central South University, China
| | - Jiashuai Zhang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Yichao Zhao
- School of Computer Science and Engineering, Central South University, China
| | - Fang-Xiang Wu
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Min Li
- Division of Biomedical Engineering and Department of Mechanical Engineering at University of Saskatchewan, Saskatoon, Canada
| |
Collapse
|
42
|
Gao CQ, Zhou YK, Xin XH, Min H, Du PF. DDA-SKF: Predicting Drug-Disease Associations Using Similarity Kernel Fusion. Front Pharmacol 2022; 12:784171. [PMID: 35095495 PMCID: PMC8792612 DOI: 10.3389/fphar.2021.784171] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 12/20/2021] [Indexed: 12/13/2022] Open
Abstract
Drug repositioning provides a promising and efficient strategy to discover potential associations between drugs and diseases. Many systematic computational drug-repositioning methods have been introduced, which are based on various similarities of drugs and diseases. In this work, we proposed a new computational model, DDA-SKF (drug-disease associations prediction using similarity kernels fusion), which can predict novel drug indications by utilizing similarity kernel fusion (SKF) and Laplacian regularized least squares (LapRLS) algorithms. DDA-SKF integrated multiple similarities of drugs and diseases. The prediction performances of DDA-SKF are better, or at least comparable, to all state-of-the-art methods. The DDA-SKF can work without sufficient similarity information between drug indications. This allows us to predict new purpose for orphan drugs. The source code and benchmarking datasets are deposited in a GitHub repository (https://github.com/GCQ2119216031/DDA-SKF).
Collapse
Affiliation(s)
| | | | | | | | - Pu-Feng Du
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| |
Collapse
|
43
|
Maudsley S, Leysen H, van Gastel J, Martin B. Systems Pharmacology: Enabling Multidimensional Therapeutics. COMPREHENSIVE PHARMACOLOGY 2022:725-769. [DOI: 10.1016/b978-0-12-820472-6.00017-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
44
|
Leysen H, Walter D, Christiaenssen B, Vandoren R, Harputluoğlu İ, Van Loon N, Maudsley S. GPCRs Are Optimal Regulators of Complex Biological Systems and Orchestrate the Interface between Health and Disease. Int J Mol Sci 2021; 22:ijms222413387. [PMID: 34948182 PMCID: PMC8708147 DOI: 10.3390/ijms222413387] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 12/08/2021] [Accepted: 12/09/2021] [Indexed: 02/06/2023] Open
Abstract
GPCRs arguably represent the most effective current therapeutic targets for a plethora of diseases. GPCRs also possess a pivotal role in the regulation of the physiological balance between healthy and pathological conditions; thus, their importance in systems biology cannot be underestimated. The molecular diversity of GPCR signaling systems is likely to be closely associated with disease-associated changes in organismal tissue complexity and compartmentalization, thus enabling a nuanced GPCR-based capacity to interdict multiple disease pathomechanisms at a systemic level. GPCRs have been long considered as controllers of communication between tissues and cells. This communication involves the ligand-mediated control of cell surface receptors that then direct their stimuli to impact cell physiology. Given the tremendous success of GPCRs as therapeutic targets, considerable focus has been placed on the ability of these therapeutics to modulate diseases by acting at cell surface receptors. In the past decade, however, attention has focused upon how stable multiprotein GPCR superstructures, termed receptorsomes, both at the cell surface membrane and in the intracellular domain dictate and condition long-term GPCR activities associated with the regulation of protein expression patterns, cellular stress responses and DNA integrity management. The ability of these receptorsomes (often in the absence of typical cell surface ligands) to control complex cellular activities implicates them as key controllers of the functional balance between health and disease. A greater understanding of this function of GPCRs is likely to significantly augment our ability to further employ these proteins in a multitude of diseases.
Collapse
Affiliation(s)
- Hanne Leysen
- Receptor Biology Lab, University of Antwerp, 2610 Wilrijk, Belgium; (H.L.); (D.W.); (B.C.); (R.V.); (İ.H.); (N.V.L.)
| | - Deborah Walter
- Receptor Biology Lab, University of Antwerp, 2610 Wilrijk, Belgium; (H.L.); (D.W.); (B.C.); (R.V.); (İ.H.); (N.V.L.)
| | - Bregje Christiaenssen
- Receptor Biology Lab, University of Antwerp, 2610 Wilrijk, Belgium; (H.L.); (D.W.); (B.C.); (R.V.); (İ.H.); (N.V.L.)
| | - Romi Vandoren
- Receptor Biology Lab, University of Antwerp, 2610 Wilrijk, Belgium; (H.L.); (D.W.); (B.C.); (R.V.); (İ.H.); (N.V.L.)
| | - İrem Harputluoğlu
- Receptor Biology Lab, University of Antwerp, 2610 Wilrijk, Belgium; (H.L.); (D.W.); (B.C.); (R.V.); (İ.H.); (N.V.L.)
- Department of Chemistry, Middle East Technical University, Çankaya, Ankara 06800, Turkey
| | - Nore Van Loon
- Receptor Biology Lab, University of Antwerp, 2610 Wilrijk, Belgium; (H.L.); (D.W.); (B.C.); (R.V.); (İ.H.); (N.V.L.)
| | - Stuart Maudsley
- Receptor Biology Lab, University of Antwerp, 2610 Wilrijk, Belgium; (H.L.); (D.W.); (B.C.); (R.V.); (İ.H.); (N.V.L.)
- Correspondence:
| |
Collapse
|
45
|
Xie G, Li J, Gu G, Sun Y, Lin Z, Zhu Y, Wang W. BGMSDDA: a bipartite graph diffusion algorithm with multiple similarity integration for drug-disease association prediction. Mol Omics 2021; 17:997-1011. [PMID: 34610633 DOI: 10.1039/d1mo00237f] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Drug repositioning, a method that relies on the information from the original drug-disease association matrix, aims to identify new indications for existing drugs and is expected to greatly reduce the cost and time of drug development. However, most current drug repositioning methods make use of the original drug-disease association matrix directly without preconditioning. As relatively only a few associations between drugs and diseases have been determined from actual observations, the original drug-disease association matrix used in the prediction is sparse, which affects the performance of the prediction method. A method for mining similar features of drugs and diseases is still lacking. To solve these problems, we developed a bipartite graph diffusion algorithm with multiple similarity integration for drug-disease association prediction (BGMSDDA). First, the weight K nearest known neighbors (WKNKN) algorithm was used to reconstruct the drug-disease association matrix. Secondly, an effective method was designed to extract similar characteristics of drugs and diseases based on integrating linear neighborhood similarity and Gaussian kernel similarity. Finally, bipartite graph diffusion was used to infer undiscovered drug-disease associations. After carrying out 10-fold cross-validation experiments, BGMSDDA showed excellent performance on two datasets, specifically with AUC values of 0.939 (Fdataset) and 0.954 (Cdataset), and AUPR values of 0.466 (Fdataset) and 0.565 (Cdataset). Furthermore, to evaluate the accuracy of the results of BGMSDDA, we conducted case studies on three medically used drugs selected from Fdataset and Cdataset and validated the predictive associated diseases of each drug with some databases. Based on the results obtained, BGMSDDA was demonstrated to be useful for predicting drug-disease associations.
Collapse
Affiliation(s)
- Guobo Xie
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| | - Jianming Li
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| | - Guosheng Gu
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| | - Yuping Sun
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| | - Zhiyi Lin
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| | - Yinting Zhu
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| | - Weiming Wang
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| |
Collapse
|
46
|
Wang SH, Wang CC, Huang L, Miao LY, Chen X. Dual-Network Collaborative Matrix Factorization for predicting small molecule-miRNA associations. Brief Bioinform 2021; 23:6447431. [PMID: 34864865 DOI: 10.1093/bib/bbab500] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2021] [Revised: 10/25/2021] [Accepted: 11/02/2021] [Indexed: 01/01/2023] Open
Abstract
MicroRNAs (miRNAs) play crucial roles in multiple biological processes and human diseases and can be considered as therapeutic targets of small molecules (SMs). Because biological experiments used to verify SM-miRNA associations are time-consuming and expensive, it is urgent to propose new computational models to predict new SM-miRNA associations. Here, we proposed a novel method called Dual-network Collaborative Matrix Factorization (DCMF) for predicting the potential SM-miRNA associations. Firstly, we utilized the Weighted K Nearest Known Neighbors (WKNKN) method to preprocess SM-miRNA association matrix. Then, we constructed matrix factorization model to obtain two feature matrices containing latent features of SM and miRNA, respectively. Finally, the predicted SM-miRNA association score matrix was obtained by calculating the inner product of two feature matrices. The main innovations of this method were that the use of WKNKN method can preprocess the missing values of association matrix and the introduction of dual network can integrate more diverse similarity information into DCMF. For evaluating the validity of DCMF, we implemented four different cross validations (CVs) based on two distinct datasets and two different case studies. Finally, based on dataset 1 (dataset 2), DCMF achieved Area Under receiver operating characteristic Curves (AUC) of 0.9868 (0.8770), 0.9833 (0.8836), 0.8377 (0.7591) and 0.9836 ± 0.0030 (0.8632 ± 0.0042) in global Leave-One-Out Cross Validation (LOOCV), miRNA-fixed local LOOCV, SM-fixed local LOOCV and 5-fold CV, respectively. For case studies, plenty of predicted associations have been confirmed by published experimental literature. Therefore, DCMF is an effective tool to predict potential SM-miRNA associations.
Collapse
Affiliation(s)
- Shu-Hao Wang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.,Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
| | - Chun-Chun Wang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.,Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
| | - Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, 10084, China.,The Future Laboratory, Tsinghua University, Beijing, 10084, China
| | - Lian-Ying Miao
- School of Mathematics, China University of Mining and Technology, Xuzhou, 221116, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.,Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|
47
|
Graph convolutional network approach to discovering disease-related circRNA-miRNA-mRNA axes. Methods 2021; 198:45-55. [PMID: 34758394 DOI: 10.1016/j.ymeth.2021.10.006] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2021] [Revised: 10/07/2021] [Accepted: 10/19/2021] [Indexed: 02/05/2023] Open
Abstract
Non-coding RNAs are gaining prominence in biology and medicine, as they play major roles in cellular homeostasis among which the circRNA-miRNA-mRNA axes are involved in a series of disease-related pathways, such as apoptosis, cell invasion and metastasis. Recently, many computational methods have been developed for the prediction of the relationship between ncRNAs and diseases, which can alleviate the time-consuming and labor-intensive exploration involved with biological experiments. However, these methods handle ncRNAs separately, ignoring the impact of the interactions among ncRNAs on the diseases. In this paper we present a novel approach to discovering disease-related circRNA-miRNA-mRNA axes from the disease-RNA information network. Our method, using graph convolutional network, learns the characteristic representation of each biological entity by propagating and aggregating local neighbor information based on the global structure of the network. The approach is evaluated using the real-world datasets and the results show that it outperforms other state-of-the-art baselines on most of the metrics.
Collapse
|
48
|
Hu P, Huang YA, Mei J, Leung H, Chen ZH, Kuang ZM, You ZH, Hu L. Learning from low-rank multimodal representations for predicting disease-drug associations. BMC Med Inform Decis Mak 2021; 21:308. [PMID: 34736437 PMCID: PMC8567544 DOI: 10.1186/s12911-021-01648-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Accepted: 10/06/2021] [Indexed: 12/15/2022] Open
Abstract
Background Disease-drug associations provide essential information for drug discovery and disease treatment. Many disease-drug associations remain unobserved or unknown, and trials to confirm these associations are time-consuming and expensive. To better understand and explore these valuable associations, it would be useful to develop computational methods for predicting unobserved disease-drug associations. With the advent of various datasets describing diseases and drugs, it has become more feasible to build a model describing the potential correlation between disease and drugs.
Results In this work, we propose a new prediction method, called LMFDA, which works in several stages. First, it studies the drug chemical structure, disease MeSH descriptors, disease-related phenotypic terms, and drug-drug interactions. On this basis, similarity networks of different sources are constructed to enrich the representation of drugs and diseases. Based on the fused disease similarity network and drug similarity network, LMFDA calculated the association score of each pair of diseases and drugs in the database. This method achieves good performance on Fdataset and Cdataset, AUROCs were 91.6% and 92.1% respectively, higher than many of the existing computational models. Conclusions The novelty of LMFDA lies in the introduction of multimodal fusion using low-rank tensors to fuse multiple similar networks and combine matrix complement technology to predict potential association. We have demonstrated that LMFDA can display excellent network integration ability for accurate disease-drug association inferring and achieve substantial improvement over the advanced approach. Overall, experimental results on two real-world networks dataset demonstrate that LMFDA able to delivers an excellent detecting performance. Results also suggest that perfecting similar networks with as much domain knowledge as possible is a promising direction for drug repositioning.
Collapse
Affiliation(s)
- Pengwei Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China
| | - Yu-An Huang
- The Hong Kong Polytechnic University, Hong Kong SAR, China
| | | | - Henry Leung
- Electrical and Computer Engineering, University of Calgary, Calgary, Canada
| | - Zhan-Heng Chen
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China
| | - Ze-Min Kuang
- Beijing Anzhen Hospital of Capital Medical University, Beijing, China
| | - Zhu-Hong You
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China.
| | - Lun Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China.
| |
Collapse
|
49
|
Drug–disease associations prediction via Multiple Kernel-based Dual Graph Regularized Least Squares. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2021.107811] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
|
50
|
Petti M, Farina L, Francone F, Lucidi S, Macali A, Palagi L, De Santis M. MOSES: A New Approach to Integrate Interactome Topology and Functional Features for Disease Gene Prediction. Genes (Basel) 2021; 12:1713. [PMID: 34828319 PMCID: PMC8624742 DOI: 10.3390/genes12111713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 10/16/2021] [Accepted: 10/25/2021] [Indexed: 11/17/2022] Open
Abstract
Disease gene prediction is to date one of the main computational challenges of precision medicine. It is still uncertain if disease genes have unique functional properties that distinguish them from other non-disease genes or, from a network perspective, if they are located randomly in the interactome or show specific patterns in the network topology. In this study, we propose a new method for disease gene prediction based on the use of biological knowledge-bases (gene-disease associations, genes functional annotations, etc.) and interactome network topology. The proposed algorithm called MOSES is based on the definition of two somewhat opposing sets of genes both disease-specific from different perspectives: warm seeds (i.e., disease genes obtained from databases) and cold seeds (genes far from the disease genes on the interactome and not involved in their biological functions). The application of MOSES to a set of 40 diseases showed that the suggested putative disease genes are significantly enriched in their reference disease. Reassuringly, known and predicted disease genes together, tend to form a connected network module on the human interactome, mitigating the scattered distribution of disease genes which is probably due to both the paucity of disease-gene associations and the incompleteness of the interactome.
Collapse
Affiliation(s)
- Manuela Petti
- Department of Computer, Control and Management Engineering, Sapienza University of Rome, 00185 Rome, Italy; (L.F.); (F.F.); (S.L.); (A.M.); (L.P.); (M.D.S.)
| | | | | | | | | | | | | |
Collapse
|