1
|
Nam Y, Lucas A, Yun JS, Lee SM, Park JW, Chen Z, Lee B, Ning X, Shen L, Verma A, Kim D. Development of complemented comprehensive networks for rapid screening of repurposable drugs applicable to new emerging disease outbreaks. J Transl Med 2023; 21:415. [PMID: 37365631 DOI: 10.1186/s12967-023-04223-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Accepted: 05/24/2023] [Indexed: 06/28/2023] Open
Abstract
BACKGROUND Computational drug repurposing is crucial for identifying candidate therapeutic medications to address the urgent need for developing treatments for newly emerging infectious diseases. The recent COVID-19 pandemic has taught us the importance of rapidly discovering candidate drugs and providing them to medical and pharmaceutical experts for further investigation. Network-based approaches can provide repurposable drugs quickly by leveraging comprehensive relationships among biological components. However, in a case of newly emerging disease, applying a repurposing methods with only pre-existing knowledge networks may prove inadequate due to the insufficiency of information flow caused by the novel nature of the disease. METHODS We proposed a network-based complementary linkage method for drug repurposing to solve the lack of incoming new disease-specific information in knowledge networks. We simulate our method under the controlled repurposing scenario that we faced in the early stage of the COVID-19 pandemic. First, the disease-gene-drug multi-layered network was constructed as the backbone network by fusing comprehensive knowledge database. Then, complementary information for COVID-19, containing data on 18 comorbid diseases and 17 relevant proteins, was collected from publications or preprint servers as of May 2020. We estimated connections between the novel COVID-19 node and the backbone network to construct a complemented network. Network-based drug scoring for COVID-19 was performed by applying graph-based semi-supervised learning, and the resulting scores were used to validate prioritized drugs for population-scale electronic health records-based medication analyses. RESULTS The backbone networks consisted of 591 diseases, 26,681 proteins, and 2,173 drug nodes based on pre-pandemic knowledge. After incorporating the 35 entities comprised of complemented information into the backbone network, drug scoring screened top 30 potential repurposable drugs for COVID-19. The prioritized drugs were subsequently analyzed in electronic health records obtained from patients in the Penn Medicine COVID-19 Registry as of October 2021 and 8 of these were found to be statistically associated with a COVID-19 phenotype. CONCLUSION We found that 8 of the 30 drugs identified by graph-based scoring on complemented networks as potential candidates for COVID-19 repurposing were additionally supported by real-world patient data in follow-up analyses. These results show that our network-based complementary linkage method and drug scoring algorithm are promising strategies for identifying candidate repurposable drugs when new emerging disease outbreaks.
Collapse
Affiliation(s)
- Yonghyun Nam
- Department of Biostatistics, Epidemiology & Informatics, The Perelman School of Medicine, University of Pennsylvania, B304 Richards Building, 3700 Hamilton Walk, Philadelphia, PA, 19104-6116, USA
| | - Anastasia Lucas
- Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, USA
| | - Jae-Seung Yun
- Department of Biostatistics, Epidemiology & Informatics, The Perelman School of Medicine, University of Pennsylvania, B304 Richards Building, 3700 Hamilton Walk, Philadelphia, PA, 19104-6116, USA
- Division of Endocrinology and Metabolism, Department of Internal Medicine, St. Vincent's Hospital, College of Medicine, The Catholic University of Korea, Seoul, South Korea
| | - Seung Mi Lee
- Department of Biostatistics, Epidemiology & Informatics, The Perelman School of Medicine, University of Pennsylvania, B304 Richards Building, 3700 Hamilton Walk, Philadelphia, PA, 19104-6116, USA
- Department of Obstetrics and Gynecology, Seoul National University College of Medicine, Seoul, South Korea
| | - Ji Won Park
- Department of Biostatistics, Epidemiology & Informatics, The Perelman School of Medicine, University of Pennsylvania, B304 Richards Building, 3700 Hamilton Walk, Philadelphia, PA, 19104-6116, USA
- Department of Surgery, Seoul National University College of Medicine, Seoul, South Korea
| | - Ziqi Chen
- Computer Science and Engineering Department, College of Engineering, The Ohio State University, Columbus, USA
| | - Brian Lee
- Department of Biostatistics, Epidemiology & Informatics, The Perelman School of Medicine, University of Pennsylvania, B304 Richards Building, 3700 Hamilton Walk, Philadelphia, PA, 19104-6116, USA
| | - Xia Ning
- Computer Science and Engineering Department, College of Engineering, The Ohio State University, Columbus, USA
- Biomedical Informatics Department, College of Medicine, The Ohio State University, Columbus, USA
- Translational Data Analytics Institute, The Ohio State University, Columbus, USA
| | - Li Shen
- Department of Biostatistics, Epidemiology & Informatics, The Perelman School of Medicine, University of Pennsylvania, B304 Richards Building, 3700 Hamilton Walk, Philadelphia, PA, 19104-6116, USA
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, USA
| | - Anurag Verma
- Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, USA.
| | - Dokyoon Kim
- Department of Biostatistics, Epidemiology & Informatics, The Perelman School of Medicine, University of Pennsylvania, B304 Richards Building, 3700 Hamilton Walk, Philadelphia, PA, 19104-6116, USA.
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, USA.
| |
Collapse
|
2
|
Abstract
Background Drug repurposing has been motivated to ameliorate low probability of success in drug discovery. For the recent decade, many in silico attempts have received primary attention as a first step to alleviate the high cost and longevity. Such study has taken benefits of abundance, variety, and easy accessibility of pharmaceutical and biomedical data. Utilizing the research friendly environment, in this study, we propose a network-based machine learning algorithm for drug repurposing. Particularly, we show a framework on how to construct a drug network, and how to strengthen the network by employing multiple/heterogeneous types of data. Results The proposed method consists of three steps. First, we construct a drug network from drug-target protein information. Then, the drug network is reinforced by utilizing drug-drug interaction knowledge on bioactivity and/or medication from literature databases. Through the enhancement, the number of connected nodes and the number of edges between them become more abundant and informative, which can lead to a higher probability of success of in silico drug repurposing. The enhanced network recommends candidate drugs for repurposing through drug scoring. The scoring process utilizes graph-based semi-supervised learning to determine the priority of recommendations. Conclusions The drug network is reinforced in terms of the coverage and connections of drugs: the drug coverage increases from 4738 to 5442, and the drug-drug associations as well from 808,752 to 982,361. Along with the network enhancement, drug recommendation becomes more reliable: AUC of 0.89 was achieved lifted from 0.79. For typical cases, 11 recommended drugs were shown for vascular dementia: amantadine, conotoxin GV, tenocyclidine, cycloeucine, etc. Electronic supplementary material The online version of this article (10.1186/s12859-019-2858-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yonghyun Nam
- Department of Industrial Engineering, Ajou University, 206, World cup-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do, 16499, Republic of Korea
| | - Myungjun Kim
- Department of Industrial Engineering, Ajou University, 206, World cup-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do, 16499, Republic of Korea
| | - Hang-Seok Chang
- Department of Surgery, Thyroid Cancer Center, Gangnam Severance Hospital, Institute of Refractory Thyroid Cancer, Yonsei University College of Medicine, 211 Eonjuro, Gangnam-gu, Seoul, 06273, Republic of Korea
| | - Hyunjung Shin
- Department of Industrial Engineering, Ajou University, 206, World cup-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do, 16499, Republic of Korea.
| |
Collapse
|
3
|
Abstract
Background Although drug discoveries can provide meaningful insights and significant enhancements in pharmaceutical field, the longevity and cost that it takes can be extensive where the success rate is low. In order to circumvent the problem, there has been increased interest in ‘Drug Repositioning’ where one searches for already approved drugs that have high potential of efficacy when applied to other diseases. To increase the success rate for drug repositioning, one considers stepwise screening and experiments based on biological reactions. Given the amount of drugs and diseases, however, the one-by-one procedure may be time consuming and expensive. Methods In this study, we propose a machine learning based approach for efficiently selecting candidate diseases and drugs. We assume that if two diseases are similar, then a drug for one disease can be effective against the other disease too. For the procedure, we first construct two disease networks; one with disease-protein association and the other with disease-drug information. If two networks are dissimilar, in a sense that the edge distribution of a disease node differ, it indicates high potential for repositioning new candidate drugs for that disease. The Kullback-Leibler divergence is employed to measure difference of connections in two constructed disease networks. Lastly, we perform repositioning of drugs to the top 20% ranked diseases. Results The results showed that F-measure of the proposed method was 0.75, outperforming 0.5 of greedy searching for the entire diseases. For the utility of the proposed method, it was applied to dementia and verified 75% accuracy for repositioned drugs assuming that there are not any known drugs to be used for dementia. Conclusion This research has novelty in that it discovers drugs with high potential of repositioning based on disease networks with the quantitative measure. Through the study, it is expected to produce profound insights for possibility of undiscovered drug repositioning. Electronic supplementary material The online version of this article (doi:10.1186/s12911-017-0449-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sunghong Park
- Department of Industrial Engineering, Ajou University, 206 Worldcup-ro, Yeongtong-gu, Suwon, 16499, South Korea
| | - Dong-Gi Lee
- Department of Industrial Engineering, Ajou University, 206 Worldcup-ro, Yeongtong-gu, Suwon, 16499, South Korea
| | - Hyunjung Shin
- Department of Industrial Engineering, Ajou University, 206 Worldcup-ro, Yeongtong-gu, Suwon, 16499, South Korea.
| |
Collapse
|
4
|
Abstract
Background Biological system is a multi-layered structure of omics with genome, epigenome, transcriptome, metabolome, proteome, etc., and can be further stretched to clinical/medical layers such as diseasome, drugs, and symptoms. One advantage of omics is that we can figure out an unknown component or its trait by inferring from known omics components. The component can be inferred by the ones in the same level of omics or the ones in different levels. Methods To implement the inference process, an algorithm that can be applied to the multi-layered complex system is required. In this study, we develop a semi-supervised learning algorithm that can be applied to the multi-layered complex system. In order to verify the validity of the inference, it was applied to the prediction problem of disease co-occurrence with a two-layered network composed of symptom-layer and disease-layer. Results The symptom-disease layered network obtained a fairly high value of AUC, 0.74, which is regarded as noticeable improvement when comparing 0.59 AUC of single-layered disease network. If further stretched to whole layered structure of omics, the proposed method is expected to produce more promising results. Conclusion This research has novelty in that it is a new integrative algorithm that incorporates the vertical structure of omics data, on contrary to other existing methods that integrate the data in parallel fashion. The results can provide enhanced guideline for disease co-occurrence prediction, thereby serve as a valuable tool for inference process of multi-layered biological system.
Collapse
Affiliation(s)
- Myungjun Kim
- Department of Industrial Engineering, Ajou University, 206 Worldcup-ro, Yeongtong-gu, Suwon, 16499, South Korea
| | - Yonghyun Nam
- Department of Industrial Engineering, Ajou University, 206 Worldcup-ro, Yeongtong-gu, Suwon, 16499, South Korea
| | - Hyunjung Shin
- Department of Industrial Engineering, Ajou University, 206 Worldcup-ro, Yeongtong-gu, Suwon, 16499, South Korea.
| |
Collapse
|