1
|
Mangione W, Falls Z, Samudrala R. Effective holistic characterization of small molecule effects using heterogeneous biological networks. Front Pharmacol 2023; 14:1113007. [PMID: 37180722 PMCID: PMC10169664 DOI: 10.3389/fphar.2023.1113007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 04/11/2023] [Indexed: 05/16/2023] Open
Abstract
The two most common reasons for attrition in therapeutic clinical trials are efficacy and safety. We integrated heterogeneous data to create a human interactome network to comprehensively describe drug behavior in biological systems, with the goal of accurate therapeutic candidate generation. The Computational Analysis of Novel Drug Opportunities (CANDO) platform for shotgun multiscale therapeutic discovery, repurposing, and design was enhanced by integrating drug side effects, protein pathways, protein-protein interactions, protein-disease associations, and the Gene Ontology, and complemented with its existing drug/compound, protein, and indication libraries. These integrated networks were reduced to a "multiscale interactomic signature" for each compound that describe its functional behavior as vectors of real values. These signatures are then used for relating compounds to each other with the hypothesis that similar signatures yield similar behavior. Our results indicated that there is significant biological information captured within our networks (particularly via side effects) which enhance the performance of our platform, as evaluated by performing all-against-all leave-one-out drug-indication association benchmarking as well as generating novel drug candidates for colon cancer and migraine disorders corroborated via literature search. Further, drug impacts on pathways derived from computed compound-protein interaction scores served as the features for a random forest machine learning model trained to predict drug-indication associations, with applications to mental disorders and cancer metastasis highlighted. This interactomic pipeline highlights the ability of Computational Analysis of Novel Drug Opportunities to accurately relate drugs in a multitarget and multiscale context, particularly for generating putative drug candidates using the information gleaned from indirect data such as side effect profiles and protein pathway information.
Collapse
Affiliation(s)
| | | | - Ram Samudrala
- Jacobs School of Medicine and Biomedical Sciences, Department of Biomedical Informatics, University at Buffalo, Buffalo, NY, United States
| |
Collapse
|
2
|
Bruggemann L, Falls Z, Mangione W, Schwartz SA, Battaglia S, Aalinkeel R, Mahajan SD, Samudrala R. Multiscale Analysis and Validation of Effective Drug Combinations Targeting Driver KRAS Mutations in Non-Small Cell Lung Cancer. Int J Mol Sci 2023; 24:ijms24020997. [PMID: 36674513 PMCID: PMC9867122 DOI: 10.3390/ijms24020997] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 11/04/2022] [Accepted: 11/06/2022] [Indexed: 01/06/2023] Open
Abstract
Pharmacogenomics is a rapidly growing field with the goal of providing personalized care to every patient. Previously, we developed the Computational Analysis of Novel Drug Opportunities (CANDO) platform for multiscale therapeutic discovery to screen optimal compounds for any indication/disease by performing analytics on their interactions using large protein libraries. We implemented a comprehensive precision medicine drug discovery pipeline within the CANDO platform to determine which drugs are most likely to be effective against mutant phenotypes of non-small cell lung cancer (NSCLC) based on the supposition that drugs with similar interaction profiles (or signatures) will have similar behavior and therefore show synergistic effects. CANDO predicted that osimertinib, an EGFR inhibitor, is most likely to synergize with four KRAS inhibitors.Validation studies with cellular toxicity assays confirmed that osimertinib in combination with ARS-1620, a KRAS G12C inhibitor, and BAY-293, a pan-KRAS inhibitor, showed a synergistic effect on decreasing cellular proliferation by acting on mutant KRAS. Gene expression studies revealed that MAPK expression is strongly correlated with decreased cellular proliferation following treatment with KRAS inhibitor BAY-293, but not treatment with ARS-1620 or osimertinib. These results indicate that our precision medicine pipeline may be used to identify compounds capable of synergizing with inhibitors of KRAS G12C, and to assess their likelihood of becoming drugs by understanding their behavior at the proteomic/interactomic scales.
Collapse
Affiliation(s)
- Liana Bruggemann
- Department of Biomedical Informatics, University at Buffalo, Buffalo, NY 14260, USA
| | - Zackary Falls
- Department of Biomedical Informatics, University at Buffalo, Buffalo, NY 14260, USA
| | - William Mangione
- Department of Biomedical Informatics, University at Buffalo, Buffalo, NY 14260, USA
| | | | | | | | - Supriya D. Mahajan
- Department of Medicine, University at Buffalo, Buffalo, NY 14260, USA
- Correspondence: (S.D.M.); (R.S.)
| | - Ram Samudrala
- Department of Biomedical Informatics, University at Buffalo, Buffalo, NY 14260, USA
- Correspondence: (S.D.M.); (R.S.)
| |
Collapse
|
3
|
Identifying Protein Features and Pathways Responsible for Toxicity Using Machine Learning and Tox21: Implications for Predictive Toxicology. Molecules 2022; 27:molecules27093021. [PMID: 35566372 PMCID: PMC9099959 DOI: 10.3390/molecules27093021] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 04/28/2022] [Accepted: 04/30/2022] [Indexed: 02/01/2023] Open
Abstract
Humans are exposed to numerous compounds daily, some of which have adverse effects on health. Computational approaches for modeling toxicological data in conjunction with machine learning algorithms have gained popularity over the last few years. Machine learning approaches have been used to predict toxicity-related biological activities using chemical structure descriptors. However, toxicity-related proteomic features have not been fully investigated. In this study, we construct a computational pipeline using machine learning models for predicting the most important protein features responsible for the toxicity of compounds taken from the Tox21 dataset that is implemented within the multiscale Computational Analysis of Novel Drug Opportunities (CANDO) therapeutic discovery platform. Tox21 is a highly imbalanced dataset consisting of twelve in vitro assays, seven from the nuclear receptor (NR) signaling pathway and five from the stress response (SR) pathway, for more than 10,000 compounds. For the machine learning model, we employed a random forest with the combination of Synthetic Minority Oversampling Technique (SMOTE) and the Edited Nearest Neighbor (ENN) method (SMOTE+ENN), which is a resampling method to balance the activity class distribution. Within the NR and SR pathways, the activity of the aryl hydrocarbon receptor (NR-AhR) and the mitochondrial membrane potential (SR-MMP) were two of the top-performing twelve toxicity endpoints with AUCROCs of 0.90 and 0.92, respectively. The top extracted features for evaluating compound toxicity were analyzed for enrichment to highlight the implicated biological pathways and proteins. We validated our enrichment results for the activity of the AhR using a thorough literature search. Our case study showed that the selected enriched pathways and proteins from our computational pipeline are not only correlated with AhR toxicity but also form a cascading upstream/downstream arrangement. Our work elucidates significant relationships between protein and compound interactions computed using CANDO and the associated biological pathways to which the proteins belong for twelve toxicity endpoints. This novel study uses machine learning not only to predict and understand toxicity but also elucidates therapeutic mechanisms at a proteomic level for a variety of toxicity endpoints.
Collapse
|
4
|
Mangione W, Falls Z, Samudrala R. Optimal COVID-19 therapeutic candidate discovery using the CANDO platform. Front Pharmacol 2022; 13:970494. [PMID: 36091793 PMCID: PMC9452636 DOI: 10.3389/fphar.2022.970494] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 07/07/2022] [Indexed: 01/22/2023] Open
Abstract
The worldwide outbreak of SARS-CoV-2 in early 2020 caused numerous deaths and unprecedented measures to control its spread. We employed our Computational Analysis of Novel Drug Opportunities (CANDO) multiscale therapeutic discovery, repurposing, and design platform to identify small molecule inhibitors of the virus to treat its resulting indication, COVID-19. Initially, few experimental studies existed on SARS-CoV-2, so we optimized our drug candidate prediction pipelines using results from two independent high-throughput screens against prevalent human coronaviruses. Ranked lists of candidate drugs were generated using our open source cando.py software based on viral protein inhibition and proteomic interaction similarity. For the former viral protein inhibition pipeline, we computed interaction scores between all compounds in the corresponding candidate library and eighteen SARS-CoV proteins using an interaction scoring protocol with extensive parameter optimization which was then applied to the SARS-CoV-2 proteome for prediction. For the latter similarity based pipeline, we computed interaction scores between all compounds and human protein structures in our libraries then used a consensus scoring approach to identify candidates with highly similar proteomic interaction signatures to multiple known anti-coronavirus actives. We published our ranked candidate lists at the very beginning of the COVID-19 pandemic. Since then, 51 of our 276 predictions have demonstrated anti-SARS-CoV-2 activity in published clinical and experimental studies. These results illustrate the ability of our platform to rapidly respond to emergent pathogens and provide greater evidence that treating compounds in a multitarget context more accurately describes their behavior in biological systems.
Collapse
Affiliation(s)
- William Mangione
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY, United States
| | - Zackary Falls
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY, United States
| | - Ram Samudrala
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY, United States
| |
Collapse
|
5
|
Schuler J, Falls Z, Mangione W, Hudson ML, Bruggemann L, Samudrala R. Evaluating the performance of drug-repurposing technologies. Drug Discov Today 2022; 27:49-64. [PMID: 34400352 PMCID: PMC10014214 DOI: 10.1016/j.drudis.2021.08.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2021] [Revised: 06/20/2021] [Accepted: 08/08/2021] [Indexed: 01/22/2023]
Abstract
Drug-repurposing technologies are growing in number and maturing. However, comparisons to each other and to reality are hindered because of a lack of consensus with respect to performance evaluation. Such comparability is necessary to determine scientific merit and to ensure that only meaningful predictions from repurposing technologies carry through to further validation and eventual patient use. Here, we review and compare performance evaluation measures for these technologies using version 2 of our shotgun repurposing Computational Analysis of Novel Drug Opportunities (CANDO) platform to illustrate their benefits, drawbacks, and limitations. Understanding and using different performance evaluation metrics ensures robust cross-platform comparability, enabling us to continue to strive toward optimal repurposing by decreasing the time and cost of drug discovery and development.
Collapse
Affiliation(s)
- James Schuler
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY, USA.
| | - Zackary Falls
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY, USA
| | - William Mangione
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY, USA
| | - Matthew L Hudson
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY, USA
| | - Liana Bruggemann
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY, USA
| | - Ram Samudrala
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY, USA.
| |
Collapse
|
6
|
Overhoff B, Falls Z, Mangione W, Samudrala R. A Deep-Learning Proteomic-Scale Approach for Drug Design. Pharmaceuticals (Basel) 2021; 14:1277. [PMID: 34959678 PMCID: PMC8709297 DOI: 10.3390/ph14121277] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 11/27/2021] [Accepted: 11/29/2021] [Indexed: 12/26/2022] Open
Abstract
Computational approaches have accelerated novel therapeutic discovery in recent decades. The Computational Analysis of Novel Drug Opportunities (CANDO) platform for shotgun multitarget therapeutic discovery, repurposing, and design aims to improve their efficacy and safety by employing a holistic approach that computes interaction signatures between every drug/compound and a large library of non-redundant protein structures corresponding to the human proteome fold space. These signatures are compared and analyzed to determine if a given drug/compound is efficacious and safe for a given indication/disease. In this study, we used a deep learning-based autoencoder to first reduce the dimensionality of CANDO-computed drug-proteome interaction signatures. We then employed a reduced conditional variational autoencoder to generate novel drug-like compounds when given a target encoded "objective" signature. Using this approach, we designed compounds to recreate the interaction signatures for twenty approved and experimental drugs and showed that 16/20 designed compounds were predicted to be significantly (p-value ≤ 0.05) more behaviorally similar relative to all corresponding controls, and 20/20 were predicted to be more behaviorally similar relative to a random control. We further observed that redesigns of objectives developed via rational drug design performed significantly better than those derived from natural sources (p-value ≤ 0.05), suggesting that the model learned an abstraction of rational drug design. We also show that the designed compounds are structurally diverse and synthetically feasible when compared to their respective objective drugs despite consistently high predicted behavioral similarity. Finally, we generated new designs that enhanced thirteen drugs/compounds associated with non-small cell lung cancer and anti-aging properties using their predicted proteomic interaction signatures. his study represents a significant step forward in automating holistic therapeutic design with machine learning, enabling the rapid generation of novel, effective, and safe drug leads for any indication.
Collapse
Affiliation(s)
| | | | | | - Ram Samudrala
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY 14203, USA; (B.O.); (Z.F.); (W.M.)
| |
Collapse
|
7
|
Carracedo-Reboredo P, Liñares-Blanco J, Rodríguez-Fernández N, Cedrón F, Novoa FJ, Carballal A, Maojo V, Pazos A, Fernandez-Lozano C. A review on machine learning approaches and trends in drug discovery. Comput Struct Biotechnol J 2021; 19:4538-4558. [PMID: 34471498 PMCID: PMC8387781 DOI: 10.1016/j.csbj.2021.08.011] [Citation(s) in RCA: 95] [Impact Index Per Article: 31.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 08/06/2021] [Accepted: 08/06/2021] [Indexed: 12/30/2022] Open
Abstract
Drug discovery aims at finding new compounds with specific chemical properties for the treatment of diseases. In the last years, the approach used in this search presents an important component in computer science with the skyrocketing of machine learning techniques due to its democratization. With the objectives set by the Precision Medicine initiative and the new challenges generated, it is necessary to establish robust, standard and reproducible computational methodologies to achieve the objectives set. Currently, predictive models based on Machine Learning have gained great importance in the step prior to preclinical studies. This stage manages to drastically reduce costs and research times in the discovery of new drugs. This review article focuses on how these new methodologies are being used in recent years of research. Analyzing the state of the art in this field will give us an idea of where cheminformatics will be developed in the short term, the limitations it presents and the positive results it has achieved. This review will focus mainly on the methods used to model the molecular data, as well as the biological problems addressed and the Machine Learning algorithms used for drug discovery in recent years.
Collapse
Key Words
- ADMET, Absorption, distribution, metabolism, elimination and toxicity
- ADR, Adverse Drug Reaction
- AI, Artificial Intelligence
- ANN, Artificial Neural Networks
- APFP, Atom Pairs 2d FingerPrint
- AUC, Area under the Curve
- BBB, Blood–Brain barrier
- CDK, Chemical Development Kit
- CNN, Convolutional Neural Networks
- CNS, Central Nervous System
- CPI, Compound-protein interaction
- CV, Cross Validation
- Cheminformatics
- DL, Deep Learning
- DNA, Deoxyribonucleic acid
- Deep Learning
- Drug Discovery
- ECFP, Extended Connectivity Fingerprints
- FDA, Food and Drug Administration
- FNN, Fully Connected Neural Networks
- FP, Fringerprints
- FS, Feature Selection
- GCN, Graph Convolutional Networks
- GEO, Gene Expression Omnibus
- GNN, Graph Neural Networks
- GO, Gene Ontology
- KEGG, Kyoto Encyclopedia of Genes and Genomes
- MACCS, Molecular ACCess System
- MCC, Matthews correlation coefficient
- MD, Molecular Descriptors
- MKL, Multiple Kernel Learning
- ML, Machine Learning
- Machine Learning
- Molecular Descriptors
- NB, Naive Bayes
- OOB, Out of Bag
- PCA, Principal Component Analyisis
- QSAR
- QSAR, Quantitative structure–activity relationship
- RF, Random Forest
- RNA, Ribonucleic Acid
- SMILES, simplified molecular-input line-entry system
- SVM, Support Vector Machines
- TCGA, The Cancer Genome Atlas
- WHO, World Health Organization
- t-SNE, t-Distributed Stochastic Neighbor Embedding
Collapse
Affiliation(s)
- Paula Carracedo-Reboredo
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Jose Liñares-Blanco
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
| | - Nereida Rodríguez-Fernández
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Department of Computer Science and Information Technologies, Faculty of Communication Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Francisco Cedrón
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Francisco J. Novoa
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Adrian Carballal
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Department of Computer Science and Information Technologies, Faculty of Communication Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Victor Maojo
- Biomedical Informatics Group, Artificial Intelligence Department, Polytechnic University of Madrid, Calle de los Ciruelos, Boadilla del Monte, Madrid 28660, Spain
| | - Alejandro Pazos
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR), Complexo Hospitalario Universitario de A Coruña (CHUAC), SERGAS, Universidade da Coruña, Instituto de Investigación Biomédica de A Coruña (INIBIC), A Coruña, Spain
| | - Carlos Fernandez-Lozano
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR), Complexo Hospitalario Universitario de A Coruña (CHUAC), SERGAS, Universidade da Coruña, Instituto de Investigación Biomédica de A Coruña (INIBIC), A Coruña, Spain
| |
Collapse
|
8
|
GPCR_LigandClassify.py; a rigorous machine learning classifier for GPCR targeting compounds. Sci Rep 2021; 11:9510. [PMID: 33947911 PMCID: PMC8097070 DOI: 10.1038/s41598-021-88939-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2020] [Accepted: 04/12/2021] [Indexed: 02/02/2023] Open
Abstract
The current study describes the construction of various ligand-based machine learning models to be used for drug-repurposing against the family of G-Protein Coupled Receptors (GPCRs). In building these models, we collected > 500,000 data points, encompassing experimentally measured molecular association data of > 160,000 unique ligands against > 250 GPCRs. These data points were retrieved from the GPCR-Ligand Association (GLASS) database. We have used diverse molecular featurization methods to describe the input molecules. Multiple supervised ML algorithms were developed, tested and compared for their accuracy, F scores, as well as for their Matthews' correlation coefficient scores (MCC). Our data suggest that combined with molecular fingerprinting, ensemble decision trees and gradient boosted trees ML algorithms are on the accuracy border of the rather sophisticated deep neural nets (DNNs)-based algorithms. On a test dataset, these models displayed an excellent performance, reaching a ~ 90% classification accuracy. Additionally, we showcase a few examples where our models were able to identify interesting connections between known drugs from the Drug-Bank database and members of the GPCR family of receptors. Our findings are in excellent agreement with previously reported experimental observations in the literature. We hope the models presented in this paper synergize with the currently ongoing interest of applying machine learning modeling in the field of drug repurposing and computational drug discovery in general.
Collapse
|
9
|
Hudson ML, Samudrala R. Multiscale Virtual Screening Optimization for Shotgun Drug Repurposing Using the CANDO Platform. Molecules 2021; 26:2581. [PMID: 33925237 PMCID: PMC8125683 DOI: 10.3390/molecules26092581] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Revised: 04/16/2021] [Accepted: 04/19/2021] [Indexed: 12/02/2022] Open
Abstract
Drug repurposing, the practice of utilizing existing drugs for novel clinical indications, has tremendous potential for improving human health outcomes and increasing therapeutic development efficiency. The goal of multi-disease multitarget drug repurposing, also known as shotgun drug repurposing, is to develop platforms that assess the therapeutic potential of each existing drug for every clinical indication. Our Computational Analysis of Novel Drug Opportunities (CANDO) platform for shotgun multitarget repurposing implements several pipelines for the large-scale modeling and simulation of interactions between comprehensive libraries of drugs/compounds and protein structures. In these pipelines, each drug is described by an interaction signature that is compared to all other signatures that are subsequently sorted and ranked based on similarity. Pipelines within the platform are benchmarked based on their ability to recover known drugs for all indications in our library, and predictions are generated based on the hypothesis that (novel) drugs with similar signatures may be repurposed for the same indication(s). The drug-protein interactions used to create the drug-proteome signatures may be determined by any screening or docking method, but the primary approach used thus far has been BANDOCK, our in-house bioanalytical or similarity docking protocol. In this study, we calculated drug-proteome interaction signatures using the publicly available molecular docking method Autodock Vina and created hybrid decision tree pipelines that combined our original bio- and chem-informatic approach with the goal of assessing and benchmarking their drug repurposing capabilities and performance. The hybrid decision tree pipeline outperformed the two docking-based pipelines from which it was synthesized, yielding an average indication accuracy of 13.3% at the top10 cutoff (the most stringent), relative to 10.9% and 7.1% for its constituent pipelines, and a random control accuracy of 2.2%. We demonstrate that docking-based virtual screening pipelines have unique performance characteristics and that the CANDO shotgun repurposing paradigm is not dependent on a specific docking method. Our results also provide further evidence that multiple CANDO pipelines can be synthesized to enhance drug repurposing predictive capability relative to their constituent pipelines. Overall, this study indicates that pipelines consisting of varied docking-based signature generation methods can capture unique and useful signals for accurate comparison of drug-proteome interaction signatures, leading to improvements in the benchmarking and predictive performance of the CANDO shotgun drug repurposing platform.
Collapse
Affiliation(s)
| | - Ram Samudrala
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY 14203, USA;
| |
Collapse
|
10
|
Mangione W, Falls Z, Chopra G, Samudrala R. cando.py: Open Source Software for Predictive Bioanalytics of Large Scale Drug-Protein-Disease Data. J Chem Inf Model 2020; 60:4131-4136. [PMID: 32515949 PMCID: PMC8098009 DOI: 10.1021/acs.jcim.0c00110] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Traditional drug discovery methods focus on optimizing the efficacy of a drug against a single biological target of interest for a specific disease. However, evidence supports the multitarget theory, i.e., drugs work by exerting their therapeutic effects via interaction with multiple biological targets, which have multiple phenotypic effects. Analytics of drug-protein interactions on a large proteomic scale provides insight into disease systems while also allowing for prediction of putative therapeutics against specific indications. We present a Python package for analysis of drug-proteome and drug-disease relationships implementing the Computational Analysis of Novel Drug Opportunities (CANDO) platform. The CANDO package allows for rapid drug similarity assessment, most notably via an in-house interaction scoring protocol where billions of drug-protein interactions are rapidly scored and the similarity of drug-proteome interaction signatures is calculated. The package also implements a variety of benchmarking protocols for shotgun drug discovery and repurposing, i.e., to determine how every known drug is related to every other in the context of the indications/diseases for which they are approved. Drug predictions are generated through consensus scoring of the most similar compounds to drugs known to treat a particular indication. Support for comparing and ranking novel chemical entities, as well as machine learning modules for both benchmarking and putative drug candidate prediction is also available. The CANDO Python package is available on GitHub at https://github.com/ram-compbio/CANDO, through the Conda Python package installer, and at http://compbio.org/software/.
Collapse
Affiliation(s)
- William Mangione
- Department of Biomedical Informatics, University at Buffalo, Buffalo, New York 14120, United States
| | - Zackary Falls
- Department of Biomedical Informatics, University at Buffalo, Buffalo, New York 14120, United States
| | - Gaurav Chopra
- Department of Chemistry, Purdue Institute for Drug Discovery, Integrated Data Science Institute, Purdue University, West Lafayette, Indiana 47907, United States
| | - Ram Samudrala
- Department of Biomedical Informatics, University at Buffalo, Buffalo, New York 14120, United States
| |
Collapse
|
11
|
Jarada TN, Rokne JG, Alhajj R. A review of computational drug repositioning: strategies, approaches, opportunities, challenges, and directions. J Cheminform 2020; 12:46. [PMID: 33431024 PMCID: PMC7374666 DOI: 10.1186/s13321-020-00450-7] [Citation(s) in RCA: 130] [Impact Index Per Article: 32.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Accepted: 07/13/2020] [Indexed: 01/13/2023] Open
Abstract
Drug repositioning is the process of identifying novel therapeutic potentials for existing drugs and discovering therapies for untreated diseases. Drug repositioning, therefore, plays an important role in optimizing the pre-clinical process of developing novel drugs by saving time and cost compared to the traditional de novo drug discovery processes. Since drug repositioning relies on data for existing drugs and diseases the enormous growth of publicly available large-scale biological, biomedical, and electronic health-related data along with the high-performance computing capabilities have accelerated the development of computational drug repositioning approaches. Multidisciplinary researchers and scientists have carried out numerous attempts, with different degrees of efficiency and success, to computationally study the potential of repositioning drugs to identify alternative drug indications. This study reviews recent advancements in the field of computational drug repositioning. First, we highlight different drug repositioning strategies and provide an overview of frequently used resources. Second, we summarize computational approaches that are extensively used in drug repositioning studies. Third, we present different computing and experimental models to validate computational methods. Fourth, we address prospective opportunities, including a few target areas. Finally, we discuss challenges and limitations encountered in computational drug repositioning and conclude with an outline of further research directions.
Collapse
Affiliation(s)
- Tamer N Jarada
- Department of Computer Science, University of Calgary, Calgary, Alberta, Canada
| | - Jon G Rokne
- Department of Computer Science, University of Calgary, Calgary, Alberta, Canada
| | - Reda Alhajj
- Department of Computer Science, University of Calgary, Calgary, Alberta, Canada.
- Department of Computer Engineering, Istanbul Medipol University, Istanbul, Turkey.
| |
Collapse
|
12
|
Schuler J, Samudrala R. Fingerprinting CANDO: Increased Accuracy with Structure- and Ligand-Based Shotgun Drug Repurposing. ACS OMEGA 2019; 4:17393-17403. [PMID: 31656912 PMCID: PMC6812124 DOI: 10.1021/acsomega.9b02160] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Accepted: 08/30/2019] [Indexed: 05/08/2023]
Abstract
We have upgraded our Computational Analysis of Novel Drug Opportunities (CANDO) platform for shotgun drug repurposing by including ligand-based, data fusion, and decision tree pipelines. The goal of shotgun drug repurposing is to screen and rank every existing human use drug or compound for every disease/indication. The first version of CANDO implemented a structure-based pipeline that modeled interactions between compounds and proteins on a large scale, generating compound-proteome interaction signatures used to infer the similarity of drug behavior; the new pipelines accomplish this by incorporating molecular fingerprints and the Tanimoto coefficient. We obtain improved benchmarking performance with the new pipelines across all three evaluation metrics used: average indication accuracy, pairwise accuracy, and coverage. The best performing pipeline achieves an average indication accuracy of 19.0% at the top10 cutoff, compared to 11.7% for v1, and 2.2% for a random control. Our results demonstrate that the CANDO drug recovery accuracy is substantially improved by integrating multiple pipelines, thereby enhancing our ability to generate putative therapeutic repurposing candidates, and increasing drug discovery efficiency.
Collapse
Affiliation(s)
- James Schuler
- Department of Biomedical
Informatics, Jacobs School of Medicine and
Biomedical Sciences at the University at Buffalo, Buffalo, New York 14203, United States
| | - Ram Samudrala
- Department of Biomedical
Informatics, Jacobs School of Medicine and
Biomedical Sciences at the University at Buffalo, Buffalo, New York 14203, United States
| |
Collapse
|
13
|
Pulley JM, Rhoads JP, Jerome RN, Challa AP, Erreger KB, Joly MM, Lavieri RR, Perry KE, Zaleski NM, Shirey-Rice JK, Aronoff DM. Using What We Already Have: Uncovering New Drug Repurposing Strategies in Existing Omics Data. Annu Rev Pharmacol Toxicol 2019; 60:333-352. [PMID: 31337270 DOI: 10.1146/annurev-pharmtox-010919-023537] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The promise of drug repurposing is to accelerate the translation of knowledge to treatment of human disease, bypassing common challenges associated with drug development to be more time- and cost-efficient. Repurposing has an increased chance of success due to the previous validation of drug safety and allows for the incorporation of omics. Hypothesis-generating omics processes inform drug repurposing decision-making methods on drug efficacy and toxicity. This review summarizes drug repurposing strategies and methodologies in the context of the following omics fields: genomics, epigenomics, transcriptomics, proteomics, metabolomics, microbiomics, phenomics, pregomics, and personomics. While each omics field has specific strengths and limitations, incorporating omics into the drug repurposing landscape is integral to its success.
Collapse
Affiliation(s)
- Jill M Pulley
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, Tennessee 37203, USA
| | - Jillian P Rhoads
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, Tennessee 37203, USA
| | - Rebecca N Jerome
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, Tennessee 37203, USA
| | - Anup P Challa
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, Tennessee 37203, USA
| | - Kevin B Erreger
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, Tennessee 37203, USA
| | - Meghan M Joly
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, Tennessee 37203, USA
| | - Robert R Lavieri
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, Tennessee 37203, USA
| | - Kelly E Perry
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, Tennessee 37203, USA
| | - Nicole M Zaleski
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, Tennessee 37203, USA
| | - Jana K Shirey-Rice
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, Tennessee 37203, USA
| | - David M Aronoff
- Department of Medicine, Division of Infectious Diseases, Vanderbilt University School of Medicine, Nashville, Tennessee 37232, USA.,Departments of Obstetrics and Gynecology, and Pathology, Microbiology, and Immunology, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA;
| |
Collapse
|
14
|
Falls Z, Mangione W, Schuler J, Samudrala R. Exploration of interaction scoring criteria in the CANDO platform. BMC Res Notes 2019; 12:318. [PMID: 31174591 PMCID: PMC6555930 DOI: 10.1186/s13104-019-4356-3] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2019] [Accepted: 05/31/2019] [Indexed: 01/18/2023] Open
Abstract
OBJECTIVE Ascertain the optimal interaction scoring criteria for the Computational Analysis of Novel Drug Opportunities (CANDO) platform for shotgun drug repurposing to improve benchmarking performance, thereby enabling more accurate prediction of novel therapeutic drug-indication pairs. RESULTS We have investigated and enhanced the interaction scoring criteria in the bioinformatic docking protocol in the newest version of our platform (v1.5), with the best performing interaction scoring criterion yielding increased benchmarking accuracies from 11.7% in v1 to 12.8% in v1.5 at the top10 cutoff (the most stringent one) and correspondingly from 24.9 to 31.2% at the top100 cutoff.
Collapse
Affiliation(s)
- Zackary Falls
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, 77 Goodell St., Suite 540, Buffalo, NY, 14203, USA
| | - William Mangione
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, 77 Goodell St., Suite 540, Buffalo, NY, 14203, USA
| | - James Schuler
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, 77 Goodell St., Suite 540, Buffalo, NY, 14203, USA
| | - Ram Samudrala
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, 77 Goodell St., Suite 540, Buffalo, NY, 14203, USA.
| |
Collapse
|