51
|
Hao Y, Moore JH. TargetTox: A Feature Selection Pipeline for Identifying Predictive Targets Associated with Drug Toxicity. J Chem Inf Model 2021; 61:5386-5394. [PMID: 34757743 DOI: 10.1021/acs.jcim.1c00733] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
In silico assessment of drug toxicity is becoming a critical step in drug development. Conventional ligand-based models are limited by low accuracy and lack of interpretability. Further, they often fail to explain cellular mechanisms underlying structure-toxicity associations. We addressed these limitations by incorporating target profile as an intermediate connecting structure to toxicity. To accommodate for high-dimensional feature space, we developed a pipeline named TargetTox that can identity a subset of predictive features. We implemented TargetTox to study 569 targets and 815 adverse events. The features identified by TargetTox comprise less than 10% of the original feature space; nevertheless, they accurately predicted binding outcomes for 377 targets and toxicity outcomes for 36 adverse events. We demonstrated that predictive targets tend to be differentially expressed in the tissue of toxicity. We also rediscovered key cellular functions associated with cardiotoxicity from the predictive targets, as well as markers of skin and liver diseases. Furthermore, we found evidence supporting diagnostic and therapeutic applications of some predictive targets in hepatotoxicity and nephrotoxicity. Our findings highlighted the critical role of predictive targets in cellular mechanisms leading to toxicity. In general, our study improved the interpretability of toxicity prediction without sacrificing accuracy. Our novel pipeline may benefit future studies of high-dimensional data sets.
Collapse
Affiliation(s)
- Yun Hao
- Genomics and Computational Biology (GCB) Graduate Program, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
| | - Jason H Moore
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
| |
Collapse
|
52
|
From serendipity to rational drug design in brain disorders: in silico, in vitro, and in vivo approaches. Curr Opin Pharmacol 2021; 60:177-182. [PMID: 34461562 DOI: 10.1016/j.coph.2021.07.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Revised: 07/16/2021] [Accepted: 07/19/2021] [Indexed: 11/23/2022]
Abstract
Prolonged life expectancy and stressful lifestyles have increased the risk of developing neurological disorders, including neurodegenerative and psychiatric illnesses. Despite obvious and immediate needs for effective treatment, drug discovery for neurological disorders has been largely serendipitous, whereas hypothesis-driven drug development programs have been remarkably poor. This may be partly due to insufficient knowledge of molecular mechanisms underlying disease pathophysiology, complex genetic and environmental risk factors, and oversimplified diagnostic criteria. Here, we review recent progress in cell type-specific investigations, bioinformatics analyses, and large reference databases, the integration of which, when combined with effective use of animal models, provides novel insights into disease mechanisms, suggests innovative drug development, and ultimately promises superior treatments for patients suffering from neurological disorders.
Collapse
|
53
|
Interpreting machine learning models to investigate circadian regulation and facilitate exploration of clock function. Proc Natl Acad Sci U S A 2021; 118:2103070118. [PMID: 34353905 PMCID: PMC8364196 DOI: 10.1073/pnas.2103070118] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
The circadian clock is an internal molecular 24-h timer that is critical to life on Earth. We describe a series of artificial intelligence (AI)– and machine learning (ML)–based approaches that enable more cost-effective analysis and insight into circadian regulation and function. Throughout the manuscript, we illuminate what is inside the ML “black box” via explanation or interpretation of predictive ML models. Using this interpretation of our models, we derive biological insights into why a prediction was made, alongside accurate predictions. Most innovatively, we use only DNA sequence features for accurate circadian gene expression prediction. Using explainable AI, we define possible, responsible regulatory elements as we make these predictions; this critically requires no prior knowledge of regulatory elements. The circadian clock is an important adaptation to life on Earth. Here, we use machine learning to predict complex, temporal, and circadian gene expression patterns in Arabidopsis. Most significantly, we classify circadian genes using DNA sequence features generated de novo from public, genomic resources, facilitating downstream application of our methods with no experimental work or prior knowledge needed. We use local model explanation that is transcript specific to rank DNA sequence features, providing a detailed profile of the potential circadian regulatory mechanisms for each transcript. Furthermore, we can discriminate the temporal phase of transcript expression using the local, explanation-derived, and ranked DNA sequence features, revealing hidden subclasses within the circadian class. Model interpretation/explanation provides the backbone of our methodological advances, giving insight into biological processes and experimental design. Next, we use model interpretation to optimize sampling strategies when we predict circadian transcripts using reduced numbers of transcriptomic timepoints. Finally, we predict the circadian time from a single, transcriptomic timepoint, deriving marker transcripts that are most impactful for accurate prediction; this could facilitate the identification of altered clock function from existing datasets.
Collapse
|
54
|
Lim JJ, Li X, Lehmler HJ, Wang D, Gu H, Cui JY. Gut Microbiome Critically Impacts PCB-induced Changes in Metabolic Fingerprints and the Hepatic Transcriptome in Mice. Toxicol Sci 2021; 177:168-187. [PMID: 32544245 DOI: 10.1093/toxsci/kfaa090] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Polychlorinated biphenyls (PCBs) are ubiquitously detected and have been linked to metabolic diseases. Gut microbiome is recognized as a critical regulator of disease susceptibility; however, little is known how PCBs and gut microbiome interact to modulate hepatic xenobiotic and intermediary metabolism. We hypothesized the gut microbiome regulates PCB-mediated changes in the metabolic fingerprints and hepatic transcriptome. Ninety-day-old female conventional and germ-free mice were orally exposed to the Fox River Mixture (synthetic PCB mixture, 6 or 30 mg/kg) or corn oil (vehicle control, 10 ml/kg), once daily for 3 consecutive days. RNA-seq was conducted in liver, and endogenous metabolites were measured in liver and serum by LC-MS. Prototypical target genes of aryl hydrocarbon receptor, pregnane X receptor, and constitutive androstane receptor were more readily upregulated by PCBs in conventional conditions, indicating PCBs, to the hepatic transcriptome, act partly through the gut microbiome. In a gut microbiome-dependent manner, xenobiotic, and steroid metabolism pathways were upregulated, whereas response to misfolded proteins-related pathways was downregulated by PCBs. At the high PCB dose, NADP, and arginine appear to interact with drug-metabolizing enzymes (ie, Cyp1-3 family), which are highly correlated with Ruminiclostridium and Roseburia, providing a novel explanation of gut-liver interaction from PCB-exposure. Utilizing the Library of Integrated Network-based Cellular Signatures L1000 database, therapeutics targeting anti-inflammatory and endoplasmic reticulum stress pathways are predicted to be remedies that can mitigate PCB toxicity. Our findings demonstrate that habitation of the gut microbiota drives PCB-mediated hepatic responses. Our study adds knowledge of physiological response differences from PCB exposure and considerations for further investigations for gut microbiome-dependent therapeutics.
Collapse
Affiliation(s)
- Joe Jongpyo Lim
- Department of Environmental and Occupational Health Sciences, University of Washington, Seattle, Washington 98195
| | - Xueshu Li
- Department of Occupational and Environmental Health, University of Iowa, Iowa City, Iowa 52242; and
| | - Hans-Joachim Lehmler
- Department of Occupational and Environmental Health, University of Iowa, Iowa City, Iowa 52242; and
| | - Dongfang Wang
- Arizona Metabolomics Laboratory, School of Nutrition and Health Promotion, College of Health Solutions, Arizona State University, Scottsdale, Arizona 85259
| | - Haiwei Gu
- Arizona Metabolomics Laboratory, School of Nutrition and Health Promotion, College of Health Solutions, Arizona State University, Scottsdale, Arizona 85259
| | - Julia Yue Cui
- Department of Environmental and Occupational Health Sciences, University of Washington, Seattle, Washington 98195
| |
Collapse
|
55
|
Joshi P, Vedhanayagam M, Ramesh R. An Ensembled SVM Based Approach for Predicting Adverse Drug Reactions. Curr Bioinform 2021. [DOI: 10.2174/1574893615999200707141420] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
Preventing adverse drug reactions (ADRs) is imperative for the safety of
the people. The problem of under-reporting the ADRs has been prevalent across the world, making it
difficult to develop the prediction models, which are unbiased. As a result, most of the models are
skewed to the negative samples leading to high accuracy but poor performance in other metrics such
as precision, recall, F1 score, and AUROC score.
Objective:
In this work, we have proposed a novel way of predicting the ADRs by balancing the dataset.
Method:
The whole data set has been partitioned into balanced smaller data sets. SVMs with
optimal kernel have been learned using each of the balanced data sets and the prediction of given
ADR for the given drug has been obtained by voting from the ensembled optimal SVMs learned.
Results:
We have found that results are encouraging and comparable with the competing methods in
the literature and obtained the average sensitivity of 0.97 for all the ADRs. The model has been
interpreted and explained with SHAP values by various plots.
Conclusion:
A novel way of predicting ADRs by balancing the dataset has been proposed thereby
reducing the effect of unbalanced datasets.
Collapse
Affiliation(s)
- Pratik Joshi
- Department of Computer Science and Engineering, IIITDM Kancheepuram, Chennai, India
| | | | | |
Collapse
|
56
|
Schotland P, Racz R, Jackson DB, Soldatos TG, Levin R, Strauss DG, Burkhart K. Target Adverse Event Profiles for Predictive Safety in the Postmarket Setting. Clin Pharmacol Ther 2021; 109:1232-1243. [PMID: 33090463 PMCID: PMC8246740 DOI: 10.1002/cpt.2074] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2019] [Accepted: 08/31/2020] [Indexed: 12/21/2022]
Abstract
We improved a previous pharmacological target adverse-event (TAE) profile model to predict adverse events (AEs) on US Food and Drug Administration (FDA) drug labels at the time of approval. The new model uses more drugs and features for learning as well as a new algorithm. Comparator drugs sharing similar target activities to a drug of interest were evaluated by aggregating AEs from the FDA Adverse Event Reporting System (FAERS), FDA drug labels, and medical literature. An ensemble machine learning model was used to evaluate FAERS case count, disproportionality scores, percent of comparator drug labels with a specific AE, and percent of comparator drugs with the reports of the event in the literature. Overall classifier performance was F1 of 0.71, area under the precision-recall curve of 0.78, and area under the receiver operating characteristic curve of 0.87. TAE analysis continues to show promise as a method to predict adverse events at the time of approval.
Collapse
Affiliation(s)
- Peter Schotland
- Division of Applied Regulatory ScienceOffice of Clinical PharmacologyCenter for Drug Evaluation and ResearchUS Food and Drug AdministrationSilver SpringMarylandUSA
- Present address:
Office of Oncologic DiseasesOffice of New DrugsCenter for Drug Evaluation and ResearchUS Food and Drug AdministrationSilver SpringMarylandUSA
| | - Rebecca Racz
- Division of Applied Regulatory ScienceOffice of Clinical PharmacologyCenter for Drug Evaluation and ResearchUS Food and Drug AdministrationSilver SpringMarylandUSA
| | | | | | - Robert Levin
- Office of Surveillance and EpidemiologyCenter for Drug Evaluation and ResearchUS Food and Drug AdministrationSilver SpringMarylandUSA
| | - David G. Strauss
- Division of Applied Regulatory ScienceOffice of Clinical PharmacologyCenter for Drug Evaluation and ResearchUS Food and Drug AdministrationSilver SpringMarylandUSA
| | - Keith Burkhart
- Division of Applied Regulatory ScienceOffice of Clinical PharmacologyCenter for Drug Evaluation and ResearchUS Food and Drug AdministrationSilver SpringMarylandUSA
| |
Collapse
|
57
|
Kropiwnicki E, Evangelista JE, Stein DJ, Clarke DJB, Lachmann A, Kuleshov MV, Jeon M, Jagodnik KM, Ma’ayan A. Drugmonizome and Drugmonizome-ML: integration and abstraction of small molecule attributes for drug enrichment analysis and machine learning. Database (Oxford) 2021; 2021:baab017. [PMID: 33787872 PMCID: PMC8011435 DOI: 10.1093/database/baab017] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Revised: 03/11/2021] [Accepted: 03/19/2021] [Indexed: 12/15/2022]
Abstract
Understanding the underlying molecular and structural similarities between seemingly heterogeneous sets of drugs can aid in identifying drug repurposing opportunities and assist in the discovery of novel properties of preclinical small molecules. A wealth of information about drug and small molecule structure, targets, indications and side effects; induced gene expression signatures; and other attributes are publicly available through web-based tools, databases and repositories. By processing, abstracting and aggregating information from these resources into drug set libraries, knowledge about novel properties of drugs and small molecules can be systematically imputed with machine learning. In addition, drug set libraries can be used as the underlying database for drug set enrichment analysis. Here, we present Drugmonizome, a database with a search engine for querying annotated sets of drugs and small molecules for performing drug set enrichment analysis. Utilizing the data within Drugmonizome, we also developed Drugmonizome-ML. Drugmonizome-ML enables users to construct customized machine learning pipelines using the drug set libraries from Drugmonizome. To demonstrate the utility of Drugmonizome, drug sets from 12 independent SARS-CoV-2 in vitro screens were subjected to consensus enrichment analysis. Despite the low overlap among these 12 independent in vitro screens, we identified common biological processes critical for blocking viral replication. To demonstrate Drugmonizome-ML, we constructed a machine learning pipeline to predict whether approved and preclinical drugs may induce peripheral neuropathy as a potential side effect. Overall, the Drugmonizome and Drugmonizome-ML resources provide rich and diverse knowledge about drugs and small molecules for direct systems pharmacology applications. Database URL: https://maayanlab.cloud/drugmonizome/.
Collapse
Affiliation(s)
- Eryk Kropiwnicki
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-Based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG); Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - John E Evangelista
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-Based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG); Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Daniel J Stein
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-Based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG); Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Daniel J B Clarke
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-Based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG); Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Alexander Lachmann
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-Based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG); Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Maxim V Kuleshov
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-Based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG); Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Minji Jeon
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-Based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG); Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Kathleen M Jagodnik
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-Based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG); Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Avi Ma’ayan
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-Based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG); Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| |
Collapse
|
58
|
Naveed H, Reglin C, Schubert T, Gao X, Arold ST, Maitland ML. Identifying Novel Drug Targets by iDTPnd: A Case Study of Kinase Inhibitors. GENOMICS PROTEOMICS & BIOINFORMATICS 2021; 19:986-997. [PMID: 33794377 PMCID: PMC9403029 DOI: 10.1016/j.gpb.2020.05.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Revised: 01/08/2020] [Accepted: 05/11/2020] [Indexed: 11/16/2022]
Abstract
Current FDA-approved kinase inhibitors cause diverse adverse effects, some of which are due to the mechanism-independent effects of these drugs. Identifying these mechanism-independent interactions could improve drug safety and support drug repurposing. Here, we develop iDTPnd (integrated Drug Target Predictor with negative dataset), a computational approach for large-scale discovery of novel targets for known drugs. For a given drug, we construct a positive structural signature as well as a negative structural signature that captures the weakly conserved structural features of drug-binding sites. To facilitate assessment of unintended targets, iDTPnd also provides a docking-based interaction score and its statistical significance. We confirm the interactions of sorafenib, imatinib, dasatinib, sunitinib, and pazopanib with their known targets at a sensitivity of 52% and a specificity of 55%. We also validate 10 predicted novel targets by using in vitro experiments. Our results suggest that proteins other than kinases, such as nuclear receptors, cytochrome P450, and MHC class I molecules, can also be physiologically relevant targets of kinase inhibitors. Our method is general and broadly applicable for the identification of protein–small molecule interactions, when sufficient drug–target 3D data are available. The code for constructing the structural signatures is available at https://sfb.kaust.edu.sa/Documents/iDTP.zip.
Collapse
Affiliation(s)
- Hammad Naveed
- Toyota Technological Institute at Chicago, Chicago, IL 60637, USA; Department of Computer Science, National University of Computer and Emerging Sciences, Islamabad 44000, Pakistan.
| | | | | | - Xin Gao
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal 23955, Saudi Arabia
| | - Stefan T Arold
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Biological and Environmental Sciences and Engineering (BESE) Division, Thuwal 23955, Saudi Arabia
| | - Michael L Maitland
- Inova Center for Personalized Health and Schar Cancer Institute, Falls Church, VA 22042 USA,; University of Virginia Cancer Center, Annandale, Virginia 22003, USA
| |
Collapse
|
59
|
Towards the routine use of in silico screenings for drug discovery using metabolic modelling. Biochem Soc Trans 2021; 48:955-969. [PMID: 32369553 PMCID: PMC7329353 DOI: 10.1042/bst20190867] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2020] [Revised: 04/01/2020] [Accepted: 04/06/2020] [Indexed: 12/12/2022]
Abstract
Currently, the development of new effective drugs for cancer therapy is not only hindered by development costs, drug efficacy, and drug safety but also by the rapid occurrence of drug resistance in cancer. Hence, new tools are needed to study the underlying mechanisms in cancer. Here, we discuss the current use of metabolic modelling approaches to identify cancer-specific metabolism and find possible new drug targets and drugs for repurposing. Furthermore, we list valuable resources that are needed for the reconstruction of cancer-specific models by integrating various available datasets with genome-scale metabolic reconstructions using model-building algorithms. We also discuss how new drug targets can be determined by using gene essentiality analysis, an in silico method to predict essential genes in a given condition such as cancer and how synthetic lethality studies could greatly benefit cancer patients by suggesting drug combinations with reduced side effects.
Collapse
|
60
|
Zhang L, Wang Z, Liu R, Li Z, Lin J, Wojciechowicz ML, Huang J, Lee K, Ma'ayan A, He JC. Connectivity Mapping Identifies BI-2536 as a Potential Drug to Treat Diabetic Kidney Disease. Diabetes 2021; 70:589-602. [PMID: 33067313 PMCID: PMC7881868 DOI: 10.2337/db20-0580] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Accepted: 10/05/2020] [Indexed: 12/11/2022]
Abstract
Diabetic kidney disease (DKD) remains the most common cause of kidney failure, and the treatment options are insufficient. Here, we used a connectivity mapping approach to first collect 15 gene expression signatures from 11 DKD-related published independent studies. Then, by querying the Library of Integrated Network-based Cellular Signatures (LINCS) L1000 data set, we identified drugs and other bioactive small molecules that are predicted to reverse these gene signatures in the diabetic kidney. Among the top consensus candidates, we selected a PLK1 inhibitor (BI-2536) for further experimental validation. We found that PLK1 expression was increased in the glomeruli of both human and mouse diabetic kidneys and localized largely in mesangial cells. We also found that BI-2536 inhibited mesangial cell proliferation and extracellular matrix in vitro and ameliorated proteinuria and kidney injury in DKD mice. Further pathway analysis of the genes predicted to be reversed by the PLK1 inhibitor was of members of the TNF-α/NF-κB, JAK/STAT, and TGF-β/Smad3 pathways. In vitro, either BI-2536 treatment or knockdown of PLK1 dampened the NF-κB and Smad3 signal transduction and transcriptional activation. Together, these results suggest that the PLK1 inhibitor BI-2536 should be further investigated as a novel therapy for DKD.
Collapse
Affiliation(s)
- Lu Zhang
- Division of Nephrology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY
- Department of Nephrology, The First Affiliated Hospital of Xiamen University, Xiamen, China
| | - Zichen Wang
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY
| | - Ruijie Liu
- Division of Nephrology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY
| | - Zhengzhe Li
- Division of Nephrology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY
| | - Jennifer Lin
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY
| | - Megan L Wojciechowicz
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY
| | - Jiyi Huang
- Department of Nephrology, The First Affiliated Hospital of Xiamen University, Xiamen, China
| | - Kyung Lee
- Division of Nephrology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY
| | - Avi Ma'ayan
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY
| | - John Cijiang He
- Division of Nephrology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY
- Renal Section, James J. Peters Veterans Affair Medical Center, Bronx, NY
| |
Collapse
|
61
|
Liu A, Walter M, Wright P, Bartosik A, Dolciami D, Elbasir A, Yang H, Bender A. Prediction and mechanistic analysis of drug-induced liver injury (DILI) based on chemical structure. Biol Direct 2021; 16:6. [PMID: 33461600 PMCID: PMC7814730 DOI: 10.1186/s13062-020-00285-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Accepted: 12/01/2020] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Drug-induced liver injury (DILI) is a major safety concern characterized by a complex and diverse pathogenesis. In order to identify DILI early in drug development, a better understanding of the injury and models with better predictivity are urgently needed. One approach in this regard are in silico models which aim at predicting the risk of DILI based on the compound structure. However, these models do not yet show sufficient predictive performance or interpretability to be useful for decision making by themselves, the former partially stemming from the underlying problem of labeling the in vivo DILI risk of compounds in a meaningful way for generating machine learning models. RESULTS As part of the Critical Assessment of Massive Data Analysis (CAMDA) "CMap Drug Safety Challenge" 2019 ( http://camda2019.bioinf.jku.at ), chemical structure-based models were generated using the binarized DILIrank annotations. Support Vector Machine (SVM) and Random Forest (RF) classifiers showed comparable performance to previously published models with a mean balanced accuracy over models generated using 5-fold LOCO-CV inside a 10-fold training scheme of 0.759 ± 0.027 when predicting an external test set. In the models which used predicted protein targets as compound descriptors, we identified the most information-rich proteins which agreed with the mechanisms of action and toxicity of nonsteroidal anti-inflammatory drugs (NSAIDs), one of the most important drug classes causing DILI, stress response via TP53 and biotransformation. In addition, we identified multiple proteins involved in xenobiotic metabolism which could be novel DILI-related off-targets, such as CLK1 and DYRK2. Moreover, we derived potential structural alerts for DILI with high precision, including furan and hydrazine derivatives; however, all derived alerts were present in approved drugs and were over specific indicating the need to consider quantitative variables such as dose. CONCLUSION Using chemical structure-based descriptors such as structural fingerprints and predicted protein targets, DILI prediction models were built with a predictive performance comparable to previous literature. In addition, we derived insights on proteins and pathways statistically (and potentially causally) linked to DILI from these models and inferred new structural alerts related to this adverse endpoint.
Collapse
Affiliation(s)
- Anika Liu
- Department of Chemistry, Centre for Molecular Informatics, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.
| | - Moritz Walter
- Department of Chemistry, Centre for Molecular Informatics, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK
| | - Peter Wright
- Department of Chemistry, Centre for Molecular Informatics, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK
| | - Aleksandra Bartosik
- Department of Chemistry, Centre for Molecular Informatics, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK
| | - Daniela Dolciami
- Department of Chemistry, Centre for Molecular Informatics, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK
- Department of Pharmaceutical Sciences, University of Perugia, Via del Liceo 1, 06123, Perugia, Italy
| | - Abdurrahman Elbasir
- Department of Chemistry, Centre for Molecular Informatics, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK
- ICT Department, College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Hongbin Yang
- Department of Chemistry, Centre for Molecular Informatics, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK
| | - Andreas Bender
- Department of Chemistry, Centre for Molecular Informatics, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.
| |
Collapse
|
62
|
Wang M, Ma X, Si J, Tang H, Wang H, Li T, Ouyang W, Gong L, Tang Y, He X, Huang W, Liu X. Adverse Drug Reaction Discovery Using a Tumor-Biomarker Knowledge Graph. Front Genet 2021; 11:625659. [PMID: 33584816 PMCID: PMC7873847 DOI: 10.3389/fgene.2020.625659] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Accepted: 12/09/2020] [Indexed: 12/14/2022] Open
Abstract
Adverse drug reactions (ADRs) are a major public health concern, and early detection is crucial for drug development and patient safety. Together with the increasing availability of large-scale literature data, machine learning has the potential to predict unknown ADRs from current knowledge. By the machine learning methods, we constructed a Tumor-Biomarker Knowledge Graph (TBKG) which contains four types of node: Tumor, Biomarker, Drug, and ADR using biomedical literatures. Based on this knowledge graph, we not only discovered potential ADRs of antitumor drugs but also provided explanations. Experiments on real-world data show that our model can achieve 0.81 accuracy of three cross-validation and the ADRs discovery of Osimertinib was chosen for the clinical validation. Calculated ADRs of Osimertinib by our model consisted of the known ADRs which were in line with the official manual and some unreported rare ADRs in clinical cases. Results also showed that our model outperformed traditional co-occurrence methods. Moreover, each calculated ADRs were attached with the corresponding paths of “tumor-biomarker-drug” in the knowledge graph which could help to obtain in-depth insights into the underlying mechanisms. In conclusion, the tumor-biomarker knowledge-graph based approach is an explainable method for potential ADRs discovery based on biomarkers and might be valuable to the community working on the emerging field of biomedical literature mining and provide impetus for the mechanism research of ADRs.
Collapse
Affiliation(s)
- Meng Wang
- School of Computer Science and Engineering, Southeast University, Nanjing, China
| | - Xinyu Ma
- School of Computer Science and Engineering, Southeast University, Nanjing, China
| | - Jingwen Si
- Department of Pharmaceutical Sciences, Tsinghua University, Beijing, China
| | - Hongjia Tang
- Department of Anesthesiology, Third Xiangya Hospital, Central South University, Changsha, China
| | - Haofen Wang
- College of Design and Innovation, Tongji University, Shanghai, China
| | - Tunliang Li
- Department of Anesthesiology, Third Xiangya Hospital, Central South University, Changsha, China
| | - Wen Ouyang
- Department of Anesthesiology, Third Xiangya Hospital, Central South University, Changsha, China
| | - Liying Gong
- Department of Intensive Care Unit, Third Xiangya Hospital, Central South University, Changsha, China
| | - Yongzhong Tang
- Department of Anesthesiology, Third Xiangya Hospital, Central South University, Changsha, China
| | - Xi He
- Department of Anesthesiology, Third Xiangya Hospital, Central South University, Changsha, China
| | - Wei Huang
- Department of Cardiology, Third Xiangya Hospital, Central South University, Changsha, China
| | - Xing Liu
- Department of Anesthesiology, Third Xiangya Hospital, Central South University, Changsha, China
| |
Collapse
|
63
|
Gao S, Han L, Luo D, Liu G, Xiao Z, Shan G, Zhang Y, Zhou W. Modeling drug mechanism of action with large scale gene-expression profiles using GPAR, an artificial intelligence platform. BMC Bioinformatics 2021; 22:17. [PMID: 33413089 PMCID: PMC7788535 DOI: 10.1186/s12859-020-03915-6] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Accepted: 11/30/2020] [Indexed: 01/03/2023] Open
Abstract
Background Querying drug-induced gene expression profiles with machine learning method is an effective way for revealing drug mechanism of actions (MOAs), which is strongly supported by the growth of large scale and high-throughput gene expression databases. However, due to the lack of code-free and user friendly applications, it is not easy for biologists and pharmacologists to model MOAs with state-of-art deep learning approach. Results In this work, a newly developed online collaborative tool, Genetic profile-activity relationship (GPAR) was built to help modeling and predicting MOAs easily via deep learning. The users can use GPAR to customize their training sets to train self-defined MOA prediction models, to evaluate the model performances and to make further predictions automatically. Cross-validation tests show GPAR outperforms Gene set enrichment analysis in predicting MOAs. Conclusion GPAR can serve as a better approach in MOAs prediction, which may facilitate researchers to generate more reliable MOA hypothesis.
Collapse
Affiliation(s)
- Shengqiao Gao
- Beijing Institute of Pharmacology and Toxicology, State Key Laboratory of Toxicology and Medical Countermeasures, Beijing, 100850, China
| | - Lu Han
- Beijing Institute of Pharmacology and Toxicology, State Key Laboratory of Toxicology and Medical Countermeasures, Beijing, 100850, China
| | - Dan Luo
- Beijing Institute of Pharmacology and Toxicology, State Key Laboratory of Toxicology and Medical Countermeasures, Beijing, 100850, China
| | - Gang Liu
- Beijing Institute of Pharmacology and Toxicology, State Key Laboratory of Toxicology and Medical Countermeasures, Beijing, 100850, China
| | - Zhiyong Xiao
- Beijing Institute of Pharmacology and Toxicology, State Key Laboratory of Toxicology and Medical Countermeasures, Beijing, 100850, China
| | - Guangcun Shan
- School of Instrumentation Science and Opto-Electronics Engineering and Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, Beihang University, Beijing, 100083, China
| | - Yongxiang Zhang
- Beijing Institute of Pharmacology and Toxicology, State Key Laboratory of Toxicology and Medical Countermeasures, Beijing, 100850, China.
| | - Wenxia Zhou
- Beijing Institute of Pharmacology and Toxicology, State Key Laboratory of Toxicology and Medical Countermeasures, Beijing, 100850, China.
| |
Collapse
|
64
|
Shukla R, Henkel ND, Alganem K, Hamoud AR, Reigle J, Alnafisah RS, Eby HM, Imami AS, Creeden JF, Miruzzi SA, Meller J, Mccullumsmith RE. Signature-based approaches for informed drug repurposing: targeting CNS disorders. Neuropsychopharmacology 2021; 46:116-130. [PMID: 32604402 PMCID: PMC7688959 DOI: 10.1038/s41386-020-0752-6] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/08/2020] [Revised: 05/30/2020] [Accepted: 06/22/2020] [Indexed: 12/15/2022]
Abstract
CNS disorders, and in particular psychiatric illnesses, lack definitive disease-altering therapeutics. The limited understanding of the mechanisms driving these illnesses with the slow pace and high cost of drug development exacerbates this issue. For these reasons, drug repurposing - both a less expensive and time-efficient practice compared to de novo drug development - has been a promising strategy to overcome the paucity of treatments available for these debilitating disorders. While empirical drug-repurposing has been a routine practice in clinical psychiatry, innovative, informed, and cost-effective repurposing efforts using big data ("omics") have been designed to characterize drugs by structural and transcriptomic signatures. These strategies, in conjunction with ontological integration, provide an important opportunity to address knowledge-based challenges associated with drug development for CNS disorders. In this review, we discuss various signature-based in silico approaches to drug repurposing, its integration with multiple omics platforms, and how this data can be used for clinically relevant, evidence-based drug repurposing. These tools provide an exciting translational avenue to merge omics-based drug discovery platforms with patient-specific disease signatures, ultimately facilitating the identification of new therapies for numerous psychiatric disorders.
Collapse
Affiliation(s)
- Rammohan Shukla
- Department of Neurosciences, University of Toledo, Toledo, OH, USA.
| | | | - Khaled Alganem
- Department of Neurosciences, University of Toledo, Toledo, OH, USA
| | | | - James Reigle
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | | | - Hunter M Eby
- Department of Neurosciences, University of Toledo, Toledo, OH, USA
| | - Ali S Imami
- Department of Neurosciences, University of Toledo, Toledo, OH, USA
| | - Justin F Creeden
- Department of Neurosciences, University of Toledo, Toledo, OH, USA
| | - Scott A Miruzzi
- Department of Neurosciences, University of Toledo, Toledo, OH, USA
| | - Jaroslaw Meller
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
- Department of Cancer Biology, University of Cincinnati College of Medicine, Cincinnati, OH, 45267, USA
- Department of Environmental Health, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Department of Electrical Engineering and Computing Systems, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Department of Informatics, Nicolaus Copernicus University, Torun, Poland
| | - Robert E Mccullumsmith
- Department of Neurosciences, University of Toledo, Toledo, OH, USA
- Neurosciences Institute, ProMedica, Toledo, OH, USA
| |
Collapse
|
65
|
Synthesis, characterization, toxic substructure prediction, hepatotoxicity evaluation, marine pathogenic bacteria inhibition, and DFT calculations of a new hydrazone derived from isoniazid. J Mol Struct 2020. [DOI: 10.1016/j.molstruc.2020.128817] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
66
|
Ito A, Zhao Q, Tanaka Y, Yasui M, Katayama R, Sun S, Tanimoto Y, Nishikawa Y, Kage-Nakadai E. Metolazone upregulates mitochondrial chaperones and extends lifespan in Caenorhabditis elegans. Biogerontology 2020; 22:119-131. [PMID: 33216250 DOI: 10.1007/s10522-020-09907-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Accepted: 11/11/2020] [Indexed: 01/01/2023]
Abstract
Accumulating studies have argued that the mitochondrial unfolded protein response (UPRmt) is a mitochondrial stress response that promotes longevity in model organisms. In the present study, we screened an off-patent drug library to identify compounds that activate UPRmt using a mitochondrial chaperone hsp-6::GFP reporter system in Caenorhabditis elegans. Metolazone, a diuretic primarily used to treat congestive heart failure and high blood pressure, was identified as a prominent hit as it upregulated hsp-6::GFP and not the endoplasmic reticulum chaperone hsp-4::GFP. Furthermore, metolazone specifically induced the expression of mitochondrial chaperones in the HeLa cell line. Metolazone also extended the lifespan of worms in a atfs-1 and ubl-5-dependent manner. Notably, metolazone failed to increase lifespan in worms with knocked-down nkcc-1. These results suggested that metolazone activates the UPRmt across species and prolongs the lifespan of C. elegans.
Collapse
Affiliation(s)
- Ai Ito
- Faculty of Human Life Science, Department of Food and Nutrition, Osaka City University, Sugimoto 3-3-138 Sumiyoshi-ku, Osaka, 558-8585, Japan
| | - Quichi Zhao
- Faculty of Human Life Science, Department of Food and Nutrition, Osaka City University, Sugimoto 3-3-138 Sumiyoshi-ku, Osaka, 558-8585, Japan
| | - Yoichiro Tanaka
- Faculty of Human Life Science, Department of Food and Nutrition, Osaka City University, Sugimoto 3-3-138 Sumiyoshi-ku, Osaka, 558-8585, Japan
| | - Masumi Yasui
- Faculty of Human Life Science, Department of Food and Nutrition, Osaka City University, Sugimoto 3-3-138 Sumiyoshi-ku, Osaka, 558-8585, Japan
| | - Rina Katayama
- Faculty of Human Life Science, Department of Food and Nutrition, Osaka City University, Sugimoto 3-3-138 Sumiyoshi-ku, Osaka, 558-8585, Japan
| | - Simo Sun
- Faculty of Human Life Science, Department of Food and Nutrition, Osaka City University, Sugimoto 3-3-138 Sumiyoshi-ku, Osaka, 558-8585, Japan
| | - Yoshihiko Tanimoto
- Faculty of Human Life Science, Department of Food and Nutrition, Osaka City University, Sugimoto 3-3-138 Sumiyoshi-ku, Osaka, 558-8585, Japan
| | - Yoshikazu Nishikawa
- Faculty of Human Life Science, Department of Food and Nutrition, Osaka City University, Sugimoto 3-3-138 Sumiyoshi-ku, Osaka, 558-8585, Japan
| | - Eriko Kage-Nakadai
- Faculty of Human Life Science, Department of Food and Nutrition, Osaka City University, Sugimoto 3-3-138 Sumiyoshi-ku, Osaka, 558-8585, Japan.
- The OCU Advanced Research Institute for Natural Science and Technology, Osaka City University, Osaka, 558-8585, Japan.
| |
Collapse
|
67
|
Xue Y, Ding MQ, Lu X. Learning to encode cellular responses to systematic perturbations with deep generative models. NPJ Syst Biol Appl 2020; 6:35. [PMID: 33159077 PMCID: PMC7648057 DOI: 10.1038/s41540-020-00158-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Accepted: 10/07/2020] [Indexed: 11/09/2022] Open
Abstract
Cellular signaling systems play a vital role in maintaining homeostasis when a cell is exposed to different perturbations. Components of the systems are organized as hierarchical networks, and perturbing different components often leads to transcriptomic profiles that exhibit compositional statistical patterns. Mining such patterns to investigate how cellular signals are encoded is an important problem in systems biology, where artificial intelligence techniques can be of great assistance. Here, we investigated the capability of deep generative models (DGMs) to modeling signaling systems and learn representations of cellular states underlying transcriptomic responses to diverse perturbations. Specifically, we show that the variational autoencoder and the supervised vector-quantized variational autoencoder can accurately regenerate gene expression data in response to perturbagen treatments. The models can learn representations that reveal the relationships between different classes of perturbagens and enable mappings between drugs and their target genes. In summary, DGMs can adequately learn and depict how cellular signals are encoded. The resulting representations have broad applications, demonstrating the power of artificial intelligence in systems biology and precision medicine.
Collapse
Affiliation(s)
- Yifan Xue
- Department of Biomedical Informatics, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15206, USA
| | - Michael Q Ding
- Department of Biomedical Informatics, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15206, USA
| | - Xinghua Lu
- Department of Biomedical Informatics, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15206, USA.
- Department of Pharmaceutical Sciences, School of Pharmacy, University of Pittsburgh, Pittsburgh, PA, 15206, USA.
| |
Collapse
|
68
|
McDermott MBA, Wang J, Zhao WN, Sheridan SD, Szolovits P, Kohane I, Haggarty SJ, Perlis RH. Deep Learning Benchmarks on L1000 Gene Expression Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:1846-1857. [PMID: 30990190 PMCID: PMC6980363 DOI: 10.1109/tcbb.2019.2910061] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Gene expression data can offer deep, physiological insights beyond the static coding of the genome alone. We believe that realizing this potential requires specialized, high-capacity machine learning methods capable of using underlying biological structure, but the development of such models is hampered by the lack of published benchmark tasks and well characterized baselines. In this work, we establish such benchmarks and baselines by profiling many classifiers against biologically motivated tasks on two curated views of a large, public gene expression dataset (the LINCS corpus) and one privately produced dataset. We provide these two curated views of the public LINCS dataset and our benchmark tasks to enable direct comparisons to future methodological work and help spur deep learning method development on this modality. In addition to profiling a battery of traditional classifiers, including linear models, random forests, decision trees, K nearest neighbor (KNN) classifiers, and feed-forward artificial neural networks (FF-ANNs), we also test a method novel to this data modality: graph convolugtional neural networks (GCNNs), which allow us to incorporate prior biological domain knowledge. We find that GCNNs can be highly performant, with large datasets, whereas FF-ANNs consistently perform well. Non-neural classifiers are dominated by linear models and KNN classifiers.
Collapse
|
69
|
He S, Wen Y, Yang X, Liu Z, Song X, Huang X, Bo X. PIMD: An Integrative Approach for Drug Repositioning Using Multiple Characterization Fusion. GENOMICS PROTEOMICS & BIOINFORMATICS 2020; 18:565-581. [PMID: 33075523 PMCID: PMC8377380 DOI: 10.1016/j.gpb.2018.10.012] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/09/2018] [Revised: 09/21/2018] [Accepted: 10/10/2018] [Indexed: 11/28/2022]
Abstract
The accumulation of various types of drug informatics data and computational approaches for drug repositioning can accelerate pharmaceutical research and development. However, the integration of multi-dimensional drug data for precision repositioning remains a pressing challenge. Here, we propose a systematic framework named PIMD to predict drug therapeutic properties by integrating multi-dimensional data for drug repositioning. In PIMD, drug similarity networks (DSNs) based on chemical, pharmacological, and clinical data are fused into an integrated DSN (iDSN) composed of many clusters. Rather than simple fusion, PIMD offers a systematic way to annotate clusters. Unexpected drugs within clusters and drug pairs with a high iDSN similarity score are therefore identified to predict novel therapeutic uses. PIMD provides new insights into the universality, individuality, and complementarity of different drug properties by evaluating the contribution of each property data. To test the performance of PIMD, we use chemical, pharmacological, and clinical properties to generate an iDSN. Analyses of the contributions of each drug property indicate that this iDSN was driven by all data types and performs better than other DSNs. Within the top 20 recommended drug pairs, 7 drugs have been reported to be repurposed. The source code for PIMD is available at https://github.com/Sepstar/PIMD/.
Collapse
Affiliation(s)
- Song He
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Yuqi Wen
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Xiaoxi Yang
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Zhen Liu
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Xinyu Song
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Xin Huang
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Xiaochen Bo
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China.
| |
Collapse
|
70
|
Shankar S, Bhandari I, Okou DT, Srinivasa G, Athri P. Predicting adverse drug reactions of two-drug combinations using structural and transcriptomic drug representations to train an artificial neural network. Chem Biol Drug Des 2020; 97:665-673. [PMID: 33006799 DOI: 10.1111/cbdd.13802] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Accepted: 09/20/2020] [Indexed: 12/16/2022]
Abstract
Adverse drug reactions (ADRs) are pharmacological events triggered by drug interactions with various sources of origin including drug-drug interactions. While there are many computational studies that explore models to predict ADRs originating from single drugs, only a few of them explore models that predict ADRs from drug combinations. Further, as far as we know, none of them have developed models using transcriptomic data, specifically the LINCS L1000 drug-induced gene expression data to predict ADRs for drug combinations. In this study, we use the TWOSIDES database as a source of ADRs originating from two-drug combinations. 34,549 common drug pairs between these two databases were used to train an artificial neural network (ANN), to predict 243 ADRs that were induced by at least 10% of the drug pairs. Our model predicts the occurrence of these ADRs with an average accuracy of 82% across a multifold cross-validation.
Collapse
Affiliation(s)
- Susmitha Shankar
- Department of Computer Science and Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Bengaluru, India
| | - Ishita Bhandari
- PES Center for Pattern Recognition, Department of Computer Science and Engineering, PES University, Bengaluru, India
| | - David T Okou
- Division of Gastroenterology, Hepatology and Nutrition, Department of Pediatrics, Emory University School of Medicine, Atlanta, GA, USA
| | - Gowri Srinivasa
- PES Center for Pattern Recognition, Department of Computer Science and Engineering, PES University, Bengaluru, India
| | - Prashanth Athri
- Department of Computer Science and Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Bengaluru, India
| |
Collapse
|
71
|
Saurabh R, Nandi S, Sinha N, Shukla M, Sarkar RR. Prediction of survival rate and effect of drugs on cancer patients with somatic mutations of genes: An AI‐based approach. Chem Biol Drug Des 2020; 96:1005-1019. [DOI: 10.1111/cbdd.13668] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 01/24/2020] [Accepted: 02/02/2020] [Indexed: 01/03/2023]
Affiliation(s)
- Rochi Saurabh
- Chemical Engineering and Process Development Division CSIR‐National Chemical Laboratory Pune India
| | - Sutanu Nandi
- Chemical Engineering and Process Development Division CSIR‐National Chemical Laboratory Pune India
- Academy of Scientific & Innovative Research (AcSIR) Ghaziabad India
| | - Noopur Sinha
- Chemical Engineering and Process Development Division CSIR‐National Chemical Laboratory Pune India
- Academy of Scientific & Innovative Research (AcSIR) Ghaziabad India
| | - Mudita Shukla
- Chemical Engineering and Process Development Division CSIR‐National Chemical Laboratory Pune India
| | - Ram Rup Sarkar
- Chemical Engineering and Process Development Division CSIR‐National Chemical Laboratory Pune India
- Academy of Scientific & Innovative Research (AcSIR) Ghaziabad India
| |
Collapse
|
72
|
Qiu Y, Lu T, Lim H, Xie L. A Bayesian approach to accurate and robust signature detection on LINCS L1000 data. Bioinformatics 2020; 36:2787-2795. [PMID: 32003771 PMCID: PMC7203754 DOI: 10.1093/bioinformatics/btaa064] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2019] [Revised: 12/13/2019] [Accepted: 01/24/2020] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION LINCS L1000 dataset contains numerous cellular expression data induced by large sets of perturbagens. Although it provides invaluable resources for drug discovery as well as understanding of disease mechanisms, the existing peak deconvolution algorithms cannot recover the accurate expression level of genes in many cases, inducing severe noise in the dataset and limiting its applications in biomedical studies. RESULTS Here, we present a novel Bayesian-based peak deconvolution algorithm that gives unbiased likelihood estimations for peak locations and characterize the peaks with probability based z-scores. Based on the above algorithm, we build a pipeline to process raw data from L1000 assay into signatures that represent the features of perturbagen. The performance of the proposed pipeline is evaluated using similarity between the signatures of bio-replicates and the drugs with shared targets, and the results show that signatures derived from our pipeline gives a substantially more reliable and informative representation for perturbagens than existing methods. Thus, the new pipeline may significantly boost the performance of L1000 data in the downstream applications such as drug repurposing, disease modeling and gene function prediction. AVAILABILITY AND IMPLEMENTATION The code and the precomputed data for LINCS L1000 Phase II (GSE 70138) are available at https://github.com/njpipeorgan/L1000-bayesian. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yue Qiu
- Ph.D. Program in Biology, The Graduate Center, The City University of New York, New York, NY 10016, USA
| | - Tianhuan Lu
- Department of Astronomy, Columbia University, New York, NY 10027, USA
| | | | - Lei Xie
- Ph.D. Program in Biology, The Graduate Center, The City University of New York, New York, NY 10016, USA.,Ph.D. Program in Biochemistry.,Ph.D. Program in Computer Science, The Graduate Center, The City University of New York, New York, NY 10016, USA.,Department of Computer Science, Hunter College, The City University of New York, New York, NY 10016, USA.,Helen and Robert Appel Alzheimer's Disease Research Institute, Feil Family Brain & Mind Research Institute, Weill Cornell Medicine, Cornell University, New York, NY 10065, USA
| |
Collapse
|
73
|
Kuleshov MV, Stein DJ, Clarke DJ, Kropiwnicki E, Jagodnik KM, Bartal A, Evangelista JE, Hom J, Cheng M, Bailey A, Zhou A, Ferguson LB, Lachmann A, Ma'ayan A. The COVID-19 Drug and Gene Set Library. PATTERNS (NEW YORK, N.Y.) 2020; 1:100090. [PMID: 32838343 PMCID: PMC7381899 DOI: 10.1016/j.patter.2020.100090] [Citation(s) in RCA: 47] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Revised: 07/05/2020] [Accepted: 07/22/2020] [Indexed: 12/27/2022]
Abstract
In a short period, many research publications that report sets of experimentally validated drugs as potential COVID-19 therapies have emerged. To organize this accumulating knowledge, we developed the COVID-19 Drug and Gene Set Library (https://amp.pharm.mssm.edu/covid19/), a collection of drug and gene sets related to COVID-19 research from multiple sources. The platform enables users to view, download, analyze, visualize, and contribute drug and gene sets related to COVID-19 research. To evaluate the content of the library, we compared the results from six in vitro drug screens for COVID-19 repurposing candidates. Surprisingly, we observe low overlap across screens while highlighting overlapping candidates that should receive more attention as potential therapeutics for COVID-19. Overall, the COVID-19 Drug and Gene Set Library can be used to identify community consensus, make researchers and clinicians aware of new potential therapies, enable machine-learning applications, and facilitate the research community to work together toward a cure.
Collapse
Affiliation(s)
- Maxim V. Kuleshov
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Big Data to Knowledge, Library of Integrated Network-based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC), Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Daniel J. Stein
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Big Data to Knowledge, Library of Integrated Network-based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC), Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Daniel J.B. Clarke
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Big Data to Knowledge, Library of Integrated Network-based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC), Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Eryk Kropiwnicki
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Big Data to Knowledge, Library of Integrated Network-based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC), Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Kathleen M. Jagodnik
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Big Data to Knowledge, Library of Integrated Network-based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC), Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Alon Bartal
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Big Data to Knowledge, Library of Integrated Network-based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC), Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - John E. Evangelista
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Big Data to Knowledge, Library of Integrated Network-based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC), Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Jason Hom
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Big Data to Knowledge, Library of Integrated Network-based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC), Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Minxuan Cheng
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Big Data to Knowledge, Library of Integrated Network-based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC), Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Allison Bailey
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Big Data to Knowledge, Library of Integrated Network-based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC), Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Abigail Zhou
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Big Data to Knowledge, Library of Integrated Network-based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC), Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Laura B. Ferguson
- Department of Neurology, Dell Medical School, University of Texas at Austin, 1601 Trinity Street, Bldg B, Austin, TX 78712, USA
| | - Alexander Lachmann
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Big Data to Knowledge, Library of Integrated Network-based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC), Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Avi Ma'ayan
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Big Data to Knowledge, Library of Integrated Network-based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC), Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| |
Collapse
|
74
|
Janssens GE, Houtkooper RH. Identification of longevity compounds with minimized probabilities of side effects. Biogerontology 2020; 21:709-719. [PMID: 32562114 PMCID: PMC7541369 DOI: 10.1007/s10522-020-09887-7] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Accepted: 06/16/2020] [Indexed: 12/15/2022]
Abstract
It is hypothesized that treating the general aging population with compounds that slow aging, geroprotectors, could provide many benefits to society, including a reduction of age-related diseases. It is intuitive that such compounds should cause minimal side effects, since they would be distributed to otherwise healthy individuals for extended periods of time. The question therefore emerges of how we should prioritize geroprotectors discovered in model organisms for clinical testing in humans. In other words, which compounds are least likely to cause harm, while still potentially providing benefit? To systematically answer this question we queried the DrugAge database—containing hundreds of known geroprotectors—and cross-referenced this with a recently published repository of compound side effect predictions. In total, 124 geroprotectors were associated to 800 unique side effects. Geroprotectors with high risks of side effects, some even with risk for death, included lamotrigine and minocycline, while compounds with low side effect risks included spermidine and d-glucosamine. Despite their popularity as top geroprotector candidates for humans, sirolimus and metformin harbored greater risks of side effects than many other candidate geroprotectors, sirolimus being the more severe of the two. Furthermore, we found that a correlation existed between maximum lifespan extension in worms and the likelihood of causing a side effect, suggesting that extreme lifespan extension in model organisms should not necessarily be the priority when screening for novel geroprotectors. We discuss the implications of our findings for prioritizing geroprotectors, suggesting spermidine and d-glucosamine for clinical trials in humans.
Collapse
Affiliation(s)
- Georges E Janssens
- Laboratory Genetic Metabolic Diseases, Amsterdam UMC, University of Amsterdam, Amsterdam Gastroenterology and Metabolism, Amsterdam Cardiovascular Sciences, Meibergdreef 9, 1105, AZ, Amsterdam, The Netherlands.
| | - Riekelt H Houtkooper
- Laboratory Genetic Metabolic Diseases, Amsterdam UMC, University of Amsterdam, Amsterdam Gastroenterology and Metabolism, Amsterdam Cardiovascular Sciences, Meibergdreef 9, 1105, AZ, Amsterdam, The Netherlands
| |
Collapse
|
75
|
Using human in vitro transcriptome analysis to build trustworthy machine learning models for prediction of animal drug toxicity. Sci Rep 2020; 10:9522. [PMID: 32533004 PMCID: PMC7293302 DOI: 10.1038/s41598-020-66481-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2020] [Accepted: 05/21/2020] [Indexed: 12/03/2022] Open
Abstract
During the development of new drugs or compounds there is a requirement for preclinical trials, commonly involving animal tests, to ascertain the safety of the compound prior to human trials. Machine learning techniques could provide an in-silico alternative to animal models for assessing drug toxicity, thus reducing expensive and invasive animal testing during clinical trials, for drugs that are most likely to fail safety tests. Here we present a machine learning model to predict kidney dysfunction, as a proxy for drug induced renal toxicity, in rats. To achieve this, we use inexpensive transcriptomic profiles derived from human cell lines after chemical compound treatment to train our models combined with compound chemical structure information. Genomics data due to its sparse, high-dimensional and noisy nature presents significant challenges in building trustworthy and transparent machine learning models. Here we address these issues by judiciously building feature sets from heterogenous sources and coupling them with measures of model uncertainty achieved through Gaussian Process based Bayesian models. We combine the use of insight into the feature-wise contributions to our predictions with the use of predictive uncertainties recovered from the Gaussian Process to improve the transparency and trustworthiness of the model.
Collapse
|
76
|
Machine-Learning Prediction of Oral Drug-Induced Liver Injury (DILI) via Multiple Features and Endpoints. BIOMED RESEARCH INTERNATIONAL 2020; 2020:4795140. [PMID: 32509859 PMCID: PMC7254069 DOI: 10.1155/2020/4795140] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Accepted: 04/17/2020] [Indexed: 12/17/2022]
Abstract
Drug discovery is a costly process which usually takes more than 10 years and billions of dollars for one successful drug to enter the market. Despite all the safety tests, drugs may still cause adverse reactions and be restricted in use or even withdrawn from the market. Drug-induced liver injury (DILI) is one of the major adverse drug reactions, and computational models may be used to predict and reduce it. To assess the computational prediction performance of DILI, we curated DILI endpoints from three databases and prepared drug features including chemical descriptors, therapeutic classifications, gene expressions, and binding proteins. We trained machine-learning models to predict the various DILI endpoints using different drug features. Using the optimal feature sets, the top-performing models obtained areas under the receiver operating characteristic curve (AUC) around 0.8 for some DILI endpoints. We found that some features, including therapeutic classifications and proteins, have good prediction performance towards DILI. We also discovered that the severity of DILI endpoints as well as the selection of negative samples may significantly affect the prediction results. Overall, our study provided a comprehensive collection, curation, and prediction of DILI endpoints using various drug features, which may help the drug researchers to better understand and prevent DILI during the drug discovery process.
Collapse
|
77
|
Mamoshina P, Bueno-Orovio A, Rodriguez B. Dual Transcriptomic and Molecular Machine Learning Predicts all Major Clinical Forms of Drug Cardiotoxicity. Front Pharmacol 2020; 11:639. [PMID: 32508633 PMCID: PMC7253645 DOI: 10.3389/fphar.2020.00639] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2019] [Accepted: 04/21/2020] [Indexed: 11/13/2022] Open
Abstract
Computational methods can increase productivity of drug discovery pipelines, through overcoming challenges such as cardiotoxicity identification. We demonstrate prediction and preservation of cardiotoxic relationships for six drug-induced cardiotoxicity types using a machine learning approach on a large collected and curated dataset of transcriptional and molecular profiles (1,131 drugs, 35% with known cardiotoxicities, and 9,933 samples). The algorithm generality is demonstrated through validation in an independent drug dataset, in addition to cross-validation. The best prediction attains an average accuracy of 79% in area under the curve (AUC) for safe versus risky drugs, across all six cardiotoxicity types on validation and 66% on the unseen set of drugs. Individual cardiotoxicities for specific drug types are also predicted with high accuracy, including cardiac disorder signs and symptoms for a previously unseen set of anti-inflammatory agents (AUC = 80%) and heart failures for an unseen set of anti-neoplastic agents (AUC = 76%). Besides, independent testing on transcriptional data from the Drug Toxicity Signature Generation Center (DToxS) produces similar results in terms of accuracy and shows an average AUC of 72% for previously seen drugs and 60% for unseen respectively. Given the ubiquitous manifestation of multiple drug adverse effects in every human organ, the methodology is expected to be applicable to additional tissue-specific side effects beyond cardiotoxicity.
Collapse
Affiliation(s)
- Polina Mamoshina
- Department of Computer Science, University of Oxford, Oxford, United Kingdom.,Insilico Medicine Hong Kong Ltd, Hong Kong, Hong Kong
| | | | - Blanca Rodriguez
- Department of Computer Science, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
78
|
Wang Z, Guo K, Gao P, Pu Q, Wu M, Li C, Hur J. Identification of Repurposable Drugs and Adverse Drug Reactions for Various Courses of COVID-19 Based on Single-Cell RNA Sequencing Data. ARXIV 2020:arXiv:2005.07856v2. [PMID: 33299905 PMCID: PMC7724679] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Revised: 12/04/2020] [Indexed: 10/26/2022]
Abstract
Coronavirus disease 2019 (COVID-19) has impacted almost every part of human life worldwide, posing a massive threat to human health. There is no specific drug for COVID-19, highlighting the urgent need for the development of effective therapeutics. To identify potentially repurposable drugs, we employed a systematic approach to mine candidates from U.S. FDA-approved drugs and preclinical small-molecule compounds by integrating the gene expression perturbation data for chemicals from the Library of Integrated Network-Based Cellular Signatures project with a publicly available single-cell RNA sequencing dataset from mild and severe COVID-19 patients. We identified 281 FDA-approved drugs that have the potential to be effective against SARS-CoV-2 infection, 16 of which are currently undergoing clinical trials to evaluate their efficacy against COVID-19. We experimentally tested the inhibitory effects of tyrphostin-AG-1478 and brefeldin-a on the replication of the single-stranded ribonucleic acid (ssRNA) virus influenza A virus. In conclusion, we have identified a list of repurposable anti-SARS-CoV-2 drugs using a systems biology approach.
Collapse
Affiliation(s)
- Zhihan Wang
- West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu, Sichuan 610041, China
| | - Kai Guo
- Department of Biomedical Sciences, University of North Dakota School of Medicine and Health Sciences, Grand Forks, ND 58202, USA
| | - Pan Gao
- Department of Biomedical Sciences, University of North Dakota School of Medicine and Health Sciences, Grand Forks, ND 58202, USA
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, and Collaborative Innovation Center for Biotherapy, Chengdu, Sichuan 610041, China
| | - Qinqin Pu
- Department of Biomedical Sciences, University of North Dakota School of Medicine and Health Sciences, Grand Forks, ND 58202, USA
| | - Min Wu
- Department of Biomedical Sciences, University of North Dakota School of Medicine and Health Sciences, Grand Forks, ND 58202, USA
| | - Changlong Li
- West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu, Sichuan 610041, China
| | - Junguk Hur
- Department of Biomedical Sciences, University of North Dakota School of Medicine and Health Sciences, Grand Forks, ND 58202, USA
| |
Collapse
|
79
|
Xie L, He S, Zhang Z, Lin K, Bo X, Yang S, Feng B, Wan K, Yang K, Yang J, Ding Y. Domain-adversarial multi-task framework for novel therapeutic property prediction of compounds. Bioinformatics 2020; 36:2848-2855. [PMID: 31999334 DOI: 10.1093/bioinformatics/btaa063] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2019] [Revised: 12/23/2019] [Accepted: 01/23/2020] [Indexed: 12/11/2022] Open
Abstract
MOTIVATION With the rapid development of high-throughput technologies, parallel acquisition of large-scale drug-informatics data provides significant opportunities to improve pharmaceutical research and development. One important application is the purpose prediction of small-molecule compounds with the objective of specifying the therapeutic properties of extensive purpose-unknown compounds and repurposing the novel therapeutic properties of FDA-approved drugs. Such a problem is extremely challenging because compound attributes include heterogeneous data with various feature patterns, such as drug fingerprints, drug physicochemical properties and drug perturbation gene expressions. Moreover, there is a complex non-linear dependency among heterogeneous data. In this study, we propose a novel domain-adversarial multi-task framework for integrating shared knowledge from multiple domains. The framework first uses an adversarial strategy to learn target representations and then models non-linear dependency among several domains. RESULTS Experiments on two real-world datasets illustrate that our approach achieves an obvious improvement over competitive baselines. The novel therapeutic properties of purpose-unknown compounds that we predicted have been widely reported or brought to clinics. Furthermore, our framework can integrate various attributes beyond the three domains examined herein and can be applied in industry for screening significant numbers of small-molecule drug candidates. AVAILABILITY AND IMPLEMENTATION The source code and datasets are available at https://github.com/JohnnyY8/DAMT-Model. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lingwei Xie
- School of Informatics, Xiamen University, Xiamen 361005, China
| | - Song He
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Zhongnan Zhang
- School of Informatics, Xiamen University, Xiamen 361005, China
| | - Kunhui Lin
- School of Informatics, Xiamen University, Xiamen 361005, China
| | - Xiaochen Bo
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Shu Yang
- Department of Computer Science, UC Santa Barbara, Santa Barbara, CA 93106, USA
| | - Boyuan Feng
- Department of Computer Science, UC Santa Barbara, Santa Barbara, CA 93106, USA
| | - Kun Wan
- Department of Computer Science, UC Santa Barbara, Santa Barbara, CA 93106, USA
| | - Kang Yang
- School of Informatics, Xiamen University, Xiamen 361005, China
| | - Jie Yang
- School of Informatics, Xiamen University, Xiamen 361005, China
| | - Yufei Ding
- Department of Computer Science, UC Santa Barbara, Santa Barbara, CA 93106, USA
| |
Collapse
|
80
|
A compound attributes-based predictive model for drug induced liver injury in humans. PLoS One 2020; 15:e0231252. [PMID: 32294131 PMCID: PMC7159228 DOI: 10.1371/journal.pone.0231252] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2019] [Accepted: 02/29/2020] [Indexed: 11/19/2022] Open
Abstract
Drug induced liver injury (DILI) is one of the key safety concerns in drug development. To assess the likelihood of drug candidates with potential adverse reactions of liver, we propose a compound attributes-based approach to predicting hepatobiliary disorders that are routinely reported to US Food and Drug Administration (FDA) Adverse Event Reporting System (FAERS). Specifically, we developed a support vector machine (SVM) model with recursive feature extraction, based on physicochemical and structural properties of compounds as model input. Cross validation demonstrates that the predictive model has a robust performance with averaged 70% of both sensitivity and specificity over 500 trials. An independent validation was performed on public benchmark drugs and the results suggest potential utility of our model for identifying safety alerts. This in silico approach, upon further validation, would ultimately be implemented, together with other in vitro safety assays, for screening compounds early in drug development.
Collapse
|
81
|
Kozawa S, Sagawa F, Endo S, De Almeida GM, Mitsuishi Y, Sato TN. Predicting Human Clinical Outcomes Using Mouse Multi-Organ Transcriptome. iScience 2020; 23:100791. [PMID: 31928967 PMCID: PMC7033637 DOI: 10.1016/j.isci.2019.100791] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2019] [Revised: 10/17/2019] [Accepted: 12/16/2019] [Indexed: 12/27/2022] Open
Abstract
Approximately 90% of pre-clinically validated drugs fail in clinical trials owing to unanticipated clinical outcomes, costing over several hundred million US dollars per drug. Despite such critical importance, translating pre-clinical data to clinical outcomes remain a major challenge. Herein, we designed a modality-independent and unbiased approach to predict clinical outcomes of drugs. The approach exploits their multi-organ transcriptome patterns induced in mice and a unique mouse-transcriptome database "humanized" by machine learning algorithms and human clinical outcome datasets. The cross-validation with small-molecule, antibody, and peptide drugs shows effective and efficient identification of the previously known outcomes of 5,519 adverse events and 11,312 therapeutic indications. In addition, the approach is adaptable to deducing potential molecular mechanisms underlying these outcomes. Furthermore, the approach identifies previously unsuspected repositioning targets. These results, together with the fact that it requires no prior structural or mechanistic information of drugs, illustrate its versatile applications to drug development process.
Collapse
Affiliation(s)
- Satoshi Kozawa
- Karydo TherapeutiX, Inc., Kyoto, Japan; ERATO Sato Live Bio-Forecasting Project, Kyoto, Japan; The Thomas N. Sato BioMEC-X Laboratories, Advanced Telecommunications Research Institute International, 2-2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-0288, Japan
| | - Fumihiko Sagawa
- Karydo TherapeutiX, Inc., Kyoto, Japan; ERATO Sato Live Bio-Forecasting Project, Kyoto, Japan; The Thomas N. Sato BioMEC-X Laboratories, Advanced Telecommunications Research Institute International, 2-2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-0288, Japan
| | - Satsuki Endo
- Karydo TherapeutiX, Inc., Kyoto, Japan; ERATO Sato Live Bio-Forecasting Project, Kyoto, Japan; The Thomas N. Sato BioMEC-X Laboratories, Advanced Telecommunications Research Institute International, 2-2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-0288, Japan
| | | | | | - Thomas N Sato
- Karydo TherapeutiX, Inc., Kyoto, Japan; ERATO Sato Live Bio-Forecasting Project, Kyoto, Japan; The Thomas N. Sato BioMEC-X Laboratories, Advanced Telecommunications Research Institute International, 2-2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-0288, Japan; Biomedical Engineering, Cornell University, Ithaca, NY, USA; Centenary Institute, Sydney, Australia; V-iClinix Laboratory, Nara Medical University, Nara, Japan.
| |
Collapse
|
82
|
Chierici M, Francescatto M, Bussola N, Jurman G, Furlanello C. Predictability of drug-induced liver injury by machine learning. Biol Direct 2020; 15:3. [PMID: 32054490 PMCID: PMC7020573 DOI: 10.1186/s13062-020-0259-4] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2018] [Accepted: 01/30/2020] [Indexed: 12/13/2022] Open
Abstract
Background Drug-induced liver injury (DILI) is a major concern in drug development, as hepatotoxicity may not be apparent at early stages but can lead to life threatening consequences. The ability to predict DILI from in vitro data would be a crucial advantage. In 2018, the Critical Assessment Massive Data Analysis group proposed the CMap Drug Safety challenge focusing on DILI prediction. Methods and results The challenge data included Affymetrix GeneChip expression profiles for the two cancer cell lines MCF7 and PC3 treated with 276 drug compounds and empty vehicles. Binary DILI labeling and a recommended train/test split for the development of predictive classification approaches were also provided. We devised three deep learning architectures for DILI prediction on the challenge data and compared them to random forest and multi-layer perceptron classifiers. On a subset of the data and for some of the models we additionally tested several strategies for balancing the two DILI classes and to identify alternative informative train/test splits. All the models were trained with the MAQC data analysis protocol (DAP), i.e., 10x5 cross-validation over the training set. In all the experiments, the classification performance in both cross-validation and external validation gave Matthews correlation coefficient (MCC) values below 0.2. We observed minimal differences between the two cell lines. Notably, deep learning approaches did not give an advantage on the classification performance. Discussion We extensively tested multiple machine learning approaches for the DILI classification task obtaining poor to mediocre performance. The results suggest that the CMap expression data on the two cell lines MCF7 and PC3 are not sufficient for accurate DILI label prediction. Reviewers This article was reviewed by Maciej Kandula and Paweł P. Labaj.
Collapse
Affiliation(s)
- Marco Chierici
- Fondazione Bruno Kessler, Via Sommarive 18, Trento, 38123, Italy.
| | | | - Nicole Bussola
- Fondazione Bruno Kessler, Via Sommarive 18, Trento, 38123, Italy.,Department CIBIO, University of Trento, Via Sommarive 9, Trento, 38123, Italy
| | - Giuseppe Jurman
- Fondazione Bruno Kessler, Via Sommarive 18, Trento, 38123, Italy
| | | |
Collapse
|
83
|
Liang X, Zhang P, Li J, Fu Y, Qu L, Chen Y, Chen Z. Learning important features from multi-view data to predict drug side effects. J Cheminform 2019; 11:79. [PMID: 33430979 PMCID: PMC6916463 DOI: 10.1186/s13321-019-0402-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2019] [Accepted: 12/05/2019] [Indexed: 02/06/2023] Open
Abstract
The problem of drug side effects is one of the most crucial issues in pharmacological development. As there are many limitations in current experimental and clinical methods for detecting side effects, a lot of computational algorithms have been developed to predict side effects with different types of drug information. However, there is still a lack of methods which could integrate heterogeneous data to predict side effects and select important features at the same time. Here, we propose a novel computational framework based on multi-view and multi-label learning for side effect prediction. Four different types of drug features are collected and graph model is constructed from each feature profile. After that, all the single view graphs are combined to regularize the linear regression functions which describe the relationships between drug features and side effect labels. L1 penalties are imposed on the regression coefficient matrices in order to select features relevant to side effects. Additionally, the correlations between side effect labels are also incorporated into the model by graph Laplacian regularization. The experimental results show that the proposed method could not only provide more accurate prediction for side effects but also select drug features related to side effects from heterogeneous data. Some case studies are also supplied to illustrate the utility of our method for prediction of drug side effects.
Collapse
Affiliation(s)
- Xujun Liang
- NHC Key Laboratory of Cancer Proteomics, Xiangya Hospital, Central South University, XiangYa Road, Changsha, China.
| | - Pengfei Zhang
- NHC Key Laboratory of Cancer Proteomics, Xiangya Hospital, Central South University, XiangYa Road, Changsha, China
| | - Jun Li
- NHC Key Laboratory of Cancer Proteomics, Xiangya Hospital, Central South University, XiangYa Road, Changsha, China
| | - Ying Fu
- NHC Key Laboratory of Cancer Proteomics, Xiangya Hospital, Central South University, XiangYa Road, Changsha, China
| | - Lingzhi Qu
- NHC Key Laboratory of Cancer Proteomics, Xiangya Hospital, Central South University, XiangYa Road, Changsha, China
| | - Yongheng Chen
- NHC Key Laboratory of Cancer Proteomics, Xiangya Hospital, Central South University, XiangYa Road, Changsha, China
| | - Zhuchu Chen
- NHC Key Laboratory of Cancer Proteomics, Xiangya Hospital, Central South University, XiangYa Road, Changsha, China
| |
Collapse
|
84
|
Yang G, Ma A, Qin ZS. An Integrated System Biology Approach Yields Drug Repositioning Candidates for the Treatment of Heart Failure. Front Genet 2019; 10:916. [PMID: 31608126 PMCID: PMC6773955 DOI: 10.3389/fgene.2019.00916] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2019] [Accepted: 08/29/2019] [Indexed: 12/20/2022] Open
Abstract
Identifying effective pharmacological treatments for heart failure (HF) patients remains critically important. Given that the development of drugs de novo is expensive and time consuming, drug repositioning has become an increasingly important branch. In the present study, we propose a two-step drug repositioning pipeline and investigate the novel therapeutic effects of existing drugs approved by the US Food and Drug Administration to discover potential therapeutic drugs for HF. In the first step, we compared the gene expression pattern of HF patients with drug-induced gene expression profiles to obtain preliminary candidates. In the second step, we performed a systems biology approach based on the known protein–protein interaction information and targets of drugs to narrow down preliminary candidates to obtain final candidates. Drug set enrichment analysis and literature search were applied to assess the performance of our repositioning approach. We also constructed a mode of action network for each candidate and performed pathway analysis for each candidate using genes contained in their mode of action network to uncover pathways that potentially reflect the mechanisms of candidates’ therapeutic efficacy to HF. We discovered numerous preliminary candidates, some of which are used in clinical practice and supported by the literature. The final candidates contained nearly all of the preliminary candidates supported by previous studies. Drug set enrichment analysis and literature search support the validity of our repositioning approach. The mode of action network for each candidate not only displayed the underlying mechanisms of drug efficacy but also uncovered potential biomarkers and therapeutic targets for HF. Our two-step drug repositioning approach is efficient to find candidates with potential therapeutic efficiency to HF supported by the literature and might be of particular use in the discovery of novel effective pharmacological therapies for HF.
Collapse
Affiliation(s)
- Guodong Yang
- Department of Cardiovascular Medicine, First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China.,Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, United States
| | - Aiqun Ma
- Department of Cardiovascular Medicine, First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China
| | - Zhaohui S Qin
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, United States
| |
Collapse
|
85
|
Pérez-Parras Toledano J, García-Pedrajas N, Cerruela-García G. Multilabel and Missing Label Methods for Binary Quantitative Structure-Activity Relationship Models: An Application for the Prediction of Adverse Drug Reactions. J Chem Inf Model 2019; 59:4120-4130. [PMID: 31514503 DOI: 10.1021/acs.jcim.9b00611] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
The prediction of adverse drug reactions in the discovery of new medicines is highly challenging. In the task of predicting the adverse reactions of chemical compounds, information about different targets is often available. Although we can focus on every adverse drug reaction prediction separately, multilabel approaches have been proven useful in many research areas for taking advantage of the relationship among the targets. However, when approaching the prediction problem from a multilabel point of view, we have to deal with the lack of information for some labels. This missing labels problem is a relevant issue in the field of cheminformatics approaches. This paper aims to predict the adverse drug reaction of commercial drugs using a multilabel approach where the possible presence of missing labels is also taken into consideration. We propose the use of multilabel methods to deal with the prediction of a large set of 27 different adverse reaction targets. We also propose the use of multilabel methods specifically designed to deal with the missing labels problem to test their ability to solve this difficult problem. The results show the validity of the proposed approach, demonstrating a superior performance of the multilabel method compared with the single-label approach in addressing the problem of adverse drug reaction prediction.
Collapse
Affiliation(s)
- José Pérez-Parras Toledano
- University of Córdoba , Department of Computing and Numerical Analysis, Campus de Rabanales , Albert Einstein Building , E-14071 Córdoba , Spain
| | - Nicolás García-Pedrajas
- University of Córdoba , Department of Computing and Numerical Analysis, Campus de Rabanales , Albert Einstein Building , E-14071 Córdoba , Spain
| | - Gonzalo Cerruela-García
- University of Córdoba , Department of Computing and Numerical Analysis, Campus de Rabanales , Albert Einstein Building , E-14071 Córdoba , Spain
| |
Collapse
|
86
|
Nassiri I, McCall MN. Systematic exploration of cell morphological phenotypes associated with a transcriptomic query. Nucleic Acids Res 2019; 46:e116. [PMID: 30011038 PMCID: PMC6212779 DOI: 10.1093/nar/gky626] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2017] [Accepted: 07/10/2018] [Indexed: 12/23/2022] Open
Abstract
Cell morphological phenotypes, including shape, size, intensity, and texture of cellular compartments have been shown to change in response to perturbation with small molecule compounds. Image-based cell profiling or cell morphological profiling has been used to associate changes of cell morphological features with alterations in cellular function and to infer molecular mechanisms of action. Recently, the Library of Integrated Network-based Cellular Signatures (LINCS) Project has measured gene expression and performed image-based cell profiling on cell lines treated with 9515 unique compounds. These data provide an opportunity to study the interdependence between transcription and cell morphology. Previous methods to investigate cell phenotypes have focused on targeting candidate genes as components of known pathways, RNAi morphological profiling, and cataloging morphological defects; however, these methods do not provide an explicit model to link transcriptomic changes with corresponding alterations in morphology. To address this, we propose a cell morphology enrichment analysis to assess the association between transcriptomic alterations and changes in cell morphology. Additionally, for a new transcriptomic query, our approach can be used to predict associated changes in cellular morphology. We demonstrate the utility of our method by applying it to cell morphological changes in a human bone osteosarcoma cell line.
Collapse
Affiliation(s)
- Isar Nassiri
- Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, NY, USA.,Department of Oncology, Weatherall Institute for Molecular Medicine, University of Oxford, UK
| | - Matthew N McCall
- Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, NY, USA.,Department of Biomedical Genetics, University of Rochester Medical Center, Rochester, NY, USA
| |
Collapse
|
87
|
Wang Z, Lachmann A, Keenan AB, Ma'ayan A. L1000FWD: fireworks visualization of drug-induced transcriptomic signatures. Bioinformatics 2019; 34:2150-2152. [PMID: 29420694 DOI: 10.1093/bioinformatics/bty060] [Citation(s) in RCA: 110] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2017] [Accepted: 02/05/2018] [Indexed: 11/12/2022] Open
Abstract
Motivation As part of the NIH Library of Integrated Network-based Cellular Signatures program, hundreds of thousands of transcriptomic signatures were generated with the L1000 technology, profiling the response of human cell lines to over 20 000 small molecule compounds. This effort is a promising approach toward revealing the mechanisms-of-action (MOA) for marketed drugs and other less studied potential therapeutic compounds. Results L1000 fireworks display (L1000FWD) is a web application that provides interactive visualization of over 16 000 drug and small-molecule induced gene expression signatures. L1000FWD enables coloring of signatures by different attributes such as cell type, time point, concentration, as well as drug attributes such as MOA and clinical phase. Signature similarity search is implemented to enable the search for mimicking or opposing signatures given as input of up and down gene sets. Each point on the L1000FWD interactive map is linked to a signature landing page, which provides multifaceted knowledge from various sources about the signature and the drug. Notably such information includes most frequent diagnoses, co-prescribed drugs and age distribution of prescriptions as extracted from the Mount Sinai Health System electronic medical records. Overall, L1000FWD serves as a platform for identifying functions for novel small molecules using unsupervised clustering, as well as for exploring drug MOA. Availability and implementation L1000FWD is freely accessible at: http://amp.pharm.mssm.edu/L1000FWD. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zichen Wang
- Department of Pharmacological Sciences, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Alexander Lachmann
- Department of Pharmacological Sciences, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Alexandra B Keenan
- Department of Pharmacological Sciences, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Avi Ma'ayan
- Department of Pharmacological Sciences, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| |
Collapse
|
88
|
Keenan AB, Wojciechowicz ML, Wang Z, Jagodnik KM, Jenkins SL, Lachmann A, Ma'ayan A. Connectivity Mapping: Methods and Applications. Annu Rev Biomed Data Sci 2019. [DOI: 10.1146/annurev-biodatasci-072018-021211] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Connectivity mapping resources consist of signatures representing changes in cellular state following systematic small-molecule, disease, gene, or other form of perturbations. Such resources enable the characterization of signatures from novel perturbations based on similarity; provide a global view of the space of many themed perturbations; and allow the ability to predict cellular, tissue, and organismal phenotypes for perturbagens. A signature search engine enables hypothesis generation by finding connections between query signatures and the database of signatures. This framework has been used to identify connections between small molecules and their targets, to discover cell-specific responses to perturbations and ways to reverse disease expression states with small molecules, and to predict small-molecule mimickers for existing drugs. This review provides a historical perspective and the current state of connectivity mapping resources with a focus on both methodology and community implementations.
Collapse
Affiliation(s)
- Alexandra B. Keenan
- Department of Pharmacological Sciences and Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Megan L. Wojciechowicz
- Department of Pharmacological Sciences and Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Zichen Wang
- Department of Pharmacological Sciences and Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Kathleen M. Jagodnik
- Department of Pharmacological Sciences and Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Sherry L. Jenkins
- Department of Pharmacological Sciences and Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Alexander Lachmann
- Department of Pharmacological Sciences and Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Avi Ma'ayan
- Department of Pharmacological Sciences and Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| |
Collapse
|
89
|
Gal J, Milano G, Ferrero JM, Saâda-Bouzid E, Viotti J, Chabaud S, Gougis P, Le Tourneau C, Schiappa R, Paquet A, Chamorey E. Optimizing drug development in oncology by clinical trial simulation: Why and how? Brief Bioinform 2019; 19:1203-1217. [PMID: 28575140 DOI: 10.1093/bib/bbx055] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2017] [Indexed: 12/11/2022] Open
Abstract
In therapeutic research, the safety and efficacy of pharmaceutical products are necessarily tested on humans via clinical trials after an extensive and expensive preclinical development period. Methodologies such as computer modeling and clinical trial simulation (CTS) might represent a valuable option to reduce animal and human assays. The relevance of these methods is well recognized in pharmacokinetics and pharmacodynamics from the preclinical phase to postmarketing. However, they are barely used and are poorly regarded for drug approval, despite Food and Drug Administration and European Medicines Agency recommendations. The generalization of CTS could be greatly facilitated by the availability of software for modeling biological systems, by clinical trial studies and hospital databases. Data sharing and data merging raise legal, policy and technical issues that will need to be addressed. Development of future molecules will have to use CTS for faster development and thus enable better patient management. Drug activity modeling coupled with disease modeling, optimal use of medical data and increased computing speed should allow this leap forward. The realization of CTS requires not only bioinformatics tools to allow interconnection and global integration of all clinical data but also a universal legal framework to protect the privacy of every patient. While recognizing that CTS can never replace 'real-life' trials, they should be implemented in future drug development schemes to provide quantitative support for decision-making. This in silico medicine opens the way to the P4 medicine: predictive, preventive, personalized and participatory.
Collapse
Affiliation(s)
- Jocelyn Gal
- Epidemiology and Biostatistics Unit at the Antoine Lacassagne Center, Nice, France
| | | | | | | | | | | | - Paul Gougis
- Pitie´-Salp^etrie`re Hospital in Paris, France
| | | | | | - Agnes Paquet
- Molecular and Cellular Pharmacology Institute of Sophia Antipolis, Valbonne, France
| | | |
Collapse
|
90
|
Ben Guebila M, Thiele I. Predicting gastrointestinal drug effects using contextualized metabolic models. PLoS Comput Biol 2019; 15:e1007100. [PMID: 31242176 PMCID: PMC6594586 DOI: 10.1371/journal.pcbi.1007100] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2018] [Accepted: 05/13/2019] [Indexed: 12/28/2022] Open
Abstract
Gastrointestinal side effects are among the most common classes of adverse reactions associated with orally absorbed drugs. These effects decrease patient compliance with the treatment and induce undesirable physiological effects. The prediction of drug action on the gut wall based on in vitro data solely can improve the safety of marketed drugs and first-in-human trials of new chemical entities. We used publicly available data of drug-induced gene expression changes to build drug-specific small intestine epithelial cell metabolic models. The combination of measured in vitro gene expression and in silico predicted metabolic rates in the gut wall was used as features for a multilabel support vector machine to predict the occurrence of side effects. We showed that combining local gut wall-specific metabolism with gene expression performs better than gene expression alone, which indicates the role of small intestine metabolism in the development of adverse reactions. Furthermore, we reclassified FDA-labeled drugs with respect to their genetic and metabolic profiles to show hidden similarities between seemingly different drugs. The linkage of xenobiotics to their transcriptomic and metabolic profiles could take pharmacology far beyond the usual indication-based classifications.
Collapse
Affiliation(s)
- Marouen Ben Guebila
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Ines Thiele
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
- School of Medicine, National University of Ireland, Galway, University Road, Galway, Ireland
- Discipline of Microbiology, School of Natural Sciences, National University of Ireland, Galway, University Road, Galway, Ireland
- * E-mail:
| |
Collapse
|
91
|
Musa A, Tripathi S, Dehmer M, Yli-Harja O, Kauffman SA, Emmert-Streib F. Systems Pharmacogenomic Landscape of Drug Similarities from LINCS data: Drug Association Networks. Sci Rep 2019; 9:7849. [PMID: 31127155 PMCID: PMC6534546 DOI: 10.1038/s41598-019-44291-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2018] [Accepted: 05/08/2019] [Indexed: 02/01/2023] Open
Abstract
Modern research in the biomedical sciences is data-driven utilizing high-throughput technologies to generate big genomic data. The Library of Integrated Network-based Cellular Signatures (LINCS) is an example for a large-scale genomic data repository providing hundred thousands of high-dimensional gene expression measurements for thousands of drugs and dozens of cell lines. However, the remaining challenge is how to use these data effectively for pharmacogenomics. In this paper, we use LINCS data to construct drug association networks (DANs) representing the relationships between drugs. By using the Anatomical Therapeutic Chemical (ATC) classification of drugs we demonstrate that the DANs represent a systems pharmacogenomic landscape of drugs summarizing the entire LINCS repository on a genomic scale meaningfully. Here we identify the modules of the DANs as therapeutic attractors of the ATC drug classes.
Collapse
Affiliation(s)
- Aliyu Musa
- Predictive Society and Data Analytics Lab, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland
- Institute of Biosciences and Medical Technology, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland
| | - Shailesh Tripathi
- Predictive Society and Data Analytics Lab, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland
- Institute for Intelligent Production, Faculty for Management, University of Applied Sciences Upper Austria, Wehrgrabengasse 1-3, 4400, Steyr, Austria
| | - Matthias Dehmer
- Department for Biomedical Computer Science and Mechatronics, UMIT - The Health and Lifesciences University, Eduard Wallnoefer Zentrum 1, 6060, Hall in Tyrol, Austria
- College of Computer and Control Engineering, Nankai University, Tianjin, 300350, P.R. China
- Institute for Intelligent Production, Faculty for Management, University of Applied Sciences Upper Austria, Wehrgrabengasse 1-3, 4400, Steyr, Austria
| | - Olli Yli-Harja
- Institute of Biosciences and Medical Technology, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland
- Computational Systems Biology Lab, Tampere University of Technology, Korkeakoulunkatu 10, 33720, Tampere, Finland
- Institute for Systems Biology, Seattle, WA, 98109, USA
| | | | - Frank Emmert-Streib
- Predictive Society and Data Analytics Lab, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland.
- Institute of Biosciences and Medical Technology, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland.
| |
Collapse
|
92
|
Niu C, Jiang M, Li N, Cao J, Hou M, Ni DA, Chu Z. Integrated bioinformatics analysis of As, Au, Cd, Pb and Cu heavy metal responsive marker genes through Arabidopsis thaliana GEO datasets. PeerJ 2019; 7:e6495. [PMID: 30918749 PMCID: PMC6428040 DOI: 10.7717/peerj.6495] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2018] [Accepted: 01/19/2019] [Indexed: 01/11/2023] Open
Abstract
BACKGROUND Current environmental pollution factors, particularly the distribution and diffusion of heavy metals in soil and water, are a high risk to local environments and humans. Despite striking advances in methods to detect contaminants by a variety of chemical and physical solutions, these methods have inherent limitations such as small dimensions and very low coverage. Therefore, identifying novel contaminant biomarkers are urgently needed. METHODS To better track heavy metal contaminations in soil and water, integrated bioinformatics analysis to identify biomarkers of relevant heavy metal, such as As, Cd, Pb and Cu, is a suitable method for long-term and large-scale surveys of such heavy metal pollutants. Subsequently, the accuracy and stability of the results screened were experimentally validated by quantitative PCR experiment. RESULTS We obtained 168 differentially expressed genes (DEGs) which contained 59 up-regulated genes and 109 down-regulated genes through comparative bioinformatics analyses. Subsequently, the gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichments of these DEGs were performed, respectively. GO analyses found that these DEGs were mainly related to responses to chemicals, responses to stimulus, responses to stress, responses to abiotic stimulus, and so on. KEGG pathway analyses of DEGs were mainly involved in the protein degradation process and other biologic process, such as the phenylpropanoid biosynthesis pathways and nitrogen metabolism. Moreover, we also speculated that nine candidate core biomarker genes (namely, NILR1, PGPS1, WRKY33, BCS1, AR781, CYP81D8, NR1, EAP1 and MYB15) might be tightly correlated with the response or transport of heavy metals. Finally, experimental results displayed that these genes had the same expression trend response to different stresses as mentioned above (Cd, Pb and Cu) and no mentioned above (Zn and Cr). CONCLUSION In general, the identified biomarker genes could help us understand the potential molecular mechanisms or signaling pathways responsive to heavy metal stress in plants, and could be applied as marker genes to track heavy metal pollution in soil and water through detecting their expression in plants growing in those environments.
Collapse
Affiliation(s)
- Chao Niu
- School of Ecological Technology and Engineering, Shanghai Institute of Technology, Shanghai, Shanghai, China
- Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai, Shanghai, China
- Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Shanghai, Shanghai, China
| | - Min Jiang
- Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai, Shanghai, China
- Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Shanghai, Shanghai, China
| | - Na Li
- Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai, Shanghai, China
- Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Shanghai, Shanghai, China
- College of Life Sciences, Shanghai Normal University, Shanghai, Shanghai, China
| | - Jianguo Cao
- College of Life Sciences, Shanghai Normal University, Shanghai, Shanghai, China
| | - Meifang Hou
- School of Ecological Technology and Engineering, Shanghai Institute of Technology, Shanghai, Shanghai, China
| | - Di-an Ni
- School of Ecological Technology and Engineering, Shanghai Institute of Technology, Shanghai, Shanghai, China
| | - Zhaoqing Chu
- Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai, Shanghai, China
- Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Shanghai, Shanghai, China
| |
Collapse
|
93
|
Regan-Fendt KE, Xu J, DiVincenzo M, Duggan MC, Shakya R, Na R, Carson WE, Payne PRO, Li F. Synergy from gene expression and network mining (SynGeNet) method predicts synergistic drug combinations for diverse melanoma genomic subtypes. NPJ Syst Biol Appl 2019; 5:6. [PMID: 30820351 PMCID: PMC6391384 DOI: 10.1038/s41540-019-0085-4] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Accepted: 01/23/2019] [Indexed: 12/31/2022] Open
Abstract
Systems biology perspectives are crucial for understanding the pathophysiology of complex diseases, and therefore hold great promise for the discovery of novel treatment strategies. Drug combinations have been shown to improve durability and reduce resistance to available first-line therapies in a variety of cancers; however, traditional drug discovery approaches are prohibitively cost and labor-intensive to evaluate large-scale matrices of potential drug combinations. Computational methods are needed to efficiently model complex interactions of drug target pathways and identify mechanisms underlying drug combination synergy. In this study, we employ a computational approach, SynGeNet (Synergy from Gene expression and Network mining), which integrates transcriptomics-based connectivity mapping and network centrality analysis to analyze disease networks and predict drug combinations. As an exemplar of a disease in which combination therapies demonstrate efficacy in genomic-specific contexts, we investigate malignant melanoma. We employed SynGeNet to generate drug combination predictions for each of the four major genomic subtypes of melanoma (BRAF, NRAS, NF1, and triple wild type) using publicly available gene expression and mutation data. We validated synergistic drug combinations predicted by our method across all genomic subtypes using results from a high-throughput drug screening study across. Finally, we prospectively validated the drug combination for BRAF-mutant melanoma that was top ranked by our approach, vemurafenib (BRAF inhibitor) + tretinoin (retinoic acid receptor agonist), using both in vitro and in vivo models of BRAF-mutant melanoma and RNA-sequencing analysis of drug-treated melanoma cells to validate the predicted mechanisms. Our approach is applicable to a wide range of disease domains, and, importantly, can model disease-relevant protein subnetworks in precision medicine contexts.
Collapse
Affiliation(s)
- Kelly E Regan-Fendt
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | - Jielin Xu
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | - Mallory DiVincenzo
- Comprehensive Cancer Center, The Ohio State University, Columbus, OH, USA
| | - Megan C Duggan
- Comprehensive Cancer Center, The Ohio State University, Columbus, OH, USA
| | - Reena Shakya
- Target Validation Shared Resource, The Ohio State University, Columbus, OH, USA
| | - Ryejung Na
- Target Validation Shared Resource, The Ohio State University, Columbus, OH, USA
| | - William E Carson
- Comprehensive Cancer Center, The Ohio State University, Columbus, OH, USA
| | - Philip R O Payne
- Institute for Informatics, Washington University in St. Louis, St. Louis, MO, USA
| | - Fuhai Li
- Institute for Informatics, Washington University in St. Louis, St. Louis, MO, USA.
- Department of Pediatrics, Washington University in St. Louis, St. Louis, MO, USA.
| |
Collapse
|
94
|
Taguchi YH. Drug candidate identification based on gene expression of treated cells using tensor decomposition-based unsupervised feature extraction for large-scale data. BMC Bioinformatics 2019; 19:388. [PMID: 30717646 PMCID: PMC7394334 DOI: 10.1186/s12859-018-2395-8] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2018] [Accepted: 09/25/2018] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Although in silico drug discovery is necessary for drug development, two major strategies, a structure-based and ligand-based approach, have not been completely successful. Currently, the third approach, inference of drug candidates from gene expression profiles obtained from the cells treated with the compounds under study requires the use of a training dataset. Here, the purpose was to develop a new approach that does not require any pre-existing knowledge about the drug-protein interactions, but these interactions can be inferred by means of an integrated approach using gene expression profiles obtained from the cells treated with the analysed compounds and the existing data describing gene-gene interactions. RESULTS In the present study, using tensor decomposition-based unsupervised feature extraction, which represents an extension of the recently proposed principal-component analysis-based feature extraction, gene sets and compounds with a significant dose-dependent activity were screened without any training datasets. Next, after these results were combined with the data showing perturbations in single-gene expression profiles, genes targeted by the analysed compounds were inferred. The set of target genes thus identified was shown to significantly overlap with known target genes of the compounds under study. CONCLUSIONS The method is specifically designed for large-scale datasets (including hundreds of treatments with compounds), not for conventional small-scale datasets. The obtained results indicate that two compounds that have not been extensively studied, WZ-3105 and CGP-60474, represent promising drug candidates targeting multiple cancers, including melanoma, adenocarcinoma, liver carcinoma, and breast, colon, and prostate cancers, which were analysed in this in silico study.
Collapse
Affiliation(s)
- Y-H Taguchi
- Department of Physics, Chuo University, 1-13-27 Kasuga, Bunkyo-ku, Tokyo, 112-8551, Japan.
| |
Collapse
|
95
|
Network-Based Assessment of Adverse Drug Reaction Risk in Polypharmacy Using High-Throughput Screening Data. Int J Mol Sci 2019; 20:ijms20020386. [PMID: 30658437 PMCID: PMC6358820 DOI: 10.3390/ijms20020386] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2018] [Revised: 01/12/2019] [Accepted: 01/15/2019] [Indexed: 12/19/2022] Open
Abstract
The risk of adverse drug reactions increases in a polypharmacology setting. High-throughput drug screening with transcriptomics applied to human cells has shown that drugs have effects on several molecular pathways, and these affected pathways may be predictive proxy for adverse drug reactions. Depending on the way that different drugs may contribute to adverse drug reactions, different options may exist in the clinical setting. Here, we formulate a network framework to integrate the relationships between drugs, biological functions, and adverse drug reactions based on the high-throughput drug perturbation data from the Library of Integrated Network-Based Cellular Signatures (LINCS) project. We present network-based parameters that indicate whether a given reaction may be related to the effect of a single drug or to the combination of several drugs, as well as the relative risk of adverse drug reaction manifestation given a certain drug combination.
Collapse
|
96
|
Dey S, Luo H, Fokoue A, Hu J, Zhang P. Predicting adverse drug reactions through interpretable deep learning framework. BMC Bioinformatics 2018; 19:476. [PMID: 30591036 PMCID: PMC6300887 DOI: 10.1186/s12859-018-2544-0] [Citation(s) in RCA: 63] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND Adverse drug reactions (ADRs) are unintended and harmful reactions caused by normal uses of drugs. Predicting and preventing ADRs in the early stage of the drug development pipeline can help to enhance drug safety and reduce financial costs. METHODS In this paper, we developed machine learning models including a deep learning framework which can simultaneously predict ADRs and identify the molecular substructures associated with those ADRs without defining the substructures a-priori. RESULTS We evaluated the performance of our model with ten different state-of-the-art fingerprint models and found that neural fingerprints from the deep learning model outperformed all other methods in predicting ADRs. Via feature analysis on drug structures, we identified important molecular substructures that are associated with specific ADRs and assessed their associations via statistical analysis. CONCLUSIONS The deep learning model with feature analysis, substructure identification, and statistical assessment provides a promising solution for identifying risky components within molecular structures and can potentially help to improve drug safety evaluation.
Collapse
Affiliation(s)
- Sanjoy Dey
- Center for Computational Health, IBM T.J. Watson Research Center, 1101 Kitchawan Road, Yorktown Heights, NY USA
| | - Heng Luo
- Center for Computational Health, IBM T.J. Watson Research Center, 1101 Kitchawan Road, Yorktown Heights, NY USA
| | - Achille Fokoue
- Cognitive Computing, IBM T.J. Watson Research Center, 1101 Kitchawan Road, Yorktown Heights, NY USA
| | - Jianying Hu
- Center for Computational Health, IBM T.J. Watson Research Center, 1101 Kitchawan Road, Yorktown Heights, NY USA
| | - Ping Zhang
- Center for Computational Health, IBM T.J. Watson Research Center, 1101 Kitchawan Road, Yorktown Heights, NY USA
| |
Collapse
|
97
|
Lysenko A, Sharma A, Boroevich KA, Tsunoda T. An integrative machine learning approach for prediction of toxicity-related drug safety. Life Sci Alliance 2018; 1:e201800098. [PMID: 30515477 PMCID: PMC6262234 DOI: 10.26508/lsa.201800098] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2018] [Revised: 11/20/2018] [Accepted: 11/20/2018] [Indexed: 01/28/2023] Open
Abstract
Recent trends in drug development have been marked by diminishing returns caused by the escalating costs and falling rates of new drug approval. Unacceptable drug toxicity is a substantial cause of drug failure during clinical trials and the leading cause of drug withdraws after release to the market. Computational methods capable of predicting these failures can reduce the waste of resources and time devoted to the investigation of compounds that ultimately fail. We propose an original machine learning method that leverages identity of drug targets and off-targets, functional impact score computed from Gene Ontology annotations, and biological network data to predict drug toxicity. We demonstrate that our method (TargeTox) can distinguish potentially idiosyncratically toxic drugs from safe drugs and is also suitable for speculative evaluation of different target sets to support the design of optimal low-toxicity combinations.
Collapse
Affiliation(s)
- Artem Lysenko
- Laboratory for Medical Science Mathematics, Rikagaku Kenkyūjyo Center for Integrative Medical Sciences, Tsurumi, Japan
| | - Alok Sharma
- Laboratory for Medical Science Mathematics, Rikagaku Kenkyūjyo Center for Integrative Medical Sciences, Tsurumi, Japan
- School of Engineering and Physics, University of the South Pacific, Suva, Fiji
| | - Keith A Boroevich
- Laboratory for Medical Science Mathematics, Rikagaku Kenkyūjyo Center for Integrative Medical Sciences, Tsurumi, Japan
| | - Tatsuhiko Tsunoda
- Laboratory for Medical Science Mathematics, Rikagaku Kenkyūjyo Center for Integrative Medical Sciences, Tsurumi, Japan
- Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, Japan
- Core Research for Evolutionary Science and Technology Program, Japan Science and Technology Agency, Tokyo, Japan
| |
Collapse
|
98
|
Soldatos TG, Taglang G, Jackson DB. In Silico Profiling of Clinical Phenotypes for Human Targets Using Adverse Event Data. High Throughput 2018; 7:ht7040037. [PMID: 30477159 PMCID: PMC6306940 DOI: 10.3390/ht7040037] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2018] [Revised: 11/09/2018] [Accepted: 11/19/2018] [Indexed: 12/19/2022] Open
Abstract
We present a novel approach for the molecular transformation and analysis of patient clinical phenotypes. Building on the fact that drugs perturb the function of targets/genes, we integrated data from 8.2 million clinical reports detailing drug-induced side effects with the molecular world of drug-target information. Using this dataset, we extracted 1.8 million associations of clinical phenotypes to 770 human drug-targets. This collection is perhaps the largest phenotypic profiling reference of human targets to-date, and unique in that it enables rapid development of testable molecular hypotheses directly from human-specific information. We also present validation results demonstrating analytical utilities of the approach, including drug safety prediction, and the design of novel combination therapies. Challenging the long-standing notion that molecular perturbation studies cannot be performed in humans, our data allows researchers to capitalize on the vast tomes of clinical information available throughout the healthcare system.
Collapse
Affiliation(s)
| | - Guillaume Taglang
- Molecular Health GmbH, Kurfuersten Anlage 21, 69115 Heidelberg, Germany.
| | - David B Jackson
- Molecular Health GmbH, Kurfuersten Anlage 21, 69115 Heidelberg, Germany.
| |
Collapse
|
99
|
Schotland P, Racz R, Jackson D, Levin R, Strauss DG, Burkhart K. Target-Adverse Event Profiles to Augment Pharmacovigilance: A Pilot Study With Six New Molecular Entities. CPT-PHARMACOMETRICS & SYSTEMS PHARMACOLOGY 2018; 7:809-817. [PMID: 30354029 PMCID: PMC6310867 DOI: 10.1002/psp4.12356] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/11/2018] [Accepted: 09/06/2018] [Indexed: 12/20/2022]
Abstract
Clinical trials can fail to detect rare adverse events (AEs). We assessed the ability of pharmacological target adverse‐event (TAE) profiles to predict AEs on US Food and Drug Administration (FDA) drug labels at least 4 years after approval. TAE profiles were generated by aggregating AEs from the FDA adverse event reporting system (FAERS) reports and the FDA drug labels for drugs that hit a common target. A genetic algorithm (GA) was used to choose the adverse event (AE) case count (N), disproportionality score in FAERS (proportional reporting ratio (PRR)), and percent of comparator drug labels with an AE to maximize F‐measure. With FAERS data alone, precision, recall, and specificity were 0.57, 0.78, and 0.61, respectively. After including FDA drug label data, precision, recall, and specificity improved to 0.67, 0.81, and 0.71, respectively. Eighteen of 23 (78%) postmarket label changes were identified correctly. TAE analysis shows promise as a method to predict AEs at the time of drug approval.
Collapse
Affiliation(s)
- Peter Schotland
- Division of Applied Regulatory Science, Office of Clinical Pharmacology, Office of Translational Science, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, Maryland, USA
| | - Rebecca Racz
- Division of Applied Regulatory Science, Office of Clinical Pharmacology, Office of Translational Science, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, Maryland, USA
| | | | - Robert Levin
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, Maryland, USA
| | - David G Strauss
- Division of Applied Regulatory Science, Office of Clinical Pharmacology, Office of Translational Science, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, Maryland, USA
| | - Keith Burkhart
- Division of Applied Regulatory Science, Office of Clinical Pharmacology, Office of Translational Science, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, Maryland, USA
| |
Collapse
|
100
|
Lu S, Fan X, Chen L, Lu X. A novel method of using Deep Belief Networks and genetic perturbation data to search for yeast signaling pathways. PLoS One 2018; 13:e0203871. [PMID: 30208101 PMCID: PMC6135403 DOI: 10.1371/journal.pone.0203871] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Accepted: 08/29/2018] [Indexed: 01/25/2023] Open
Abstract
Perturbing a signaling system with a serial of single gene deletions and then observing corresponding expression changes in model organisms, such as yeast, is an important and widely used experimental technique for studying signaling pathways. People have developed different computational methods to analyze the perturbation data from gene deletion experiments for exploring the signaling pathways. The most popular methods/techniques include K-means clustering and hierarchical clustering techniques, or combining the expression data with knowledge, such as protein-protein interactions (PPIs) or gene ontology (GO), to search for new pathways. However, these methods neither consider nor fully utilize the intrinsic relation between the perturbation of a pathway and expression changes of genes regulated by the pathway, which served as the main motivation for developing a new computational method in this study. In our new model, we first find gene transcriptomic modules such that genes in each module are highly likely to be regulated by a common signal. We then use the expression status of those modules as readouts of pathway perturbations to search for up-stream pathways. Systematic evaluation, such as through gene ontology enrichment analysis, has provided evidence that genes in each transcriptomic module are highly likely to be regulated by a common signal. The PPI density analysis and literature search revealed that our new perturbation modules are functionally coherent. For example, the literature search revealed that 9 genes in one of our perturbation module are related to cell cycle and all 10 genes in another perturbation module are related by DNA damage, with much evidence from the literature coming from in vitro or/and in vivo verifications. Hence, utilizing the intrinsic relation between the perturbation of a pathway and the expression changes of genes regulated by the pathway is a useful method of searching for signaling pathways using genetic perturbation data. This model would also be suitable for analyzing drug experiment data, such as the CMap data, for finding drugs that perturb the same pathways.
Collapse
Affiliation(s)
- Songjian Lu
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- * E-mail:
| | - Xiaonan Fan
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- Department of Automation, Northwestern Polytechnical University, Shanxi, People’s Republic of China
| | - Lujia Chen
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Xinghua Lu
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| |
Collapse
|