1
|
Lee D, Yoo S. hERGAT: predicting hERG blockers using graph attention mechanism through atom- and molecule-level interaction analyses. J Cheminform 2025; 17:11. [PMID: 39875959 PMCID: PMC11776176 DOI: 10.1186/s13321-025-00957-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Accepted: 01/11/2025] [Indexed: 01/30/2025] Open
Abstract
The human ether-a-go-go-related gene (hERG) channel plays a critical role in the electrical activity of the heart, and its blockers can cause serious cardiotoxic effects. Thus, screening for hERG channel blockers is a crucial step in the drug development process. Many in silico models have been developed to predict hERG blockers, which can efficiently save time and resources. However, previous methods have found it hard to achieve high performance and to interpret the predictive results. To overcome these challenges, we have proposed hERGAT, a graph neural network model with an attention mechanism, to consider compound interactions on atomic and molecular levels. In the atom-level interaction analysis, we applied a graph attention mechanism (GAT) that integrates information from neighboring nodes and their extended connections. The hERGAT employs a gated recurrent unit (GRU) with the GAT to learn information between more distant atoms. To confirm this, we performed clustering analysis and visualized a correlation heatmap, verifying the interactions between distant atoms were considered during the training process. In the molecule-level interaction analysis, the attention mechanism enables the target node to focus on the most relevant information, highlighting the molecular substructures that play crucial roles in predicting hERG blockers. Through a literature review, we confirmed that highlighted substructures have a significant role in determining the chemical and biological characteristics related to hERG activity. Furthermore, we integrated physicochemical properties into our hERGAT model to improve the performance. Our model achieved an area under the receiver operating characteristic of 0.907 and an area under the precision-recall of 0.904, demonstrating its effectiveness in modeling hERG activity and offering a reliable framework for optimizing drug safety in early development stages.Scientific contribution:hERGAT is a deep learning model for predicting hERG blockers by combining GAT and GRU, enabling it to capture complex interactions at atomic and molecular levels. We improve the model's interpretability by analyzing the highlighted molecular substructures, providing valuable insights into their roles in determining hERG activity. The model achieves high predictive performance, confirming its potential as a preliminary tool for early cardiotoxicity assessment and enhancing the reliability of the results.
Collapse
Affiliation(s)
- Dohyeon Lee
- Department of Intelligent Electronics and Computer Engineering, Chonnam National University, Gwangju, Republic of Korea
| | - Sunyong Yoo
- Department of Intelligent Electronics and Computer Engineering, Chonnam National University, Gwangju, Republic of Korea.
| |
Collapse
|
2
|
Kim H, Park M, Lee I, Nam H. BayeshERG: a robust, reliable and interpretable deep learning model for predicting hERG channel blockers. Brief Bioinform 2022; 23:6609519. [PMID: 35709752 DOI: 10.1093/bib/bbac211] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2022] [Revised: 04/19/2022] [Accepted: 05/06/2022] [Indexed: 11/13/2022] Open
Abstract
Unintended inhibition of the human ether-à-go-go-related gene (hERG) ion channel by small molecules leads to severe cardiotoxicity. Thus, hERG channel blockage is a significant concern in the development of new drugs. Several computational models have been developed to predict hERG channel blockage, including deep learning models; however, they lack robustness, reliability and interpretability. Here, we developed a graph-based Bayesian deep learning model for hERG channel blocker prediction, named BayeshERG, which has robust predictive power, high reliability and high resolution of interpretability. First, we applied transfer learning with 300 000 large data in initial pre-training to increase the predictive performance. Second, we implemented a Bayesian neural network with Monte Carlo dropout to calibrate the uncertainty of the prediction. Third, we utilized global multihead attentive pooling to augment the high resolution of structural interpretability for the hERG channel blockers and nonblockers. We conducted both internal and external validations for stringent evaluation; in particular, we benchmarked most of the publicly available hERG channel blocker prediction models. We showed that our proposed model outperformed predictive performance and uncertainty calibration performance. Furthermore, we found that our model learned to focus on the essential substructures of hERG channel blockers via an attention mechanism. Finally, we validated the prediction results of our model by conducting in vitro experiments and confirmed its high validity. In summary, BayeshERG could serve as a versatile tool for discovering hERG channel blockers and helping maximize the possibility of successful drug discovery. The data and source code are available at our GitHub repository (https://github.com/GIST-CSBL/BayeshERG).
Collapse
Affiliation(s)
- Hyunho Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Buk-gu, Gwangju, 61005, Republic of Korea
| | - Minsu Park
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Buk-gu, Gwangju, 61005, Republic of Korea
| | - Ingoo Lee
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Buk-gu, Gwangju, 61005, Republic of Korea
| | - Hojung Nam
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Buk-gu, Gwangju, 61005, Republic of Korea
| |
Collapse
|
3
|
Kim H, Nam H. hERG-Att: Self-attention-based deep neural network for predicting hERG blockers. Comput Biol Chem 2020; 87:107286. [PMID: 32531518 DOI: 10.1016/j.compbiolchem.2020.107286] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Accepted: 05/09/2020] [Indexed: 02/05/2023]
Abstract
A voltage-gated potassium channel encoded by the human ether-à-go-go-related gene (hERG) regulates cardiac action potential, and it is involved in cardiotoxicity with compounds that inhibit its activity. Therefore, the screening of hERG channel blockers is a mandatory step in the drug discovery process. The screening of hERG blockers by using conventional methods is inefficient in terms of cost and efforts. This has led to the development of many in silico hERG blocker prediction models. However, constructing a high-performance predictive model with interpretability on hERG blockage by certain compounds is a major obstacle. In this study, we developed the first, attention-based, interpretable model that predicts hERG blockers and captures important hERG-related compound substructures. To do that, we first collected various datasets, ranging from public databases to publicly available private datasets, to train and test the model. Then, we developed a precise and interpretable hERG blocker prediction model by using deep learning with a self-attention approach that has an appropriate molecular descriptor, Morgan fingerprint. The proposed prediction model was validated, and the validation result showed that the model was well-optimized and had high performance. The test set performance of the proposed model was significantly higher than that of previous fingerprint-based conventional machine learning models. In particular, the proposed model generally had high accuracy and F1 score thereby, representing the model's predictive reliability. Furthermore, we interpreted the calculated attention score vectors obtained from the proposed prediction model and demonstrated the important structural patterns that are represented in hERG blockers. In summary, we have proposed a powerful and interpretable hERG blocker prediction model that can reduce the overall cost of drug discovery by accurately screening for hERG blockers and suggesting hERG-related substructures.
Collapse
Affiliation(s)
- Hyunho Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Buk-gu, Gwangju 61005, Republic of Korea.
| | - Hojung Nam
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Buk-gu, Gwangju 61005, Republic of Korea.
| |
Collapse
|
4
|
Lin SR, Chang CH, Tsai MJ, Cheng H, Chen JC, Leong MK, Weng CF. The perceptions of natural compounds against dipeptidyl peptidase 4 in diabetes: from in silico to in vivo. Ther Adv Chronic Dis 2019; 10:2040622319875305. [PMID: 31555430 PMCID: PMC6753520 DOI: 10.1177/2040622319875305] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2019] [Accepted: 08/12/2019] [Indexed: 12/13/2022] Open
Abstract
Dipeptidyl peptidase IV (DPP-4), an incretin glucagon-like peptide-1 (GLP-1) degrading enzyme, contains two forms and it can exert various physiological functions particular in controlling blood glucose through the action of GLP-1. In diabetic use, the DPP-4 inhibitor can block the DDP-4 to attenuate GLP-1 degradation and prolong GLP-1 its action and sensitize insulin activity for the purpose of lowering blood glucose. Nonetheless the adverse effects of DPP-4 inhibitors severely hinder their clinical applications, and notably there is a clinical demand for novel DPP-4 inhibitors from various sources including chemical synthesis, herbs, and plants with fewer side effects. In this review, we highlight various strategies, namely computational biology (in silico), in vitro enzymatic and cell assays, and in vivo animal tests, for seeking natural DPP-4 inhibitors from botanic sources including herbs and plants. The pros and cons of all approaches for new inhibitor candidates or hits will be under discussion.
Collapse
Affiliation(s)
- Shian-Ren Lin
- Department of Life Science and Institute of
Biotechnology, National Dong Hwa University, Hualien
| | - Chia-Hsiang Chang
- Department of Life Science and Institute of
Biotechnology, National Dong Hwa University, Hualien
| | - May-Jwan Tsai
- Neural Regeneration Laboratory, Neurological
Institute, Taipei Veterans General Hospital, Beitou, Taipei
| | - Henrich Cheng
- Neural Regeneration Laboratory, Neurological
Institute, Taipei Veterans General Hospital, Beitou, Taipei
| | - Jian-Chyi Chen
- Department of Biotechnology, Southern Taiwan
University of Science and Technology, Yungkang, Tainan
| | - Max K. Leong
- Department of Chemistry, National Dong Hwa
University, No.1, Sec.2, Da-Hsueh Road, Shoufeng, Hualien, 97401,
Taiwan
| | - Ching-Feng Weng
- Department of Basic Medical Science, Center for
Transitional Medicine, Xiamen Medical College, Xiamen, 361023, China
| |
Collapse
|
5
|
Kumar SP. Receptor pharmacophore ensemble (REPHARMBLE): a probabilistic pharmacophore modeling approach using multiple protein-ligand complexes. J Mol Model 2018; 24:282. [PMID: 30220049 DOI: 10.1007/s00894-018-3820-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2018] [Accepted: 09/03/2018] [Indexed: 10/28/2022]
Abstract
Ensemble methods are gaining more importance in structure-based approaches as single protein-ligand complexes strongly influence the outcomes of virtual screening. Structure-based pharmacophore modeling based on a single protein-ligand complex with complex feature combinations is often limited to certain chemical classes. The REPHARMBLE (receptor pharmacophore ensemble) approach presented here examines the ability of an ensemble of selected protein-ligand complexes to populate pharmacophore space in the ligand binding site, rigorously assesses the importance of pharmacophore features using Poisson statistic and information theory-based entropy calculations, and generates pharmacophore models with high probabilities. In addition, an ensemble scoring function that combines all the resultant high-scoring pharmacophore models to score molecules is derived. The REPHARMBLE approach was evaluated on ten DUD-E benchmark datasets and afforded good screening performance, as measured by receiver operating characteristic, enrichment factor and Güner-Henry score. Although one of the high-scoring models achieved superior statistical results in each dataset, the ensemble scoring function balanced the shortcomings of each model and passed with close performance measures. This approach offers a reliable way of choosing the best-scoring features to build four-feature pharmacophore queries and customize a target-biased 'pharmacophore ensemble' scoring function for subsequent virtual screening.
Collapse
|
6
|
Hou TY, Weng CF, Leong MK. Insight Analysis of Promiscuous Estrogen Receptor α-Ligand Binding by a Novel Machine Learning Scheme. Chem Res Toxicol 2018; 31:799-813. [PMID: 30019586 DOI: 10.1021/acs.chemrestox.8b00130] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Estrogen receptor α (ERα) plays a significant role in occurrence of breast cancer and may cause various adverse side-effects when ERα is an off-target protein. A theoretical model was derived to predict the binding affinity of ERα using the pharmacophore ensemble/support vector machine (PhE/SVM) scheme to consider the promiscuous characteristic of ERα. The estimations by PhE/SVM were discovered to be in good agreement with the observed values for those training molecules ( n = 31, r2 = 0.80, qCV2 = 0.77, RMSE = 0.57, s = 0.58), test molecules ( n = 179, q2 = 0.91-0.96, RMSE = 0.33, s = 0.26) and outliers ( n = 15, q2 = 0.80-0.86, RMSE = 0.56, s = 0.49). When subjected to various statistical validations, the PhE/SVM model consistently fulfilled the strictest criteria. A mock test also asserted its predictivity. When compared with crystal structures, the calculated results are consistent with the reported ERα-ligand co-complex structure, and the plasticity nature of ERα is also disclosed. Consequently, this precise, fast, and robust model can be adopted to predict ERα-ligand binding affinities and to design safer non-ERα-targeted pharmaceuticals in the process of drug discovery and development.
Collapse
|
7
|
Siramshetty VB, Chen Q, Devarakonda P, Preissner R. The Catch-22 of Predicting hERG Blockade Using Publicly Accessible Bioactivity Data. J Chem Inf Model 2018; 58:1224-1233. [PMID: 29772901 DOI: 10.1021/acs.jcim.8b00150] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Drug-induced inhibition of the human ether-à-go-go-related gene (hERG)-encoded potassium ion channels can lead to fatal cardiotoxicity. Several marketed drugs and promising drug candidates were recalled because of this concern. Diverse modeling methods ranging from molecular similarity assessment to quantitative structure-activity relationship analysis employing machine learning techniques have been applied to data sets of varying size and composition (number of blockers and nonblockers). In this study, we highlight the challenges involved in the development of a robust classifier for predicting the hERG end point using bioactivity data extracted from the public domain. To this end, three different modeling methods, nearest neighbors, random forests, and support vector machines, were employed to develop predictive models using different molecular descriptors, activity thresholds, and training set compositions. Our models demonstrated superior performance in external validations in comparison with those reported in the previous studies from which the data sets were extracted. The choice of descriptors had little influence on the model performance, with minor exceptions. The criteria used to filter bioactivity data, the activity threshold settings used to separate blockers from nonblockers, and the structural diversity of blockers in training data set were found to be the crucial indicators of model performance. Training sets based on a binary threshold of 1 μM/10 μM to separate blockers (IC50/ Ki ≤ 1 μM) from nonblockers (IC50/ Ki > 10 μM) provided superior performance in comparison with those defined using a single threshold (1 μM or 10 μM). A major limitation in using the public domain hERG activity data is the abundance of blockers in comparison with nonblockers at usual activity thresholds, since not many studies report the latter.
Collapse
Affiliation(s)
- Vishal B Siramshetty
- Structural Bioinformatics Group , Charité - University Medicine Berlin , 10115 Berlin , Germany.,BB3R - Berlin Brandenburg 3R Graduate School , Freie Universität Berlin , 14195 Berlin , Germany
| | - Qiaofeng Chen
- Structural Bioinformatics Group , Charité - University Medicine Berlin , 10115 Berlin , Germany.,China Scholarship Council (CSC) , Beijing 100044 , China
| | - Prashanth Devarakonda
- Structural Bioinformatics Group , Charité - University Medicine Berlin , 10115 Berlin , Germany
| | - Robert Preissner
- Structural Bioinformatics Group , Charité - University Medicine Berlin , 10115 Berlin , Germany.,BB3R - Berlin Brandenburg 3R Graduate School , Freie Universität Berlin , 14195 Berlin , Germany
| |
Collapse
|
8
|
Zhang MM, Wang Y, Liu JQ, Wang XS. An efficient green synthesis of 5H-spiro[benzo[4,5]imidazo[1,2-c]quinazoline-6,3′-indolin]-2′-ones catalyzed by iodine in ionic liquids. HETEROCYCL COMMUN 2017. [DOI: 10.1515/hc-2017-0046] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
AbstractBenzene-1,2-diamine was treated with 2-nitrobenzaldehyde in EtOH, and the product was reduced with hydrazine hydrate in the presence of Fe(C) without separation to give 2-(1
Collapse
|
9
|
Ekins S. The Next Era: Deep Learning in Pharmaceutical Research. Pharm Res 2016; 33:2594-603. [PMID: 27599991 DOI: 10.1007/s11095-016-2029-7] [Citation(s) in RCA: 127] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2016] [Accepted: 08/23/2016] [Indexed: 01/22/2023]
Abstract
Over the past decade we have witnessed the increasing sophistication of machine learning algorithms applied in daily use from internet searches, voice recognition, social network software to machine vision software in cameras, phones, robots and self-driving cars. Pharmaceutical research has also seen its fair share of machine learning developments. For example, applying such methods to mine the growing datasets that are created in drug discovery not only enables us to learn from the past but to predict a molecule's properties and behavior in future. The latest machine learning algorithm garnering significant attention is deep learning, which is an artificial neural network with multiple hidden layers. Publications over the last 3 years suggest that this algorithm may have advantages over previous machine learning methods and offer a slight but discernable edge in predictive performance. The time has come for a balanced review of this technique but also to apply machine learning methods such as deep learning across a wider array of endpoints relevant to pharmaceutical research for which the datasets are growing such as physicochemical property prediction, formulation prediction, absorption, distribution, metabolism, excretion and toxicity (ADME/Tox), target prediction and skin permeation, etc. We also show that there are many potential applications of deep learning beyond cheminformatics. It will be important to perform prospective testing (which has been carried out rarely to date) in order to convince skeptics that there will be benefits from investing in this technique.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborations Pharmaceuticals, Inc, 5616 Hilltop Needmore Road, Fuquay-Varina, North Carolina, 27526, USA. .,Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, California, 94010, USA.
| |
Collapse
|
10
|
Wang S, Sun H, Liu H, Li D, Li Y, Hou T. ADMET Evaluation in Drug Discovery. 16. Predicting hERG Blockers by Combining Multiple Pharmacophores and Machine Learning Approaches. Mol Pharm 2016; 13:2855-66. [PMID: 27379394 DOI: 10.1021/acs.molpharmaceut.6b00471] [Citation(s) in RCA: 88] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Blockade of human ether-à-go-go related gene (hERG) channel by compounds may lead to drug-induced QT prolongation, arrhythmia, and Torsades de Pointes (TdP), and therefore reliable prediction of hERG liability in the early stages of drug design is quite important to reduce the risk of cardiotoxicity-related attritions in the later development stages. In this study, pharmacophore modeling and machine learning approaches were combined to construct classification models to distinguish hERG active from inactive compounds based on a diverse data set. First, an optimal ensemble of pharmacophore hypotheses that had good capability to differentiate hERG active from inactive compounds was identified by the recursive partitioning (RP) approach. Then, the naive Bayesian classification (NBC) and support vector machine (SVM) approaches were employed to construct classification models by integrating multiple important pharmacophore hypotheses. The integrated classification models showed improved predictive capability over any single pharmacophore hypothesis, suggesting that the broad binding polyspecificity of hERG can only be well characterized by multiple pharmacophores. The best SVM model achieved the prediction accuracies of 84.7% for the training set and 82.1% for the external test set. Notably, the accuracies for the hERG blockers and nonblockers in the test set reached 83.6% and 78.2%, respectively. Analysis of significant pharmacophores helps to understand the multimechanisms of action of hERG blockers. We believe that the combination of pharmacophore modeling and SVM is a powerful strategy to develop reliable theoretical models for the prediction of potential hERG liability.
Collapse
Affiliation(s)
- Shuangquan Wang
- College of Pharmaceutical Sciences, Zhejiang University , Hangzhou, Zhejiang 310058, China
| | - Huiyong Sun
- College of Pharmaceutical Sciences, Zhejiang University , Hangzhou, Zhejiang 310058, China
| | - Hui Liu
- College of Pharmaceutical Sciences, Zhejiang University , Hangzhou, Zhejiang 310058, China
| | - Dan Li
- College of Pharmaceutical Sciences, Zhejiang University , Hangzhou, Zhejiang 310058, China
| | - Youyong Li
- Institute of Functional Nano & Soft Materials (FUNSOM), Soochow University , Suzhou, Jiangsu 215123, China
| | - Tingjun Hou
- College of Pharmaceutical Sciences, Zhejiang University , Hangzhou, Zhejiang 310058, China.,State Key Lab of CAD&CG, Zhejiang University , Hangzhou, Zhejiang 310058, P. R. China
| |
Collapse
|
11
|
Computational investigations of hERG channel blockers: New insights and current predictive models. Adv Drug Deliv Rev 2015; 86:72-82. [PMID: 25770776 DOI: 10.1016/j.addr.2015.03.003] [Citation(s) in RCA: 76] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2014] [Revised: 01/13/2015] [Accepted: 03/04/2015] [Indexed: 01/08/2023]
Abstract
Identification of potential human Ether-a-go-go Related-Gene (hERG) potassium channel blockers is an essential part of the drug development and drug safety process in pharmaceutical industries or academic drug discovery centers, as they may lead to drug-induced QT prolongation, arrhythmia and Torsade de Pointes. Recent reports also suggest starting to address such issues at the hit selection stage. In order to prioritize molecules during the early drug discovery phase and to reduce the risk of drug attrition due to cardiotoxicity during pre-clinical and clinical stages, computational approaches have been developed to predict the potential hERG blockage of new drug candidates. In this review, we will describe the current in silico methods developed and applied to predict and to understand the mechanism of actions of hERG blockers, including ligand-based and structure-based approaches. We then discuss ongoing research on other ion channels and hERG polymorphism susceptible to be involved in LQTS and how systemic approaches can help in the drug safety decision.
Collapse
|
12
|
Ai N, Fan X, Ekins S. In silico methods for predicting drug-drug interactions with cytochrome P-450s, transporters and beyond. Adv Drug Deliv Rev 2015; 86:46-60. [PMID: 25796619 DOI: 10.1016/j.addr.2015.03.006] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2014] [Revised: 01/05/2015] [Accepted: 03/11/2015] [Indexed: 12/13/2022]
Abstract
Drug-drug interactions (DDIs) are associated with severe adverse effects that may lead to the patient requiring alternative therapeutics and could ultimately lead to drug withdrawal from the market if they are severe. To prevent the occurrence of DDI in the clinic, experimental systems to evaluate drug interaction have been integrated into the various stages of the drug discovery and development process. A large body of knowledge about DDI has also accumulated through these studies and pharmacovigillence systems. Much of this work to date has focused on the drug metabolizing enzymes such as cytochrome P-450s as well as drug transporters, ion channels and occasionally other proteins. This combined knowledge provides a foundation for a hypothesis-driven in silico approach, using either cheminformatics or physiologically based pharmacokinetics (PK) modeling methods to assess DDI potential. Here we review recent advances in these approaches with emphasis on hypothesis-driven mechanistic models for important protein targets involved in PK-based DDI. Recent efforts with other informatics approaches to detect DDI are highlighted. Besides DDI, we also briefly introduce drug interactions with other substances, such as Traditional Chinese Medicines to illustrate how in silico modeling can be useful in this domain. We also summarize valuable data sources and web-based tools that are available for DDI prediction. We finally explore the challenges we see faced by in silico approaches for predicting DDI and propose future directions to make these computational models more reliable, accurate, and publically accessible.
Collapse
Affiliation(s)
- Ni Ai
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, 866 Yuhangtang Road, Hangzhou, Zhejiang 310058, PR China
| | - Xiaohui Fan
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, 866 Yuhangtang Road, Hangzhou, Zhejiang 310058, PR China.
| | - Sean Ekins
- Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, NC 27526, USA.
| |
Collapse
|
13
|
Balfer J, Bajorath J. Systematic artifacts in support vector regression-based compound potency prediction revealed by statistical and activity landscape analysis. PLoS One 2015; 10:e0119301. [PMID: 25742011 PMCID: PMC4350943 DOI: 10.1371/journal.pone.0119301] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2014] [Accepted: 01/29/2015] [Indexed: 02/01/2023] Open
Abstract
Support vector machines are a popular machine learning method for many classification tasks in biology and chemistry. In addition, the support vector regression (SVR) variant is widely used for numerical property predictions. In chemoinformatics and pharmaceutical research, SVR has become the probably most popular approach for modeling of non-linear structure-activity relationships (SARs) and predicting compound potency values. Herein, we have systematically generated and analyzed SVR prediction models for a variety of compound data sets with different SAR characteristics. Although these SVR models were accurate on the basis of global prediction statistics and not prone to overfitting, they were found to consistently mispredict highly potent compounds. Hence, in regions of local SAR discontinuity, SVR prediction models displayed clear limitations. Compared to observed activity landscapes of compound data sets, landscapes generated on the basis of SVR potency predictions were partly flattened and activity cliff information was lost. Taken together, these findings have implications for practical SVR applications. In particular, prospective SVR-based potency predictions should be considered with caution because artificially low predictions are very likely for highly potent candidate compounds, the most important prediction targets.
Collapse
Affiliation(s)
- Jenny Balfer
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr. 2, D-53113, Bonn, Germany
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr. 2, D-53113, Bonn, Germany
- * E-mail:
| |
Collapse
|
14
|
Kratz JM, Schuster D, Edtbauer M, Saxena P, Mair CE, Kirchebner J, Matuszczak B, Baburin I, Hering S, Rollinger JM. Experimentally validated HERG pharmacophore models as cardiotoxicity prediction tools. J Chem Inf Model 2014; 54:2887-901. [PMID: 25148533 DOI: 10.1021/ci5001955] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
The goal of this study was to design, experimentally validate, and apply a virtual screening workflow to identify novel hERG channel blockers. The hERG channel is an important antitarget in drug development since cardiotoxic risks remain as a major cause of attrition. A ligand-based pharmacophore model collection was developed and theoretically validated. The seven most complementary and suitable models were used for virtual screening of in-house and commercially available compound libraries. From the hit lists, 50 compounds were selected for experimental validation through bioactivity assessment using patch clamp techniques. Twenty compounds inhibited hERG channels expressed in HEK 293 cells with IC50 values ranging from 0.13 to 2.77 μM, attesting to the suitability of the models as cardiotoxicity prediction tools in a preclinical stage.
Collapse
Affiliation(s)
- Jadel M Kratz
- Departamento de Ciências Farmacêuticas, Universidade Federal de Santa Catarina , 88.040-900 Florianópolis, Santa Catarina, Brazil
| | | | | | | | | | | | | | | | | | | |
Collapse
|
15
|
de la Vega de León A, Bajorath J. Prediction of Compound Potency Changes in Matched Molecular Pairs Using Support Vector Regression. J Chem Inf Model 2014; 54:2654-63. [DOI: 10.1021/ci5003944] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Affiliation(s)
- Antonio de la Vega de León
- Department of Life Science
Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal
Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr. 2, D-53113 Bonn, Germany
| | - Jürgen Bajorath
- Department of Life Science
Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal
Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr. 2, D-53113 Bonn, Germany
| |
Collapse
|
16
|
Ding YL, Shih YH, Tsai FY, Leong MK. In silico prediction of inhibition of promiscuous breast cancer resistance protein (BCRP/ABCG2). PLoS One 2014; 9:e90689. [PMID: 24614353 PMCID: PMC3948701 DOI: 10.1371/journal.pone.0090689] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2013] [Accepted: 02/03/2014] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Breast cancer resistant protein has an essential role in active transport of endogenous substances and xenobiotics across extracellular and intracellular membranes along with P-glycoprotein. It also plays a major role in multiple drug resistance and permeation of blood-brain barrier. Therefore, it is of great importance to derive theoretical models to predict the inhibition of both transporters in the process of drug discovery and development. Hitherto, very limited BCRP inhibition predictive models have been proposed as compared with its P-gp counterpart. METHODOLOGY/PRINCIPAL FINDINGS An in silico BCRP inhibition model was developed in this study using the pharmacophore ensemble/support vector machine scheme to take into account the promiscuous nature of BCRP. The predictions by the PhE/SVM model were found to be in good agreement with the observed values for those molecules in the training set (n= 22, r2 =0.82, qCV2=0.73, RMSE= 0.40, s = 0.24), test set (n =97, q2=0.75-0.89, RMSE= 0.31, s= 0.21), and outlier set (n= 16, q2 =0.72-0.91, RMSE= 0.29, s=0.17). When subjected to a variety of statistical validations, the developed PhE/SVM model consistently met the most stringent criteria. A mock test by HIV protease inhibitors also asserted its predictivity. CONCLUSIONS/SIGNIFICANCE It was found that this accurate, fast, and robust PhE/SVM model can be employed to predict the BCRP inhibition of structurally diverse molecules that otherwise cannot be carried out by any other methods in a high-throughput fashion to design therapeutic agents with insignificant drug toxicity and unfavorable drug-drug interactions mediated by BCRP to enhance clinical efficacy and/or circumvent drug resistance.
Collapse
Affiliation(s)
- Yi-Lung Ding
- Department of Chemistry, National Dong Hwa University, Shoufeng, Hualien, Taiwan
| | - Yu-Hsuan Shih
- Department of Chemistry, National Dong Hwa University, Shoufeng, Hualien, Taiwan
| | - Fu-Yuan Tsai
- Center for General Education, Chang Gung University, Taoyuan, Taiwan
| | - Max K Leong
- Department of Chemistry, National Dong Hwa University, Shoufeng, Hualien, Taiwan; Department of Life Science and Institute of Biotechnology, National Dong Hwa University, Shoufeng, Hualien, Taiwan; Department of Medical Research and Teaching, Mennonite Christian Hospital, Hualien, Taiwan
| |
Collapse
|
17
|
|
18
|
Affiliation(s)
- Paul Czodrowski
- Merck KGaA, Small Molecule
Platform, Global Computational Chemistry, Frankfurter Strasse 250,
64293 Darmstadt, Germany
| |
Collapse
|
19
|
Abstract
Frequent failure of drug candidates during development stages remains the major deterrent for an early introduction of new drug molecules. The drug toxicity is the major cause of expensive late-stage development failures. An early identification/optimization of the most favorable molecule will naturally save considerable cost, time, human efforts and minimize animal sacrifice. (Quantitative) Structure Activity Relationships [(Q)SARs] represent statistically derived predictive models correlating biological activity (including desirable therapeutic effect and undesirable side effects) of chemicals (drugs/toxicants/environmental pollutants) with molecular descriptors and/or properties. (Q)SAR models which categorize the available data into two or more groups/classes are known as classification models. Numerous techniques of diverse nature are being presently employed for development of classification models. Though there is an increasing use of classification models for prediction of either biological activity or toxicity, the future trend will naturally be towards the development of classification models capable of simultaneous prediction of biological activity, toxicity, and pharmacokinetic parameters so as to accelerate development of bioavailable safe drug molecules.
Collapse
|
20
|
QSPR studies for predicting gas to acetone and gas to acetonitrile solvation enthalpies using support vector machine. J Mol Liq 2012. [DOI: 10.1016/j.molliq.2012.08.006] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
|
21
|
Abstract
Structure-based drug design has become an essential tool for rapid lead discovery and optimization. As available structural information has increased, researchers have become increasingly aware of the importance of protein flexibility for accurate description of the native state. Typical protein-ligand docking efforts still rely on a single rigid receptor, which is an incomplete representation of potential binding conformations of the protein. These rigid docking efforts typically show the best performance rates between 50 and 75%, while fully flexible docking methods can enhance pose prediction up to 80-95%. This review examines the current toolbox for flexible protein-ligand docking and receptor surface mapping. Present limitations and possibilities for future development are discussed.
Collapse
Affiliation(s)
- Katrina W. Lexa
- Department of Medicinal Chemistry, University of Michigan, 428 Church Street, Ann Arbor, MI 48109-1065, USA
| | - Heather A. Carlson
- Department of Medicinal Chemistry, University of Michigan, 428 Church Street, Ann Arbor, MI 48109-1065, USA
| |
Collapse
|
22
|
Su BH, Tu YS, Esposito EX, Tseng YJ. Predictive Toxicology Modeling: Protocols for Exploring hERG Classification and Tetrahymena pyriformis End Point Predictions. J Chem Inf Model 2012; 52:1660-73. [DOI: 10.1021/ci300060b] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Affiliation(s)
- Bo-Han Su
- Department
of Computer Science and Information Engineering, National Taiwan University, No.1 Sec.4, Roosevelt Road,
Taipei, Taiwan 106
| | - Yi-shu Tu
- Graduate
Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, No.1 Sec.4,
Roosevelt Road, Taipei, Taiwan 106
| | | | - Yufeng J. Tseng
- Department
of Computer Science and Information Engineering, National Taiwan University, No.1 Sec.4, Roosevelt Road,
Taipei, Taiwan 106
- Graduate
Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, No.1 Sec.4,
Roosevelt Road, Taipei, Taiwan 106
| |
Collapse
|
23
|
Leong MK, Chen HB, Shih YH. Prediction of promiscuous p-glycoprotein inhibition using a novel machine learning scheme. PLoS One 2012; 7:e33829. [PMID: 22439003 PMCID: PMC3306300 DOI: 10.1371/journal.pone.0033829] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2011] [Accepted: 02/20/2012] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND P-glycoprotein (P-gp) is an ATP-dependent membrane transporter that plays a pivotal role in eliminating xenobiotics by active extrusion of xenobiotics from the cell. Multidrug resistance (MDR) is highly associated with the over-expression of P-gp by cells, resulting in increased efflux of chemotherapeutical agents and reduction of intracellular drug accumulation. It is of clinical importance to develop a P-gp inhibition predictive model in the process of drug discovery and development. METHODOLOGY/PRINCIPAL FINDINGS An in silico model was derived to predict the inhibition of P-gp using the newly invented pharmacophore ensemble/support vector machine (PhE/SVM) scheme based on the data compiled from the literature. The predictions by the PhE/SVM model were found to be in good agreement with the observed values for those structurally diverse molecules in the training set (n = 31, r(2) = 0.89, q(2) = 0.86, RMSE = 0.40, s = 0.28), the test set (n = 88, r(2) = 0.87, RMSE = 0.39, s = 0.25) and the outlier set (n = 11, r(2) = 0.96, RMSE = 0.10, s = 0.05). The generated PhE/SVM model also showed high accuracy when subjected to those validation criteria generally adopted to gauge the predictivity of a theoretical model. CONCLUSIONS/SIGNIFICANCE This accurate, fast and robust PhE/SVM model that can take into account the promiscuous nature of P-gp can be applied to predict the P-gp inhibition of structurally diverse compounds that otherwise cannot be done by any other methods in a high-throughput fashion to facilitate drug discovery and development by designing drug candidates with better metabolism profile.
Collapse
Affiliation(s)
- Max K Leong
- Department of Chemistry, National Dong Hwa University, Shoufeng, Hualien, Taiwan.
| | | | | |
Collapse
|
24
|
Predicting Activation of the Promiscuous Human Pregnane X Receptor by Pharmacophore Ensemble/Support Vector Machine Approach. Chem Res Toxicol 2011; 24:1765-78. [DOI: 10.1021/tx200310j] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
25
|
Shen MY, Su BH, Esposito EX, Hopfinger AJ, Tseng YJ. A Comprehensive Support Vector Machine Binary hERG Classification Model Based on Extensive but Biased End Point hERG Data Sets. Chem Res Toxicol 2011; 24:934-49. [DOI: 10.1021/tx200099j] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
|
26
|
Tan Y, Chen Y, You Q, Sun H, Li M. Predicting the potency of hERG K+ channel inhibition by combining 3D-QSAR pharmacophore and 2D-QSAR models. J Mol Model 2011; 18:1023-36. [DOI: 10.1007/s00894-011-1136-y] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2011] [Accepted: 05/23/2011] [Indexed: 02/06/2023]
|
27
|
Kim JH, Chae CH, Kang SM, Lee JY, Lee GN, Hwang SH, Kang NS. The Predictive QSAR Model for hERG Inhibitors Using Bayesian and Random Forest Classification Method. B KOREAN CHEM SOC 2011. [DOI: 10.5012/bkcs.2011.32.4.1237] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
28
|
Klon AE. Machine learning algorithms for the prediction of hERG and CYP450 binding in drug development. Expert Opin Drug Metab Toxicol 2011; 6:821-33. [PMID: 20465523 DOI: 10.1517/17425255.2010.489550] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
IMPORTANCE OF THE FIELD The cost of developing new drugs is estimated at approximately $1 billion; the withdrawal of a marketed compound due to toxicity can result in serious financial loss for a pharmaceutical company. There has been a greater interest in the development of in silico tools that can identify compounds with metabolic liabilities before they are brought to market. AREAS COVERED IN THIS REVIEW The two largest classes of machine learning (ML) models, which will be discussed in this review, have been developed to predict binding to the human ether-a-go-go related gene (hERG) ion channel protein and the various CYP isoforms. Being able to identify potentially toxic compounds before they are made would greatly reduce the number of compound failures and the costs associated with drug development. WHAT THE READER WILL GAIN This review summarizes the state of modeling hERG and CYP binding towards this goal since 2003 using ML algorithms. TAKE HOME MESSAGE A wide variety of ML algorithms that are comparable in their overall performance are available. These ML methods may be applied regularly in discovery projects to flag compounds with potential metabolic liabilities.
Collapse
Affiliation(s)
- Anthony E Klon
- Ansaris, Computational Chemistry, Four Valley Square, 512 East Township Line Road, Blue Bell, PA 19422, USA.
| |
Collapse
|
29
|
Nasonov AF. Computational methods and software in computer-aided combinatorial library design. RUSS J GEN CHEM+ 2011. [DOI: 10.1134/s1070363210120248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
30
|
Durdagi S, Duff HJ, Noskov SY. Combined Receptor and Ligand-Based Approach to the Universal Pharmacophore Model Development for Studies of Drug Blockade to the hERG1 Pore Domain. J Chem Inf Model 2011; 51:463-74. [DOI: 10.1021/ci100409y] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Affiliation(s)
- Serdar Durdagi
- Department of Biological Sciences, University of Calgary, Institute for Biocomplexity and Informatics Calgary, Alberta, Canada
| | - Henry J. Duff
- Faculty of Medicine, University of Calgary, Libin Cardiovascular Institute of Alberta, Calgary, Alberta, Canada
| | - Sergei Yu. Noskov
- Department of Biological Sciences, University of Calgary, Institute for Biocomplexity and Informatics Calgary, Alberta, Canada
| |
Collapse
|
31
|
Hecht D. Applications of machine learning and computational intelligence to drug discovery and development. Drug Dev Res 2010. [DOI: 10.1002/ddr.20402] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Affiliation(s)
- David Hecht
- Southwestern College, Chula Vista, California
| |
Collapse
|
32
|
Leong MK, Lin SW, Chen HB, Tsai FY. Predicting Mutagenicity of Aromatic Amines by Various Machine Learning Approaches. Toxicol Sci 2010; 116:498-513. [DOI: 10.1093/toxsci/kfq159] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
|
33
|
Raschi E, Ceccarini L, De Ponti F, Recanatini M. hERG-related drug toxicity and models for predicting hERG liability and QT prolongation. Expert Opin Drug Metab Toxicol 2009; 5:1005-1021. [PMID: 19572824 DOI: 10.1517/17425250903055070] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
BACKGROUND hERG K(+) channels have been recognized as a primary antitarget in safety pharmacology. Their blockade, caused by several drugs with different therapeutic indications, may lead to QT prolongation and, eventually, to potentially fatal arrhythmia, namely torsade de pointes. Therefore, a number of preclinical models have been developed to predict hERG liability early in the drug development process. OBJECTIVE The aim of this review is to outline the present state of the art on drug-induced hERG blockade, providing insights on the predictive value of in vitro and in silico models for hERG liability. METHODS On the basis of latest reports, high-throughput preclinical models have been discussed outlining advantages and limitations. CONCLUSION Although no single model has an absolute value, an integrated risk assessment is recommended to predict the pro-arrhythmic risk of a given drug. This prediction requires expertise from different areas and should encompass emerging issues such as interference with hERG trafficking and QT shortening.
Collapse
Affiliation(s)
- Emanuel Raschi
- University of Bologna, Department of Pharmacology, Italy
| | | | | | | |
Collapse
|
34
|
Leong MK, Chen YM, Chen TH. Prediction of human cytochrome P450 2B6-substrate interactions using hierarchical support vector regression approach. J Comput Chem 2009; 30:1899-909. [DOI: 10.1002/jcc.21190] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
|
35
|
Polak S, Wiśniowska B, Brandys J. Collation, assessment and analysis of literature in vitro data on hERG receptor blocking potency for subsequent modeling of drugs' cardiotoxic properties. J Appl Toxicol 2009; 29:183-206. [PMID: 18988205 DOI: 10.1002/jat.1395] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The assessment of the torsadogenic potency of a new chemical entity is a crucial issue during lead optimization and the drug development process. It is required by the regulatory agencies during the registration process. In recent years, there has been a considerable interest in developing in silico models, which allow prediction of drug-hERG channel interaction at the early stage of a drug development process. The main mechanism underlying an acquired QT syndrome and a potentially fatal arrhythmia called torsades de pointes is the inhibition of potassium channel encoded by hERG (the human ether-a-go-go-related gene). The concentration producing half-maximal block of the hERG potassium current (IC(50)) is a surrogate marker for proarrhythmic properties of compounds and is considered a test for cardiac safety of drugs or drug candidates. The IC(50) values, obtained from data collected during electrophysiological studies, are highly dependent on experimental conditions (i.e. model, temperature, voltage protocol). For the in silico models' quality and performance, the data quality and consistency is a crucial issue. Therefore the main objective of our work was to collect and assess the hERG IC(50) data available in accessible scientific literature to provide a high-quality data set for further studies.
Collapse
Affiliation(s)
- Sebastian Polak
- Toxicology Department, Faculty of Pharmacy, Medical Collage, Jagiellonian University, Poland.
| | | | | |
Collapse
|
36
|
Predicting inhibitors of acetylcholinesterase by regression and classification machine learning approaches with combinations of molecular descriptors. Pharm Res 2009; 26:2216-24. [PMID: 19603258 DOI: 10.1007/s11095-009-9937-8] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2009] [Accepted: 07/02/2009] [Indexed: 10/20/2022]
Abstract
PURPOSE Acetylcholinesterase (AChE) is both a therapeutic target for Alzheimer's disease and a target for organophosphorus, carbamates and chemical warfare agents. Prediction of the likelihood of compounds interacting with this enzyme is therefore important from both therapeutic and toxicological perspectives. MATERIALS AND METHODS Support vector machine classification and regression models with molecular descriptors derived from Shape Signatures and the Molecular Operating Environment (MOE) application software were built and tested using a set of piperidine AChE inhibitors (N = 110). RESULTS The combination of the alignment free Shape Signatures and 2D MOE descriptors with the Support Vector Regression method outperforms the models based solely on 2D and internal 3D (i3D) MOE descriptors, and is comparable with the best previously reported PLS model based on CoMFA molecular descriptors (r(2)(test,SVR) = 0.48 vs. r(2)(test,PLS) = 0.47 from Sutherland et al. J Med Chem 47:5541-5554, 2004). Support Vector Classification algorithms proved superior to a classifier based on scores from the molecular docking program GOLD, with the overall prediction accuracies being Q(SVC(10CV)) = 74% and Q(SVC(LNO)) = 67% vs. Q(GOLD) = 56%. CONCLUSIONS These new machine learning models with combined descriptor schemes may find utility for predicting novel AChE inhibitors.
Collapse
|
37
|
Hybrid scoring and classification approaches to predict human pregnane X receptor activators. Pharm Res 2008; 26:1001-11. [PMID: 19115096 DOI: 10.1007/s11095-008-9809-7] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2008] [Accepted: 12/10/2008] [Indexed: 02/07/2023]
Abstract
PURPOSE The human pregnane X receptor (PXR) is a transcriptional regulator of many genes involved in xenobiotic metabolism and excretion. Reliable prediction of high affinity binders with this receptor would be valuable for pharmaceutical drug discovery to predict potential toxicological responses MATERIALS AND METHODS Computational models were developed and validated for a dataset consisting of human PXR (PXR) activators and non-activators. We used support vector machine (SVM) algorithms with molecular descriptors derived from two sources, Shape Signatures and the Molecular Operating Environment (MOE) application software. We also employed the molecular docking program GOLD in which the GoldScore method was supplemented with other scoring functions to improve docking results. RESULTS The overall test set prediction accuracy for PXR activators with SVM was 72% to 81%. This indicates that molecular shape descriptors are useful in classification of compounds binding to this receptor. The best docking prediction accuracy (61%) was obtained using 1D Shape Signature descriptors as a weighting factor to the GoldScore. By pooling the available human PXR data sets we revealed those molecular features that are associated with human PXR activators. CONCLUSIONS These combined computational approaches using molecular shape information may assist scientists to more confidently identify PXR activators.
Collapse
|
38
|
Development of a New Predictive Model for Interactions with Human Cytochrome P450 2A6 Using Pharmacophore Ensemble/Support Vector Machine (PhE/SVM) Approach. Pharm Res 2008; 26:987-1000. [DOI: 10.1007/s11095-008-9807-9] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2008] [Accepted: 12/08/2008] [Indexed: 02/06/2023]
|
39
|
Wang M, Yang XG, Xue Y. Identifying hERG Potassium Channel Inhibitors by Machine Learning Methods. ACTA ACUST UNITED AC 2008. [DOI: 10.1002/qsar.200810015] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
40
|
Abstract
hERG blockade is one of the major toxicological problems in lead structure optimization. Reliable ligand-based in silico models for predicting hERG blockade therefore have considerable potential for saving time and money, as patch-clamp measurements are very expensive and no crystal structures of the hERG-encoded channel are available. Herein we present a predictive QSAR model for hERG blockade that differentiates between specific and nonspecific binding. Specific binders are identified by preliminary pharmacophore scanning. In addition to descriptor-based models for the compounds selected as hitting one of two different pharmacophores, we also use a model for nonspecific binding that reproduces blocking properties of molecules that do not fit either of the two pharmacophores. PLS and SVR models based on interpretable quantum mechanically derived descriptors on a literature dataset of 113 molecules reach overall R(2) values between 0.60 and 0.70 for independent validation sets and R(2) values between 0.39 and 0.76 after partitioning according to the pharmacophore search for the test sets. Our findings suggest that hERG blockade may occur through different types of binding, so that several different models may be necessary to assess hERG toxicity.
Collapse
Affiliation(s)
- Christian Kramer
- Department of Lead Discovery, Boehringer-Ingelheim Pharma GmbH & Co. KG, 88397 Biberach, Germany
| | | | | | | |
Collapse
|
41
|
Support vector machines classification of hERG liabilities based on atom types. Bioorg Med Chem 2008; 16:6252-60. [DOI: 10.1016/j.bmc.2008.04.028] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2008] [Revised: 04/08/2008] [Accepted: 04/14/2008] [Indexed: 01/29/2023]
|
42
|
Yu W, Clyne M, Dolan SM, Yesupriya A, Wulf A, Liu T, Khoury MJ, Gwinn M. GAPscreener: an automatic tool for screening human genetic association literature in PubMed using the support vector machine technique. BMC Bioinformatics 2008; 9:205. [PMID: 18430222 PMCID: PMC2387176 DOI: 10.1186/1471-2105-9-205] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2007] [Accepted: 04/22/2008] [Indexed: 02/05/2023] Open
Abstract
BACKGROUND Synthesis of data from published human genetic association studies is a critical step in the translation of human genome discoveries into health applications. Although genetic association studies account for a substantial proportion of the abstracts in PubMed, identifying them with standard queries is not always accurate or efficient. Further automating the literature-screening process can reduce the burden of a labor-intensive and time-consuming traditional literature search. The Support Vector Machine (SVM), a well-established machine learning technique, has been successful in classifying text, including biomedical literature. The GAPscreener, a free SVM-based software tool, can be used to assist in screening PubMed abstracts for human genetic association studies. RESULTS The data source for this research was the HuGE Navigator, formerly known as the HuGE Pub Lit database. Weighted SVM feature selection based on a keyword list obtained by the two-way z score method demonstrated the best screening performance, achieving 97.5% recall, 98.3% specificity and 31.9% precision in performance testing. Compared with the traditional screening process based on a complex PubMed query, the SVM tool reduced by about 90% the number of abstracts requiring individual review by the database curator. The tool also ascertained 47 articles that were missed by the traditional literature screening process during the 4-week test period. We examined the literature on genetic associations with preterm birth as an example. Compared with the traditional, manual process, the GAPscreener both reduced effort and improved accuracy. CONCLUSION GAPscreener is the first free SVM-based application available for screening the human genetic association literature in PubMed with high recall and specificity. The user-friendly graphical user interface makes this a practical, stand-alone application. The software can be downloaded at no charge.
Collapse
Affiliation(s)
- Wei Yu
- National Office of Public Health Genomics, Coordinating Center for Health Promotion, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Melinda Clyne
- National Office of Public Health Genomics, Coordinating Center for Health Promotion, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Siobhan M Dolan
- Albert Einstein College of Medicine/Montefiore Medical Center, Bronx, NY, USA
| | - Ajay Yesupriya
- National Office of Public Health Genomics, Coordinating Center for Health Promotion, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Anja Wulf
- National Office of Public Health Genomics, Coordinating Center for Health Promotion, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Tiebin Liu
- National Office of Public Health Genomics, Coordinating Center for Health Promotion, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Muin J Khoury
- National Office of Public Health Genomics, Coordinating Center for Health Promotion, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Marta Gwinn
- National Office of Public Health Genomics, Coordinating Center for Health Promotion, Centers for Disease Control and Prevention, Atlanta, GA, USA
| |
Collapse
|