1
|
Daina A, Zoete V. Testing the predictive power of reverse screening to infer drug targets, with the help of machine learning. Commun Chem 2024; 7:105. [PMID: 38724725 PMCID: PMC11082207 DOI: 10.1038/s42004-024-01179-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Accepted: 04/16/2024] [Indexed: 05/12/2024] Open
Abstract
Estimating protein targets of compounds based on the similarity principle-similar molecules are likely to show comparable bioactivity-is a long-standing strategy in drug research. Having previously quantified this principle, we present here a large-scale evaluation of its predictive power for inferring macromolecular targets by reverse screening an unprecedented vast external test set of more than 300,000 active small molecules against another bioactivity set of more than 500,000 compounds. We show that machine-learning can predict the correct targets, with the highest probability among 2069 proteins, for more than 51% of the external molecules. The strong enrichment thus obtained demonstrates its usefulness in supporting phenotypic screens, polypharmacology, or repurposing. Moreover, we quantified the impact of the bioactivity knowledge available for proteins in terms of number and diversity of actives. Finally, we advise that developers of such approaches follow an application-oriented benchmarking strategy and use large, high-quality, non-overlapping datasets as provided here.
Collapse
Affiliation(s)
- Antoine Daina
- Molecular Modeling Group, SIB Swiss Institute of Bioinformatics, CH-1015, Lausanne, Switzerland
| | - Vincent Zoete
- Molecular Modeling Group, SIB Swiss Institute of Bioinformatics, CH-1015, Lausanne, Switzerland.
- Computer-Aided Molecular Engineering, Department of Oncology UNIL-CHUV, Ludwig Institute for Cancer Research Lausanne Branch, University of Lausanne, Lausanne, Switzerland.
| |
Collapse
|
2
|
Ji KY, Liu C, Liu ZQ, Deng YF, Hou TJ, Cao DS. Comprehensive assessment of nine target prediction web services: which should we choose for target fishing? Brief Bioinform 2023; 24:6995377. [PMID: 36681902 DOI: 10.1093/bib/bbad014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 12/29/2022] [Accepted: 01/03/2023] [Indexed: 01/23/2023] Open
Abstract
Identification of potential targets for known bioactive compounds and novel synthetic analogs is of considerable significance. In silico target fishing (TF) has become an alternative strategy because of the expensive and laborious wet-lab experiments, explosive growth of bioactivity data and rapid development of high-throughput technologies. However, these TF methods are based on different algorithms, molecular representations and training datasets, which may lead to different results when predicting the same query molecules. This can be confusing for practitioners in practical applications. Therefore, this study systematically evaluated nine popular ligand-based TF methods based on target and ligand-target pair statistical strategies, which will help practitioners make choices among multiple TF methods. The evaluation results showed that SwissTargetPrediction was the best method to produce the most reliable predictions while enriching more targets. High-recall similarity ensemble approach (SEA) was able to find real targets for more compounds compared with other TF methods. Therefore, SwissTargetPrediction and SEA can be considered as primary selection methods in future studies. In addition, the results showed that k = 5 was the optimal number of experimental candidate targets. Finally, a novel ensemble TF method based on consensus voting is proposed to improve the prediction performance. The precision of the ensemble TF method outperforms the individual TF method, indicating that the ensemble TF method can more effectively identify real targets within a given top-k threshold. The results of this study can be used as a reference to guide practitioners in selecting the most effective methods in computational drug discovery.
Collapse
Affiliation(s)
- Kai-Yue Ji
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China
| | - Chong Liu
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China
| | - Zhao-Qian Liu
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China
| | - Ya-Feng Deng
- CarbonSilicon AI Technology Co., Ltd, Hangzhou, Zhejiang 310018, China
| | - Ting-Jun Hou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Dong-Sheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China
| |
Collapse
|
3
|
Mathai N, Chen Y, Kirchmair J. Validation strategies for target prediction methods. Brief Bioinform 2021; 21:791-802. [PMID: 31220208 PMCID: PMC7299289 DOI: 10.1093/bib/bbz026] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Revised: 01/14/2019] [Accepted: 02/17/2019] [Indexed: 12/11/2022] Open
Abstract
Computational methods for target prediction, based on molecular similarity and network-based approaches, machine learning, docking and others, have evolved as valuable and powerful tools to aid the challenging task of mode of action identification for bioactive small molecules such as drugs and drug-like compounds. Critical to discerning the scope and limitations of a target prediction method is understanding how its performance was evaluated and reported. Ideally, large-scale prospective experiments are conducted to validate the performance of a model; however, this expensive and time-consuming endeavor is often not feasible. Therefore, to estimate the predictive power of a method, statistical validation based on retrospective knowledge is commonly used. There are multiple statistical validation techniques that vary in rigor. In this review we discuss the validation strategies employed, highlighting the usefulness and constraints of the validation schemes and metrics that are employed to measure and describe performance. We address the limitations of measuring only generalized performance, given that the underlying bioactivity and structural data are biased towards certain small-molecule scaffolds and target families, and suggest additional aspects of performance to consider in order to produce more detailed and realistic estimates of predictive power. Finally, we describe the validation strategies that were employed by some of the most thoroughly validated and accessible target prediction methods.
Collapse
Affiliation(s)
- Neann Mathai
- Department of Chemistry, University of Bergen, Bergen, Norway.,Computational Biology Unit (CBU), University of Bergen, Bergen, Norway.,Center for Bioinformatics (ZBH), Department of Computer Science, Faculty of Mathematics, Informatics and Natural Sciences, Universität Hamburg, Hamburg, Germany
| | - Ya Chen
- Center for Bioinformatics (ZBH), Department of Computer Science, Faculty of Mathematics, Informatics and Natural Sciences, Universität Hamburg, Hamburg, Germany
| | - Johannes Kirchmair
- Department of Chemistry, University of Bergen, Bergen, Norway.,Computational Biology Unit (CBU), University of Bergen, Bergen, Norway.,Center for Bioinformatics (ZBH), Department of Computer Science, Faculty of Mathematics, Informatics and Natural Sciences, Universität Hamburg, Hamburg, Germany
| |
Collapse
|
4
|
GPCR_LigandClassify.py; a rigorous machine learning classifier for GPCR targeting compounds. Sci Rep 2021; 11:9510. [PMID: 33947911 PMCID: PMC8097070 DOI: 10.1038/s41598-021-88939-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2020] [Accepted: 04/12/2021] [Indexed: 02/02/2023] Open
Abstract
The current study describes the construction of various ligand-based machine learning models to be used for drug-repurposing against the family of G-Protein Coupled Receptors (GPCRs). In building these models, we collected > 500,000 data points, encompassing experimentally measured molecular association data of > 160,000 unique ligands against > 250 GPCRs. These data points were retrieved from the GPCR-Ligand Association (GLASS) database. We have used diverse molecular featurization methods to describe the input molecules. Multiple supervised ML algorithms were developed, tested and compared for their accuracy, F scores, as well as for their Matthews' correlation coefficient scores (MCC). Our data suggest that combined with molecular fingerprinting, ensemble decision trees and gradient boosted trees ML algorithms are on the accuracy border of the rather sophisticated deep neural nets (DNNs)-based algorithms. On a test dataset, these models displayed an excellent performance, reaching a ~ 90% classification accuracy. Additionally, we showcase a few examples where our models were able to identify interesting connections between known drugs from the Drug-Bank database and members of the GPCR family of receptors. Our findings are in excellent agreement with previously reported experimental observations in the literature. We hope the models presented in this paper synergize with the currently ongoing interest of applying machine learning modeling in the field of drug repurposing and computational drug discovery in general.
Collapse
|
5
|
Ghislat G, Rahman T, Ballester PJ. Identification and Validation of Carbonic Anhydrase II as the First Target of the Anti-Inflammatory Drug Actarit. Biomolecules 2020; 10:biom10111570. [PMID: 33227945 PMCID: PMC7699199 DOI: 10.3390/biom10111570] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Revised: 11/13/2020] [Accepted: 11/16/2020] [Indexed: 12/31/2022] Open
Abstract
Background and purpose: Identifying the macromolecular targets of drug molecules is a fundamental aspect of drug discovery and pharmacology. Several drugs remain without known targets (orphan) despite large-scale in silico and in vitro target prediction efforts. Ligand-centric chemical-similarity-based methods for in silico target prediction have been found to be particularly powerful, but the question remains of whether they are able to discover targets for target-orphan drugs. Experimental Approach: We used one of these in silico methods to carry out a target prediction analysis for two orphan drugs: actarit and malotilate. The top target predicted for each drug was carbonic anhydrase II (CAII). Each drug was therefore quantitatively evaluated for CAII inhibition to validate these two prospective predictions. Key Results: Actarit showed in vitro concentration-dependent inhibition of CAII activity with submicromolar potency (IC50 = 422 nM) whilst no consistent inhibition was observed for malotilate. Among the other 25 targets predicted for actarit, RORγ (RAR-related orphan receptor-gamma) is promising in that it is strongly related to actarit’s indication, rheumatoid arthritis (RA). Conclusion and Implications: This study is a proof-of-concept of the utility of MolTarPred for the fast and cost-effective identification of targets of orphan drugs. Furthermore, the mechanism of action of actarit as an anti-RA agent can now be re-examined from a CAII-inhibitor perspective, given existing relationships between this target and RA. Moreover, the confirmed CAII-actarit association supports investigating the repositioning of actarit on other CAII-linked indications (e.g., hypertension, epilepsy, migraine, anemia and bone, eye and cardiac disorders).
Collapse
Affiliation(s)
- Ghita Ghislat
- Centre d’Immunologie de Marseille-Luminy, Inserm, U1104, CNRS UMR7280, F-13288 Marseille, France
- Correspondence: (G.G.); (P.J.B.)
| | - Taufiq Rahman
- Department of Pharmacology, University of Cambridge, Cambridge CB2 1PD, UK;
| | - Pedro J. Ballester
- Centre de Recherche en Cancérologie de Marseille (CRCM), Inserm, U1068, F-13009 Marseille, France
- CNRS, UMR7258, F-13009 Marseille, France
- Institut Paoli-Calmettes, F-13009 Marseille, France
- Aix-Marseille University, UM 105, F-13284 Marseille, France
- Correspondence: (G.G.); (P.J.B.)
| |
Collapse
|
6
|
Yang S, Ye Q, Ding J, Yin, Lu A, Chen X, Hou T, Cao D. Current advances in ligand‐based target prediction. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2020. [DOI: 10.1002/wcms.1504] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- Su‐Qing Yang
- Xiangya School of Pharmaceutical Sciences Central South University Changsha Hunan China
| | - Qing Ye
- College of Pharmaceutical Sciences Innovation Institute for Artificial Intelligence in Medicine, Zhejiang University Hangzhou, Zhejiang China
| | - Jun‐Jie Ding
- Beijing Institute of Pharmaceutical Chemistry Beijing China
| | - Yin
- Department of Dermatology, Hunan Engineering Research Center of Skin Health and Disease, Hunan Key Laboratory of Skin Cancer and Psoriasis, Xiangya Hospital Central South University Changsha Hunan China
| | - Ai‐Ping Lu
- Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine Hong Kong Baptist University Hong Kong China
| | - Xiang Chen
- Department of Dermatology, Hunan Engineering Research Center of Skin Health and Disease, Hunan Key Laboratory of Skin Cancer and Psoriasis, Xiangya Hospital Central South University Changsha Hunan China
| | - Ting‐Jun Hou
- College of Pharmaceutical Sciences Innovation Institute for Artificial Intelligence in Medicine, Zhejiang University Hangzhou, Zhejiang China
| | - Dong‐Sheng Cao
- Xiangya School of Pharmaceutical Sciences Central South University Changsha Hunan China
- Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine Hong Kong Baptist University Hong Kong China
| |
Collapse
|
7
|
Selecting machine-learning scoring functions for structure-based virtual screening. DRUG DISCOVERY TODAY. TECHNOLOGIES 2020; 32-33:81-87. [PMID: 33386098 DOI: 10.1016/j.ddtec.2020.09.001] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Revised: 09/02/2020] [Accepted: 09/07/2020] [Indexed: 12/27/2022]
Abstract
Interest in docking technologies has grown parallel to the ever increasing number and diversity of 3D models for macromolecular therapeutic targets. Structure-Based Virtual Screening (SBVS) aims at leveraging these experimental structures to discover the necessary starting points for the drug discovery process. It is now established that Machine Learning (ML) can strongly enhance the predictive accuracy of scoring functions for SBVS by exploiting large datasets from targets, molecules and their associations. However, with greater choice, the question of which ML-based scoring function is the most suitable for prospective use on a given target has gained importance. Here we analyse two approaches to select an existing scoring function for the target along with a third approach consisting in generating a scoring function tailored to the target. These analyses required discussing the limitations of popular SBVS benchmarks, the alternatives to benchmark scoring functions for SBVS and how to generate them or use them using freely-available software.
Collapse
|
8
|
Jenkinson S, Schmidt F, Rosenbrier Ribeiro L, Delaunois A, Valentin JP. A practical guide to secondary pharmacology in drug discovery. J Pharmacol Toxicol Methods 2020; 105:106869. [PMID: 32302774 DOI: 10.1016/j.vascn.2020.106869] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Revised: 03/21/2020] [Accepted: 04/03/2020] [Indexed: 01/29/2023]
Abstract
Secondary pharmacological profiling is increasingly applied in pharmaceutical drug discovery to address unwanted pharmacological side effects of drug candidates before entering the clinic. Regulators, drug makers and patients share a demand for deep characterization of secondary pharmacology effects of novel drugs and their metabolites. The scope of such profiling has therefore expanded substantially in the past two decades, leading to the implementation of broad in silico profiling methods and focused in vitro off-target screening panels, to identify liabilities, but also opportunities, as early as possible. The pharmaceutical industry applies such panels at all stages of drug discovery routinely up to early development. Nevertheless, target composition, screening technologies, assay formats, interpretation and scheduling of panels can vary significantly between companies in the absence of dedicated guidelines. To contribute towards best practices in secondary pharmacology profiling, this review aims to summarize the state-of-the art in this field. Considerations are discussed with respect to panel design, screening strategy, implementation and interpretation of the data, including regulatory perspectives. The cascaded, or integrated, use of in silico and off-target profiling allows to exploit synergies for comprehensive safety assessment of drug candidates.
Collapse
Affiliation(s)
- Stephen Jenkinson
- Drug Safety Research and Development, Pfizer Inc., La Jolla, CA 92121, United States of America.
| | - Friedemann Schmidt
- Sanofi, R&D Preclinical Safety, Industriepark Höchst, 65926 Frankfurt/Main, Germany
| | - Lyn Rosenbrier Ribeiro
- Medicines Discovery Catapult, Block 35, Mereside, Alderley Park, Alderley Edge, SK10 4TG, United Kingdom
| | - Annie Delaunois
- UCB BioPharma SRL, Early Solutions, Development Science, Non-Clinical Safety, 1420 Braine L'Alleud, Walloon Region, Belgium
| | - Jean-Pierre Valentin
- UCB BioPharma SRL, Early Solutions, Development Science, Non-Clinical Safety, 1420 Braine L'Alleud, Walloon Region, Belgium
| |
Collapse
|
9
|
Li H, Sze K, Lu G, Ballester PJ. Machine‐learning scoring functions for structure‐based drug lead optimization. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2020. [DOI: 10.1002/wcms.1465] [Citation(s) in RCA: 53] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Affiliation(s)
- Hongjian Li
- CUHK‐SDU Joint Laboratory on Reproductive Genetics, School of Biomedical Sciences Chinese University of Hong Kong Shatin Hong Kong
| | - Kam‐Heung Sze
- CUHK‐SDU Joint Laboratory on Reproductive Genetics, School of Biomedical Sciences Chinese University of Hong Kong Shatin Hong Kong
| | - Gang Lu
- CUHK‐SDU Joint Laboratory on Reproductive Genetics, School of Biomedical Sciences Chinese University of Hong Kong Shatin Hong Kong
| | - Pedro J. Ballester
- Cancer Research Center of Marseille (INSERM U1068, Institut Paoli‐Calmettes, Aix‐Marseille Université UM105, CNRS UMR7258) Marseille France
| |
Collapse
|
10
|
Hassanzadeh P, Atyabi F, Dinarvand R. The significance of artificial intelligence in drug delivery system design. Adv Drug Deliv Rev 2019; 151-152:169-190. [PMID: 31071378 DOI: 10.1016/j.addr.2019.05.001] [Citation(s) in RCA: 77] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Revised: 04/14/2019] [Accepted: 05/02/2019] [Indexed: 02/07/2023]
Abstract
Over the last decade, increasing interest has been attracted towards the application of artificial intelligence (AI) technology for analyzing and interpreting the biological or genetic information, accelerated drug discovery, and identification of the selective small-molecule modulators or rare molecules and prediction of their behavior. Application of the automated workflows and databases for rapid analysis of the huge amounts of data and artificial neural networks (ANNs) for development of the novel hypotheses and treatment strategies, prediction of disease progression, and evaluation of the pharmacological profiles of drug candidates may significantly improve treatment outcomes. Target fishing (TF) by rapid prediction or identification of the biological targets might be of great help for linking targets to the novel compounds. AI and TF methods in association with human expertise may indeed revolutionize the current theranostic strategies, meanwhile, validation approaches are necessary to overcome the potential challenges and ensure higher accuracy. In this review, the significance of AI and TF in the development of drugs and delivery systems and the potential challenging issues have been highlighted.
Collapse
Affiliation(s)
- Parichehr Hassanzadeh
- Nanotechnology Research Center, Faculty of Pharmacy, Tehran University of Medical Sciences, Tehran 13169-43551, Iran.
| | - Fatemeh Atyabi
- Nanotechnology Research Center, Faculty of Pharmacy, Tehran University of Medical Sciences, Tehran 13169-43551, Iran.
| | - Rassoul Dinarvand
- Nanotechnology Research Center, Faculty of Pharmacy, Tehran University of Medical Sciences, Tehran 13169-43551, Iran.
| |
Collapse
|
11
|
Öztürk H, Ozkirimli E, Özgür A. A novel methodology on distributed representations of proteins using their interacting ligands. Bioinformatics 2019; 34:i295-i303. [PMID: 29949957 PMCID: PMC6022674 DOI: 10.1093/bioinformatics/bty287] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Motivation The effective representation of proteins is a crucial task that directly affects the performance of many bioinformatics problems. Related proteins usually bind to similar ligands. Chemical characteristics of ligands are known to capture the functional and mechanistic properties of proteins suggesting that a ligand-based approach can be utilized in protein representation. In this study, we propose SMILESVec, a Simplified molecular input line entry system (SMILES)-based method to represent ligands and a novel method to compute similarity of proteins by describing them based on their ligands. The proteins are defined utilizing the word-embeddings of the SMILES strings of their ligands. The performance of the proposed protein description method is evaluated in protein clustering task using TransClust and MCL algorithms. Two other protein representation methods that utilize protein sequence, Basic local alignment tool and ProtVec, and two compound fingerprint-based protein representation methods are compared. Results We showed that ligand-based protein representation, which uses only SMILES strings of the ligands that proteins bind to, performs as well as protein sequence-based representation methods in protein clustering. The results suggest that ligand-based protein description can be an alternative to the traditional sequence or structure-based representation of proteins and this novel approach can be applied to different bioinformatics problems such as prediction of new protein–ligand interactions and protein function annotation. Availability and implementation https://github.com/hkmztrk/SMILESVecProteinRepresentation Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hakime Öztürk
- Department of Computer Engineering, Bogazici University, Istanbul, Turkey
| | - Elif Ozkirimli
- Department of Chemical Engineering, Bogazici University, Istanbul, Turkey
| | - Arzucan Özgür
- Department of Computer Engineering, Bogazici University, Istanbul, Turkey
| |
Collapse
|
12
|
Tejera E, Carrera I, Jimenes-Vargas K, Armijos-Jaramillo V, Sánchez-Rodríguez A, Cruz-Monteagudo M, Perez-Castillo Y. Cell fishing: A similarity based approach and machine learning strategy for multiple cell lines-compound sensitivity prediction. PLoS One 2019; 14:e0223276. [PMID: 31589649 PMCID: PMC6779297 DOI: 10.1371/journal.pone.0223276] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2019] [Accepted: 09/17/2019] [Indexed: 12/21/2022] Open
Abstract
The prediction of cell-lines sensitivity to a given set of compounds is a very important factor in the optimization of in-vitro assays. To date, the most common prediction strategies are based upon machine learning or other quantitative structure-activity relationships (QSAR) based approaches. In the present research, we propose and discuss a straightforward strategy not based on any learning modelling but exclusively relying upon the chemical similarity of a query compound to reference compounds with annotated activity against cell lines. We also compare the performance of the proposed method to machine learning predictions on the same problem. A curated database of compounds-cell lines associations derived from ChemBL version 22 was created for algorithm construction and cross-validation. Validation was done using 10-fold cross-validation and testing the models on new data obtained from ChemBL version 25. In terms of accuracy, both methods perform similarly with values around 0.65 across 750 cell lines in 10-fold cross-validation experiments. By combining both methods it is possible to achieve 66% of correct classification rate in more than 26000 newly reported interactions comprising 11000 new compounds. A Web Service implementing the described approaches (both similarity and machine learning based models) is freely available at: http://bioquimio.udla.edu.ec/cellfishing.
Collapse
Affiliation(s)
- E. Tejera
- Ingeniería en Biotecnología, Facultad de Ingeniería y Ciencias Aplicadas, Universidad de Las Américas, Quito, Ecuador
- Grupo de Bio-Quimioinformática, Universidad de Las Américas, Quito, Ecuador
| | - I. Carrera
- Departamento de Informática y Ciencias de la Computación, Escuela Politécnica Nacional, Quito, Ecuador
- Departamento de Ciências de Computadores, Faculdade de Ciências, Universidade do Porto, Porto, Portugal
| | - Karina Jimenes-Vargas
- Ingeniería en Biotecnología, Facultad de Ingeniería y Ciencias Aplicadas, Universidad de Las Américas, Quito, Ecuador
| | - V. Armijos-Jaramillo
- Ingeniería en Biotecnología, Facultad de Ingeniería y Ciencias Aplicadas, Universidad de Las Américas, Quito, Ecuador
- Grupo de Bio-Quimioinformática, Universidad de Las Américas, Quito, Ecuador
| | - A. Sánchez-Rodríguez
- Grupo de Bio-Quimioinformática, Universidad de Las Américas, Quito, Ecuador
- Universidad Técnica Particular de Loja, Loja, Ecuador
| | - M. Cruz-Monteagudo
- Center for Computational Science (CCS), University of Miami (UM), Miami, FL, United States of America
- West Coast University, Miami, Florida, United States of America
| | - Y. Perez-Castillo
- Grupo de Bio-Quimioinformática, Universidad de Las Américas, Quito, Ecuador
- Escuela de Ciencias Físicas y Matemáticas, Universidad de Las Américas, Quito, Ecuador
| |
Collapse
|
13
|
Peón A, Li H, Ghislat G, Leung KS, Wong MH, Lu G, Ballester PJ. MolTarPred: A web tool for comprehensive target prediction with reliability estimation. Chem Biol Drug Des 2019; 94:1390-1401. [PMID: 30916462 DOI: 10.1111/cbdd.13516] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2018] [Revised: 02/07/2019] [Accepted: 03/03/2019] [Indexed: 12/17/2022]
Abstract
Molecular target prediction can provide a starting point to understand the efficacy and side effects of phenotypic screening hits. Unfortunately, the vast majority of in silico target prediction methods are not available as web tools. Furthermore, these are limited in the number of targets that can be predicted, do not estimate which target predictions are more reliable and/or lack comprehensive retrospective validations. We present MolTarPred ( http://moltarpred.marseille.inserm.fr/), a user-friendly web tool for predicting protein targets of small organic compounds. It is powered by a large knowledge base comprising 607,659 compounds and 4,553 macromolecular targets collected from the ChEMBL database. In about 1 min, the predicted targets for the supplied molecule will be listed in a table. The chemical structures of the query molecule and the most similar compounds annotated with the predicted target will also be shown to permit visual inspection and comparison. Practical examples of the use of MolTarPred are showcased. MolTarPred is a new resource for scientists that require a more complete knowledge of the polypharmacology of a molecule. The introduction of a reliability score constitutes an attractive functionality of MolTarPred, as it permits focusing experimental confirmatory tests on the most reliable predictions, which leads to higher prospective hit rates.
Collapse
Affiliation(s)
- Antonio Peón
- Centre de Recherche en Cancérologie de Marseille (CRCM), U1068, Inserm, Marseille, France.,UMR7258, CNRS, Marseille, France.,Institut Paoli-Calmettes, Marseille, France.,UM 105, Aix-Marseille University, Marseille, France
| | - Hongjian Li
- SDIVF R&D Centre, Hong Kong Science Park, Sha Tin, New Territories, Hong Kong.,CUHK-SDU Joint Laboratory on Reproductive Genetics, School of Biomedical Sciences, The Chinese University of Hong Kong, Sha Tin, New Territories, Hong Kong
| | - Ghita Ghislat
- U1104, CNRS UMR7280, Centre d'Immunologie de Marseille-Luminy, Inserm, Marseille, France
| | - Kwong-Sak Leung
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Sha Tin, New Territories, Hong Kong
| | - Man-Hon Wong
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Sha Tin, New Territories, Hong Kong
| | - Gang Lu
- CUHK-SDU Joint Laboratory on Reproductive Genetics, School of Biomedical Sciences, The Chinese University of Hong Kong, Sha Tin, New Territories, Hong Kong
| | - Pedro J Ballester
- Centre de Recherche en Cancérologie de Marseille (CRCM), U1068, Inserm, Marseille, France.,UMR7258, CNRS, Marseille, France.,Institut Paoli-Calmettes, Marseille, France.,UM 105, Aix-Marseille University, Marseille, France
| |
Collapse
|
14
|
Abstract
Drugs modulate disease states through their actions on targets in the body. Determining these targets aids the focused development of new treatments, and helps to better characterize those already employed. One means of accomplishing this is through the deployment of in silico methodologies, harnessing computational analytical and predictive power to produce educated hypotheses for experimental verification. Here, we provide an overview of the current state of the art, describe some of the well-established methods in detail, and reflect on how they, and emerging technologies promoting the incorporation of complex and heterogeneous data-sets, can be employed to improve our understanding of (poly)pharmacology.
Collapse
Affiliation(s)
- Ryan Byrne
- Department of Chemistry and Applied Biosciences, Swiss Federal Institute of Technology (ETH), Zurich, Switzerland
| | - Gisbert Schneider
- Department of Chemistry and Applied Biosciences, Swiss Federal Institute of Technology (ETH), Zurich, Switzerland.
| |
Collapse
|
15
|
Li B, Ma C, Zhao X, Hu Z, Du T, Xu X, Wang Z, Lin J. YaTCM: Yet another Traditional Chinese Medicine Database for Drug Discovery. Comput Struct Biotechnol J 2018; 16:600-610. [PMID: 30546860 PMCID: PMC6280608 DOI: 10.1016/j.csbj.2018.11.002] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2018] [Revised: 11/04/2018] [Accepted: 11/06/2018] [Indexed: 12/13/2022] Open
Abstract
Traditional Chinese Medicine (TCM) has a long history of widespread clinical applications, especially in East Asia, and is becoming frequently used in Western countries. However, owing to extreme complicacy in both chemical ingredients and mechanism of action, a deep understanding of TCM is still difficult. To accelerate the modernization and popularization of TCM, a single comprehensive database is required, containing a wealth of TCM-related information and equipped with complete analytical tools. Here we present YaTCM (Yet another Traditional Chinese Medicine database), a free web-based toolkit, which provides comprehensive TCM information and is furnished with analysis tools. YaTCM allows a user to (1) identify the potential ingredients that are crucial to TCM herbs through similarity search and substructure search, (2) investigate the mechanism of action for TCM or prescription through pathway analysis and network pharmacology analysis, (3) predict potential targets for TCM molecules by multi-voting chemical similarity ensemble approach, and (4) explore functionally similar herb pairs. All these functions can lead to one systematic network for visualization of TCM recipes, herbs, ingredients, definite or putative protein targets, pathways, and diseases. This web service would help in uncovering the mechanism of action of TCM, revealing the essence of TCM theory and then promoting the drug discovery process. YaTCM is freely available at http://cadd.pharmacy.nankai.edu.cn/yatcm/home.
Collapse
Affiliation(s)
- Baiqing Li
- State Key Laboratory of Medicinal Chemical Biology, College of Pharmacy and Tianjin Key Laboratory of Molecular Drug Research, Nankai University, Haihe Education Park, 38 Tongyan Road, Tianjin 300353, China
| | - Chunfeng Ma
- State Key Laboratory of Medicinal Chemical Biology, College of Pharmacy and Tianjin Key Laboratory of Molecular Drug Research, Nankai University, Haihe Education Park, 38 Tongyan Road, Tianjin 300353, China.,Platform of Pharmaceutical Intelligence, Tianjin International Joint Academy of Biomedicine, Tianjin 300457, China
| | - Xiaoyong Zhao
- State Key Laboratory of Medicinal Chemical Biology, College of Pharmacy and Tianjin Key Laboratory of Molecular Drug Research, Nankai University, Haihe Education Park, 38 Tongyan Road, Tianjin 300353, China
| | - Zhigang Hu
- State Key Laboratory of Medicinal Chemical Biology, College of Pharmacy and Tianjin Key Laboratory of Molecular Drug Research, Nankai University, Haihe Education Park, 38 Tongyan Road, Tianjin 300353, China
| | - Tengfei Du
- State Key Laboratory of Medicinal Chemical Biology, College of Pharmacy and Tianjin Key Laboratory of Molecular Drug Research, Nankai University, Haihe Education Park, 38 Tongyan Road, Tianjin 300353, China
| | - Xuanming Xu
- State Key Laboratory of Medicinal Chemical Biology, College of Pharmacy and Tianjin Key Laboratory of Molecular Drug Research, Nankai University, Haihe Education Park, 38 Tongyan Road, Tianjin 300353, China
| | - Zhonghua Wang
- Biodesign Center, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, 32 West 7th Avenue, Tianjin Airport Economic Area, Tianjin 300308, China
| | - Jianping Lin
- State Key Laboratory of Medicinal Chemical Biology, College of Pharmacy and Tianjin Key Laboratory of Molecular Drug Research, Nankai University, Haihe Education Park, 38 Tongyan Road, Tianjin 300353, China.,Biodesign Center, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, 32 West 7th Avenue, Tianjin Airport Economic Area, Tianjin 300308, China.,Platform of Pharmaceutical Intelligence, Tianjin International Joint Academy of Biomedicine, Tianjin 300457, China
| |
Collapse
|
16
|
Liang TT, Zhao Q, He S, Mu FZ, Deng W, Han BN. Modeling Analysis of Potential Target of Dolastatin 16 by Computational Virtual Screening. Chem Pharm Bull (Tokyo) 2018; 66:602-607. [DOI: 10.1248/cpb.c17-00966] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Affiliation(s)
- Ting-Ting Liang
- School of Chemical and Environmental Engineering, Shanghai Institute of Technology
- Department of Development Technology of Marine Resources, College of Life Sciences, Zhejiang Sci-Tech University
| | - Qi Zhao
- Faculty of Health Sciences, University of Macau
| | - Shan He
- Li Dak Sum Yip Yio Chin Kenneth Li Marine Biopharmaceutical Research Center, Ningbo University
| | - Fang-Zhou Mu
- Pharmaceutical Sciences Division, School of Pharmacy, University of Wisconsin-Madison
| | - Wei Deng
- School of Chemical and Environmental Engineering, Shanghai Institute of Technology
| | - Bing-Nan Han
- Department of Development Technology of Marine Resources, College of Life Sciences, Zhejiang Sci-Tech University
| |
Collapse
|
17
|
Bolgár B, Antal P. VB-MK-LMF: fusion of drugs, targets and interactions using variational Bayesian multiple kernel logistic matrix factorization. BMC Bioinformatics 2017; 18:440. [PMID: 28978313 PMCID: PMC5628496 DOI: 10.1186/s12859-017-1845-z] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2017] [Accepted: 09/21/2017] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Computational fusion approaches to drug-target interaction (DTI) prediction, capable of utilizing multiple sources of background knowledge, were reported to achieve superior predictive performance in multiple studies. Other studies showed that specificities of the DTI task, such as weighting the observations and focusing the side information are also vital for reaching top performance. METHOD We present Variational Bayesian Multiple Kernel Logistic Matrix Factorization (VB-MK-LMF), which unifies the advantages of (1) multiple kernel learning, (2) weighted observations, (3) graph Laplacian regularization, and (4) explicit modeling of probabilities of binary drug-target interactions. RESULTS VB-MK-LMF achieves significantly better predictive performance in standard benchmarks compared to state-of-the-art methods, which can be traced back to multiple factors. The systematic evaluation of the effect of multiple kernels confirm their benefits, but also highlights the limitations of linear kernel combinations, already recognized in other fields. The analysis of the effect of prior kernels using varying sample sizes sheds light on the balance of data and knowledge in DTI tasks and on the rate at which the effect of priors vanishes. This also shows the existence of "small sample size" regions where using side information offers significant gains. Alongside favorable predictive performance, a notable property of MF methods is that they provide a unified space for drugs and targets using latent representations. Compared to earlier studies, the dimensionality of this space proved to be surprisingly low, which makes the latent representations constructed by VB-ML-LMF especially well-suited for visual analytics. The probabilistic nature of the predictions allows the calculation of the expected values of hits in functionally relevant sets, which we demonstrate by predicting drug promiscuity. The variational Bayesian approximation is also implemented for general purpose graphics processing units yielding significantly improved computational time. CONCLUSION In standard benchmarks, VB-MK-LMF shows significantly improved predictive performance in a wide range of settings. Beyond these benchmarks, another contribution of our work is highlighting and providing estimates for further pharmaceutically relevant quantities, such as promiscuity, druggability and total number of interactions.
Collapse
Affiliation(s)
- Bence Bolgár
- Department of Measurement and Information Systems, Budapest University of Technology and Economics, Magyar tudósok krt. 2., Budapest, 1117 Hungary
| | - Péter Antal
- Department of Measurement and Information Systems, Budapest University of Technology and Economics, Magyar tudósok krt. 2., Budapest, 1117 Hungary
| |
Collapse
|
18
|
Peón A, Naulaerts S, Ballester PJ. Predicting the Reliability of Drug-target Interaction Predictions with Maximum Coverage of Target Space. Sci Rep 2017. [PMID: 28630414 PMCID: PMC5476590 DOI: 10.1038/s41598-017-04264-w] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
Many computational methods to predict the macromolecular targets of small organic molecules have been presented to date. Despite progress, target prediction methods still have important limitations. For example, the most accurate methods implicitly restrict their predictions to a relatively small number of targets, are not systematically validated on drugs (whose targets are harder to predict than those of non-drug molecules) and often lack a reliability score associated with each predicted target. Here we present a systematic validation of ligand-centric target prediction methods on a set of clinical drugs. These methods exploit a knowledge-base covering 887,435 known ligand-target associations between 504,755 molecules and 4,167 targets. Based on this dataset, we provide a new estimate of the polypharmacology of drugs, which on average have 11.5 targets below IC50 10 µM. The average performance achieved across clinical drugs is remarkable (0.348 precision and 0.423 recall, with large drug-dependent variability), especially given the unusually large coverage of the target space. Furthermore, we show how a sparse ligand-target bioactivity matrix to retrospectively validate target prediction methods could underestimate prospective performance. Lastly, we present and validate a first-in-kind score capable of accurately predicting the reliability of target predictions.
Collapse
Affiliation(s)
- Antonio Peón
- Centre de Recherche en Cancérologie de Marseille (CRCM), Inserm, U1068, Marseille, F-13009, France.,CNRS, UMR7258, Marseille, F-13009, France.,Institut Paoli-Calmettes, Marseille, F-13009, France.,Aix-Marseille University, UM 105, F-13284, Marseille, France
| | - Stefan Naulaerts
- Centre de Recherche en Cancérologie de Marseille (CRCM), Inserm, U1068, Marseille, F-13009, France.,CNRS, UMR7258, Marseille, F-13009, France.,Institut Paoli-Calmettes, Marseille, F-13009, France.,Aix-Marseille University, UM 105, F-13284, Marseille, France
| | - Pedro J Ballester
- Centre de Recherche en Cancérologie de Marseille (CRCM), Inserm, U1068, Marseille, F-13009, France. .,CNRS, UMR7258, Marseille, F-13009, France. .,Institut Paoli-Calmettes, Marseille, F-13009, France. .,Aix-Marseille University, UM 105, F-13284, Marseille, France.
| |
Collapse
|
19
|
Chu YY, Cheng HJ, Tian ZH, Zhao JC, Li G, Chu YY, Sun CJ, Li WB. Rational drug design of indazole-based diarylurea derivatives as anticancer agents. Chem Biol Drug Des 2017; 90:609-617. [DOI: 10.1111/cbdd.12984] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2016] [Revised: 01/19/2017] [Accepted: 03/09/2017] [Indexed: 11/27/2022]
Affiliation(s)
- Yan-yan Chu
- School of Medicine and Pharmacy; Ocean University of China; Qingdao China
- Marine Biomedical Research Institute of Qingdao; Qingdao China
| | - He-juan Cheng
- School of Medicine and Pharmacy; Ocean University of China; Qingdao China
| | - Zhen-hua Tian
- School of Medicine and Pharmacy; Ocean University of China; Qingdao China
| | - Jian-chun Zhao
- School of Medicine and Pharmacy; Ocean University of China; Qingdao China
- Marine Biomedical Research Institute of Qingdao; Qingdao China
| | - Gang Li
- Haile PharmaTech Ltd; Jinan China
| | | | | | - Wen-bao Li
- School of Medicine and Pharmacy; Ocean University of China; Qingdao China
- Marine Biomedical Research Institute of Qingdao; Qingdao China
- Haile PharmaTech Ltd; Jinan China
| |
Collapse
|
20
|
Waters S, Svensson P, Kullingsjö J, Pontén H, Andreasson T, Sunesson Y, Ljung E, Sonesson C, Waters N. In Vivo Systems Response Profiling and Multivariate Classification of CNS Active Compounds: A Structured Tool for CNS Drug Discovery. ACS Chem Neurosci 2017; 8:785-797. [PMID: 27997108 DOI: 10.1021/acschemneuro.6b00371] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
This paper describes the application of in vivo systems response profiling in CNS drug discovery by a process referred to as the Integrative Screening Process. The biological response profile, treated as an array, is used as major outcome for selection of candidate drugs. Dose-response data, including ex vivo brain monoaminergic biomarkers and behavioral descriptors, are systematically collected and analyzed by principal component analysis (PCA) and partial least-squares (PLS) regression, yielding multivariate characterization across compounds. The approach is exemplified by assessing a new class of CNS active compounds, the dopidines, compared to other monoamine modulating compounds including antipsychotics, antidepressants, and procognitive agents. Dopidines display a distinct phenotypic profile which has prompted extensive further preclinical and clinical investigations. In summary, in vivo profiles of CNS compounds are mapped, based on dose response studies in the rat. Applying a systematic and standardized work-flow, a database of in vivo systems response profiles is compiled, enabling comparisons and classification. This creates a framework for translational mapping, a crucial component in CNS drug discovery.
Collapse
Affiliation(s)
- Susanna Waters
- Department
of Pharmacology, Gothenburg University, SE-405 30 Gothenburg, Sweden
- Integrative Research Laboratories Sweden AB, Gothenburg SE-413 46, Sweden
| | - Peder Svensson
- Integrative Research Laboratories Sweden AB, Gothenburg SE-413 46, Sweden
| | - Johan Kullingsjö
- Integrative Research Laboratories Sweden AB, Gothenburg SE-413 46, Sweden
| | - Henrik Pontén
- Department
of Pharmacology, Gothenburg University, SE-405 30 Gothenburg, Sweden
| | | | | | - Elisabeth Ljung
- Integrative Research Laboratories Sweden AB, Gothenburg SE-413 46, Sweden
| | - Clas Sonesson
- Integrative Research Laboratories Sweden AB, Gothenburg SE-413 46, Sweden
| | - Nicholas Waters
- Integrative Research Laboratories Sweden AB, Gothenburg SE-413 46, Sweden
| |
Collapse
|