1
|
Orsi M, Reymond JL. One chiral fingerprint to find them all. J Cheminform 2024; 16:53. [PMID: 38741153 DOI: 10.1186/s13321-024-00849-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 04/28/2024] [Indexed: 05/16/2024] Open
Abstract
Molecular fingerprints are indispensable tools in cheminformatics. However, stereochemistry is generally not considered, which is problematic for large molecules which are almost all chiral. Herein we report MAP4C, a chiral version of our previously reported fingerprint MAP4, which lists MinHashes computed from character strings containing the SMILES of all pairs of circular substructures up to a diameter of four bonds and the shortest topological distance between their central atoms. MAP4C includes the Cahn-Ingold-Prelog (CIP) annotation (R, S, r or s) whenever the chiral atom is the center of a circular substructure, a question mark for undefined stereocenters, and double bond cis-trans information if specified. MAP4C performs slightly better than the achiral MAP4, ECFP and AP fingerprints in non-stereoselective virtual screening benchmarks. Furthermore, MAP4C distinguishes between stereoisomers in chiral molecules from small molecule drugs to large natural products and peptides comprising thousands of diastereomers, with a degree of distinction smaller than between structural isomers and proportional to the number of chirality changes. Due to its excellent performance across diverse molecular classes and its ability to handle stereochemistry, MAP4C is recommended as a generally applicable chiral molecular fingerprint. SCIENTIFIC CONTRIBUTION: The ability of our chiral fingerprint MAP4C to handle stereoisomers from small molecules to large natural products and peptides is unprecedented and opens the way for cheminformatics to include stereochemistry as an important molecular parameter across all fields of molecular design.
Collapse
Affiliation(s)
- Markus Orsi
- Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland.
| |
Collapse
|
2
|
Vogt M. Chemoinformatic approaches for navigating large chemical spaces. Expert Opin Drug Discov 2024; 19:403-414. [PMID: 38300511 DOI: 10.1080/17460441.2024.2313475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Accepted: 01/30/2024] [Indexed: 02/02/2024]
Abstract
INTRODUCTION Large chemical spaces (CSs) include traditional large compound collections, combinatorial libraries covering billions to trillions of molecules, DNA-encoded chemical libraries comprising complete combinatorial CSs in a single mixture, and virtual CSs explored by generative models. The diverse nature of these types of CSs require different chemoinformatic approaches for navigation. AREAS COVERED An overview of different types of large CSs is provided. Molecular representations and similarity metrics suitable for large CS exploration are discussed. A summary of navigation of CSs in generative models is provided. Methods for characterizing and comparing CSs are discussed. EXPERT OPINION The size of large CSs might restrict navigation to specialized algorithms and limit it to considering neighborhoods of structurally similar molecules. Efficient navigation of large CSs not only requires methods that scale with size but also requires smart approaches that focus on better but not necessarily larger molecule selections. Deep generative models aim to provide such approaches by implicitly learning features relevant for targeted biological properties. It is unclear whether these models can fulfill this ideal as validation is difficult as long as the covered CSs remain mainly virtual without experimental verification.
Collapse
Affiliation(s)
- Martin Vogt
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Bonn, Germany
| |
Collapse
|
3
|
Qian W, Wang X, Kang Y, Pan P, Hou T, Hsieh CY. A general model for predicting enzyme functions based on enzymatic reactions. J Cheminform 2024; 16:38. [PMID: 38556873 PMCID: PMC10983695 DOI: 10.1186/s13321-024-00827-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Accepted: 03/16/2024] [Indexed: 04/02/2024] Open
Abstract
Accurate prediction of the enzyme comission (EC) numbers for chemical reactions is essential for the understanding and manipulation of enzyme functions, biocatalytic processes and biosynthetic planning. A number of machine leanring (ML)-based models have been developed to classify enzymatic reactions, showing great advantages over costly and long-winded experimental verifications. However, the prediction accuracy for most available models trained on the records of chemical reactions without specifying the enzymatic catalysts is rather limited. In this study, we introduced BEC-Pred, a BERT-based multiclassification model, for predicting EC numbers associated with reactions. Leveraging transfer learning, our approach achieves precise forecasting across a wide variety of Enzyme Commission (EC) numbers solely through analysis of the SMILES sequences of substrates and products. BEC-Pred model outperformed other sequence and graph-based ML methods, attaining a higher accuracy of 91.6%, surpassing them by 5.5%, and exhibiting superior F1 scores with improvements of 6.6% and 6.0%, respectively. The enhanced performance highlights the potential of BEC-Pred to serve as a reliable foundational tool to accelerate the cutting-edge research in synthetic biology and drug metabolism. Moreover, we discussed a few examples on how BEC-Pred could accurately predict the enzymatic classification for the Novozym 435-induced hydrolysis and lipase efficient catalytic synthesis. We anticipate that BEC-Pred will have a positive impact on the progression of enzymatic research.
Collapse
Affiliation(s)
- Wenjia Qian
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Xiaorui Wang
- Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao, 999078, China
- CarbonSilicon AI Technology Co., Ltd, Hangzhou, 310018, Zhejiang, China
| | - Yu Kang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Peichen Pan
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Tingjun Hou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China.
| | - Chang-Yu Hsieh
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China.
| |
Collapse
|
4
|
Srisongkram T. Ensemble Quantitative Read-Across Structure-Activity Relationship Algorithm for Predicting Skin Cytotoxicity. Chem Res Toxicol 2023; 36:1961-1972. [PMID: 38047785 DOI: 10.1021/acs.chemrestox.3c00238] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]
Abstract
Read-across (RA) and quantitative structure-activity relationship (QSAR) are two alternative methods commonly used to fill data gaps in chemical registrations. These approaches use physicochemical properties or molecular fingerprints of source substances to predict the properties of unknown substances that have similar chemical structures or physicochemical properties. Research on RA and QSAR is essential to minimize the time, money, and animal testing needed to determine biological properties that are not currently known. This study developed a stacked ensemble quantitative read-across structure-activity relationship algorithm (enQRASAR) for predicting skin irritation toxicity based on negative log cell viability inhibition concentration at 50% (pIC50) against skin keratinocytes as the end point. The goodness-of-fit and predictability of this algorithm were validated using leave-one-out cross-validation and external test data sets. The results obtained were statistically reliable in terms of goodness-of-fit, robustness, and predictability metrics. Additionally, the developed model demonstrated a low prediction error when predicting FDA-approved drugs. These results confirm that the enQRASAR algorithm can be used to predict skin cytotoxicity of chemicals. Therefore, this model was publicly available to further facilitate toxicity predictions of unknown compounds in chemical registrations.
Collapse
Affiliation(s)
- Tarapong Srisongkram
- Division of Pharmaceutical Chemistry, Faculty of Pharmaceutical Sciences, Khon Kaen University, Khon Kaen 40000, Thailand
| |
Collapse
|
5
|
Ali S, Shaikh S, Ahmad K, Choi I. Identification of active compounds as novel dipeptidyl peptidase-4 inhibitors through machine learning and structure-based molecular docking simulations. J Biomol Struct Dyn 2023:1-10. [PMID: 38100571 DOI: 10.1080/07391102.2023.2292299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Accepted: 11/23/2023] [Indexed: 12/17/2023]
Abstract
The enzyme dipeptidyl peptidase 4 (DPP4) is a potential therapeutic target for type 2 diabetes (T2DM). Many synthetic anti-DPP4 medications are available to treat T2DM. The need for secure and efficient medicines has been unmet due to the adverse side effects of existing DPP4 medications. The present study implemented a combined approach to machine learning and structure-based virtual screening to identify DPP4 inhibitors. Two ML models were trained based on DPP4 IC50 datasets. The ML models random forest (RF) and multilayer perceptron (MLP) neural network showed good accuracy, with the area under the curve being 0.93 and 0.91, respectively. The natural compound library was screened through ML models, and 1% (217) of compounds were selected for further screening. Structure-based virtual screening was performed along with positive control sitagliptin to obtain more specific and selective leads for DPP4. Based on binding affinity, drug-likeness properties, and interaction with DPP4, Z-614 and Z-997 compounds showed high binding affinity and specificity in the catalytic pocket of DPP4. Finally, the stability conformation of the DPP4 enzyme complex was checked by a molecular dynamics (MD) simulation. The MD simulation showed that both compounds bind better in the catalytic pocket, but the Z-614 compound altered the DPP4 native conformation. Therefore, Z-614 showed a high deviation in the backbone. This combined approach (ML and structure-based) study reported that Z-997 binds most stably to DPP4 in their catalytic pocket with a binding free energy of -70.3 kJ/mol, suggesting its therapeutic potential as a treatment option for T2DM disease.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Shahid Ali
- Department of Medical Biotechnology, Yeungnam University, Gyeongsan, South Korea
- Research Institute of Cell Culture, Yeungnam University, Gyeongsan, South Korea
| | - Sibhghatulla Shaikh
- Department of Medical Biotechnology, Yeungnam University, Gyeongsan, South Korea
- Research Institute of Cell Culture, Yeungnam University, Gyeongsan, South Korea
| | - Khurshid Ahmad
- Department of Medical Biotechnology, Yeungnam University, Gyeongsan, South Korea
- Research Institute of Cell Culture, Yeungnam University, Gyeongsan, South Korea
| | - Inho Choi
- Department of Medical Biotechnology, Yeungnam University, Gyeongsan, South Korea
- Research Institute of Cell Culture, Yeungnam University, Gyeongsan, South Korea
| |
Collapse
|
6
|
Srisongkram T, Syahid NF, Tookkane D, Weerapreeyakul N, Puthongking P. Stacked ensemble learning on HaCaT cytotoxicity for skin irritation prediction: A case study on dipterocarpol. Food Chem Toxicol 2023; 181:114115. [PMID: 37863382 DOI: 10.1016/j.fct.2023.114115] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2023] [Revised: 09/26/2023] [Accepted: 10/17/2023] [Indexed: 10/22/2023]
Abstract
Skin irritation is an adverse effect associated with various substances, including chemicals, drugs, or natural products. Dipterocarpol, extracted from Dipterocarpus alatus, contains several skin benefits notably anticancer, wound healing, and antibacterial properties. However, the skin irritation of dipterocarpol remains unassessed. Quantitative structure-activity relationship (QSAR) is a recommended tool for toxicity assessment involving less time, money, and animal testing to access unavailable acute toxicity data. Therefore, our study aimed to develop a highly accurate machine learning-based QSAR model for predicting skin irritation. We utilized a stacked ensemble learning model with 1064 chemicals. We also adhered to the recommendations from the OECD for QSAR validation. Subsequently, we used the proposed model to explore the cytotoxicity of dipterocarpol on keratinocytes. Our findings indicate that the model displayed promising statistical quality in terms of accuracy, precision, and recall in both 10-fold cross-validation and test datasets. Moreover, the model predicted that dipterocarpol does not have skin irritation, which was confirmed by the cell-based assay. In conclusion, our proposed model can be applied for the risk assessment of skin irritation in untested compounds that fall within its applicability domain. The web application of this model is available at https://qsarlabs.com/#stackhacat.
Collapse
Affiliation(s)
- Tarapong Srisongkram
- Division of Pharmaceutical Chemistry, Faculty of Pharmaceutical Sciences, Khon Kaen University, 40002, Thailand; Human High Performance and Health Promotion Research Institute, Khon Kaen University, Khon Kaen, 40002, Thailand.
| | - Nur Fadhilah Syahid
- Graduate School in the Program of Pharmaceutical Chemistry and Natural Products, Pharmaceutical Sciences, Faculty of Pharmaceutical Sciences, Khon Kaen University, 40002, Thailand
| | - Dheerapat Tookkane
- Division of Pharmaceutical Chemistry, Faculty of Pharmaceutical Sciences, Khon Kaen University, 40002, Thailand
| | - Natthida Weerapreeyakul
- Division of Pharmaceutical Chemistry, Faculty of Pharmaceutical Sciences, Khon Kaen University, 40002, Thailand; Human High Performance and Health Promotion Research Institute, Khon Kaen University, Khon Kaen, 40002, Thailand
| | - Ploenthip Puthongking
- Division of Pharmaceutical Chemistry, Faculty of Pharmaceutical Sciences, Khon Kaen University, 40002, Thailand
| |
Collapse
|
7
|
Kırboğa KK, Abbasi S, Küçüksille EU. Explainability and white box in drug discovery. Chem Biol Drug Des 2023; 102:217-233. [PMID: 37105727 DOI: 10.1111/cbdd.14262] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2022] [Revised: 03/24/2023] [Accepted: 04/12/2023] [Indexed: 04/29/2023]
Abstract
Recently, artificial intelligence (AI) techniques have been increasingly used to overcome the challenges in drug discovery. Although traditional AI techniques generally have high accuracy rates, there may be difficulties in explaining the decision process and patterns. This can create difficulties in understanding and making sense of the outputs of algorithms used in drug discovery. Therefore, using explainable AI (XAI) techniques, the causes and consequences of the decision process are better understood. This can help further improve the drug discovery process and make the right decisions. To address this issue, Explainable Artificial Intelligence (XAI) emerged as a process and method that securely captures the results and outputs of machine learning (ML) and deep learning (DL) algorithms. Using techniques such as SHAP (SHApley Additive ExPlanations) and LIME (Locally Interpretable Model-Independent Explanations) has made the drug targeting phase clearer and more understandable. XAI methods are expected to reduce time and cost in future computational drug discovery studies. This review provides a comprehensive overview of XAI-based drug discovery and development prediction. XAI mechanisms to increase confidence in AI and modeling methods. The limitations and future directions of XAI in drug discovery are also discussed.
Collapse
Affiliation(s)
- Kevser Kübra Kırboğa
- Bioengineering Department, Bilecik Seyh Edebali University, Bilecik, Turkey
- Informatics Institute, Istanbul Technical University, Maslak, Turkey
| | - Sumra Abbasi
- Department of Biological Sciences, National of Medical Sciences, Rawalpindi, Pakistan
| | - Ecir Uğur Küçüksille
- Department of Computer Engineering, Süleyman Demirel University, Isparta, Turkey
| |
Collapse
|
8
|
Syahid NF, Weerapreeyakul N, Srisongkram T. StackBRAF: A Large-Scale Stacking Ensemble Learning for BRAF Affinity Prediction. ACS OMEGA 2023; 8:20881-20891. [PMID: 37332807 PMCID: PMC10268632 DOI: 10.1021/acsomega.3c01641] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Accepted: 05/22/2023] [Indexed: 06/20/2023]
Abstract
The B-rapidly accelerated fibrosarcoma (BRAF) is a proto-oncogene that plays a vital role in cell signaling and growth regulation. Identifying a potent BRAF inhibitor can enhance therapeutic success in high-stage cancers, particularly metastatic melanoma. In this study, we proposed a stacking ensemble learning framework for the accurate prediction of BRAF inhibitors. We obtained 3857 curated molecules with BRAF inhibitory activity expressed as a predicted half-maximal inhibitory concentration value (pIC50) from the ChEMBL database. Twelve molecular fingerprints from PaDeL-Descriptor were calculated for model training. Three machine learning algorithms including extreme gradient boosting, support vector regression, and multilayer perceptron were utilized for constructing new predictive features (PFs). The meta-ensemble random forest regression, called StackBRAF, was created based on the 36 PFs. The StackBRAF model achieves lower mean absolute error (MAE) and higher coefficient of determination (R2 and Q2) than the individual baseline models. The stacking ensemble learning model provides good y-randomization results, indicating a strong correlation between molecular features and pIC50. An applicability domain of the model with an acceptable Tanimoto similarity score was also defined. Moreover, a large-scale high-throughput screening of 2123 FDA-approved drugs against the BRAF protein was successfully demonstrated using the StackBRAF algorithm. Thus, the StackBRAF model proved beneficial as a drug design algorithm for BRAF inhibitor drug discovery and drug development.
Collapse
Affiliation(s)
- Nur Fadhilah Syahid
- Graduate
School in the Program of Pharmaceutical Chemistry and Natural Products,
Faculty of Pharmaceutical Sciences, Khon
Kaen University, Khon Kaen 40002, Thailand
| | - Natthida Weerapreeyakul
- Division
of Pharmaceutical Chemistry, Faculty of Pharmaceutical Sciences, Khon Kaen University, Khon Kaen 40002, Thailand
- Human
High Performance and Health Promotion Research Institute, Khon Kaen University, Khon Kaen 40002, Thailand
| | - Tarapong Srisongkram
- Division
of Pharmaceutical Chemistry, Faculty of Pharmaceutical Sciences, Khon Kaen University, Khon Kaen 40002, Thailand
- Human
High Performance and Health Promotion Research Institute, Khon Kaen University, Khon Kaen 40002, Thailand
| |
Collapse
|
9
|
Sellner MS, Mahmoud AH, Lill MA. Efficient virtual high-content screening using a distance-aware transformer model. J Cheminform 2023; 15:18. [PMID: 36755346 PMCID: PMC9906956 DOI: 10.1186/s13321-023-00686-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2022] [Accepted: 01/22/2023] [Indexed: 02/10/2023] Open
Abstract
Molecular similarity search is an often-used method in drug discovery, especially in virtual screening studies. While simple one- or two-dimensional similarity metrics can be applied to search databases containing billions of molecules in a reasonable amount of time, this is not the case for complex three-dimensional methods. In this work, we trained a transformer model to autoencode tokenized SMILES strings using a custom loss function developed to conserve similarities in latent space. This allows the direct sampling of molecules in the generated latent space based on their Euclidian distance. Reducing the similarity between molecules to their Euclidian distance in latent space allows the model to perform independent of the similarity metric it was trained on. While we test the method here using 2D similarity as proof-of-concept study, the algorithm will enable also high-content screening with time-consuming 3D similarity metrics. We show that the presence of a specific loss function for similarity conservation greatly improved the model's ability to predict highly similar molecules. When applying the model to a database containing 1.5 billion molecules, our model managed to reduce the relevant search space by 5 orders of magnitude. We also show that our model was able to generalize adequately when trained on a relatively small dataset of representative structures. The herein presented method thereby provides new means of substantially reducing the relevant search space in virtual screening approaches, thus highly increasing their throughput. Additionally, the distance awareness of the model causes the efficiency of this method to be independent of the underlying similarity metric.
Collapse
Affiliation(s)
- Manuel S. Sellner
- grid.6612.30000 0004 1937 0642Department of Pharmaceutical Sciences, University of Basel, Basel, Switzerland
| | - Amr H. Mahmoud
- grid.6612.30000 0004 1937 0642Department of Pharmaceutical Sciences, University of Basel, Basel, Switzerland
| | - Markus A. Lill
- grid.6612.30000 0004 1937 0642Department of Pharmaceutical Sciences, University of Basel, Basel, Switzerland
| |
Collapse
|
10
|
Cai X, Orsi M, Capecchi A, Köhler T, van Delden C, Javor S, Reymond JL. An intrinsically disordered antimicrobial peptide dendrimer from stereorandomized virtual screening. CELL REPORTS. PHYSICAL SCIENCE 2022; 3:101161. [PMID: 36632208 PMCID: PMC9780108 DOI: 10.1016/j.xcrp.2022.101161] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 10/21/2022] [Accepted: 11/02/2022] [Indexed: 06/17/2023]
Abstract
Membrane-disruptive amphiphilic antimicrobial peptides behave as intrinsically disordered proteins by being unordered in water and becoming α-helical in contact with biological membranes. We recently discovered that synthesizing the α-helical antimicrobial peptide dendrimer L-T25 ((KL)8(KKL)4(KLL)2 KKLL) using racemic amino acids to form stereorandomized sr-T25, an analytically pure mixture of all possible diastereoisomers of L-T25, preserved antibacterial activity but abolished hemolysis and cytotoxicity, pointing to an intrinsically disordered antibacterial conformation and an α-helical cytotoxic conformation. In this study, to identify non-toxic intrinsically disordered homochiral antimicrobial peptide dendrimers (AMPDs), we surveyed sixty-three sr-analogs of sr-T25 selected by virtual screening. One of the analogs, sr-X18 ((KL)8(KLK)4(KLL)2 KLLL), lost antibacterial activity as L-enantiomer and became hemolytic due to α-helical folding. By contrast, the L- and D-enantiomers of sr-X22 ((KL)8(KL)4(KKLL)2 KLKK) were equally antibacterial, non-hemolytic, and non-toxic, implying an intrinsically disordered bioactive conformation. Screening stereorandomized libraries may be generally useful to identify or optimize intrinsically disordered bioactive peptides.
Collapse
Affiliation(s)
- Xingguang Cai
- Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern, Freiestrasse 3, 3012 Bern, Switzerland
| | - Markus Orsi
- Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern, Freiestrasse 3, 3012 Bern, Switzerland
| | - Alice Capecchi
- Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern, Freiestrasse 3, 3012 Bern, Switzerland
| | - Thilo Köhler
- Department of Microbiology and Molecular Medicine, University of Geneva, Service of Infectious Diseases, University Hospital of Geneva, Geneva, Switzerland
| | - Christian van Delden
- Department of Microbiology and Molecular Medicine, University of Geneva, Service of Infectious Diseases, University Hospital of Geneva, Geneva, Switzerland
| | - Sacha Javor
- Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern, Freiestrasse 3, 3012 Bern, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern, Freiestrasse 3, 3012 Bern, Switzerland
| |
Collapse
|
11
|
Sajjan M, Li J, Selvarajan R, Sureshbabu SH, Kale SS, Gupta R, Singh V, Kais S. Quantum machine learning for chemistry and physics. Chem Soc Rev 2022; 51:6475-6573. [PMID: 35849066 DOI: 10.1039/d2cs00203e] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Machine learning (ML) has emerged as a formidable force for identifying hidden but pertinent patterns within a given data set with the objective of subsequent generation of automated predictive behavior. In recent years, it is safe to conclude that ML and its close cousin, deep learning (DL), have ushered in unprecedented developments in all areas of physical sciences, especially chemistry. Not only classical variants of ML, even those trainable on near-term quantum hardwares have been developed with promising outcomes. Such algorithms have revolutionized materials design and performance of photovoltaics, electronic structure calculations of ground and excited states of correlated matter, computation of force-fields and potential energy surfaces informing chemical reaction dynamics, reactivity inspired rational strategies of drug designing and even classification of phases of matter with accurate identification of emergent criticality. In this review we shall explicate a subset of such topics and delineate the contributions made by both classical and quantum computing enhanced machine learning algorithms over the past few years. We shall not only present a brief overview of the well-known techniques but also highlight their learning strategies using statistical physical insight. The objective of the review is not only to foster exposition of the aforesaid techniques but also to empower and promote cross-pollination among future research in all areas of chemistry which can benefit from ML and in turn can potentially accelerate the growth of such algorithms.
Collapse
Affiliation(s)
- Manas Sajjan
- Department of Chemistry, Purdue University, West Lafayette, IN-47907, USA. .,Purdue Quantum Science and Engineering Institute, Purdue University, West Lafayette, Indiana 47907, USA
| | - Junxu Li
- Purdue Quantum Science and Engineering Institute, Purdue University, West Lafayette, Indiana 47907, USA.,Department of Physics and Astronomy, Purdue University, West Lafayette, IN-47907, USA
| | - Raja Selvarajan
- Purdue Quantum Science and Engineering Institute, Purdue University, West Lafayette, Indiana 47907, USA.,Department of Physics and Astronomy, Purdue University, West Lafayette, IN-47907, USA
| | - Shree Hari Sureshbabu
- Purdue Quantum Science and Engineering Institute, Purdue University, West Lafayette, Indiana 47907, USA.,Elmore Family School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN-47907, USA
| | - Sumit Suresh Kale
- Department of Chemistry, Purdue University, West Lafayette, IN-47907, USA. .,Purdue Quantum Science and Engineering Institute, Purdue University, West Lafayette, Indiana 47907, USA
| | - Rishabh Gupta
- Department of Chemistry, Purdue University, West Lafayette, IN-47907, USA. .,Purdue Quantum Science and Engineering Institute, Purdue University, West Lafayette, Indiana 47907, USA
| | - Vinit Singh
- Department of Chemistry, Purdue University, West Lafayette, IN-47907, USA. .,Purdue Quantum Science and Engineering Institute, Purdue University, West Lafayette, Indiana 47907, USA
| | - Sabre Kais
- Department of Chemistry, Purdue University, West Lafayette, IN-47907, USA. .,Purdue Quantum Science and Engineering Institute, Purdue University, West Lafayette, Indiana 47907, USA.,Department of Physics and Astronomy, Purdue University, West Lafayette, IN-47907, USA.,Elmore Family School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN-47907, USA
| |
Collapse
|
12
|
Janela T, Takeuchi K, Bajorath J. Introducing a Chemically Intuitive Core-Substituent Fingerprint Designed to Explore Structural Requirements for Effective Similarity Searching and Machine Learning. MOLECULES (BASEL, SWITZERLAND) 2022; 27:molecules27072331. [PMID: 35408730 PMCID: PMC9000322 DOI: 10.3390/molecules27072331] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 03/29/2022] [Accepted: 04/01/2022] [Indexed: 11/16/2022]
Abstract
Fingerprint (FP) representations of chemical structure continue to be one of the most widely used types of molecular descriptors in chemoinformatics and computational medicinal chemistry. One often distinguishes between two- and three-dimensional (2D and 3D) FPs depending on whether they are derived from molecular graphs or conformations, respectively. Primary application areas for FPs include similarity searching and compound classification via machine learning, especially for hit identification. For these applications, 2D FPs are particularly popular, given their robustness and for the most part comparable (or better) performance to 3D FPs. While a variety of FP prototypes has been designed and evaluated during earlier times of chemoinformatics research, new developments have been rare over the past decade. At least in part, this has been due to the situation that topological (atom environment) FPs derived from molecular graphs have evolved as a gold standard in the field. We were interested in exploring the question of whether the amount of structural information captured by state-of-the-art 2D FPs is indeed required for effective similarity searching and compound classification or whether accounting for fewer structural features might be sufficient. Therefore, pursuing a "structural minimalist" approach, we designed and implemented a new 2D FP based upon ring and substituent fragments obtained by systematically decomposing large numbers of compounds from medicinal chemistry. The resulting FP termed core-substituent FP (CSFP) captures much smaller numbers of structural features than state-of-the-art 2D FPs. However, CSFP achieves high performance in similarity searching and machine learning, demonstrating that less structural information is required for establishing molecular similarity relationships than is often believed. Given its high performance and chemical tangibility, CSFP is also relevant for practical applications in medicinal chemistry.
Collapse
|
13
|
Zhu D, Johannsen S, Masini T, Simonin C, Haupenthal J, Illarionov B, Andreas A, Awale M, Gierse RM, van der Laan T, van der Vlag R, Nasti R, Poizat M, Buhler E, Reiling N, Müller R, Fischer M, Reymond JL, Hirsch AKH. Discovery of novel drug-like antitubercular hits targeting the MEP pathway enzyme DXPS by strategic application of ligand-based virtual screening. Chem Sci 2022; 13:10686-10698. [PMID: 36320685 PMCID: PMC9491098 DOI: 10.1039/d2sc02371g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Accepted: 08/07/2022] [Indexed: 12/04/2022] Open
Abstract
In the present manuscript, we describe how we successfully used ligand-based virtual screening (LBVS) to identify two small-molecule, drug-like hit classes with excellent ADMET profiles against the difficult to address microbial enzyme 1-deoxy-d-xylulose-5-phosphate synthase (DXPS). In the fight against antimicrobial resistance (AMR), it has become increasingly important to address novel targets such as DXPS, the first enzyme of the 2-C-methyl-d-erythritol-4-phosphate (MEP) pathway, which affords the universal isoprenoid precursors. This pathway is absent in humans but essential for pathogens such as Mycobacterium tuberculosis, making it a rich source of drug targets for the development of novel anti-infectives. Standard computer-aided drug-design tools, frequently applied in other areas of drug development, often fail for targets with large, hydrophilic binding sites such as DXPS. Therefore, we introduce the concept of pseudo-inhibitors, combining the benefits of pseudo-ligands (defining a pharmacophore) and pseudo-receptors (defining anchor points in the binding site), for providing the basis to perform a LBVS against M. tuberculosis DXPS. Starting from a diverse set of reference ligands showing weak inhibition of the orthologue from Deinococcus radiodurans DXPS, we identified three structurally unrelated classes with promising in vitro (against M. tuberculosis DXPS) and whole-cell activity including extensively drug-resistant strains of M. tuberculosis. The hits were validated to be specific inhibitors of DXPS and to have a unique mechanism of inhibition. Furthermore, two of the hits have a balanced profile in terms of metabolic and plasma stability and display a low frequency of resistance development, making them ideal starting points for hit-to-lead optimization of antibiotics with an unprecedented mode of action. We identified two drug-like antitubercular hits with submicromolar inhibition constants against the target 1-deoxy-d-xylulose-5-phosphate synthase (DXPS) with a new mode of action and promising activity against drug-resistant tuberculosis.![]()
Collapse
Affiliation(s)
- Di Zhu
- Helmholtz Institute for Pharmaceutical Research Saarland (HIPS) - Helmholtz Centre for Infection Research (HZI), Campus Building E8.1 66123 Saarbrücken Germany
- Department of Pharmacy, Saarland University, Campus Building E8.1 66123 Saarbrücken Germany
- Stratingh Institute for Chemistry, University of Groningen Nijenborgh 7 9747 AG Groningen The Netherlands
| | - Sandra Johannsen
- Helmholtz Institute for Pharmaceutical Research Saarland (HIPS) - Helmholtz Centre for Infection Research (HZI), Campus Building E8.1 66123 Saarbrücken Germany
- Department of Pharmacy, Saarland University, Campus Building E8.1 66123 Saarbrücken Germany
| | - Tiziana Masini
- Stratingh Institute for Chemistry, University of Groningen Nijenborgh 7 9747 AG Groningen The Netherlands
| | - Céline Simonin
- Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern Freiestrasse 3 3012 Bern Switzerland
| | - Jörg Haupenthal
- Helmholtz Institute for Pharmaceutical Research Saarland (HIPS) - Helmholtz Centre for Infection Research (HZI), Campus Building E8.1 66123 Saarbrücken Germany
| | - Boris Illarionov
- Hamburg School of Food Science, Institute of Food Chemistry Grindelallee 117 20146 Hamburg Germany
| | - Anastasia Andreas
- Helmholtz Institute for Pharmaceutical Research Saarland (HIPS) - Helmholtz Centre for Infection Research (HZI), Campus Building E8.1 66123 Saarbrücken Germany
- Department of Pharmacy, Saarland University, Campus Building E8.1 66123 Saarbrücken Germany
| | - Mahendra Awale
- Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern Freiestrasse 3 3012 Bern Switzerland
| | - Robin M Gierse
- Helmholtz Institute for Pharmaceutical Research Saarland (HIPS) - Helmholtz Centre for Infection Research (HZI), Campus Building E8.1 66123 Saarbrücken Germany
- Department of Pharmacy, Saarland University, Campus Building E8.1 66123 Saarbrücken Germany
- Stratingh Institute for Chemistry, University of Groningen Nijenborgh 7 9747 AG Groningen The Netherlands
| | - Tridia van der Laan
- Department of Mycobacteria, National Institute of Public Health and the Environment (RIVM), Diagnostics and Laboratory Surveillance (IDS) Infectious Diseases Research Antonie van Leeuwenhoeklaan 9 3721 MA Bilthoven The Netherlands
| | - Ramon van der Vlag
- Stratingh Institute for Chemistry, University of Groningen Nijenborgh 7 9747 AG Groningen The Netherlands
| | - Rita Nasti
- Stratingh Institute for Chemistry, University of Groningen Nijenborgh 7 9747 AG Groningen The Netherlands
| | - Mael Poizat
- Symeres Kadijk 3 9747 AT Groningen The Netherlands
| | - Eric Buhler
- Laboratoire Matière et Systèmes Complexes (MSC), UMR CNRS 7057, Université Paris Cité Bâtiment Condorcet 75205 Paris Cedex 13 France
| | - Norbert Reiling
- RG Microbial Interface Biology, Research Center Borstel, Leibniz Lung Center Borstel Germany
- German Center for Infection Research (DZIF), Partner Site Hamburg-Lübeck-Borstel-Riems Borstel Germany
| | - Rolf Müller
- Helmholtz Institute for Pharmaceutical Research Saarland (HIPS) - Helmholtz Centre for Infection Research (HZI), Campus Building E8.1 66123 Saarbrücken Germany
- Department of Pharmacy, Saarland University, Campus Building E8.1 66123 Saarbrücken Germany
- Helmholtz International Lab for Anti-infectives Campus Building E8.1 66123 Saarbrücken Germany
| | - Markus Fischer
- Hamburg School of Food Science, Institute of Food Chemistry Grindelallee 117 20146 Hamburg Germany
| | - Jean-Louis Reymond
- Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern Freiestrasse 3 3012 Bern Switzerland
| | - Anna K H Hirsch
- Helmholtz Institute for Pharmaceutical Research Saarland (HIPS) - Helmholtz Centre for Infection Research (HZI), Campus Building E8.1 66123 Saarbrücken Germany
- Department of Pharmacy, Saarland University, Campus Building E8.1 66123 Saarbrücken Germany
- Stratingh Institute for Chemistry, University of Groningen Nijenborgh 7 9747 AG Groningen The Netherlands
- Helmholtz International Lab for Anti-infectives Campus Building E8.1 66123 Saarbrücken Germany
| |
Collapse
|
14
|
Sun H, Wang Y, Chen CZ, Xu M, Guo H, Itkin M, Zheng W, Shen M. Identification of SARS-CoV-2 viral entry inhibitors using machine learning and cell-based pseudotyped particle assay. Bioorg Med Chem 2021; 38:116119. [PMID: 33831697 PMCID: PMC7997310 DOI: 10.1016/j.bmc.2021.116119] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Revised: 03/17/2021] [Accepted: 03/19/2021] [Indexed: 11/26/2022]
Abstract
In response to the pandemic caused by SARS-CoV-2, we constructed a hybrid support vector machine (SVM) classification model using a set of publicly posted SARS-CoV-2 pseudotyped particle (PP) entry assay repurposing screen data to identify novel potent compounds as a starting point for drug development to treat COVID-19 patients. Two different molecular descriptor systems, atom typing descriptors and 3D fingerprints (FPs), were employed to construct the SVM classification models. Both models achieved reasonable performance, with the area under the curve of receiver operating characteristic (AUC-ROC) of 0.84 and 0.82, respectively. The consensus prediction outperformed the two individual models with significantly improved AUC-ROC of 0.91, where the compounds with inconsistent classifications were excluded. The consensus model was then used to screen the 173,898 compounds in the NCATS annotated and diverse chemical libraries. Of the 255 compounds selected for experimental confirmation, 116 compounds exhibited inhibitory activities in the SARS-CoV-2 PP entry assay with IC50 values ranged between 0.17 µM and 62.2 µM, representing an enrichment factor of 3.2. These 116 active compounds with diverse and novel structures could potentially serve as starting points for chemistry optimization for COVID-19 drug discovery.
Collapse
Affiliation(s)
- Hongmao Sun
- National Center for Advancing Translational Sciences (NCATS), 9800 Medical Center Dr., Rockville, MD 20850, USA
| | - Yuhong Wang
- National Center for Advancing Translational Sciences (NCATS), 9800 Medical Center Dr., Rockville, MD 20850, USA
| | - Catherine Z Chen
- National Center for Advancing Translational Sciences (NCATS), 9800 Medical Center Dr., Rockville, MD 20850, USA
| | - Miao Xu
- National Center for Advancing Translational Sciences (NCATS), 9800 Medical Center Dr., Rockville, MD 20850, USA
| | - Hui Guo
- National Center for Advancing Translational Sciences (NCATS), 9800 Medical Center Dr., Rockville, MD 20850, USA
| | - Misha Itkin
- National Center for Advancing Translational Sciences (NCATS), 9800 Medical Center Dr., Rockville, MD 20850, USA
| | - Wei Zheng
- National Center for Advancing Translational Sciences (NCATS), 9800 Medical Center Dr., Rockville, MD 20850, USA
| | - Min Shen
- National Center for Advancing Translational Sciences (NCATS), 9800 Medical Center Dr., Rockville, MD 20850, USA.
| |
Collapse
|
15
|
Abstract
Molecular descriptors encode a variety of molecular representations for computer-assisted drug discovery. Here, we focus on the Weighted Holistic Atom Localization and Entity Shape (WHALES) descriptors, which were originally designed for scaffold hopping from natural products to synthetic molecules. WHALES descriptors capture molecular shape and partial charges simultaneously. We introduce the key aspects of the WHALES concept and provide a step-by-step guide on how to use these descriptors for virtual compound screening and scaffold hopping. The results presented can be reproduced by using the code freely available from URL: github.com/ETHmodlab/scaffold_hopping_whales .
Collapse
Affiliation(s)
- Francesca Grisoni
- Department of Chemistry and Applied Biosciences, RETHINK, ETH Zurich, Zurich, Switzerland.
| | - Gisbert Schneider
- Department of Chemistry and Applied Biosciences, RETHINK, ETH Zurich, Zurich, Switzerland
| |
Collapse
|
16
|
Jiménez-Luna J, Grisoni F, Schneider G. Drug discovery with explainable artificial intelligence. NAT MACH INTELL 2020. [DOI: 10.1038/s42256-020-00236-4] [Citation(s) in RCA: 152] [Impact Index Per Article: 38.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
17
|
Wang Y, Hu J, Lai J, Li Y, Jin H, Zhang L, Zhang LR, Liu ZM. TF3P: Three-Dimensional Force Fields Fingerprint Learned by Deep Capsular Network. J Chem Inf Model 2020; 60:2754-2765. [PMID: 32392062 DOI: 10.1021/acs.jcim.0c00005] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
Molecular fingerprints are the workhorse in ligand-based drug discovery. In recent years, an increasing number of research papers reported fascinating results on using deep neural networks to learn 2D molecular representations as fingerprints. It is anticipated that the integration of deep learning would also contribute to the prosperity of 3D fingerprints. Here, we unprecedentedly introduce deep learning into 3D small molecule fingerprints, presenting a new one we termed as the three-dimensional force fields fingerprint (TF3P). TF3P is learned by a deep capsular network whose training is in no need of labeled data sets for specific predictive tasks. TF3P can encode the 3D force fields information of molecules and demonstrates the stronger ability to capture 3D structural changes, to recognize molecules alike in 3D but not in 2D, and to identify similar targets inaccessible by other 2D or 3D fingerprints based on only ligands similarity. Furthermore, TF3P is compatible with both statistical models (e.g., similarity ensemble approach) and machine learning models. Altogether, we report TF3P as a new 3D small molecule fingerprint with a promising future in ligand-based drug discovery. All codes are written in Python and available at https://github.com/canisw/tf3p.
Collapse
Affiliation(s)
- Yanxing Wang
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, P. R. China
| | - Jianxing Hu
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, P. R. China
| | - Junyong Lai
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, P. R. China
| | - Yibo Li
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100191, P. R. China
| | - Hongwei Jin
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, P. R. China
| | - Lihe Zhang
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, P. R. China
| | - Liang-Ren Zhang
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, P. R. China
| | - Zhen-Ming Liu
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, P. R. China
| |
Collapse
|
18
|
One molecular fingerprint to rule them all: drugs, biomolecules, and the metabolome. J Cheminform 2020; 12:43. [PMID: 33431010 PMCID: PMC7291580 DOI: 10.1186/s13321-020-00445-4] [Citation(s) in RCA: 116] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Accepted: 06/04/2020] [Indexed: 02/08/2023] Open
Abstract
Background Molecular fingerprints are essential cheminformatics tools for virtual screening and mapping chemical space. Among the different types of fingerprints, substructure fingerprints perform best for small molecules such as drugs, while atom-pair fingerprints are preferable for large molecules such as peptides. However, no available fingerprint achieves good performance on both classes of molecules. Results Here we set out to design a new fingerprint suitable for both small and large molecules by combining substructure and atom-pair concepts. Our quest resulted in a new fingerprint called MinHashed atom-pair fingerprint up to a diameter of four bonds (MAP4). In this fingerprint the circular substructures with radii of r = 1 and r = 2 bonds around each atom in an atom-pair are written as two pairs of SMILES, each pair being combined with the topological distance separating the two central atoms. These so-called atom-pair molecular shingles are hashed, and the resulting set of hashes is MinHashed to form the MAP4 fingerprint. MAP4 significantly outperforms all other fingerprints on an extended benchmark that combines the Riniker and Landrum small molecule benchmark with a peptide benchmark recovering BLAST analogs from either scrambled or point mutation analogs. MAP4 furthermore produces well-organized chemical space tree-maps (TMAPs) for databases as diverse as DrugBank, ChEMBL, SwissProt and the Human Metabolome Database (HMBD), and differentiates between all metabolites in HMBD, over 70% of which are indistinguishable from their nearest neighbor using substructure fingerprints. Conclusion MAP4 is a new molecular fingerprint suitable for drugs, biomolecules, and the metabolome and can be adopted as a universal fingerprint to describe and search chemical space. The source code is available at https://github.com/reymond-group/map4 and interactive MAP4 similarity search tools and TMAPs for various databases are accessible at http://map-search.gdb.tools/ and http://tm.gdb.tools/map4/.![]()
Collapse
|
19
|
Probst D, Reymond JL. Visualization of very large high-dimensional data sets as minimum spanning trees. J Cheminform 2020; 12:12. [PMID: 33431043 PMCID: PMC7015965 DOI: 10.1186/s13321-020-0416-x] [Citation(s) in RCA: 116] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Accepted: 02/04/2020] [Indexed: 01/10/2023] Open
Abstract
The chemical sciences are producing an unprecedented amount of large, high-dimensional data sets containing chemical structures and associated properties. However, there are currently no algorithms to visualize such data while preserving both global and local features with a sufficient level of detail to allow for human inspection and interpretation. Here, we propose a solution to this problem with a new data visualization method, TMAP, capable of representing data sets of up to millions of data points and arbitrary high dimensionality as a two-dimensional tree (http://tmap.gdb.tools). Visualizations based on TMAP are better suited than t-SNE or UMAP for the exploration and interpretation of large data sets due to their tree-like nature, increased local and global neighborhood and structure preservation, and the transparency of the methods the algorithm is based on. We apply TMAP to the most used chemistry data sets including databases of molecules such as ChEMBL, FDB17, the Natural Products Atlas, DSSTox, as well as to the MoleculeNet benchmark collection of data sets. We also show its broad applicability with further examples from biology, particle physics, and literature.![]()
Collapse
Affiliation(s)
- Daniel Probst
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland.
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland.
| |
Collapse
|
20
|
Bühlmann S, Reymond JL. ChEMBL-Likeness Score and Database GDBChEMBL. Front Chem 2020; 8:46. [PMID: 32117874 PMCID: PMC7010641 DOI: 10.3389/fchem.2020.00046] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Accepted: 01/15/2020] [Indexed: 01/02/2023] Open
Abstract
The generated database GDB17 enumerates 166.4 billion molecules up to 17 atoms of C, N, O, S and halogens following simple rules of chemical stability and synthetic feasibility. However, most molecules in GDB17 are too complex to be considered for chemical synthesis. To address this limitation, we report GDBChEMBL as a subset of GDB17 featuring 10 million molecules selected according to a ChEMBL-likeness score (CLscore) calculated from the frequency of occurrence of circular substructures in ChEMBL, followed by uniform sampling across molecular size, stereocenters and heteroatoms. Compared to the previously reported subsets FDB17 and GDBMedChem selected from GDB17 by fragment-likeness, respectively, medicinal chemistry criteria, our new subset features molecules with higher synthetic accessibility and possibly bioactivity yet retains a broad and continuous coverage of chemical space typical of the entire GDB17. GDBChEMBL is accessible at http://gdb.unibe.ch for download and for browsing using an interactive chemical space map at http://faerun.gdb.tools.
Collapse
Affiliation(s)
- Sven Bühlmann
- Department of Chemistry and Biochemistry, University of Bern, Bern, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Bern, Bern, Switzerland
| |
Collapse
|
21
|
Capecchi A, Zhang A, Reymond JL. Populating Chemical Space with Peptides Using a Genetic Algorithm. J Chem Inf Model 2020; 60:121-132. [PMID: 31868369 DOI: 10.1021/acs.jcim.9b01014] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
In drug discovery, one uses chemical space as a concept to organize molecules according to their structures and properties. One often would like to generate new possible molecules at a specific location in the chemical space marked by a molecule of interest. Herein, we report the peptide design genetic algorithm (PDGA, code available at https://github.com/reymond-group/PeptideDesignGA ), a computational tool capable of producing peptide sequences of various topologies (linear, cyclic/polycyclic, or dendritic) in proximity of any molecule of interest in a chemical space defined by macromolecule extended atom-pair fingerprint (MXFP), an atom-pair fingerprint describing molecular shape and pharmacophores. We show that the PDGA generates high-similarity analogues of bioactive peptides with diverse peptide chain topologies and of nonpeptide target molecules. We illustrate the chemical space accessible by the PDGA with an interactive 3D map of the MXFP property space available at http://faerun.gdb.tools/ . The PDGA should be generally useful to generate peptides at any location in the chemical space.
Collapse
Affiliation(s)
- Alice Capecchi
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland
| | - Alain Zhang
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland
| |
Collapse
|
22
|
Maltarollo VG. Classification of Staphylococcus Aureus FabI Inhibitors by Machine Learning Techniques. ACTA ACUST UNITED AC 2019. [DOI: 10.4018/ijqspr.2019100101] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Enoyl-acyl carrier protein reductase (FabI) is a key enzyme in the fatty acid metabolism of gram-positive bacteria and is considered a potential target for new antibacterial drugs development. Indeed, triclosan is a widely employed antibacterial and AFN-1252 is currently under phase-II clinical trials, both are known as FabI inhibitors. Nowadays, there is an urgent need for new drug discovery due to increasing antibacterial resistance. In the present study, classification models using machine learning techniques were generated to distinguish SaFabI inhibitors from non-inhibitors successfully (e.g., Mathews correlation coefficient values equal to 0.837 and 0.789 calculated with internal and external validations). The interpretation of a selected model indicates that larger compounds, number of N atoms and the distance between central amide and naphthyridinone ring are important to biological activity, corroborating previous studies. Therefore, these obtained information and generated models can be useful for design/discovery of novel bioactive ligands as potential antibacterial agents.
Collapse
|
23
|
Toxicity Prediction Method Based on Multi-Channel Convolutional Neural Network. Molecules 2019; 24:molecules24183383. [PMID: 31533341 PMCID: PMC6766985 DOI: 10.3390/molecules24183383] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Revised: 09/03/2019] [Accepted: 09/13/2019] [Indexed: 02/08/2023] Open
Abstract
Molecular toxicity prediction is one of the key studies in drug design. In this paper, a deep learning network based on a two-dimension grid of molecules is proposed to predict toxicity. At first, the van der Waals force and hydrogen bond were calculated according to different descriptors of molecules, and multi-channel grids were generated, which could discover more detail and helpful molecular information for toxicity prediction. The generated grids were fed into a convolutional neural network to obtain the result. A Tox21 dataset was used for the evaluation. This dataset contains more than 12,000 molecules. It can be seen from the experiment that the proposed method performs better compared to other traditional deep learning and machine learning methods.
Collapse
|
24
|
Awale M, Sirockin F, Stiefl N, Reymond JL. Drug Analogs from Fragment-Based Long Short-Term Memory Generative Neural Networks. J Chem Inf Model 2019; 59:1347-1356. [DOI: 10.1021/acs.jcim.8b00902] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Affiliation(s)
- Mahendra Awale
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012 Bern, Switzerland
| | - Finton Sirockin
- Novartis Institutes for Biomedical Research, CH-4002 Basel, Switzerland
| | - Nikolaus Stiefl
- Novartis Institutes for Biomedical Research, CH-4002 Basel, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012 Bern, Switzerland
| |
Collapse
|
25
|
Delalande C, Awale M, Rubin M, Probst D, Ozhathil LC, Gertsch J, Abriel H, Reymond JL. Optimizing TRPM4 inhibitors in the MHFP6 chemical space. Eur J Med Chem 2019; 166:167-177. [DOI: 10.1016/j.ejmech.2019.01.048] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2018] [Revised: 12/18/2018] [Accepted: 01/19/2019] [Indexed: 12/12/2022]
|
26
|
Abstract
Drug promiscuity or polypharmacology is the ability of small molecules to interact with multiple protein targets simultaneously. In drug discovery, understanding the polypharmacology of potential drug molecules is crucial to improve their efficacy and safety, and to discover the new therapeutic potentials of existing drugs. Over the past decade, several computational methods have been developed to study the polypharmacology of small molecules, many of which are available as Web services. In this chapter, we review some of these Web tools focusing on ligand based approaches. We highlight in particular our recently developed polypharmacology browser (PPB) and its application for finding the side targets of a new inhibitor of the TRPV6 calcium channel.
Collapse
Affiliation(s)
- Mahendra Awale
- Department of Chemistry and Biochemistry, National Center of Competence in Research NCCR TransCure, University of Berne, Berne, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, National Center of Competence in Research NCCR TransCure, University of Berne, Berne, Switzerland.
| |
Collapse
|
27
|
Manganese coordination compounds of mefenamic acid: In vitro screening and in silico prediction of biological activity. J Inorg Biochem 2019; 190:1-14. [DOI: 10.1016/j.jinorgbio.2018.09.017] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2018] [Revised: 09/14/2018] [Accepted: 09/26/2018] [Indexed: 02/07/2023]
|
28
|
Awale M, Reymond JL. Polypharmacology Browser PPB2: Target Prediction Combining Nearest Neighbors with Machine Learning. J Chem Inf Model 2018; 59:10-17. [PMID: 30558418 DOI: 10.1021/acs.jcim.8b00524] [Citation(s) in RCA: 70] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Here we report PPB2 as a target prediction tool assigning targets to a query molecule based on ChEMBL data. PPB2 computes ligand similarities using molecular fingerprints encoding composition (MQN), molecular shape and pharmacophores (Xfp), and substructures (ECfp4) and features an unprecedented combination of nearest neighbor (NN) searches and Naı̈ve Bayes (NB) machine learning, together with simple NN searches, NB and Deep Neural Network (DNN) machine learning models as further options. Although NN(ECfp4) gives the best results in terms of recall in a 10-fold cross-validation study, combining NN searches with NB machine learning provides superior precision statistics, as well as better results in a case study predicting off-targets of a recently reported TRPV6 calcium channel inhibitor, illustrating the value of this combined approach. PPB2 is available to assess possible off-targets of small molecule drug-like compounds by public access at http://gdb.unibe.ch .
Collapse
Affiliation(s)
- Mahendra Awale
- Department of Chemistry and Biochemistry, National Center of Competence in Research NCCR TransCure , University of Berne , Freiestrasse 3 , 3012 Berne , Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, National Center of Competence in Research NCCR TransCure , University of Berne , Freiestrasse 3 , 3012 Berne , Switzerland
| |
Collapse
|
29
|
Probst D, Reymond JL. A probabilistic molecular fingerprint for big data settings. J Cheminform 2018; 10:66. [PMID: 30564943 PMCID: PMC6755601 DOI: 10.1186/s13321-018-0321-8] [Citation(s) in RCA: 49] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2018] [Accepted: 12/13/2018] [Indexed: 11/10/2022] Open
Abstract
Background Among the various molecular fingerprints available to describe small organic molecules, extended connectivity fingerprint, up to four bonds (ECFP4) performs best in benchmarking drug analog recovery studies as it encodes substructures with a high level of detail. Unfortunately, ECFP4 requires high dimensional representations (≥ 1024D) to perform well, resulting in ECFP4 nearest neighbor searches in very large databases such as GDB, PubChem or ZINC to perform very slowly due to the curse of dimensionality. Results Herein we report a new fingerprint, called MinHash fingerprint, up to six bonds (MHFP6), which encodes detailed substructures using the extended connectivity principle of ECFP in a fundamentally different manner, increasing the performance of exact nearest neighbor searches in benchmarking studies and enabling the application of locality sensitive hashing (LSH) approximate nearest neighbor search algorithms. To describe a molecule, MHFP6 extracts the SMILES of all circular substructures around each atom up to a diameter of six bonds and applies the MinHash method to the resulting set. MHFP6 outperforms ECFP4 in benchmarking analog recovery studies. By leveraging locality sensitive hashing, LSH approximate nearest neighbor search methods perform as well on unfolded MHFP6 as comparable methods do on folded ECFP4 fingerprints in terms of speed and relative recovery rate, while operating in very sparse and high-dimensional binary chemical space. Conclusion MHFP6 is a new molecular fingerprint, encoding circular substructures, which outperforms ECFP4 for analog searches while allowing the direct application of locality sensitive hashing algorithms. It should be well suited for the analysis of large databases. The source code for MHFP6 is available on GitHub (https://github.com/reymond-group/mhfp).![]() Electronic supplementary material The online version of this article (10.1186/s13321-018-0321-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Daniel Probst
- Department of Chemistry and Biochemistry, National Center for Competence in Research NCCR TransCure, University of Berne, Freiestrasse 3, 3012, Bern, Switzerland.
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, National Center for Competence in Research NCCR TransCure, University of Berne, Freiestrasse 3, 3012, Bern, Switzerland
| |
Collapse
|
30
|
Poirier M, Awale M, Roelli MA, Giuffredi GT, Ruddigkeit L, Evensen L, Stooss A, Calarco S, Lorens JB, Charles RP, Reymond JL. Identifying Lysophosphatidic Acid Acyltransferase β (LPAAT-β) as the Target of a Nanomolar Angiogenesis Inhibitor from a Phenotypic Screen Using the Polypharmacology Browser PPB2. ChemMedChem 2018; 14:224-236. [PMID: 30520265 DOI: 10.1002/cmdc.201800554] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2018] [Indexed: 12/11/2022]
Abstract
By screening a focused library of kinase inhibitor analogues in a phenotypic co-culture assay for angiogenesis inhibition, we identified an aminotriazine that acts as a cytostatic nanomolar inhibitor. However, this aminotriazine was found to be completely inactive in a whole-kinome profiling assay. To decipher its mechanism of action, we used the online target prediction tool PPB2 (http://ppb2.gdb.tools), which suggested lysophosphatidic acid acyltransferase β (LPAAT-β) as a possible target for this aminotriazine as well as several analogues identified by structure-activity relationship profiling. LPAAT-β inhibition (IC50 ≈15 nm) was confirmed in a biochemical assay and by its effects on cell proliferation in comparison with a known LPAAT-β inhibitor. These experiments illustrate the value of target-prediction tools to guide target identification for phenotypic screening hits and significantly expand the rather limited pharmacology of LPAAT-β inhibitors.
Collapse
Affiliation(s)
- Marion Poirier
- Department of Chemistry and Biochemistry, National Center of Competence in Research NCCR TransCure, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland
| | - Mahendra Awale
- Department of Chemistry and Biochemistry, National Center of Competence in Research NCCR TransCure, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland
| | - Matthias A Roelli
- Institute of Biochemistry and Molecular Medicine, National Center of Competence in Research NCCR TransCure, University of Bern, Bühlstrasse 28, 3000, Bern 9, Switzerland
| | - Guy T Giuffredi
- Department of Chemistry and Biochemistry, National Center of Competence in Research NCCR TransCure, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland
| | - Lars Ruddigkeit
- Department of Chemistry and Biochemistry, National Center of Competence in Research NCCR TransCure, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland
| | - Lasse Evensen
- Department of Biomedicine, Centre for Cancer Biomarkers (CCBIO), University of Bergen, Jonas Lies vei 91, 5009, Bergen, Norway
| | - Amandine Stooss
- Institute of Biochemistry and Molecular Medicine, National Center of Competence in Research NCCR TransCure, University of Bern, Bühlstrasse 28, 3000, Bern 9, Switzerland
| | - Serafina Calarco
- Institute of Biochemistry and Molecular Medicine, National Center of Competence in Research NCCR TransCure, University of Bern, Bühlstrasse 28, 3000, Bern 9, Switzerland
| | - James B Lorens
- Department of Biomedicine, Centre for Cancer Biomarkers (CCBIO), University of Bergen, Jonas Lies vei 91, 5009, Bergen, Norway
| | - Roch-Philippe Charles
- Institute of Biochemistry and Molecular Medicine, National Center of Competence in Research NCCR TransCure, University of Bern, Bühlstrasse 28, 3000, Bern 9, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, National Center of Competence in Research NCCR TransCure, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland
| |
Collapse
|
31
|
Abstract
The recent general availability of low-cost virtual reality headsets and accompanying three-dimensional (3D) engine support presents an opportunity to bring the concept of chemical space into virtual environments. While virtual reality applications represent a category of widespread tools in other fields, their use in the visualization and exploration of abstract data such as chemical spaces has been experimental. In our previous work, we established the concept of interactive two-dimensional (2D) maps of chemical spaces followed by interactive web-based 3D visualizations, culminating in the interactive web-based 3D visualization of extremely large chemical spaces. Virtual reality chemical spaces are a natural extension of these concepts. As 2D and 3D embeddings and projections of high-dimensional chemical fingerprint spaces have been shown to be valuable tools in chemical space visualization and exploration, existing pipelines of data mining and preparation can be extended to be used in virtual reality applications. Here we present an application based on the Unity engine and the Virtual Reality Toolkit, allowing for the interactive exploration of chemical space populated by DrugBank compounds in virtual reality. The source code of the application as well as the most recent build are available on GitHub ( https://github.com/reymond-group/virtual-reality-chemical-space ).
Collapse
Affiliation(s)
- Daniel Probst
- Department of Chemistry and Biochemistry, National Center for Competence in Research NCCR TransCure , University of Berne , Freiestrasse 3 , 3012 Berne , Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, National Center for Competence in Research NCCR TransCure , University of Berne , Freiestrasse 3 , 3012 Berne , Switzerland
| |
Collapse
|
32
|
Siriwardena TN, Capecchi A, Gan B, Jin X, He R, Wei D, Ma L, Köhler T, van Delden C, Javor S, Reymond J. Optimizing Antimicrobial Peptide Dendrimers in Chemical Space. Angew Chem Int Ed Engl 2018. [DOI: 10.1002/ange.201802837] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
- Thissa N. Siriwardena
- Department of Chemistry and Biochemistry University of Bern Freiestrasse 3 3012 Bern Switzerland
| | - Alice Capecchi
- Department of Chemistry and Biochemistry University of Bern Freiestrasse 3 3012 Bern Switzerland
| | - Bee‐Ha Gan
- Department of Chemistry and Biochemistry University of Bern Freiestrasse 3 3012 Bern Switzerland
| | - Xian Jin
- Department of Chemistry and Biochemistry University of Bern Freiestrasse 3 3012 Bern Switzerland
| | - Runze He
- Shanghai Space Peptides Pharmaceutical Co., Ltd. Shanghai 201210 China
| | - Dengwen Wei
- Department of General Surgery Lanzhou General Hospital of Lanzhou Military Region, PLA 333 South Binhe Road, Qilihe District Lanzhou Gansu Province 730046 China
| | - Lan Ma
- Lanzhou Ruibei Pharmaceutical R&D Co., Ltd. Lanzhou Gansu Province 730000 China
| | - Thilo Köhler
- Department of Microbiology and Molecular Medicine University of Geneva
- Service of Infectious Diseases University Hospital of Geneva Geneva Switzerland
| | - Christian van Delden
- Department of Microbiology and Molecular Medicine University of Geneva
- Service of Infectious Diseases University Hospital of Geneva Geneva Switzerland
| | - Sacha Javor
- Department of Chemistry and Biochemistry University of Bern Freiestrasse 3 3012 Bern Switzerland
| | - Jean‐Louis Reymond
- Department of Chemistry and Biochemistry University of Bern Freiestrasse 3 3012 Bern Switzerland
| |
Collapse
|
33
|
Siriwardena TN, Capecchi A, Gan BH, Jin X, He R, Wei D, Ma L, Köhler T, van Delden C, Javor S, Reymond JL. Optimizing Antimicrobial Peptide Dendrimers in Chemical Space. Angew Chem Int Ed Engl 2018; 57:8483-8487. [PMID: 29767453 DOI: 10.1002/anie.201802837] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Revised: 05/08/2018] [Indexed: 12/13/2022]
Abstract
We used nearest-neighbor searches in chemical space to improve the activity of the antimicrobial peptide dendrimer (AMPD) G3KL and identified dendrimer T7, which has an expanded activity range against Gram-negative pathogenic bacteria including Klebsiellae pneumoniae, increased serum stability, and promising activity in an in vivo infection model against a multidrug-resistant strain of Acinetobacter baumannii. Imaging, spectroscopic studies, and a structural model from molecular dynamics simulations suggest that T7 acts through membrane disruption. These experiments provide the first example of using virtual screening in the field of dendrimers and show that dendrimer size does not limit the activity of AMPDs.
Collapse
Affiliation(s)
- Thissa N Siriwardena
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland
| | - Alice Capecchi
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland
| | - Bee-Ha Gan
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland
| | - Xian Jin
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland
| | - Runze He
- Shanghai Space Peptides Pharmaceutical Co., Ltd., Shanghai, 201210, China
| | - Dengwen Wei
- Department of General Surgery, Lanzhou General Hospital of Lanzhou Military Region, PLA, 333 South Binhe Road, Qilihe District, Lanzhou, Gansu Province, 730046, China
| | - Lan Ma
- Lanzhou Ruibei Pharmaceutical R&D Co., Ltd., Lanzhou, Gansu Province, 730000, China
| | - Thilo Köhler
- Department of Microbiology and Molecular Medicine, University of Geneva.,Service of Infectious Diseases, University Hospital of Geneva, Geneva, Switzerland
| | - Christian van Delden
- Department of Microbiology and Molecular Medicine, University of Geneva.,Service of Infectious Diseases, University Hospital of Geneva, Geneva, Switzerland
| | - Sacha Javor
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland
| |
Collapse
|
34
|
Di Bonaventura I, Baeriswyl S, Capecchi A, Gan BH, Jin X, Siriwardena TN, He R, Köhler T, Pompilio A, Di Bonaventura G, van Delden C, Javor S, Reymond JL. An antimicrobial bicyclic peptide from chemical space against multidrug resistant Gram-negative bacteria. Chem Commun (Camb) 2018; 54:5130-5133. [PMID: 29717727 DOI: 10.1039/c8cc02412j] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
We used the concept of chemical space to explore a virtual library of bicyclic peptides formed by double thioether cyclization of a precursor linear peptide, and identified an antimicrobial bicyclic peptide (AMBP) with remarkable activity against several MDR strains of Acinetobacter baumannii and Pseudomonas aeruginosa.
Collapse
Affiliation(s)
- Ivan Di Bonaventura
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012 Bern, Switzerland.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
35
|
Visini R, Arús-Pous J, Awale M, Reymond JL. Virtual Exploration of the Ring Systems Chemical Universe. J Chem Inf Model 2017; 57:2707-2718. [PMID: 29019686 DOI: 10.1021/acs.jcim.7b00457] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Here, we explore the chemical space of all virtually possible organic molecules focusing on ring systems, which represent the cyclic cores of organic molecules obtained by removing all acyclic bonds and converting all remaining atoms to carbon. This approach circumvents the combinatorial explosion encountered when enumerating the molecules themselves. We report the chemical universe database GDB4c containing 916 130 ring systems up to four saturated or aromatic rings and maximum ring size of 14 atoms and GDB4c3D containing the corresponding 6 555 929 stereoisomers. Almost all (98.6%) of these ring systems are unknown and represent chiral 3D-shaped macrocycles containing small rings and quaternary centers reminiscent of polycyclic natural products. We envision that GDB4c can serve to select new ring systems from which to design analogs of such natural products. The database is available for download at www.gdb.unibe.ch together with interactive visualization and search tools as a resource for molecular design.
Collapse
Affiliation(s)
- Ricardo Visini
- Department of Chemistry and Biochemistry, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| | - Josep Arús-Pous
- Department of Chemistry and Biochemistry, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| | - Mahendra Awale
- Department of Chemistry and Biochemistry, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| |
Collapse
|
36
|
Di Bonaventura I, Jin X, Visini R, Probst D, Javor S, Gan BH, Michaud G, Natalello A, Doglia SM, Köhler T, van Delden C, Stocker A, Darbre T, Reymond JL. Chemical space guided discovery of antimicrobial bridged bicyclic peptides against Pseudomonas aeruginosa and its biofilms. Chem Sci 2017; 8:6784-6798. [PMID: 29147502 PMCID: PMC5643981 DOI: 10.1039/c7sc01314k] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2017] [Accepted: 07/12/2017] [Indexed: 12/15/2022] Open
Abstract
Herein we report the discovery of antimicrobial bridged bicyclic peptides (AMBPs) active against Pseudomonas aeruginosa, a highly problematic Gram negative bacterium in the hospital environment. Two of these AMBPs show strong biofilm inhibition and dispersal activity and enhance the activity of polymyxin, currently a last resort antibiotic against which resistance is emerging. To discover our AMBPs we used the concept of chemical space, which is well known in the area of small molecule drug discovery, to define a small number of test compounds for synthesis and experimental evaluation. Our chemical space was calculated using 2DP, a new topological shape and pharmacophore fingerprint for peptides. This method provides a general strategy to search for bioactive peptides with unusual topologies and expand the structural diversity of peptide-based drugs.
Collapse
Affiliation(s)
- Ivan Di Bonaventura
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland .
| | - Xian Jin
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland .
| | - Ricardo Visini
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland .
| | - Daniel Probst
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland .
| | - Sacha Javor
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland .
| | - Bee-Ha Gan
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland .
| | - Gaëlle Michaud
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland .
| | - Antonino Natalello
- Department of Biotechnology and Biosciences , University of Milano-Bicocca , Piazza della Scienza 2 , 20126 Milan , Italy
| | - Silvia Maria Doglia
- Department of Biotechnology and Biosciences , University of Milano-Bicocca , Piazza della Scienza 2 , 20126 Milan , Italy
| | - Thilo Köhler
- Department of Microbiology and Molecular Medicine , University of Geneva, and Service of Infectious Diseases , University Hospital of Geneva , Geneva , Switzerland
| | - Christian van Delden
- Department of Microbiology and Molecular Medicine , University of Geneva, and Service of Infectious Diseases , University Hospital of Geneva , Geneva , Switzerland
| | - Achim Stocker
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland .
| | - Tamis Darbre
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland .
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland .
| |
Collapse
|
37
|
Peng H, Liu Z, Yan X, Ren J, Xu J. A de novo substructure generation algorithm for identifying the privileged chemical fragments of liver X receptorβ agonists. Sci Rep 2017; 7:11121. [PMID: 28894088 PMCID: PMC5593923 DOI: 10.1038/s41598-017-08848-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2016] [Accepted: 07/17/2017] [Indexed: 12/29/2022] Open
Abstract
Liver X receptorβ (LXRβ) is a promising therapeutic target for lipid disorders, atherosclerosis, chronic inflammation, autoimmunity, cancer and neurodegenerative diseases. Druggable LXRβ agonists have been explored over the past decades. However, the pocket of LXRβ ligand-binding domain (LBD) is too large to predict LXRβ agonists with novel scaffolds based on either receptor or agonist structures. In this paper, we report a de novo algorithm which drives privileged LXRβ agonist fragments by starting with individual chemical bonds (de novo) from every molecule in a LXRβ agonist library, growing the bonds into substructures based on the agonist structures with isomorphic and homomorphic restrictions, and electing the privileged fragments from the substructures with a popularity threshold and background chemical and biological knowledge. Using these privileged fragments as queries, we were able to figure out the rules to reconstruct LXRβ agonist molecules from the fragments. The privileged fragments were validated by building regularized logistic regression (RLR) and supporting vector machine (SVM) models as descriptors to predict a LXRβ agonist activities.
Collapse
Affiliation(s)
- He Peng
- Research Center for Drug Discovery, School of Pharmaceutical Sciences and School of Life Sciences, Sun Yat-Sen University, 132 East Circle at University City, Guangzhou, 510006, China
| | - Zhihong Liu
- Research Center for Drug Discovery, School of Pharmaceutical Sciences and School of Life Sciences, Sun Yat-Sen University, 132 East Circle at University City, Guangzhou, 510006, China
| | - Xin Yan
- Research Center for Drug Discovery, School of Pharmaceutical Sciences and School of Life Sciences, Sun Yat-Sen University, 132 East Circle at University City, Guangzhou, 510006, China
| | - Jian Ren
- Research Center for Drug Discovery, School of Pharmaceutical Sciences and School of Life Sciences, Sun Yat-Sen University, 132 East Circle at University City, Guangzhou, 510006, China.
| | - Jun Xu
- Research Center for Drug Discovery, School of Pharmaceutical Sciences and School of Life Sciences, Sun Yat-Sen University, 132 East Circle at University City, Guangzhou, 510006, China.
| |
Collapse
|
38
|
Abstract
To better understand chemical space we recently enumerated the database GDB-17 containing 166.4 billion possible molecules up to 17 atoms of C, N, O, S and halogen following the simple rules of chemical stability and synthetic feasibility. However, due to the combinatorial explosion caused by systematic enumeration GDB-17 is strongly biased toward the largest, functionally and stereochemically most complex molecules and far too large for most virtual screening tools. Herein we selected a much smaller subset of GDB-17, called the fragment database FDB-17, which contains 10 million fragmentlike molecules evenly covering a broad value range for molecular size, polarity, and stereochemical complexity. The database is available at www.gdb.unibe.ch for download and free use, together with an interactive visualization application and a Web-based nearest neighbor search tool to facilitate the selection of new fragment-sized molecules for chemical synthesis.
Collapse
Affiliation(s)
- Ricardo Visini
- Department of Chemistry and Biochemistry, University of Bern , Freiestrasse 3, 3012 Berne, Switzerland
| | - Mahendra Awale
- Department of Chemistry and Biochemistry, University of Bern , Freiestrasse 3, 3012 Berne, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Bern , Freiestrasse 3, 3012 Berne, Switzerland
| |
Collapse
|
39
|
Awale M, Reymond JL. The polypharmacology browser: a web-based multi-fingerprint target prediction tool using ChEMBL bioactivity data. J Cheminform 2017; 9:11. [PMID: 28270862 PMCID: PMC5319934 DOI: 10.1186/s13321-017-0199-x] [Citation(s) in RCA: 64] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2016] [Accepted: 02/10/2017] [Indexed: 12/31/2022] Open
Abstract
Background Several web-based tools have been reported recently which predict the possible targets of a small molecule by similarity to compounds of known bioactivity using molecular fingerprints (fps), however predictions in each case rely on similarities computed from only one or two fps. Considering that structural similarity and therefore the predicted targets strongly depend on the method used for comparison, it would be highly desirable to predict targets using a broader set of fps simultaneously. Results Herein, we present the polypharmacology browser (PPB), a web-based platform which predicts possible targets for small molecules by searching for nearest neighbors using ten different fps describing composition, substructures, molecular shape and pharmacophores. PPB searches through 4613 groups of at least 10 same target annotated bioactive molecules from ChEMBL and returns a list of predicted targets ranked by consensus voting scheme and p value. A validation study across 670 drugs with up to 20 targets showed that combining the predictions from all 10 fps gives the best results, with on average 50% of the known targets of a drug being correctly predicted with a hit rate of 25%. Furthermore, when profiling a new inhibitor of the calcium channel TRPV6 against 24 targets taken from a safety screen panel, we observed inhibition in 5 out of 5 targets predicted by PPB and in 7 out of 18 targets not predicted by PPB. The rate of correct (5/12) and incorrect (0/12) predictions for this compound by PPB was comparable to that of other web-based prediction tools. Conclusion PPB offers a versatile platform for target prediction based on multi-fingerprint comparisons, and is freely accessible at www.gdb.unibe.ch as a valuable support for drug discovery.. ![]() Electronic supplementary material The online version of this article (doi:10.1186/s13321-017-0199-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Mahendra Awale
- Department of Chemistry and Biochemistry, National Center of Competence in Research NCCR Chemical Biology and NCCR TransCure, University of Berne, Freiestrasse 3, 3012 Berne, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, National Center of Competence in Research NCCR Chemical Biology and NCCR TransCure, University of Berne, Freiestrasse 3, 3012 Berne, Switzerland
| |
Collapse
|
40
|
Kilchmann F, Marcaida MJ, Kotak S, Schick T, Boss SD, Awale M, Gönczy P, Reymond JL. Discovery of a Selective Aurora A Kinase Inhibitor by Virtual Screening. J Med Chem 2016; 59:7188-211. [PMID: 27391133 DOI: 10.1021/acs.jmedchem.6b00709] [Citation(s) in RCA: 52] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Here we report the discovery of a selective inhibitor of Aurora A, a key regulator of cell division and potential anticancer target. We used the atom category extended ligand overlap score (xLOS), a 3D ligand-based virtual screening method recently developed in our group, to select 437 shape and pharmacophore analogs of reference kinase inhibitors. Biochemical screening uncovered two inhibitor series with scaffolds unprecedented among kinase inhibitors. One of them was successfully optimized by structure-based design to a potent Aurora A inhibitor (IC50 = 2 nM) with very high kinome selectivity for Aurora kinases. This inhibitor locks Aurora A in an inactive conformation and disrupts binding to its activator protein TPX2, which impairs Aurora A localization at the mitotic spindle and induces cell division defects. This phenotype can be rescued by inhibitor-resistant Aurora A mutants. The inhibitor furthermore does not induce Aurora B specific effects in cells.
Collapse
Affiliation(s)
- Falco Kilchmann
- Department of Chemistry and Biochemistry, National Center of Competence in Research NCCR Chemical Biology and NCCR TransCure, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| | - Maria J Marcaida
- Department of Chemistry and Biochemistry, National Center of Competence in Research NCCR Chemical Biology and NCCR TransCure, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| | - Sachin Kotak
- Swiss Institute for Experimental Cancer Research (ISREC), School of Life Sciences, National Center of Competence in Research NCCR Chemical Biology, Swiss Federal Institute of Technology (EPFL) , CH-1015 Lausanne, Switzerland
| | - Thomas Schick
- Department of Chemistry and Biochemistry, National Center of Competence in Research NCCR Chemical Biology and NCCR TransCure, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| | - Silvan D Boss
- Department of Chemistry and Biochemistry, National Center of Competence in Research NCCR Chemical Biology and NCCR TransCure, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| | - Mahendra Awale
- Department of Chemistry and Biochemistry, National Center of Competence in Research NCCR Chemical Biology and NCCR TransCure, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| | - Pierre Gönczy
- Swiss Institute for Experimental Cancer Research (ISREC), School of Life Sciences, National Center of Competence in Research NCCR Chemical Biology, Swiss Federal Institute of Technology (EPFL) , CH-1015 Lausanne, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, National Center of Competence in Research NCCR Chemical Biology and NCCR TransCure, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| |
Collapse
|
41
|
Awale M, Reymond JL. Web-based 3D-visualization of the DrugBank chemical space. J Cheminform 2016; 8:25. [PMID: 27148409 PMCID: PMC4855437 DOI: 10.1186/s13321-016-0138-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2016] [Accepted: 04/27/2016] [Indexed: 12/14/2022] Open
Abstract
Background Similarly to the periodic table for elements, chemical space offers an organizing principle for representing the diversity of organic molecules, usually in the form of multi-dimensional property spaces that are subjected to dimensionality reduction methods to obtain 3D-spaces or 2D-maps suitable for visual inspection. Unfortunately, tools to look at chemical space on the internet are currently very limited. Results Herein we present webDrugCS, a web application freely available at www.gdb.unibe.ch to visualize DrugBank (www.drugbank.ca, containing over 6000 investigational and approved drugs) in five different property spaces. WebDrugCS displays 3D-clouds of color-coded grid points representing molecules, whose structural formula is displayed on mouse over with an option to link to the corresponding molecule page at the DrugBank website. The 3D-clouds are obtained by principal component analysis of high dimensional property spaces describing constitution and topology (42D molecular quantum numbers MQN), structural features (34D SMILES fingerprint SMIfp), molecular shape (20D atom pair fingerprint APfp), pharmacophores (55D atom category extended atom pair fingerprint Xfp) and substructures (1024D binary substructure fingerprint Sfp). User defined molecules can be uploaded as SMILES lists and displayed together with DrugBank. In contrast to 2D-maps where many compounds fold onto each other, these 3D-spaces have a comparable resolution to their parent high-dimensional chemical space. Conclusion To the best of our knowledge webDrugCS is the first publicly available web tool for interactive visualization and exploration of the DrugBank chemical space in 3D. WebDrugCS works on computers, tablets and phones, and facilitates the visual exploration of DrugBank to rapidly learn about the structural diversity of small molecule drugs.webDrugCS visualization of DrugBank projected in 3D MQN space color-coded by ring count, with pointer showing the drug 5-fluorouracil. ![]()
Collapse
Affiliation(s)
- Mahendra Awale
- Department of Chemistry and Biochemistry, National Center of Competence in Research NCCR TransCure, University of Bern, Freiestrasse 3, 3012 Bern, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, National Center of Competence in Research NCCR TransCure, University of Bern, Freiestrasse 3, 3012 Bern, Switzerland
| |
Collapse
|
42
|
|
43
|
Muegge I, Mukherjee P. An overview of molecular fingerprint similarity search in virtual screening. Expert Opin Drug Discov 2015; 11:137-48. [PMID: 26558489 DOI: 10.1517/17460441.2016.1117070] [Citation(s) in RCA: 111] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
INTRODUCTION A central premise of medicinal chemistry is that structurally similar molecules exhibit similar biological activities. Molecular fingerprints encode properties of small molecules and assess their similarities computationally through bit string comparisons. Based on the similarity to a biologically active template, molecular fingerprint methods allow for identifying additional compounds with a higher chance of displaying similar biological activities against the same target - a process commonly referred to as virtual screening (VS). AREAS COVERED This article focuses on fingerprint similarity searches in the context of compound selection for enhancing hit sets, comparing compound decks, and VS. In addition, the authors discuss the application of fingerprints in predictive modeling. EXPERT OPINION Fingerprint similarity search methods are especially useful in VS if only a few unrelated ligands are known for a given target and therefore more complex and information rich methods such as pharmacophore searches or structure-based design are not applicable. In addition, fingerprint methods are used in characterizing properties of compound collections such as chemical diversity, density in chemical space, and content of biologically active molecules (biodiversity). Such assessments are important for deciding what compounds to experimentally screen, to purchase, or to assemble in a virtual compound deck for in silico screening or de novo design.
Collapse
Affiliation(s)
- Ingo Muegge
- a Boehringer Ingelheim Pharmaceuticals , Department of Small Molecule Discovery Research , Ridgefield , CT , USA
| | - Prasenjit Mukherjee
- a Boehringer Ingelheim Pharmaceuticals , Department of Small Molecule Discovery Research , Ridgefield , CT , USA
| |
Collapse
|
44
|
Jin X, Awale M, Zasso M, Kostro D, Patiny L, Reymond JL. PDB-Explorer: a web-based interactive map of the protein data bank in shape space. BMC Bioinformatics 2015; 16:339. [PMID: 26493835 PMCID: PMC4619230 DOI: 10.1186/s12859-015-0776-9] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2015] [Accepted: 10/14/2015] [Indexed: 11/17/2022] Open
Abstract
Background The RCSB Protein Data Bank (PDB) provides public access to experimentally determined 3D-structures of biological macromolecules (proteins, peptides and nucleic acids). While various tools are available to explore the PDB, options to access the global structural diversity of the entire PDB and to perceive relationships between PDB structures remain very limited. Methods A 136-dimensional atom pair 3D-fingerprint for proteins (3DP) counting categorized atom pairs at increasing through-space distances was designed to represent the molecular shape of PDB-entries. Nearest neighbor searches examples were reported exemplifying the ability of 3DP-similarity to identify closely related biomolecules from small peptides to enzyme and large multiprotein complexes such as virus particles. The principle component analysis was used to obtain the visualization of PDB in 3DP-space. Results The 3DP property space groups proteins and protein assemblies according to their 3D-shape similarity, yet shows exquisite ability to distinguish between closely related structures. An interactive website called PDB-Explorer is presented featuring a color-coded interactive map of PDB in 3DP-space. Each pixel of the map contains one or more PDB-entries which are directly visualized as ribbon diagrams when the pixel is selected. The PDB-Explorer website allows performing 3DP-nearest neighbor searches of any PDB-entry or of any structure uploaded as protein-type PDB file. All functionalities on the website are implemented in JavaScript in a platform-independent manner and draw data from a server that is updated daily with the latest PDB additions, ensuring complete and up-to-date coverage. The essentially instantaneous 3DP-similarity search with the PDB-Explorer provides results comparable to those of much slower 3D-alignment algorithms, and automatically clusters proteins from the same superfamilies in tight groups. Conclusion A chemical space classification of PDB based on molecular shape was obtained using a new atom-pair 3D-fingerprint for proteins and implemented in a web-based database exploration tool comprising an interactive color-coded map of the PDB chemical space and a nearest neighbor search tool. The PDB-Explorer website is freely available at www.cheminfo.org/pdbexplorer and represents an unprecedented opportunity to interactively visualize and explore the structural diversity of the PDB. ᅟ ᅟMaps of PDB in 3DP-space color-coded by heavy atom count and shape. ![]()
Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0776-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Xian Jin
- Department of Chemistry and Biochemistry, University of Berne, Freiestrasse 3, 3012, Berne, Switzerland.
| | - Mahendra Awale
- Department of Chemistry and Biochemistry, University of Berne, Freiestrasse 3, 3012, Berne, Switzerland.
| | - Michaël Zasso
- Ecole Polytechnique Fédérale de Lausanne (EPFL), Institute of Chemical Sciences and Engineering (ISIC), Lausanne, 1015, Switzerland.
| | - Daniel Kostro
- Ecole Polytechnique Fédérale de Lausanne (EPFL), Institute of Chemical Sciences and Engineering (ISIC), Lausanne, 1015, Switzerland.
| | - Luc Patiny
- Ecole Polytechnique Fédérale de Lausanne (EPFL), Institute of Chemical Sciences and Engineering (ISIC), Lausanne, 1015, Switzerland.
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Berne, Freiestrasse 3, 3012, Berne, Switzerland.
| |
Collapse
|
45
|
Awale M, Reymond JL. Similarity Mapplet: Interactive Visualization of the Directory of Useful Decoys and ChEMBL in High Dimensional Chemical Spaces. J Chem Inf Model 2015. [PMID: 26207526 DOI: 10.1021/acs.jcim.5b00182] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
An Internet portal accessible at www.gdb.unibe.ch has been set up to automatically generate color-coded similarity maps of the ChEMBL database in relation to up to two sets of active compounds taken from the enhanced Directory of Useful Decoys (eDUD), a random set of molecules, or up to two sets of user-defined reference molecules. These maps visualize the relationships between the selected compounds and ChEMBL in six different high dimensional chemical spaces, namely MQN (42-D molecular quantum numbers), SMIfp (34-D SMILES fingerprint), APfp (20-D shape fingerprint), Xfp (55-D pharmacophore fingerprint), Sfp (1024-bit substructure fingerprint), and ECfp4 (1024-bit extended connectivity fingerprint). The maps are supplied in form of Java based desktop applications called "similarity mapplets" allowing interactive content browsing and linked to a "Multifingerprint Browser for ChEMBL" (also accessible directly at www.gdb.unibe.ch ) to perform nearest neighbor searches. One can obtain six similarity mapplets of ChEMBL relative to random reference compounds, 606 similarity mapplets relative to single eDUD active sets, 30,300 similarity mapplets relative to pairs of eDUD active sets, and any number of similarity mapplets relative to user-defined reference sets to help visualize the structural diversity of compound series in drug optimization projects and their relationship to other known bioactive compounds.
Collapse
Affiliation(s)
- Mahendra Awale
- Department of Chemistry and Biochemistry, National Center of Competence in Research NCCR TransCure, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, National Center of Competence in Research NCCR TransCure, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| |
Collapse
|
46
|
Montalbetti N, Simonin A, Simonin C, Awale M, Reymond JL, Hediger MA. Discovery and characterization of a novel non-competitive inhibitor of the divalent metal transporter DMT1/SLC11A2. Biochem Pharmacol 2015; 96:216-24. [PMID: 26047847 DOI: 10.1016/j.bcp.2015.05.002] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2015] [Accepted: 05/05/2015] [Indexed: 10/23/2022]
Abstract
Divalent metal transporter-1 (SLC11A2/DMT1) uses the H(+) electrochemical gradient as the driving force to transport divalent metal ions such as Fe(2+), Mn(2+) and others metals into mammalian cells. DMT1 is ubiquitously expressed, most notably in proximal duodenum, immature erythroid cells, brain and kidney. This transporter mediates H(+)-coupled transport of ferrous iron across the apical membrane of enterocytes. In addition, in cells such as to erythroid precursors, following transferrin receptor (TfR) mediated endocytosis; it mediates H(+)-coupled exit of ferrous iron from endocytic vesicles into the cytosol. Dysfunction of human DMT1 is associated with several pathologies such as iron deficiency anemia hemochromatosis, Parkinson's disease and Alzheimer's disease, as well as colorectal cancer and esophageal adenocarcinoma, making DMT1 an attractive target for drug discovery. In the present study, we performed a ligand-based virtual screening of the Princeton database (700,000 commercially available compounds) to search for pharmacophore shape analogs of recently reported DMT1 inhibitors. We discovered a new compound, named pyrimidinone 8, which mediates a reversible linear non-competitive inhibition of human DMT1 (hDMT1) transport activity with a Ki of ∼20μM. This compound does not affect hDMT1 cell surface expression and shows no dependence on extracellular pH. To our knowledge, this is the first experimental evidence that hDMT1 can be allosterically modulated by pharmacological agents. Pyrimidinone 8 represents a novel versatile tool compound and it may serve as a lead structure for the development of therapeutic compounds for pre-clinical assessment.
Collapse
Affiliation(s)
- Nicolas Montalbetti
- Institute of Biochemistry and Molecular Medicine, University of Bern, Switzerland; Swiss National Center of Competence in Research, NCCR TransCure, University of Bern, Switzerland.
| | - Alexandre Simonin
- Institute of Biochemistry and Molecular Medicine, University of Bern, Switzerland; Swiss National Center of Competence in Research, NCCR TransCure, University of Bern, Switzerland.
| | - Céline Simonin
- Department of Chemistry and Biochemistry, University of Bern, Switzerland; Swiss National Center of Competence in Research, NCCR TransCure, University of Bern, Switzerland
| | - Mahendra Awale
- Department of Chemistry and Biochemistry, University of Bern, Switzerland; Swiss National Center of Competence in Research, NCCR TransCure, University of Bern, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Bern, Switzerland; Swiss National Center of Competence in Research, NCCR TransCure, University of Bern, Switzerland.
| | - Matthias A Hediger
- Institute of Biochemistry and Molecular Medicine, University of Bern, Switzerland; Swiss National Center of Competence in Research, NCCR TransCure, University of Bern, Switzerland.
| |
Collapse
|
47
|
Abstract
![]()
In 1996,
a snapshot of the field of synthesis was provided by many
of its thought leaders in a Chemical Reviews thematic
issue on “Frontiers in Organic Synthesis”. This Accounts of Chemical Research thematic issue on “Synthesis,
Design, and Molecular Function” is intended to provide further
perspective now from well into the 21st century. Much has happened
in the past few decades. The targets, methods, strategies, reagents,
procedures, goals, funding, practices, and practitioners of synthesis
have changed, some in dramatic ways as documented in impressive contributions
to this issue. However, a constant for most synthesis studies continues
to be the goal of achieving function with synthetic economy. Whether in the form of new catalysts, reagents, therapeutic leads,
diagnostics, drug delivery systems, imaging agents, sensors, materials,
energy generation and storage systems, bioremediation strategies,
or molecules that challenge old theories or test new ones, the function
of a target has been and continues to be a major and compelling justification
for its synthesis. While the targets of synthesis have historically
been heavily represented by natural products, increasingly design,
often inspired by natural structures, is providing a new source of
target structures exhibiting new or natural functions and new or natural
synthetic challenges. Complementing isolation and screening approaches
to new target identification, design enables one to create targets de novo with an emphasis on sought-after function and synthetic
innovation with step-economy. Design provides choice. It allows one
to determine how close a synthesis will come to the ideal synthesis
and how close a structure will come to the ideal function. In
this Account, we address studies in our laboratory on function-oriented
synthesis (FOS), a strategy to achieve
function by design and with synthetic economy. By starting with function
rather than structure, FOS places an initial emphasis on target design,
thereby harnessing the power of chemists and computers to create new
structures with desired functions that could be prepared in a simple,
safe, economical, and green, if not ideal, fashion. Reported herein
are examples of FOS associated with (a) molecular recognition, leading
to the first designed phorbol-inspired protein kinase C regulatory
ligands, the first designed bryostatin analogs, the newest bryologs,
and a new family of designed kinase inhibitors, (b) target modification,
leading to highly simplified but functionally competent photonucleases—molecules
that cleave DNA upon photoactivation, (c) drug delivery, leading to
cell penetrating molecular transporters, molecules that ferry other
attached or complexed molecules across biological barriers, and (d)
new reactivity-regenerating reagents in the form of functional equivalents
of butatrienes, reagents that allow for back-to-back three-component
cycloaddition reactions, thus achieving structural complexity and
value with step-economy. While retrosynthetic analysis seeks to identify
the best way to make a target, retrofunction analysis seeks to identify
the best targets to make. In essence, form (structure) follows function.
Collapse
Affiliation(s)
- Paul A. Wender
- Departments of Chemistry
and Chemical and Systems Biology, Stanford University, Stanford, California 94305-5080, United States
| | - Ryan V. Quiroz
- Departments of Chemistry
and Chemical and Systems Biology, Stanford University, Stanford, California 94305-5080, United States
| | - Matthew C. Stevens
- Departments of Chemistry
and Chemical and Systems Biology, Stanford University, Stanford, California 94305-5080, United States
| |
Collapse
|
48
|
Abstract
One of the simplest questions that can be asked about molecular diversity is how many organic molecules are possible in total? To answer this question, my research group has computationally enumerated all possible organic molecules up to a certain size to gain an unbiased insight into the entire chemical space. Our latest database, GDB-17, contains 166.4 billion molecules of up to 17 atoms of C, N, O, S, and halogens, by far the largest small molecule database reported to date. Molecules allowed by valency rules but unstable or nonsynthesizable due to strained topologies or reactive functional groups were not considered, which reduced the enumeration by at least 10 orders of magnitude and was essential to arrive at a manageable database size. Despite these restrictions, GDB-17 is highly relevant with respect to known molecules. Beyond enumeration, understanding and exploiting GDBs (generated databases) led us to develop methods for virtual screening and visualization of very large databases in the form of a "periodic system of molecules" comprising six different fingerprint spaces, with web-browsers for nearest neighbor searches, and the MQN- and SMIfp-Mapplet application for exploring color-coded principal component maps of GDB and other large databases. Proof-of-concept applications of GDB for drug discovery were realized by combining virtual screening with chemical synthesis and activity testing for neurotransmitter receptor and transporter ligands. One surprising lesson from using GDB for drug analog searches is the incredible depth of chemical space, that is, the fact that millions of very close analogs of any molecule can be readily identified by nearest-neighbor searches in the MQN-space of the various GDBs. The chemical space project has opened an unprecedented door on chemical diversity. Ongoing and yet unmet challenges concern enumerating molecules beyond 17 atoms and synthesizing GDB molecules with innovative scaffolds and pharmacophores.
Collapse
Affiliation(s)
- Jean-Louis Reymond
- Department of Chemistry and
Biochemistry, University of Berne, Freiestrasse 3, 3012 Berne, Switzerland
| |
Collapse
|
49
|
Awale M, Jin X, Reymond JL. Stereoselective virtual screening of the ZINC database using atom pair 3D-fingerprints. J Cheminform 2015; 7:3. [PMID: 25750664 PMCID: PMC4352573 DOI: 10.1186/s13321-014-0051-5] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2014] [Accepted: 12/19/2014] [Indexed: 01/15/2023] Open
Abstract
BACKGROUND Tools to explore large compound databases in search for analogs of query molecules provide a strategically important support in drug discovery to help identify available analogs of any given reference or hit compound by ligand based virtual screening (LBVS). We recently showed that large databases can be formatted for very fast searching with various 2D-fingerprints using the city-block distance as similarity measure, in particular a 2D-atom pair fingerprint (APfp) and the related category extended atom pair fingerprint (Xfp) which efficiently encode molecular shape and pharmacophores, but do not perceive stereochemistry. Here we investigated related 3D-atom pair fingerprints to enable rapid stereoselective searches in the ZINC database (23.2 million 3D structures). RESULTS Molecular fingerprints counting atom pairs at increasing through-space distance intervals were designed using either all atoms (16-bit 3DAPfp) or different atom categories (80-bit 3DXfp). These 3D-fingerprints retrieved molecular shape and pharmacophore analogs (defined by OpenEye ROCS scoring functions) of 110,000 compounds from the Cambridge Structural Database with equal or better accuracy than the 2D-fingerprints APfp and Xfp, and showed comparable performance in recovering actives from decoys in the DUD database. LBVS by 3DXfp or 3DAPfp similarity was stereoselective and gave very different analogs when starting from different diastereomers of the same chiral drug. Results were also different from LBVS with the parent 2D-fingerprints Xfp or APfp. 3D- and 2D-fingerprints also gave very different results in LBVS of folded molecules where through-space distances between atom pairs are much shorter than topological distances. CONCLUSIONS 3DAPfp and 3DXfp are suitable for stereoselective searches for shape and pharmacophore analogs of query molecules in large databases. Web-browsers for searching ZINC by 3DAPfp and 3DXfp similarity are accessible at www.gdb.unibe.ch and should provide useful assistance to drug discovery projects. Graphical abstractAtom pair fingerprints based on through-space distances (3DAPfp) provide better shape encoding than atom pair fingerprints based on topological distances (APfp) as measured by the recovery of ROCS shape analogs by fp similarity.
Collapse
Affiliation(s)
- Mahendra Awale
- Department of Chemistry and Biochemistry, University of Berne, Freiestrasse 3, 3012 Berne, Switzerland
| | - Xian Jin
- Department of Chemistry and Biochemistry, University of Berne, Freiestrasse 3, 3012 Berne, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Berne, Freiestrasse 3, 3012 Berne, Switzerland
| |
Collapse
|