1
|
Orsi M, Reymond JL. One chiral fingerprint to find them all. J Cheminform 2024; 16:53. [PMID: 38741153 DOI: 10.1186/s13321-024-00849-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 04/28/2024] [Indexed: 05/16/2024] Open
Abstract
Molecular fingerprints are indispensable tools in cheminformatics. However, stereochemistry is generally not considered, which is problematic for large molecules which are almost all chiral. Herein we report MAP4C, a chiral version of our previously reported fingerprint MAP4, which lists MinHashes computed from character strings containing the SMILES of all pairs of circular substructures up to a diameter of four bonds and the shortest topological distance between their central atoms. MAP4C includes the Cahn-Ingold-Prelog (CIP) annotation (R, S, r or s) whenever the chiral atom is the center of a circular substructure, a question mark for undefined stereocenters, and double bond cis-trans information if specified. MAP4C performs slightly better than the achiral MAP4, ECFP and AP fingerprints in non-stereoselective virtual screening benchmarks. Furthermore, MAP4C distinguishes between stereoisomers in chiral molecules from small molecule drugs to large natural products and peptides comprising thousands of diastereomers, with a degree of distinction smaller than between structural isomers and proportional to the number of chirality changes. Due to its excellent performance across diverse molecular classes and its ability to handle stereochemistry, MAP4C is recommended as a generally applicable chiral molecular fingerprint. SCIENTIFIC CONTRIBUTION: The ability of our chiral fingerprint MAP4C to handle stereoisomers from small molecules to large natural products and peptides is unprecedented and opens the way for cheminformatics to include stereochemistry as an important molecular parameter across all fields of molecular design.
Collapse
Affiliation(s)
- Markus Orsi
- Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland.
| |
Collapse
|
2
|
Orsi M, Probst D, Schwaller P, Reymond JL. Alchemical analysis of FDA approved drugs. DIGITAL DISCOVERY 2023; 2:1289-1296. [PMID: 38013905 PMCID: PMC10561545 DOI: 10.1039/d3dd00039g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 08/29/2023] [Indexed: 11/29/2023]
Abstract
Chemical space maps help visualize similarities within molecular sets. However, there are many different molecular similarity measures resulting in a confusing number of possible comparisons. To overcome this limitation, we exploit the fact that tools designed for reaction informatics also work for alchemical processes that do not obey Lavoisier's principle, such as the transmutation of lead into gold. We start by using the differential reaction fingerprint (DRFP) to create tree-maps (TMAPs) representing the chemical space of pairs of drugs selected as being similar according to various molecular fingerprints. We then use the Transformer-based RXNMapper model to understand structural relationships between drugs, and its confidence score to distinguish between pairs related by chemically feasible transformations and pairs related by alchemical transmutations. This analysis reveals a diversity of structural similarity relationships that are otherwise difficult to analyze simultaneously. We exemplify this approach by visualizing FDA-approved drugs, EGFR inhibitors, and polymyxin B analogs.
Collapse
Affiliation(s)
- Markus Orsi
- Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern Freiestrasse 3 3012 Bern Switzerland
| | - Daniel Probst
- Ecole Polytechnique Fédérale de Lausanne 1015 Lausanne Switzerland
| | | | - Jean-Louis Reymond
- Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern Freiestrasse 3 3012 Bern Switzerland
| |
Collapse
|
3
|
Abstract
DNA-encoded libraries (DELs) are widely used in the discovery of drug candidates, and understanding their design principles is critical for accessing better libraries. Most DELs are combinatorial in nature and are synthesized by assembling sets of building blocks in specific topologies. In this study, different aspects of library topology were explored and their effect on DEL properties and chemical diversity was analyzed. We introduce a descriptor for DEL topological assignment (DELTA) and use it to examine the landscape of possible DEL topologies and their coverage in the literature. A generative topographic mapping analysis revealed that the impact of library topology on chemical space coverage is secondary to building block selection. Furthermore, it became apparent that the descriptor used to analyze chemical space dictates how structures cluster, with the effects of topology being apparent when using three-dimensional descriptors but not with common two-dimensional descriptors. This outcome points to potential challenges of attempts to predict DEL productivity based on chemical space analyses alone. While topology is rather inconsequential for defining the chemical space of encoded compounds, it greatly affects possible interactions with target proteins as illustrated in docking studies using NAD/NADP binding proteins as model receptors.
Collapse
Affiliation(s)
- William K Weigel
- Department of Medicinal Chemistry, Skaggs College of Pharmacy, University of Utah, 30 S 2000 E, Salt Lake City, Utah 84112, United States
| | - Alba L Montoya
- Department of Medicinal Chemistry, Skaggs College of Pharmacy, University of Utah, 30 S 2000 E, Salt Lake City, Utah 84112, United States
| | - Raphael M Franzini
- Department of Medicinal Chemistry, Skaggs College of Pharmacy, University of Utah, 30 S 2000 E, Salt Lake City, Utah 84112, United States
- Huntsman Cancer Institute, University of Utah, 2000 Circle of Hope Dr., Salt Lake City, Utah 84112, United States
| |
Collapse
|
4
|
Dewey JA, Delalande C, Azizi SA, Lu V, Antonopoulos D, Babnigg G. Molecular Glue Discovery: Current and Future Approaches. J Med Chem 2023; 66:9278-9296. [PMID: 37437222 PMCID: PMC10805529 DOI: 10.1021/acs.jmedchem.3c00449] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/14/2023]
Abstract
The intracellular interactions of biomolecules can be maneuvered to redirect signaling, reprogram the cell cycle, or decrease infectivity using only a few dozen atoms. Such "molecular glues," which can drive both novel and known interactions between protein partners, represent an enticing therapeutic strategy. Here, we review the methods and approaches that have led to the identification of small-molecule molecular glues. We first classify current FDA-approved molecular glues to facilitate the selection of discovery methods. We then survey two broad discovery method strategies, where we highlight the importance of factors such as experimental conditions, software packages, and genetic tools for success. We hope that this curation of methodologies for directed discovery will inspire diverse research efforts targeting a multitude of human diseases.
Collapse
Affiliation(s)
- Jeffrey A Dewey
- Biosciences Division, Argonne National Laboratory, Lemont, Illinois 60439, United States
| | - Clémence Delalande
- Department of Chemistry, University of Chicago, Chicago, Illinois 60637, United States
| | - Saara-Anne Azizi
- Pritzker School of Medicine, University of Chicago, Chicago, Illinois 60637, United States
| | - Vivian Lu
- Department of Chemistry, University of Chicago, Chicago, Illinois 60637, United States
| | - Dionysios Antonopoulos
- Biosciences Division, Argonne National Laboratory, Lemont, Illinois 60439, United States
| | - Gyorgy Babnigg
- Biosciences Division, Argonne National Laboratory, Lemont, Illinois 60439, United States
| |
Collapse
|
5
|
Wang Y, Li Y, Chen X, Zhao L. HIV-1/HBV Coinfection Accurate Multitarget Prediction Using a Graph Neural Network-Based Ensemble Predicting Model. Int J Mol Sci 2023; 24:ijms24087139. [PMID: 37108305 PMCID: PMC10139236 DOI: 10.3390/ijms24087139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 04/07/2023] [Accepted: 04/11/2023] [Indexed: 04/29/2023] Open
Abstract
HIV and HBV infection are both serious public health challenges. There are more than approximately 4 million patients coinfected with HIV and HBV worldwide, and approximately 5% to 15% of those infected with HIV are coinfected with HBV. Disease progression is more rapid in patients with coinfection, which significantly increases the likelihood of patients progressing from chronic hepatitis to cirrhosis, end-stage liver disease, and hepatocellular carcinoma. HIV treatment is complicated by drug interactions, antiretroviral (ARV) hepatotoxicity, and HBV-related immune reconditioning and inflammatory syndromes. Drug development is a highly costly and time-consuming procedure with traditional experimental methods. With the development of computer-aided drug design techniques, both machine learning and deep learning have been successfully used to facilitate rapid innovations in the virtual screening of candidate drugs. In this study, we proposed a graph neural network-based molecular feature extraction model by integrating one optimal supervised learner to replace the output layer of the GNN to accurately predict the potential multitargets of HIV-1/HBV coinfections. The experimental results strongly suggested that DMPNN + GBDT may greatly improve the accuracy of binary-target predictions and efficiently identify the potential multiple targets of HIV-1 and HBV simultaneously.
Collapse
Affiliation(s)
- Yishu Wang
- School of Mathematics and Statistics, University of Science and Technology Beijing, Beijing 100083, China
| | - Yue Li
- School of Mathematics and Statistics, University of Science and Technology Beijing, Beijing 100083, China
| | - Xiaomin Chen
- School of Mathematics and Statistics, University of Science and Technology Beijing, Beijing 100083, China
| | - Lutao Zhao
- School of Mathematics and Statistics, University of Science and Technology Beijing, Beijing 100083, China
- Center for Energy and Environmental Policy Research, Beijing Institute of Technology, Beijing 100081, China
| |
Collapse
|
6
|
Cavalcanti ABS, Maia MDS, Figueiredo PTRD, Menezes RPBD, Monteiro AFM, Meireles RAR, Rodrigues GCS, Rodrigues de Almeida Silva AR, Lins JDS, Cordeiro LV, Junior VSR, Castelo Branco APOT, Agra MDF, Sessions ZL, Muratov EN, Scotti L, Silva MSD, Costa VCDO, Tavares JF, Scotti MT. Four diterpenes identified in silico were isolated from Hyptidinae and demonstrated in vitro activity against Mycobacterium tuberculosis. Nat Prod Res 2023; 37:903-911. [PMID: 35819986 DOI: 10.1080/14786419.2022.2096604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
Plants of Hyptidinae subtribe (Lamiaceae - family), as Mesosphaerum sidifolium, are a source of bioactive molecules. In the search for new drug candidates, we perform chemical characterization of diterpenes isolated from the aerial parts of M. sidifolium was carried out with uni- and bidimensional NMR spectral data, and evaluate in silico through the construction of a predictive model followed by in vitro testing Mycobacterium tuberculosis and Mycobacterium smegmatis. Resulted in the isolation of four components: Pomiferin D (1), Salviol (2), Pomiferin E (3) and 2α-hydroxysugiol (4), as well as two phenolic compounds, rosmarinic and caffeic acids. In silico model identified 48 diterpenes likely to have biological activity against M. tuberculosis. The diterpenes isolated were tested in vitro against M. tuberculosis demonstrating MIC = 125 µM for 4 and 1, while 2 and 3 -MIC = 250 µM. These compounds did not show biological activity at these concentrations for M. smegmatis.
Collapse
Affiliation(s)
- Andreza Barbosa Silva Cavalcanti
- Program of Natural and Synthetic Bioactive Products (PgPNSB), Health Sciences Center, Federal University of Paraíba, João Pessoa, Brazil
| | - Mayara Dos Santos Maia
- Program of Natural and Synthetic Bioactive Products (PgPNSB), Health Sciences Center, Federal University of Paraíba, João Pessoa, Brazil
| | - Pedro Thiago Ramalho de Figueiredo
- Program of Natural and Synthetic Bioactive Products (PgPNSB), Health Sciences Center, Federal University of Paraíba, João Pessoa, Brazil
| | - Renata Priscila Barros de Menezes
- Program of Natural and Synthetic Bioactive Products (PgPNSB), Health Sciences Center, Federal University of Paraíba, João Pessoa, Brazil
| | - Alex France Messias Monteiro
- Program of Natural and Synthetic Bioactive Products (PgPNSB), Health Sciences Center, Federal University of Paraíba, João Pessoa, Brazil
| | - Roseana Araújo Ramos Meireles
- Program of Natural and Synthetic Bioactive Products (PgPNSB), Health Sciences Center, Federal University of Paraíba, João Pessoa, Brazil
| | - Gabriela Cristina Soares Rodrigues
- Program of Natural and Synthetic Bioactive Products (PgPNSB), Health Sciences Center, Federal University of Paraíba, João Pessoa, Brazil
| | | | - Jociano da Silva Lins
- Program of Natural and Synthetic Bioactive Products (PgPNSB), Health Sciences Center, Federal University of Paraíba, João Pessoa, Brazil
| | - Laísa Vilar Cordeiro
- Program of Natural and Synthetic Bioactive Products (PgPNSB), Health Sciences Center, Federal University of Paraíba, João Pessoa, Brazil
| | - Valnês S Rodrigues Junior
- National Institute of Science and Technology in Tuberculosis (INCT-TB), Porto Alegre, Brazil.,Program of Biotechnology, Center for Biotechnology, Federal University of Paraíba, João Pessoa, Brazil
| | | | - Maria de Fátima Agra
- Program of Biotechnology, Center for Biotechnology, Federal University of Paraíba, João Pessoa, Brazil
| | - Zoe L Sessions
- Laboratory of Molecular Modeling, Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Eugene N Muratov
- Laboratory of Molecular Modeling, Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Luciana Scotti
- Program of Natural and Synthetic Bioactive Products (PgPNSB), Health Sciences Center, Federal University of Paraíba, João Pessoa, Brazil
| | - Marcelo Sobral da Silva
- Program of Natural and Synthetic Bioactive Products (PgPNSB), Health Sciences Center, Federal University of Paraíba, João Pessoa, Brazil
| | - Vicente Carlos de Oliveira Costa
- Program of Natural and Synthetic Bioactive Products (PgPNSB), Health Sciences Center, Federal University of Paraíba, João Pessoa, Brazil
| | - Josean Fechine Tavares
- Program of Natural and Synthetic Bioactive Products (PgPNSB), Health Sciences Center, Federal University of Paraíba, João Pessoa, Brazil
| | - Marcus Tullius Scotti
- Program of Natural and Synthetic Bioactive Products (PgPNSB), Health Sciences Center, Federal University of Paraíba, João Pessoa, Brazil
| |
Collapse
|
7
|
Cai X, Orsi M, Capecchi A, Köhler T, van Delden C, Javor S, Reymond JL. An intrinsically disordered antimicrobial peptide dendrimer from stereorandomized virtual screening. CELL REPORTS. PHYSICAL SCIENCE 2022; 3:101161. [PMID: 36632208 PMCID: PMC9780108 DOI: 10.1016/j.xcrp.2022.101161] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 10/21/2022] [Accepted: 11/02/2022] [Indexed: 06/17/2023]
Abstract
Membrane-disruptive amphiphilic antimicrobial peptides behave as intrinsically disordered proteins by being unordered in water and becoming α-helical in contact with biological membranes. We recently discovered that synthesizing the α-helical antimicrobial peptide dendrimer L-T25 ((KL)8(KKL)4(KLL)2 KKLL) using racemic amino acids to form stereorandomized sr-T25, an analytically pure mixture of all possible diastereoisomers of L-T25, preserved antibacterial activity but abolished hemolysis and cytotoxicity, pointing to an intrinsically disordered antibacterial conformation and an α-helical cytotoxic conformation. In this study, to identify non-toxic intrinsically disordered homochiral antimicrobial peptide dendrimers (AMPDs), we surveyed sixty-three sr-analogs of sr-T25 selected by virtual screening. One of the analogs, sr-X18 ((KL)8(KLK)4(KLL)2 KLLL), lost antibacterial activity as L-enantiomer and became hemolytic due to α-helical folding. By contrast, the L- and D-enantiomers of sr-X22 ((KL)8(KL)4(KKLL)2 KLKK) were equally antibacterial, non-hemolytic, and non-toxic, implying an intrinsically disordered bioactive conformation. Screening stereorandomized libraries may be generally useful to identify or optimize intrinsically disordered bioactive peptides.
Collapse
Affiliation(s)
- Xingguang Cai
- Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern, Freiestrasse 3, 3012 Bern, Switzerland
| | - Markus Orsi
- Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern, Freiestrasse 3, 3012 Bern, Switzerland
| | - Alice Capecchi
- Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern, Freiestrasse 3, 3012 Bern, Switzerland
| | - Thilo Köhler
- Department of Microbiology and Molecular Medicine, University of Geneva, Service of Infectious Diseases, University Hospital of Geneva, Geneva, Switzerland
| | - Christian van Delden
- Department of Microbiology and Molecular Medicine, University of Geneva, Service of Infectious Diseases, University Hospital of Geneva, Geneva, Switzerland
| | - Sacha Javor
- Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern, Freiestrasse 3, 3012 Bern, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern, Freiestrasse 3, 3012 Bern, Switzerland
| |
Collapse
|
8
|
Artificial intelligence and machine-learning approaches in structure and ligand-based discovery of drugs affecting central nervous system. Mol Divers 2022; 27:959-985. [PMID: 35819579 DOI: 10.1007/s11030-022-10489-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Accepted: 06/21/2022] [Indexed: 12/11/2022]
Abstract
CNS disorders are indications with a very high unmet medical needs, relatively smaller number of available drugs, and a subpar satisfaction level among patients and caregiver. Discovery of CNS drugs is extremely expensive affair with its own unique challenges leading to extremely high attrition rates and low efficiency. With explosion of data in information age, there is hardly any aspect of life that has not been touched by data driven technologies such as artificial intelligence (AI) and machine learning (ML). Drug discovery is no exception, emergence of big data via genomic, proteomic, biological, and chemical technologies has driven pharmaceutical giants to collaborate with AI oriented companies to revolutionise drug discovery, with the goal of increasing the efficiency of the process. In recent years many examples of innovative applications of AI and ML techniques in CNS drug discovery has been reported. Research on therapeutics for diseases such as schizophrenia, Alzheimer's and Parkinsonism has been provided with a new direction and thrust from these developments. AI and ML has been applied to both ligand-based and structure-based drug discovery and design of CNS therapeutics. In this review, we have summarised the general aspects of AI and ML from the perspective of drug discovery followed by a comprehensive coverage of the recent developments in the applications of AI/ML techniques in CNS drug discovery.
Collapse
|
9
|
Staszak M, Staszak K, Wieszczycka K, Bajek A, Roszkowski K, Tylkowski B. Machine learning in drug design: Use of artificial intelligence to explore the chemical structure–biological activity relationship. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1568] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Maciej Staszak
- Institute of Technology and Chemical Engineering Poznan University of Technology Poznan Poland
| | - Katarzyna Staszak
- Institute of Technology and Chemical Engineering Poznan University of Technology Poznan Poland
| | - Karolina Wieszczycka
- Institute of Technology and Chemical Engineering Poznan University of Technology Poznan Poland
| | - Anna Bajek
- Department of Tissue Engineering Collegium Medicum, Nicolaus Copernicus University Bydgoszcz Poland
| | - Krzysztof Roszkowski
- Department of Oncology Collegium Medicum Nicolaus Copernicus University Bydgoszcz Poland
| | - Bartosz Tylkowski
- Department of Chemical Engineering University Rovira i Virgili Tarragona Spain
- Eurecat, Centre Tecnològic de Catalunya Chemical Technologies Unit Tarragona Spain
| |
Collapse
|
10
|
Baylon JL, Ursu O, Muzdalo A, Wassermann AM, Adams GL, Spale M, Mejzlik P, Gromek A, Pisarenko V, Hancharyk D, Jenkins E, Bednar D, Chang C, Clarova K, Glick M, Bitton DA. PepSeA: Peptide Sequence Alignment and Visualization Tools to Enable Lead Optimization. J Chem Inf Model 2022; 62:1259-1267. [PMID: 35192366 DOI: 10.1021/acs.jcim.1c01360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Therapeutic peptides offer potential advantages over small molecules in terms of selectivity, affinity, and their ability to target "undruggable" proteins that are associated with a wide range of pathologies. Despite their importance, current molecular design capabilities that inform medicinal chemistry decisions on peptide programs are limited. More specifically, there are unmet needs for structure-activity relationship (SAR) analysis and visualization of linear, cyclic, and cross-linked peptides containing non-natural motifs, which are widely used in drug discovery. To bridge this gap, we developed PepSeA (Peptide Sequence Alignment and Visualization), an open-source, freely available package of sequence-based tools (https://github.com/Merck/PepSeA). PepSeA enables multiple sequence alignment of non-natural amino acids and enhanced visualization with the hierarchical editing language for macromolecules (HELM). Via stepwise SAR analysis of a ChEMBL peptide data set, we demonstrate the utility of PepSeA to accelerate decision making in lead optimization campaigns in pharmaceutical setting. PepSeA represents an initial attempt to expand cheminformatics capabilities for therapeutic peptides and to enable rapid and more efficient design-make-test cycles.
Collapse
Affiliation(s)
- Javier L Baylon
- Computational and Structural Chemistry, Merck & Co., Inc., Boston, Massachusetts 02115, United States
| | - Oleg Ursu
- Computational and Structural Chemistry, Merck & Co., Inc., Boston, Massachusetts 02115, United States
| | - Anja Muzdalo
- R&D Informatics Solutions, MSD Czech Republic s.r.o., Prague 150 00, Czech Republic
| | - Anne Mai Wassermann
- Computational and Structural Chemistry, Merck & Co., Inc., Boston, Massachusetts 02115, United States
| | - Gregory L Adams
- Computational and Structural Chemistry, Merck & Co., Inc., Boston, Massachusetts 02115, United States
| | - Martin Spale
- R&D Informatics Solutions, MSD Czech Republic s.r.o., Prague 150 00, Czech Republic
| | - Petr Mejzlik
- AI & Big Data Analytics, MSD Czech Republic s.r.o., Prague 150 00, Czech Republic
| | - Anna Gromek
- R&D Informatics Solutions, MSD Czech Republic s.r.o., Prague 150 00, Czech Republic
| | - Viktor Pisarenko
- R&D Informatics Solutions, MSD Czech Republic s.r.o., Prague 150 00, Czech Republic
| | - Dzianis Hancharyk
- R&D Informatics Solutions, MSD Czech Republic s.r.o., Prague 150 00, Czech Republic
| | - Esteban Jenkins
- Foundational Data and Analytics, MSD Czech Republic s.r.o., Prague 150 00, Czech Republic
| | - David Bednar
- Foundational Data and Analytics, MSD Czech Republic s.r.o., Prague 150 00, Czech Republic
| | - Charlie Chang
- Discovery Research IT, Merck & Co., Inc., Boston, Massachusetts 02115, United States
| | - Kamila Clarova
- R&D Informatics Solutions, MSD Czech Republic s.r.o., Prague 150 00, Czech Republic.,Department of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology, Prague 166 28, Czech Republic
| | - Meir Glick
- Computational and Structural Chemistry, Merck & Co., Inc., Boston, Massachusetts 02115, United States
| | - Danny A Bitton
- R&D Informatics Solutions, MSD Czech Republic s.r.o., Prague 150 00, Czech Republic
| |
Collapse
|
11
|
Deng D, Chen X, Zhang R, Lei Z, Wang X, Zhou F. XGraphBoost: Extracting Graph Neural Network-Based Features for a Better Prediction of Molecular Properties. J Chem Inf Model 2021; 61:2697-2705. [PMID: 34009965 DOI: 10.1021/acs.jcim.0c01489] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Determining the properties of chemical molecules is essential for screening candidates similar to a specific drug. These candidate molecules are further evaluated for their target binding affinities, side effects, target missing probabilities, etc. Conventional machine learning algorithms demonstrated satisfying prediction accuracies of molecular properties. A molecule cannot be directly loaded into a machine learning model, and a set of engineered features needs to be designed and calculated from a molecule. Such hand-crafted features rely heavily on the experiences of the investigating researchers. The concept of graph neural networks (GNNs) was recently introduced to describe the chemical molecules. The features may be automatically and objectively extracted from the molecules through various types of GNNs, e.g., GCN (graph convolution network), GGNN (gated graph neural network), DMPNN (directed message passing neural network), etc. However, the training of a stable GNN model requires a huge number of training samples and a large amount of computing power, compared with the conventional machine learning strategies. This study proposed the integrated framework XGraphBoost to extract the features using a GNN and build an accurate prediction model of molecular properties using the classifier XGBoost. The proposed framework XGraphBoost fully inherits the merits of the GNN-based automatic molecular feature extraction and XGBoost-based accurate prediction performance. Both classification and regression problems were evaluated using the framework XGraphBoost. The experimental results strongly suggest that XGraphBoost may facilitate the efficient and accurate predictions of various molecular properties. The source code is freely available to academic users at https://github.com/chenxiaowei-vincent/XGraphBoost.git.
Collapse
Affiliation(s)
- Daiguo Deng
- Fermion Technology Co., Ltd., Guangzhou, Guangdong 510000, P.R. China
| | - Xiaowei Chen
- Fermion Technology Co., Ltd., Guangzhou, Guangdong 510000, P.R. China
| | - Ruochi Zhang
- Fermion Technology Co., Ltd., Guangzhou, Guangdong 510000, P.R. China.,College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, P.R. China
| | - Zengrong Lei
- Fermion Technology Co., Ltd., Guangzhou, Guangdong 510000, P.R. China
| | - Xiaojian Wang
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, P.R. China
| | - Fengfeng Zhou
- College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, P.R. China
| |
Collapse
|
12
|
Capecchi A, Reymond JL. Assigning the Origin of Microbial Natural Products by Chemical Space Map and Machine Learning. Biomolecules 2020; 10:E1385. [PMID: 32998475 PMCID: PMC7600738 DOI: 10.3390/biom10101385] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Revised: 09/22/2020] [Accepted: 09/25/2020] [Indexed: 12/20/2022] Open
Abstract
Microbial natural products (NPs) are an important source of drugs, however, their structural diversity remains poorly understood. Here we used our recently reported MinHashed Atom Pair fingerprint with diameter of four bonds (MAP4), a fingerprint suitable for molecules across very different sizes, to analyze the Natural Products Atlas (NPAtlas), a database of 25,523 NPs of bacterial or fungal origin. To visualize NPAtlas by MAP4 similarity, we used the dimensionality reduction method tree map (TMAP). The resulting interactive map organizes molecules by physico-chemical properties and compound families such as peptides and glycosides. Remarkably, the map separates bacterial and fungal NPs from one another, revealing that these two compound families are intrinsically different despite their related biosynthetic pathways. We used these differences to train a machine learning model capable of distinguishing between NPs of bacterial or fungal origin.
Collapse
Affiliation(s)
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012 Bern, Switzerland;
| |
Collapse
|
13
|
Zhao L, Ciallella HL, Aleksunes LM, Zhu H. Advancing computer-aided drug discovery (CADD) by big data and data-driven machine learning modeling. Drug Discov Today 2020; 25:1624-1638. [PMID: 32663517 PMCID: PMC7572559 DOI: 10.1016/j.drudis.2020.07.005] [Citation(s) in RCA: 66] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Revised: 06/26/2020] [Accepted: 07/06/2020] [Indexed: 02/06/2023]
Abstract
Advancing a new drug to market requires substantial investments in time as well as financial resources. Crucial bioactivities for drug candidates, including their efficacy, pharmacokinetics (PK), and adverse effects, need to be investigated during drug development. With advancements in chemical synthesis and biological screening technologies over the past decade, a large amount of biological data points for millions of small molecules have been generated and are stored in various databases. These accumulated data, combined with new machine learning (ML) approaches, such as deep learning, have shown great potential to provide insights into relevant chemical structures to predict in vitro, in vivo, and clinical outcomes, thereby advancing drug discovery and development in the big data era.
Collapse
Affiliation(s)
- Linlin Zhao
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ 08102, USA
| | - Heather L Ciallella
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ 08102, USA
| | - Lauren M Aleksunes
- Department of Pharmacology and Toxicology, Ernest Mario School of Pharmacy, Rutgers University, Piscataway, NJ 08854, USA
| | - Hao Zhu
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ 08102, USA; Department of Chemistry, Rutgers University, Camden, NJ 08102, USA.
| |
Collapse
|
14
|
Iwaniak A, Minkiewicz P, Pliszka M, Mogut D, Darewicz M. Characteristics of Biopeptides Released In Silico from Collagens Using Quantitative Parameters. Foods 2020; 9:E965. [PMID: 32708318 PMCID: PMC7404701 DOI: 10.3390/foods9070965] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Revised: 07/17/2020] [Accepted: 07/20/2020] [Indexed: 02/06/2023] Open
Abstract
The potential of collagens to release biopeptides was evaluated using the BIOPEP-UWM-implemented quantitative criteria including the frequency of the release of fragments with a given activity by selected enzyme(s) (AE), relative frequency of release of fragments with a given activity by selected enzyme(s) (W), and the theoretical degree of hydrolysis (DHt). Cow, pig, sheep, chicken, duck, horse, salmon, rainbow trout, goat, rabbit, and turkey collagens were theoretically hydrolyzed using: stem bromelain, ficin, papain, pepsin, trypsin, chymotrypsin, pepsin+trypsin, and pepsin+trypsin+chymotrypsin. Peptides released from the collagens having comparable AE and W were estimated for their likelihood to be bioactive using PeptideRanker Score. The collagens tested were the best sources of angiotensin I-converting enzyme (ACE) and dipeptidyl peptidase IV (DPP-IV) inhibitors. AE and W values revealed that pepsin and/or trypsin were effective producers of such peptides from the majority of the collagens examined. Then, the SwissTargetPrediction program was used to estimate the possible interactions of such peptides with enzymes and proteins, whereas ADMETlab was applied to evaluate their safety and drug-likeness properties. Target prediction revealed that the collagen-derived peptides might interact with several human proteins, especially proteinases, but with relatively low probability. In turn, their bioactivity may be limited by their short half-life in the body.
Collapse
Affiliation(s)
- Anna Iwaniak
- University of Warmia and Mazury in Olsztyn, Faculty of Food Science, Chair of Food Biochemistry, Pl. Cieszyński 1, 10-719 Olsztyn-Kortowo, Poland
| | - Piotr Minkiewicz
- University of Warmia and Mazury in Olsztyn, Faculty of Food Science, Chair of Food Biochemistry, Pl. Cieszyński 1, 10-719 Olsztyn-Kortowo, Poland
| | - Monika Pliszka
- University of Warmia and Mazury in Olsztyn, Faculty of Food Science, Chair of Food Biochemistry, Pl. Cieszyński 1, 10-719 Olsztyn-Kortowo, Poland
| | - Damir Mogut
- University of Warmia and Mazury in Olsztyn, Faculty of Food Science, Chair of Food Biochemistry, Pl. Cieszyński 1, 10-719 Olsztyn-Kortowo, Poland
| | - Małgorzata Darewicz
- University of Warmia and Mazury in Olsztyn, Faculty of Food Science, Chair of Food Biochemistry, Pl. Cieszyński 1, 10-719 Olsztyn-Kortowo, Poland
| |
Collapse
|
15
|
One molecular fingerprint to rule them all: drugs, biomolecules, and the metabolome. J Cheminform 2020; 12:43. [PMID: 33431010 PMCID: PMC7291580 DOI: 10.1186/s13321-020-00445-4] [Citation(s) in RCA: 116] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Accepted: 06/04/2020] [Indexed: 02/08/2023] Open
Abstract
Background Molecular fingerprints are essential cheminformatics tools for virtual screening and mapping chemical space. Among the different types of fingerprints, substructure fingerprints perform best for small molecules such as drugs, while atom-pair fingerprints are preferable for large molecules such as peptides. However, no available fingerprint achieves good performance on both classes of molecules. Results Here we set out to design a new fingerprint suitable for both small and large molecules by combining substructure and atom-pair concepts. Our quest resulted in a new fingerprint called MinHashed atom-pair fingerprint up to a diameter of four bonds (MAP4). In this fingerprint the circular substructures with radii of r = 1 and r = 2 bonds around each atom in an atom-pair are written as two pairs of SMILES, each pair being combined with the topological distance separating the two central atoms. These so-called atom-pair molecular shingles are hashed, and the resulting set of hashes is MinHashed to form the MAP4 fingerprint. MAP4 significantly outperforms all other fingerprints on an extended benchmark that combines the Riniker and Landrum small molecule benchmark with a peptide benchmark recovering BLAST analogs from either scrambled or point mutation analogs. MAP4 furthermore produces well-organized chemical space tree-maps (TMAPs) for databases as diverse as DrugBank, ChEMBL, SwissProt and the Human Metabolome Database (HMBD), and differentiates between all metabolites in HMBD, over 70% of which are indistinguishable from their nearest neighbor using substructure fingerprints. Conclusion MAP4 is a new molecular fingerprint suitable for drugs, biomolecules, and the metabolome and can be adopted as a universal fingerprint to describe and search chemical space. The source code is available at https://github.com/reymond-group/map4 and interactive MAP4 similarity search tools and TMAPs for various databases are accessible at http://map-search.gdb.tools/ and http://tm.gdb.tools/map4/.![]()
Collapse
|
16
|
Singh N, Chaput L, Villoutreix BO. Virtual screening web servers: designing chemical probes and drug candidates in the cyberspace. Brief Bioinform 2020; 22:1790-1818. [PMID: 32187356 PMCID: PMC7986591 DOI: 10.1093/bib/bbaa034] [Citation(s) in RCA: 53] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The interplay between life sciences and advancing technology drives a continuous cycle of chemical data growth; these data are most often stored in open or partially open databases. In parallel, many different types of algorithms are being developed to manipulate these chemical objects and associated bioactivity data. Virtual screening methods are among the most popular computational approaches in pharmaceutical research. Today, user-friendly web-based tools are available to help scientists perform virtual screening experiments. This article provides an overview of internet resources enabling and supporting chemical biology and early drug discovery with a main emphasis on web servers dedicated to virtual ligand screening and small-molecule docking. This survey first introduces some key concepts and then presents recent and easily accessible virtual screening and related target-fishing tools as well as briefly discusses case studies enabled by some of these web services. Notwithstanding further improvements, already available web-based tools not only contribute to the design of bioactive molecules and assist drug repositioning but also help to generate new ideas and explore different hypotheses in a timely fashion while contributing to teaching in the field of drug development.
Collapse
Affiliation(s)
- Natesh Singh
- Univ. Lille, Inserm, Institut Pasteur de Lille, U1177 Drugs and Molecules for Living Systems, F-59000 Lille, France
| | - Ludovic Chaput
- Univ. Lille, Inserm, Institut Pasteur de Lille, U1177 Drugs and Molecules for Living Systems, F-59000 Lille, France
| | - Bruno O Villoutreix
- Univ. Lille, Inserm, Institut Pasteur de Lille, U1177 Drugs and Molecules for Living Systems, F-59000 Lille, France
| |
Collapse
|
17
|
Bagherian M, Sabeti E, Wang K, Sartor MA, Nikolovska-Coleska Z, Najarian K. Machine learning approaches and databases for prediction of drug-target interaction: a survey paper. Brief Bioinform 2020; 22:247-269. [PMID: 31950972 PMCID: PMC7820849 DOI: 10.1093/bib/bbz157] [Citation(s) in RCA: 148] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Revised: 11/01/2019] [Accepted: 11/07/2019] [Indexed: 12/12/2022] Open
Abstract
The task of predicting the interactions between drugs and targets plays a key role in the process of drug discovery. There is a need to develop novel and efficient prediction approaches in order to avoid costly and laborious yet not-always-deterministic experiments to determine drug–target interactions (DTIs) by experiments alone. These approaches should be capable of identifying the potential DTIs in a timely manner. In this article, we describe the data required for the task of DTI prediction followed by a comprehensive catalog consisting of machine learning methods and databases, which have been proposed and utilized to predict DTIs. The advantages and disadvantages of each set of methods are also briefly discussed. Lastly, the challenges one may face in prediction of DTI using machine learning approaches are highlighted and we conclude by shedding some lights on important future research directions.
Collapse
Affiliation(s)
- Maryam Bagherian
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Elyas Sabeti
- Michigan Institute for Data Science, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Kai Wang
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Maureen A Sartor
- Department of Pathology, University of Michigan, Ann Arbor, MI, 48109, USA
| | | | - Kayvan Najarian
- Department of Electrical Engineering and Computer Science, College of Engineering, University of Michigan, Ann Arbor, MI, 48109, USA
| |
Collapse
|
18
|
Capecchi A, Zhang A, Reymond JL. Populating Chemical Space with Peptides Using a Genetic Algorithm. J Chem Inf Model 2020; 60:121-132. [PMID: 31868369 DOI: 10.1021/acs.jcim.9b01014] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
In drug discovery, one uses chemical space as a concept to organize molecules according to their structures and properties. One often would like to generate new possible molecules at a specific location in the chemical space marked by a molecule of interest. Herein, we report the peptide design genetic algorithm (PDGA, code available at https://github.com/reymond-group/PeptideDesignGA ), a computational tool capable of producing peptide sequences of various topologies (linear, cyclic/polycyclic, or dendritic) in proximity of any molecule of interest in a chemical space defined by macromolecule extended atom-pair fingerprint (MXFP), an atom-pair fingerprint describing molecular shape and pharmacophores. We show that the PDGA generates high-similarity analogues of bioactive peptides with diverse peptide chain topologies and of nonpeptide target molecules. We illustrate the chemical space accessible by the PDGA with an interactive 3D map of the MXFP property space available at http://faerun.gdb.tools/ . The PDGA should be generally useful to generate peptides at any location in the chemical space.
Collapse
Affiliation(s)
- Alice Capecchi
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland
| | - Alain Zhang
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland
| |
Collapse
|
19
|
Chemical-Reactivity Properties, Drug Likeness, and Bioactivity Scores of Seragamides A–F Anticancer Marine Peptides: Conceptual Density Functional Theory Viewpoint. COMPUTATION 2019. [DOI: 10.3390/computation7030052] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
A methodology based on concepts that arose from Density Functional Theory (CDFT) was chosen for the calculation of global and local reactivity descriptors of the Seragamide family of marine anticancer peptides. Determination of active sites for the molecules was achieved by resorting to some descriptors within Molecular Electron Density Theory (MEDT) such as Fukui functions. The pKas of the six studied peptides were established using a proposed relationship between this property and calculated chemical hardness. The drug likenesses and bioactivity properties of the peptides considered in this study were obtained by resorting to a homology model by comparison with the bioactivity of related molecules in their interaction with different receptors. With the object of analyzing the concept of drug repurposing, a study of potential AGE-inhibition abilities of Seragamides peptides was pursued by comparison with well-known drugs that are already available as pharmaceuticals.
Collapse
|
20
|
Flores-Holguín N, Frau J, Glossman-Mitnik D. Calculation of the Global and Local Conceptual DFT Indices for the Prediction of the Chemical Reactivity Properties of Papuamides A-F Marine Drugs. Molecules 2019; 24:E3312. [PMID: 31514433 PMCID: PMC6767314 DOI: 10.3390/molecules24183312] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2019] [Revised: 09/05/2019] [Accepted: 09/07/2019] [Indexed: 12/28/2022] Open
Abstract
A well-behaved model chemistry previously validated for the study of the chemical reactivity of peptides was considered for the calculation of the molecular properties and structures of the Papuamide family of marine peptides. A methodology based on Conceptual Density Functional Theory (CDFT) was chosen for the determination of the reactivity descriptors. The molecular active sites were associated with the active regions of the molecules related to the nucleophilic and electrophilic Parr functions. Finally, the drug-likenesses and the bioactivity scores for the Papuamide peptides were predicted through a homology methodology relating them with the calculated reactivity descriptors, while other properties such as the pKas were determined following a methodology developed by our group.
Collapse
Affiliation(s)
- Norma Flores-Holguín
- Laboratorio Virtual NANOCOSMOS, Departamento de Medio Ambiente y Energía, Centro de Investigación en Materiales Avanzados, Miguel de Cervantes 120, Complejo Industrial Chihuahua, Chihuahua 31136, Mexico.
| | - Juan Frau
- Departament de Química, Universitat de les Illes Balears, 07122 Palma de Mallorca, Spain.
| | - Daniel Glossman-Mitnik
- Laboratorio Virtual NANOCOSMOS, Departamento de Medio Ambiente y Energía, Centro de Investigación en Materiales Avanzados, Miguel de Cervantes 120, Complejo Industrial Chihuahua, Chihuahua 31136, Mexico.
- Departament de Química, Universitat de les Illes Balears, 07122 Palma de Mallorca, Spain.
| |
Collapse
|
21
|
Flores-Holguín N, Frau J, Glossman-Mitnik D. Chemical reactivity and bioactivity properties of the Phallotoxin family of fungal peptides based on Conceptual Peptidology and DFT study. Heliyon 2019; 5:e02335. [PMID: 31463408 PMCID: PMC6710531 DOI: 10.1016/j.heliyon.2019.e02335] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2019] [Revised: 08/12/2019] [Accepted: 08/14/2019] [Indexed: 12/14/2022] Open
Abstract
A methodology based on the concepts that arise from Density Functional Theory (CDFT) was chosen for the calculation of the global and local reactivity descriptors of the Phallotoxin family of fungal peptides. The determination of the active sites for the molecules has been achieved by resorting some descriptors within Molecular Electron Density Theory (MEDT) like the Dual Descriptor and the Parr functions. Phallosacin has been found as the most reactive of the peptides on the basis of the calculated Global Reactivity Descriptors. The pKas of the seven studied peptides were established using a proposed relationship between this property and the calculated Global Hardness. The bioactivity properties of the peptides considered in this study were obtained by resorting to a homology model by comparison with the bioactivity of related molecules in their interaction with different receptors.
Collapse
Affiliation(s)
- Norma Flores-Holguín
- Laboratorio Virtual NANOCOSMOS, Departamento de Medio Ambiente y Energía, Centro de Investigación en Materiales Avanzados, Chihuahua, Chih 31136, Mexico
| | - Juan Frau
- Departament de Química, Universitat de les Illes Balears, Palma de Mallorca 07122, Spain
| | - Daniel Glossman-Mitnik
- Departament de Química, Universitat de les Illes Balears, Palma de Mallorca 07122, Spain.,Laboratorio Virtual NANOCOSMOS, Departamento de Medio Ambiente y Energía, Centro de Investigación en Materiales Avanzados, Chihuahua, Chih 31136, Mexico
| |
Collapse
|
22
|
Awale M, Sirockin F, Stiefl N, Reymond JL. Medicinal Chemistry Aware Database GDBMedChem. Mol Inform 2019; 38:e1900031. [PMID: 31169974 DOI: 10.1002/minf.201900031] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Accepted: 05/21/2019] [Indexed: 12/17/2022]
Abstract
The generated database GDB17 enumerates 166.4 billion possible molecules up to 17 atoms of C, N, O, S and halogens following simple chemical stability and synthetic feasibility rules, however medicinal chemistry criteria are not taken into account. Here we applied rules inspired by medicinal chemistry to exclude problematic functional groups and complex molecules from GDB17, and sampled the resulting subset uniformly across molecular size, stereochemistry and polarity to form GDBMedChem as a compact collection of 10 million small molecules. This collection has reduced complexity and better synthetic accessibility than the entire GDB17 but retains higher sp3 -carbon fraction and natural product likeness scores compared to known drugs. GDBMedChem molecules are more diverse and very different from known molecules in terms of substructures and represent an unprecedented source of diversity for drug design. GDBMedChem is available for 3D-visualization, similarity searching and for download at http://gdb.unibe.ch.
Collapse
Affiliation(s)
- Mahendra Awale
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland
| | - Finton Sirockin
- Novartis Institutes for Biomedical Research, Basel, Switzerland
| | - Nikolaus Stiefl
- Novartis Institutes for Biomedical Research, Basel, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland
| |
Collapse
|