1
|
Tan L, Hirte S, Palmacci V, Stork C, Kirchmair J. Tackling assay interference associated with small molecules. Nat Rev Chem 2024; 8:319-339. [PMID: 38622244 DOI: 10.1038/s41570-024-00593-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/29/2024] [Indexed: 04/17/2024]
Abstract
Biochemical and cell-based assays are essential to discovering and optimizing efficacious and safe drugs, agrochemicals and cosmetics. However, false assay readouts stemming from colloidal aggregation, chemical reactivity, chelation, light signal attenuation and emission, membrane disruption, and other interference mechanisms remain a considerable challenge in screening synthetic compounds and natural products. To address assay interference, a range of powerful experimental approaches are available and in silico methods are now gaining traction. This Review begins with an overview of the scope and limitations of experimental approaches for tackling assay interference. It then focuses on theoretical methods, discusses strategies for their integration with experimental approaches, and provides recommendations for best practices. The Review closes with a summary of the critical facts and an outlook on potential future developments.
Collapse
Affiliation(s)
- Lu Tan
- Drug Discovery Sciences, Boehringer Ingelheim RCV GmbH & Co KG, Vienna, Austria
| | - Steffen Hirte
- Department of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, Faculty of Life Sciences, University of Vienna, Vienna, Austria
- Vienna Doctoral School of Pharmaceutical, Nutritional and Sport Sciences (PhaNuSpo), University of Vienna, Vienna, Austria
| | - Vincenzo Palmacci
- Department of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, Faculty of Life Sciences, University of Vienna, Vienna, Austria
- Vienna Doctoral School of Pharmaceutical, Nutritional and Sport Sciences (PhaNuSpo), University of Vienna, Vienna, Austria
| | - Conrad Stork
- Department of Informatics, Center for Bioinformatics, Faculty of Mathematics, Informatics and Natural Sciences, Universität Hamburg, Hamburg, Germany
- BASF SE, Ludwigshafen am Rhein, Germany
| | - Johannes Kirchmair
- Department of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, Faculty of Life Sciences, University of Vienna, Vienna, Austria.
- Christian Doppler Laboratory for Molecular Informatics in the Biosciences, Department for Pharmaceutical Sciences, University of Vienna, Vienna, Austria.
| |
Collapse
|
2
|
Mahjour BA, Coley CW. RDCanon: A Python Package for Canonicalizing the Order of Tokens in SMARTS Queries. J Chem Inf Model 2024; 64:2948-2954. [PMID: 38488634 DOI: 10.1021/acs.jcim.4c00138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
SMARTS is a widely used language in cheminformatics for defining substructural queries for database lookups, reaction templates for chemical transformations, and other applications. As an extension to SMILES, many SMARTS patterns can represent the same query. Despite this, no canonicalization algorithm invariant of the line notation sequence or atomic numbering is publicly available. Here, we introduce RDCanon, an open-source Python package that can be used to standardize SMARTS queries. RDCanon is designed to ensure that the sequence of atomic queries remains consistent for all graphs representing the same substructure query and to ensure a canonical sequence of primitives within each individual atom query; furthermore, the algorithm can be applied to canonicalize the order of reactants, agents, and products and their atom map numbers in reaction SMARTS templates. As part of its canonicalization algorithm, RDCanon provides a mechanism in which the canonicalized SMARTS is optimized for speed against specific molecular databases. Several case studies are provided to showcase improved efficiency in substructure matching and retrosynthetic analysis.
Collapse
Affiliation(s)
- Babak A Mahjour
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Connor W Coley
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
3
|
Meng W, Pan H, Sha Y, Zhai X, Xing A, Lingampelly SS, Sripathi SR, Wang Y, Li K. Metabolic Connectome and Its Role in the Prediction, Diagnosis, and Treatment of Complex Diseases. Metabolites 2024; 14:93. [PMID: 38392985 PMCID: PMC10890086 DOI: 10.3390/metabo14020093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 01/17/2024] [Accepted: 01/25/2024] [Indexed: 02/25/2024] Open
Abstract
The interconnectivity of advanced biological systems is essential for their proper functioning. In modern connectomics, biological entities such as proteins, genes, RNA, DNA, and metabolites are often represented as nodes, while the physical, biochemical, or functional interactions between them are represented as edges. Among these entities, metabolites are particularly significant as they exhibit a closer relationship to an organism's phenotype compared to genes or proteins. Moreover, the metabolome has the ability to amplify small proteomic and transcriptomic changes, even those from minor genomic changes. Metabolic networks, which consist of complex systems comprising hundreds of metabolites and their interactions, play a critical role in biological research by mediating energy conversion and chemical reactions within cells. This review provides an introduction to common metabolic network models and their construction methods. It also explores the diverse applications of metabolic networks in elucidating disease mechanisms, predicting and diagnosing diseases, and facilitating drug development. Additionally, it discusses potential future directions for research in metabolic networks. Ultimately, this review serves as a valuable reference for researchers interested in metabolic network modeling, analysis, and their applications.
Collapse
Affiliation(s)
- Weiyu Meng
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macau SAR 999078, China
| | - Hongxin Pan
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macau SAR 999078, China
| | - Yuyang Sha
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macau SAR 999078, China
| | - Xiaobing Zhai
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macau SAR 999078, China
| | - Abao Xing
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macau SAR 999078, China
| | | | - Srinivasa R Sripathi
- Henderson Ocular Stem Cell Laboratory, Retina Foundation of the Southwest, Dallas, TX 75231, USA
| | - Yuefei Wang
- National Key Laboratory of Chinese Medicine Modernization, State Key Laboratory of Component-Based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin 301617, China
- Haihe Laboratory of Modern Chinese Medicine, Tianjin 301617, China
| | - Kefeng Li
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macau SAR 999078, China
| |
Collapse
|
4
|
Olmedo DA, Durant-Archibold AA, López-Pérez JL, Medina-Franco JL. Design and Diversity Analysis of Chemical Libraries in Drug Discovery. Comb Chem High Throughput Screen 2024; 27:502-515. [PMID: 37409545 DOI: 10.2174/1386207326666230705150110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 05/30/2023] [Accepted: 05/30/2023] [Indexed: 07/07/2023]
Abstract
Chemical libraries and compound data sets are among the main inputs to start the drug discovery process at universities, research institutes, and the pharmaceutical industry. The approach used in the design of compound libraries, the chemical information they possess, and the representation of structures, play a fundamental role in the development of studies: chemoinformatics, food informatics, in silico pharmacokinetics, computational toxicology, bioinformatics, and molecular modeling to generate computational hits that will continue the optimization process of drug candidates. The prospects for growth in drug discovery and development processes in chemical, biotechnological, and pharmaceutical companies began a few years ago by integrating computational tools with artificial intelligence methodologies. It is anticipated that it will increase the number of drugs approved by regulatory agencies shortly.
Collapse
Affiliation(s)
- Dionisio A Olmedo
- Centro de Investigaciones Farmacognósticas de la Flora Panameña (CIFLORPAN), Facultad de Farmacia, Universidad de Panamá, Ciudad de Panamá, Apartado, 0824-00178, Panamá
- Sistema Nacional de Investigación (SNI), Secretaria Nacional de Ciencia, Tecnología e Innovación (SENACYT), Ciudad del Saber, Clayton, Panamá
| | - Armando A Durant-Archibold
- Centro de Biodiversidad y Descubrimiento de Drogas, Instituto de Investigaciones Científicas y Servicios de Alta Tecnología (INDICASAT AIP), Apartado, 0843-01103, Panamá
- Departamento de Bioquímica, Facultad de Ciencias Naturales, Exactas y Tecnología, Universidad de Panamá, Ciudad de Panamá, Panamá
| | - José Luis López-Pérez
- CESIFAR, Departamento de Farmacología, Facultad de Medicina, Universidad de Panamá, Ciudad de Panamá, Panamá
- Departamento de Ciencias Farmacéuticas, Facultad de Farmacia, Universidad de Salamanca, Avda. Campo Charro s/n, 37071 Salamanca, España
| | - José Luis Medina-Franco
- DIFACQUIM Grupo de Investigación, Departamento de Farmacia, Escuela de Química, Universidad Nacional Autónoma de México, Ciudad de México, Apartado, 04510, México
| |
Collapse
|
5
|
Vivek-Ananth R, Mohanraj K, Sahoo AK, Samal A. IMPPAT 2.0: An Enhanced and Expanded Phytochemical Atlas of Indian Medicinal Plants. ACS OMEGA 2023; 8:8827-8845. [PMID: 36910986 PMCID: PMC9996785 DOI: 10.1021/acsomega.3c00156] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Accepted: 02/13/2023] [Indexed: 06/18/2023]
Abstract
Compilation, curation, digitization, and exploration of the phytochemical space of Indian medicinal plants can expedite ongoing efforts toward natural product and traditional knowledge based drug discovery. To this end, we present IMPPAT 2.0, an enhanced and expanded database compiling manually curated information on 4010 Indian medicinal plants, 17,967 phytochemicals, and 1095 therapeutic uses. Notably, IMPPAT 2.0 compiles associations at the level of plant parts and provides a FAIR-compliant nonredundant in silico stereo-aware library of 17,967 phytochemicals from Indian medicinal plants. The phytochemical library has been annotated with several useful properties to enable easier exploration of the chemical space. We have also filtered a subset of 1335 drug-like phytochemicals of which majority have no similarity to existing approved drugs. Using cheminformatics, we have characterized the molecular complexity and molecular scaffold based structural diversity of the phytochemical space of Indian medicinal plants and performed a comparative analysis with other chemical libraries. Altogether, IMPPAT 2.0 is a manually curated extensive phytochemical atlas of Indian medicinal plants that is accessible at https://cb.imsc.res.in/imppat/.
Collapse
Affiliation(s)
- R.P. Vivek-Ananth
- The
Institute of Mathematical Sciences (IMSc), Chennai 600113, India
- Homi
Bhabha National Institute (HBNI), Mumbai 400094, India
| | | | - Ajaya Kumar Sahoo
- The
Institute of Mathematical Sciences (IMSc), Chennai 600113, India
- Homi
Bhabha National Institute (HBNI), Mumbai 400094, India
| | - Areejit Samal
- The
Institute of Mathematical Sciences (IMSc), Chennai 600113, India
- Homi
Bhabha National Institute (HBNI), Mumbai 400094, India
| |
Collapse
|
6
|
Dolfus U, Briem H, Rarey M. Visualizing Generic Reaction Patterns. J Chem Inf Model 2022; 62:4680-4689. [PMID: 36169383 DOI: 10.1021/acs.jcim.2c00992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Reaction schemes for organic molecules play a crucial role in modern in silico drug design processes. In contrast to the classical drawn reaction diagrams, computational chemists prefer SMARTS based line notations due to a substantially increased expressiveness and precision. They are used to search databases, calculate synthesizability, generate new molecules, or simulate novel reactions. Working with computer-readable representations of reaction schemes can be challenging due to the complexity of the features to be represented. Line representations of reaction schemes can often be cryptic, even to experienced users. To simplify the work with Reaction SMARTS for synthetic, computational, and medicinal chemists, we introduce a visualization technique for reaction schemes and provide a respective tool, called ReactionViewer. ReactionViewer is able to convert reaction schemes encoded as Reaction SMILES, Reaction SMARTS, or SMIRKS into a visual representation. The visualization technique is based on the concept of structure diagrams and follows IUPAC's "Compendium of Chemical Terminology" definition of chemical reaction equations for the reaction symbols. We demonstrate the applicability of the method using two data sets of organic synthesis reaction schemes taken from recent publications. We discuss various properties of the visualization and highlight its readability and interpretability.
Collapse
Affiliation(s)
- Uschi Dolfus
- Universität Hamburg, ZBH - Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| | - Hans Briem
- Bayer AG, Research and Development, Pharmaceuticals, Computational Molecular Design Berlin, Building S110, 711, 13342 Berlin, Germany
| | - Matthias Rarey
- Universität Hamburg, ZBH - Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| |
Collapse
|
7
|
Jackson IM, Webb EW, Scott PJ, James ML. In Silico Approaches for Addressing Challenges in CNS Radiopharmaceutical Design. ACS Chem Neurosci 2022; 13:1675-1683. [PMID: 35606334 PMCID: PMC9945852 DOI: 10.1021/acschemneuro.2c00269] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
Positron emission tomography (PET) is a highly sensitive and versatile molecular imaging modality that leverages radiolabeled molecules, known as radiotracers, to interrogate biochemical processes such as metabolism, enzymatic activity, and receptor expression. The ability to probe specific molecular and cellular events longitudinally in a noninvasive manner makes PET imaging a particularly powerful technique for studying the central nervous system (CNS) in both health and disease. Unfortunately, developing and translating a single CNS PET tracer for clinical use is typically an extremely resource-intensive endeavor, often requiring synthesis and evaluation of numerous candidate molecules. While existing in vitro methods are beginning to address the challenge of derisking molecules prior to costly in vivo PET studies, most require a significant investment of resources and possess substantial limitations. In the context of CNS drug development, significant time and resources have been invested into the development and optimization of computational methods, particularly involving machine learning, to streamline the design of better CNS therapeutics. However, analogous efforts developed and validated for CNS radiotracer design are conspicuously limited. In this Perspective, we overview the requirements and challenges of CNS PET tracer design, survey the most promising computational methods for in silico CNS drug design, and bridge these two areas by discussing the potential applications and impact of computational design tools in CNS radiotracer design.
Collapse
Affiliation(s)
- Isaac M. Jackson
- Department of Radiology, Stanford University, Stanford, CA 94305
| | - E. William Webb
- Department of Radiology, University of Michigan, Ann Arbor, MI 48109
| | - Peter J.H. Scott
- Department of Radiology, University of Michigan, Ann Arbor, MI 48109;,Corresponding Authors: Peter J. H. Scott − Department of Radiology, University of Michigan, Ann Arbor, MI 48109, United States; , Michelle L. James − Departments of Radiology, and Neurology & Neurological Sciences, 1201 Welch Rd., P-206, Stanford, CA 94305-5484, United States;
| | - Michelle L. James
- Department of Radiology, Stanford University, Stanford, CA 94305;,Department of Neurology & Neurological Sciences, Stanford University, Stanford, CA 94304.,Corresponding Authors: Peter J. H. Scott − Department of Radiology, University of Michigan, Ann Arbor, MI 48109, United States; , Michelle L. James − Departments of Radiology, and Neurology & Neurological Sciences, 1201 Welch Rd., P-206, Stanford, CA 94305-5484, United States;
| |
Collapse
|
8
|
Penner P, Guba W, Schmidt R, Meyder A, Stahl M, Rarey M. The Torsion Library: Semiautomated Improvement of Torsion Rules with SMARTScompare. J Chem Inf Model 2022; 62:1644-1653. [PMID: 35318851 DOI: 10.1021/acs.jcim.2c00043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The Torsion Library is a collection of torsion motifs associated with angle distributions, derived from crystallographic databases. It is used in strain assessment, conformer generation, and geometry optimization. A hierarchical structure of expert curated SMARTS defines the chemical environments of rotatable bonds and associates these with preferred angles. SMARTS can be very complex and full of implications, which make them difficult to maintain manually. Recent developments in automatically comparing SMARTS patterns can be applied to the Torsion Library to ensure its correctness. We specifically discuss the implementation and the limits of such a procedure in the context of torsion motifs and show several examples of how the Torsion Library benefits from this. All automated changes are validated manually and then shown to have an effect on the angle distributions by correcting matching behavior. The corrected Torsion Library itself is available including both PDB as well as CSD histograms in the Supporting Information and can be used to evaluate rotatable bonds at https://torsions.zbh.uni-hamburg.de.
Collapse
Affiliation(s)
- Patrick Penner
- Universität Hamburg,ZBH - Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| | - Wolfgang Guba
- Roche Pharma Research & Early Development, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., CH-4070 Basel, Switzerland
| | - Robert Schmidt
- Universität Hamburg,ZBH - Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| | - Agnes Meyder
- Universität Hamburg,ZBH - Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| | - Martin Stahl
- Roche Pharma Research & Early Development, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., CH-4070 Basel, Switzerland
| | - Matthias Rarey
- Universität Hamburg,ZBH - Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| |
Collapse
|
9
|
Heid E, Liu J, Aude A, Green WH. Influence of Template Size, Canonicalization, and Exclusivity for Retrosynthesis and Reaction Prediction Applications. J Chem Inf Model 2021; 62:16-26. [PMID: 34939786 PMCID: PMC8757433 DOI: 10.1021/acs.jcim.1c01192] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Heuristic and machine learning models for rank-ordering reaction templates comprise an important basis for computer-aided organic synthesis regarding both product prediction and retrosynthetic pathway planning. Their viability relies heavily on the quality and characteristics of the underlying template database. With the advent of automated reaction and template extraction software and consequently the creation of template databases too large for manual curation, a data-driven approach to assess and improve the quality of template sets is needed. We therefore systematically studied the influence of template generality, canonicalization, and exclusivity on the performance of different template ranking models. We find that duplicate and nonexclusive templates, i.e., templates which describe the same chemical transformation on identical or overlapping sets of molecules, decrease both the accuracy of the ranking algorithm and the applicability of the respective top-ranked templates significantly. To remedy the negative effects of nonexclusivity, we developed a general and computationally efficient framework to deduplicate and hierarchically correct templates. As a result, performance improved considerably for both heuristic and machine learning template ranking models, as well as multistep retrosynthetic planning models. The canonicalization and correction code is made freely available.
Collapse
Affiliation(s)
- Esther Heid
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02 139, United States
| | - Jiannan Liu
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02 139, United States
| | - Andrea Aude
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02 139, United States
| | - William H Green
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02 139, United States
| |
Collapse
|
10
|
Schmidt R, Klein R, Rarey M. Maximum Common Substructure Searching in Combinatorial Make-on-Demand Compound Spaces. J Chem Inf Model 2021; 62:2133-2150. [PMID: 34478299 DOI: 10.1021/acs.jcim.1c00640] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Commercial make-on-demand compound spaces have become increasingly popular within the past few years. Since these libraries are too large for enumeration, they are usually accessed using combinatorial fragment space technologies like FTrees-FS and SpaceLight. Although both search types are of high practical impact, they lack the ability to search for precise structural features on the atomic level. To address this important use case, we developed SpaceMACS enabling efficient and precise maximum common induced substructure (MCIS) similarity and substructure searches within chemical fragment spaces. SpaceMACS enumerates a user-defined number of compounds in a multistep procedure. First, substructures of the query are extracted and matched to all fragments of the space. Then partial results are combined to actual compounds of the space. In this way, SpaceMACS identifies common substructures even if they cross fragment borders. We applied SpaceMACS on three commercial fragment spaces searching for the 150 000 most similar analogs to a glucosyltransferase binder from literature. We were able to find almost all building blocks used for the synthesis of the 90 listed analogs and a plethora of additional results. SpaceMACS is the missing link to enable rational drug discovery on make-on-demand combinatorial catalogs. No matter whether initial compound suggestions come from a de novo design, an AI-based compound generation, or a medicinal chemist's drawing board, the method gives access to the structurally closest chemically available analogs in seconds to at most minutes.
Collapse
Affiliation(s)
- Robert Schmidt
- Universität Hamburg, ZBH-Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| | - Raphael Klein
- BioSolveIT GmbH, An der Ziegelei 79, 53757 Sankt Augustin, Germany
| | - Matthias Rarey
- Universität Hamburg, ZBH-Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| |
Collapse
|
11
|
Keith JA, Vassilev-Galindo V, Cheng B, Chmiela S, Gastegger M, Müller KR, Tkatchenko A. Combining Machine Learning and Computational Chemistry for Predictive Insights Into Chemical Systems. Chem Rev 2021; 121:9816-9872. [PMID: 34232033 PMCID: PMC8391798 DOI: 10.1021/acs.chemrev.1c00107] [Citation(s) in RCA: 188] [Impact Index Per Article: 62.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Indexed: 12/23/2022]
Abstract
Machine learning models are poised to make a transformative impact on chemical sciences by dramatically accelerating computational algorithms and amplifying insights available from computational chemistry methods. However, achieving this requires a confluence and coaction of expertise in computer science and physical sciences. This Review is written for new and experienced researchers working at the intersection of both fields. We first provide concise tutorials of computational chemistry and machine learning methods, showing how insights involving both can be achieved. We follow with a critical review of noteworthy applications that demonstrate how computational chemistry and machine learning can be used together to provide insightful (and useful) predictions in molecular and materials modeling, retrosyntheses, catalysis, and drug design.
Collapse
Affiliation(s)
- John A. Keith
- Department
of Chemical and Petroleum Engineering Swanson School of Engineering, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| | - Valentin Vassilev-Galindo
- Department
of Physics and Materials Science, University
of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Bingqing Cheng
- Accelerate
Programme for Scientific Discovery, Department
of Computer Science and Technology, 15 J. J. Thomson Avenue, Cambridge CB3 0FD, United Kingdom
| | - Stefan Chmiela
- Department
of Software Engineering and Theoretical Computer Science, Technische Universität Berlin, 10587, Berlin, Germany
| | - Michael Gastegger
- Department
of Software Engineering and Theoretical Computer Science, Technische Universität Berlin, 10587, Berlin, Germany
| | - Klaus-Robert Müller
- Machine
Learning Group, Technische Universität
Berlin, 10587, Berlin, Germany
- Department
of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul, 02841, Korea
- Max-Planck-Institut für Informatik, 66123 Saarbrücken, Germany
- Google Research, Brain Team, 10117 Berlin, Germany
| | - Alexandre Tkatchenko
- Department
of Physics and Materials Science, University
of Luxembourg, L-1511 Luxembourg City, Luxembourg
| |
Collapse
|
12
|
Bonciarelli S, Desantis J, Goracci L, Siragusa L, Zamora I, Ortega-Carrasco E. Automatic Identification of Lansoprazole Degradants under Stress Conditions by LC-HRMS with MassChemSite and WebChembase. J Chem Inf Model 2021; 61:2706-2719. [PMID: 34061520 DOI: 10.1021/acs.jcim.1c00226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Stress testing is one of the most important parts of the drug development process, helping to foresee stability problems and to identify degradation products. One of the processes involving stress testing is represented by forced degradation studies, which can predict the impact of certain conditions of pH, moisture, heat, or other negative effects due to transportation or packaging issues on drug potency and purity, ensuring patient safety. Regulatory agencies have been working on a standardization of laboratory procedures since the past two decades. One of the results of those years of intensive research is the International Conference on Harmonization (ICH) guidelines, which clearly define which forced degradation studies should be performed on new drugs, which become a routine work in pharmaceutical laboratories. Since used techniques based on high-performance liquid chromatography coupled with high-resolution mass spectrometry have been developed years ago and are now mastered by pharmaceutical scientists, automation of data analysis, and thus data processing, is becoming a hot topic nowadays. In this work, we present MassChemSite and WebChembase as a tandem to automatize the routine analysis studies without missing information quality, using as a case study the degradation of lansoprazole under acidic, oxidative, basic, and neutral stress conditions.
Collapse
Affiliation(s)
- Stefano Bonciarelli
- Department of Chemistry, Biology and Biotechnology, University of Perugia, Via Elce di Sotto 8, 06123 Perugia, Italy
| | - Jenny Desantis
- Department of Chemistry, Biology and Biotechnology, University of Perugia, Via Elce di Sotto 8, 06123 Perugia, Italy
| | - Laura Goracci
- Department of Chemistry, Biology and Biotechnology, University of Perugia, Via Elce di Sotto 8, 06123 Perugia, Italy
| | - Lydia Siragusa
- Molecular Horizon SRL, Via Montelino 30, 06084 Bettona, Italy
| | - Ismael Zamora
- Lead Molecular Design, SL, Rambla del Celler 113 local, 08173 Sant Cugat del Vallès, Spain
| | | |
Collapse
|
13
|
Schmidt R, Krull F, Heinzke AL, Rarey M. Disconnected Maximum Common Substructures under Constraints. J Chem Inf Model 2020; 61:167-178. [PMID: 33325698 DOI: 10.1021/acs.jcim.0c00741] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The maximum common substructure (MCS) problem is an important, well-studied problem in cheminformatics. It is applied in several application scenarios like molecule superimposition and scaffold detection or as a similarity measure in virtual screening and clustering. In many cases, the connected MCS is preferred since it is faster to calculate and a highly fragmented MCS is not very meaningful from a chemical point of view. Nevertheless, a disconnected MCS (dMCS) can be very instructive if it consists of reasonably sized molecular parts connected by variable groups. We present a new algorithm named RIMACS, which is able to calculate the dMCS under constraints. We can control the maximum number of connected components and their minimal size using a modified local substructure mapping approach. A formal proof of correctness is provided as well as extended runtime evaluations on chemical data. The evaluation of RIMACS shows that a small number of connected components helps us to improve MCS similarity in a meaningful way while keeping the runtime requirements in a reasonable range.
Collapse
Affiliation(s)
- Robert Schmidt
- Universität Hamburäg, ZBH - Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| | - Florian Krull
- Universität Hamburäg, ZBH - Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| | - Anna Lina Heinzke
- Universität Hamburäg, ZBH - Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| | - Matthias Rarey
- Universität Hamburäg, ZBH - Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| |
Collapse
|
14
|
Schuffenhauer A, Schneider N, Hintermann S, Auld D, Blank J, Cotesta S, Engeloch C, Fechner N, Gaul C, Giovannoni J, Jansen J, Joslin J, Krastel P, Lounkine E, Manchester J, Monovich LG, Pelliccioli AP, Schwarze M, Shultz MD, Stiefl N, Baeschlin DK. Evolution of Novartis' Small Molecule Screening Deck Design. J Med Chem 2020; 63:14425-14447. [PMID: 33140646 DOI: 10.1021/acs.jmedchem.0c01332] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
This article summarizes the evolution of the screening deck at the Novartis Institutes for BioMedical Research (NIBR). Historically, the screening deck was an assembly of all available compounds. In 2015, we designed a first deck to facilitate access to diverse subsets with optimized properties. We allocated the compounds as plated subsets on a 2D grid with property based ranking in one dimension and increasing structural redundancy in the other. The learnings from the 2015 screening deck were applied to the design of a next generation in 2019. We found that using traditional leadlikeness criteria (mainly MW, clogP) reduces the hit rates of attractive chemical starting points in subset screening. Consequently, the 2019 deck relies on solubility and permeability to select preferred compounds. The 2019 design also uses NIBR's experimental assay data and inferred biological activity profiles in addition to structural diversity to define redundancy across the compound sets.
Collapse
Affiliation(s)
- Ansgar Schuffenhauer
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Nadine Schneider
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Samuel Hintermann
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Douglas Auld
- Novartis Institutes for BioMedical Research Inc., 181 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Jutta Blank
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Simona Cotesta
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Caroline Engeloch
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Nikolas Fechner
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Christoph Gaul
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Jerome Giovannoni
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Johanna Jansen
- Novartis Institutes for BioMedical Research-Emeryville, 5300 Chiron Way, Emeryville, California 94608-2916, United States
| | - John Joslin
- Genomics Institute of the Novartis Foundation, San Diego, California 92121, United States
| | - Philipp Krastel
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Eugen Lounkine
- Novartis Institutes for BioMedical Research Inc., 181 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - John Manchester
- Novartis Institutes for BioMedical Research Inc., 181 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Lauren G Monovich
- Novartis Institutes for BioMedical Research Inc., 181 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Anna Paola Pelliccioli
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Manuel Schwarze
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Michael D Shultz
- Novartis Institutes for BioMedical Research Inc., 181 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Nikolaus Stiefl
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Daniel K Baeschlin
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| |
Collapse
|
15
|
Zhao Y, Zhang LX, Jiang T, Long J, Ma ZY, Lu AP, Cheng Y, Cao DS. The ups and downs of Poly(ADP-ribose) Polymerase-1 inhibitors in cancer therapy–Current progress and future direction. Eur J Med Chem 2020; 203:112570. [DOI: 10.1016/j.ejmech.2020.112570] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2020] [Revised: 06/10/2020] [Accepted: 06/11/2020] [Indexed: 12/13/2022]
|
16
|
Ehrt C, Krause B, Schmidt R, Ehmki ESR, Rarey M. SMARTS.plus - A Toolbox for Chemical Pattern Design. Mol Inform 2020; 39:e2000216. [PMID: 32997890 PMCID: PMC7757167 DOI: 10.1002/minf.202000216] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Accepted: 09/28/2020] [Indexed: 11/06/2022]
Abstract
The number of publications concerning Pan-Assay Interference Compounds and related problematic structural motifs in screening libraries is constantly growing. In consequence, filter collections are merged, extended but also critically discussed. Due to the complexity of the chemical pattern language SMARTS, an easy-to-use toolbox enabling every chemist to understand, design and modify chemical patterns is urgently needed. Over the past decade, we developed a series of software tools for visualizing, editing, creating, and analysing chemical patterns. Herein, we highlight how most of these tools can now be easily used as part of the novel SMARTS.plus web server (https://smarts.plus/). As a showcase, we demonstrate how researchers can apply the web server tools within minutes to derive novel SMARTS patterns for the filtering of frequent hitters from their screening libraries with only a little experience with the SMARTS language.
Collapse
Affiliation(s)
- Christiane Ehrt
- Universität Hamburg, ZBH - Center for Bioinformatics, Bundesstraße 43, 20146, Hamburg, Germany
| | - Bennet Krause
- Universität Hamburg, ZBH - Center for Bioinformatics, Bundesstraße 43, 20146, Hamburg, Germany
| | - Robert Schmidt
- Universität Hamburg, ZBH - Center for Bioinformatics, Bundesstraße 43, 20146, Hamburg, Germany
| | - Emanuel S R Ehmki
- Universität Hamburg, ZBH - Center for Bioinformatics, Bundesstraße 43, 20146, Hamburg, Germany
| | - Matthias Rarey
- Universität Hamburg, ZBH - Center for Bioinformatics, Bundesstraße 43, 20146, Hamburg, Germany
| |
Collapse
|
17
|
Yang X, Liu C, Walker BD, Ren P. Accurate description of molecular dipole surface with charge flux implemented for molecular mechanics. J Chem Phys 2020; 153:064103. [DOI: 10.1063/5.0016376] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Xudong Yang
- Department of Biomedical Engineering, The University of Texas at Austin, Austin, 78712 Texas, USA
| | - Chengwen Liu
- Department of Biomedical Engineering, The University of Texas at Austin, Austin, 78712 Texas, USA
| | - Brandon D. Walker
- Department of Biomedical Engineering, The University of Texas at Austin, Austin, 78712 Texas, USA
| | - Pengyu Ren
- Department of Biomedical Engineering, The University of Texas at Austin, Austin, 78712 Texas, USA
| |
Collapse
|
18
|
Cheng CY, Campbell JE, Day GM. Evolutionary chemical space exploration for functional materials: computational organic semiconductor discovery. Chem Sci 2020; 11:4922-4933. [PMID: 34122948 PMCID: PMC8159259 DOI: 10.1039/d0sc00554a] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2020] [Accepted: 04/21/2020] [Indexed: 11/26/2022] Open
Abstract
Computational methods, including crystal structure and property prediction, have the potential to accelerate the materials discovery process by enabling structure prediction and screening of possible molecular building blocks prior to their synthesis. However, the discovery of new functional molecular materials is still limited by the need to identify promising molecules from a vast chemical space. We describe an evolutionary method which explores a user specified region of chemical space to identify promising molecules, which are subsequently evaluated using crystal structure prediction. We demonstrate the methods for the exploration of aza-substituted pentacenes with the aim of finding small molecule organic semiconductors with high charge carrier mobilities, where the space of possible substitution patterns is too large to exhaustively search using a high throughput approach. The method efficiently explores this large space, typically requiring calculations on only ∼1% of molecules during a search. The results reveal two promising structural motifs: aza-substituted naphtho[1,2-a]anthracenes with reorganisation energies as low as pentacene and a series of pyridazine-based molecules having both low reorganisation energies and high electron affinities.
Collapse
Affiliation(s)
- Chi Y Cheng
- Computational Systems Chemistry, School of Chemistry, University of Southampton Highfield Southampton SO17 1NX UK
| | - Josh E Campbell
- Computational Systems Chemistry, School of Chemistry, University of Southampton Highfield Southampton SO17 1NX UK
| | - Graeme M Day
- Computational Systems Chemistry, School of Chemistry, University of Southampton Highfield Southampton SO17 1NX UK
| |
Collapse
|
19
|
Wang Z, Walker GW, Muir DCG, Nagatani-Yoshida K. Toward a Global Understanding of Chemical Pollution: A First Comprehensive Analysis of National and Regional Chemical Inventories. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2020; 54:2575-2584. [PMID: 31968937 DOI: 10.1021/acs.est.9b06379] [Citation(s) in RCA: 306] [Impact Index Per Article: 76.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]
Abstract
Chemicals, while bringing benefits to society, may be released during their lifecycles and possibly cause harm to humans and ecosystems. Chemical pollution has been mentioned as one of the planetary boundaries within which humanity can safely operate, but is not comprehensively understood. Here, 22 chemical inventories from 19 countries and regions are analyzed to achieve a first comprehensive overview of chemicals on the market as an essential first step toward a global understanding of chemical pollution. Over 350 000 chemicals and mixtures of chemicals have been registered for production and use, up to three times as many as previously estimated and with substantial differences across countries/regions. A noteworthy finding is that the identities of many chemicals remain publicly unknown because they are claimed as confidential (over 50 000) or ambiguously described (up to 70 000). Coordinated efforts by all stakeholders including scientists from different disciplines are urgently needed, with (new) areas of interest and opportunities highlighted here.
Collapse
Affiliation(s)
- Zhanyun Wang
- Chair of Ecological Systems Design, Institute of Environmental Engineering, ETH Zürich, 8093 Zürich, Switzerland, ORCID: 0000-0001-9914-7659
| | - Glen W Walker
- Department of the Environment and Energy, Australian Government, General Post Office Box 787, Canberra, Australian Capital Territory 2601, Australia
| | - Derek C G Muir
- Environment & Climate Change Canada, Canada Centre for Inland Waters, Burlington, Ontario Canada, ORCID: 0000-0001-6631-9776
| | | |
Collapse
|
20
|
Yang H, Lou C, Li W, Liu G, Tang Y. Computational Approaches to Identify Structural Alerts and Their Applications in Environmental Toxicology and Drug Discovery. Chem Res Toxicol 2020; 33:1312-1322. [DOI: 10.1021/acs.chemrestox.0c00006] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Affiliation(s)
- Hongbin Yang
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Chaofeng Lou
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Weihua Li
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Guixia Liu
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Yun Tang
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| |
Collapse
|
21
|
Ehmki ESR, Schmidt R, Ohm F, Rarey M. Comparing Molecular Patterns Using the Example of SMARTS: Applications and Filter Collection Analysis. J Chem Inf Model 2019; 59:2572-2586. [DOI: 10.1021/acs.jcim.9b00249] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
| | - Robert Schmidt
- ZBH - Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| | - Farina Ohm
- ZBH - Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| | - Matthias Rarey
- ZBH - Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| |
Collapse
|