1
|
Najar Najafi N, Karbassian R, Hajihassani H, Azimzadeh Irani M. Unveiling the influence of fastest nobel prize winner discovery: alphafold's algorithmic intelligence in medical sciences. J Mol Model 2025; 31:163. [PMID: 40387957 DOI: 10.1007/s00894-025-06392-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2024] [Accepted: 05/06/2025] [Indexed: 05/20/2025]
Abstract
CONTEXT AlphaFold's advanced AI technology has transformed protein structure interpretation. By predicting three-dimensional protein structures from amino acid sequences, AlphaFold has solved the complex protein-folding problem, previously challenging for experimental methods due to numerous possible conformations. Since its inception, AlphaFold has introduced several versions, including AlphaFold2, AlphaFold DB, AlphaFold Multimer, Alpha Missense, and AlphaFold3, each further enhancing protein structure prediction. Remarkably, AlphaFold is recognized as the fastest Nobel Prize winner in science history. This technology has extensive applications, potentially transforming treatment and diagnosis in medical sciences by reducing drug design costs and time, while elucidating structural pathways of human body systems. Numerous studies have demonstrated how AlphaFold aids in understanding health conditions by providing critical information about protein mutations, abnormal protein-protein interactions, and changes in protein dynamics. Researchers have also developed new technologies and pipelines using different versions of AlphaFold to amplify its potential. However, addressing existing limitations is crucial to maximizing AlphaFold's capacity to redefine medical research. This article reviews AlphaFold's impact on five key aspects of medical sciences: protein mutation, protein-protein interaction, molecular dynamics, drug design, and immunotherapy. METHODS This review examines the contributions of various AlphaFold versions AlphaFold2, AlphaFold DB, AlphaFold Multimer, Alpha Missense, and AlphaFold3 to protein structure prediction. The methods include an extensive analysis of computational techniques and software used in interpreting and predicting protein structures, emphasizing advances in AI technology and its applications in medical research.
Collapse
Affiliation(s)
- Niki Najar Najafi
- Faculty of Life Sciences and Biotechnology, Shahid Beheshti University, Tehran, Iran
| | - Reyhaneh Karbassian
- Faculty of Life Sciences and Biotechnology, Shahid Beheshti University, Tehran, Iran
| | - Helia Hajihassani
- Faculty of Life Sciences and Biotechnology, Shahid Beheshti University, Tehran, Iran
| | | |
Collapse
|
2
|
Makkawi A, Beker W, Wołos A, Manna S, Roszak R, Szymkuć S, Moskal M, Koshevarnikov A, Molga K, Żądło-Dobrowolska A, Grzybowski BA. Retro-forward synthesis design and experimental validation of potent structural analogs of known drugs. Chem Sci 2025; 16:8383-8393. [PMID: 40225181 PMCID: PMC11983321 DOI: 10.1039/d5sc00070j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2025] [Accepted: 03/16/2025] [Indexed: 04/15/2025] Open
Abstract
Generation of structural analogs to "parent" molecule(s) of interest remains one of the important elements of drug development. Ideally, such analogs should be synthesizable by concise and robust synthetic routes. The current work illustrates how this process can be facilitated by a computational pipeline spanning (i) diversification of the parent via substructure replacements aimed at enhancing biological activity, (ii) retrosynthesis of the thus generated "replicas" to identify substrates, (iii) forward syntheses originating from these substrates (and synthetically versatile "auxiliaries") and guided "towards" the parent, and (iv) evaluation of the candidates for target binding and other medicinal-chemical properties. This pipeline proposes syntheses of thousands of readily makeable analogs in a matter of minutes, and is deployed here to validate by experiment seven structural analogs of Ketoprofen and six analogs of Donepezil. The concise, computer-designed syntheses are confirmed in 12 out of 13 cases, offering access to several potent inhibitors. While the synthesis-design component is robust, binding affinities are predicted less accurately although still to the order-of-magnitude, which may be valuable in discerning promising from inadequate binders.
Collapse
Affiliation(s)
- Ahmad Makkawi
- Institute of Organic Chemistry, Polish Academy of Sciences Warsaw Poland
| | | | | | - Sabyasachi Manna
- Institute of Organic Chemistry, Polish Academy of Sciences Warsaw Poland
| | | | | | | | - Aleksei Koshevarnikov
- Institute of Organic Chemistry, Polish Academy of Sciences Warsaw Poland
- Allchemy, Inc. Highland IN USA
| | | | | | - Bartosz A Grzybowski
- Institute of Organic Chemistry, Polish Academy of Sciences Warsaw Poland
- Center for Algorithmic and Robotized Synthesis (CARS), Institute for Basic Science (IBS) Ulsan 44919 Republic of Korea
- Department of Chemistry, Ulsan Institute of Science and Technology, UNIST Ulsan 44919 Republic of Korea
| |
Collapse
|
3
|
Guo J, Schwaller P. Directly optimizing for synthesizability in generative molecular design using retrosynthesis models. Chem Sci 2025; 16:6943-6956. [PMID: 40123687 PMCID: PMC11927497 DOI: 10.1039/d5sc01476j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2025] [Accepted: 03/11/2025] [Indexed: 03/25/2025] Open
Abstract
Synthesizability in generative molecular design remains a pressing challenge. Existing methods to assess synthesizability include heuristics-based metrics or retrosynthesis models which predict a synthetic pathway. By contrast, an explicit approach anchors generation with "synthetically-feasible" chemical transformations, such that all generated molecules already have a predicted synthetic pathway. To date, retrosynthesis models have been mostly used as a post hoc filtering tool as their inference cost remains prohibitive to use directly in an optimization loop. In this work, we show that with a sufficiently sample-efficient generative model, it is straightforward to directly optimize for synthesizability using retrosynthesis models in goal-directed generation. Under a heavily-constrained computational budget, our model can generate molecules satisfying multi-parameter drug discovery optimization tasks while being synthesizable, as deemed by retrosynthesis models. We reaffirm previous findings that common synthesizability heuristics (formulated based on known bio-active molecules) can be well correlated with retrosynthesis models' solvability, such that optimizing for the latter may not be an optimal allocation of computational resources. However, going further, we show that moving to other classes of molecules, such as functional materials, current heuristics' correlations diminish, such that there is an advantage to incorporating retrosynthesis models directly in the optimization loop. Finally, we demonstrate that over-reliance on synthesizability heuristics can overlook promising molecules. The codebase is available at https://github.com/schwallergroup/saturn.
Collapse
Affiliation(s)
- Jeff Guo
- École Polytechnique Fédérale de Lausanne (EPFL) Switzerland
- National Centre of Competence in Research (NCCR) Catalysis Switzerland
| | - Philippe Schwaller
- École Polytechnique Fédérale de Lausanne (EPFL) Switzerland
- National Centre of Competence in Research (NCCR) Catalysis Switzerland
| |
Collapse
|
4
|
Hassen AK, Šícho M, van Aalst YJ, Huizenga MCW, Reynolds DNR, Luukkonen S, Bernatavicius A, Clevert DA, Janssen APA, van Westen GJP, Preuss M. Generate what you can make: achieving in-house synthesizability with readily available resources in de novo drug design. J Cheminform 2025; 17:41. [PMID: 40155970 PMCID: PMC11954305 DOI: 10.1186/s13321-024-00910-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Accepted: 09/28/2024] [Indexed: 04/01/2025] Open
Abstract
Computer-Aided Synthesis Planning (CASP) and CASP-based approximated synthesizability scores have rarely been used as generation objectives in Computer-Aided Drug Design despite facilitating the in-silico generation of synthesizable molecules. However, these synthesizability approaches are disconnected from the reality of small laboratory drug design, where building block resources are limited, thus making the notion of in-house synthesizability with already available resources highly desirable. In this work, we show a successful in-house de novo drug design workflow generating active and in-house synthesizable ligands of monoglyceride lipase (MGLL). First, we demonstrate the successful transfer of CASP from 17.4 million commercial building blocks to a small laboratory setting of roughly 6000 building blocks with only a decrease of -12% in CASP success when accepting two reaction-steps longer synthesis routes on average. Next, we present a rapidly retrainable in-house synthesizability score, successfully capturing our in-house synthesizability without relying on external building block resources. We show that including our in-house synthesizability score in a multi-objective de novo drug design workflow, alongside a simple QSAR model, provides thousands of potentially active and easily in-house synthesizable molecules. Finally, we experimentally evaluate the synthesis and biochemical activity of three de novo candidates using their CASP-suggested synthesis routes employing only in-house building blocks. We find one candidate with evident activity, suggesting potential new ligand ideas for MGLL inhibitors while showcasing the usefulness of our in-house synthesizability score for de novo drug design.Scientific contribution Our core scientific contribution is the introduction of in-house de novo drug design, which enables the practical application of generative methods in small laboratories by utilizing a limited stock of available building blocks. Our fast-to-adapt workflow for in-house synthesizability scoring requires minimal computational retraining costs while supporting a high diversity of generated structures. We highlight the practicality of our approach through a comprehensive in-vitro case study that relies entirely on in-house resources, including in-silico generation, synthesis planning, and activity evaluation.
Collapse
Affiliation(s)
- Alan Kai Hassen
- Leiden Institute of Advanced Computer Science, Leiden University, Leiden, The Netherlands.
- Machine Learning Research, Pfizer Research and Development, Berlin, Germany.
| | - Martin Šícho
- Leiden Academic Centre of Drug Research, Leiden University, Leiden, The Netherlands
- CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Department of Informatics and Chemistry, Faculty of Chemical Technolog, University of Chemistry and Technology Prague, Prague, Czech Republic
| | - Yorick J van Aalst
- Leiden Academic Centre of Drug Research, Leiden University, Leiden, The Netherlands
| | | | - Darcy N R Reynolds
- Leiden Institute of Chemistry, Leiden University, Leiden, The Netherlands
| | - Sohvi Luukkonen
- Leiden Academic Centre of Drug Research, Leiden University, Leiden, The Netherlands
| | - Andrius Bernatavicius
- Leiden Institute of Advanced Computer Science, Leiden University, Leiden, The Netherlands
- Leiden Academic Centre of Drug Research, Leiden University, Leiden, The Netherlands
| | - Djork-Arné Clevert
- Machine Learning Research, Pfizer Research and Development, Berlin, Germany
| | | | - Gerard J P van Westen
- Leiden Academic Centre of Drug Research, Leiden University, Leiden, The Netherlands.
| | - Mike Preuss
- Leiden Institute of Advanced Computer Science, Leiden University, Leiden, The Netherlands.
| |
Collapse
|
5
|
Nakamura S, Yasuo N, Sekijima M. Molecular optimization using a conditional transformer for reaction-aware compound exploration with reinforcement learning. Commun Chem 2025; 8:40. [PMID: 39922979 PMCID: PMC11807120 DOI: 10.1038/s42004-025-01437-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2024] [Accepted: 01/28/2025] [Indexed: 02/10/2025] Open
Abstract
Designing molecules with desirable properties is a critical endeavor in drug discovery. Because of recent advances in deep learning, molecular generative models have been developed. However, the existing compound exploration models often disregard the important issue of ensuring the feasibility of organic synthesis. To address this issue, we propose TRACER, which is a framework that integrates the optimization of molecular property optimization with synthetic pathway generation. The model can predict the product derived from a given reactant via a conditional transformer under the constraints of a reaction type. The molecular optimization results of an activity prediction model targeting DRD2, AKT1, and CXCR4 revealed that TRACER effectively generated compounds with high scores. The transformer model, which recognizes the entire structures, captures the complexity of the organic synthesis and enables its navigation in a vast chemical space while considering real-world reactivity constraints.
Collapse
Affiliation(s)
- Shogo Nakamura
- Department of Life Science and Technology, Institute of Science Tokyo, 4259-J3-23, Nagatsuta-cho, Midori-ku, Yokohama, 226-8501, Kanagawa, Japan
| | - Nobuaki Yasuo
- Academy for Convergence of Materials and Informatics (TAC-MI), Institute of Science Tokyo, S6-23, Ookayama, Meguro-ku, 152-8550, Tokyo, Japan
| | - Masakazu Sekijima
- Department Computer Science, Institute of Science Tokyo, 4259-J3-23, Nagatsuta-cho, Midori-ku, Yokohama, 226-8501, Kanagawa, Japan.
| |
Collapse
|
6
|
Maziarz K, Tripp A, Liu G, Stanley M, Xie S, Gaiński P, Seidl P, Segler MHS. Re-evaluating retrosynthesis algorithms with Syntheseus. Faraday Discuss 2025; 256:568-586. [PMID: 39485491 DOI: 10.1039/d4fd00093e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2024]
Abstract
Automated synthesis planning has recently re-emerged as a research area at the intersection of chemistry and machine learning. Despite the appearance of steady progress, we argue that imperfect benchmarks and inconsistent comparisons mask systematic shortcomings of existing techniques, and unnecessarily hamper progress. To remedy this, we present a synthesis planning library with an extensive benchmarking framework, called SYNTHESEUS, which promotes best practice by default, enabling consistent meaningful evaluation of single-step and multi-step synthesis planning algorithms. We demonstrate the capabilities of SYNTHESEUS by re-evaluating several previous retrosynthesis algorithms, and find that the ranking of state-of-the-art models changes in controlled evaluation experiments. We end with guidance for future works in this area, and call on the community to engage in the discussion on how to improve benchmarks for synthesis planning.
Collapse
|
7
|
P de Oliveira SH, Pedawi A, Kenyon V, van den Bedem H. NGT: Generative AI with Synthesizability Guarantees Discovers MC2R Inhibitors from a Tera-Scale Virtual Screen. J Med Chem 2024; 67:19417-19427. [PMID: 39471377 DOI: 10.1021/acs.jmedchem.4c01763] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2024]
Abstract
Commercially available, synthesis-on-demand virtual libraries contain upward of trillions of readily synthesizable compounds for drug discovery campaigns. These libraries are a critical resource for rapid cycles of in silico discovery, property optimization and in vitro validation. However, as these libraries continue to grow exponentially in size, traditional search strategies encounter significant limitations. Here we present NeuralGenThesis (NGT), an efficient reinforcement learning approach to generate compounds from ultralarge libraries that satisfy user-specified constraints. Our method first trains a generative model over a virtual library and subsequently trains a normalizing flow to learn a distribution over latent space that decodes constraint-satisfying compounds. NGT allows multiple constraints simultaneously without dictating how molecular properties are calculated. Using NGT, we generated potent and selective inhibitors for the melanocortin-2 receptor (MC2R) from a three trillion compound library. NGT offers a powerful and scalable solution for navigating ultralarge virtual libraries, accelerating drug discovery efforts.
Collapse
Affiliation(s)
| | - Aryan Pedawi
- Atomwise Inc, San Francisco, California 94108, United States
| | - Victor Kenyon
- Atomwise Inc, San Francisco, California 94108, United States
| | - Henry van den Bedem
- Atomwise Inc, San Francisco, California 94108, United States
- Department of Bioengineering & Therapeutic Sciences, University of California, San Francisco, California 94143, United States
| |
Collapse
|
8
|
Wu K, Xia Y, Deng P, Liu R, Zhang Y, Guo H, Cui Y, Pei Q, Wu L, Xie S, Chen S, Lu X, Hu S, Wu J, Chan CK, Chen S, Zhou L, Yu N, Chen E, Liu H, Guo J, Qin T, Liu TY. TamGen: drug design with target-aware molecule generation through a chemical language model. Nat Commun 2024; 15:9360. [PMID: 39472567 PMCID: PMC11522292 DOI: 10.1038/s41467-024-53632-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Accepted: 10/14/2024] [Indexed: 11/02/2024] Open
Abstract
Generative drug design facilitates the creation of compounds effective against pathogenic target proteins. This opens up the potential to discover novel compounds within the vast chemical space and fosters the development of innovative therapeutic strategies. However, the practicality of generated molecules is often limited, as many designs focus on a narrow set of drug-related properties, failing to improve the success rate of subsequent drug discovery process. To overcome these challenges, we develop TamGen, a method that employs a GPT-like chemical language model and enables target-aware molecule generation and compound refinement. We demonstrate that the compounds generated by TamGen have improved molecular quality and viability. Additionally, we have integrated TamGen into a drug discovery pipeline and identified 14 compounds showing compelling inhibitory activity against the Tuberculosis ClpP protease, with the most effective compound exhibiting a half maximal inhibitory concentration (IC50) of 1.9 μM. Our findings underscore the practical potential and real-world applicability of generative drug design approaches, paving the way for future advancements in the field.
Collapse
Affiliation(s)
- Kehan Wu
- University of Science and Technology of China, Hefei, China
| | - Yingce Xia
- Microsoft Research AI for Science, Beijing, China.
| | - Pan Deng
- Microsoft Research AI for Science, Beijing, China
| | - Renhe Liu
- Global Health Drug Discovery Institute, Beijing, China
| | - Yuan Zhang
- Global Health Drug Discovery Institute, Beijing, China
| | - Han Guo
- Global Health Drug Discovery Institute, Beijing, China
| | - Yumeng Cui
- Global Health Drug Discovery Institute, Beijing, China
| | - Qizhi Pei
- Renmin University of China, Beijing, China
| | - Lijun Wu
- Microsoft Research AI for Science, Beijing, China
| | - Shufang Xie
- Microsoft Research AI for Science, Beijing, China
| | - Si Chen
- Global Health Drug Discovery Institute, Beijing, China
| | - Xi Lu
- Global Health Drug Discovery Institute, Beijing, China
| | - Song Hu
- Global Health Drug Discovery Institute, Beijing, China
| | - Jinzhi Wu
- Global Health Drug Discovery Institute, Beijing, China
| | - Chi-Kin Chan
- Global Health Drug Discovery Institute, Beijing, China
| | - Shawn Chen
- Global Health Drug Discovery Institute, Beijing, China
| | | | - Nenghai Yu
- University of Science and Technology of China, Hefei, China
| | - Enhong Chen
- University of Science and Technology of China, Hefei, China
| | - Haiguang Liu
- Microsoft Research AI for Science, Beijing, China
| | - Jinjiang Guo
- Global Health Drug Discovery Institute, Beijing, China.
| | - Tao Qin
- Microsoft Research AI for Science, Beijing, China.
| | - Tie-Yan Liu
- Microsoft Research AI for Science, Beijing, China
| |
Collapse
|
9
|
Li B, Tan K, Lao AR, Wang H, Zheng H, Zhang L. A comprehensive review of artificial intelligence for pharmacology research. Front Genet 2024; 15:1450529. [PMID: 39290983 PMCID: PMC11405247 DOI: 10.3389/fgene.2024.1450529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Accepted: 08/26/2024] [Indexed: 09/19/2024] Open
Abstract
With the innovation and advancement of artificial intelligence, more and more artificial intelligence techniques are employed in drug research, biomedical frontier research, and clinical medicine practice, especially, in the field of pharmacology research. Thus, this review focuses on the applications of artificial intelligence in drug discovery, compound pharmacokinetic prediction, and clinical pharmacology. We briefly introduced the basic knowledge and development of artificial intelligence, presented a comprehensive review, and then summarized the latest studies and discussed the strengths and limitations of artificial intelligence models. Additionally, we highlighted several important studies and pointed out possible research directions.
Collapse
Affiliation(s)
- Bing Li
- College of Computer Science, Sichuan University, Chengdu, China
| | - Kan Tan
- College of Computer Science, Sichuan University, Chengdu, China
| | - Angelyn R Lao
- Department of Mathematics and Statistics, De La Salle University, Manila, Philippines
| | - Haiying Wang
- School of Computing, Ulster University, Belfast, United Kingdom
| | - Huiru Zheng
- School of Computing, Ulster University, Belfast, United Kingdom
| | - Le Zhang
- College of Computer Science, Sichuan University, Chengdu, China
| |
Collapse
|
10
|
Eguida M, Bret G, Sindt F, Li F, Chau I, Ackloo S, Arrowsmith C, Bolotokova A, Ghiabi P, Gibson E, Halabelian L, Houliston S, Harding RJ, Hutchinson A, Loppnau P, Perveen S, Seitova A, Zeng H, Schapira M, Rognan D. Subpocket Similarity-Based Hit Identification for Challenging Targets: Application to the WDR Domain of LRRK2. J Chem Inf Model 2024; 64:5344-5355. [PMID: 38916159 DOI: 10.1021/acs.jcim.4c00601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]
Abstract
We herewith applied a priori a generic hit identification method (POEM) for difficult targets of known three-dimensional structure, relying on the simple knowledge of physicochemical and topological properties of a user-selected cavity. Searching for local similarity to a set of fragment-bound protein microenvironments of known structure, a point cloud registration algorithm is first applied to align known subpockets to the target cavity. The resulting alignment then permits us to directly pose the corresponding seed fragments in a target cavity space not typically amenable to classical docking approaches. Last, linking potentially connectable atoms by a deep generative linker enables full ligand enumeration. When applied to the WD40 repeat (WDR) central cavity of leucine-rich repeat kinase 2 (LRRK2), an unprecedented binding site, POEM was able to quickly propose 94 potential hits, five of which were subsequently confirmed to bind in vitro to LRRK2-WDR.
Collapse
Affiliation(s)
- Merveille Eguida
- Laboratoire d'innovation thérapeutique, UMR7200 CNRS-Université de Strasbourg, F-67400 Illkirch, Strasbourg, France
| | - Guillaume Bret
- Laboratoire d'innovation thérapeutique, UMR7200 CNRS-Université de Strasbourg, F-67400 Illkirch, Strasbourg, France
| | - François Sindt
- Laboratoire d'innovation thérapeutique, UMR7200 CNRS-Université de Strasbourg, F-67400 Illkirch, Strasbourg, France
| | - Fengling Li
- Structural Genomics Consortium, University of Toronto, Toronto, ON M5G 1L7, Canada
| | - Irene Chau
- Structural Genomics Consortium, University of Toronto, Toronto, ON M5G 1L7, Canada
| | - Suzanne Ackloo
- Structural Genomics Consortium, University of Toronto, Toronto, ON M5G 1L7, Canada
| | - Cheryl Arrowsmith
- Structural Genomics Consortium, University of Toronto, Toronto, ON M5G 1L7, Canada
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario M5G 1L7, Canada
| | - Albina Bolotokova
- Structural Genomics Consortium, University of Toronto, Toronto, ON M5G 1L7, Canada
| | - Pegah Ghiabi
- Structural Genomics Consortium, University of Toronto, Toronto, ON M5G 1L7, Canada
| | - Elisa Gibson
- Structural Genomics Consortium, University of Toronto, Toronto, ON M5G 1L7, Canada
| | - Levon Halabelian
- Structural Genomics Consortium, University of Toronto, Toronto, ON M5G 1L7, Canada
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario M5G 1L7, Canada
| | - Scott Houliston
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario M5G 1L7, Canada
| | - Rachel J Harding
- Structural Genomics Consortium, University of Toronto, Toronto, ON M5G 1L7, Canada
- Department of Pharmacology & Toxicology, University of Toronto, Toronto, ON M5G 1L7, Canada
| | - Ashley Hutchinson
- Structural Genomics Consortium, University of Toronto, Toronto, ON M5G 1L7, Canada
| | - Peter Loppnau
- Structural Genomics Consortium, University of Toronto, Toronto, ON M5G 1L7, Canada
| | - Sumera Perveen
- Structural Genomics Consortium, University of Toronto, Toronto, ON M5G 1L7, Canada
| | - Almagul Seitova
- Structural Genomics Consortium, University of Toronto, Toronto, ON M5G 1L7, Canada
| | - Hong Zeng
- Structural Genomics Consortium, University of Toronto, Toronto, ON M5G 1L7, Canada
| | - Matthieu Schapira
- Structural Genomics Consortium, University of Toronto, Toronto, ON M5G 1L7, Canada
- Department of Pharmacology & Toxicology, University of Toronto, Toronto, ON M5G 1L7, Canada
| | - Didier Rognan
- Laboratoire d'innovation thérapeutique, UMR7200 CNRS-Université de Strasbourg, F-67400 Illkirch, Strasbourg, France
| |
Collapse
|
11
|
Cebi E, Lee J, Subramani VK, Bak N, Oh C, Kim KK. Cryo-electron microscopy-based drug design. Front Mol Biosci 2024; 11:1342179. [PMID: 38501110 PMCID: PMC10945328 DOI: 10.3389/fmolb.2024.1342179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 01/31/2024] [Indexed: 03/20/2024] Open
Abstract
Structure-based drug design (SBDD) has gained popularity owing to its ability to develop more potent drugs compared to conventional drug-discovery methods. The success of SBDD relies heavily on obtaining the three-dimensional structures of drug targets. X-ray crystallography is the primary method used for solving structures and aiding the SBDD workflow; however, it is not suitable for all targets. With the resolution revolution, enabling routine high-resolution reconstruction of structures, cryogenic electron microscopy (cryo-EM) has emerged as a promising alternative and has attracted increasing attention in SBDD. Cryo-EM offers various advantages over X-ray crystallography and can potentially replace X-ray crystallography in SBDD. To fully utilize cryo-EM in drug discovery, understanding the strengths and weaknesses of this technique and noting the key advancements in the field are crucial. This review provides an overview of the general workflow of cryo-EM in SBDD and highlights technical innovations that enable its application in drug design. Furthermore, the most recent achievements in the cryo-EM methodology for drug discovery are discussed, demonstrating the potential of this technique for advancing drug development. By understanding the capabilities and advancements of cryo-EM, researchers can leverage the benefits of designing more effective drugs. This review concludes with a discussion of the future perspectives of cryo-EM-based SBDD, emphasizing the role of this technique in driving innovations in drug discovery and development. The integration of cryo-EM into the drug design process holds great promise for accelerating the discovery of new and improved therapeutic agents to combat various diseases.
Collapse
Affiliation(s)
| | | | | | | | - Changsuk Oh
- Department of Precision Medicine, Sungkyunkwan University School of Medicine, Suwon, Republic of Korea
| | - Kyeong Kyu Kim
- Department of Precision Medicine, Sungkyunkwan University School of Medicine, Suwon, Republic of Korea
| |
Collapse
|
12
|
Loeffler HH, He J, Tibo A, Janet JP, Voronov A, Mervin LH, Engkvist O. Reinvent 4: Modern AI-driven generative molecule design. J Cheminform 2024; 16:20. [PMID: 38383444 PMCID: PMC10882833 DOI: 10.1186/s13321-024-00812-5] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Accepted: 02/09/2024] [Indexed: 02/23/2024] Open
Abstract
REINVENT 4 is a modern open-source generative AI framework for the design of small molecules. The software utilizes recurrent neural networks and transformer architectures to drive molecule generation. These generators are seamlessly embedded within the general machine learning optimization algorithms, transfer learning, reinforcement learning and curriculum learning. REINVENT 4 enables and facilitates de novo design, R-group replacement, library design, linker design, scaffold hopping and molecule optimization. This contribution gives an overview of the software and describes its design. Algorithms and their applications are discussed in detail. REINVENT 4 is a command line tool which reads a user configuration in either TOML or JSON format. The aim of this release is to provide reference implementations for some of the most common algorithms in AI based molecule generation. An additional goal with the release is to create a framework for education and future innovation in AI based molecular design. The software is available from https://github.com/MolecularAI/REINVENT4 and released under the permissive Apache 2.0 license. Scientific contribution. The software provides an open-source reference implementation for generative molecular design where the software is also being used in production to support in-house drug discovery projects. The publication of the most common machine learning algorithms in one code and full documentation thereof will increase transparency of AI and foster innovation, collaboration and education.
Collapse
Affiliation(s)
- Hannes H Loeffler
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden.
| | - Jiazhen He
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Alessandro Tibo
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Jon Paul Janet
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Alexey Voronov
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Lewis H Mervin
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Cambridge, UK
| | - Ola Engkvist
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| |
Collapse
|
13
|
Garg V. Generative AI for graph-based drug design: Recent advances and the way forward. Curr Opin Struct Biol 2024; 84:102769. [PMID: 38199072 DOI: 10.1016/j.sbi.2023.102769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 12/17/2023] [Accepted: 12/19/2023] [Indexed: 01/12/2024]
Abstract
Discovering new promising molecule candidates that could translate into effective drugs is a key scientific pursuit. However, factors such as the vastness and discreteness of the molecular search space pose a formidable technical challenge in this quest. AI-driven generative models can effectively learn from data, and offer hope to streamline drug design. In this article, we review state of the art in generative models that operate on molecular graphs. We also shed light on some limitations of the existing methodology and sketch directions to harness the potential of AI for drug design tasks going forward.
Collapse
Affiliation(s)
- Vikas Garg
- Aalto University and YaiYai Ltd, Finland.
| |
Collapse
|
14
|
Bi X, Lin L, Chen Z, Ye J. Artificial Intelligence for Surface-Enhanced Raman Spectroscopy. SMALL METHODS 2024; 8:e2301243. [PMID: 37888799 DOI: 10.1002/smtd.202301243] [Citation(s) in RCA: 31] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 10/11/2023] [Indexed: 10/28/2023]
Abstract
Surface-enhanced Raman spectroscopy (SERS), well acknowledged as a fingerprinting and sensitive analytical technique, has exerted high applicational value in a broad range of fields including biomedicine, environmental protection, food safety among the others. In the endless pursuit of ever-sensitive, robust, and comprehensive sensing and imaging, advancements keep emerging in the whole pipeline of SERS, from the design of SERS substrates and reporter molecules, synthetic route planning, instrument refinement, to data preprocessing and analysis methods. Artificial intelligence (AI), which is created to imitate and eventually exceed human behaviors, has exhibited its power in learning high-level representations and recognizing complicated patterns with exceptional automaticity. Therefore, facing up with the intertwining influential factors and explosive data size, AI has been increasingly leveraged in all the above-mentioned aspects in SERS, presenting elite efficiency in accelerating systematic optimization and deepening understanding about the fundamental physics and spectral data, which far transcends human labors and conventional computations. In this review, the recent progresses in SERS are summarized through the integration of AI, and new insights of the challenges and perspectives are provided in aim to better gear SERS toward the fast track.
Collapse
Affiliation(s)
- Xinyuan Bi
- State Key Laboratory of Systems Medicine for Cancer, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200030, P. R. China
| | - Li Lin
- State Key Laboratory of Systems Medicine for Cancer, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200030, P. R. China
| | - Zhou Chen
- State Key Laboratory of Systems Medicine for Cancer, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200030, P. R. China
| | - Jian Ye
- State Key Laboratory of Systems Medicine for Cancer, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200030, P. R. China
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200127, P. R. China
- Shanghai Key Laboratory of Gynecologic Oncology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127, P. R. China
| |
Collapse
|
15
|
Schrier J, Norquist AJ, Buonassisi T, Brgoch J. In Pursuit of the Exceptional: Research Directions for Machine Learning in Chemical and Materials Science. J Am Chem Soc 2023; 145:21699-21716. [PMID: 37754929 DOI: 10.1021/jacs.3c04783] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/28/2023]
Abstract
Exceptional molecules and materials with one or more extraordinary properties are both technologically valuable and fundamentally interesting, because they often involve new physical phenomena or new compositions that defy expectations. Historically, exceptionality has been achieved through serendipity, but recently, machine learning (ML) and automated experimentation have been widely proposed to accelerate target identification and synthesis planning. In this Perspective, we argue that the data-driven methods commonly used today are well-suited for optimization but not for the realization of new exceptional materials or molecules. Finding such outliers should be possible using ML, but only by shifting away from using traditional ML approaches that tweak the composition, crystal structure, or reaction pathway. We highlight case studies of high-Tc oxide superconductors and superhard materials to demonstrate the challenges of ML-guided discovery and discuss the limitations of automation for this task. We then provide six recommendations for the development of ML methods capable of exceptional materials discovery: (i) Avoid the tyranny of the middle and focus on extrema; (ii) When data are limited, qualitative predictions that provide direction are more valuable than interpolative accuracy; (iii) Sample what can be made and how to make it and defer optimization; (iv) Create room (and look) for the unexpected while pursuing your goal; (v) Try to fill-in-the-blanks of input and output space; (vi) Do not confuse human understanding with model interpretability. We conclude with a description of how these recommendations can be integrated into automated discovery workflows, which should enable the discovery of exceptional molecules and materials.
Collapse
Affiliation(s)
- Joshua Schrier
- Department of Chemistry, Fordham University, The Bronx, New York 10458, United States
| | - Alexander J Norquist
- Department of Chemistry, Haverford College, Haverford, Pennsylvania 19041, United States
| | - Tonio Buonassisi
- Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Jakoah Brgoch
- Department of Chemistry and Texas Center for Superconductivity, University of Houston, Houston, Texas 77204, United States
| |
Collapse
|