1
|
Nahal Y, Menke J, Martinelli J, Heinonen M, Kabeshov M, Janet JP, Nittinger E, Engkvist O, Kaski S. Human-in-the-loop active learning for goal-oriented molecule generation. J Cheminform 2024; 16:138. [PMID: 39654043 PMCID: PMC11629536 DOI: 10.1186/s13321-024-00924-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2024] [Accepted: 11/02/2024] [Indexed: 12/12/2024] Open
Abstract
Machine learning (ML) systems have enabled the modelling of quantitative structure-property relationships (QSPR) and structure-activity relationships (QSAR) using existing experimental data to predict target properties for new molecules. These property predictors hold significant potential in accelerating drug discovery by guiding generative artificial intelligence (AI) agents to explore desired chemical spaces. However, they often struggle to generalize due to the limited scope of the training data. When optimized by generative agents, this limitation can result in the generation of molecules with artificially high predicted probabilities of satisfying target properties, which subsequently fail experimental validation. To address this challenge, we propose an adaptive approach that integrates active learning (AL) and iterative feedback to refine property predictors, thereby improving the outcomes of their optimization by generative AI agents. Our method leverages the Expected Predictive Information Gain (EPIG) criterion to select additional molecules for evaluation by an oracle. This process aims to provide the greatest reduction in predictive uncertainty, enabling more accurate model evaluations of subsequently generated molecules. Recognizing the impracticality of immediate wet-lab or physics-based experiments due to time and logistical constraints, we propose leveraging human experts for their cost-effectiveness and domain knowledge to effectively augment property predictors, bridging gaps in the limited training data. Empirical evaluations through both simulated and real human-in-the-loop experiments demonstrate that our approach refines property predictors to better align with oracle assessments. Additionally, we observe improved accuracy of predicted properties as well as improved drug-likeness among the top-ranking generated molecules. SCIENTIFIC CONTRIBUTION: We present an adaptable framework that integrates AL and human expertise to refine property predictors for goal-oriented molecule generation. This approach is robust to noise in human feedback and ensures that navigating chemical space with human-refined predictors leverages human insights to identify molecules that not only satisfy predicted property profiles but also score highly on oracle models. Additionally, it prioritizes practical characteristics such as drug-likeness, synthetic accessibility, and a favorable balance between exploring diverse chemical space and exploiting similarity to existing training data.
Collapse
Affiliation(s)
- Yasmine Nahal
- Department of Computer Science, Aalto University, 02150, Espoo, Finland.
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, 431 83, Mölndal, Sweden.
| | - Janosch Menke
- Department of Computer Science and Engineering, Chalmers University of Technology, 412 96, Gothenburg, Sweden
| | - Julien Martinelli
- Inserm Bordeaux Population Health, Vaccine Research Institute, Université de Bordeaux, Inria Bordeaux Sud-ouest, 33405, Talence, France
| | - Markus Heinonen
- Department of Computer Science, Aalto University, 02150, Espoo, Finland
| | - Mikhail Kabeshov
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, 431 83, Mölndal, Sweden
| | - Jon Paul Janet
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, 431 83, Mölndal, Sweden
| | - Eva Nittinger
- Medicinal Chemistry, Research and Early Development, Respiratory and Immunology (R&I), R&D, AstraZeneca, 412 96, Gothenburg, Sweden
| | - Ola Engkvist
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, 431 83, Mölndal, Sweden
- Department of Computer Science and Engineering, Chalmers University of Technology, 412 96, Gothenburg, Sweden
| | - Samuel Kaski
- Department of Computer Science, Aalto University, 02150, Espoo, Finland
- Department of Computer Science, University of Manchester, Manchester, M13 9PL, United Kingdom
| |
Collapse
|
2
|
Chen H, Lu D, Xiao Z, Li S, Zhang W, Luan X, Zhang W, Zheng G. Comprehensive applications of the artificial intelligence technology in new drug research and development. Health Inf Sci Syst 2024; 12:41. [PMID: 39130617 PMCID: PMC11310389 DOI: 10.1007/s13755-024-00300-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Accepted: 07/27/2024] [Indexed: 08/13/2024] Open
Abstract
Purpose Target-based strategy is a prevalent means of drug research and development (R&D), since targets provide effector molecules of drug action and offer the foundation of pharmacological investigation. Recently, the artificial intelligence (AI) technology has been utilized in various stages of drug R&D, where AI-assisted experimental methods show higher efficiency than sole experimental ones. It is a critical need to give a comprehensive review of AI applications in drug R &D for biopharmaceutical field. Methods Relevant literatures about AI-assisted drug R&D were collected from the public databases (Including Google Scholar, Web of Science, PubMed, IEEE Xplore Digital Library, Springer, and ScienceDirect) through a keyword searching strategy with the following terms [("Artificial Intelligence" OR "Knowledge Graph" OR "Machine Learning") AND ("Drug Target Identification" OR "New Drug Development")]. Results In this review, we first introduced common strategies and novel trends of drug R&D, followed by characteristic description of AI algorithms widely used in drug R&D. Subsequently, we depicted detailed applications of AI algorithms in target identification, lead compound identification and optimization, drug repurposing, and drug analytical platform construction. Finally, we discussed the challenges and prospects of AI-assisted methods for drug discovery. Conclusion Collectively, this review provides comprehensive overview of AI applications in drug R&D and presents future perspectives for biopharmaceutical field, which may promote the development of drug industry.
Collapse
Affiliation(s)
- Hongyu Chen
- Shanghai Frontiers Science Center for Chinese Medicine Chemical Biology, Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Dong Lu
- Shanghai Frontiers Science Center for Chinese Medicine Chemical Biology, Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Ziyi Xiao
- Johns Hopkins Bloomberg School of Public Health, Baltimore, MD USA
| | - Shensuo Li
- Shanghai Frontiers Science Center for Chinese Medicine Chemical Biology, Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Wen Zhang
- Shanghai Frontiers Science Center for Chinese Medicine Chemical Biology, Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Xin Luan
- Shanghai Frontiers Science Center for Chinese Medicine Chemical Biology, Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Weidong Zhang
- Shanghai Frontiers Science Center for Chinese Medicine Chemical Biology, Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Guangyong Zheng
- Shanghai Frontiers Science Center for Chinese Medicine Chemical Biology, Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| |
Collapse
|
3
|
Menke J, Nahal Y, Bjerrum EJ, Kabeshov M, Kaski S, Engkvist O. Metis: a python-based user interface to collect expert feedback for generative chemistry models. J Cheminform 2024; 16:100. [PMID: 39143631 PMCID: PMC11323385 DOI: 10.1186/s13321-024-00892-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Accepted: 08/02/2024] [Indexed: 08/16/2024] Open
Abstract
One challenge that current de novo drug design models face is a disparity between the user's expectations and the actual output of the model in practical applications. Tailoring models to better align with chemists' implicit knowledge, expectation and preferences is key to overcoming this obstacle effectively. While interest in preference-based and human-in-the-loop machine learning in chemistry is continuously increasing, no tool currently exists that enables the collection of standardized and chemistry-specific feedback. Metis is a Python-based open-source graphical user interface (GUI), designed to solve this and enable the collection of chemists' detailed feedback on molecular structures. The GUI enables chemists to explore and evaluate molecules, offering a user-friendly interface for annotating preferences and specifying desired or undesired structural features. By providing chemists the opportunity to give detailed feedback, allows researchers to capture more efficiently the chemist's implicit knowledge and preferences. This knowledge is crucial to align the chemist's idea with the de novo design agents. The GUI aims to enhance this collaboration between the human and the "machine" by providing an intuitive platform where chemists can interactively provide feedback on molecular structures, aiding in preference learning and refining de novo design strategies. Metis integrates with the existing de novo framework REINVENT, creating a closed-loop system where human expertise can continuously inform and refine the generative models.Scientific contributionWe introduce a novel Graphical User Interface, that allows chemists/researchers to give detailed feedback on substructures and properties of small molecules. This tool can be used to learn the preferences of chemists in order to align de novo drug design models with the chemist's ideas. The GUI can be customized to fit different needs and projects and enables direct integration into de novo REINVENT runs. We believe that Metis can facilitate the discussion and development of novel ways to integrate human feedback that goes beyond binary decisions of liking or disliking a molecule.
Collapse
Affiliation(s)
- Janosch Menke
- Department of Computer Science and Engineering, Chalmers University of Technology, Gothenburg, 41296, Sweden.
| | - Yasmine Nahal
- Department of Computer Science, Aalto University, Espoo, 02150, Finland
| | | | - Mikhail Kabeshov
- Molecular AI, Discovery Sciences AstraZeneca R &D, Mölndal, 43183, Sweden
| | - Samuel Kaski
- Department of Computer Science, Aalto University, Espoo, 02150, Finland
- Department of Computer Science, University of Manchester, Manchester, M13 9PL, UK
| | - Ola Engkvist
- Department of Computer Science and Engineering, Chalmers University of Technology, Gothenburg, 41296, Sweden
- Molecular AI, Discovery Sciences AstraZeneca R &D, Mölndal, 43183, Sweden
| |
Collapse
|
4
|
Ghiandoni GM, Evertsson E, Riley DJ, Tyrchan C, Rathi PC. Augmenting DMTA using predictive AI modelling at AstraZeneca. Drug Discov Today 2024; 29:103945. [PMID: 38460568 DOI: 10.1016/j.drudis.2024.103945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 02/27/2024] [Accepted: 03/05/2024] [Indexed: 03/11/2024]
Abstract
Design-Make-Test-Analyse (DMTA) is the discovery cycle through which molecules are designed, synthesised, and assayed to produce data that in turn are analysed to inform the next iteration. The process is repeated until viable drug candidates are identified, often requiring many cycles before reaching a sweet spot. The advent of artificial intelligence (AI) and cloud computing presents an opportunity to innovate drug discovery to reduce the number of cycles needed to yield a candidate. Here, we present the Predictive Insight Platform (PIP), a cloud-native modelling platform developed at AstraZeneca. The impact of PIP in each step of DMTA, as well as its architecture, integration, and usage, are discussed and used to provide insights into the future of drug discovery.
Collapse
Affiliation(s)
- Gian Marco Ghiandoni
- Augmented DMTA Platform, R&D IT, AstraZeneca, The Discovery Centre (DISC), Francis Crick Avenue, Cambridge CB2 0AA, UK.
| | - Emma Evertsson
- Research and Early Development, Respiratory and Immunology (R&I), Biopharmaceuticals R&D, AstraZeneca, Pepparedsleden, Mölndal, SE 43183, Sweden
| | - David J Riley
- Augmented DMTA Platform, R&D IT, AstraZeneca, The Discovery Centre (DISC), Francis Crick Avenue, Cambridge CB2 0AA, UK
| | - Christian Tyrchan
- Research and Early Development, Respiratory and Immunology (R&I), Biopharmaceuticals R&D, AstraZeneca, Pepparedsleden, Mölndal, SE 43183, Sweden
| | - Prakash Chandra Rathi
- Augmented DMTA Platform, R&D IT, AstraZeneca, The Discovery Centre (DISC), Francis Crick Avenue, Cambridge CB2 0AA, UK
| |
Collapse
|
5
|
Çelikok MM, Murena PA, Kaski S. Modeling needs user modeling. Front Artif Intell 2023; 6:1097891. [PMID: 37091302 PMCID: PMC10116056 DOI: 10.3389/frai.2023.1097891] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 03/24/2023] [Indexed: 04/08/2023] Open
Abstract
Modeling has actively tried to take the human out of the loop, originally for objectivity and recently also for automation. We argue that an unnecessary side effect has been that modeling workflows and machine learning pipelines have become restricted to only well-specified problems. Putting the humans back into the models would enable modeling a broader set of problems, through iterative modeling processes in which AI can offer collaborative assistance. However, this requires advances in how we scope our modeling problems, and in the user models. In this perspective article, we characterize the required user models and the challenges ahead for realizing this vision, which would enable new interactive modeling workflows, and human-centric or human-compatible machine learning pipelines.
Collapse
Affiliation(s)
| | | | - Samuel Kaski
- Department of Computer Science, Aalto University, Espoo, Finland
- Department of Computer Science, University of Manchester, Manchester, United Kingdom
- *Correspondence: Samuel Kaski
| |
Collapse
|