1
|
Seshadri K, Abad AND, Nagasawa KK, Yost KM, Johnson CW, Dror MJ, Tang Y. Synthetic Biology in Natural Product Biosynthesis. Chem Rev 2025; 125:3814-3931. [PMID: 40116601 DOI: 10.1021/acs.chemrev.4c00567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/23/2025]
Abstract
Synthetic biology has played an important role in the renaissance of natural products research during the post-genomics era. The development and integration of new tools have transformed the workflow of natural product discovery and engineering, generating multidisciplinary interest in the field. In this review, we summarize recent developments in natural product biosynthesis from three different aspects. First, advances in bioinformatics, experimental, and analytical tools to identify natural products associated with predicted biosynthetic gene clusters (BGCs) will be covered. This will be followed by an extensive review on the heterologous expression of natural products in bacterial, fungal and plant organisms. The native host-independent paradigm to natural product identification, pathway characterization, and enzyme discovery is where synthetic biology has played the most prominent role. Lastly, strategies to engineer biosynthetic pathways for structural diversification and complexity generation will be discussed, including recent advances in assembly-line megasynthase engineering, precursor-directed structural modification, and combinatorial biosynthesis.
Collapse
Affiliation(s)
- Kaushik Seshadri
- Department of Chemical and Biomolecular Engineering, University of California, Los Angeles, 420 Westwood Plaza, Los Angeles, California 90095, United States
| | - Abner N D Abad
- Department of Chemical and Biomolecular Engineering, University of California, Los Angeles, 420 Westwood Plaza, Los Angeles, California 90095, United States
| | - Kyle K Nagasawa
- Department of Chemistry and Biochemistry, University of California, Los Angeles, 420 Westwood Plaza, Los Angeles, California 90095, United States
| | - Karl M Yost
- Department of Chemistry and Biochemistry, University of California, Los Angeles, 420 Westwood Plaza, Los Angeles, California 90095, United States
| | - Colin W Johnson
- Department of Chemistry and Biochemistry, University of California, Los Angeles, 420 Westwood Plaza, Los Angeles, California 90095, United States
| | - Moriel J Dror
- Department of Bioengineering, University of California, Los Angeles, 420 Westwood Plaza, Los Angeles, California 90095, United States
| | - Yi Tang
- Department of Chemical and Biomolecular Engineering, University of California, Los Angeles, 420 Westwood Plaza, Los Angeles, California 90095, United States
- Department of Chemistry and Biochemistry, University of California, Los Angeles, 420 Westwood Plaza, Los Angeles, California 90095, United States
- Department of Bioengineering, University of California, Los Angeles, 420 Westwood Plaza, Los Angeles, California 90095, United States
| |
Collapse
|
2
|
Flores JE, Prymolenna AV, Lewis LA, Winans NM, Eder EK, Kew W, Young RP, Bramer LM. nmRanalysis: An Open-Source Web Application for Semi-automated NMR Metabolite Profiling. Anal Chem 2025; 97:7037-7046. [PMID: 40129367 DOI: 10.1021/acs.analchem.4c05104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2025]
Abstract
Though data acquisition and initial signal preprocessing of nuclear magnetic resonance (NMR) spectra have achieved high degrees of automation, downstream processing─specifically the profiling of spectra─has bottlenecked the overall NMR analysis workflow. Several efforts have been made to mitigate this bottleneck, but these solutions often trade an increase in automation for limitations elsewhere. In this work, we introduce nmRanalysis, a user-friendly web application that integrates the strengths of existing profiling tools for a more automated profiling workflow. nmRanalysis additionally incorporates novel features, including a machine-learning-driven recommender system for metabolite identification, further increasing the utility of nmRanalysis over the individual tools that it incorporates.
Collapse
Affiliation(s)
- Javier E Flores
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - Anastasiya V Prymolenna
- Environmental and Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - Logan A Lewis
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - Natalie M Winans
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - Elizabeth K Eder
- Environmental and Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - William Kew
- Environmental and Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - Robert P Young
- Environmental and Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - Lisa M Bramer
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| |
Collapse
|
3
|
Luo Y, Zheng X, Qiu M, Gou Y, Yang Z, Qu X, Chen Z, Lin Y. Deep learning and its applications in nuclear magnetic resonance spectroscopy. PROGRESS IN NUCLEAR MAGNETIC RESONANCE SPECTROSCOPY 2025; 146-147:101556. [PMID: 40306798 DOI: 10.1016/j.pnmrs.2024.101556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/19/2024] [Revised: 12/26/2024] [Accepted: 12/30/2024] [Indexed: 05/02/2025]
Abstract
Nuclear Magnetic Resonance (NMR), as an advanced technology, has widespread applications in various fields like chemistry, biology, and medicine. However, issues such as long acquisition times for multidimensional spectra and low sensitivity limit the broader application of NMR. Traditional algorithms aim to address these issues but have limitations in speed and accuracy. Deep Learning (DL), a branch of Artificial Intelligence (AI) technology, has shown remarkable success in many fields including NMR. This paper presents an overview of the basics of DL and current applications of DL in NMR, highlights existing challenges, and suggests potential directions for improvement.
Collapse
Affiliation(s)
- Yao Luo
- Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Department of Electronic Science, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University, Xiamen 361005, China
| | - Xiaoxu Zheng
- Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Department of Electronic Science, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University, Xiamen 361005, China
| | - Mengjie Qiu
- Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Department of Electronic Science, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University, Xiamen 361005, China
| | - Yaoping Gou
- Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Department of Electronic Science, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University, Xiamen 361005, China
| | - Zhengxian Yang
- Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Department of Electronic Science, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University, Xiamen 361005, China
| | - Xiaobo Qu
- Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Department of Electronic Science, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University, Xiamen 361005, China
| | - Zhong Chen
- Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Department of Electronic Science, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University, Xiamen 361005, China
| | - Yanqin Lin
- Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Department of Electronic Science, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University, Xiamen 361005, China.
| |
Collapse
|
4
|
Basnet BB, Zhou ZY, Wei B, Wang H. Advances in AI-based strategies and tools to facilitate natural product and drug development. Crit Rev Biotechnol 2025:1-32. [PMID: 40159111 DOI: 10.1080/07388551.2025.2478094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2024] [Revised: 02/11/2025] [Accepted: 02/16/2025] [Indexed: 04/02/2025]
Abstract
Natural products and their derivatives have been important for treating diseases in humans, animals, and plants. However, discovering new structures from natural sources is still challenging. In recent years, artificial intelligence (AI) has greatly aided the discovery and development of natural products and drugs. AI facilitates to: connect genetic data to chemical structures or vice-versa, repurpose known natural products, predict metabolic pathways, and design and optimize metabolites biosynthesis. More recently, the emergence and improvement in neural networks such as deep learning and ensemble automated web based bioinformatics platforms have sped up the discovery process. Meanwhile, AI also improves the identification and structure elucidation of unknown compounds from raw data like mass spectrometry and nuclear magnetic resonance. This article reviews these AI-driven methods and tools, highlighting their practical applications and guide for efficient natural product discovery and drug development.
Collapse
Affiliation(s)
- Buddha Bahadur Basnet
- College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou, China
- Central Department of Biotechnology, Tribhuvan University, Kathmandu, Nepal
| | - Zhen-Yi Zhou
- College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou, China
| | - Bin Wei
- College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou, China
| | - Hong Wang
- College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou, China
- Key Laboratory of Marine Fishery Resources Exploitment, Utilization of Zhejiang Province, Zhejiang University of Technology, Hangzhou, China
| |
Collapse
|
5
|
Lücken L, Mitschke N, Dittmar T, Blasius B. Network Flow Methods for NMR-Based Compound Identification. Anal Chem 2025; 97:4832-4840. [PMID: 39998390 PMCID: PMC11912116 DOI: 10.1021/acs.analchem.4c01652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Revised: 01/28/2025] [Accepted: 02/06/2025] [Indexed: 02/26/2025]
Abstract
In this work, we introduce a novel method for compound identification in mixtures based on nuclear magnetic resonance spectra. Contrary to many other methods, our approach can be used without peak-picking the mixture spectrum and simultaneously optimizes the fit of all individual compound spectra in a given library. At the core of the method, a minimum cost flow problem is solved on a network consisting of nodes that represent spectral peaks of the library compounds and the mixture. We show that our approach can outperform other popular algorithms by applying it to a standard compound identification task for 2D 1H,13C HSQC spectra of artificial mixtures and a natural sample using a library of 501 compounds. Moreover, our method retrieves individual compound concentrations with at least semiquantitative accuracy for artificial mixtures with up to 34 compounds. A software implementation of the minimum cost flow method is available on GitHub (https://github.com/GeoMetabolomics-ICBM/mcfNMR).
Collapse
Affiliation(s)
- Leonhard Lücken
- Institute for Chemistry and Biology of the Marine Environment (ICBM), Carl von Ossietzky Universität Oldenburg, Ammerländer Heerstraße 114-118, 26129 Oldenburg, Germany
| | - Nico Mitschke
- Institute for Chemistry and Biology of the Marine Environment (ICBM), Carl von Ossietzky Universität Oldenburg, Ammerländer Heerstraße 114-118, 26129 Oldenburg, Germany
| | - Thorsten Dittmar
- Institute for Chemistry and Biology of the Marine Environment (ICBM), Carl von Ossietzky Universität Oldenburg, Ammerländer Heerstraße 114-118, 26129 Oldenburg, Germany
- Helmholtz Institute for Functional Marine Biodiversity, Carl von Ossietzky Universität Oldenburg, Ammerländer Heerstraße 114-118, 26129 Oldenburg, Germany
| | - Bernd Blasius
- Institute for Chemistry and Biology of the Marine Environment (ICBM), Carl von Ossietzky Universität Oldenburg, Ammerländer Heerstraße 114-118, 26129 Oldenburg, Germany
- Helmholtz Institute for Functional Marine Biodiversity, Carl von Ossietzky Universität Oldenburg, Ammerländer Heerstraße 114-118, 26129 Oldenburg, Germany
| |
Collapse
|
6
|
Gangwal A, Lavecchia A. Artificial Intelligence in Natural Product Drug Discovery: Current Applications and Future Perspectives. J Med Chem 2025; 68:3948-3969. [PMID: 39916476 PMCID: PMC11874025 DOI: 10.1021/acs.jmedchem.4c01257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2024] [Revised: 12/01/2024] [Accepted: 01/28/2025] [Indexed: 02/28/2025]
Abstract
Drug discovery, a multifaceted process from compound identification to regulatory approval, historically plagued by inefficiencies and time lags due to limited data utilization, now faces urgent demands for accelerated lead compound identification. Innovations in biological data and computational chemistry have spurred a shift from trial-and-error methods to holistic approaches to medicinal chemistry. Computational techniques, particularly artificial intelligence (AI), notably machine learning (ML) and deep learning (DL), have revolutionized drug development, enhancing data analysis and predictive modeling. Natural products (NPs) have long served as rich sources of biologically active compounds, with many successful drugs originating from them. Advances in information science expanded NP-related databases, enabling deeper exploration with AI. Integrating AI into NP drug discovery promises accelerated discoveries, leveraging AI's analytical prowess, including generative AI for data synthesis. This perspective illuminates AI's current landscape in NP drug discovery, addressing strengths, limitations, and future trajectories to advance this vital research domain.
Collapse
Affiliation(s)
- Amit Gangwal
- Department
of Natural Product Chemistry, Shri Vile
Parle Kelavani Mandal’s Institute of Pharmacy, Dhule, 424001 Maharashtra, India
| | - Antonio Lavecchia
- “Drug
Discovery” Laboratory, Department of Pharmacy, University of Naples Federico II, I-80131 Naples, Italy
| |
Collapse
|
7
|
Chi J, Shu J, Li M, Mudappathi R, Jin Y, Lewis F, Boon A, Qin X, Liu L, Gu H. Artificial Intelligence in Metabolomics: A Current Review. Trends Analyt Chem 2024; 178:117852. [PMID: 39071116 PMCID: PMC11271759 DOI: 10.1016/j.trac.2024.117852] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/30/2024]
Abstract
Metabolomics and artificial intelligence (AI) form a synergistic partnership. Metabolomics generates large datasets comprising hundreds to thousands of metabolites with complex relationships. AI, aiming to mimic human intelligence through computational modeling, possesses extraordinary capabilities for big data analysis. In this review, we provide a recent overview of the methodologies and applications of AI in metabolomics studies in the context of systems biology and human health. We first introduce the AI concept, history, and key algorithms for machine learning and deep learning, summarizing their strengths and weaknesses. We then discuss studies that have successfully used AI across different aspects of metabolomic analysis, including analytical detection, data preprocessing, biomarker discovery, predictive modeling, and multi-omics data integration. Lastly, we discuss the existing challenges and future perspectives in this rapidly evolving field. Despite limitations and challenges, the combination of metabolomics and AI holds great promises for revolutionary advancements in enhancing human health.
Collapse
Affiliation(s)
- Jinhua Chi
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA
- Center for Translational Science, Florida International University, Port St. Lucie, FL 34987, USA
| | - Jingmin Shu
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA
- Center for Personalized Diagnostics, Biodesign Institute, Arizona State University, Tempe, AZ 85281, USA
| | - Ming Li
- Phoenix VA Health Care System, Phoenix, AZ 85012, USA
- University of Arizona College of Medicine, Phoenix, AZ 85004, USA
| | - Rekha Mudappathi
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA
- Center for Personalized Diagnostics, Biodesign Institute, Arizona State University, Tempe, AZ 85281, USA
| | - Yan Jin
- Center for Translational Science, Florida International University, Port St. Lucie, FL 34987, USA
| | - Freeman Lewis
- Center for Translational Science, Florida International University, Port St. Lucie, FL 34987, USA
| | - Alexandria Boon
- Center for Translational Science, Florida International University, Port St. Lucie, FL 34987, USA
| | - Xiaoyan Qin
- College of Liberal Arts and Sciences, Arizona State University, Tempe, AZ 85281, USA
| | - Li Liu
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA
- Center for Personalized Diagnostics, Biodesign Institute, Arizona State University, Tempe, AZ 85281, USA
| | - Haiwei Gu
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA
- Center for Translational Science, Florida International University, Port St. Lucie, FL 34987, USA
| |
Collapse
|
8
|
McCullagh J, Probert F. New analytical methods focusing on polar metabolite analysis in mass spectrometry and NMR-based metabolomics. Curr Opin Chem Biol 2024; 80:102466. [PMID: 38772215 DOI: 10.1016/j.cbpa.2024.102466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Revised: 03/19/2024] [Accepted: 04/26/2024] [Indexed: 05/23/2024]
Abstract
Following in the footsteps of genomics and proteomics, metabolomics has revolutionised the way we investigate and understand biological systems. Rapid development in the last 25 years has been driven largely by technical innovations in mass spectrometry and nuclear magnetic resonance spectroscopy. However, despite the modest size of metabolomes relative to proteomes and genomes, methodological capabilities for robust, comprehensive metabolite analysis remain a major challenge. Therefore, development of new methods and techniques remains vital for progress in the field. Here, we review developments in LC-MS, GC-MS and NMR methods in the last few years that have enhanced quantitative and comprehensive metabolome coverage, highlighting the techniques involved, their technical capabilities, relative performance, and potential impact.
Collapse
Affiliation(s)
- James McCullagh
- Department of Chemistry, University of Oxford, Mansfield Road, Oxford, OX1 3TA, UK.
| | - Fay Probert
- Department of Chemistry, University of Oxford, Mansfield Road, Oxford, OX1 3TA, UK
| |
Collapse
|
9
|
Akyol S, Ashrafi N, Yilmaz A, Turkoglu O, Graham SF. Metabolomics: An Emerging "Omics" Platform for Systems Biology and Its Implications for Huntington Disease Research. Metabolites 2023; 13:1203. [PMID: 38132886 PMCID: PMC10744751 DOI: 10.3390/metabo13121203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 11/29/2023] [Accepted: 12/02/2023] [Indexed: 12/23/2023] Open
Abstract
Huntington's disease (HD) is a progressive, fatal neurodegenerative disease characterized by motor, cognitive, and psychiatric symptoms. The precise mechanisms of HD progression are poorly understood; however, it is known that there is an expansion of the trinucleotide cytosine-adenine-guanine (CAG) repeat in the Huntingtin gene. Important new strategies are of paramount importance to identify early biomarkers with predictive value for intervening in disease progression at a stage when cellular dysfunction has not progressed irreversibly. Metabolomics is the study of global metabolite profiles in a system (cell, tissue, or organism) under certain conditions and is becoming an essential tool for the systemic characterization of metabolites to provide a snapshot of the functional and pathophysiological states of an organism and support disease diagnosis and biomarker discovery. This review briefly highlights the historical progress of metabolomic methodologies, followed by a more detailed review of the use of metabolomics in HD research to enable a greater understanding of the pathogenesis, its early prediction, and finally the main technical platforms in the field of metabolomics.
Collapse
Affiliation(s)
- Sumeyya Akyol
- NX Prenatal Inc., 4350 Brownsboro Road, Louisville KY 40207, USA;
| | - Nadia Ashrafi
- Department of Obstetrics and Gynecology, Oakland University-William Beaumont School of Medicine, 318 Meadow Brook Road, Rochester, MI 48309, USA; (N.A.); (A.Y.); (O.T.)
| | - Ali Yilmaz
- Department of Obstetrics and Gynecology, Oakland University-William Beaumont School of Medicine, 318 Meadow Brook Road, Rochester, MI 48309, USA; (N.A.); (A.Y.); (O.T.)
- Metabolomics Division, Beaumont Research Institute, 3811 W. 13 Mile Road, Royal Oak, MI 48073, USA
| | - Onur Turkoglu
- Department of Obstetrics and Gynecology, Oakland University-William Beaumont School of Medicine, 318 Meadow Brook Road, Rochester, MI 48309, USA; (N.A.); (A.Y.); (O.T.)
| | - Stewart F. Graham
- Department of Obstetrics and Gynecology, Oakland University-William Beaumont School of Medicine, 318 Meadow Brook Road, Rochester, MI 48309, USA; (N.A.); (A.Y.); (O.T.)
- Metabolomics Division, Beaumont Research Institute, 3811 W. 13 Mile Road, Royal Oak, MI 48073, USA
| |
Collapse
|
10
|
Hu G, Qiu M. Machine learning-assisted structure annotation of natural products based on MS and NMR data. Nat Prod Rep 2023; 40:1735-1753. [PMID: 37519196 DOI: 10.1039/d3np00025g] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/01/2023]
Abstract
Covering: up to March 2023Machine learning (ML) has emerged as a popular tool for analyzing the structures of natural products (NPs). This review presents a summary of the recent advancements in ML-assisted mass spectrometry (MS) and nuclear magnetic resonance (NMR) data analysis to establish the chemical structures of NPs. First, ML-based MS/MS analyses that rely on library matching are discussed, which involves the utilization of ML algorithms to calculate similarity, predict the MS/MS fragments, and form molecular fingerprint. Then, ML assisted MS/MS structural annotation without library matching is reviewed. Furthermore, the cases of ML algorithms in assisting structural studies of NPs based on NMR are discussed from four perspectives: NMR prediction, functional group identification, structural categorization and quantum chemical calculation. Finally, the review concludes with a discussion of the challenges and the trends associated with the structural establishment of NPs based on ML algorithms.
Collapse
Affiliation(s)
- Guilin Hu
- State Key Laboratory of Phytochemistry and Plant Resources in West China, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, Yunnan, China.
- University of the Chinese Academy of Sciences, Beijing 100049, People's Republic of China
| | - Minghua Qiu
- State Key Laboratory of Phytochemistry and Plant Resources in West China, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, Yunnan, China.
- University of the Chinese Academy of Sciences, Beijing 100049, People's Republic of China
| |
Collapse
|
11
|
Mullowney MW, Duncan KR, Elsayed SS, Garg N, van der Hooft JJJ, Martin NI, Meijer D, Terlouw BR, Biermann F, Blin K, Durairaj J, Gorostiola González M, Helfrich EJN, Huber F, Leopold-Messer S, Rajan K, de Rond T, van Santen JA, Sorokina M, Balunas MJ, Beniddir MA, van Bergeijk DA, Carroll LM, Clark CM, Clevert DA, Dejong CA, Du C, Ferrinho S, Grisoni F, Hofstetter A, Jespers W, Kalinina OV, Kautsar SA, Kim H, Leao TF, Masschelein J, Rees ER, Reher R, Reker D, Schwaller P, Segler M, Skinnider MA, Walker AS, Willighagen EL, Zdrazil B, Ziemert N, Goss RJM, Guyomard P, Volkamer A, Gerwick WH, Kim HU, Müller R, van Wezel GP, van Westen GJP, Hirsch AKH, Linington RG, Robinson SL, Medema MH. Artificial intelligence for natural product drug discovery. Nat Rev Drug Discov 2023; 22:895-916. [PMID: 37697042 DOI: 10.1038/s41573-023-00774-7] [Citation(s) in RCA: 107] [Impact Index Per Article: 53.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/20/2023] [Indexed: 09/13/2023]
Abstract
Developments in computational omics technologies have provided new means to access the hidden diversity of natural products, unearthing new potential for drug discovery. In parallel, artificial intelligence approaches such as machine learning have led to exciting developments in the computational drug design field, facilitating biological activity prediction and de novo drug design for molecular targets of interest. Here, we describe current and future synergies between these developments to effectively identify drug candidates from the plethora of molecules produced by nature. We also discuss how to address key challenges in realizing the potential of these synergies, such as the need for high-quality datasets to train deep learning algorithms and appropriate strategies for algorithm validation.
Collapse
Affiliation(s)
| | - Katherine R Duncan
- Strathclyde Institute of Pharmacy and Biomedical Sciences, University of Strathclyde, Glasgow, UK
| | - Somayah S Elsayed
- Department of Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands
| | - Neha Garg
- School of Chemistry and Biochemistry, Center for Microbial Dynamics and Infection, Georgia Institute of Technology, Atlanta, GA, USA
| | - Justin J J van der Hooft
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
- Department of Biochemistry, University of Johannesburg, Johannesburg, South Africa
| | - Nathaniel I Martin
- Biological Chemistry Group, Institute of Biology, Leiden University, Leiden, The Netherlands
| | - David Meijer
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
| | - Barbara R Terlouw
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
| | - Friederike Biermann
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
- Institute of Molecular Bio Science, Goethe-University Frankfurt, Frankfurt am Main, Germany
- LOEWE Center for Translational Biodiversity Genomics (TBG), Frankfurt am Main, Germany
| | - Kai Blin
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby, Denmark
| | | | - Marina Gorostiola González
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden, The Netherlands
- ONCODE institute, Leiden, The Netherlands
| | - Eric J N Helfrich
- Institute of Molecular Bio Science, Goethe-University Frankfurt, Frankfurt am Main, Germany
- LOEWE Center for Translational Biodiversity Genomics (TBG), Frankfurt am Main, Germany
| | - Florian Huber
- Center for Digitalization and Digitality, Hochschule Düsseldorf, Düsseldorf, Germany
| | - Stefan Leopold-Messer
- Institut für Mikrobiologie, Eidgenössische Technische Hochschule (ETH) Zürich, Zürich, Switzerland
| | - Kohulan Rajan
- Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University Jena, Jena, Germany
| | - Tristan de Rond
- School of Chemical Sciences, University of Auckland, Auckland, New Zealand
| | - Jeffrey A van Santen
- Department of Chemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Maria Sorokina
- Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller University, Jena, Germany
- Pharmaceuticals R&D, Bayer AG, Berlin, Germany
| | - Marcy J Balunas
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, MI, USA
- Department of Medicinal Chemistry, University of Michigan, Ann Arbor, MI, USA
| | - Mehdi A Beniddir
- Équipe "Chimie des Substances Naturelles", Université Paris-Saclay, CNRS, BioCIS, Orsay, France
| | - Doris A van Bergeijk
- Department of Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands
| | - Laura M Carroll
- Structural and Computational Biology Unit, EMBL, Heidelberg, Germany
| | - Chase M Clark
- Division of Pharmaceutical Sciences, School of Pharmacy, University of Wisconsin-Madison, Madison, WI, USA
| | | | | | - Chao Du
- Department of Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands
| | | | - Francesca Grisoni
- Institute for Complex Molecular Systems, Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
- Centre for Living Technologies, Alliance TU/e, WUR, UU, UMC Utrecht, Utrecht, The Netherlands
| | | | - Willem Jespers
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden, The Netherlands
| | - Olga V Kalinina
- Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Saarbrücken, Germany
- Drug Bioinformatics, Medical Faculty, Saarland University, Homburg, Germany
- Center for Bioinformatics, Saarland University, Saarbrücken, Germany
| | | | - Hyunwoo Kim
- College of Pharmacy and Integrated Research Institute for Drug Development, Dongguk University Seoul, Goyang-si, Republic of Korea
| | - Tiago F Leao
- Center for Nuclear Energy in Agriculture, University of São Paulo, Piracicaba, Brazil
| | - Joleen Masschelein
- Center for Microbiology, VIB-KU Leuven, Heverlee, Belgium
- Department of Biology, KU Leuven, Heverlee, Belgium
| | - Evan R Rees
- Division of Pharmaceutical Sciences, School of Pharmacy, University of Wisconsin-Madison, Madison, WI, USA
| | - Raphael Reher
- Institute of Pharmaceutical Biology and Biotechnology, University of Marburg, Marburg, Germany
- Institute of Pharmacy, Martin-Luther-University Halle-Wittenberg, Halle (Saale), Germany
| | - Daniel Reker
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
- Duke Microbiome Center, Duke University, Durham, NC, USA
| | - Philippe Schwaller
- Laboratory of Artificial Chemical Intelligence, Institut des Sciences et Ingénierie Chimiques, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | | | - Michael A Skinnider
- Adapsyn Bioscience, Hamilton, Ontario, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
| | - Allison S Walker
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Department of Biological Sciences, Vanderbilt University, Nashville, TN, USA
| | - Egon L Willighagen
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, The Netherlands
| | - Barbara Zdrazil
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridgeshire, UK
| | - Nadine Ziemert
- Interfaculty Institute for Microbiology and Infection Medicine Tuebingen (IMIT), Institute for Bioinformatics and Medical Informatics (IBMI), University of Tuebingen, Tuebingen, Germany
| | | | - Pierre Guyomard
- Bonsai team, CRIStAL - Centre de Recherche en Informatique Signal et Automatique de Lille, Université de Lille, Villeneuve d'Ascq Cedex, France
| | - Andrea Volkamer
- Center for Bioinformatics, Saarland University, Saarbrücken, Germany
- In silico Toxicology and Structural Bioinformatics, Institute of Physiology, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - William H Gerwick
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA, USA
| | - Hyun Uk Kim
- Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Korea
| | - Rolf Müller
- Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Saarbrücken, Germany
- Department of Pharmacy, Saarland University, Saarbrücken, Germany
- German Center for infection research (DZIF), Braunschweig, Germany
- Helmholtz International Lab for Anti-Infectives, Saarbrücken, Germany
| | - Gilles P van Wezel
- Department of Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands
- Netherlands Institute of Ecology, NIOO-KNAW, Wageningen, The Netherlands
| | - Gerard J P van Westen
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden, The Netherlands.
| | - Anna K H Hirsch
- Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Saarbrücken, Germany.
- Department of Pharmacy, Saarland University, Saarbrücken, Germany.
- German Center for infection research (DZIF), Braunschweig, Germany.
- Helmholtz International Lab for Anti-Infectives, Saarbrücken, Germany.
| | - Roger G Linington
- Department of Chemistry, Simon Fraser University, Burnaby, British Columbia, Canada.
| | - Serina L Robinson
- Department of Environmental Microbiology, Eawag: Swiss Federal Institute for Aquatic Science and Technology, Dübendorf, Switzerland.
| | - Marnix H Medema
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands.
- Institute of Biology, Leiden University, Leiden, The Netherlands.
| |
Collapse
|
12
|
Gaudêncio SP, Bayram E, Lukić Bilela L, Cueto M, Díaz-Marrero AR, Haznedaroglu BZ, Jimenez C, Mandalakis M, Pereira F, Reyes F, Tasdemir D. Advanced Methods for Natural Products Discovery: Bioactivity Screening, Dereplication, Metabolomics Profiling, Genomic Sequencing, Databases and Informatic Tools, and Structure Elucidation. Mar Drugs 2023; 21:md21050308. [PMID: 37233502 DOI: 10.3390/md21050308] [Citation(s) in RCA: 36] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 05/11/2023] [Accepted: 05/12/2023] [Indexed: 05/27/2023] Open
Abstract
Natural Products (NP) are essential for the discovery of novel drugs and products for numerous biotechnological applications. The NP discovery process is expensive and time-consuming, having as major hurdles dereplication (early identification of known compounds) and structure elucidation, particularly the determination of the absolute configuration of metabolites with stereogenic centers. This review comprehensively focuses on recent technological and instrumental advances, highlighting the development of methods that alleviate these obstacles, paving the way for accelerating NP discovery towards biotechnological applications. Herein, we emphasize the most innovative high-throughput tools and methods for advancing bioactivity screening, NP chemical analysis, dereplication, metabolite profiling, metabolomics, genome sequencing and/or genomics approaches, databases, bioinformatics, chemoinformatics, and three-dimensional NP structure elucidation.
Collapse
Affiliation(s)
- Susana P Gaudêncio
- Associate Laboratory i4HB-Institute for Health and Bioeconomy, NOVA School of Science and Technology, NOVA University Lisbon, 2819-516 Caparica, Portugal
- UCIBIO-Applied Molecular Biosciences Unit, Chemistry Department, NOVA School of Science and Technology, NOVA University of Lisbon, 2819-516 Caparica, Portugal
| | - Engin Bayram
- Institute of Environmental Sciences, Room HKC-202, Hisar Campus, Bogazici University, Bebek, Istanbul 34342, Turkey
| | - Lada Lukić Bilela
- Department of Biology, Faculty of Science, University of Sarajevo, 71000 Sarajevo, Bosnia and Herzegovina
| | - Mercedes Cueto
- Instituto de Productos Naturales y Agrobiología-CSIC, 38206 La Laguna, Spain
| | - Ana R Díaz-Marrero
- Instituto de Productos Naturales y Agrobiología-CSIC, 38206 La Laguna, Spain
- Instituto Universitario de Bio-Orgánica (IUBO), Universidad de La Laguna, 38206 La Laguna, Spain
| | - Berat Z Haznedaroglu
- Institute of Environmental Sciences, Room HKC-202, Hisar Campus, Bogazici University, Bebek, Istanbul 34342, Turkey
| | - Carlos Jimenez
- CICA- Centro Interdisciplinar de Química e Bioloxía, Departamento de Química, Facultade de Ciencias, Universidade da Coruña, 15071 A Coruña, Spain
| | - Manolis Mandalakis
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, HCMR Thalassocosmos, 71500 Gournes, Crete, Greece
| | - Florbela Pereira
- LAQV, REQUIMTE, Chemistry Department, NOVA School of Science and Technology, NOVA University of Lisbon, 2819-516 Caparica, Portugal
| | - Fernando Reyes
- Fundación MEDINA, Avda. del Conocimiento 34, 18016 Armilla, Spain
| | - Deniz Tasdemir
- GEOMAR Centre for Marine Biotechnology (GEOMAR-Biotech), Research Unit Marine Natural Products Chemistry, GEOMAR Helmholtz Centre for Ocean Research Kiel, Am Kiel-Kanal 44, 24106 Kiel, Germany
- Faculty of Mathematics and Natural Science, Kiel University, Christian-Albrechts-Platz 4, 24118 Kiel, Germany
| |
Collapse
|
13
|
Affiliation(s)
- G. A. Nagana Gowda
- Northwest Metabolomics Research Center, Department of Anesthesiology and Pain Medicine, University of Washington
- Mitochondria and Metabolism Center, Department of Anesthesiology and Pain Medicine, University of Washington
| | - Daniel Raftery
- Northwest Metabolomics Research Center, Department of Anesthesiology and Pain Medicine, University of Washington
- Mitochondria and Metabolism Center, Department of Anesthesiology and Pain Medicine, University of Washington
- Fred Hutchinson Cancer Research Center, Seattle, WA 98109
| |
Collapse
|
14
|
Sahayasheela VJ, Lankadasari MB, Dan VM, Dastager SG, Pandian GN, Sugiyama H. Artificial intelligence in microbial natural product drug discovery: current and emerging role. Nat Prod Rep 2022; 39:2215-2230. [PMID: 36017693 PMCID: PMC9931531 DOI: 10.1039/d2np00035k] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Covering: up to the end of 2022Microorganisms are exceptional sources of a wide array of unique natural products and play a significant role in drug discovery. During the golden era, several life-saving antibiotics and anticancer agents were isolated from microbes; moreover, they are still widely used. However, difficulties in the isolation methods and repeated discoveries of the same molecules have caused a setback in the past. Artificial intelligence (AI) has had a profound impact on various research fields, and its application allows the effective performance of data analyses and predictions. With the advances in omics, it is possible to obtain a wealth of information for the identification, isolation, and target prediction of secondary metabolites. In this review, we discuss drug discovery based on natural products from microorganisms with the help of AI and machine learning.
Collapse
Affiliation(s)
- Vinodh J Sahayasheela
- Department of Chemistry, Graduate School of Science, Kyoto University, Kitashirakawa-Oiwakecho, Sakyo-Ku, Kyoto 606-8502, Japan.
| | - Manendra B Lankadasari
- Thoracic Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Vipin Mohan Dan
- Microbiology Division, Jawaharlal Nehru Tropical Botanic Garden and Research Institute, Thiruvananthapuram, Kerala, India
| | - Syed G Dastager
- NCIM Resource Centre, Division of Biochemical Sciences, CSIR - National Chemical Laboratory, Pune, Maharashtra, India
| | - Ganesh N Pandian
- Institute for Integrated Cell-Material Sciences (WPI-iCeMS), Kyoto University, Yoshida-Ushinomaecho, Sakyo-Ku, Kyoto 606-8501, Japan
| | - Hiroshi Sugiyama
- Department of Chemistry, Graduate School of Science, Kyoto University, Kitashirakawa-Oiwakecho, Sakyo-Ku, Kyoto 606-8502, Japan.
- Institute for Integrated Cell-Material Sciences (WPI-iCeMS), Kyoto University, Yoshida-Ushinomaecho, Sakyo-Ku, Kyoto 606-8501, Japan
| |
Collapse
|
15
|
Convolutional Neural Network-Based Compound Fingerprint Prediction for Metabolite Annotation. Metabolites 2022; 12:metabo12070605. [PMID: 35888729 PMCID: PMC9316655 DOI: 10.3390/metabo12070605] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 06/17/2022] [Accepted: 06/21/2022] [Indexed: 11/23/2022] Open
Abstract
Metabolite annotation has been a challenging issue especially in untargeted metabolomics studies by liquid chromatography coupled with mass spectrometry (LC-MS). This is in part due to the limitations of publicly available spectral libraries, which consist of tandem mass spectrometry (MS/MS) data acquired from just a fraction of known metabolites. Machine learning provides the opportunity to predict molecular fingerprints based on MS/MS data. The predicted molecular fingerprints can then be used to help rank putative metabolite IDs obtained by using either the precursor mass or the formula of the unknown metabolite. This method is particularly useful to help annotate metabolites whose corresponding MS/MS spectra are missing or cannot be matched with those in accessible spectral libraries. We investigated a convolutional neural network (CNN) for molecular fingerprint prediction based on data acquired by MS/MS. We used more than 680,000 MS/MS spectra obtained from the MoNA repository and NIST 20, representing about 36,000 compounds for training and testing our CNN model. The trained CNN model is implemented as a python package, MetFID. The package is available on GitHub for users to enter their MS/MS spectra and corresponding putative metabolite IDs to obtain ranked lists of metabolites. Better performance is achieved by MetFID in ranking putative metabolite IDs using the CASMI 2016 benchmark dataset compared to two other machine learning-based tools (CSI:FingerID and ChemDistiller).
Collapse
|
16
|
Kontogianni VG, Gerothanassis IP. Analytical and Structural Tools of Lipid Hydroperoxides: Present State and Future Perspectives. Molecules 2022; 27:2139. [PMID: 35408537 PMCID: PMC9000705 DOI: 10.3390/molecules27072139] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 03/20/2022] [Accepted: 03/22/2022] [Indexed: 11/17/2022] Open
Abstract
Mono- and polyunsaturated lipids are particularly susceptible to peroxidation, which results in the formation of lipid hydroperoxides (LOOHs) as primary nonradical-reaction products. LOOHs may undergo degradation to various products that have been implicated in vital biological reactions, and thus in the pathogenesis of various diseases. The structure elucidation and qualitative and quantitative analysis of lipid hydroperoxides are therefore of great importance. The objectives of the present review are to provide a critical analysis of various methods that have been widely applied, and more specifically on volumetric methods, applications of UV-visible, infrared, Raman/surface-enhanced Raman, fluorescence and chemiluminescence spectroscopies, chromatographic methods, hyphenated MS techniques, NMR and chromatographic methods, NMR spectroscopy in mixture analysis, structural investigations based on quantum chemical calculations of NMR parameters, applications in living cells, and metabolomics. Emphasis will be given to analytical and structural methods that can contribute significantly to the molecular basis of the chemical process involved in the formation of lipid hydroperoxides without the need for the isolation of the individual components. Furthermore, future developments in the field will be discussed.
Collapse
Affiliation(s)
- Vassiliki G. Kontogianni
- Section of Organic Chemistry and Biochemistry, Department of Chemistry, University of Ioannina, GR-45110 Ioannina, Greece
| | - Ioannis P. Gerothanassis
- Section of Organic Chemistry and Biochemistry, Department of Chemistry, University of Ioannina, GR-45110 Ioannina, Greece
- International Center for Chemical and Biological Sciences, H.E.J. Research Institute of Chemistry, University of Karachi, Karachi 75270, Pakistan
| |
Collapse
|
17
|
NMR in Metabolomics: From Conventional Statistics to Machine Learning and Neural Network Approaches. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12062824] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
NMR measurements combined with chemometrics allow achieving a great amount of information for the identification of potential biomarkers responsible for a precise metabolic pathway. These kinds of data are useful in different fields, ranging from food to biomedical fields, including health science. The investigation of the whole set of metabolites in a sample, representing its fingerprint in the considered condition, is known as metabolomics and may take advantage of different statistical tools. The new frontier is to adopt self-learning techniques to enhance clustering or classification actions that can improve the predictive power over large amounts of data. Although machine learning is already employed in metabolomics, deep learning and artificial neural networks approaches were only recently successfully applied. In this work, we give an overview of the statistical approaches underlying the wide range of opportunities that machine learning and neural networks allow to perform with accurate metabolites assignment and quantification.Various actual challenges are discussed, such as proper metabolomics, deep learning architectures and model accuracy.
Collapse
|