1
|
Karavaeva V, Sousa FL. Navigating the archaeal frontier: insights and projections from bioinformatic pipelines. Front Microbiol 2024; 15:1433224. [PMID: 39380680 PMCID: PMC11459464 DOI: 10.3389/fmicb.2024.1433224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Accepted: 08/28/2024] [Indexed: 10/10/2024] Open
Abstract
Archaea continues to be one of the least investigated domains of life, and in recent years, the advent of metagenomics has led to the discovery of many new lineages at the phylum level. For the majority, only automatic genomic annotations can provide information regarding their metabolic potential and role in the environment. Here, genomic data from 2,978 archaeal genomes was used to perform automatic annotations using bioinformatics tools, alongside synteny analysis. These automatic classifications were done to assess how good these different tools perform in relation to archaeal data. Our study revealed that even with lowered cutoffs, several functional models do not capture the recently discovered archaeal diversity. Moreover, our investigation revealed that a significant portion of archaeal genomes, approximately 42%, remain uncharacterized. In comparison, within 3,235 bacterial genomes, a diverse range of unclassified proteins is obtained, with well-studied organisms like Escherichia coli having a substantially lower proportion of uncharacterized regions, ranging from <5 to 25%, and less studied lineages being comparable to archaea with the range of 35-40% of unclassified regions. Leveraging this analysis, we were able to identify metabolic protein markers, thereby providing insights into the metabolism of the archaea in our dataset. Our findings underscore a substantial gap between automatic classification tools and the comprehensive mapping of archaeal metabolism. Despite advances in computational approaches, a significant portion of archaeal genomes remains unexplored, highlighting the need for extensive experimental validation in this domain, as well as more refined annotation methods. This study contributes to a better understanding of archaeal metabolism and underscores the importance of further research in elucidating the functional potential of archaeal genomes.
Collapse
Affiliation(s)
- Val Karavaeva
- Genome Evolution and Ecology Group, Department of Functional and Evolutionary Ecology, University of Vienna, Vienna, Austria
- Vienna Doctoral School of Ecology and Evolution, University of Vienna, Vienna, Austria
| | - Filipa L. Sousa
- Genome Evolution and Ecology Group, Department of Functional and Evolutionary Ecology, University of Vienna, Vienna, Austria
| |
Collapse
|
2
|
Luo W, Zhou G, Zhu Z, Yuan Y, Ke G, Wei Z, Gao Z, Zheng H. Bridging Machine Learning and Thermodynamics for Accurate p K a Prediction. JACS AU 2024; 4:3451-3465. [PMID: 39328749 PMCID: PMC11423309 DOI: 10.1021/jacsau.4c00271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 07/07/2024] [Accepted: 07/10/2024] [Indexed: 09/28/2024]
Abstract
Integrating scientific principles into machine learning models to enhance their predictive performance and generalizability is a central challenge in the development of AI for Science. Herein, we introduce Uni-pK a, a novel framework that successfully incorporates thermodynamic principles into machine learning modeling, achieving high-precision predictions of acid dissociation constants (pK a), a crucial task in the rational design of drugs and catalysts, as well as a modeling challenge in computational physical chemistry for small organic molecules. Uni-pK a utilizes a comprehensive free energy model to represent molecular protonation equilibria accurately. It features a structure enumerator that reconstructs molecular configurations from pK a data, coupled with a neural network that functions as a free energy predictor, ensuring high-throughput, data-driven prediction while preserving thermodynamic consistency. Employing a pretraining-finetuning strategy with both predicted and experimental pK a data, Uni-pK a not only achieves state-of-the-art accuracy in chemoinformatics but also shows comparable precision to quantum mechanics-based methods.
Collapse
Affiliation(s)
- Weiliang Luo
- Department
of Chemistry, Massachusetts Institute of
Technology, Cambridge, Massachusetts 02139, United States
- DP
Technology, Beijing 100089, China
| | - Gengmo Zhou
- DP
Technology, Beijing 100089, China
- Gaoling
School of Artificial Intelligence, Renmin
University of China, Beijing 100872, China
| | | | | | - Guolin Ke
- DP
Technology, Beijing 100089, China
| | - Zhewei Wei
- Gaoling
School of Artificial Intelligence, Renmin
University of China, Beijing 100872, China
| | | | | |
Collapse
|
3
|
Tannir S, Pan Y, Josephs N, Cunningham C, Hendrick NR, Beckett A, McNeely J, Beeler A, Jeffries-El M, Kolaczyk ED. Predicting Emission Wavelengths in Benzobisoxazole-Based OLEDs with Gradient Boosted Ensemble Models. J Phys Chem A 2024; 128:6116-6123. [PMID: 39008894 DOI: 10.1021/acs.jpca.4c00077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/17/2024]
Abstract
We demonstrate the use of gradient-boosted ensemble models that accurately predict emission wavelengths in benzobis[1,2-d:4,5-d']oxazole (BBO) based fluorescent emitters. We have curated a database of 50 molecules from previously published data by the Jeffries-EL group using density functional theory (DFT) computed ground and excited state features. We consider two machine learning (ML) models based on (i) whole cruciform molecules and (ii) their constituent fragment molecules. Both ML models provide accurate predictions with root-mean-square errors between 30 and 36 nm, competitive with state-of-the-art deep learning models trained on orders of magnitude more molecules, and this accuracy holds even when tested on four new BBO emitters unseen by the models. We also provide an interpretable feature importance analysis and discuss the relevant relationships between DFT and changes in predicted emission wavelength.
Collapse
Affiliation(s)
- Shambhavi Tannir
- Department of Chemistry, Boston University, Boston, Massachusetts 02215, United States
| | - Yuning Pan
- Department of Mathematics and Statistics, Boston University, Boston, Massachusetts 02215, United States
| | - Nathaniel Josephs
- Department of Statistics, North Carolina State University, Raleigh, North Carolina 27695, United States
| | | | - Nathan R Hendrick
- Department of Chemistry, Boston University, Boston, Massachusetts 02215, United States
| | - Annie Beckett
- Department of Chemistry, Boston University, Boston, Massachusetts 02215, United States
| | - James McNeely
- Department of Chemistry, Boston University, Boston, Massachusetts 02215, United States
| | - Aaron Beeler
- Department of Chemistry, Boston University, Boston, Massachusetts 02215, United States
| | - Malika Jeffries-El
- Department of Chemistry, Boston University, Boston, Massachusetts 02215, United States
- Division of Material Science and Engineering, Boston University, Boston, Massachusetts 02215, United States
| | - Eric D Kolaczyk
- Department of Mathematics and Statistics, Boston University, Boston, Massachusetts 02215, United States
- Department of Mathematics and Statistics, McGill University, Montreal, QC H3A 0G4, Canada
| |
Collapse
|
4
|
Wossnig L, Furtmann N, Buchanan A, Kumar S, Greiff V. Best practices for machine learning in antibody discovery and development. Drug Discov Today 2024; 29:104025. [PMID: 38762089 DOI: 10.1016/j.drudis.2024.104025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 04/25/2024] [Accepted: 05/13/2024] [Indexed: 05/20/2024]
Abstract
In the past 40 years, therapeutic antibody discovery and development have advanced considerably, with machine learning (ML) offering a promising way to speed up the process by reducing costs and the number of experiments required. Recent progress in ML-guided antibody design and development (D&D) has been hindered by the diversity of data sets and evaluation methods, which makes it difficult to conduct comparisons and assess utility. Establishing standards and guidelines will be crucial for the wider adoption of ML and the advancement of the field. This perspective critically reviews current practices, highlights common pitfalls and proposes method development and evaluation guidelines for various ML-based techniques in therapeutic antibody D&D. Addressing challenges across the ML process, best practices are recommended for each stage to enhance reproducibility and progress.
Collapse
Affiliation(s)
- Leonard Wossnig
- LabGenius Ltd, The Biscuit Factory, 100 Drummond Road, London SE16 4DG, UK; Department of Computer Science, University College London, 66-72 Gower St, London WC1E 6EA, UK.
| | - Norbert Furtmann
- R&D Large Molecules Research Platform, Sanofi Deutschland GmbH, Industriepark Höchst, Frankfurt Am Main, Germany
| | - Andrew Buchanan
- Biologics Engineering, R&D, AstraZeneca, Cambridge CB2 0AA, UK
| | - Sandeep Kumar
- Computational Protein Design and Modeling Group, Computational Science, Moderna Therapeutics, 200 Technology Square, Cambridge, MA 02139, USA
| | - Victor Greiff
- Department of Immunology and Oslo University Hospital, University of Oslo, Oslo, Norway
| |
Collapse
|
5
|
Cavasotto CN, Di Filippo JI, Scardino V. Lessons learnt from machine learning in early stages of drug discovery. Expert Opin Drug Discov 2024; 19:631-633. [PMID: 38727031 DOI: 10.1080/17460441.2024.2354279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Accepted: 05/08/2024] [Indexed: 05/22/2024]
Affiliation(s)
- Claudio N Cavasotto
- Computational Drug Design and Biomedical Informatics Laboratory, Instituto de Investigaciones en Medicina Traslacional (IIMT), CONICET-Universidad Austral, Pilar, Buenos Aires, Argentina
- Facultad de Ciencias Biomédicas, Universidad Austral, Pilar, Buenos Aires, Argentina
- Austral Institute for Applied Artificial Intelligence, Universidad Austral, Pilar, Argentina
| | - Juan I Di Filippo
- Facultad de Ciencias Biomédicas, Universidad Austral, Pilar, Buenos Aires, Argentina
- Austral Institute for Applied Artificial Intelligence, Universidad Austral, Pilar, Argentina
- Meton AI, Inc, Wilmington, DE, USA
| | - Valeria Scardino
- Austral Institute for Applied Artificial Intelligence, Universidad Austral, Pilar, Argentina
- Meton AI, Inc, Wilmington, DE, USA
| |
Collapse
|
6
|
Moussa AY, Alanzi A, Luo J, Chung SK, Xu B. Potential anti-obesity effect of saponin metabolites from adzuki beans: A computational approach. Food Sci Nutr 2024; 12:3612-3627. [PMID: 38726452 PMCID: PMC11077217 DOI: 10.1002/fsn3.4032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 01/25/2024] [Accepted: 01/30/2024] [Indexed: 05/12/2024] Open
Abstract
In contrast to its widespread traditional and popular culinary use to reduce weight, Vigna angularis (adzuki beans) was not subjected to sufficient scientific scrutiny. Particularly, its saponins whose role was never investigated before to unveil the beans' antidiabetic and anti-obesity effects. Four vital pancreatic and intestinal carbohydrate enzymes were selected to assess the potency of the triterpenoidal saponins of V. angularis to bind and activate these proteins through high-precision molecular modeling and dynamics mechanisms with accurate molecular mechanics Generalized Born Surface Area (MMGBSA) energy calculations; thus, recognizing their anti-obesity potential. Our results showed that adzukisaponin VI and adzukisaponin IV were the best compounds in the α-amylase and α-glucosidase enzymatic grooves, respectively. Adzukisaponin VI and angulasaponin C were the best fitting in the N-termini of sucrase-isomaltose (SI) enzyme, and angulasaponin C was the best scoring compound in maltase-glucoamylase C-termini. All of them outperformed the standard drug acarbose. These compounds in their protein complexes were selected to undergo molecular simulations of the drug-bound protein compared to the apo-protein through 100 ns, which confirmed the consistency of binding to the key amino acid residues in the four enzyme pockets with the least propensity of unfolding. Detailed analysis is given of the different polar and hydrophobic binding interactions of docked compounds. While maltase-adzukisaponin VI complex scored the lowest MMGBSA free energy of -67.77 Kcal/mol, α-amylase complex with angulasaponin B revealed the free binding energy of -74.18 Kcal/mol with a dominance of van der Waals energy (ΔEVDW) and the least change from the start to the end of the simulation time. This study will direct researchers to the significance of isolating the pure adzuki saponin components to conduct future in vitro and in vivo experimental works and even clinical trials.
Collapse
Affiliation(s)
- Ashaimaa Y. Moussa
- Department of Pharmacognosy, Faculty of PharmacyAin Shams UniversityCairoEgypt
| | - Abdullah Alanzi
- Department of Pharmacognosy, College of PharmacyKing Saud UniversityRiyadhSaudi Arabia
| | - Jinhai Luo
- Department of Life Sciences, Food Science and Technology ProgramBNU‐HKBU United International CollegeZhuhaiGuangdongChina
| | - Sookja Kim Chung
- Medical FacultyMacau University of Science and TechnologyMacauChina
| | - Baojun Xu
- Department of Life Sciences, Food Science and Technology ProgramBNU‐HKBU United International CollegeZhuhaiGuangdongChina
| |
Collapse
|
7
|
Baran K, Kloskowski A. Graph Neural Networks and Structural Information on Ionic Liquids: A Cheminformatics Study on Molecular Physicochemical Property Prediction. J Phys Chem B 2023; 127:10542-10555. [PMID: 38015981 PMCID: PMC10726349 DOI: 10.1021/acs.jpcb.3c05521] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 11/01/2023] [Accepted: 11/16/2023] [Indexed: 11/30/2023]
Abstract
Ionic liquids (ILs) provide a promising solution in many industrial applications, such as solvents, absorbents, electrolytes, catalysts, lubricants, and many others. However, due to the enormous variety of their structures, uncovering or designing those with optimal attributes requires expensive and exhaustive simulations and experiments. For these reasons, searching for an efficient theoretical tool for finding the relationship between the IL structure and properties has been the subject of many research studies. Recently, special attention has been paid to machine learning tools, especially multilayer perceptron and convolutional neural networks, among many other algorithms in the field of artificial neural networks. For the latter, graph neural networks (GNNs) seem to be a powerful cheminformatic tool yet not well enough studied for dual molecular systems such as ILs. In this work, the usage of GNNs in structure-property studies is critically evaluated for predicting the density, viscosity, and surface tension of ILs. The problem of data availability and integrity is discussed to show how well GNNs deal with mislabeled chemical data. Providing more training data is proven to be more important than ensuring that they are immaculate. Great attention is paid to how GNNs process different ions to give graph transformations and electrostatic information. Clues on how GNNs should be applied to predict the properties of ILs are provided. Differences, especially regarding handling mislabeled data, favoring the use of GNNs over classical quantitative structure-property models are discussed.
Collapse
Affiliation(s)
- Karol Baran
- Department of Physical Chemistry,
Faculty of Chemistry, Gdansk University
of Technology, Narutowicza Street 11/12, 80-233 Gdansk, Poland
| | - Adam Kloskowski
- Department of Physical Chemistry,
Faculty of Chemistry, Gdansk University
of Technology, Narutowicza Street 11/12, 80-233 Gdansk, Poland
| |
Collapse
|
8
|
Dias AL, Bustillo L, Rodrigues T. Limitations of representation learning in small molecule property prediction. Nat Commun 2023; 14:6394. [PMID: 37833279 PMCID: PMC10575963 DOI: 10.1038/s41467-023-41967-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Accepted: 09/18/2023] [Indexed: 10/15/2023] Open
Abstract
Machine learning is a powerful tool for the study and design of molecules. Here the authors comment a recent publication in Nature Communications which highlights the challenges of different molecular representations for data-driven property predictions.
Collapse
Affiliation(s)
- Ana Laura Dias
- Research Institute for Medicines (iMed), Faculdade de Farmácia, Universidade de Lisboa, Lisbon, Portugal
| | - Latimah Bustillo
- Research Institute for Medicines (iMed), Faculdade de Farmácia, Universidade de Lisboa, Lisbon, Portugal
| | - Tiago Rodrigues
- Research Institute for Medicines (iMed), Faculdade de Farmácia, Universidade de Lisboa, Lisbon, Portugal.
| |
Collapse
|
9
|
Shilpa S, Kashyap G, Sunoj RB. Recent Applications of Machine Learning in Molecular Property and Chemical Reaction Outcome Predictions. J Phys Chem A 2023; 127:8253-8271. [PMID: 37769193 DOI: 10.1021/acs.jpca.3c04779] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/30/2023]
Abstract
Burgeoning developments in machine learning (ML) and its rapidly growing adaptations in chemistry are noteworthy. Motivated by the successful deployments of ML in the realm of molecular property prediction (MPP) and chemical reaction prediction (CRP), herein we highlight some of its most recent applications in predictive chemistry. We present a nonmathematical and concise overview of the progression of ML implementations, ranging from an ensemble-based random forest model to advanced graph neural network algorithms. Similarly, the prospects of various feature engineering and feature learning approaches that work in conjunction with ML models are described. Highly accurate predictions reported in MPP tasks (e.g., lipophilicity, solubility, distribution coefficient), using methods such as D-MPNN, MolCLR, SMILES-BERT, and MolBERT, offer promising avenues in molecular design and drug discovery. Whereas MPP pertains to a given molecule, ML applications in chemical reactions present a different level of challenge, primarily arising from the simultaneous involvement of multiple molecules and their diverse roles in a reaction setting. The reported RMSEs in MPP tasks range from 0.287 to 2.20, while those for yield predictions are well over 4.9 in the lower end, reaching thresholds of >10.0 in several examples. Our Review concludes with a set of persisting challenges in dealing with reaction data sets and an overall optimistic outlook on benefits of ML-driven workflows for various MPP as well as CRP tasks.
Collapse
Affiliation(s)
- Shilpa Shilpa
- Department of Chemistry, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India
| | - Gargee Kashyap
- Department of Chemistry, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India
| | - Raghavan B Sunoj
- Department of Chemistry, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India
- Centre for Machine Intelligence and Data Science, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India
| |
Collapse
|
10
|
Bustillo L, Laino T, Rodrigues T. The rise of automated curiosity-driven discoveries in chemistry. Chem Sci 2023; 14:10378-10384. [PMID: 37799997 PMCID: PMC10548516 DOI: 10.1039/d3sc03367h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Accepted: 09/07/2023] [Indexed: 10/07/2023] Open
Abstract
The quest for generating novel chemistry knowledge is critical in scientific advancement, and machine learning (ML) has emerged as an asset in this pursuit. Through interpolation among learned patterns, ML can tackle tasks that were previously deemed demanding to machines. This distinctive capacity of ML provides invaluable aid to bench chemists in their daily work. However, current ML tools are typically designed to prioritize experiments with the highest likelihood of success, i.e., higher predictive confidence. In this perspective, we build on current trends that suggest a future in which ML could be just as beneficial in exploring uncharted search spaces through simulated curiosity. We discuss how low and 'negative' data can catalyse one-/few-shot learning, and how the broader use of curious ML and novelty detection algorithms can propel the next wave of chemical discoveries. We anticipate that ML for curiosity-driven research will help the community overcome potentially biased assumptions and uncover unexpected findings in the chemical sciences at an accelerated pace.
Collapse
Affiliation(s)
- Latimah Bustillo
- Research Institute for Medicines (iMed), Faculdade de Farmácia, Universidade de Lisboa Lisbon Portugal
| | - Teodoro Laino
- IBM Research Europe Säumerstrasse 4 8803 Rüschlikon Switzerland
- National Center for Competence in Research-Catalysis (NCCR-Catalysis) Zurich Switzerland
| | - Tiago Rodrigues
- Research Institute for Medicines (iMed), Faculdade de Farmácia, Universidade de Lisboa Lisbon Portugal
| |
Collapse
|
11
|
Vidhya KS, Sultana A, M NK, Rangareddy H. Artificial Intelligence's Impact on Drug Discovery and Development From Bench to Bedside. Cureus 2023; 15:e47486. [PMID: 37881323 PMCID: PMC10597591 DOI: 10.7759/cureus.47486] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/22/2023] [Indexed: 10/27/2023] Open
Abstract
Artificial intelligence (AI) techniques have the potential to revolutionize drug release modeling, optimize therapy for personalized medicine, and minimize side effects. By applying AI algorithms, researchers can predict drug release profiles, incorporate patient-specific factors, and optimize dosage regimens to achieve tailored and effective therapies. This AI-based approach has the potential to improve treatment outcomes, enhance patient satisfaction, and advance the field of pharmaceutical sciences. International collaborations and professional organizations play vital roles in establishing guidelines and best practices for data collection and sharing. Open data initiatives can enhance transparency and scientific progress, facilitating algorithm validation.
Collapse
Affiliation(s)
- K S Vidhya
- Bioinformatics, University of Visvesvaraya College of Engineering, Bangalore, IND
| | - Ayesha Sultana
- Pathology, St. George's University School of Medicine, St. George's, GRD
| | - Naveen Kumar M
- Pharmacology, Haveri Institute of Medical Sciences, Haveri, IND
| | | |
Collapse
|
12
|
Bustillo L, Rodrigues T. A focus on the use of real-world datasets for yield prediction. Chem Sci 2023; 14:4958-4960. [PMID: 37206402 PMCID: PMC10189867 DOI: 10.1039/d3sc90069j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/21/2023] Open
Abstract
The prediction of reaction yields remains a challenging task for machine learning (ML), given the vast search spaces and absence of robust training data. Wiest, Chawla et al. (https://doi.org/10.1039/D2SC06041H) show that a deep learning algorithm performs well on high-throughput experimentation data but surprisingly poorly on real-world, historical data from a pharmaceutical company. The result suggests that there is considerable room for improvement when coupling ML to electronic laboratory notebook data.
Collapse
Affiliation(s)
- Latimah Bustillo
- Research Institute for Medicines (iMed), Faculty of Pharmacy, University of Lisbon Av Prof Gama Pinto 1649-003 Lisbon Portugal
| | - Tiago Rodrigues
- Research Institute for Medicines (iMed), Faculty of Pharmacy, University of Lisbon Av Prof Gama Pinto 1649-003 Lisbon Portugal
| |
Collapse
|
13
|
Ferreira AIS, da Silva NFF, Mesquita FN, Rosa TC, Monzón VH, Mesquita-Neto JN. Automatic acoustic recognition of pollinating bee species can be highly improved by Deep Learning models accompanied by pre-training and strong data augmentation. FRONTIERS IN PLANT SCIENCE 2023; 14:1081050. [PMID: 37123860 PMCID: PMC10140520 DOI: 10.3389/fpls.2023.1081050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Accepted: 03/20/2023] [Indexed: 05/03/2023]
Abstract
Introduction Bees capable of performing floral sonication (or buzz-pollination) are among the most effective pollinators of blueberries. However, the quality of pollination provided varies greatly among species visiting the flowers. Consequently, the correct identification of flower visitors becomes indispensable to distinguishing the most efficient pollinators of blueberry. However, taxonomic identification normally depends on microscopic characteristics and the active participation of experts in the decision-making process. Moreover, the many species of bees (20,507 worldwide) and other insects are a challenge for a decreasing number of insect taxonomists. To overcome the limitations of traditional taxonomy, automatic classification systems of insects based on Machine-Learning (ML) have been raised for detecting and distinguishing a wide variety of bioacoustic signals, including bee buzzing sounds. Despite that, classical ML algorithms fed by spectrogram-type data only reached marginal performance for bee ID recognition. On the other hand, emerging systems from Deep Learning (DL), especially Convolutional Neural Networks (CNNs), have provided a substantial boost to classification performance in other audio domains, but have yet to be tested for acoustic bee species recognition tasks. Therefore, we aimed to automatically identify blueberry pollinating bee species based on characteristics of their buzzing sounds using DL algorithms. Methods We designed CNN models combined with Log Mel-Spectrogram representations and strong data augmentation and compared their performance at recognizing blueberry pollinating bee species with the current state-of-the-art models for automatic recognition of bee species. Results and Discussion We found that CNN models performed better at assigning bee buzzing sounds to their respective taxa than expected by chance. However, CNN models were highly dependent on acoustic data pre-training and data augmentation to outperform classical ML classifiers in recognizing bee buzzing sounds. Under these conditions, the CNN models could lead to automating the taxonomic recognition of flower-visiting bees of blueberry crops. However, there is still room to improve the performance of CNN models by focusing on recording samples for poorly represented bee species. Automatic acoustic recognition associated with the degree of efficiency of a bee species to pollinate a particular crop would result in a comprehensive and powerful tool for recognizing those that best pollinate and increase fruit yields.
Collapse
Affiliation(s)
| | | | | | - Thierson Couto Rosa
- Instituto de Informatica, Universidade Federal de Goias, Goiania, Goias, Brazil
| | - Victor Hugo Monzón
- Laboratorio Ecologıa de Abejas, Departamento de Biologıa y Quımica, Facultad de Ciencias Basicas, Universidad Catolica del Maule, Talca, Chile
| | - José Neiva Mesquita-Neto
- Laboratorio Ecologıa de Abejas, Departamento de Biologıa y Quımica, Facultad de Ciencias Basicas, Universidad Catolica del Maule, Talca, Chile
| |
Collapse
|
14
|
Yang CI, Li YP. Explainable uncertainty quantifications for deep learning-based molecular property prediction. J Cheminform 2023; 15:13. [PMID: 36737786 PMCID: PMC9898940 DOI: 10.1186/s13321-023-00682-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2022] [Accepted: 01/15/2023] [Indexed: 02/05/2023] Open
Abstract
Quantifying uncertainty in machine learning is important in new research areas with scarce high-quality data. In this work, we develop an explainable uncertainty quantification method for deep learning-based molecular property prediction. This method can capture aleatoric and epistemic uncertainties separately and attribute the uncertainties to atoms present in the molecule. The atom-based uncertainty method provides an extra layer of chemical insight to the estimated uncertainties, i.e., one can analyze individual atomic uncertainty values to diagnose the chemical component that introduces uncertainty to the prediction. Our experiments suggest that atomic uncertainty can detect unseen chemical structures and identify chemical species whose data are potentially associated with significant noise. Furthermore, we propose a post-hoc calibration method to refine the uncertainty quantified by ensemble models for better confidence interval estimates. This work improves uncertainty calibration and provides a framework for assessing whether and why a prediction should be considered unreliable.
Collapse
Affiliation(s)
- Chu-I Yang
- grid.19188.390000 0004 0546 0241Department of Chemical Engineering, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei, 10617 Taiwan
| | - Yi-Pei Li
- grid.19188.390000 0004 0546 0241Department of Chemical Engineering, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei, 10617 Taiwan ,grid.28665.3f0000 0001 2287 1366Taiwan International Graduate Program (TIGP), Academia Sinica, No. 128, Sec. 2, Academia Road, Taipei, 11529 Taiwan
| |
Collapse
|
15
|
McNair D. Artificial Intelligence and Machine Learning for Lead-to-Candidate Decision-Making and Beyond. Annu Rev Pharmacol Toxicol 2023; 63:77-97. [PMID: 35679624 DOI: 10.1146/annurev-pharmtox-051921-023255] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
The use of artificial intelligence (AI) and machine learning (ML) in pharmaceutical research and development has to date focused on research: target identification; docking-, fragment-, and motif-based generation of compound libraries; modeling of synthesis feasibility; rank-ordering likely hits according to structural and chemometric similarity to compounds having known activity and affinity to the target(s); optimizing a smaller library for synthesis and high-throughput screening; and combining evidence from screening to support hit-to-lead decisions. Applying AI/ML methods to lead optimization and lead-to-candidate (L2C) decision-making has shown slower progress, especially regarding predicting absorption, distribution, metabolism, excretion, and toxicology properties. The present review surveys reasons why this is so, reports progress that has occurred in recent years, and summarizes some of the issues that remain. Effective AI/ML tools to derisk L2C and later phases of development are important to accelerate the pharmaceutical development process, ameliorate escalating development costs, and achieve greater success rates.
Collapse
Affiliation(s)
- Douglas McNair
- Global Health, Integrated Development, Bill & Melinda Gates Foundation, Seattle, Washington, USA;
| |
Collapse
|
16
|
Shirasawa R, Takemura I, Hattori S, Nagata Y. A semi-automated material exploration scheme to predict the solubilities of tetraphenylporphyrin derivatives. Commun Chem 2022; 5:158. [PMID: 36697881 PMCID: PMC9814751 DOI: 10.1038/s42004-022-00770-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 11/04/2022] [Indexed: 11/24/2022] Open
Abstract
Acceleration of material discovery has been tackled by informatics and laboratory automation. Here we show a semi-automated material exploration scheme to modelize the solubility of tetraphenylporphyrin derivatives. The scheme involved the following steps: definition of a practical chemical search space, prioritization of molecules in the space using an extended algorithm for submodular function maximization without requiring biased variable selection or pre-existing data, synthesis & automated measurement, and machine-learning model estimation. The optimal evaluation order selected using the algorithm covered several similar molecules (32% of all targeted molecules, whereas that obtained by random sampling and uncertainty sampling was ~7% and ~4%, respectively) with a small number of evaluations (10 molecules: 0.13% of all targeted molecules). The derived binary classification models predicted 'good solvents' with an accuracy >0.8. Overall, we confirmed the effectivity of the proposed semi-automated scheme in early-stage material search projects for accelerating a wider range of material research.
Collapse
Affiliation(s)
- Raku Shirasawa
- Advanced Research Laboratory, R&D Center, Sony Group Corporation, Atsugi Tec. 4-14-1 Asahi-cho, Atsugi-shi, Kanagawa, 243-0014, Japan.
| | - Ichiro Takemura
- Tokyo Laboratory 26, R&D Center, Sony Group Corporation, Atsugi Tec. 4-14-1 Asahi-cho, Atsugi-shi, Kanagawa, 243-0014, Japan
| | - Shinnosuke Hattori
- Advanced Research Laboratory, R&D Center, Sony Group Corporation, Atsugi Tec. 4-14-1 Asahi-cho, Atsugi-shi, Kanagawa, 243-0014, Japan
| | - Yuuya Nagata
- Institute for Chemical Reaction Design and Discovery, Hokkaido University, Kita 21 Nishi 10, Kita-ku, Sapporo, Hokkaido, 001-0021, Japan.
| |
Collapse
|
17
|
Strieth-Kalthoff F, Sandfort F, Kühnemund M, Schäfer FR, Kuchen H, Glorius F. Machine Learning for Chemical Reactivity: The Importance of Failed Experiments. Angew Chem Int Ed Engl 2022; 61:e202204647. [PMID: 35512117 DOI: 10.1002/anie.202204647] [Citation(s) in RCA: 71] [Impact Index Per Article: 23.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Indexed: 12/27/2022]
Abstract
Assessing the outcomes of chemical reactions in a quantitative fashion has been a cornerstone across all synthetic disciplines. Classically approached through empirical optimization, data-driven modelling bears an enormous potential to streamline this process. However, such predictive models require significant quantities of high-quality data, the availability of which is limited: Main reasons for this include experimental errors and, importantly, human biases regarding experiment selection and result reporting. In a series of case studies, we investigate the impact of these biases for drawing general conclusions from chemical reaction data, revealing the utmost importance of "negative" examples. Eventually, case studies into data expansion approaches showcase directions to circumvent these limitations-and demonstrate perspectives towards a long-term data quality enhancement in chemistry.
Collapse
Affiliation(s)
- Felix Strieth-Kalthoff
- Westfälische Wilhelms-Universität Münster, Organisch-Chemisches Institut, Corrensstr. 40, 48149, Münster, Germany
| | - Frederik Sandfort
- Westfälische Wilhelms-Universität Münster, Organisch-Chemisches Institut, Corrensstr. 40, 48149, Münster, Germany
| | - Marius Kühnemund
- Westfälische Wilhelms-Universität Münster, Department for Information Systems, Leonardo-Campus 3, 48149, Münster, Germany
| | - Felix R Schäfer
- Westfälische Wilhelms-Universität Münster, Organisch-Chemisches Institut, Corrensstr. 40, 48149, Münster, Germany
| | - Herbert Kuchen
- Westfälische Wilhelms-Universität Münster, Department for Information Systems, Leonardo-Campus 3, 48149, Münster, Germany
| | - Frank Glorius
- Westfälische Wilhelms-Universität Münster, Organisch-Chemisches Institut, Corrensstr. 40, 48149, Münster, Germany
| |
Collapse
|
18
|
Strieth‐Kalthoff F, Sandfort F, Kühnemund M, Schäfer FR, Kuchen H, Glorius F. Maschinelles Lernen zur Vorhersage chemischer Reaktivität: Die Bedeutung “gescheiterter” Experimente. Angew Chem Int Ed Engl 2022. [DOI: 10.1002/ange.202204647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Felix Strieth‐Kalthoff
- Westfälische Wilhelms-Universität Münster Organisch-Chemisches Institut Corrensstr. 40 48149 Münster Deutschland
| | - Frederik Sandfort
- Westfälische Wilhelms-Universität Münster Organisch-Chemisches Institut Corrensstr. 40 48149 Münster Deutschland
| | - Marius Kühnemund
- Westfälische Wilhelms-Universität Münster Department for Information Systems Leonardo-Campus 3 48149 Münster Deutschland
| | - Felix R. Schäfer
- Westfälische Wilhelms-Universität Münster Organisch-Chemisches Institut Corrensstr. 40 48149 Münster Deutschland
| | - Herbert Kuchen
- Westfälische Wilhelms-Universität Münster Department for Information Systems Leonardo-Campus 3 48149 Münster Deutschland
| | - Frank Glorius
- Westfälische Wilhelms-Universität Münster Organisch-Chemisches Institut Corrensstr. 40 48149 Münster Deutschland
| |
Collapse
|
19
|
Bender A, Schneider N, Segler M, Patrick Walters W, Engkvist O, Rodrigues T. Evaluation guidelines for machine learning tools in the chemical sciences. Nat Rev Chem 2022; 6:428-442. [PMID: 37117429 DOI: 10.1038/s41570-022-00391-9] [Citation(s) in RCA: 53] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/13/2022] [Indexed: 02/07/2023]
Abstract
Machine learning (ML) promises to tackle the grand challenges in chemistry and speed up the generation, improvement and/or ordering of research hypotheses. Despite the overarching applicability of ML workflows, one usually finds diverse evaluation study designs. The current heterogeneity in evaluation techniques and metrics leads to difficulty in (or the impossibility of) comparing and assessing the relevance of new algorithms. Ultimately, this may delay the digitalization of chemistry at scale and confuse method developers, experimentalists, reviewers and journal editors. In this Perspective, we critically discuss a set of method development and evaluation guidelines for different types of ML-based publications, emphasizing supervised learning. We provide a diverse collection of examples from various authors and disciplines in chemistry. While taking into account varying accessibility across research groups, our recommendations focus on reporting completeness and standardizing comparisons between tools. We aim to further contribute to improved ML transparency and credibility by suggesting a checklist of retro-/prospective tests and dissecting their importance. We envisage that the wide adoption and continuous update of best practices will encourage an informed use of ML on real-world problems related to the chemical sciences.
Collapse
|
20
|
Talebian S, Rodrigues T, das Neves J, Sarmento B, Langer R, Conde J. Facts and Figures on Materials Science and Nanotechnology Progress and Investment. ACS NANO 2021; 15:15940-15952. [PMID: 34320802 DOI: 10.1021/acsnano.1c03992] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
As the twenty-first century unfolds, nanotechnology is no longer just a buzzword in the field of materials science, but rather a tangible reality. This is evident from the surging number of commercial nanoproducts and their corresponding revenue generated in different industry sectors. However, it is important to recognize that sustainable growth of nanotechnology is heavily dependent on government funding and relevant national incentive programs. Consequently, proper analyses on publicly available nanotechnology data sets comprising information on the past two decades can be illuminating, facilitate development, and amend previous strategies as we move forward. Along these lines, classical statistics and machine learning (ML) allow processing large data sets to scrutinize patterns in materials science and nanotechnology research. Herein, we provide an analysis on nanotechnology progress and investment from an unbiased, computational vantage point and using orthogonal approaches. Our data reveal both well-established and surprising correlations in the nanotechnology field and its actors, including the interplay between the number of research institutes-industry, publications-patents, collaborative research, and top contributors to nanoproducts. Overall, data suggest that, supported by incentive programs set out by stakeholders (researchers, funding agencies, policy makers, and industry), nanotechnology could experience an exponential growth and become a centerpiece for economical welfare. Indeed, the recent success of COVID-19 vaccines is also likely to boost public trust in nanotechnology and its global impact over the coming years.
Collapse
Affiliation(s)
- Sepehr Talebian
- Intelligent Polymer Research Institute, ARC Centre of Excellence for Electromaterials Science, AIIM Facility, University of Wollongong, Wollongong, NSW 2522, Australia
- Illawarra Health and Medical Research Institute, University of Wollongong, Wollongong, NSW 2522, Australia
| | - Tiago Rodrigues
- Research Institute for Medicines (iMed), Faculdade de Farmácia, Universidade de Lisboa, Avenida Prof. Gama Pinto, 1649-003 Lisboa, Portugal
| | - José das Neves
- i3S-Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Rua Alfredo Allen, 208, 4200-135 Porto, Portugal
- INEB-Instituto de Engenharia Biomédica, Universidade do Porto, 4200-135 Porto, Portugal
- CESPU, IINFACTS-Institute for Research and Advanced Training in Health Sciences and Technologies, Avenida Central de Gandra, 1317, 4585-116 Gandra, Portugal
| | - Bruno Sarmento
- i3S-Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Rua Alfredo Allen, 208, 4200-135 Porto, Portugal
- INEB-Instituto de Engenharia Biomédica, Universidade do Porto, 4200-135 Porto, Portugal
- CESPU, IINFACTS-Institute for Research and Advanced Training in Health Sciences and Technologies, Avenida Central de Gandra, 1317, 4585-116 Gandra, Portugal
| | - Robert Langer
- David H. Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, 500 Main Street, Cambridge, Massachusetts 02139, United States
| | - João Conde
- NOVA Medical School, Faculdade de Ciências Médicas, Universidade Nova de Lisboa, 1169-056 Lisboa, Portugal
- Centre for Toxicogenomics and Human Health, Genetics, Oncology and Human Toxicology, NOVA Medical School, Faculdade de Ciências Médicas, Universidade Nova de Lisboa, 1169-056 Lisboa, Portugal
| |
Collapse
|
21
|
Newman DJ. Problems that Can Occur when Assaying Extracts to Pure Compounds in Biological Systems. Curr Ther Res Clin Exp 2021; 95:100645. [PMID: 34691294 PMCID: PMC8515388 DOI: 10.1016/j.curtheres.2021.100645] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Accepted: 09/11/2021] [Indexed: 12/01/2022] Open
Abstract
For a significant number of years, scientists of many persuasions have assayed natural product materials ranging from crude extracts to pure compounds, in a multitude of assays causally related to some biological processes. However, in a very significant number of submitted papers and published articles, what may be considered as canned biological assays were used, and if a positive effect was observed, then the authors would claim that the material assayed was a potential drug lead. This also occurred with pure synthetic compounds and compounds derived from natural products by simple chemical modifications. However, what has now become quite obvious—with all such classes of materials—is that there are many promiscuous players with multiple bioactivities. These can range from relatively crude extracts, pure compounds from natural products, synthetic processes that produce natural product derivatives, and even compounds that are truly synthetic in origin. There is also a potential problem with the data from crude to purified extracts being used to claim some form of beneficial activities for such materials, to sell that particular mixture to the lay public, by very careful descriptions of its possible uses due to legal hurdles. With the advent of artificial intelligence and very large compound databases, some of which may well contain impure materials, scientists from a variety of backgrounds have begun to utilize such listings to obtain compounds for their low to high throughput biological screens, without realizing that there are very significant numbers of active compounds (eg, pan assay interference compounds and invalid metabolic panaceas), that will hit in many different screens for a variety of reasons, thus leading to significant wasted efforts and published scientific articles that have incorrect results. This commentary gives some of the history of such materials but is designed to be used as a warning to both researchers and in particular, journal editors, and reviewers, that reports of biological results that are claimed to be the result of the compounds used, need to be very carefully screened for results due to such promiscuous compounds, irrespective of their nominal source(s). All literature searches were made by the author and the background knowledge has come from more than 55 years of research in industry and governmental laboratories in both the United Kingdom and the United States, for enzyme inhibitors/activators as well as antimicrobial and antitumor lead compounds mainly from natural product sources. The conclusion that I came up with as a result is this: Caveat emptor. (Curr Ther Res Clin Exp. 2021; 82:XXX–XXX) © 2021 Elsevier HS Journals, Inc.
Collapse
Affiliation(s)
- David J. Newman
- Address correspondence to: 664 Crestwood Rd, Wayne, PA 19087.
| |
Collapse
|
22
|
Lee K, Yang A, Lin YC, Reker D, Bernardes GJ, Rodrigues T. Combating small-molecule aggregation with machine learning. CELL REPORTS PHYSICAL SCIENCE 2021; 2:100573. [DOI: 10.1016/j.xcrp.2021.100573] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2025]
|
23
|
Almeida AF, Ataíde FAP, Loureiro RMS, Moreira R, Rodrigues T. Augmenting Adaptive Machine Learning with Kinetic Modeling for Reaction Optimization. J Org Chem 2021; 86:14192-14198. [PMID: 34235919 DOI: 10.1021/acs.joc.1c01038] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
We combine random sampling and active machine learning (ML) to optimize the synthesis of isomacroin, executing only 3% of all possible Friedländer reactions. Employing kinetic modeling, we augment machine intuition by extracting mechanistic knowledge and verify that a global optimum was obtained with ML. Our study contributes evidence on the potential of multiscale approaches to expedite the access to chemical matter, further democratizing organic chemistry in a data-motivated fashion.
Collapse
Affiliation(s)
- A Filipa Almeida
- R&D, Process Chemistry Development, Hovione FarmaCiência S.A, Campus do Lumiar, Building S 1649-038 Lisboa, Portugal.,Research Institute for Medicines (iMed.Ulisboa), Faculty of Pharmacy, Universidade de Lisboa, 1649-003 Lisboa, Portugal
| | - Filipe A P Ataíde
- R&D, Process Chemistry Development, Hovione FarmaCiência S.A, Campus do Lumiar, Building S 1649-038 Lisboa, Portugal
| | - Rui M S Loureiro
- R&D, Process Chemistry Development, Hovione FarmaCiência S.A, Campus do Lumiar, Building S 1649-038 Lisboa, Portugal
| | - Rui Moreira
- Research Institute for Medicines (iMed.Ulisboa), Faculty of Pharmacy, Universidade de Lisboa, 1649-003 Lisboa, Portugal
| | - Tiago Rodrigues
- Research Institute for Medicines (iMed.Ulisboa), Faculty of Pharmacy, Universidade de Lisboa, 1649-003 Lisboa, Portugal
| |
Collapse
|
24
|
Lawson ADG, MacCoss M, Baeten DL, Macpherson A, Shi J, Henry AJ. Modulating Target Protein Biology Through the Re-mapping of Conformational Distributions Using Small Molecules. Front Chem 2021; 9:668186. [PMID: 34017820 PMCID: PMC8129178 DOI: 10.3389/fchem.2021.668186] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Accepted: 03/30/2021] [Indexed: 12/13/2022] Open
Abstract
Over the last 10 years considerable progress has been made in the application of small molecules to modulating protein-protein interactions (PPIs), and the navigation from "undruggable" to a host of candidate molecules in clinical trials has been well-charted in recent, comprehensive reviews. Structure-based design has played an important role in this scientific journey, with three dimensional structures guiding medicinal chemistry efforts. However, the importance of two additional dimensions: movement and time is only now being realised, as increasing computing power, closely aligned with wet lab validation, is applied to the challenge. Protein dynamics are fundamental to biology and disease, and application to PPI drug discovery has massively widened the scope for new chemical entities to influence function from allosteric, and previously unreported, sites. In this forward-looking perspective we highlight exciting, new opportunities for small molecules to modulate disease biology, by adjusting the frequency profile of natural conformational sampling, through the stabilisation of clinically desired conformers of target proteins.
Collapse
Affiliation(s)
| | | | | | | | - Jiye Shi
- UCB Pharma, Slough, United Kingdom
| | | |
Collapse
|
25
|
Taking the leap between analytical chemistry and artificial intelligence: A tutorial review. Anal Chim Acta 2021; 1161:338403. [DOI: 10.1016/j.aca.2021.338403] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Revised: 03/02/2021] [Accepted: 03/03/2021] [Indexed: 01/01/2023]
|
26
|
Abstract
Introduction: Artificial Intelligence (AI) has become a component of our everyday lives, with applications ranging from recommendations on what to buy to the analysis of radiology images. Many of the techniques originally developed for other fields such as language translation and computer vision are now being applied in drug discovery. AI has enabled multiple aspects of drug discovery including the analysis of high content screening data, and the design and synthesis of new molecules.Areas covered: This perspective provides an overview of the application of AI in several areas relevant to drug discovery including property prediction, molecule generation, image analysis, and organic synthesis planning.Expert opinion: While a variety of machine learning methods are now being routinely used to predict biological activity and ADME properties, methods of representing molecules continue to evolve. Molecule generation methods are relatively new and unproven but hold the potential to access new, unexplored areas of chemical space. The application of AI in drug discovery will continue to benefit from dedicated research, as well as AI developments in other fields. With this pairing algorithmic advancements and high-quality data, the impact of AI in drug discovery will continue to grow in the coming years.
Collapse
Affiliation(s)
| | - Regina Barzilay
- Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA, USA
| |
Collapse
|