1
|
Venhorst J, Kalkman G. Drug target assessments: classifying target modulation and associated health effects using multi-level BERT-based classification models. BIOINFORMATICS ADVANCES 2025; 5:vbaf043. [PMID: 40110561 PMCID: PMC11919816 DOI: 10.1093/bioadv/vbaf043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/25/2024] [Revised: 01/10/2025] [Accepted: 03/04/2025] [Indexed: 03/22/2025]
Abstract
Motivation Drug target selection determines the success of the drug development pipeline. Therefore, novel drug targets need to be assessed for their therapeutic benefits/risks at the earliest stage possible. Where manual risk/benefit analyses are often user-biased and time-consuming, Large Language Models can offer a systematic and efficient approach to curating and analysing literature. Currently, publicly available Large Language Models are lacking for this task, while public platforms for target assessments are limited to co-occurrences. Results BERT-models for multi-level classification of drug target-health effect relationships described in PubMed were developed. Relationships were classified based on (i) causality; (ii) direction of target modulation; (iii) direction of the associated health effect. The models showed competitive performances with F1 scores between 0.86 and 0.92 and their applicability was demonstrated using ADAM33 and OSM as case study. The developed classification pipeline is the first to allow detailed classification of drug target-health effect relationships. The models provide mechanistic insight into how target modulation affects health and disease, both from an efficacy and safety perspective. The models, deployed on the whole of PubMed and available through the TargetTri platform, are expected to offer a significant advancement in artificial intelligence-assisted target identification and evaluation. Availability and implementation https://www.targettri.com.
Collapse
Affiliation(s)
- Jennifer Venhorst
- Biomedical and Digital Health, The Netherlands Organization for Applied Scientific Research (TNO), Utrecht 3584 CB, The Netherlands
| | - Gino Kalkman
- Biomedical and Digital Health, The Netherlands Organization for Applied Scientific Research (TNO), Utrecht 3584 CB, The Netherlands
| |
Collapse
|
2
|
Das SK, Mishra R, Samanta A, Shil D, Roy SD. Deep learning: A game changer in drug design and development. ADVANCES IN PHARMACOLOGY (SAN DIEGO, CALIF.) 2025; 103:101-120. [PMID: 40175037 DOI: 10.1016/bs.apha.2025.01.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2025]
Abstract
The lengthy and costly drug discovery process is transformed by deep learning, a subfield of artificial intelligence. Deep learning technologies expedite the procedure, increasing treatment success rates and speeding life-saving procedures. Deep learning stands out in target identification and lead selection. Deep learning greatly accelerates initial stage by analyzing large datasets of biological data to identify possible therapeutic targets and rank targeted drug molecules with desired features. Predicting possible adverse effects is another significant challenge. Deep learning offers prompt and efficient assistance with toxicology prediction in a very short time, deep learning algorithms can forecast a new drug's possible harm. This enables to concentrate on safer alternatives and steer clear of late-stage failures brought on by unanticipated toxicity. Deep learning unlocks the possibility of drug repurposing; by examining currently available medications, it is possible to find whole new therapeutic uses. This method speeds up development of diseases that were previously incurable. De novo drug discovery is made possible by deep learning when combined with sophisticated computational modeling, it can create completely new medications from the ground. Deep learning can recommend and direct towards new drug candidates with high binding affinities and intended therapeutic effects by examining molecular structures of disease targets. This provides focused and personalized medication. Lastly, drug characteristics can be optimized with aid of deep learning. Researchers can create medications with higher bioavailability and fewer toxicity by forecasting drug pharmacokinetics. In conclusion, deep learning promises to accelerate drug development, reduce costs, and ultimately save lives.
Collapse
Affiliation(s)
- Sushanta Kumar Das
- Mata Gujri College of Pharmacy, Mata Gujri University, Kishanganj, Bihar, India.
| | - Rahul Mishra
- Pharmacokinetics Scientist, Phase 1 Clinical Trial, Celerion IMC, Rose Street, Lincoln, NE, United States
| | - Amit Samanta
- Mata Gujri College of Pharmacy, Mata Gujri University, Kishanganj, Bihar, India
| | - Dibyendu Shil
- Mata Gujri College of Pharmacy, Mata Gujri University, Kishanganj, Bihar, India
| | - Saumendu Deb Roy
- Mata Gujri College of Pharmacy, Mata Gujri University, Kishanganj, Bihar, India
| |
Collapse
|
3
|
Zhang O, Huang Y, Cheng S, Yu M, Zhang X, Lin H, Zeng Y, Wang M, Wu Z, Zhao H, Zhang Z, Hua C, Kang Y, Cui S, Pan P, Hsieh CY, Hou T. FragGen: towards 3D geometry reliable fragment-based molecular generation. Chem Sci 2024; 15:19452-19465. [PMID: 39568888 PMCID: PMC11575641 DOI: 10.1039/d4sc04620j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2024] [Accepted: 10/11/2024] [Indexed: 11/22/2024] Open
Abstract
3D structure-based molecular generation is a successful application of generative AI in drug discovery. Most earlier models follow an atom-wise paradigm, generating molecules with good docking scores but poor molecular properties (like synthesizability and drugability). In contrast, fragment-wise generation offers a promising alternative by assembling chemically viable fragments. However, the co-design of plausible chemical and geometrical structures is still challenging, as evidenced by existing models. To address this, we introduce the Deep Geometry Handling protocol, which decomposes the entire geometry into multiple sets of geometric variables, looking beyond model architecture design. Drawing from a newly defined six-category taxonomy, we propose FragGen, a novel hybrid strategy as the first geometry-reliable, fragment-wise molecular generation method. FragGen significantly enhances both the geometric quality and synthesizability of the generated molecules, overcoming major limitations of previous models. Moreover, FragGen has been successfully applied in real-world scenarios, notably in designing type II kinase inhibitors at the ∼nM level, establishing it as the first validated 3D fragment-based drug design algorithm. We believe that this concept-algorithm-application cycle will not only inspire researchers working on other geometry-centric tasks to move beyond architecture designs but also provide a solid example of how generative AI can be customized for drug design.
Collapse
Affiliation(s)
- Odin Zhang
- College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Yufei Huang
- Zhejiang University Hangzhou 310058 Zhejiang China
| | - Shichen Cheng
- College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Mengyao Yu
- College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Xujun Zhang
- College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Haitao Lin
- Zhejiang University Hangzhou 310058 Zhejiang China
| | - Yundian Zeng
- College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Mingyang Wang
- College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Zhenxing Wu
- College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Huifeng Zhao
- College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Zaixi Zhang
- Anhui Province Key Lab of Big Data Analysis and Application, University of Science and Technology of China Hefei Anhui China
| | - Chenqing Hua
- Montreal Institute for Learning Algorithms, McGill University Montreal QC Canada
| | - Yu Kang
- College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Sunliang Cui
- College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Peichen Pan
- College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Chang-Yu Hsieh
- College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Tingjun Hou
- College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| |
Collapse
|
4
|
Venhorst J, Hanemaaijer R, Dulos R, Caspers MPM, Toet K, Attema J, de Ruiter C, Kalkman G, Rouhani Rankouhi T, de Jong JCBC, Verschuren L. Integrating text mining with network models for successful target identification: in vitro validation in MASH-induced liver fibrosis. Front Pharmacol 2024; 15:1442752. [PMID: 39399467 PMCID: PMC11466758 DOI: 10.3389/fphar.2024.1442752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2024] [Accepted: 08/28/2024] [Indexed: 10/15/2024] Open
Abstract
An in silico target discovery pipeline was developed by including a directional and weighted molecular disease network for metabolic dysfunction-associated steatohepatitis (MASH)-induced liver fibrosis. This approach integrates text mining, network biology, and artificial intelligence/machine learning with clinical transcriptome data for optimal translational power. At the mechanistic level, the critical components influencing disease progression were identified from the disease network using in silico knockouts. The top-ranked genes were then subjected to a target efficacy analysis, following which the top-5 candidate targets were validated in vitro. Three targets, including EP300, were confirmed for their roles in liver fibrosis. EP300 gene-silencing was found to significantly reduce collagen by 37%; compound intervention studies performed in human primary hepatic stellate cells and the hepatic stellate cell line LX-2 showed significant inhibition of collagen to the extent of 81% compared to the TGFβ-stimulated control (1 μM inobrodib in LX-2 cells). The validated in silico pipeline presents a unique approach for the identification of human-disease-mechanism-relevant drug targets. The directionality of the network ensures adherence to physiologically relevant signaling cascades, while the inclusion of clinical data boosts its translational power and ensures identification of the most relevant disease pathways. In silico knockouts thus provide crucial molecular insights for successful target identification.
Collapse
Affiliation(s)
- Jennifer Venhorst
- Biomedical and Digital Health, The Netherlands Organization for Applied Scientific Research (TNO), Utrecht, Netherlands
| | - Roeland Hanemaaijer
- Department of Metabolic Health Research, The Netherlands Organization for Applied Scientific Research (TNO), Leiden, Netherlands
| | - Remon Dulos
- Department of Microbiology and Systems Biology, The Netherlands Organization for Applied Scientific Research (TNO), Leiden, Netherlands
| | - Martien P. M. Caspers
- Department of Microbiology and Systems Biology, The Netherlands Organization for Applied Scientific Research (TNO), Leiden, Netherlands
| | - Karin Toet
- Department of Metabolic Health Research, The Netherlands Organization for Applied Scientific Research (TNO), Leiden, Netherlands
| | - Joline Attema
- Department of Metabolic Health Research, The Netherlands Organization for Applied Scientific Research (TNO), Leiden, Netherlands
| | - Christa de Ruiter
- Department of Metabolic Health Research, The Netherlands Organization for Applied Scientific Research (TNO), Leiden, Netherlands
| | - Gino Kalkman
- Biomedical and Digital Health, The Netherlands Organization for Applied Scientific Research (TNO), Utrecht, Netherlands
| | - Tanja Rouhani Rankouhi
- Biomedical and Digital Health, The Netherlands Organization for Applied Scientific Research (TNO), Utrecht, Netherlands
| | - Jelle C. B. C. de Jong
- Department of Microbiology and Systems Biology, The Netherlands Organization for Applied Scientific Research (TNO), Leiden, Netherlands
| | - Lars Verschuren
- Department of Microbiology and Systems Biology, The Netherlands Organization for Applied Scientific Research (TNO), Leiden, Netherlands
| |
Collapse
|
5
|
Rosa LS, Argolo CO, Nascimento CM, Pimentel AS. Identifying Substructures That Facilitate Compounds to Penetrate the Blood-Brain Barrier via Passive Transport Using Machine Learning Explainer Models. ACS Chem Neurosci 2024; 15:2144-2159. [PMID: 38723285 PMCID: PMC11157485 DOI: 10.1021/acschemneuro.3c00840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 04/15/2024] [Accepted: 04/16/2024] [Indexed: 06/06/2024] Open
Abstract
The local interpretable model-agnostic explanation (LIME) method was used to interpret two machine learning models of compounds penetrating the blood-brain barrier. The classification models, Random Forest, ExtraTrees, and Deep Residual Network, were trained and validated using the blood-brain barrier penetration dataset, which shows the penetrability of compounds in the blood-brain barrier. LIME was able to create explanations for such penetrability, highlighting the most important substructures of molecules that affect drug penetration in the barrier. The simple and intuitive outputs prove the applicability of this explainable model to interpreting the permeability of compounds across the blood-brain barrier in terms of molecular features. LIME explanations were filtered with a weight equal to or greater than 0.1 to obtain only the most relevant explanations. The results showed several structures that are important for blood-brain barrier penetration. In general, it was found that some compounds with nitrogenous substructures are more likely to permeate the blood-brain barrier. The application of these structural explanations may help the pharmaceutical industry and potential drug synthesis research groups to synthesize active molecules more rationally.
Collapse
Affiliation(s)
- Lucca
Caiaffa Santos Rosa
- Departamento de Química, Pontifícia Universidade Católica do
Rio de Janeiro, Rio de
Janeiro, RJ 22453-900, Brazil
| | - Caio Oliveira Argolo
- Departamento de Química, Pontifícia Universidade Católica do
Rio de Janeiro, Rio de
Janeiro, RJ 22453-900, Brazil
| | | | - Andre Silva Pimentel
- Departamento de Química, Pontifícia Universidade Católica do
Rio de Janeiro, Rio de
Janeiro, RJ 22453-900, Brazil
| |
Collapse
|
6
|
Barakat A, Munro G, Heegaard AM. Finding new analgesics: Computational pharmacology faces drug discovery challenges. Biochem Pharmacol 2024; 222:116091. [PMID: 38412924 DOI: 10.1016/j.bcp.2024.116091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 01/10/2024] [Accepted: 02/23/2024] [Indexed: 02/29/2024]
Abstract
Despite the worldwide prevalence and huge burden of pain, pain is an undertreated phenomenon. Currently used analgesics have several limitations regarding their efficacy and safety. The discovery of analgesics possessing a novel mechanism of action has faced multiple challenges, including a limited understanding of biological processes underpinning pain and analgesia and poor animal-to-human translation. Computational pharmacology is currently employed to face these challenges. In this review, we discuss the theory, methods, and applications of computational pharmacology in pain research. Computational pharmacology encompasses a wide variety of theoretical concepts and practical methodological approaches, with the overall aim of gaining biological insight through data acquisition and analysis. Data are acquired from patients or animal models with pain or analgesic treatment, at different levels of biological organization (molecular, cellular, physiological, and behavioral). Distinct methodological algorithms can then be used to analyze and integrate data. This helps to facilitate the identification of biological molecules and processes associated with pain phenotype, build quantitative models of pain signaling, and extract translatable features between humans and animals. However, computational pharmacology has several limitations, and its predictions can provide false positive and negative findings. Therefore, computational predictions are required to be validated experimentally before drawing solid conclusions. In this review, we discuss several case study examples of combining and integrating computational tools with experimental pain research tools to meet drug discovery challenges.
Collapse
Affiliation(s)
- Ahmed Barakat
- Department of Drug Design and Pharmacology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark; Department of Pharmacology and Toxicology, Faculty of Pharmacy, Assiut University, Assiut, Egypt.
| | | | - Anne-Marie Heegaard
- Department of Drug Design and Pharmacology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
7
|
Chen X, Huang L. Computational model for drug research. Brief Bioinform 2024; 25:bbae158. [PMID: 38581423 PMCID: PMC10998638 DOI: 10.1093/bib/bbae158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Accepted: 03/22/2024] [Indexed: 04/08/2024] Open
Abstract
This special issue focuses on computational model for drug research regarding drug bioactivity prediction, drug-related interaction prediction, modelling for immunotherapy and modelling for treatment of a specific disease, as conveyed by the following six research and four review articles. Notably, these 10 papers described a wide variety of in-depth drug research from the computational perspective and may represent a snapshot of the wide research landscape.
Collapse
Affiliation(s)
- Xing Chen
- School of Science, Jiangnan University, Wuxi, 214122, China
| | - Li Huang
- The Future Laboratory, Tsinghua University, Beijing, 100084, China
| |
Collapse
|
8
|
Tran TTV, Tayara H, Chong KT. Recent Studies of Artificial Intelligence on In Silico Drug Absorption. J Chem Inf Model 2023; 63:6198-6211. [PMID: 37819031 DOI: 10.1021/acs.jcim.3c00960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/13/2023]
Abstract
Absorption is an important area of research in pharmacochemistry and drug development, because the drug has to be absorbed before any drug effects can occur. Furthermore, the ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity) profile of drugs can be directly and considerably altered by modulating factors affecting absorption. Many drugs in development fail because of poor absorption. The research and continuous efforts of researchers in recent years have brought many successes and promises in drug absorption property prediction, especially in silico, which helps to reduce the time and cost significantly for screening undesirable drug candidates. In this report, we explicitly provide an overview of recent in silico studies on predicting absorption properties, especially from 2019 to the present, using artificial intelligence. Additionally, we have collected and investigated public databases that support absorption prediction research. On those grounds, we also proposed the challenges and development directions of absorption prediction in the future. We hope this review can provide researchers with valuable guidelines on absorption prediction to facilitate the development of newer approaches in drug discovery.
Collapse
Affiliation(s)
- Thi Tuyet Van Tran
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, Republic of Korea
- Faculty of Information Technology, An Giang University, Long Xuyen 880000, Vietnam
- Vietnam National University, Ho Chi Minh City, Ho Chi Minh 700000, Vietnam
| | - Hilal Tayara
- School of International Engineering and Science, Jeonbuk National University, Jeonju 54896, Republic of Korea
| | - Kil To Chong
- Advances Electronics and Information Research Center, Jeonbuk National University, Jeonju 54896, Republic of Korea
| |
Collapse
|
9
|
Dolivo DM, Rodrigues AE, Galiano RD, Mustoe TA, Hong SJ. Prediction and Demonstration of Retinoic Acid Receptor Agonist Ch55 as an Antifibrotic Agent in the Dermis. J Invest Dermatol 2023; 143:1724-1734.e15. [PMID: 36804965 PMCID: PMC10432574 DOI: 10.1016/j.jid.2023.01.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 01/27/2023] [Accepted: 01/31/2023] [Indexed: 02/18/2023]
Abstract
The prevalence of fibrotic diseases and the lack of pharmacologic modalities to effectively treat them impart particular importance to the discovery of novel antifibrotic therapies. The repurposing of drugs with existing mechanisms of action and/or clinical data is a promising approach for the treatment of fibrotic diseases. One paradigm that pervades all fibrotic diseases is the pathological myofibroblast, a collagen-secreting, contractile mesenchymal cell that is responsible for the deposition of fibrotic tissue. In this study, we use a gene expression paradigm characteristic of activated myofibroblasts in combination with the Connectivity Map to select compounds that are predicted to reverse the pathological gene expression signature associated with the myofibroblast and thus contain the potential for use as antifibrotic compounds. We tested a small list of these compounds in a first-pass screen, applying them to fibroblasts, and identified the retinoic acid receptor agonist Ch55 as a potential hit. Further investigation exhibited and elucidated the antifibrotic effects of Ch55 in vitro as well as showing antiscarring activity upon intradermal application in a preclinical rabbit ear hypertrophic scar model. We hope that similar predictions to uncover antiscarring compounds may yield further preclinical and ultimately clinical success.
Collapse
Affiliation(s)
- David M Dolivo
- Department of Surgery, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Adrian E Rodrigues
- Department of Surgery, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Robert D Galiano
- Department of Surgery, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Thomas A Mustoe
- Department of Surgery, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Seok Jong Hong
- Department of Surgery, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA.
| |
Collapse
|
10
|
Machine Learning Scoring Functions for Drug Discovery from Experimental and Computer-Generated Protein-Ligand Structures: Towards Per-Target Scoring Functions. Molecules 2023; 28:molecules28041661. [PMID: 36838647 PMCID: PMC9966217 DOI: 10.3390/molecules28041661] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 02/05/2023] [Accepted: 02/06/2023] [Indexed: 02/12/2023] Open
Abstract
In recent years, machine learning has been proposed as a promising strategy to build accurate scoring functions for computational docking finalized to numerically empowered drug discovery. However, the latest studies have suggested that over-optimistic results had been reported due to the correlations present in the experimental databases used for training and testing. Here, we investigate the performance of an artificial neural network in binding affinity predictions, comparing results obtained using both experimental protein-ligand structures as well as larger sets of computer-generated structures created using commercial software. Interestingly, similar performances are obtained on both databases. We find a noticeable performance suppression when moving from random horizontal tests to vertical tests performed on target proteins not included in the training data. The possibility to train the network on relatively easily created computer-generated databases leads us to explore per-target scoring functions, trained and tested ad-hoc on complexes including only one target protein. Encouraging results are obtained, depending on the type of protein being addressed.
Collapse
|
11
|
Tran TTV, Tayara H, Chong KT. Recent Studies of Artificial Intelligence on In Silico Drug Distribution Prediction. Int J Mol Sci 2023; 24:1815. [PMID: 36768139 PMCID: PMC9915725 DOI: 10.3390/ijms24031815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Revised: 01/11/2023] [Accepted: 01/13/2023] [Indexed: 01/19/2023] Open
Abstract
Drug distribution is an important process in pharmacokinetics because it has the potential to influence both the amount of medicine reaching the active sites and the effectiveness as well as safety of the drug. The main causes of 90% of drug failures in clinical development are lack of efficacy and uncontrolled toxicity. In recent years, several advances and promising developments in drug distribution property prediction have been achieved, especially in silico, which helped to drastically reduce the time and expense of screening undesired drug candidates. In this study, we provide comprehensive knowledge of drug distribution background, influencing factors, and artificial intelligence-based distribution property prediction models from 2019 to the present. Additionally, we gathered and analyzed public databases and datasets commonly utilized by the scientific community for distribution prediction. The distribution property prediction performance of five large ADMET prediction tools is mentioned as a benchmark for future research. On this basis, we also offer future challenges in drug distribution prediction and research directions. We hope that this review will provide researchers with helpful insight into distribution prediction, thus facilitating the development of innovative approaches for drug discovery.
Collapse
Affiliation(s)
- Thi Tuyet Van Tran
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, Republic of Korea
- Department of Information Technology, An Giang University, Long Xuyen 880000, Vietnam
- Vietnam National University–Ho Chi Minh City, Ho Chi Minh 700000, Vietnam
| | - Hilal Tayara
- School of International Engineering and Science, Jeonbuk National University, Jeonju 54896, Republic of Korea
| | - Kil To Chong
- Advances Electronics and Information Research Center, Jeonbuk National University, Jeonju 54896, Republic of Korea
| |
Collapse
|
12
|
Artificial intelligence and machine-learning approaches in structure and ligand-based discovery of drugs affecting central nervous system. Mol Divers 2022; 27:959-985. [PMID: 35819579 DOI: 10.1007/s11030-022-10489-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Accepted: 06/21/2022] [Indexed: 12/11/2022]
Abstract
CNS disorders are indications with a very high unmet medical needs, relatively smaller number of available drugs, and a subpar satisfaction level among patients and caregiver. Discovery of CNS drugs is extremely expensive affair with its own unique challenges leading to extremely high attrition rates and low efficiency. With explosion of data in information age, there is hardly any aspect of life that has not been touched by data driven technologies such as artificial intelligence (AI) and machine learning (ML). Drug discovery is no exception, emergence of big data via genomic, proteomic, biological, and chemical technologies has driven pharmaceutical giants to collaborate with AI oriented companies to revolutionise drug discovery, with the goal of increasing the efficiency of the process. In recent years many examples of innovative applications of AI and ML techniques in CNS drug discovery has been reported. Research on therapeutics for diseases such as schizophrenia, Alzheimer's and Parkinsonism has been provided with a new direction and thrust from these developments. AI and ML has been applied to both ligand-based and structure-based drug discovery and design of CNS therapeutics. In this review, we have summarised the general aspects of AI and ML from the perspective of drug discovery followed by a comprehensive coverage of the recent developments in the applications of AI/ML techniques in CNS drug discovery.
Collapse
|
13
|
From traditional to data-driven medicinal chemistry: a case study. Drug Discov Today 2022; 27:2065-2070. [PMID: 35452790 DOI: 10.1016/j.drudis.2022.04.017] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Revised: 04/08/2022] [Accepted: 04/13/2022] [Indexed: 12/20/2022]
Abstract
Artificial intelligence (AI) and data science are beginning to impact drug discovery. It usually takes considerable time and effort until new scientific concepts or technologies make a transition from conceptual stages to practical applicability and until experience values are gathered. Especially for computational approaches, demonstrating measurable impact on drug discovery projects is not a trivial task. A pilot study at Daiichi Sankyo Company has attempted to integrate data-driven approaches into practical medicinal chemistry and quantify the impact, as reported herein. Although the organization and focal points of early-phase drug discovery naturally vary at different pharmaceutical companies, the results of this pilot study indicate the significant potential of data-driven medicinal chemistry and suggest new models for internal training of next-generation medicinal chemists. Keywords: medicinal chemistry; drug discovery; chemoinformatics; data science; data-driven R&D.
Collapse
|
14
|
Alharbi E, Skeva R, Juty N, Jay C, Goble C. Exploring the Current Practices, Costs and Benefits of FAIR
Implementation in Pharmaceutical Research and Development: A Qualitative
Interview Study. DATA INTELLIGENCE 2021. [DOI: 10.1162/dint_a_00109] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
The findable, accessible, interoperable, reusable (FAIR) principles for scientific data management and stewardship aim to facilitate data reuse at scale by both humans and machines. Research and development (R&D) in the pharmaceutical industry is becoming increasingly data driven, but managing its data assets according to FAIR principles remains costly and challenging. To date, little scientific evidence exists about how FAIR is currently implemented in practice, what its associated costs and benefits are, and how decisions are made about the retrospective FAIRification of data sets in pharmaceutical R&D. This paper reports the results of semi-structured interviews with 14 pharmaceutical professionals who participate in various stages of drug R&D in seven pharmaceutical businesses. Inductive thematic analysis identified three primary themes of the benefits and costs of FAIRification, and the elements that influence the decision-making process for FAIRifying legacy data sets. Participants collectively acknowledged the potential contribution of FAIRification to data reusability in diverse research domains and the subsequent potential for cost-savings. Implementation costs, however, were still considered a barrier by participants, with the need for considerable expenditure in terms of resources, and cultural change. How decisions were made about FAIRification was influenced by legal and ethical considerations, management commitment, and data prioritisation. The findings have significant implications for those in the pharmaceutical R&D industry who are engaged in driving FAIR implementation, and for external parties who seek to better understand existing practices and challenges.
Collapse
Affiliation(s)
- Ebtisam Alharbi
- School of Computer Science, University of Manchester, Manchester, Manchester M13 9PL, UK
- College of Computer and Information Systems, Umm Al-Qura University, Mecca, Makkah 21421, Saudi Arabia
| | - Rigina Skeva
- School of Computer Science, University of Manchester, Manchester, Manchester M13 9PL, UK
| | - Nick Juty
- School of Computer Science, University of Manchester, Manchester, Manchester M13 9PL, UK
| | - Caroline Jay
- School of Computer Science, University of Manchester, Manchester, Manchester M13 9PL, UK
| | - Carole Goble
- School of Computer Science, University of Manchester, Manchester, Manchester M13 9PL, UK
| |
Collapse
|
15
|
Asai A, Konno M, Taniguchi M, Vecchione A, Ishii H. Computational healthcare: Present and future perspectives (Review). Exp Ther Med 2021; 22:1351. [PMID: 34659497 PMCID: PMC8515560 DOI: 10.3892/etm.2021.10786] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 07/19/2021] [Indexed: 12/05/2022] Open
Abstract
Artificial intelligence (AI) has been developed through repeated new discoveries since around 1960. The use of AI is now becoming widespread within society and our daily lives. AI is also being introduced into healthcare, such as medicine and drug development; however, it is currently biased towards specific domains. The present review traces the history of the development of various AI-based applications in healthcare and compares AI-based healthcare with conventional healthcare to show the future prospects for this type of care. Knowledge of the past and present development of AI-based applications would be useful for the future utilization of novel AI approaches in healthcare.
Collapse
Affiliation(s)
- Ayumu Asai
- Center of Medical Innovation and Translational Research, Department of Medical Data Science, Graduate School of Medicine, Osaka University, Suita, Osaka 565-0871, Japan.,Artificial Intelligence Research Center, Osaka University, Ibaraki, Osaka 567-0047, Japan.,The Institute of Scientific and Industrial Research, Osaka University, Ibaraki, Osaka 567-0047, Japan
| | - Masamitsu Konno
- Center of Medical Innovation and Translational Research, Department of Medical Data Science, Graduate School of Medicine, Osaka University, Suita, Osaka 565-0871, Japan
| | - Masateru Taniguchi
- The Institute of Scientific and Industrial Research, Osaka University, Ibaraki, Osaka 567-0047, Japan
| | - Andrea Vecchione
- Department of Clinical and Molecular Medicine, University of Rome 'Sapienza', Santo Andrea Hospital, I-1035-00189 Rome, Italy
| | - Hideshi Ishii
- Center of Medical Innovation and Translational Research, Department of Medical Data Science, Graduate School of Medicine, Osaka University, Suita, Osaka 565-0871, Japan
| |
Collapse
|
16
|
Gironda-Martínez A, Donckele EJ, Samain F, Neri D. DNA-Encoded Chemical Libraries: A Comprehensive Review with Succesful Stories and Future Challenges. ACS Pharmacol Transl Sci 2021; 4:1265-1279. [PMID: 34423264 PMCID: PMC8369695 DOI: 10.1021/acsptsci.1c00118] [Citation(s) in RCA: 154] [Impact Index Per Article: 38.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Indexed: 12/27/2022]
Abstract
DNA-encoded chemical libraries (DELs) represent a versatile and powerful technology platform for the discovery of small-molecule ligands to protein targets of biological and pharmaceutical interest. DELs are collections of molecules, individually coupled to distinctive DNA tags serving as amplifiable identification barcodes. Thanks to advances in DNA-compatible reactions, selection methodologies, next-generation sequencing, and data analysis, DEL technology allows the construction and screening of libraries of unprecedented size, which has led to the discovery of highly potent ligands, some of which have progressed to clinical trials. In this Review, we present an overview of diverse approaches for the generation and screening of DEL molecular repertoires. Recent success stories are described, detailing how novel ligands were isolated from DEL screening campaigns and were further optimized by medicinal chemistry. The goal of the Review is to capture some of the most recent developments in the field, while also elaborating on future challenges to further improve DEL technology as a therapeutic discovery platform.
Collapse
Affiliation(s)
| | | | - Florent Samain
- Philochem
AG, Libernstrasse 3, CH-8112 Otelfingen, Switzerland
| | - Dario Neri
- Department
of Chemistry and Applied Biosciences, Swiss
Federal Institute of Technology, CH-8093 Zürich, Switzerland
- Philogen
S.p.A, 53100 Siena, Italy
| |
Collapse
|
17
|
Kropiwnicki E, Evangelista JE, Stein DJ, Clarke DJB, Lachmann A, Kuleshov MV, Jeon M, Jagodnik KM, Ma’ayan A. Drugmonizome and Drugmonizome-ML: integration and abstraction of small molecule attributes for drug enrichment analysis and machine learning. Database (Oxford) 2021; 2021:baab017. [PMID: 33787872 PMCID: PMC8011435 DOI: 10.1093/database/baab017] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Revised: 03/11/2021] [Accepted: 03/19/2021] [Indexed: 12/15/2022]
Abstract
Understanding the underlying molecular and structural similarities between seemingly heterogeneous sets of drugs can aid in identifying drug repurposing opportunities and assist in the discovery of novel properties of preclinical small molecules. A wealth of information about drug and small molecule structure, targets, indications and side effects; induced gene expression signatures; and other attributes are publicly available through web-based tools, databases and repositories. By processing, abstracting and aggregating information from these resources into drug set libraries, knowledge about novel properties of drugs and small molecules can be systematically imputed with machine learning. In addition, drug set libraries can be used as the underlying database for drug set enrichment analysis. Here, we present Drugmonizome, a database with a search engine for querying annotated sets of drugs and small molecules for performing drug set enrichment analysis. Utilizing the data within Drugmonizome, we also developed Drugmonizome-ML. Drugmonizome-ML enables users to construct customized machine learning pipelines using the drug set libraries from Drugmonizome. To demonstrate the utility of Drugmonizome, drug sets from 12 independent SARS-CoV-2 in vitro screens were subjected to consensus enrichment analysis. Despite the low overlap among these 12 independent in vitro screens, we identified common biological processes critical for blocking viral replication. To demonstrate Drugmonizome-ML, we constructed a machine learning pipeline to predict whether approved and preclinical drugs may induce peripheral neuropathy as a potential side effect. Overall, the Drugmonizome and Drugmonizome-ML resources provide rich and diverse knowledge about drugs and small molecules for direct systems pharmacology applications. Database URL: https://maayanlab.cloud/drugmonizome/.
Collapse
Affiliation(s)
- Eryk Kropiwnicki
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-Based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG); Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - John E Evangelista
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-Based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG); Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Daniel J Stein
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-Based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG); Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Daniel J B Clarke
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-Based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG); Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Alexander Lachmann
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-Based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG); Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Maxim V Kuleshov
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-Based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG); Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Minji Jeon
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-Based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG); Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Kathleen M Jagodnik
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-Based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG); Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Avi Ma’ayan
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-Based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG); Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| |
Collapse
|
18
|
Hastings J, Glauer M, Memariani A, Neuhaus F, Mossakowski T. Learning chemistry: exploring the suitability of machine learning for the task of structure-based chemical ontology classification. J Cheminform 2021; 13:23. [PMID: 33726837 PMCID: PMC7962259 DOI: 10.1186/s13321-021-00500-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Accepted: 02/26/2021] [Indexed: 12/22/2022] Open
Abstract
Chemical data is increasingly openly available in databases such as PubChem, which contains approximately 110 million compound entries as of February 2021. With the availability of data at such scale, the burden has shifted to organisation, analysis and interpretation. Chemical ontologies provide structured classifications of chemical entities that can be used for navigation and filtering of the large chemical space. ChEBI is a prominent example of a chemical ontology, widely used in life science contexts. However, ChEBI is manually maintained and as such cannot easily scale to the full scope of public chemical data. There is a need for tools that are able to automatically classify chemical data into chemical ontologies, which can be framed as a hierarchical multi-class classification problem. In this paper we evaluate machine learning approaches for this task, comparing different learning frameworks including logistic regression, decision trees and long short-term memory artificial neural networks, and different encoding approaches for the chemical structures, including cheminformatics fingerprints and character-based encoding from chemical line notation representations. We find that classical learning approaches such as logistic regression perform well with sets of relatively specific, disjoint chemical classes, while the neural network is able to handle larger sets of overlapping classes but needs more examples per class to learn from, and is not able to make a class prediction for every molecule. Future work will explore hybrid and ensemble approaches, as well as alternative network architectures including neuro-symbolic approaches.
Collapse
Affiliation(s)
- Janna Hastings
- Department of Computer Science, Otto-von-Guericke University of Magdeburg, Magdeburg, Germany
| | - Martin Glauer
- Department of Computer Science, Otto-von-Guericke University of Magdeburg, Magdeburg, Germany
| | - Adel Memariani
- Department of Computer Science, Otto-von-Guericke University of Magdeburg, Magdeburg, Germany
| | - Fabian Neuhaus
- Department of Computer Science, Otto-von-Guericke University of Magdeburg, Magdeburg, Germany
| | - Till Mossakowski
- Department of Computer Science, Otto-von-Guericke University of Magdeburg, Magdeburg, Germany
| |
Collapse
|
19
|
Capuccini M, Dahlö M, Toor S, Spjuth O. MaRe: Processing Big Data with application containers on Apache Spark. Gigascience 2020; 9:giaa042. [PMID: 32369166 PMCID: PMC7199472 DOI: 10.1093/gigascience/giaa042] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Revised: 02/10/2020] [Accepted: 04/07/2020] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Life science is increasingly driven by Big Data analytics, and the MapReduce programming model has been proven successful for data-intensive analyses. However, current MapReduce frameworks offer poor support for reusing existing processing tools in bioinformatics pipelines. Furthermore, these frameworks do not have native support for application containers, which are becoming popular in scientific data processing. RESULTS Here we present MaRe, an open source programming library that introduces support for Docker containers in Apache Spark. Apache Spark and Docker are the MapReduce framework and container engine that have collected the largest open source community; thus, MaRe provides interoperability with the cutting-edge software ecosystem. We demonstrate MaRe on 2 data-intensive applications in life science, showing ease of use and scalability. CONCLUSIONS MaRe enables scalable data-intensive processing in life science with Apache Spark and application containers. When compared with current best practices, which involve the use of workflow systems, MaRe has the advantage of providing data locality, ingestion from heterogeneous storage systems, and interactive processing. MaRe is generally applicable and available as open source software.
Collapse
Affiliation(s)
- Marco Capuccini
- Department of Information Technology, Uppsala University, Box 337, 75105, Uppsala, Sweden
- Department of Pharmaceutical Biosciences, Uppsala University, Box 591, 751 24, Uppsala, Sweden
| | - Martin Dahlö
- Department of Pharmaceutical Biosciences, Uppsala University, Box 591, 751 24, Uppsala, Sweden
- Science for Life Laboratory, Uppsala University, Box 591, 751 24, Uppsala, Sweden
- Uppsala Multidisciplinary Center for Advanced Computational Science, Uppsala University, Box 337, 75105, Uppsala, Sweden
| | - Salman Toor
- Department of Information Technology, Uppsala University, Box 337, 75105, Uppsala, Sweden
| | - Ola Spjuth
- Department of Pharmaceutical Biosciences, Uppsala University, Box 591, 751 24, Uppsala, Sweden
| |
Collapse
|
20
|
The current state of drug repurposing and rare diseases: an interview with Paul Trippier. FUTURE DRUG DISCOVERY 2020. [DOI: 10.4155/fdd-2019-0037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Paul Tripper is an Associate Professor of Medicinal Chemistry at the University of Nebraska Medical Center (UNMC, NE, USA) and an Editorial Board member of Future Drug Discovery. Here, he speaks to Managing Editor Francesca Lake about drug repurposing, focusing on the key challenges, its application to rare diseases and what we can look forward to in the future.
Collapse
|
21
|
Schneider P, Walters WP, Plowright AT, Sieroka N, Listgarten J, Goodnow RA, Fisher J, Jansen JM, Duca JS, Rush TS, Zentgraf M, Hill JE, Krutoholow E, Kohler M, Blaney J, Funatsu K, Luebkemann C, Schneider G. Rethinking drug design in the artificial intelligence era. Nat Rev Drug Discov 2019. [DOI: 78495111110.1038/s41573-019-0050-3' target='_blank'>'"<>78495111110.1038/s41573-019-0050-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [78495111110.1038/s41573-019-0050-3','', '10.1016/bs.pmch.2017.12.003')">Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/29/2022]
78495111110.1038/s41573-019-0050-3" />
|
22
|
Rethinking drug design in the artificial intelligence era. Nat Rev Drug Discov 2019; 19:353-364. [PMID: 31801986 DOI: 10.1038/s41573-019-0050-3] [Citation(s) in RCA: 348] [Impact Index Per Article: 58.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/28/2019] [Indexed: 12/17/2022]
|
23
|
Rodrigues T. The good, the bad, and the ugly in chemical and biological data for machine learning. DRUG DISCOVERY TODAY. TECHNOLOGIES 2019; 32-33:3-8. [PMID: 33386092 PMCID: PMC7382642 DOI: 10.1016/j.ddtec.2020.07.001] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Revised: 07/08/2020] [Accepted: 07/09/2020] [Indexed: 02/05/2023]
Abstract
Machine learning and artificial intelligence (ML/AI) have become important research tools in molecular medicine and chemistry. Their rise and recent success in drug discovery promises a rapid progression of development pipelines while reshaping how fundamental and clinical research is conducted. By taking advantage of the ever-growing wealth of publicly available and proprietary data, learning algorithms now provide an attractive means to generate statistically motivated research hypotheses. Hitherto unknown data patterns may guide and prioritize experiments, and augment expert intuition. Therefore, data is a key component in the model building workflow. Herein, I aim to discuss types of chemical and biological data according to their quality and reemphasize general recommendations for their use in ML/AI.
Collapse
Affiliation(s)
- Tiago Rodrigues
- Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina da Universidade de Lisboa, Av Prof Egaz Moniz, 1649-028 Lisboa, Portugal; Research Institute for Medicines (iMed.ULisboa), Faculdade de Farmácia, Universidade de Lisboa, Av. Prof. Gama Pinto 1649-003, Lisboa, Portugal.
| |
Collapse
|
24
|
López-López E, Naveja JJ, Medina-Franco JL. DataWarrior: an evaluation of the open-source drug discovery tool. Expert Opin Drug Discov 2019; 14:335-341. [PMID: 30806519 DOI: 10.1080/17460441.2019.1581170] [Citation(s) in RCA: 57] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
INTRODUCTION DataWarrior is open and interactive software for data analysis and visualization that integrates well-established and novel chemoinformatics algorithms in a single environment. Since its public release in 2014, DataWarrior has been used by research groups in universities, government, and industry. Areas covered: Herein, the authors discuss, in a critical manner, the tools and distinct technical features of DataWarrior and analyze the areas of opportunity. Authors also present the most common applications as well as emerging uses in research areas beyond drug discovery with an emphasis on multidisciplinary projects. Expert opinion: In the era of big data and data-driven science, DataWarrior stands out as a technology that combines prediction of physicochemical properties of pharmaceutical interest, cheminformatics calculations, multivariate data analysis, and interactive visualization with dynamic plots. The well-established chemoinformatics tools implemented in DataWarrior, as well as the innovative algorithms, make the technology useful and attractive as revealed by the increasing number of documented applications.
Collapse
Affiliation(s)
- Edgar López-López
- a Department of Pharmacy, School of Chemistry , National Autonomous University of Mexico , Mexico City , Mexico.,b Medicinal Chemistry Laboratory , University of Veracruz , Veracruz , Mexico
| | - J Jesús Naveja
- a Department of Pharmacy, School of Chemistry , National Autonomous University of Mexico , Mexico City , Mexico.,c PECEM, Faculty of Medicine , National Autonomous University of Mexico , Mexico City , Mexico
| | - José L Medina-Franco
- a Department of Pharmacy, School of Chemistry , National Autonomous University of Mexico , Mexico City , Mexico
| |
Collapse
|
25
|
Wise J, de Barron AG, Splendiani A, Balali-Mood B, Vasant D, Little E, Mellino G, Harrow I, Smith I, Taubert J, van Bochove K, Romacker M, Walgemoed P, Jimenez RC, Winnenburg R, Plasterer T, Gupta V, Hedley V. Implementation and relevance of FAIR data principles in biopharmaceutical R&D. Drug Discov Today 2019; 24:933-938. [PMID: 30690198 DOI: 10.1016/j.drudis.2019.01.008] [Citation(s) in RCA: 66] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2018] [Revised: 12/21/2018] [Accepted: 01/20/2019] [Indexed: 10/27/2022]
Abstract
Biopharmaceutical industry R&D, and indeed other life sciences R&D such as biomedical, environmental, agricultural and food production, is becoming increasingly data-driven and can significantly improve its efficiency and effectiveness by implementing the FAIR (findable, accessible, interoperable, reusable) guiding principles for scientific data management and stewardship. By so doing, the plethora of new and powerful analytical tools such as artificial intelligence and machine learning will be able, automatically and at scale, to access the data from which they learn, and on which they thrive. FAIR is a fundamental enabler for digital transformation.
Collapse
|
26
|
Mode-of-Action-Guided, Molecular Modeling-Based Toxicity Prediction: A Novel Approach for In Silico Predictive Toxicology. CHALLENGES AND ADVANCES IN COMPUTATIONAL CHEMISTRY AND PHYSICS 2019. [DOI: 10.1007/978-3-030-16443-0_6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
27
|
Li Y, Idakwo G, Thangapandian S, Chen M, Hong H, Zhang C, Gong P. Target-specific toxicity knowledgebase (TsTKb): a novel toolkit for in silico predictive toxicology. JOURNAL OF ENVIRONMENTAL SCIENCE AND HEALTH. PART C, ENVIRONMENTAL CARCINOGENESIS & ECOTOXICOLOGY REVIEWS 2018; 36:219-236. [PMID: 30426823 DOI: 10.1080/10590501.2018.1537148] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
As the number of man-made chemicals increases at an unprecedented pace, efforts of quickly screening and accurately evaluating their potential adverse biological effects have been hampered by prohibitively high costs of in vivo/vitro toxicity testing. While it is unrealistic and unnecessary to test every uncharacterized chemical, it remains a major challenge to develop alternative in silico tools with high reliability and precision in toxicity prediction. To address this urgent need, we have developed a novel mode-of-action-guided, molecular modeling-based, and machine learning-enabled modeling approach for in silico chemical toxicity prediction. Here we introduce the core element of this approach, Target-specific Toxicity Knowledgebase (TsTKb), which consists of two main components: Chemical Mode of Action (ChemMoA) database and a suite of prediction model libraries.
Collapse
Affiliation(s)
- Yan Li
- a Bennett Aerospace Inc. , Cary , NC , USA
| | - Gabriel Idakwo
- b School of Computing Science and Computer Engineering , University of Southern Mississippi , Hattiesburg , MS , USA
| | - Sundar Thangapandian
- c Environmental Laboratory , US Army Engineer Research and Development Center , Vicksburg , MS , USA
| | - Minjun Chen
- d Division of Bioinformatics and Biostatistics , National Center for Toxicological Research, US Food and Drug Administration , Jefferson , AR , USA
| | - Huixiao Hong
- d Division of Bioinformatics and Biostatistics , National Center for Toxicological Research, US Food and Drug Administration , Jefferson , AR , USA
| | - Chaoyang Zhang
- b School of Computing Science and Computer Engineering , University of Southern Mississippi , Hattiesburg , MS , USA
| | - Ping Gong
- c Environmental Laboratory , US Army Engineer Research and Development Center , Vicksburg , MS , USA
| |
Collapse
|