1
|
Rahiyab M, Khan I, Ali SS, Hussain Z, Ali S, Iqbal A. Computational profiling of molecular biomarkers in congenital disorders of glycosylation Type-I and binding analysis of Ginkgolide A with P4HB. Comput Biol Med 2025; 190:110042. [PMID: 40117797 DOI: 10.1016/j.compbiomed.2025.110042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2024] [Revised: 03/14/2025] [Accepted: 03/16/2025] [Indexed: 03/23/2025]
Abstract
AIMS Congenital disorders of glycosylation (CDG) comprise a diverse group of genetic diseases characterized by aberrant glycosylation that leads to severe multi-systematic effects. Despite advancements in understanding the underlying molecular mechanisms, curative options remain limited. This study employed computational methods to identify key molecular biomarkers for CDG-I and examine the pharmacological effects of Ginkgolide A (GA), a potent bioactive natural compound. METHODS We analyzed the GSE8440 microarray dataset to discover differentially expressed genes (DEGs) in patients compared to healthy individuals with CDG-I utilizing GEO2R. Functional enrichments, including gene ontologies (GO) and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway analyses, were conducted to contextualize the biological mechanisms and molecular signatures involved in CDG-I (Congenital Disorders of Glycosylation Type-1). The protein-protein interaction (PPI) network for DEGs was constructed using the STRING database, and the central hub genes within the PPI network were identified using Cytohubba. Furthermore, the 3D structure of the top hub gene (P4HB) was predicted by using the Robetta server. The CASTp was employed to evaluate the active sites. Molecular docking of P4HB with GA was carried out to investigate the binding affinity using the PyRx tool, and the stability of the docked complex was validated through MD simulation. The pharmacokinetics, toxicity, and bioactivity score of GA were comprehensively assessed using SwissADME, ProTox-II, and Molinspiration. RESULTS Our findings indicated 247 significant DEGs, including 146 up-regulated and 101 down-regulated genes. GO and KEGG pathway analyses confirmed that the up-regulated and hub genes were strongly associated with protein folding, glycoprotein processing in the endoplasmic reticulum, and endoplasmic reticulum stress (ER) pathways. P4HB emerged as the top hub gene in CDG-I, playing a significant role in protein folding and ER stress. The 3D structure of P4HB was refined and validated, achieving 95.8 % residues in the most favored region of the Ramachandran plot, with an overall quality of 92.97 %. The CASTp server predicted the largest active site with an area of 2243.660 Å2 and a volume of 3236.584 Å3. Molecular docking revealed that GA has a strong binding affinity with P4HB (-8.9 kcal/mol). The ADME (Absorption, Distribution, Metabolism, Excretion) and toxicity assessments confirmed promising drug-like characteristics, excellent bioavailability, and minimal toxicity risk. CONCLUSION This study emphasizes GA as a potential treatment possibility option to alleviated CDG-I pathology by targeting protein misfolding and ER stress, which are fundamental aspects of the disease. Additionally, our findings indicate that P4HB is a critical molecular target in CDG-I. These results pave the way for future preclinical and clinical investigations aimed at advancing the targeted and tailored treatments for CDG.
Collapse
Affiliation(s)
- Muhammad Rahiyab
- Center for Biotechnology and Microbiology, University of Swat, KPK, Pakistan
| | - Ishaq Khan
- Center for Biotechnology and Microbiology, University of Swat, KPK, Pakistan
| | - Syed Shujait Ali
- Center for Biotechnology and Microbiology, University of Swat, KPK, Pakistan
| | - Zahid Hussain
- Center for Biotechnology and Microbiology, University of Swat, KPK, Pakistan
| | - Shahid Ali
- Center for Biotechnology and Microbiology, University of Swat, KPK, Pakistan
| | - Arshad Iqbal
- Center for Biotechnology and Microbiology, University of Swat, KPK, Pakistan.
| |
Collapse
|
2
|
Li J, Zhang J, Guo R, Dai J, Niu Z, Wang Y, Wang T, Jiang X, Hu W. Progress of machine learning in the application of small molecule druggability prediction. Eur J Med Chem 2025; 285:117269. [PMID: 39808972 DOI: 10.1016/j.ejmech.2025.117269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2024] [Revised: 01/07/2025] [Accepted: 01/08/2025] [Indexed: 01/16/2025]
Abstract
Machine learning (ML) has become an important tool for predicting the pharmaceutical properties of small molecules. Recent advancements in ML algorithms enable the rapid and accurate evaluation of solubility, activity, toxicity, pharmacokinetics, and other molecular properties through ML-based models. By conducting virtual screening of drug targets and elucidating drug-target protein interactions, researchers can conduct preliminary evaluations of the activity and safety of compounds from the ultra-large drug compound libraries, thereby accelerating the screening process for lead compounds. Moreover, ML leverages existing experimental data to train and generate new datasets, addressing the challenge of limited compounds and protein target data. This review provided a concise overview of ML applications in predicting small molecule properties, focusing on model construction principles, molecular feature selection, and other essential aspects. It also discussed the potential applications of ML in the screening of pharmaceutical small molecules.
Collapse
Affiliation(s)
- Junyao Li
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou, China; School of Life Sciences, Huaiyin Normal University, Huaian, 223300, China; Institute of Translational Medicine, School of Medicine, Yangzhou University, Yangzhou, 225009, China
| | - Jianmei Zhang
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou, China
| | - Rui Guo
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou, China; Institute of Translational Medicine, School of Medicine, Yangzhou University, Yangzhou, 225009, China
| | - Jiawei Dai
- Institute of Translational Medicine, School of Medicine, Yangzhou University, Yangzhou, 225009, China
| | - Zhiqiang Niu
- Institute of Translational Medicine, School of Medicine, Yangzhou University, Yangzhou, 225009, China
| | - Yan Wang
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou, China
| | - Taoyun Wang
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou, China.
| | - Xiaojian Jiang
- School of Life Sciences, Huaiyin Normal University, Huaian, 223300, China.
| | - Weicheng Hu
- Institute of Translational Medicine, School of Medicine, Yangzhou University, Yangzhou, 225009, China.
| |
Collapse
|
3
|
Rasul HO, Ghafour DD, Aziz BK, Hassan BA, Rashid TA, Kivrak A. Decoding Drug Discovery: Exploring A-to-Z In Silico Methods for Beginners. Appl Biochem Biotechnol 2025; 197:1453-1503. [PMID: 39630336 DOI: 10.1007/s12010-024-05110-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/19/2024] [Indexed: 03/29/2025]
Abstract
The drug development process is a critical challenge in the pharmaceutical industry due to its time-consuming nature and the need to discover new drug potentials to address various ailments. The initial step in drug development, drug target identification, often consumes considerable time. While valid, traditional methods such as in vivo and in vitro approaches are limited in their ability to analyze vast amounts of data efficiently, leading to wasteful outcomes. To expedite and streamline drug development, an increasing reliance on computer-aided drug design (CADD) approaches has merged. These sophisticated in silico methods offer a promising avenue for efficiently identifying viable drug candidates, thus providing pharmaceutical firms with significant opportunities to uncover new prospective drug targets. The main goal of this work is to review in silico methods used in the drug development process with a focus on identifying therapeutic targets linked to specific diseases at the genetic or protein level. This article thoroughly discusses A-to-Z in silico techniques, which are essential for identifying the targets of bioactive compounds and their potential therapeutic effects. This review intends to improve drug discovery processes by illuminating the state of these cutting-edge approaches, thereby maximizing the effectiveness and duration of clinical trials for novel drug target investigation.
Collapse
Affiliation(s)
- Hezha O Rasul
- Department of Pharmaceutical Chemistry, College of Science, Charmo University, Peshawa Street, Chamchamal, 46023, Sulaimani, Iraq.
| | - Dlzar D Ghafour
- Department of Medical Laboratory Science, College of Science, Komar University of Science and Technology, 46001, Sulaimani, Iraq
- Department of Chemistry, College of Science, University of Sulaimani, 46001, Sulaimani, Iraq
| | - Bakhtyar K Aziz
- Department of Nanoscience and Applied Chemistry, College of Science, Charmo University, Peshawa Street, Chamchamal, 46023, Sulaimani, Iraq
| | - Bryar A Hassan
- Computer Science and Engineering Department, School of Science and Engineering, University of Kurdistan Hewler, KRI, Iraq
- Department of Computer Science, College of Science, Charmo University, Peshawa Street, Chamchamal, 46023, Sulaimani, Iraq
| | - Tarik A Rashid
- Computer Science and Engineering Department, School of Science and Engineering, University of Kurdistan Hewler, KRI, Iraq
| | - Arif Kivrak
- Department of Chemistry, Faculty of Sciences and Arts, Eskisehir Osmangazi University, Eskişehir, 26040, Turkey
| |
Collapse
|
4
|
Zhang Y, Xie Z, Xiao F, Yu J, Fan Z, Sun S, Shi J, Fu Z, Li X, Wang D, Zheng M, Luo X. Prediction of Multi-Pharmacokinetics Property in Multi-Species: Bayesian Neural Network Stacking Model with Uncertainty. Mol Pharm 2024. [PMID: 39508275 DOI: 10.1021/acs.molpharmaceut.4c00406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2024]
Abstract
Pharmacokinetic (PK) properties of a drug are vital attributes influencing its therapeutic effectiveness, playing an important role in the drug development process. Focusing on the difficult task of predicting PK parameters, we compiled an extensive data set comprising parameters across multiple species. Building upon this groundwork, we introduced the PKStack ensemble model to predict PK parameters across diverse species. PKStack integrates a variety of base models and includes uncertainty in its predictions. We also manually collected PK data from animals as an external test set. We predicted a total of 45 tasks for nine PK parameters in five species, and in general, the prediction accuracy was better for intravenous injections, including parameters such as human Vd (R2 = 0.72, RMSE = 0.31), human CL (R2 = 0.52, RMSE = 0.32), and others. In addition to predictive accuracy, we also considered the interpretability of the results and the definition of the model's application domain. Based on the findings, our model has great potential for practical applications in drug discovery.
Collapse
Affiliation(s)
- Yuanyuan Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhiyin Xie
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Fu Xiao
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Jie Yu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China
- Lingang Laboratory, Shanghai 200031, China
| | - Zhehuan Fan
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Shihui Sun
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Jiangshan Shi
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zunyun Fu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Xutong Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | | | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Xiaomin Luo
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, China
| |
Collapse
|
5
|
Suzuki H, Kokabu T, Yamada K, Ishikawa Y, Yabu A, Yanagihashi Y, Hyakumachi T, Tachi H, Shimizu T, Endo T, Ohnishi T, Ukeba D, Nagahama K, Takahata M, Sudo H, Iwasaki N. Deep learning-based detection of lumbar spinal canal stenosis using convolutional neural networks. Spine J 2024; 24:2086-2101. [PMID: 38909909 DOI: 10.1016/j.spinee.2024.06.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 06/13/2024] [Accepted: 06/14/2024] [Indexed: 06/25/2024]
Abstract
BACKGROUND CONTEXT Lumbar spinal canal stenosis (LSCS) is the most common spinal degenerative disorder in elderly people and usually first seen by primary care physicians or orthopedic surgeons who are not spine surgery specialists. Magnetic resonance imaging (MRI) is useful in the diagnosis of LSCS, but the equipment is often not available or difficult to read. LSCS patients with progressive neurologic deficits have difficulty with recovery if surgical treatment is delayed. So, early diagnosis and determination of appropriate surgical indications are crucial in the treatment of LSCS. Convolutional neural networks (CNNs), a type of deep learning, offers significant advantages for image recognition and classification, and work well with radiographs, which can be easily taken at any facility. PURPOSE Our purpose was to develop an algorithm to diagnose the presence or absence of LSCS requiring surgery from plain radiographs using CNNs. STUDY DESIGN Retrospective analysis of consecutive, nonrandomized series of patients at a single institution. PATIENT SAMPLE Data of 150 patients who underwent surgery for LSCS, including degenerative spondylolisthesis, at a single institution from January 2022 to August 2022, were collected. Additionally, 25 patients who underwent surgery at 2 other hospitals were included for extra external validation. OUTCOME MEASURES In annotation 1, the area under the curve (AUC) computed from the receiver operating characteristic (ROC) curve, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, positive likelihood ratio (PLR), and negative likelihood ratio (NLR) were calculated. In annotation 2, correlation coefficients were used. METHODS Four intervertebral levels from L1/2 to L4/5 were extracted as region of interest from lateral plain lumbar spine radiographs totaling 600 images were obtained. Based on the date of surgery, 500 images derived from the first 125 cases were used for internal validation, and 100 images from the subsequent 25 cases used for external validation. Additionally, 100 images from other hospitals were used for extra external validation. In annotation 1, binary classification of operative and nonoperative levels was used, and in annotation 2, the spinal canal area measured on axial MRI was labeled as the output layer. For internal validation, the 500 images were divided into each 5 dataset on per-patient basis and 5-fold cross-validation was performed. Five trained models were registered in the external validation prediction performance. Grad-CAM was used to visualize area with the high features extracted by CNNs. RESULTS In internal validation, the AUC and accuracy for annotation 1 ranged between 0.85-0.89 and 79-83%, respectively, and the correlation coefficients for annotation 2 ranged between 0.53 and 0.64 (all p<.01). In external validation, the AUC and accuracy for annotation 1 were 0.90 and 82%, respectively, and the correlation coefficient for annotation 2 was 0.69, using 5 trained CNN models. In the extra external validation, the AUC and accuracy for annotation 1 were 0.89 and 84%, respectively, and the correlation coefficient for annotation 2 was 0.56. Grad-CAM showed high feature density in the intervertebral joints and posterior intervertebral discs. CONCLUSIONS This technology automatically detects LSCS from plain lumbar spine radiographs, making it possible for medical facilities without MRI or nonspecialists to diagnose LSCS, suggesting the possibility of eliminating delays in the diagnosis and treatment of LSCS that require early treatment.
Collapse
Affiliation(s)
- Hisataka Suzuki
- Department of Orthopaedic Surgery, Faculty of Medicine and Graduate School of Medicine, Hokkaido University, N15W7, Sapporo, Hokkaido 060-8638, Japan; Department of Orthopaedic Surgery, Eniwa Hospital, 2-1-1 Kogane Chuo, Eniwa, Hokkaido 061-1449, Japan
| | - Terufumi Kokabu
- Department of Orthopaedic Surgery, Faculty of Medicine and Graduate School of Medicine, Hokkaido University, N15W7, Sapporo, Hokkaido 060-8638, Japan; Department of Orthopaedic Surgery, Eniwa Hospital, 2-1-1 Kogane Chuo, Eniwa, Hokkaido 061-1449, Japan
| | - Katsuhisa Yamada
- Department of Orthopaedic Surgery, Faculty of Medicine and Graduate School of Medicine, Hokkaido University, N15W7, Sapporo, Hokkaido 060-8638, Japan.
| | - Yoko Ishikawa
- Department of Orthopaedic Surgery, Eniwa Hospital, 2-1-1 Kogane Chuo, Eniwa, Hokkaido 061-1449, Japan
| | - Akito Yabu
- Department of Orthopaedic Surgery, Eniwa Hospital, 2-1-1 Kogane Chuo, Eniwa, Hokkaido 061-1449, Japan
| | - Yasushi Yanagihashi
- Department of Orthopaedic Surgery, Eniwa Hospital, 2-1-1 Kogane Chuo, Eniwa, Hokkaido 061-1449, Japan
| | - Takahiko Hyakumachi
- Department of Orthopaedic Surgery, Eniwa Hospital, 2-1-1 Kogane Chuo, Eniwa, Hokkaido 061-1449, Japan
| | - Hiroyuki Tachi
- Department of Orthopaedic Surgery, Faculty of Medicine and Graduate School of Medicine, Hokkaido University, N15W7, Sapporo, Hokkaido 060-8638, Japan
| | - Tomohiro Shimizu
- Department of Orthopaedic Surgery, Faculty of Medicine and Graduate School of Medicine, Hokkaido University, N15W7, Sapporo, Hokkaido 060-8638, Japan
| | - Tsutomu Endo
- Department of Orthopaedic Surgery, Faculty of Medicine and Graduate School of Medicine, Hokkaido University, N15W7, Sapporo, Hokkaido 060-8638, Japan
| | - Takashi Ohnishi
- Department of Orthopaedic Surgery, Faculty of Medicine and Graduate School of Medicine, Hokkaido University, N15W7, Sapporo, Hokkaido 060-8638, Japan
| | - Daisuke Ukeba
- Department of Orthopaedic Surgery, Faculty of Medicine and Graduate School of Medicine, Hokkaido University, N15W7, Sapporo, Hokkaido 060-8638, Japan
| | - Ken Nagahama
- Department of Orthopaedic Surgery, Sapporo Endoscopic Spine Surgery, N16E16, Sapporo, Hokkaido 065-0016, Japan
| | - Masahiko Takahata
- Department of Orthopaedic Surgery, Faculty of Medicine and Graduate School of Medicine, Hokkaido University, N15W7, Sapporo, Hokkaido 060-8638, Japan
| | - Hideki Sudo
- Department of Orthopaedic Surgery, Faculty of Medicine and Graduate School of Medicine, Hokkaido University, N15W7, Sapporo, Hokkaido 060-8638, Japan
| | - Norimasa Iwasaki
- Department of Orthopaedic Surgery, Faculty of Medicine and Graduate School of Medicine, Hokkaido University, N15W7, Sapporo, Hokkaido 060-8638, Japan
| |
Collapse
|
6
|
Mukherjee J, Sharma R, Dutta P, Bhunia B. Artificial intelligence in healthcare: a mastery. Biotechnol Genet Eng Rev 2024; 40:1659-1708. [PMID: 37013913 DOI: 10.1080/02648725.2023.2196476] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Accepted: 03/22/2023] [Indexed: 04/05/2023]
Abstract
There is a vast development of artificial intelligence (AI) in recent years. Computational technology, digitized data collection and enormous advancement in this field have allowed AI applications to penetrate the core human area of specialization. In this review article, we describe current progress achieved in the AI field highlighting constraints on smooth development in the field of medical AI sector, with discussion of its implementation in healthcare from a commercial, regulatory and sociological standpoint. Utilizing sizable multidimensional biological datasets that contain individual heterogeneity in genomes, functionality and milieu, precision medicine strives to create and optimize approaches for diagnosis, treatment methods and assessment. With the arise of complexity and expansion of data in the health-care industry, AI can be applied more frequently. The main application categories include indications for diagnosis and therapy, patient involvement and commitment and administrative tasks. There has recently been a sharp rise in interest in medical AI applications due to developments in AI software and technology, particularly in deep learning algorithms and in artificial neural network (ANN). In this overview, we enlisted the major categories of issues that AI systems are ideally equipped to resolve followed by clinical diagnostic tasks. It also includes a discussion of the future potential of AI, particularly for risk prediction in complex diseases, and the difficulties, constraints and biases that must be meticulously addressed for the effective delivery of AI in the health-care sector.
Collapse
Affiliation(s)
- Jayanti Mukherjee
- Department of Pharmaceutical Chemistry, CMR College of Pharmacy Affiliated to Jawaharlal Nehru Technological University, Hyderabad, Telangana, India
| | - Ramesh Sharma
- Department of Bioengineering, National Institute of Technology, Agartala, India
| | - Prasenjit Dutta
- Department of Production Engineering, National Institute of Technology, Agartala, India
| | - Biswanath Bhunia
- Department of Bioengineering, National Institute of Technology, Agartala, India
| |
Collapse
|
7
|
Walter M, Borghardt JM, Humbeck L, Skalic M. Multi-Task ADME/PK prediction at industrial scale: leveraging large and diverse experimental datasets. Mol Inform 2024; 43:e202400079. [PMID: 38973777 DOI: 10.1002/minf.202400079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 04/10/2024] [Accepted: 05/04/2024] [Indexed: 07/09/2024]
Abstract
ADME (Absorption, Distribution, Metabolism, Excretion) properties are key parameters to judge whether a drug candidate exhibits a desired pharmacokinetic (PK) profile. In this study, we tested multi-task machine learning (ML) models to predict ADME and animal PK endpoints trained on in-house data generated at Boehringer Ingelheim. Models were evaluated both at the design stage of a compound (i. e., no experimental data of test compounds available) and at testing stage when a particular assay would be conducted (i. e., experimental data of earlier conducted assays may be available). Using realistic time-splits, we found a clear benefit in performance of multi-task graph-based neural network models over single-task model, which was even stronger when experimental data of earlier assays is available. In an attempt to explain the success of multi-task models, we found that especially endpoints with the largest numbers of data points (physicochemical endpoints, clearance in microsomes) are responsible for increased predictivity in more complex ADME and PK endpoints. In summary, our study provides insight into how data for multiple ADME/PK endpoints in a pharmaceutical company can be best leveraged to optimize predictivity of ML models.
Collapse
Affiliation(s)
- Moritz Walter
- Medicinal Chemistry Department, Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorfer Str. 65, 88397, Biberach an der Riss, Germany
| | - Jens M Borghardt
- Drug Discovery Sciences Department, Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorfer Str. 65, 88397, Biberach an der Riss, Germany
| | - Lina Humbeck
- Medicinal Chemistry Department, Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorfer Str. 65, 88397, Biberach an der Riss, Germany
| | - Miha Skalic
- Medicinal Chemistry Department, Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorfer Str. 65, 88397, Biberach an der Riss, Germany
| |
Collapse
|
8
|
Shahabuddin, Uzma, Azam M, Parveen M, Kadir NHA, Min K, Alam M. Exploring 7β-amino-6-nitrocholestens as COVID-19 antivirals: in silico, synthesis, evaluation, and integration of artificial intelligence (AI) in drug design: assessing the cytotoxicity and antioxidant activity of 3β-acetoxynitrocholestane. RSC Med Chem 2024:d4md00257a. [PMID: 39430952 PMCID: PMC11485945 DOI: 10.1039/d4md00257a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2024] [Accepted: 09/22/2024] [Indexed: 10/22/2024] Open
Abstract
In light of the ongoing pandemic caused by SARS-CoV-2, effective and clinically translatable treatments are desperately needed for COVID-19 and its emerging variants. In this study, some derivatives, including 7β-aminocholestene compounds, and 3β-acetoxy-6-nitrocholesta-4,6-diene were synthesized, in quantitative yields from 7β-bromo-6-nitrocholest-5-enes (1-3) with a small library of amines. The synthesized steroidal products were then thoroughly characterized using a range of physicochemical techniques, including IR, NMR, UV, MS, and elemental analysis. Next, a virtual screening based on structures using docking studies was conducted to investigate the potential of these synthesized compounds as therapeutic candidates against SARS-CoV-2. Specifically, we evaluated the compounds' binding energy of the reactants and their products with three SARS-CoV-2 functional proteins: the papain-like protease, 3C-like protease or main protease, and RNA-dependent RNA polymerase. Our results indicate that the 7β-aminocholestene derivatives (4-8) display intermediate to excellent binding energy, suggesting that they interact strongly with the receptor's active amino acids and may be promising drug candidates for inhibiting SARS-CoV-2. Although the starting steroid derivatives; 7β-bromo-6-nitrocholest-5-enes (1-3) and one steroid product; 3β-acetoxy-6-nitrocholesta-4,6-diene (9) exhibited strong binding energies with various SARS-CoV-2 receptors, they did not meet the Lipinski Rule and ADMET properties required for drug development. These compounds showed either mutagenic or reproductive/developmental toxicity when assessed using toxicity prediction software. The findings based on structure-based virtual screening, suggest that 7β-aminocholestaines (4-8) may be useful for reducing the susceptibility to SARS-CoV-2 infection. The docking pose of compound 4, which has a high score of -7.4 kcal mol-1, was subjected to AI-assisted deep learning to generate 60 AI-designed molecules for drug design. Molecular docking of these AI molecules was performed to select optimal candidates for further analysis and visualization. The cytotoxicity and antioxidant effects of 3β-acetoxy-6-nitrocholesta-4,6-diene were tested in vitro, showing marked cytotoxicity and antioxidant activity. To elucidate the molecular basis for these effects, steroidal compound 9 was subjected to molecular docking analysis to identify potential binding interactions. The stability of the top-ranked docking pose was subsequently assessed using molecular dynamics simulations.
Collapse
Affiliation(s)
- Shahabuddin
- Department of Applied Chemistry, Z. H. College of Engineering & Technology, Aligarh Muslim University Aligarh 202002 India
| | - Uzma
- Division of Organic Synthesis, Department of Chemistry, Aligarh Muslim University Aligarh 202002 India
| | - Mohammad Azam
- Department of Chemistry, College of Science, King Saud University PO 2455 Riyadh 11451 Saudi Arabia
| | - Mehtab Parveen
- Division of Organic Synthesis, Department of Chemistry, Aligarh Muslim University Aligarh 202002 India
| | - Nurul Huda Abd Kadir
- Faculty of Science and Environmental Marine, Universiti Malaysia Terengganu 21030 Terengganu Malaysia
| | - Kim Min
- Department of Safety Engineering, Dongguk University 123 Dongdae-ro Gyeongju-si Gyeongbuk 780714 South Korea
| | - Mahboob Alam
- Department of Safety Engineering, Dongguk University 123 Dongdae-ro Gyeongju-si Gyeongbuk 780714 South Korea
| |
Collapse
|
9
|
Li B, Tan K, Lao AR, Wang H, Zheng H, Zhang L. A comprehensive review of artificial intelligence for pharmacology research. Front Genet 2024; 15:1450529. [PMID: 39290983 PMCID: PMC11405247 DOI: 10.3389/fgene.2024.1450529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Accepted: 08/26/2024] [Indexed: 09/19/2024] Open
Abstract
With the innovation and advancement of artificial intelligence, more and more artificial intelligence techniques are employed in drug research, biomedical frontier research, and clinical medicine practice, especially, in the field of pharmacology research. Thus, this review focuses on the applications of artificial intelligence in drug discovery, compound pharmacokinetic prediction, and clinical pharmacology. We briefly introduced the basic knowledge and development of artificial intelligence, presented a comprehensive review, and then summarized the latest studies and discussed the strengths and limitations of artificial intelligence models. Additionally, we highlighted several important studies and pointed out possible research directions.
Collapse
Affiliation(s)
- Bing Li
- College of Computer Science, Sichuan University, Chengdu, China
| | - Kan Tan
- College of Computer Science, Sichuan University, Chengdu, China
| | - Angelyn R Lao
- Department of Mathematics and Statistics, De La Salle University, Manila, Philippines
| | - Haiying Wang
- School of Computing, Ulster University, Belfast, United Kingdom
| | - Huiru Zheng
- School of Computing, Ulster University, Belfast, United Kingdom
| | - Le Zhang
- College of Computer Science, Sichuan University, Chengdu, China
| |
Collapse
|
10
|
Xiao Z, Zhu M, Chen J, You Z. Integrated Transfer Learning and Multitask Learning Strategies to Construct Graph Neural Network Models for Predicting Bioaccumulation Parameters of Chemicals. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:15650-15660. [PMID: 39051472 DOI: 10.1021/acs.est.4c02421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/27/2024]
Abstract
Accurate prediction of parameters related to the environmental exposure of chemicals is crucial for the sound management of chemicals. However, the lack of large data sets for training models may result in poor prediction accuracy and robustness. Herein, integrated transfer learning (TL) and multitask learning (MTL) was proposed for constructing a graph neural network (GNN) model (abbreviated as TL-MTL-GNN model) using n-octanol/water partition coefficients as a source domain. The TL-MTL-GNN model was trained to predict three bioaccumulation parameters based on enlarged data sets that cover 2496 compounds with at least one bioaccumulation parameter. Results show that the TL-MTL-GNN model outperformed single-task GNN models with and without the TL, as well as conventional machine learning models trained with molecular descriptors or fingerprints. Applicability domains were characterized by a state-of-the-art structure-activity landscape-based (abbreviated as ADSAL) methodology. The TL-MTL-GNN model coupled with the optimal ADSAL was employed to predict bioaccumulation parameters for around 60,000 chemicals, with more than 13,000 compounds identified as bioaccumulative chemicals. The high predictive accuracy and robustness of the TL-MTL-GNN model demonstrate the feasibility of integrating the TL and MTL strategy in modeling small-sized data sets. The strategy holds significant potential for addressing small data challenges in modeling environmental chemicals.
Collapse
Affiliation(s)
- Zijun Xiao
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Minghua Zhu
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
- Key Laboratory of Integrated Regulation and Resources Development of Shallow Lakes of Ministry of Education, College of Environment, Hohai University, Nanjing 210098, China
| | - Jingwen Chen
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Zecang You
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| |
Collapse
|
11
|
Komissarov L, Manevski N, Groebke Zbinden K, Schindler T, Zitnik M, Sach-Peltason L. Actionable Predictions of Human Pharmacokinetics at the Drug Design Stage. Mol Pharm 2024; 21:4356-4371. [PMID: 39132855 DOI: 10.1021/acs.molpharmaceut.4c00311] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/13/2024]
Abstract
We present a novel computational approach for predicting human pharmacokinetics (PK) that addresses the challenges of early stage drug design. Our study introduces and describes a large-scale data set of 11 clinical PK end points, encompassing over 2700 unique chemical structures to train machine learning models. To that end multiple advanced training strategies are compared, including the integration of in vitro data and a novel self-supervised pretraining task. In addition to the predictions, our final model provides meaningful epistemic uncertainties for every data point. This allows us to successfully identify regions of exceptional predictive performance, with an absolute average fold error (AAFE/geometric mean fold error) of less than 2.5 across multiple end points. Together, these advancements represent a significant leap toward actionable PK predictions, which can be utilized early on in the drug design process to expedite development and reduce reliance on nonclinical studies.
Collapse
Affiliation(s)
- Leonid Komissarov
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, Basel 4070, Switzerland
| | - Nenad Manevski
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, Basel 4070, Switzerland
| | - Katrin Groebke Zbinden
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, Basel 4070, Switzerland
| | - Torsten Schindler
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, Basel 4070, Switzerland
| | - Marinka Zitnik
- Harvard Medical School, Department of Biomedical Informatics, Boston, Massachusetts 02115, United States
| | - Lisa Sach-Peltason
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, Basel 4070, Switzerland
| |
Collapse
|
12
|
Bendtsen KM, Harder MWH, Glendorf T, Kjeldsen TB, Kristensen NR, Refsgaard HHF. Predicting human half-life for insulin analogs: An inter-drug approach. Eur J Pharm Biopharm 2024; 201:114375. [PMID: 38897553 DOI: 10.1016/j.ejpb.2024.114375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Revised: 06/14/2024] [Accepted: 06/16/2024] [Indexed: 06/21/2024]
Abstract
An inter-drug approach, applying pharmacokinetic information for insulin analogs in different animal species, rat, dog and pig, performed better compared to allometric scaling for human translation of intra-venous half-life and only required data from a single animal species for reliable predictions. Average fold error (AFE) between 1.2-1.7 were determined for all species and for multispecies allometric scaling AFE was 1.9. A slightly larger prediction error for human half-life was determined from in vitro human insulin receptor affinity data (AFE on 2.3-2.6). The requirements for the inter-drug approach were shown to be a span of at least 2 orders of magnitude in half-life for the included drugs and a shared clearance mechanism. The insulin analogs in this study were the five fatty acid protracted analogs: Insulin degludec, insulin icodec, insulin 320, insulin 338 and insulin 362, as well as the non-acylated analog insulin aspart.
Collapse
Affiliation(s)
- Kristian M Bendtsen
- Digital Sciences & Innovation, Research & Early Development, Novo Nordisk, DK-2760 Måløv, Denmark
| | - Magnus W H Harder
- Global Drug Discovery, Research & Early Development, Novo Nordisk, DK-2760 Måløv, Denmark
| | - Tine Glendorf
- Global Research Technologies, Research & Early Development, Novo Nordisk, DK-2760 Måløv, Denmark
| | - Thomas B Kjeldsen
- Global Research Technologies, Research & Early Development, Novo Nordisk, DK-2760 Måløv, Denmark
| | | | - Hanne H F Refsgaard
- Global Drug Discovery, Research & Early Development, Novo Nordisk, DK-2760 Måløv, Denmark.
| |
Collapse
|
13
|
Yang Z, Wang Y, Du G, Zhan Y, Zhan W. Prediction method of pharmacokinetic parameters of small molecule drugs based on GCN network model. J Mol Model 2024; 30:264. [PMID: 38995407 DOI: 10.1007/s00894-024-06051-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Accepted: 06/26/2024] [Indexed: 07/13/2024]
Abstract
CONTEXT Accurately predicting plasma protein binding rate (PPBR) and oral bioavailability (OBA) helps to better reveal the absorption and distribution of drugs in the human body and subsequent drug design. Although machine learning models have achieved good results in prediction accuracy, they often suffer from insufficient accuracy when dealing with data with irregular topological structures. METHODS In view of this, this study proposes a pharmacokinetic parameter prediction framework based on graph convolutional networks (GCN), which predicts the PPBR and OBA of small molecule drugs. In the framework, GCN is first used to extract spatial feature information on the topological structure of drug molecules, in order to better learn node features and association information between nodes. Then, based on the principle of drug similarity, this study calculates the similarity between small molecule drugs, selects different thresholds to construct datasets, and establishes a prediction model centered on the GCN algorithm. The experimental results show that compared with traditional machine learning prediction models, the prediction model constructed based on the GCN method performs best on PPBR and OBA datasets with an inter-molecular similarity threshold of 0.25, with MAE of 0.155 and 0.167, respectively. In addition, in order to further improve the accuracy of the prediction model, GCN is combined with other algorithms. Compared to using a single GCN method, the distribution of the predicted values obtained by the combined model is highly consistent with the true values. In summary, this work provides a new method for improving the rate of early drug screening in the future.
Collapse
Affiliation(s)
- Zhihua Yang
- Department of Radiation Oncology, General Hospital of Ningxia Medical University, Yinchuan, 750004, China
| | - Ying Wang
- Engineering Research Center of Molecular and Neuro Imaging of the Ministry of Education, School of Life Science and Technology, Xidian University, Xi'an, 710071, Shaanxi, China
| | - Getao Du
- Engineering Research Center of Molecular and Neuro Imaging of the Ministry of Education, School of Life Science and Technology, Xidian University, Xi'an, 710071, Shaanxi, China
| | - Yonghua Zhan
- Engineering Research Center of Molecular and Neuro Imaging of the Ministry of Education, School of Life Science and Technology, Xidian University, Xi'an, 710071, Shaanxi, China.
| | - Wenhua Zhan
- Department of Radiation Oncology, General Hospital of Ningxia Medical University, Yinchuan, 750004, China.
| |
Collapse
|
14
|
Kim J, Chang W, Ji H, Joung I. Quantum-Informed Molecular Representation Learning Enhancing ADMET Property Prediction. J Chem Inf Model 2024; 64:5028-5040. [PMID: 38916580 DOI: 10.1021/acs.jcim.4c00772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]
Abstract
We examined pretraining tasks leveraging abundant labeled data to effectively enhance molecular representation learning in downstream tasks, specifically emphasizing graph transformers to improve the prediction of ADMET properties. Our investigation revealed limitations in previous pretraining tasks and identified more meaningful training targets, ranging from 2D molecular descriptors to extensive quantum chemistry simulations. These data were seamlessly integrated into supervised pretraining tasks. The implementation of our pretraining strategy and multitask learning outperforms conventional methods, achieving state-of-the-art outcomes in 7 out of 22 ADMET tasks within the Therapeutics Data Commons by utilizing a shared encoder across all tasks. Our approach underscores the effectiveness of learning molecular representations and highlights the potential for scalability when leveraging extensive data sets, marking a significant advancement in this domain.
Collapse
Affiliation(s)
- Jungwoo Kim
- Standigm Inc., 182 Dogok-ro, 6F, Gangnam-gu, Seoul 06261, Korea
| | - Woojae Chang
- Standigm Inc., 182 Dogok-ro, 6F, Gangnam-gu, Seoul 06261, Korea
| | - Hyunjun Ji
- Standigm Inc., 182 Dogok-ro, 6F, Gangnam-gu, Seoul 06261, Korea
| | - InSuk Joung
- Standigm Inc., 182 Dogok-ro, 6F, Gangnam-gu, Seoul 06261, Korea
| |
Collapse
|
15
|
Guo W, Dong Y, Hao GF. Transfer learning empowers accurate pharmacokinetics prediction of small samples. Drug Discov Today 2024; 29:103946. [PMID: 38460571 DOI: 10.1016/j.drudis.2024.103946] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 02/22/2024] [Accepted: 03/05/2024] [Indexed: 03/11/2024]
Abstract
Accurate assessment of pharmacokinetic (PK) properties is crucial for selecting optimal candidates and avoiding downstream failures. Transfer learning is an innovative machine learning approach enabling high-throughput prediction with limited data. Recently, transfer learning methods showed promise in predicting ADME/PK parameters. Given the prolific growth of research on transfer learning for PK prediction, a comprehensive review of its advantages and challenges is imperative. This study explores the fundamentals, classifications, toolkits and applications of various transfer learning techniques for PK prediction, demonstrating their utility through three practical case studies. This work will serve as a reference for drug design researchers.
Collapse
Affiliation(s)
- Wenbo Guo
- National Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Guizhou University, Guiyang 550025, China
| | - Yawen Dong
- School of Pharmaceutical Sciences, Guizhou University, Guiyang 550025, China.
| | - Ge-Fei Hao
- National Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Guizhou University, Guiyang 550025, China.
| |
Collapse
|
16
|
Melo L, Scotti L, Scotti MT. Development of a standardized methodology for transfer learning with QSAR models: a purely data-driven approach for source task selection. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2024; 35:183-198. [PMID: 38312090 DOI: 10.1080/1062936x.2024.2311693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 01/23/2024] [Indexed: 02/06/2024]
Abstract
Transfer learning is a machine learning technique that works well with chemical endpoints, with several papers confirming its efficiency. Although effective, because the choice of source/assistant tasks is non-trivial, the application of this technique is severely limited by the domain knowledge of the modeller. Considering this limitation, we developed a purely data-driven approach for source task selection that abstracts the need for domain knowledge. To achieve this, we created a supervised learning setting in which transfer outcome (positive/negative) is the variable to be predicted, and a set of six transferability metrics, calculated based on information from target and source datasets, are the features for prediction. We used the ChEMBL database to generate 100,000 transfers using random pairing, and with these transfers, we trained and evaluated our transferability prediction model (TP-Model). Our TP-Model achieved a 135-fold increase in precision while achieving a sensitivity of 92%, demonstrating a clear superiority against random search. In addition, we observed that transfer learning could provide considerable performance increases when applicable, with an average Matthews Correlation Coefficient (MCC) increase of 0.19 when using a single source and an average MCC increase of 0.44 when using multiple sources.
Collapse
Affiliation(s)
- L Melo
- Postgraduate Program in Natural and Synthetic Bioactive Products, Federal University of Paraíba, João Pessoa, Brazil
| | - L Scotti
- Postgraduate Program in Natural and Synthetic Bioactive Products, Federal University of Paraíba, João Pessoa, Brazil
| | - M T Scotti
- Postgraduate Program in Natural and Synthetic Bioactive Products, Federal University of Paraíba, João Pessoa, Brazil
| |
Collapse
|
17
|
Karalis VD. The Integration of Artificial Intelligence into Clinical Practice. APPLIED BIOSCIENCES 2024; 3:14-44. [DOI: 10.3390/applbiosci3010002] [Citation(s) in RCA: 60] [Impact Index Per Article: 60.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2025]
Abstract
The purpose of this literature review is to provide a fundamental synopsis of current research pertaining to artificial intelligence (AI) within the domain of clinical practice. Artificial intelligence has revolutionized the field of medicine and healthcare by providing innovative solutions to complex problems. One of the most important benefits of AI in clinical practice is its ability to investigate extensive volumes of data with efficiency and precision. This has led to the development of various applications that have improved patient outcomes and reduced the workload of healthcare professionals. AI can support doctors in making more accurate diagnoses and developing personalized treatment plans. Successful examples of AI applications are outlined for a series of medical specialties like cardiology, surgery, gastroenterology, pneumology, nephrology, urology, dermatology, orthopedics, neurology, gynecology, ophthalmology, pediatrics, hematology, and critically ill patients, as well as diagnostic methods. Special reference is made to legal and ethical considerations like accuracy, informed consent, privacy issues, data security, regulatory framework, product liability, explainability, and transparency. Finally, this review closes by critically appraising AI use in clinical practice and its future perspectives. However, it is also important to approach its development and implementation cautiously to ensure ethical considerations are met.
Collapse
Affiliation(s)
- Vangelis D. Karalis
- Department of Pharmacy, School of Health Sciences, National and Kapodistrian University of Athens, 15784 Athens, Greece
- Institute of Applied and Computational Mathematics, Foundation for Research and Technology Hellas (FORTH), 70013 Heraklion, Greece
| |
Collapse
|
18
|
Mozafari N, Mozafari N, Dehshahri A, Azadi A. Knowledge Gaps in Generating Cell-Based Drug Delivery Systems and a Possible Meeting with Artificial Intelligence. Mol Pharm 2023; 20:3757-3778. [PMID: 37428824 DOI: 10.1021/acs.molpharmaceut.3c00162] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2023]
Abstract
Cell-based drug delivery systems are new strategies in targeted delivery in which cells or cell-membrane-derived systems are used as carriers and release their cargo in a controlled manner. Recently, great attention has been directed to cells as carrier systems for treating several diseases. There are various challenges in the development of cell-based drug delivery systems. The prediction of the properties of these platforms is a prerequisite step in their development to reduce undesirable effects. Integrating nanotechnology and artificial intelligence leads to more innovative technologies. Artificial intelligence quickly mines data and makes decisions more quickly and accurately. Machine learning as a subset of the broader artificial intelligence has been used in nanomedicine to design safer nanomaterials. Here, how challenges of developing cell-based drug delivery systems can be solved with potential predictive models of artificial intelligence and machine learning is portrayed. The most famous cell-based drug delivery systems and their challenges are described. Last but not least, artificial intelligence and most of its types used in nanomedicine are highlighted. The present Review has shown the challenges of developing cells or their derivatives as carriers and how they can be used with potential predictive models of artificial intelligence and machine learning.
Collapse
Affiliation(s)
- Negin Mozafari
- Department of Pharmaceutics, School of Pharmacy, Shiraz University of Medical Sciences, 71468 64685 Shiraz, Iran
| | - Niloofar Mozafari
- Design and System Operations Department, Regional Information Center for Science and Technology, 71946 94171 Shiraz, Iran
| | - Ali Dehshahri
- Department of Pharmaceutical Biotechnology, School of Pharmacy, Shiraz University of Medical Sciences, 71468 64685 Shiraz, Iran
- Pharmaceutical Sciences Research Centre, Shiraz University of Medical Sciences, 71468 64685 Shiraz, Iran
| | - Amir Azadi
- Department of Pharmaceutics, School of Pharmacy, Shiraz University of Medical Sciences, 71468 64685 Shiraz, Iran
- Pharmaceutical Sciences Research Centre, Shiraz University of Medical Sciences, 71468 64685 Shiraz, Iran
| |
Collapse
|
19
|
Dou B, Zhu Z, Merkurjev E, Ke L, Chen L, Jiang J, Zhu Y, Liu J, Zhang B, Wei GW. Machine Learning Methods for Small Data Challenges in Molecular Science. Chem Rev 2023; 123:8736-8780. [PMID: 37384816 PMCID: PMC10999174 DOI: 10.1021/acs.chemrev.3c00189] [Citation(s) in RCA: 85] [Impact Index Per Article: 42.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023]
Abstract
Small data are often used in scientific and engineering research due to the presence of various constraints, such as time, cost, ethics, privacy, security, and technical limitations in data acquisition. However, big data have been the focus for the past decade, small data and their challenges have received little attention, even though they are technically more severe in machine learning (ML) and deep learning (DL) studies. Overall, the small data challenge is often compounded by issues, such as data diversity, imputation, noise, imbalance, and high-dimensionality. Fortunately, the current big data era is characterized by technological breakthroughs in ML, DL, and artificial intelligence (AI), which enable data-driven scientific discovery, and many advanced ML and DL technologies developed for big data have inadvertently provided solutions for small data problems. As a result, significant progress has been made in ML and DL for small data challenges in the past decade. In this review, we summarize and analyze several emerging potential solutions to small data challenges in molecular science, including chemical and biological sciences. We review both basic machine learning algorithms, such as linear regression, logistic regression (LR), k-nearest neighbor (KNN), support vector machine (SVM), kernel learning (KL), random forest (RF), and gradient boosting trees (GBT), and more advanced techniques, including artificial neural network (ANN), convolutional neural network (CNN), U-Net, graph neural network (GNN), Generative Adversarial Network (GAN), long short-term memory (LSTM), autoencoder, transformer, transfer learning, active learning, graph-based semi-supervised learning, combining deep learning with traditional machine learning, and physical model-based data augmentation. We also briefly discuss the latest advances in these methods. Finally, we conclude the survey with a discussion of promising trends in small data challenges in molecular science.
Collapse
Affiliation(s)
- Bozheng Dou
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Zailiang Zhu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Ekaterina Merkurjev
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Lu Ke
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Long Chen
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Jian Jiang
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Yueying Zhu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Jie Liu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Bengong Zhang
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
20
|
Einarson K, Bendtsen KM, Li K, Thomsen M, Kristensen NR, Winther O, Fulle S, Clemmensen L, Refsgaard HH. Molecular Representations in Machine-Learning-Based Prediction of PK Parameters for Insulin Analogs. ACS OMEGA 2023; 8:23566-23578. [PMID: 37426277 PMCID: PMC10324072 DOI: 10.1021/acsomega.3c01218] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 06/06/2023] [Indexed: 07/11/2023]
Abstract
Therapeutic peptides and proteins derived from either endogenous hormones, such as insulin, or de novo design via display technologies occupy a distinct pharmaceutical space in between small molecules and large proteins such as antibodies. Optimizing the pharmacokinetic (PK) profile of drug candidates is of high importance when it comes to prioritizing lead candidates, and machine-learning models can provide a relevant tool to accelerate the drug design process. Predicting PK parameters of proteins remains difficult due to the complex factors that influence PK properties; furthermore, the data sets are small compared to the variety of compounds in the protein space. This study describes a novel combination of molecular descriptors for proteins such as insulin analogs, where many contained chemical modifications, e.g., attached small molecules for protraction of the half-life. The underlying data set consisted of 640 structural diverse insulin analogs, of which around half had attached small molecules. Other analogs were conjugated to peptides, amino acid extensions, or fragment crystallizable regions. The PK parameters clearance (CL), half-life (T1/2), and mean residence time (MRT) could be predicted by using classical machine-learning models such as Random Forest (RF) and Artificial Neural Networks (ANN) with root-mean-square errors of CL of 0.60 and 0.68 (log units) and average fold errors of 2.5 and 2.9 for RF and ANN, respectively. Both random and temporal data splittings were employed to evaluate ideal and prospective model performance with the best models, regardless of data splitting, achieving a minimum of 70% of predictions within a twofold error. The tested molecular representations include (1) global physiochemical descriptors combined with descriptors encoding the amino acid composition of the insulin analogs, (2) physiochemical descriptors of the attached small molecule, (3) protein language model (evolutionary scale modeling) embedding of the amino acid sequence of the molecules, and (4) a natural language processing inspired embedding (mol2vec) of the attached small molecule. Encoding the attached small molecule via (2) or (4) significantly improved the predictions, while the benefit of using the protein language model-based encoding (3) depended on the used machine-learning model. The most important molecular descriptors were identified as descriptors related to the molecular size of both the protein and protraction part using Shapley additive explanations values. Overall, the results show that combining representations of proteins and small molecules was key for PK predictions of insulin analogs.
Collapse
Affiliation(s)
- Kasper
A. Einarson
- Danish
Technical University (DTU), Applied Mathematics
and Computer Science, Kongens Lyngby 2800, Denmark
- Novo
Nordisk A/S, Global Drug Discovery, Research
& Early Development (R&ED), Måløv 2760, Denmark
| | | | - Kang Li
- Novo
Nordisk A/S, Digital Science & Innovation, R&ED, Måløv 2760, Denmark
| | - Maria Thomsen
- Novo
Nordisk A/S, Digital Science & Innovation, R&ED, Måløv 2760, Denmark
| | | | - Ole Winther
- Danish
Technical University (DTU), Applied Mathematics
and Computer Science, Kongens Lyngby 2800, Denmark
- Center
for Genomic Medicine, Rigshospitalet (Copenhagen
University Hospital), Copenhagen 2100, Denmark
- Department
of Biology, Bioinformatics Centre, University
of Copenhagen, Copenhagen 2200, Denmark
| | - Simone Fulle
- Novo
Nordisk A/S, Digital Science & Innovation, R&ED, Måløv 2760, Denmark
| | - Line Clemmensen
- Danish
Technical University (DTU), Applied Mathematics
and Computer Science, Kongens Lyngby 2800, Denmark
| | - Hanne H.F. Refsgaard
- Novo
Nordisk A/S, Global Drug Discovery, Research
& Early Development (R&ED), Måløv 2760, Denmark
| |
Collapse
|
21
|
Manevski N, Umehara K, Parrott N. Drug Design and Success of Prospective Mouse In Vitro-In Vivo Extrapolation (IVIVE) for Predictions of Plasma Clearance (CL p) from Hepatocyte Intrinsic Clearance (CL int). Mol Pharm 2023. [PMID: 37235687 DOI: 10.1021/acs.molpharmaceut.2c01001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Hepatocyte intrinsic clearance (CLint) and methods of in vitro-in vivo extrapolation (IVIVE) are often used to predict plasma clearance (CLp) in drug discovery. While the prediction success of this approach is dependent on the chemotype, specific molecular properties and drug design features that govern these outcomes are poorly understood. To address this challenge, we investigated the success of prospective mouse CLp IVIVE across 2142 chemically diverse compounds. Dilution scaling, which assumes that the free fraction in hepatocyte incubations (fu,inc) is governed by binding to the 10% of serum in the incubation medium, was used as our default CLp IVIVE approach. Results show that predictions of CLp are better for smaller (molecular weight (MW) < 500 Da), less polar (total polar surface area (TPSA) < 100 Å2, hydrogen bond donor (HBD) ≤1, hydrogen bond acceptor (HBA) ≤ 6), lipophilic (log D > 3), and neutral compounds, with low HBD count playing the key role. If compounds are classified according to their chemical space, predictions were good for compounds resembling central nervous system (CNS) drugs [average absolute fold error (AAFE) of 2.05, average fold error (AFE) of 0.90], moderate for classical druglike compounds (according to Lipinski, Veber, and Ghose guidelines; AAFE of 2.55; AFE of 0.68), and poor for nonclassical "beyond the rule of 5" compounds (AAFE of 3.31; AFE of 0.41). From the perspective of measured druglike properties, predictions of CLp were better for compounds with moderate-to-high hepatocyte CLint (>10 μL/min/106 cells), high passive cellular permeability (Papp > 100 nm/s), and moderate observed CLp (5-50 mL/min/kg). Influences of plasma protein binding (fu,p) and P-glycoprotein (Pgp) apical efflux ratio (AP-ER) were less pronounced. If the extended clearance classification system (ECCS) is applied, predictions were good for class 2 (Papp > 50 nm/s; neutral or basic; AAFE of 2.35; AFE of 0.70) and acceptable for class 1A compounds (AAFE of 2.98; AFE of 0.70). Classes 1B, 3 A/B, and 4 showed poor outcomes (AAFE > 3.80; AFE < 0.60). Functional groups trending toward weaker CLp IVIVE were esters, carbamates, sulfonamides, carboxylic acids, ketones, primary and secondary amines, primary alcohols, oxetanes, and compounds liable to aldehyde oxidase metabolism, likely due to multifactorial reasons. Multivariate analysis showed that multiple properties are relevant, combining together to define the overall success of CLp IVIVE. Our results indicate that the current practice of prospective CLp IVIVE is suitable only for CNS-like compounds and well-behaved classical druglike space (e.g., high permeability or ECCS class 2) without challenging functional groups. Unfortunately, based on existing mouse data, prospective CLp IVIVE for complex and nonclassical chemotypes is poor and hardly better than random guessing. This is likely due to complexities such as extrahepatic metabolism and transporter-mediated disposition which are poorly captured by this methodology. With small-molecule drug discovery increasingly evolving toward nonclassical and complex chemotypes, existing CLp IVIVE methodology will require improvement. While empirical correction factors may bridge the gap in the near future, improved and new in vitro assays, data integration models, and machine learning (ML) methods are increasingly needed to address this challenge and reduce the number of nonclinical pharmacokinetic (PK) studies.
Collapse
Affiliation(s)
- Nenad Manevski
- Roche Pharmaceutical Research and Early Development (pRED), Roche Innovation Center Basel, 4070 Basel, Switzerland
| | - Kenichi Umehara
- Roche Pharmaceutical Research and Early Development (pRED), Roche Innovation Center Basel, 4070 Basel, Switzerland
| | - Neil Parrott
- Roche Pharmaceutical Research and Early Development (pRED), Roche Innovation Center Basel, 4070 Basel, Switzerland
| |
Collapse
|
22
|
Mucllari E, Zadorozhnyy V, Ye Q, Nguyen DD. Novel Molecular Representations Using Neumann-Cayley Orthogonal Gated Recurrent Unit. J Chem Inf Model 2023; 63:2656-2666. [PMID: 37075324 DOI: 10.1021/acs.jcim.2c01526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/21/2023]
Abstract
Advances in deep neural networks (DNNs) have made a very powerful machine learning method available to researchers across many fields of study, including the biomedical and cheminformatics communities, where DNNs help to improve tasks such as protein performance, molecular design, drug discovery, etc. Many of those tasks rely on molecular descriptors for representing molecular characteristics in cheminformatics. Despite significant efforts and the introduction of numerous methods that derive molecular descriptors, the quantitative prediction of molecular properties remains challenging. One widely used method of encoding molecule features into bit strings is the molecular fingerprint. In this work, we propose using new Neumann-Cayley Gated Recurrent Units (NC-GRU) inside the Neural Nets encoder (AutoEncoder) to create neural molecular fingerprints (NC-GRU fingerprints). The NC-GRU AutoEncoder introduces orthogonal weights into widely used GRU architecture, resulting in faster, more stable training, and more reliable molecular fingerprints. Integrating novel NC-GRU fingerprints and Multi-Task DNN schematics improves the performance of various molecular-related tasks such as toxicity, partition coefficient, lipophilicity, and solvation-free energy, producing state-of-the-art results on several benchmarks.
Collapse
Affiliation(s)
- Edison Mucllari
- Department of Mathematics, University of Kentucky, Lexington, Kentucky 40506, United States
| | - Vasily Zadorozhnyy
- Department of Mathematics, University of Kentucky, Lexington, Kentucky 40506, United States
| | - Qiang Ye
- Department of Mathematics, University of Kentucky, Lexington, Kentucky 40506, United States
| | - Duc Duy Nguyen
- Department of Mathematics, University of Kentucky, Lexington, Kentucky 40506, United States
| |
Collapse
|
23
|
Obrezanova O. Artificial intelligence for compound pharmacokinetics prediction. Curr Opin Struct Biol 2023; 79:102546. [PMID: 36804676 DOI: 10.1016/j.sbi.2023.102546] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 01/04/2023] [Accepted: 01/13/2023] [Indexed: 02/17/2023]
Abstract
Optimisation of compound pharmacokinetics (PK) is an integral part of drug discovery and development. Animal in vivo PK data as well as human and animal in vitro systems are routinely utilised to evaluate PK in humans. In recent years machine learning and artificial intelligence (AI) emerged as a major tool for modelling of in vivo animal and human PK, enabling prediction from chemical structure early in drug discovery, and therefore offering opportunities to guide the design and prioritisation of molecules based on relevant in vivo properties and, ultimately, predicting human PK at the point of design. This review presents recent advances in machine learning and AI models for in vivo animal and human PK for small-molecule compounds as well as some examples for antibody therapeutics.
Collapse
Affiliation(s)
- Olga Obrezanova
- Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Cambridge, CB4 0WJ, UK.
| |
Collapse
|
24
|
García-Andrade X, García Tahoces P, Pérez-Ríos J, Martínez Núñez E. Barrier Height Prediction by Machine Learning Correction of Semiempirical Calculations. J Phys Chem A 2023; 127:2274-2283. [PMID: 36877614 PMCID: PMC10845151 DOI: 10.1021/acs.jpca.2c08340] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 02/19/2023] [Indexed: 03/07/2023]
Abstract
Different machine learning (ML) models are proposed in the present work to predict density functional theory-quality barrier heights (BHs) from semiempirical quantum mechanical (SQM) calculations. The ML models include a multitask deep neural network, gradient-boosted trees by means of the XGBoost interface, and Gaussian process regression. The obtained mean absolute errors are similar to those of previous models considering the same number of data points. The ML corrections proposed in this paper could be useful for rapid screening of the large reaction networks that appear in combustion chemistry or in astrochemistry. Finally, our results show that 70% of the features with the highest impact on model output are bespoke predictors. This custom-made set of predictors could be employed by future Δ-ML models to improve the quantitative prediction of other reaction properties.
Collapse
Affiliation(s)
| | - Pablo García Tahoces
- Department
of Electronics and Computer Science, University
of Santiago de Compostela, Santiago de Compostela 15782, Spain
| | - Jesús Pérez-Ríos
- Department
of Physics, Stony Brook University, Stony Brook, New York 11794, United States
- Institute
for Advanced Computational Science, Stony
Brook University, Stony
Brook, New York 11794-3800, United States
| | - Emilio Martínez Núñez
- Department
of Physical Chemistry, University of Santiago
de Compostela, Santiago
de Compostela 15782, Spain
| |
Collapse
|
25
|
Wang N, Zhang Y, Wang W, Ye Z, Chen H, Hu G, Ouyang D. How can machine learning and multiscale modeling benefit ocular drug development? Adv Drug Deliv Rev 2023; 196:114772. [PMID: 36906232 DOI: 10.1016/j.addr.2023.114772] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 02/06/2023] [Accepted: 03/05/2023] [Indexed: 03/12/2023]
Abstract
The eyes possess sophisticated physiological structures, diverse disease targets, limited drug delivery space, distinctive barriers, and complicated biomechanical processes, requiring a more in-depth understanding of the interactions between drug delivery systems and biological systems for ocular formulation development. However, the tiny size of the eyes makes sampling difficult and invasive studies costly and ethically constrained. Developing ocular formulations following conventional trial-and-error formulation and manufacturing process screening procedures is inefficient. Along with the popularity of computational pharmaceutics, non-invasive in silico modeling & simulation offer new opportunities for the paradigm shift of ocular formulation development. The current work first systematically reviews the theoretical underpinnings, advanced applications, and unique advantages of data-driven machine learning and multiscale simulation approaches represented by molecular simulation, mathematical modeling, and pharmacokinetic (PK)/pharmacodynamic (PD) modeling for ocular drug development. Following this, a new computer-driven framework for rational pharmaceutical formulation design is proposed, inspired by the potential of in silico explorations in understanding drug delivery details and facilitating drug formulation design. Lastly, to promote the paradigm shift, integrated in silico methodologies were highlighted, and discussions on data challenges, model practicality, personalized modeling, regulatory science, interdisciplinary collaboration, and talent training were conducted in detail with a view to achieving more efficient objective-oriented pharmaceutical formulation design.
Collapse
Affiliation(s)
- Nannan Wang
- State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences (ICMS), University of Macau, Macau, China
| | - Yunsen Zhang
- State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences (ICMS), University of Macau, Macau, China
| | - Wei Wang
- State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences (ICMS), University of Macau, Macau, China
| | - Zhuyifan Ye
- State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences (ICMS), University of Macau, Macau, China
| | - Hongyu Chen
- State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences (ICMS), University of Macau, Macau, China; Faculty of Science and Technology (FST), University of Macau, Macau, China
| | - Guanghui Hu
- Faculty of Science and Technology (FST), University of Macau, Macau, China
| | - Defang Ouyang
- State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences (ICMS), University of Macau, Macau, China; Department of Public Health and Medicinal Administration, Faculty of Health Sciences (FHS), University of Macau, Macau, China.
| |
Collapse
|
26
|
Baccari W, Saidi I, Znati M, Mustafa AM, Caprioli G, Harrath AH, Ben Jannet H. HPLC-MS/MS analysis, antioxidant and α-amylase inhibitory activities of the endemic plant Ferula tunetana using in vitro and in silico methods. Process Biochem 2023. [DOI: 10.1016/j.procbio.2023.03.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 03/17/2023]
|
27
|
Stoyanova R, Katzberger PM, Komissarov L, Khadhraoui A, Sach-Peltason L, Groebke Zbinden K, Schindler T, Manevski N. Computational Predictions of Nonclinical Pharmacokinetics at the Drug Design Stage. J Chem Inf Model 2023; 63:442-458. [PMID: 36595708 DOI: 10.1021/acs.jcim.2c01134] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Although computational predictions of pharmacokinetics (PK) are desirable at the drug design stage, existing approaches are often limited by prediction accuracy and human interpretability. Using a discovery data set of mouse and rat PK studies at Roche (9,685 unique compounds), we performed a proof-of-concept study to predict key PK properties from chemical structure alone, including plasma clearance (CLp), volume of distribution at steady-state (Vss), and oral bioavailability (F). Ten machine learning (ML) models were evaluated, including Single-Task, Multitask, and transfer learning approaches (i.e., pretraining with in vitro data). In addition to prediction accuracy, we emphasized human interpretability of outcomes, especially the quantification of uncertainty, applicability domains, and explanations of predictions in terms of molecular features. Results show that intravenous (IV) PK properties (CLp and Vss) can be predicted with good precision (average absolute fold error, AAFE of 1.96-2.84 depending on data split) and low bias (average fold error, AFE of 0.98-1.36), with AutoGluon, Gaussian Process Regressor (GP), and ChemProp displaying the best performance. Driven by higher complexity of oral PK studies, predictions of F were more challenging, with the best AAFE values of 2.35-2.60 and higher overprediction bias (AFE of 1.45-1.62). Multi-Task approaches and pretraining of ChemProp neural networks with in vitro data showed similar precision to Single-Task models but helped reduce the bias and increase correlations between observations and predictions. A combination of GP-computed prediction variance, molecular clustering, and dimensionality-reduction provided valuable quantitative insights into prediction uncertainty and applicability domains. SHAPley Additive exPlanations (SHAPs) highlighted molecular features contributing to prediction outcomes of Vss, providing explanations that could aid drug design. Combined results show that computational predictions of PK are feasible at the drug design stage, with several ML technologies converging to successfully leverage historical PK data sets. Further studies are needed to unlock the full potential of this approach, especially with respect to data set sizes and quality, transfer learning between in vitro and in vivo data sets, model-independent quantification of uncertainty, and explainability of predictions.
Collapse
Affiliation(s)
- Raya Stoyanova
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, 4070Basel, Switzerland
| | - Paul Maximilian Katzberger
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, 4070Basel, Switzerland
| | - Leonid Komissarov
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, 4070Basel, Switzerland
| | - Aous Khadhraoui
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, 4070Basel, Switzerland
| | - Lisa Sach-Peltason
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, 4070Basel, Switzerland
| | - Katrin Groebke Zbinden
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, 4070Basel, Switzerland
| | - Torsten Schindler
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, 4070Basel, Switzerland
| | - Nenad Manevski
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, 4070Basel, Switzerland
| |
Collapse
|
28
|
Tran TTV, Tayara H, Chong KT. Recent Studies of Artificial Intelligence on In Silico Drug Distribution Prediction. Int J Mol Sci 2023; 24:1815. [PMID: 36768139 PMCID: PMC9915725 DOI: 10.3390/ijms24031815] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Revised: 01/11/2023] [Accepted: 01/13/2023] [Indexed: 01/19/2023] Open
Abstract
Drug distribution is an important process in pharmacokinetics because it has the potential to influence both the amount of medicine reaching the active sites and the effectiveness as well as safety of the drug. The main causes of 90% of drug failures in clinical development are lack of efficacy and uncontrolled toxicity. In recent years, several advances and promising developments in drug distribution property prediction have been achieved, especially in silico, which helped to drastically reduce the time and expense of screening undesired drug candidates. In this study, we provide comprehensive knowledge of drug distribution background, influencing factors, and artificial intelligence-based distribution property prediction models from 2019 to the present. Additionally, we gathered and analyzed public databases and datasets commonly utilized by the scientific community for distribution prediction. The distribution property prediction performance of five large ADMET prediction tools is mentioned as a benchmark for future research. On this basis, we also offer future challenges in drug distribution prediction and research directions. We hope that this review will provide researchers with helpful insight into distribution prediction, thus facilitating the development of innovative approaches for drug discovery.
Collapse
Affiliation(s)
- Thi Tuyet Van Tran
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, Republic of Korea
- Department of Information Technology, An Giang University, Long Xuyen 880000, Vietnam
- Vietnam National University–Ho Chi Minh City, Ho Chi Minh 700000, Vietnam
| | - Hilal Tayara
- School of International Engineering and Science, Jeonbuk National University, Jeonju 54896, Republic of Korea
| | - Kil To Chong
- Advances Electronics and Information Research Center, Jeonbuk National University, Jeonju 54896, Republic of Korea
| |
Collapse
|
29
|
Synthesis of new 1,2,3-triazole linked benzimidazolidinone : single crystal X-ray structure, biological activities evaluation and molecular docking studies. ARAB J CHEM 2023. [DOI: 10.1016/j.arabjc.2023.104566] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
|
30
|
Ota R, Yamashita F. Application of machine learning techniques to the analysis and prediction of drug pharmacokinetics. J Control Release 2022; 352:961-969. [PMID: 36370876 DOI: 10.1016/j.jconrel.2022.11.014] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2022] [Revised: 10/23/2022] [Accepted: 11/07/2022] [Indexed: 11/17/2022]
Abstract
In this review, we describe the current status and challenges in applying machine-learning techniques to the analysis and prediction of pharmacokinetic data. The theory of pharmacokinetics has been developed over decades on the basis of physiology and reaction kinetics. Mathematical models allow the reduction of pharmacokinetic data to parameter values, giving insight and understanding into ADME processes and predicting the outcome of different dosing scenarios. However, much information hidden in the data is lost through conceptual simplification with models. It is difficult to use mechanistic models alone to predict diverse pharmacokinetic time profiles, including inter-drug and inter-individual differences, in a cross-sectional manner. Machine learning is a prediction platform that can handle complex phenomena through data-driven analysis. As a resule, machine learning has been successfully adopted in various fields, including image recognition and language processing, and has been used for over two decades in pharmacokinetic research, primarily in the area of quantitative structure-activity relationships for pharmacokinetic parameters. Machine-learning models are generally known to provide better predictive performance than conventional linear models. Owing to the recent success in deep learning, models with new structures are being consistently proposed. These models include transfer learning and generative adversarial networks, which contribute to the effective use of a limited amount of data by diverting existing similar models or generating pseudo-data. How to make such newly emerging machine learning technologies applicable to meet challenges in the pharmacokinetics/pharmacodynamics field is now the key issue.
Collapse
Affiliation(s)
- Ryosaku Ota
- Department of Drug Delivery Research, Graduate School of Pharmaceutical Sciences, Kyoto University, Sakyo-ku, Kyoto 606-8501, Japan
| | - Fumiyoshi Yamashita
- Department of Drug Delivery Research, Graduate School of Pharmaceutical Sciences, Kyoto University, Sakyo-ku, Kyoto 606-8501, Japan; Department of Applied Pharmacy and Pharmacokinetics, Graduate School of Pharmaceutical Sciences, Kyoto University, Sakyo-ku, Kyoto 606-8501, Japan.
| |
Collapse
|
31
|
Parrott N, Manevski N, Olivares-Morales A. Can We Predict Clinical Pharmacokinetics of Highly Lipophilic Compounds by Integration of Machine Learning or In Vitro Data into Physiologically Based Models? A Feasibility Study Based on 12 Development Compounds. Mol Pharm 2022; 19:3858-3868. [PMID: 36150125 DOI: 10.1021/acs.molpharmaceut.2c00350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
While high lipophilicity tends to improve potency, its effects on pharmacokinetics (PK) are complex and often unfavorable. To predict clinical PK in early drug discovery, we built human physiologically based PK (PBPK) models integrating either (i) machine learning (ML)-predicted properties or (ii) discovery stage in vitro data. Our test set was composed of 12 challenging development compounds with high lipophilicity (mean calculated log P 4.2), low plasma-free fraction (50% of compounds with fu,p < 1%), and low aqueous solubility. Predictions focused on key human PK parameters, including plasma clearance (CL), volume of distribution at steady state (Vss), and oral bioavailability (%F). For predictions of CL, the ML inputs showed acceptable accuracy and slight underprediction bias [an average absolute fold error (AAFE) of 3.55; an average fold error (AFE) of 0.95]. Surprisingly, use of measured data only slightly improved accuracy but introduced an overprediction bias (AAFE = 3.35; AFE = 2.63). Predictions of Vss were more successful, with both ML (AAFE = 2.21; AFE = 0.90) and in vitro (AAFE = 2.24; AFE = 1.72) inputs showing good accuracy and moderate bias. The %F was poorly predicted using ML inputs [average absolute prediction error (AAPE) of 45%], and use of measured data for solubility and permeability improved this to 34%. Sensitivity analysis showed that predictions of CL limited the overall accuracy of human PK predictions, partly due to high nonspecific binding of lipophilic compounds, leading to uncertainty of unbound clearance. For accurate predictions of %F, solubility was the key factor. Despite current limitations, this work encourages further development of ML models and integration of their results within PBPK models to enable human PK prediction at the drug design stage, even before compounds are synthesized. Further evaluation of this approach with more diverse chemical types is warranted.
Collapse
Affiliation(s)
- Neil Parrott
- Pharmaceutical Sciences, Pharma Research and Early Development, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, CH-4070 Basel, Switzerland
| | - Nenad Manevski
- Pharmaceutical Sciences, Pharma Research and Early Development, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, CH-4070 Basel, Switzerland
| | - Andrés Olivares-Morales
- Pharmaceutical Sciences, Pharma Research and Early Development, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, CH-4070 Basel, Switzerland
| |
Collapse
|
32
|
Synthesis and In Silico Docking Study towards M-Pro of Novel Heterocyclic Compounds Derived from Pyrazolopyrimidinone as Putative SARS-CoV-2 Inhibitors. MOLECULES (BASEL, SWITZERLAND) 2022; 27:molecules27165303. [PMID: 36014537 PMCID: PMC9416631 DOI: 10.3390/molecules27165303] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/30/2022] [Revised: 08/11/2022] [Accepted: 08/15/2022] [Indexed: 12/19/2022]
Abstract
In addition to vaccines, antiviral drugs are essential in order to suppress COVID-19. Although some inhibitor candidates have been determined to target the SARS-CoV-2 protein, there is still an urgent need to continue researching novel inhibitors of the SARS-CoV-2 main protease 'Omicron P132H', a protein that has recently been discovered. In the present study, in the search for therapeutic alternatives to treat COVID-19 and its recent variants, we conducted a structure-based virtual screening using docking studies for a new series of pyrazolo[3,4-d]pyrimidin-4(5H)-one derivatives 5-13, which were synthesized from the condensation reaction of pyrazolopyrimidinone-hydrazide (4) with a series of electrophiles. Some significant ADMET predictions-in addition to the docking results-were obtained based on the types of interactions formed and the binding energy values were compared to the reference anti- SARS-CoV-2 redocked drug nirmatrelvir.
Collapse
|
33
|
Kim H, Park M, Lee I, Nam H. BayeshERG: a robust, reliable and interpretable deep learning model for predicting hERG channel blockers. Brief Bioinform 2022; 23:6609519. [PMID: 35709752 DOI: 10.1093/bib/bbac211] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2022] [Revised: 04/19/2022] [Accepted: 05/06/2022] [Indexed: 11/13/2022] Open
Abstract
Unintended inhibition of the human ether-à-go-go-related gene (hERG) ion channel by small molecules leads to severe cardiotoxicity. Thus, hERG channel blockage is a significant concern in the development of new drugs. Several computational models have been developed to predict hERG channel blockage, including deep learning models; however, they lack robustness, reliability and interpretability. Here, we developed a graph-based Bayesian deep learning model for hERG channel blocker prediction, named BayeshERG, which has robust predictive power, high reliability and high resolution of interpretability. First, we applied transfer learning with 300 000 large data in initial pre-training to increase the predictive performance. Second, we implemented a Bayesian neural network with Monte Carlo dropout to calibrate the uncertainty of the prediction. Third, we utilized global multihead attentive pooling to augment the high resolution of structural interpretability for the hERG channel blockers and nonblockers. We conducted both internal and external validations for stringent evaluation; in particular, we benchmarked most of the publicly available hERG channel blocker prediction models. We showed that our proposed model outperformed predictive performance and uncertainty calibration performance. Furthermore, we found that our model learned to focus on the essential substructures of hERG channel blockers via an attention mechanism. Finally, we validated the prediction results of our model by conducting in vitro experiments and confirmed its high validity. In summary, BayeshERG could serve as a versatile tool for discovering hERG channel blockers and helping maximize the possibility of successful drug discovery. The data and source code are available at our GitHub repository (https://github.com/GIST-CSBL/BayeshERG).
Collapse
Affiliation(s)
- Hyunho Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Buk-gu, Gwangju, 61005, Republic of Korea
| | - Minsu Park
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Buk-gu, Gwangju, 61005, Republic of Korea
| | - Ingoo Lee
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Buk-gu, Gwangju, 61005, Republic of Korea
| | - Hojung Nam
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Buk-gu, Gwangju, 61005, Republic of Korea
| |
Collapse
|
34
|
Serov N, Vinogradov V. Artificial intelligence to bring nanomedicine to life. Adv Drug Deliv Rev 2022; 184:114194. [PMID: 35283223 DOI: 10.1016/j.addr.2022.114194] [Citation(s) in RCA: 48] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 03/04/2022] [Accepted: 03/07/2022] [Indexed: 12/13/2022]
Abstract
The technology of drug delivery systems (DDSs) has demonstrated an outstanding performance and effectiveness in production of pharmaceuticals, as it is proved by many FDA-approved nanomedicines that have an enhanced selectivity, manageable drug release kinetics and synergistic therapeutic actions. Nonetheless, to date, the rational design and high-throughput development of nanomaterial-based DDSs for specific purposes is far from a routine practice and is still in its infancy, mainly due to the limitations in scientists' capabilities to effectively acquire, analyze, manage, and comprehend complex and ever-growing sets of experimental data, which is vital to develop DDSs with a set of desired functionalities. At the same time, this task is feasible for the data-driven approaches, high throughput experimentation techniques, process automatization, artificial intelligence (AI) technology, and machine learning (ML) approaches, which is referred to as The Fourth Paradigm of scientific research. Therefore, an integration of these approaches with nanomedicine and nanotechnology can potentially accelerate the rational design and high-throughput development of highly efficient nanoformulated drugs and smart materials with pre-defined functionalities. In this Review, we survey the important results and milestones achieved to date in the application of data science, high throughput, as well as automatization approaches, combined with AI and ML to design and optimize DDSs and related nanomaterials. This manuscript mission is not only to reflect the state-of-art in data-driven nanomedicine, but also show how recent findings in the related fields can transform the nanomedicine's image. We discuss how all these results can be used to boost nanomedicine translation to the clinic, as well as highlight the future directions for the development, data-driven, high throughput experimentation-, and AI-assisted design, as well as the production of nanoformulated drugs and smart materials with pre-defined properties and behavior. This Review will be of high interest to the chemists involved in materials science, nanotechnology, and DDSs development for biomedical applications, although the general nature of the presented approaches enables knowledge translation to many other fields of science.
Collapse
Affiliation(s)
- Nikita Serov
- International Institute "Solution Chemistry of Advanced Materials and Technologies", ITMO University, Saint-Petersburg 191002, Russian Federation
| | - Vladimir Vinogradov
- International Institute "Solution Chemistry of Advanced Materials and Technologies", ITMO University, Saint-Petersburg 191002, Russian Federation.
| |
Collapse
|
35
|
Obrezanova O, Martinsson A, Whitehead T, Mahmoud S, Bender A, Miljković F, Grabowski P, Irwin B, Oprisiu I, Conduit G, Segall M, Smith GF, Williamson B, Winiwarter S, Greene N. Prediction of In Vivo Pharmacokinetic Parameters and Time-Exposure Curves in Rats Using Machine Learning from the Chemical Structure. Mol Pharm 2022; 19:1488-1504. [PMID: 35412314 DOI: 10.1021/acs.molpharmaceut.2c00027] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Animal pharmacokinetic (PK) data as well as human and animal in vitro systems are utilized in drug discovery to define the rate and route of drug elimination. Accurate prediction and mechanistic understanding of drug clearance and disposition in animals provide a degree of confidence for extrapolation to humans. In addition, prediction of in vivo properties can be used to improve design during drug discovery, help select compounds with better properties, and reduce the number of in vivo experiments. In this study, we generated machine learning models able to predict rat in vivo PK parameters and concentration-time PK profiles based on the molecular chemical structure and either measured or predicted in vitro parameters. The models were trained on internal in vivo rat PK data for over 3000 diverse compounds from multiple projects and therapeutic areas, and the predicted endpoints include clearance and oral bioavailability. We compared the performance of various traditional machine learning algorithms and deep learning approaches, including graph convolutional neural networks. The best models for PK parameters achieved R2 = 0.63 [root mean squared error (RMSE) = 0.26] for clearance and R2 = 0.55 (RMSE = 0.46) for bioavailability. The models provide a fast and cost-efficient way to guide the design of molecules with optimal PK profiles, to enable the prediction of virtual compounds at the point of design, and to drive prioritization of compounds for in vivo assays.
Collapse
Affiliation(s)
- Olga Obrezanova
- Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Cambridge CB4 0FZ, U.K
| | - Anton Martinsson
- Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Gothenburg SE-43183, Sweden
| | - Tom Whitehead
- Intellegens Ltd., Eagle Labs, Cambridge CB4 3AZ, U.K
| | - Samar Mahmoud
- Optibrium Ltd., Cambridge Innovation Park, Cambridge CB25 9PB, U.K
| | - Andreas Bender
- Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Cambridge CB4 0FZ, U.K.,Department of Chemistry, Centre for Molecular Informatics, University of Cambridge, Cambridge CB2 1EW, U.K
| | - Filip Miljković
- Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Gothenburg SE-43183, Sweden
| | - Piotr Grabowski
- Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Cambridge CB4 0FZ, U.K
| | - Ben Irwin
- Optibrium Ltd., Cambridge Innovation Park, Cambridge CB25 9PB, U.K
| | - Ioana Oprisiu
- Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Gothenburg SE-43183, Sweden
| | | | - Matthew Segall
- Optibrium Ltd., Cambridge Innovation Park, Cambridge CB25 9PB, U.K
| | - Graham F Smith
- Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Cambridge CB4 0FZ, U.K
| | - Beth Williamson
- Drug Metabolism and Pharmacokinetics, Research and Early Development, Oncology R&D, AstraZeneca, Cambridge CB10 1XL, U.K
| | - Susanne Winiwarter
- Drug Metabolism and Pharmacokinetics, Research and Early Development, Cardiovascular, Renal and Metabolism (CVRM), Biopharmaceutical R&D, AstraZeneca, Gothenburg SE-43183, Sweden
| | - Nigel Greene
- Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Waltham, Massachusetts 02451, United States
| |
Collapse
|
36
|
|
37
|
Deep Learning-Based Prediction of Physical Stability considering Class Imbalance for Amorphous Solid Dispersions. J CHEM-NY 2022. [DOI: 10.1155/2022/4148443] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
This research is aimed at predicting the physical stability for amorphous solid dispersion by utilizing deep learning methods. We propose a prediction model that effectively learns from a small dataset that is imbalanced in terms of class. In order to overcome the imbalance problem, our model performs a hybrid sampling which combines synthetic minority oversampling technique (SMOTE) algorithm with edited nearest neighbor (ENN) algorithm and reduces the dimensionality of the dataset using principal component analysis (PCA) algorithm during data preprocessing. After the preprocessing, it performs the learning process using a carefully designed neural network of simple but effective structure. Experimental results show that the proposed model has faster training convergence speed and better test performance compared to the existing DNN model. Furthermore, it significantly reduces the computational complexity of both training and test processes.
Collapse
|
38
|
Ye Z, Ouyang D. Prediction of small-molecule compound solubility in organic solvents by machine learning algorithms. J Cheminform 2021; 13:98. [PMID: 34895323 PMCID: PMC8665485 DOI: 10.1186/s13321-021-00575-3] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Accepted: 11/22/2021] [Indexed: 11/26/2022] Open
Abstract
Rapid solvent selection is of great significance in chemistry. However, solubility prediction remains a crucial challenge. This study aimed to develop machine learning models that can accurately predict compound solubility in organic solvents. A dataset containing 5081 experimental temperature and solubility data of compounds in organic solvents was extracted and standardized. Molecular fingerprints were selected to characterize structural features. lightGBM was compared with deep learning and traditional machine learning (PLS, Ridge regression, kNN, DT, ET, RF, SVM) to develop models for predicting solubility in organic solvents at different temperatures. Compared to other models, lightGBM exhibited significantly better overall generalization (logS ± 0.20). For unseen solutes, our model gave a prediction accuracy (logS ± 0.59) close to the expected noise level of experimental solubility data. lightGBM revealed the physicochemical relationship between solubility and structural features. Our method enables rapid solvent screening in chemistry and may be applied to solubility prediction in other solvents.
Collapse
Affiliation(s)
- Zhuyifan Ye
- State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences (ICMS), University of Macau, Macau, China
| | - Defang Ouyang
- State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences (ICMS), University of Macau, Macau, China.
| |
Collapse
|
39
|
Miljković F, Martinsson A, Obrezanova O, Williamson B, Johnson M, Sykes A, Bender A, Greene N. Machine Learning Models for Human In Vivo Pharmacokinetic Parameters with In-House Validation. Mol Pharm 2021; 18:4520-4530. [PMID: 34758626 DOI: 10.1021/acs.molpharmaceut.1c00718] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Prior to clinical development, a comprehensive pharmacokinetic characterization of a novel drug is required to understand its exposure at the site of action and elimination. Accordingly, in vitro assays and animal pharmacokinetic studies are regularly employed to predict drug exposure in humans, which is often costly and time-consuming. For this reason, the prediction of human pharmacokinetics at the point of design would be of high value for drug discovery. Therefore, we have established a comprehensive data curation protocol that enables machine learning evaluation of 12 human in vivo pharmacokinetic parameters using only chemical structure information and available doses for 1001 unique compounds. These machine learning models were thoroughly investigated and validated using both an independent hold-out test set and AstraZeneca clinical data. In addition, the availability of preclinical predictions for a subset of internal clinical candidates allowed us to compare our in silico approach with state-of-the-art pharmacokinetic predictions. Based on this evaluation, three fit-for-purpose models for AUC PO (Rtest2 = 0.63; RMSEtest = 0.76), Cmax PO (Rtest2 = 0.68; RMSEtest = 0.62), and Vdss IV (Rtest2 = 0.47; RMSEtest = 0.50) were identified. Based on the findings, our machine learning models have considerable potential for practical applications in drug discovery, such as influencing decision-making in drug discovery projects and progression of drug candidates toward the clinic.
Collapse
Affiliation(s)
- Filip Miljković
- Data Science and AI, Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Gothenburg SE-43183, Sweden
| | - Anton Martinsson
- Data Science and AI, Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Gothenburg SE-43183, Sweden
| | - Olga Obrezanova
- Data Science and AI, Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Cambridge CB4 0FZ, U.K
| | - Beth Williamson
- Drug Metabolism and Pharmacokinetics, Research and Early Development, Oncology, R&D, AstraZeneca, Cambridge CB10 1XL, U.K
| | - Martin Johnson
- Clinical Pharmacology & Quantitative Pharmacology, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Cambridge SG8 6HB, U.K
| | - Andy Sykes
- Clinical Pharmacology & Quantitative Pharmacology, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Cambridge SG8 6HB, U.K
| | - Andreas Bender
- Data Science and AI, Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Cambridge CB4 0FZ, U.K.,Department of Chemistry, Centre for Molecular Informatics, University of Cambridge, Cambridge CB2 1EW, U.K
| | - Nigel Greene
- Data Science and AI, Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Waltham, Massachusetts 02451, United States
| |
Collapse
|
40
|
Abstract
The use of artificial intelligence methods in drug safety began in the early 2000s with applications such as predicting bacterial mutagenicity and hERG inhibition. The field has been endlessly expanding ever since and the models have become more complex. These approaches are now integrated into molecule risk assessment processes along with in vitro and in vivo methods. Today, artificial intelligence can be used in every phase of drug discovery and development, from profiling chemical libraries in early discovery, to predicting off-target effects in the mid-discovery phase, to assessing potential mutagenic impurities in development and degradants as part of life cycle management. This chapter provides an overview of artificial intelligence in drug safety and describes its application throughout the entire discovery and development process.
Collapse
|
41
|
Wang W, Ye Z, Gao H, Ouyang D. Computational pharmaceutics - A new paradigm of drug delivery. J Control Release 2021; 338:119-136. [PMID: 34418520 DOI: 10.1016/j.jconrel.2021.08.030] [Citation(s) in RCA: 81] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2021] [Revised: 08/17/2021] [Accepted: 08/17/2021] [Indexed: 01/18/2023]
Abstract
In recent decades pharmaceutics and drug delivery have become increasingly critical in the pharmaceutical industry due to longer time, higher cost, and less productivity of new molecular entities (NMEs). However, current formulation development still relies on traditional trial-and-error experiments, which are time-consuming, costly, and unpredictable. With the exponential growth of computing capability and algorithms, in recent ten years, a new discipline named "computational pharmaceutics" integrates with big data, artificial intelligence, and multi-scale modeling techniques into pharmaceutics, which offered great potential to shift the paradigm of drug delivery. Computational pharmaceutics can provide multi-scale lenses to pharmaceutical scientists, revealing physical, chemical, mathematical, and data-driven details ranging across pre-formulation studies, formulation screening, in vivo prediction in the human body, and precision medicine in the clinic. The present paper provides a comprehensive and detailed review in all areas of computational pharmaceutics and "Pharma 4.0", including artificial intelligence and machine learning algorithms, molecular modeling, mathematical modeling, process simulation, and physiologically based pharmacokinetic (PBPK) modeling. We not only summarized the theories and progress of these technologies but also discussed the regulatory requirements, current challenges, and future perspectives in the area, such as talent training and a culture change in the future pharmaceutical industry.
Collapse
Affiliation(s)
- Wei Wang
- State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences (ICMS), University of Macau, Macau, China
| | - Zhuyifan Ye
- State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences (ICMS), University of Macau, Macau, China
| | - Hanlu Gao
- State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences (ICMS), University of Macau, Macau, China
| | - Defang Ouyang
- State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences (ICMS), University of Macau, Macau, China.
| |
Collapse
|
42
|
Danishuddin, Kumar V, Faheem M, Woo Lee K. A decade of machine learning-based predictive models for human pharmacokinetics: Advances and challenges. Drug Discov Today 2021; 27:529-537. [PMID: 34592448 DOI: 10.1016/j.drudis.2021.09.013] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 07/21/2021] [Accepted: 09/22/2021] [Indexed: 11/28/2022]
Abstract
Traditionally, in vitro and in vivo methods are useful for estimating human pharmacokinetics (PK) parameters; however, it is impractical to perform these complex and expensive experiments on a large number of compounds. The integration of publicly available chemical, or medical Big Data and artificial intelligence (AI)-based approaches led to qualitative and quantitative prediction of human PK of a candidate drug. However, predicting drug response with these approaches is challenging, partially because of the adaptation of algorithmic and limitations related to experimental data. In this report, we provide an overview of machine learning (ML)-based quantitative structure-activity relationship (QSAR) models used in the assessment or prediction of PK values as well as databases available for obtaining such data.
Collapse
Affiliation(s)
- Danishuddin
- Department of Bio & Medical Big Data (BK4), Division of Life Sciences, Research Institute of Natural Sciences (RINS), Gyeongsang National University (GNU), 501 Jinju-daero, Jinju 52828, Republic of Korea
| | - Vikas Kumar
- Department of Bio & Medical Big Data (BK4), Division of Life Sciences, Research Institute of Natural Sciences (RINS), Gyeongsang National University (GNU), 501 Jinju-daero, Jinju 52828, Republic of Korea
| | - Mohammad Faheem
- Department of Biotechnology, Indian Institute of Technology, Roorkee, Uttarakhand 247667, India
| | - Keun Woo Lee
- Department of Bio & Medical Big Data (BK4), Division of Life Sciences, Research Institute of Natural Sciences (RINS), Gyeongsang National University (GNU), 501 Jinju-daero, Jinju 52828, Republic of Korea.
| |
Collapse
|
43
|
Piroozmand F, Mohammadipanah F, Sajedi H. Spectrum of deep learning algorithms in drug discovery. Chem Biol Drug Des 2021; 96:886-901. [PMID: 33058458 DOI: 10.1111/cbdd.13674] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 02/11/2020] [Accepted: 02/19/2020] [Indexed: 12/16/2022]
Abstract
Deep learning (DL) algorithms are a subset of machine learning algorithms with the aim of modeling complex mapping between a set of elements and their classes. In parallel to the advance in revealing the molecular bases of diseases, a notable innovation has been undertaken to apply DL in data/libraries management, reaction optimizations, differentiating uncertainties, molecule constructions, creating metrics from qualitative results, and prediction of structures or interactions. From source identification to lead discovery and medicinal chemistry of the drug candidate, drug delivery, and modification, the challenges can be subjected to artificial intelligence algorithms to aid in the generation and interpretation of data. Discovery and design approach, both demand automation, large data management and data fusion by the advance in high-throughput mode. The application of DL can accelerate the exploration of drug mechanisms, finding novel indications for existing drugs (drug repositioning), drug development, and preclinical and clinical studies. The impact of DL in the workflow of drug discovery, design, and their complementary tools are highlighted in this review. Additionally, the type of DL algorithms used for this purpose, and their pros and cons along with the dominant directions of future research are presented.
Collapse
Affiliation(s)
- Firoozeh Piroozmand
- Pharmaceutical Biotechnology Lab, Department of Microbiology, School of Biology and Center of Excellence in Phylogeny of Living Organisms, College of Science, University of Tehran, Tehran, Iran
| | - Fatemeh Mohammadipanah
- Pharmaceutical Biotechnology Lab, Department of Microbiology, School of Biology and Center of Excellence in Phylogeny of Living Organisms, College of Science, University of Tehran, Tehran, Iran
| | - Hedieh Sajedi
- Department of Computer Science, School of Mathematics, Statistics and Computer Science, College of Science, University of Tehran, Tehran, Iran
| |
Collapse
|
44
|
Ye Z, Yang W, Yang Y, Ouyang D. Interpretable machine learning methods for in vitro pharmaceutical formulation development. FOOD FRONTIERS 2021. [DOI: 10.1002/fft2.78] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Affiliation(s)
- Zhuyifan Ye
- State Key Laboratory of Quality Research in Chinese Medicine Institute of Chinese Medical Sciences (ICMS) University of Macau Macau China
| | - Wenmian Yang
- State Key Laboratory of Internet of Things for Smart City University of Macau Macau China
| | - Yilong Yang
- School of Software Beihang University Beijing China
| | - Defang Ouyang
- State Key Laboratory of Quality Research in Chinese Medicine Institute of Chinese Medical Sciences (ICMS) University of Macau Macau China
| |
Collapse
|
45
|
Machine Learning Attempts for Predicting Human Subcutaneous Bioavailability of Monoclonal Antibodies. Pharm Res 2021; 38:451-460. [PMID: 33710513 DOI: 10.1007/s11095-021-03022-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2021] [Accepted: 02/22/2021] [Indexed: 10/21/2022]
Abstract
PURPOSE One knowledge gap related to subcutaneous (SC) delivery is unpredictable and variable bioavailability. This study was aimed to develop machine learning methods to predict whether mAb's bioavailability was ≥70% or below, without completely knowing the mechanism and causality between inputs and outputs. METHODS A database of mAb SC products was built. The model training and validation were accomplished based on this database and a set of the inputs (product properties) were mapped to the output (bioavailability) using different machine learning algorithms. Dimensionality reduction was undertaken using principal component analysis (PCA). RESULTS The bioavailability of the mAb products being investigated varied from 35% to 90%. The tree-based methods, including random forest (RF), Adaptive Boost (AdaBoost), and decision tree (DT) presented the best predictability and generalization power on bioavailability classification. The models based on Multi-layer perceptron (MLP), Gaussian Naïve Bayes (GaussianNB), and k nearest neighbor (kNN) algorithms also provided acceptable prediction accuracy. CONCLUSION Machine learning could be a potential tool to predict mAb's bioavailability. Since all input features were acquired using theoretical calculations and predictions rather than experiments, the models may be particularly applicable to some early-stage research activities such as mAb molecule triage, design/optimization, mutant screening, molecule selection, and formulation design.
Collapse
|
46
|
Olasupo SB, Uzairu A, Shallangwa GA, Uba S. Unveiling novel inhibitors of dopamine transporter via in silico drug design, molecular docking, and bioavailability predictions as potential antischizophrenic agents. FUTURE JOURNAL OF PHARMACEUTICAL SCIENCES 2021. [DOI: 10.1186/s43094-021-00198-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Abstract
Background
The inhibition of dopamine transporter is known to play a significant role in the treatment of schizophrenia-related and other mental disorders. In a continuing from our previous study, computational drug design approach, molecular docking simulation, and pharmacokinetics study were explored for the identification of novel inhibitors dopamine transporter as potential Antischizophrenic agents. Consequently, thirteen (13) new inhibitors of dopamine transporter were designed by selecting the molecule with serial number 39 from our previous study as the template molecule because it exhibits good pharmacological attributes.
Results
Molecular docking simulation results revealed excellent molecular interactions between the protein target (PDB: 4m48) and the ligands (designed inhibitors) with major interactions that involved hydrogen bonding and hydrophobic interactions. Also, some of the designed inhibitors displayed a superior binding affinity range from − 10.0 to − 10.7 kcal/mol compared to the referenced drug (Lumateperone) with a binding affinity of − 9.7 kcal/mol. Computed physicochemical parameters showed that none of the designed inhibitors including the referenced drug violate Lipinski’s rule of five indicating that all the designed inhibitors would be orally bioavailable as potential drug candidates. Similarly, the ADMET/pharmacokinetics evaluations of some designed inhibitors revealed that they possessed good absorption, distribution, metabolism and excretion properties and none of the inhibitors is neither carcinogens nor toxic toward human ether-a-go-go related gene (hERG I) inhibitor or skin sensitization. Likewise, the BOILED-Egg graphics unveils that all the designed inhibitors demonstrate a high probability to be absorbed by the human gastrointestinal tract and could permeate into the brain. Besides, the predicted bioactive parameters suggested that all the selected inhibitors would be active as drug candidates. Furthermore, the synthetic accessibility scores for all the selected inhibitors and referenced drug lied within the easy zone (i.e., between 1–4) with their computed values range from 2.55 to 3.92, this implies that all the selected inhibitors would be very easy to synthesize in the laboratory.
Conclusions
Hence, all the designed inhibitors having shown excellent pharmacokinetics properties and good bioavailabilities attributes with remarkable biochemical interactions could be developed and optimized as novel Antischizophrenic agents after the conclusion of other experimental investigations.
Collapse
|
47
|
Kim H, Kim E, Lee I, Bae B, Park M, Nam H. Artificial Intelligence in Drug Discovery: A Comprehensive Review of Data-driven and Machine Learning Approaches. BIOTECHNOL BIOPROC E 2021; 25:895-930. [PMID: 33437151 PMCID: PMC7790479 DOI: 10.1007/s12257-020-0049-y] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Revised: 05/27/2020] [Accepted: 06/03/2020] [Indexed: 02/07/2023]
Abstract
As expenditure on drug development increases exponentially, the overall drug discovery process requires a sustainable revolution. Since artificial intelligence (AI) is leading the fourth industrial revolution, AI can be considered as a viable solution for unstable drug research and development. Generally, AI is applied to fields with sufficient data such as computer vision and natural language processing, but there are many efforts to revolutionize the existing drug discovery process by applying AI. This review provides a comprehensive, organized summary of the recent research trends in AI-guided drug discovery process including target identification, hit identification, ADMET prediction, lead optimization, and drug repositioning. The main data sources in each field are also summarized in this review. In addition, an in-depth analysis of the remaining challenges and limitations will be provided, and proposals for promising future directions in each of the aforementioned areas.
Collapse
Affiliation(s)
- Hyunho Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Eunyoung Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Ingoo Lee
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Bongsung Bae
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Minsu Park
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Hojung Nam
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| |
Collapse
|
48
|
Mathew S, Tess D, Burchett W, Chang G, Woody N, Keefer C, Orozco C, Lin J, Jordan S, Yamazaki S, Jones R, Di L. Evaluation of Prediction Accuracy for Volume of Distribution in Rat and Human Using In Vitro, In Vivo, PBPK and QSAR Methods. J Pharm Sci 2020; 110:1799-1823. [PMID: 33338491 DOI: 10.1016/j.xphs.2020.12.005] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Revised: 11/17/2020] [Accepted: 12/03/2020] [Indexed: 10/22/2022]
Abstract
Volume of distribution at steady state (Vss) is an important pharmacokinetic parameter of a drug candidate. In this study, Vss prediction accuracy was evaluated by using: (1) seven methods for rat with 56 compounds, (2) four methods for human with 1276 compounds, and (3) four in vivo methods and three Kp (partition coefficient) scalar methods from scaling of three preclinical species with 125 compounds. The results showed that the global QSAR models outperformed the PBPK methods. Tissue fraction unbound (fu,t) method with adipose and muscle also provided high Vss prediction accuracy. Overall, the high performing methods for human Vss prediction are the global QSAR models, Øie-Tozer and equivalency methods from scaling of preclinical species, as well as PBPK methods with Kp scalar from preclinical species. Certain input parameter ranges rendered PBPK models inaccurate due to mass balance issues. These were addressed using appropriate theoretical limit checks. Prediction accuracy of tissue Kp were also examined. The fu,t method predicted Kp values more accurately than the PBPK methods for adipose, heart and muscle. All the methods overpredicted brain Kp and underpredicted liver Kp due to transporter effects. Successful Vss prediction involves strategic integration of in silico, in vitro and in vivo approaches.
Collapse
Affiliation(s)
- Shibin Mathew
- Pharmacokinetics, Dynamics and Metabolism, Pfizer Worldwide Research and Development, Cambridge, MA 02139, USA
| | - David Tess
- Pharmacokinetics, Dynamics and Metabolism, Pfizer Worldwide Research and Development, Cambridge, MA 02139, USA
| | - Woodrow Burchett
- Early Clinical Development, Pfizer Worldwide Research and Development, Groton, CT 06340, USA
| | - George Chang
- Pharmacokinetics, Dynamics and Metabolism, Pfizer Worldwide Research and Development, Groton, CT 06340, USA
| | - Nathaniel Woody
- Pharmacokinetics, Dynamics and Metabolism, Pfizer Worldwide Research and Development, Groton, CT 06340, USA
| | - Christopher Keefer
- Pharmacokinetics, Dynamics and Metabolism, Pfizer Worldwide Research and Development, Groton, CT 06340, USA
| | - Christine Orozco
- Pharmacokinetics, Dynamics and Metabolism, Pfizer Worldwide Research and Development, Groton, CT 06340, USA
| | - Jian Lin
- Pharmacokinetics, Dynamics and Metabolism, Pfizer Worldwide Research and Development, Groton, CT 06340, USA
| | - Samantha Jordan
- Pharmacokinetics, Dynamics and Metabolism, Pfizer Worldwide Research and Development, Groton, CT 06340, USA
| | - Shinji Yamazaki
- Pharmacokinetics, Dynamics and Metabolism, Pfizer Worldwide Research and Development, San Diego, CA 92121, USA
| | - Rhys Jones
- Pharmacokinetics, Dynamics and Metabolism, Pfizer Worldwide Research and Development, San Diego, CA 92121, USA
| | - Li Di
- Pharmacokinetics, Dynamics and Metabolism, Pfizer Worldwide Research and Development, Groton, CT 06340, USA.
| |
Collapse
|
49
|
Abstract
Artificial intelligence (AI) and machine learning, in particular, have gained significant interest in many fields, including pharmaceutical sciences. The enormous growth of data from several sources, the recent advances in various analytical tools, and the continuous developments in machine learning algorithms have resulted in a rapid increase in new machine learning applications in different areas of pharmaceutical sciences. This review summarizes the past, present, and potential future impacts of machine learning technologies on different areas of pharmaceutical sciences, including drug design and discovery, preformulation, and formulation. The machine learning methods commonly used in pharmaceutical sciences are discussed, with a specific emphasis on artificial neural networks due to their capability to model the nonlinear relationships that are commonly encountered in pharmaceutical research. AI and machine learning technologies in common day-to-day pharma needs as well as industrial and regulatory insights are reviewed. Beyond traditional potentials of implementing digital technologies using machine learning in the development of more efficient, fast, and economical solutions in pharmaceutical sciences are also discussed.
Collapse
|
50
|
Cai C, Wang S, Xu Y, Zhang W, Tang K, Ouyang Q, Lai L, Pei J. Transfer Learning for Drug Discovery. J Med Chem 2020; 63:8683-8694. [PMID: 32672961 DOI: 10.1021/acs.jmedchem.9b02147] [Citation(s) in RCA: 162] [Impact Index Per Article: 32.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
The data sets available to train models for in silico drug discovery efforts are often small. Indeed, the sparse availability of labeled data is a major barrier to artificial-intelligence-assisted drug discovery. One solution to this problem is to develop algorithms that can cope with relatively heterogeneous and scarce data. Transfer learning is a type of machine learning that can leverage existing, generalizable knowledge from other related tasks to enable learning of a separate task with a small set of data. Deep transfer learning is the most commonly used type of transfer learning in the field of drug discovery. This Perspective provides an overview of transfer learning and related applications to drug discovery to date. Furthermore, it provides outlooks on the future development of transfer learning for drug discovery.
Collapse
Affiliation(s)
- Chenjing Cai
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, P. R. China
| | - Shiwei Wang
- PTN Graduate Program, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, P. R. China
| | - Youjun Xu
- BNLMS and Peking-Tsinghua Center for Life Sciences at the College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, P. R. China
| | - Weilin Zhang
- Beijing Intelligent Pharma Technology Co., Ltd., Beijing 100083, P. R. China
| | - Ke Tang
- Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, P. R. China
| | - Qi Ouyang
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, P. R. China.,The State Key Laboratory for Artificial Microstructures and Mesoscopic Physics, School of Physics, Peking University, Beijing 100871, P. R. China
| | - Luhua Lai
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, P. R. China.,BNLMS and Peking-Tsinghua Center for Life Sciences at the College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, P. R. China
| | - Jianfeng Pei
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, P. R. China
| |
Collapse
|