101
|
Zhu Q, Gao S, Xiao B, He Z, Hu S. Plasmer: an Accurate and Sensitive Bacterial Plasmid Prediction Tool Based on Machine Learning of Shared k-mers and Genomic Features. Microbiol Spectr 2023; 11:e0464522. [PMID: 37191574 PMCID: PMC10269668 DOI: 10.1128/spectrum.04645-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 04/26/2023] [Indexed: 05/17/2023] Open
Abstract
Identification of plasmids in bacterial genomes is critical for many factors, including horizontal gene transfer, antibiotic resistance genes, host-microbe interactions, cloning vectors, and industrial production. There are several in silico methods to predict plasmid sequences in assembled genomes. However, existing methods have evident shortcomings, such as unbalance in sensitivity and specificity, dependency on species-specific models, and performance reduction in sequences shorter than 10 kb, which has limited their scope of applicability. In this work, we proposed Plasmer, a novel plasmid predictor based on machine-learning of shared k-mers and genomic features. Unlike existing k-mer or genomic-feature based methods, Plasmer employs the random forest algorithm to make predictions using the percent of shared k-mers with plasmid and chromosome databases combined with other genomic features, including alignment E value and replicon distribution scores (RDS). Plasmer can predict on multiple species and has achieved an average the area under the curve (AUC) of 0.996 with accuracy of 98.4%. Compared to existing methods, tests of both sliding sequences and simulated and de novo assemblies have consistently shown that Plasmer has outperforming accuracy and stable performance across long and short contigs above 500 bp, demonstrating its applicability for fragmented assemblies. Plasmer also has excellent and balanced performance on both sensitivity and specificity (both >0.95 above 500 bp) with the highest F1-score, which has eliminated the bias on sensitivity or specificity that was common in existing methods. Plasmer also provides taxonomy classification to help identify the origin of plasmids. IMPORTANCE In this study, we proposed a novel plasmid prediction tool named Plasmer. Technically, unlike existing k-mer or genomic features-based methods, Plasmer is the first tool to combine the advantages of the percent of shared k-mers and the alignment score of genomic features. This has given Plasmer (i) evident improvement in performance compared to other methods, with the best F1-score and accuracy on sliding sequences, simulated contigs, and de novo assemblies; (ii) applicability for contigs above 500 bp with highest accuracy, enabling plasmid prediction in fragmented short-read assemblies; (iii) excellent and balanced performance between sensitivity and specificity (both >0.95 above 500 bp) with the highest F1-score, which eliminated the bias on sensitivity or specificity that commonly existed in other methods; and (iv) no dependency of species-specific training models. We believe that Plasmer provides a more reliable alternative for plasmid prediction in bacterial genome assemblies.
Collapse
Affiliation(s)
- Qianhui Zhu
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Shenghan Gao
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
| | - Binghan Xiao
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
- Sino-Danish College, University of Chinese Academy of Sciences, Beijing, China
| | - Zilong He
- School of Engineering Medicine, Beihang University, Beijing, China
- Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, Interdisciplinary Innovation Institute of Medicine and Engineering, Beihang University, Beijing, China
| | - Songnian Hu
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
- Sino-Danish College, University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
102
|
Bo T, Lin Y, Han J, Hao Z, Liu J. Machine learning-assisted data filtering and QSAR models for prediction of chemical acute toxicity on rat and mouse. JOURNAL OF HAZARDOUS MATERIALS 2023; 452:131344. [PMID: 37027914 DOI: 10.1016/j.jhazmat.2023.131344] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 03/20/2023] [Accepted: 03/31/2023] [Indexed: 05/03/2023]
Abstract
Machine learning (ML) methods provide a new opportunity to build quantitative structure-activity relationship (QSAR) models for predicting chemicals' toxicity based on large toxicity data sets, but they are limited in insufficient model robustness due to poor data set quality for chemicals with certain structures. To address this issue and improve model robustness, we built a large data set on rat oral acute toxicity for thousands of chemicals, then used ML to filter chemicals favorable for regression models (CFRM). In comparison to chemicals not favorable for regression models (CNRM), CFRM accounted for 67% of chemicals in the original data set, and had a higher structural similarity and a smaller toxicity distribution in 2-4 log10 (mg/kg). The performance of established regression models for CFRM was greatly improved, with root-mean-square deviations (RMSE) in the range of 0.45-0.48 log10 (mg/kg). Classification models were built for CNRM using all chemicals in the original data set, and the area under receiver operating characteristic (AUROC) reached 0.75-0.76. The proposed strategy was successfully applied to a mouse oral acute data set, yielding RMSE and AUROC in the range of 0.36-0.38 log10 (mg/kg) and 0.79, respectively.
Collapse
Affiliation(s)
- Tao Bo
- School of Environment, Hangzhou Institute for Advanced Study, UCAS, Hangzhou 310024, China; State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, P.O. Box 2871, Beijing 100085, China
| | - Yaohui Lin
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, P.O. Box 2871, Beijing 100085, China; Key Laboratory for Analytical Science of Food Safety and Biology of MOE, Fujian Provincial Key Lab of Analysis and Detection for Food Safety, College of Chemistry, Fuzhou University, Fuzhou, Fujian 350116, China
| | - Jinglong Han
- State Key Laboratory of Urban Water Resource and Environment, Harbin Institute of Technology Shenzhen, Shenzhen 518055, China
| | - Zhineng Hao
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, P.O. Box 2871, Beijing 100085, China.
| | - Jingfu Liu
- School of Environment, Hangzhou Institute for Advanced Study, UCAS, Hangzhou 310024, China; State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, P.O. Box 2871, Beijing 100085, China.
| |
Collapse
|
103
|
Lunghini F, Fava A, Pisapia V, Sacco F, Iaconis D, Beccari AR. ProfhEX: AI-based platform for small molecules liability profiling. J Cheminform 2023; 15:60. [PMID: 37296454 DOI: 10.1186/s13321-023-00728-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Accepted: 05/28/2023] [Indexed: 06/12/2023] Open
Abstract
Off-target drug interactions are a major reason for candidate failure in the drug discovery process. Anticipating potential drug's adverse effects in the early stages is necessary to minimize health risks to patients, animal testing, and economical costs. With the constantly increasing size of virtual screening libraries, AI-driven methods can be exploited as first-tier screening tools to provide liability estimation for drug candidates. In this work we present ProfhEX, an AI-driven suite of 46 OECD-compliant machine learning models that can profile small molecules on 7 relevant liability groups: cardiovascular, central nervous system, gastrointestinal, endocrine, renal, pulmonary and immune system toxicities. Experimental affinity data was collected from public and commercial data sources. The entire chemical space comprised 289'202 activity data for a total of 210'116 unique compounds, spanning over 46 targets with dataset sizes ranging from 819 to 18896. Gradient boosting and random forest algorithms were initially employed and ensembled for the selection of a champion model. Models were validated according to the OECD principles, including robust internal (cross validation, bootstrap, y-scrambling) and external validation. Champion models achieved an average Pearson correlation coefficient of 0.84 (SD of 0.05), an R2 determination coefficient of 0.68 (SD = 0.1) and a root mean squared error of 0.69 (SD of 0.08). All liability groups showed good hit-detection power with an average enrichment factor at 5% of 13.1 (SD of 4.5) and AUC of 0.92 (SD of 0.05). Benchmarking against already existing tools demonstrated the predictive power of ProfhEX models for large-scale liability profiling. This platform will be further expanded with the inclusion of new targets and through complementary modelling approaches, such as structure and pharmacophore-based models. ProfhEX is freely accessible at the following address: https://profhex.exscalate.eu/ .
Collapse
Affiliation(s)
- Filippo Lunghini
- EXSCALATE, Dompé Farmaceutici SpA, Via Tommaso de Amicis 95, 80123, Naples, Italy
| | - Anna Fava
- EXSCALATE, Dompé Farmaceutici SpA, Via Tommaso de Amicis 95, 80123, Naples, Italy
| | - Vincenzo Pisapia
- Professional Service Department, SAS Institute, Via Darwin 20/22, 20143, Milan, Italy
| | - Francesco Sacco
- Professional Service Department, SAS Institute, Via Darwin 20/22, 20143, Milan, Italy
| | - Daniela Iaconis
- EXSCALATE, Dompé Farmaceutici SpA, Via Tommaso de Amicis 95, 80123, Naples, Italy
| | | |
Collapse
|
104
|
Hou R, Xie C, Gui Y, Li G, Li X. Machine-Learning-Based Data Analysis Method for Cell-Based Selection of DNA-Encoded Libraries. ACS OMEGA 2023; 8:19057-19071. [PMID: 37273617 PMCID: PMC10233830 DOI: 10.1021/acsomega.3c02152] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
DNA-encoded library (DEL) is a powerful ligand discovery technology that has been widely adopted in the pharmaceutical industry. DEL selections are typically performed with a purified protein target immobilized on a matrix or in solution phase. Recently, DELs have also been used to interrogate the targets in the complex biological environment, such as membrane proteins on live cells. However, due to the complex landscape of the cell surface, the selection inevitably involves significant nonspecific interactions, and the selection data are much noisier than the ones with purified proteins, making reliable hit identification highly challenging. Researchers have developed several approaches to denoise DEL datasets, but it remains unclear whether they are suitable for cell-based DEL selections. Here, we report the proof-of-principle of a new machine-learning (ML)-based approach to process cell-based DEL selection datasets by using a Maximum A Posteriori (MAP) estimation loss function, a probabilistic framework that can account for and quantify uncertainties of noisy data. We applied the approach to a DEL selection dataset, where a library of 7,721,415 compounds was selected against a purified carbonic anhydrase 2 (CA-2) and a cell line expressing the membrane protein carbonic anhydrase 12 (CA-12). The extended-connectivity fingerprint (ECFP)-based regression model using the MAP loss function was able to identify true binders and also reliable structure-activity relationship (SAR) from the noisy cell-based selection datasets. In addition, the regularized enrichment metric (known as MAP enrichment) could also be calculated directly without involving the specific machine-learning model, effectively suppressing low-confidence outliers and enhancing the signal-to-noise ratio. Future applications of this method will focus on de novo ligand discovery from cell-based DEL selections.
Collapse
Affiliation(s)
- Rui Hou
- Department
of Chemistry and State Key Laboratory of Synthetic Chemistry, The University of Hong Kong, Hong Kong SAR, China
- Laboratory
for Synthetic Chemistry and Chemical Biology LimitedHealth@InnoHK, Innovation and Technology Commission, Hong Kong SAR, China
| | - Chao Xie
- Department
of Chemistry and State Key Laboratory of Synthetic Chemistry, The University of Hong Kong, Hong Kong SAR, China
| | - Yuhan Gui
- Department
of Chemistry and State Key Laboratory of Synthetic Chemistry, The University of Hong Kong, Hong Kong SAR, China
| | - Gang Li
- Institute
of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518132, China
| | - Xiaoyu Li
- Department
of Chemistry and State Key Laboratory of Synthetic Chemistry, The University of Hong Kong, Hong Kong SAR, China
- Laboratory
for Synthetic Chemistry and Chemical Biology LimitedHealth@InnoHK, Innovation and Technology Commission, Hong Kong SAR, China
| |
Collapse
|
105
|
Guha R, Velegol D. Harnessing Shannon entropy-based descriptors in machine learning models to enhance the prediction accuracy of molecular properties. J Cheminform 2023; 15:54. [PMID: 37211605 DOI: 10.1186/s13321-023-00712-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Accepted: 03/18/2023] [Indexed: 05/23/2023] Open
Abstract
Accurate prediction of molecular properties is essential in the screening and development of drug molecules and other functional materials. Traditionally, property-specific molecular descriptors are used in machine learning models. This in turn requires the identification and development of target or problem-specific descriptors. Additionally, an increase in the prediction accuracy of the model is not always feasible from the standpoint of targeted descriptor usage. We explored the accuracy and generalizability issues using a framework of Shannon entropies, based on SMILES, SMARTS and/or InChiKey strings of respective molecules. Using various public databases of molecules, we showed that the accuracy of the prediction of machine learning models could be significantly enhanced simply by using Shannon entropy-based descriptors evaluated directly from SMILES. Analogous to partial pressures and total pressure of gases in a mixture, we used atom-wise fractional Shannon entropy in combination with total Shannon entropy from respective tokens of the string representation to model the molecule efficiently. The proposed descriptor was competitive in performance with standard descriptors such as Morgan fingerprints and SHED in regression models. Additionally, we found that either a hybrid descriptor set containing the Shannon entropy-based descriptors or an optimized, ensemble architecture of multilayer perceptrons and graph neural networks using the Shannon entropies was synergistic to improve the prediction accuracy. This simple approach of coupling the Shannon entropy framework to other standard descriptors and/or using it in ensemble models could find applications in boosting the performance of molecular property predictions in chemistry and material science.
Collapse
Affiliation(s)
- Rajarshi Guha
- Intel Corporation, 2501 NE Century Blvd, Hillsboro, OR, 97124, USA.
| | - Darrell Velegol
- Department of Chemical Engineering, Pennsylvania State University, University Park, PA, 16802, USA
| |
Collapse
|
106
|
Kao PY, Yang YC, Chiang WY, Hsiao JY, Cao Y, Aliper A, Ren F, Aspuru-Guzik A, Zhavoronkov A, Hsieh MH, Lin YC. Exploring the Advantages of Quantum Generative Adversarial Networks in Generative Chemistry. J Chem Inf Model 2023. [PMID: 37171372 DOI: 10.1021/acs.jcim.3c00562] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
De novo drug design with desired biological activities is crucial for developing novel therapeutics for patients. The drug development process is time- and resource-consuming, and it has a low probability of success. Recent advances in machine learning and deep learning technology have reduced the time and cost of the discovery process and therefore, improved pharmaceutical research and development. In this paper, we explore the combination of two rapidly developing fields with lead candidate discovery in the drug development process. First, artificial intelligence has already been demonstrated to successfully accelerate conventional drug design approaches. Second, quantum computing has demonstrated promising potential in different applications, such as quantum chemistry, combinatorial optimizations, and machine learning. This article explores hybrid quantum-classical generative adversarial networks (GAN) for small molecule discovery. We substituted each element of GAN with a variational quantum circuit (VQC) and demonstrated the quantum advantages in the small drug discovery. Utilizing a VQC in the noise generator of a GAN to generate small molecules achieves better physicochemical properties and performance in the goal-directed benchmark than the classical counterpart. Moreover, we demonstrate the potential of a VQC with only tens of learnable parameters in the generator of GAN to generate small molecules. We also demonstrate the quantum advantage of a VQC in the discriminator of GAN. In this hybrid model, the number of learnable parameters is significantly less than the classical ones, and it can still generate valid molecules. The hybrid model with only tens of training parameters in the quantum discriminator outperforms the MLP-based one in terms of both generated molecule properties and the achieved KL divergence. However, the hybrid quantum-classical GANs still face challenges in generating unique and valid molecules compared to their classical counterparts.
Collapse
Affiliation(s)
- Po-Yu Kao
- Insilico Medicine Taiwan Ltd., Taipei 110208, Taiwan
| | - Ya-Chu Yang
- Insilico Medicine Taiwan Ltd., Taipei 110208, Taiwan
| | - Wei-Yin Chiang
- Hon Hai (Foxconn) Research Institute, Taipei 114699, Taiwan
| | - Jen-Yueh Hsiao
- Hon Hai (Foxconn) Research Institute, Taipei 114699, Taiwan
| | - Yudong Cao
- Zapata Computing, Inc., Boston, Massachusetts 02110, United States
| | - Alex Aliper
- Insilico Medicine AI Limited, Masdar City, Abu Dhabi 145748, UAE
| | - Feng Ren
- Insilico Medicine Shanghai Ltd., Shanghai 201203, China
| | - Alán Aspuru-Guzik
- Department of Chemistry, University of Toronto, Toronto, ON M5S 3H6, Canada
- Department of Computer Science, University of Toronto, Toronto, ON M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, Toronto, ON M5S 1M1, Canada
- Lebovic Fellow, Canadian Institute for Advanced Research, Toronto, ON M5S 1M1, Canada
| | | | - Min-Hsiu Hsieh
- Hon Hai (Foxconn) Research Institute, Taipei 114699, Taiwan
| | - Yen-Chu Lin
- Insilico Medicine Taiwan Ltd., Taipei 110208, Taiwan
- Department of Pharmacy, National Yang Ming Chiao Tung University, Taipei 112304, Taiwan
| |
Collapse
|
107
|
Li X, Wang H, Jiang M, Ding M, Xu X, Xu B, Zou Y, Yu Y, Yang W. Collision Cross Section Prediction Based on Machine Learning. Molecules 2023; 28:molecules28104050. [PMID: 37241791 DOI: 10.3390/molecules28104050] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 05/10/2023] [Accepted: 05/10/2023] [Indexed: 05/28/2023] Open
Abstract
Ion mobility-mass spectrometry (IM-MS) is a powerful separation technique providing an additional dimension of separation to support the enhanced separation and characterization of complex components from the tissue metabolome and medicinal herbs. The integration of machine learning (ML) with IM-MS can overcome the barrier to the lack of reference standards, promoting the creation of a large number of proprietary collision cross section (CCS) databases, which help to achieve the rapid, comprehensive, and accurate characterization of the contained chemical components. In this review, advances in CCS prediction using ML in the past 2 decades are summarized. The advantages of ion mobility-mass spectrometers and the commercially available ion mobility technologies with different principles (e.g., time dispersive, confinement and selective release, and space dispersive) are introduced and compared. The general procedures involved in CCS prediction based on ML (acquisition and optimization of the independent and dependent variables, model construction and evaluation, etc.) are highlighted. In addition, quantum chemistry, molecular dynamics, and CCS theoretical calculations are also described. Finally, the applications of CCS prediction in metabolomics, natural products, foods, and the other research fields are reflected.
Collapse
Affiliation(s)
- Xiaohang Li
- State Key Laboratory of Component-Based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
- Haihe Laboratory of Modern Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
| | - Hongda Wang
- State Key Laboratory of Component-Based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
- Haihe Laboratory of Modern Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
| | - Meiting Jiang
- State Key Laboratory of Component-Based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
- Haihe Laboratory of Modern Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
| | - Mengxiang Ding
- State Key Laboratory of Component-Based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
- Haihe Laboratory of Modern Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
| | - Xiaoyan Xu
- State Key Laboratory of Component-Based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
- Haihe Laboratory of Modern Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
| | - Bei Xu
- State Key Laboratory of Component-Based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
- Haihe Laboratory of Modern Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
| | - Yadan Zou
- State Key Laboratory of Component-Based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
- Haihe Laboratory of Modern Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
| | - Yuetong Yu
- State Key Laboratory of Component-Based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
| | - Wenzhi Yang
- State Key Laboratory of Component-Based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
- Haihe Laboratory of Modern Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
| |
Collapse
|
108
|
Nemoto S, Mizuno T, Kusuhara H. Investigation of chemical structure recognition by encoder-decoder models in learning progress. J Cheminform 2023; 15:45. [PMID: 37046349 PMCID: PMC10100163 DOI: 10.1186/s13321-023-00713-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 03/18/2023] [Indexed: 04/14/2023] Open
Abstract
Descriptor generation methods using latent representations of encoder-decoder (ED) models with SMILES as input are useful because of the continuity of descriptor and restorability to the structure. However, it is not clear how the structure is recognized in the learning progress of ED models. In this work, we created ED models of various learning progress and investigated the relationship between structural information and learning progress. We showed that compound substructures were learned early in ED models by monitoring the accuracy of downstream tasks and input-output substructure similarity using substructure-based descriptors, which suggests that existing evaluation methods based on the accuracy of downstream tasks may not be sensitive enough to evaluate the performance of ED models with SMILES as descriptor generation methods. On the other hand, we showed that structure restoration was time-consuming, and in particular, insufficient learning led to the estimation of a larger structure than the actual one. It can be inferred that determining the endpoint of the structure is a difficult task for the model. To our knowledge, this is the first study to link the learning progress of SMILES by ED model to chemical structures for a wide range of chemicals.
Collapse
Affiliation(s)
- Shumpei Nemoto
- Department of Pharmaceutical Sciences, The University of Tokyo, Bunkyo, Tokyo, Japan
| | - Tadahaya Mizuno
- Department of Pharmaceutical Sciences, The University of Tokyo, Bunkyo, Tokyo, Japan.
| | - Hiroyuki Kusuhara
- Department of Pharmaceutical Sciences, The University of Tokyo, Bunkyo, Tokyo, Japan
| |
Collapse
|
109
|
Gholampour M, Seradj H, Sakhteman A. Structure-Selectivity Relationship Prediction of Tau Imaging Tracers Using Machine Learning-Assisted QSAR Models and Interaction Fingerprint Map. ACS Chem Neurosci 2023. [PMID: 37037183 DOI: 10.1021/acschemneuro.3c00038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/12/2023] Open
Abstract
Protein aggregates composed of tau fibrils are major pathologic findings in different tauopathies. An ideal agent for imaging tau fibrils must be highly selective. The molecular basis for the binding of current available compounds to tau aggregates is not well understood. Herein, we provide insights into previously studied positron emission tomography tracers using various computational methods, including machine learning-based quantitative structure-activity relationship (QSAR) classification, docking, and molecular dynamics (MD) simulations to investigate the structural basis of selective tau aggregate binding for potential compounds. The QSAR classification model based on the Random Forest algorithm with an accuracy of 96.6% for the selective and 97.6% for the nonselective class of compounds revealed essential selective moieties. The combination of molecular docking, MD simulations, and molecular mechanics Poisson-Boltzmann surface area (MM/PBSA) binding free-energy calculation showed superior binding energy of ligand 63 toward tau and PHF6, a key hexapeptide in tau aggregation, as the most selective compound in the data set. Dissecting the binding properties of ligand 63 and ligand 8 (the least selective compound) within tau and Aβ structures confirmed that these two compounds favor different binding sites of tau; however, the preferential binding site in Aβ was similar for both with lower binding energies calculated for ligand 8. Results revealed that the number of N-heterocycles, the position of nitrogen atoms, and the presence of tertiary amine are important components of selective binding moieties, and they should be maintained in molecules for selective binding to tau aggregates. The predicted structure-selectivity relationship will facilitate the rational design and further development of selective tau imaging agents.
Collapse
Affiliation(s)
- Maryam Gholampour
- Department of Medicinal Chemistry, Faculty of Pharmacy, Shiraz University of Medical Sciences, Shiraz 71468-64685, Iran
| | - Hassan Seradj
- Department of Medicinal Chemistry, Faculty of Pharmacy, Shiraz University of Medical Sciences, Shiraz 71468-64685, Iran
| | - Amirhossein Sakhteman
- Chair of Proteomics and Bioanalytics, Technical University of Munich (TUM), Freising 85354, Germany
| |
Collapse
|
110
|
Zhang T, Mo Q, Jiang N, Wu Y, Yang X, Chen W, Li Q, Yang S, Yang J, Zeng J, Huang F, Huang Q, Luo J, Wu J, Wang L. The combination of machine learning and transcriptomics reveals a novel megakaryopoiesis inducer, MO-A, that promotes thrombopoiesis by activating FGF1/FGFR1/PI3K/Akt/NF-κB signaling. Eur J Pharmacol 2023; 944:175604. [PMID: 36804544 DOI: 10.1016/j.ejphar.2023.175604] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 01/20/2023] [Accepted: 02/16/2023] [Indexed: 02/19/2023]
Abstract
Radiation-induced thrombocytopenia (RIT) occurs widely and causes high mortality and morbidity in cancer patients who receive radiotherapy. However, specific drugs for treating RIT remain woefully inadequate. Here, we first developed a drug screening model using naive Bayes, a machine learning (ML) algorithm, to virtually screen the active compounds promoting megakaryopoiesis and thrombopoiesis. A natural product library was screened by the model, and methylophiopogonanone A (MO-A) was identified as the most active compound. The activity of MO-A was then validated in vitro and showed that MO-A could markedly induce megakaryocyte (MK) differentiation of K562 and Meg-01 cells in a concentration-dependent manner. Furthermore, the therapeutic action of MO-A on RIT was evaluated, and MO-A significantly accelerated platelet level recovery, platelet activation, megakaryopoiesis, MK differentiation in RIT mice. Moreover, RNA-sequencing (RNA-seq) indicated that the PI3K cascade was closely related to MK differentiation induced by MO-A. Finally, experimental verification demonstrated that MO-A obviously induced the expression of FGF1 and FGFR1, and increased the phosphorylation of PI3K, Akt and NF-κB. Blocking FGFR1 with its inhibitor dovitinib suppressed MO-A-induced MK differentiation, and PI3K, Akt and NF-κB phosphorylation. Similarly, inhibition of PI3K-Akt signal pathway by its inhibitor LY294002 suppressed MK differentiation, and PI3K, Akt and NF-κB phosphorylation induced by MO-A. Taken together, our study provides an efficient drug discovery strategy for hematological diseases, and demonstrates that MO-A is a novel countermeasure for treating RIT through activation of the FGF1/FGFR1/PI3K/Akt/NF-κB signaling pathway.
Collapse
Affiliation(s)
- Ting Zhang
- Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou, Sichuan, 646000, China
| | - Qi Mo
- Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou, Sichuan, 646000, China
| | - Nan Jiang
- Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou, Sichuan, 646000, China
| | - Yuesong Wu
- Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou, Sichuan, 646000, China
| | - Xin Yang
- Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou, Sichuan, 646000, China
| | - Wang Chen
- Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou, Sichuan, 646000, China
| | - Qinyao Li
- Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou, Sichuan, 646000, China
| | - Shuo Yang
- Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou, Sichuan, 646000, China
| | - Jing Yang
- Department of Pharmacy, Chengdu Fifth People's Hospital, Chengdu University of Traditional Chinese Medicine, Chengdu, Sichuan, 611137, China
| | - Jing Zeng
- Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou, Sichuan, 646000, China
| | - Feihong Huang
- Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou, Sichuan, 646000, China
| | - Qianqian Huang
- Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou, Sichuan, 646000, China
| | - Jiesi Luo
- Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou, Sichuan, 646000, China.
| | - Jianming Wu
- Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou, Sichuan, 646000, China; School of Basic Medical Sciences, Southwest Medical University, Luzhou, Sichuan, 646000, China; Education Ministry Key Laboratory of Medical Electrophysiology, Sichuan Key Medical Laboratory of New Drug Discovery and Druggability Evaluation, Luzhou Key Laboratory of Activity Screening and Druggability Evaluation for Chinese Materia Medica, Southwest Medical University, Luzhou, Sichuan, 646000, China.
| | - Long Wang
- Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou, Sichuan, 646000, China.
| |
Collapse
|
111
|
Jaramillo DN, Millán D, Guevara-Pulido J. Design, synthesis and cytotoxic evaluation of a selective serotonin reuptake inhibitor (SSRI) by virtual screening. Eur J Pharm Sci 2023; 183:106403. [PMID: 36758772 DOI: 10.1016/j.ejps.2023.106403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 01/24/2023] [Accepted: 02/06/2023] [Indexed: 02/11/2023]
Abstract
Depression is one of the most common mental illnesses, affecting almost 300 million people. According to the WHO, depression is one of the world's leading causes of disability and morbidity. People with this illness require both psychological and pharmaceutical treatment because severe depressive episodes often result in suicide. Selective serotonin reuptake inhibitors (SSRI) are widely used antidepressants that target the human serotonin transporter (hSERT). The crystallization of hSERT and the experimental data available allows cost and time-efficient computational tools like virtual screening (VS) to be utilized in the development of therapeutic agents. Here, we synthesized, characterized, and evaluated the biological activity of a novel SSRI analog of paroxetine, rationally designed by applying an artificial neural network-based QSAR model and a molecular docking analysis on hSERT. The analog N-substituted 18a showed higher affinity for the transporter (-10.2 kcal/mol), lower Ki value (1.19 nM) and a safer toxicological profile than paroxetine and was synthesized with a 71% yield. The in vitro cytotoxicity of the analog was evaluated using human glioblastoma (U87 MG), human neuroblastoma (SH SY5Y) and murine fibroblast (L929) cell lines. Also, the hemolytic ability of the compound was assessed on human erythrocytes. Results showed that analog 18a did not exhibit cytotoxic activity on the cell lines used and has no hemolytic activity at any of the concentrations tested, whereas with paroxetine, hemolysis was observed at 2.3, 1.29 y 0.67 mM. Based on these results, it is possible to suggest that analog 18a could be a promising new SSRI candidate for the treatment of this illness.
Collapse
Affiliation(s)
- Deissy N Jaramillo
- INQA, Applied Chemistry Research Group- Faculty of Chemistry, Universidad El Bosque, Bogotá, Colombia
| | - Diana Millán
- GIBAT, Basic and Traslational Research Group - Faculty of Medicine, Universidad El Bosque, Bogotá, Colombia
| | - James Guevara-Pulido
- INQA, Applied Chemistry Research Group- Faculty of Chemistry, Universidad El Bosque, Bogotá, Colombia.
| |
Collapse
|
112
|
Lien ST, Lin TE, Hsieh JH, Sung TY, Chen JH, Hsu KC. Establishment of extensive artificial intelligence models for kinase inhibitor prediction: Identification of novel PDGFRB inhibitors. Comput Biol Med 2023; 156:106722. [PMID: 36878123 DOI: 10.1016/j.compbiomed.2023.106722] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 02/16/2023] [Accepted: 02/26/2023] [Indexed: 03/06/2023]
Abstract
Identifying hit compounds is an important step in drug development. Unfortunately, this process continues to be a challenging task. Several machine learning models have been generated to aid in simplifying and improving the prediction of candidate compounds. Models tuned for predicting kinase inhibitors have been established. However, an effective model can be limited by the size of the chosen training dataset. In this study, we tested several machine learning models to predict potential kinase inhibitors. A dataset was curated from a number of publicly available repositories. This resulted in a comprehensive dataset covering more than half of the human kinome. More than 2,000 kinase models were established using different model approaches. The performances of the models were compared, and the Keras-MLP model was determined to be the best performing model. The model was then used to screen a chemical library for potential inhibitors targeting platelet-derived growth factor receptor-β (PDGFRB). Several PDGFRB candidates were selected, and in vitro assays confirmed four compounds with PDGFRB inhibitory activity and IC50 values in the nanomolar range. These results show the effectiveness of machine learning models trained on the reported dataset. This report would aid in the establishment of machine learning models as well as in the discovery of novel kinase inhibitors.
Collapse
Affiliation(s)
- Ssu-Ting Lien
- Graduate Institute of Cancer Biology and Drug Discovery, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
| | - Tony Eight Lin
- Graduate Institute of Cancer Biology and Drug Discovery, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan; Ph.D. Program for Cancer Molecular Biology and Drug Discovery, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
| | - Jui-Hua Hsieh
- Division of Translational Toxicology, National Institute of Environmental Health Sciences, National Institutes of Health, Durham, NC, USA
| | - Tzu-Ying Sung
- Biomedical Translation Research Center, Academia Sinica, Taipei, Taiwan
| | - Jun-Hong Chen
- Graduate Institute of Cancer Biology and Drug Discovery, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
| | - Kai-Cheng Hsu
- Graduate Institute of Cancer Biology and Drug Discovery, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan; Ph.D. Program for Cancer Molecular Biology and Drug Discovery, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan; Ph.D. Program in Drug Discovery and Development Industry, College of Pharmacy, Taipei Medical University, Taipei, Taiwan; Cancer Center, Wan Fang Hospital, Taipei Medical University, Taipei, Taiwan; TMU Research Center of Cancer Translational Medicine, Taipei Medical University, Taipei, Taiwan; TMU Research Center of Drug Discovery, Taipei Medical University, Taipei, Taiwan.
| |
Collapse
|
113
|
Verhaegen F, Butterworth KT, Chalmers AJ, Coppes RP, de Ruysscher D, Dobiasch S, Fenwick JD, Granton PV, Heijmans SHJ, Hill MA, Koumenis C, Lauber K, Marples B, Parodi K, Persoon LCGG, Staut N, Subiel A, Vaes RDW, van Hoof S, Verginadis IL, Wilkens JJ, Williams KJ, Wilson GD, Dubois LJ. Roadmap for precision preclinical x-ray radiation studies. Phys Med Biol 2023; 68:06RM01. [PMID: 36584393 DOI: 10.1088/1361-6560/acaf45] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 12/30/2022] [Indexed: 12/31/2022]
Abstract
This Roadmap paper covers the field of precision preclinical x-ray radiation studies in animal models. It is mostly focused on models for cancer and normal tissue response to radiation, but also discusses other disease models. The recent technological evolution in imaging, irradiation, dosimetry and monitoring that have empowered these kinds of studies is discussed, and many developments in the near future are outlined. Finally, clinical translation and reverse translation are discussed.
Collapse
Affiliation(s)
- Frank Verhaegen
- MAASTRO Clinic, Radiotherapy Division, GROW-School for Oncology and Reproduction, Maastricht University Medical Centre+, Maastricht, The Netherlands
- SmART Scientific Solutions BV, Maastricht, The Netherlands
| | - Karl T Butterworth
- Patrick G. Johnston, Centre for Cancer Research, Queen's University Belfast, Belfast, Northern Ireland, United Kingdom
| | - Anthony J Chalmers
- School of Cancer Sciences, University of Glasgow, Glasgow G61 1QH, United Kingdom
| | - Rob P Coppes
- Departments of Biomedical Sciences of Cells & Systems, Section Molecular Cell Biology and Radiation Oncology, University Medical Center Groningen, University of Groningen, 9700 AD Groningen, The Netherlands
| | - Dirk de Ruysscher
- MAASTRO Clinic, Radiotherapy Division, GROW-School for Oncology and Reproduction, Maastricht University Medical Centre+, Maastricht, The Netherlands
| | - Sophie Dobiasch
- Department of Radiation Oncology, Technical University of Munich (TUM), School of Medicine and Klinikum rechts der Isar, Germany
- Department of Medical Physics, Institute of Radiation Medicine (IRM), Department of Radiation Sciences (DRS), Helmholtz Zentrum München, Germany
| | - John D Fenwick
- Department of Medical Physics & Biomedical Engineering University College LondonMalet Place Engineering Building, London WC1E 6BT, United Kingdom
| | | | | | - Mark A Hill
- MRC Oxford Institute for Radiation Oncology, University of Oxford, ORCRB Roosevelt Drive, Oxford OX3 7DQ, United Kingdom
| | - Constantinos Koumenis
- Department of Radiation Oncology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Kirsten Lauber
- Department of Radiation Oncology, University Hospital, LMU München, Munich, Germany
- German Cancer Consortium (DKTK), Partner site Munich, Germany
| | - Brian Marples
- Department of Radiation Oncology, University of Rochester, NY, United States of America
| | - Katia Parodi
- German Cancer Consortium (DKTK), Partner site Munich, Germany
- Department of Medical Physics, Faculty of Physics, Ludwig-Maximilians-Universität München, Garching b. Munich, Germany
| | | | - Nick Staut
- SmART Scientific Solutions BV, Maastricht, The Netherlands
| | - Anna Subiel
- National Physical Laboratory, Medical Radiation Science Hampton Road, Teddington, Middlesex, TW11 0LW, United Kingdom
| | - Rianne D W Vaes
- MAASTRO Clinic, Radiotherapy Division, GROW-School for Oncology and Reproduction, Maastricht University Medical Centre+, Maastricht, The Netherlands
| | | | - Ioannis L Verginadis
- Department of Radiation Oncology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Jan J Wilkens
- Department of Radiation Oncology, Technical University of Munich (TUM), School of Medicine and Klinikum rechts der Isar, Germany
- Physics Department, Technical University of Munich (TUM), Germany
| | - Kaye J Williams
- Division of Pharmacy and Optometry, University of Manchester, Manchester, United Kingdom
| | - George D Wilson
- Department of Radiation Oncology, Beaumont Health, MI, United States of America
- Henry Ford Health, Detroit, MI, United States of America
| | - Ludwig J Dubois
- The M-Lab, Department of Precision Medicine, GROW-School for Oncology and Reproduction, Maastricht University, Maastricht, The Netherlands
| |
Collapse
|
114
|
Zheng W, Chen Q, Yao L, Zhuang J, Huang J, Hu Y, Yu S, Chen T, Wei N, Zeng Y, Zhang Y, Fan C, Wang Y. Prediction Models for Sleep Quality Among College Students During the COVID-19 Outbreak: Cross-sectional Study Based on the Internet New Media. J Med Internet Res 2023; 25:e45721. [PMID: 36961495 PMCID: PMC10131726 DOI: 10.2196/45721] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Revised: 02/15/2023] [Accepted: 02/16/2023] [Indexed: 02/18/2023] Open
Abstract
BACKGROUND COVID-19 has been reported to affect the sleep quality of Chinese residents; however, the epidemic's effects on the sleep quality of college students during closed-loop management remain unclear, and a screening tool is lacking. OBJECTIVE This study aimed to understand the sleep quality of college students in Fujian Province during the epidemic and determine sensitive variables, in order to develop an efficient prediction model for the early screening of sleep problems in college students. METHODS From April 5 to 16, 2022, a cross-sectional internet-based survey was conducted. The Pittsburgh Sleep Quality Index (PSQI) scale, a self-designed general data questionnaire, and the sleep quality influencing factor questionnaire were used to understand the sleep quality of respondents in the previous month. A chi-square test and a multivariate unconditioned logistic regression analysis were performed, and influencing factors obtained were applied to develop prediction models. The data were divided into a training-testing set (n=14,451, 70%) and an independent validation set (n=6194, 30%) by stratified sampling. Four models using logistic regression, an artificial neural network, random forest, and naïve Bayes were developed and validated. RESULTS In total, 20,645 subjects were included in this survey, with a mean global PSQI score of 6.02 (SD 3.112). The sleep disturbance rate was 28.9% (n=5972, defined as a global PSQI score >7 points). A total of 11 variables related to sleep quality were taken as parameters of the prediction models, including age, gender, residence, specialty, respiratory history, coffee consumption, stay up, long hours on the internet, sudden changes, fears of infection, and impatient closed-loop management. Among the generated models, the artificial neural network model proved to be the best, with an area under curve, accuracy, sensitivity, specificity, positive predictive value, and negative predictive value of 0.713, 73.52%, 25.51%, 92.58%, 57.71%, and 75.79%, respectively. It is noteworthy that the logistic regression, random forest, and naive Bayes models achieved high specificities of 94.41%, 94.77%, and 86.40%, respectively. CONCLUSIONS The COVID-19 containment measures affected the sleep quality of college students on multiple levels, indicating that it is desiderate to provide targeted university management and social support. The artificial neural network model has presented excellent predictive efficiency and is favorable for implementing measures earlier in order to improve present conditions.
Collapse
Affiliation(s)
- Wanyu Zheng
- The Second Affiliated Hospital of Fujian Medical University, Quanzhou, China
| | - Qingquan Chen
- The School of Public Health, Fujian Medical University, Fuzhou, China
- The Graduate School of Fujian Medical University, Fuzhou, China
| | - Ling Yao
- The School of Clinical Medicine, Fujian Medical University, Fuzhou, China
| | - Jiajing Zhuang
- The School of Clinical Medicine, Fujian Medical University, Fuzhou, China
| | - Jiewei Huang
- The Graduate School of Fujian Medical University, Fuzhou, China
| | - Yiming Hu
- The School of Public Health, Fujian Medical University, Fuzhou, China
| | - Shaoyang Yu
- The Second Affiliated Hospital of Fujian Medical University, Quanzhou, China
| | - Tebin Chen
- The Second Affiliated Hospital of Fujian Medical University, Quanzhou, China
| | - Nan Wei
- The School of Clinical Medicine, Fujian Medical University, Fuzhou, China
| | - Yifu Zeng
- Cyberspace Institute of Advanced Technology, Guangzhou University, Guangzhou, China
| | - Yixiang Zhang
- The Second Affiliated Hospital of Fujian Medical University, Quanzhou, China
| | - Chunmei Fan
- The Second Affiliated Hospital of Fujian Medical University, Quanzhou, China
| | - Youjuan Wang
- The Second Affiliated Hospital of Fujian Medical University, Quanzhou, China
| |
Collapse
|
115
|
Mirza Z, Karim S. Structure-Based Profiling of Potential Phytomolecules with AKT1 a Key Cancer Drug Target. Molecules 2023; 28:molecules28062597. [PMID: 36985568 PMCID: PMC10051420 DOI: 10.3390/molecules28062597] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2023] [Revised: 03/07/2023] [Accepted: 03/09/2023] [Indexed: 03/14/2023] Open
Abstract
Identifying cancer biomarkers is imperative, as upregulated genes offer a better microenvironment for the tumor; hence, targeted inhibition is preferred. The theme of our study is to predict molecular interactions between cancer biomarker proteins and selected natural compounds. We identified an overexpressed potential molecular target (AKT1) and computationally evaluated its inhibition by four dietary ligands (isoliquiritigenin, shogaol, tehranolide, and theophylline). The three-dimensional structures of protein and phytochemicals were retrieved from the RCSB PDB database (4EKL) and NCBI’s PubChem, respectively. Rational structure-based docking studies were performed using AutoDock. Results were analyzed based primarily on the estimated free binding energy (kcal/mol), hydrogen bonds, and inhibition constant, Ki, to identify the most effective anti-cancer phytomolecule. Toxicity and drug-likeliness prediction were performed using OSIRIS and SwissADME. Amongst the four phytocompounds, tehranolide has better potential to suppress the expression of AKT1 and could be used for anti-cancer drug development, as inhibition of AKT1 is directly associated with the inhibition of growth, progression, and metastasis of the tumor. Docking analyses reveal that tehranolide has the most efficiency in inhibiting AKT1 and has the potential to be used for the therapeutic management of cancer. Natural compounds targeting cancer biomarkers offer less rejection, minimal toxicity, and fewer side effects.
Collapse
Affiliation(s)
- Zeenat Mirza
- King Fahd Medical Research Center, King Abdulaziz University, Jeddah 21589, Saudi Arabia
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah 21589, Saudi Arabia
- Correspondence: or
| | - Sajjad Karim
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah 21589, Saudi Arabia
- Center of Excellence in Genomic Medicine Research, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| |
Collapse
|
116
|
Yu T, Nantasenamat C, Kachenton S, Anuwongcharoen N, Piacham T. Cheminformatic Analysis and Machine Learning Modeling to Investigate Androgen Receptor Antagonists to Combat Prostate Cancer. ACS OMEGA 2023; 8:6729-6742. [PMID: 36844574 PMCID: PMC9948163 DOI: 10.1021/acsomega.2c07346] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Accepted: 02/01/2023] [Indexed: 06/18/2023]
Abstract
Prostate cancer (PCa) is a major leading cause of mortality of cancer among males. There have been numerous studies to develop antagonists against androgen receptor (AR), a crucial therapeutic target for PCa. This study is a systematic cheminformatic analysis and machine learning modeling to study the chemical space, scaffolds, structure-activity relationship, and landscape of human AR antagonists. There are 1678 molecules as final data sets. Chemical space visualization by physicochemical property visualization has demonstrated that molecules from the potent/active class generally have a mildly smaller molecular weight (MW), octanol-water partition coefficient (log P), number of hydrogen-bond acceptors (nHA), number of rotatable bonds (nRot), and topological polar surface area (TPSA) than molecules from intermediate/inactive class. The chemical space visualization in the principal component analysis (PCA) plot shows significant overlapping distributions between potent/active class molecules and intermediate/inactive class molecules; potent/active class molecules are intensively distributed, while intermediate/inactive class molecules are widely and sparsely distributed. Murcko scaffold analysis has shown low scaffold diversity in general, and scaffold diversity of potent/active class molecules is even lower than intermediate/inactive class molecules, indicating the necessity for developing molecules with novel scaffolds. Furthermore, scaffold visualization has identified 16 representative Murcko scaffolds. Among them, scaffolds 1, 2, 3, 4, 7, 8, 10, 11, 15, and 16 are highly favorable scaffolds due to their high scaffold enrichment factor values. Based on scaffold analysis, their local structure-activity relationships (SARs) were investigated and summarized. In addition, the global SAR landscape was explored by quantitative structure-activity relationship (QSAR) modelings and structure-activity landscape visualization. A QSAR classification model incorporating all of the 1678 molecules stands out as the best model from a total of 12 candidate models for AR antagonists (built on PubChem fingerprint, extra trees algorithm, accuracy for training set: 0.935, 10-fold cross-validation set: 0.735 and test set: 0.756). Deeper insights into the structure-activity landscape highlighted a total of seven significant activity cliff (AC) generators (ChEMBL molecule IDs: 160257, 418198, 4082265, 348918, 390728, 4080698, and 6530), which provide valuable SAR information for medicinal chemistry. The findings in this study provide new insights and guidelines for hit identification and lead optimization for the development of novel AR antagonists.
Collapse
Affiliation(s)
- Tianshi Yu
- Center
of Data Mining and Biomedical informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand
| | - Chanin Nantasenamat
- Streamlit
Open Source, Snowflake Inc., San Mateo, California 94402, United States
| | - Supicha Kachenton
- Department
of Clinical Microbiology and Applied Technology, Faculty of Medical
Technology, Mahidol University, Bangkok 10700, Thailand
| | - Nuttapat Anuwongcharoen
- Center
of Data Mining and Biomedical informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand
| | - Theeraphon Piacham
- Department
of Clinical Microbiology and Applied Technology, Faculty of Medical
Technology, Mahidol University, Bangkok 10700, Thailand
| |
Collapse
|
117
|
Comparative Studies on Resampling Techniques in Machine Learning and Deep Learning Models for Drug-Target Interaction Prediction. Molecules 2023; 28:molecules28041663. [PMID: 36838652 PMCID: PMC9964614 DOI: 10.3390/molecules28041663] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 01/23/2023] [Accepted: 01/24/2023] [Indexed: 02/12/2023] Open
Abstract
The prediction of drug-target interactions (DTIs) is a vital step in drug discovery. The success of machine learning and deep learning methods in accurately predicting DTIs plays a huge role in drug discovery. However, when dealing with learning algorithms, the datasets used are usually highly dimensional and extremely imbalanced. To solve this issue, the dataset must be resampled accordingly. In this paper, we have compared several data resampling techniques to overcome class imbalance in machine learning methods as well as to study the effectiveness of deep learning methods in overcoming class imbalance in DTI prediction in terms of binary classification using ten (10) cancer-related activity classes from BindingDB. It is found that the use of Random Undersampling (RUS) in predicting DTIs severely affects the performance of a model, especially when the dataset is highly imbalanced, thus, rendering RUS unreliable. It is also found that SVM-SMOTE can be used as a go-to resampling method when paired with the Random Forest and Gaussian Naïve Bayes classifiers, whereby a high F1 score is recorded for all activity classes that are severely and moderately imbalanced. Additionally, the deep learning method called Multilayer Perceptron recorded high F1 scores for all activity classes even when no resampling method was applied.
Collapse
|
118
|
Exploring the Chemical Space of CYP17A1 Inhibitors Using Cheminformatics and Machine Learning. Molecules 2023; 28:molecules28041679. [PMID: 36838665 PMCID: PMC9966999 DOI: 10.3390/molecules28041679] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 01/01/2023] [Accepted: 01/12/2023] [Indexed: 02/12/2023] Open
Abstract
Cytochrome P450 17A1 (CYP17A1) is one of the key enzymes in steroidogenesis that produces dehydroepiandrosterone (DHEA) from cholesterol. Abnormal DHEA production may lead to the progression of severe diseases, such as prostatic and breast cancers. Thus, CYP17A1 is a druggable target for anti-cancer molecule development. In this study, cheminformatic analyses and quantitative structure-activity relationship (QSAR) modeling were applied on a set of 962 CYP17A1 inhibitors (i.e., consisting of 279 steroidal and 683 nonsteroidal inhibitors) compiled from the ChEMBL database. For steroidal inhibitors, a QSAR classification model built using the PubChem fingerprint along with the extra trees algorithm achieved the best performance, reflected by the accuracy values of 0.933, 0.818, and 0.833 for the training, cross-validation, and test sets, respectively. For nonsteroidal inhibitors, a systematic cheminformatic analysis was applied for exploring the chemical space, Murcko scaffolds, and structure-activity relationships (SARs) for visualizing distributions, patterns, and representative scaffolds for drug discoveries. Furthermore, seven total QSAR classification models were established based on the nonsteroidal scaffolds, and two activity cliff (AC) generators were identified. The best performing model out of these seven was model VIII, which is built upon the PubChem fingerprint along with the random forest algorithm. It achieved a robust accuracy across the training set, the cross-validation set, and the test set, i.e., 0.96, 0.92, and 0.913, respectively. It is anticipated that the results presented herein would be instrumental for further CYP17A1 inhibitor drug discovery efforts.
Collapse
|
119
|
A deep learning-based framework for automatic detection of drug resistance in tuberculosis patients. EGYPTIAN INFORMATICS JOURNAL 2023. [DOI: 10.1016/j.eij.2023.01.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
|
120
|
Firooz A, Funkhouser AT, Martin JC, Edenfield WJ, Valafar H, Blenda AV. Comprehensive and User-Analytics-Friendly Cancer Patient Database for Physicians and Researchers. ARXIV 2023:arXiv:2302.01337v1. [PMID: 36776819 PMCID: PMC9915752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/14/2023]
Abstract
Nuanced cancer patient care is needed, as the development and clinical course of cancer is multifactorial with influences from the general health status of the patient, germline and neoplastic mutations, co-morbidities, and environment. To effectively tailor an individualized treatment to each patient, such multifactorial data must be presented to providers in an easy-to-access and easy-to-analyze fashion. To address the need, a relational database has been developed integrating status of cancer-critical gene mutations, serum galectin profiles, serum and tumor glycomic profiles, with clinical, demographic, and lifestyle data points of individual cancer patients. The database, as a backend, provides physicians and researchers with a single, easily accessible repository of cancer profiling data to aid-in and enhance individualized treatment. Our interactive database allows care providers to amalgamate cohorts from these groups to find correlations between different data types with the possibility of finding "molecular signatures" based upon a combination of genetic mutations, galectin serum levels, glycan compositions, and patient clinical data and lifestyle choices. Our project provides a framework for an integrated, interactive, and growing database to analyze molecular and clinical patterns across cancer stages and subtypes and provides opportunities for increased diagnostic and prognostic power.
Collapse
Affiliation(s)
- Ali Firooz
- College of Engineering and Computing, University of South Carolina, Columbia, SC, USA
| | - Avery T Funkhouser
- School of Medicine Greenville, University of South Carolina, Greenville, SC, USA
| | | | | | - Homayoun Valafar
- College of Engineering and Computing, University of South Carolina, Columbia, SC, USA
| | - Anna V Blenda
- School of Medicine Greenville, University of South Carolina, Prisma Health Cancer Institute, Greenville, SC, USA
| |
Collapse
|
121
|
Mirzaei M, Furxhi I, Murphy F, Mullins M. Employing Supervised Algorithms for the Prediction of Nanomaterial's Antioxidant Efficiency. Int J Mol Sci 2023; 24:ijms24032792. [PMID: 36769135 PMCID: PMC9918003 DOI: 10.3390/ijms24032792] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 01/25/2023] [Accepted: 01/29/2023] [Indexed: 02/05/2023] Open
Abstract
Reactive oxygen species (ROS) are compounds that readily transform into free radicals. Excessive exposure to ROS depletes antioxidant enzymes that protect cells, leading to oxidative stress and cellular damage. Nanomaterials (NMs) exhibit free radical scavenging efficiency representing a potential solution for oxidative stress-induced disorders. This study aims to demonstrate the application of machine learning (ML) algorithms for predicting the antioxidant efficiency of NMs. We manually compiled a comprehensive dataset based on a literature review of 62 in vitro studies. We extracted NMs' physico-chemical (P-chem) properties, the NMs' synthesis technique and various experimental conditions as input features to predict the antioxidant efficiency measured by a 2,2-diphenyl-1-picrylhydrazyl (DPPH) assay. Following data pre-processing, various regression models were trained and validated. The random forest model showed the highest predictive performance reaching an R2 = 0.83. The attribute importance analysis revealed that the NM's type, core-size and dosage are the most important attributes influencing the prediction. Our findings corroborate with those of the prior research landscape regarding the importance of P-chem characteristics. This study expands the application of ML in the nano-domain beyond safety-related outcomes by capturing the functional performance. Accordingly, this study has two objectives: (1) to develop a model to forecast the antioxidant efficiency of NMs to complement conventional in vitro assays and (2) to underline the lack of a comprehensive database and the scarcity of relevant data and/or data management practices in the nanotechnology field, especially with regards to functionality assessments.
Collapse
Affiliation(s)
- Mahsa Mirzaei
- Department of Accounting and Finance, Kemmy Business School, University of Limerick, V94PH93 Limerick, Ireland
| | - Irini Furxhi
- Department of Accounting and Finance, Kemmy Business School, University of Limerick, V94PH93 Limerick, Ireland
- Transgero Limited, Newcastle West, V42V384 Limerick, Ireland
- Correspondence: ; Tel.: +353-85-106-9771
| | - Finbarr Murphy
- Department of Accounting and Finance, Kemmy Business School, University of Limerick, V94PH93 Limerick, Ireland
| | - Martin Mullins
- Department of Accounting and Finance, Kemmy Business School, University of Limerick, V94PH93 Limerick, Ireland
| |
Collapse
|
122
|
Vemula D, Jayasurya P, Sushmitha V, Kumar YN, Bhandari V. CADD, AI and ML in drug discovery: A comprehensive review. Eur J Pharm Sci 2023; 181:106324. [PMID: 36347444 DOI: 10.1016/j.ejps.2022.106324] [Citation(s) in RCA: 64] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 10/26/2022] [Accepted: 11/03/2022] [Indexed: 11/06/2022]
Abstract
Computer-aided drug design (CADD) is an emerging field that has drawn a lot of interest because of its potential to expedite and lower the cost of the drug development process. Drug discovery research is expensive and time-consuming, and it frequently took 10-15 years for a drug to be commercially available. CADD has significantly impacted this area of research. Further, the combination of CADD with Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) technologies to handle enormous amounts of biological data has reduced the time and cost associated with the drug development process. This review will discuss how CADD, AI, ML, and DL approaches help identify drug candidates and various other steps of the drug discovery process. It will also provide a detailed overview of the different in silico tools used and how these approaches interact.
Collapse
Affiliation(s)
- Divya Vemula
- National Institute of Pharmaceutical Education and Research- Hyderabad, India
| | - Perka Jayasurya
- National Institute of Pharmaceutical Education and Research- Hyderabad, India
| | - Varthiya Sushmitha
- National Institute of Pharmaceutical Education and Research- Hyderabad, India
| | | | - Vasundhra Bhandari
- National Institute of Pharmaceutical Education and Research- Hyderabad, India.
| |
Collapse
|
123
|
McNair D. Artificial Intelligence and Machine Learning for Lead-to-Candidate Decision-Making and Beyond. Annu Rev Pharmacol Toxicol 2023; 63:77-97. [PMID: 35679624 DOI: 10.1146/annurev-pharmtox-051921-023255] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
The use of artificial intelligence (AI) and machine learning (ML) in pharmaceutical research and development has to date focused on research: target identification; docking-, fragment-, and motif-based generation of compound libraries; modeling of synthesis feasibility; rank-ordering likely hits according to structural and chemometric similarity to compounds having known activity and affinity to the target(s); optimizing a smaller library for synthesis and high-throughput screening; and combining evidence from screening to support hit-to-lead decisions. Applying AI/ML methods to lead optimization and lead-to-candidate (L2C) decision-making has shown slower progress, especially regarding predicting absorption, distribution, metabolism, excretion, and toxicology properties. The present review surveys reasons why this is so, reports progress that has occurred in recent years, and summarizes some of the issues that remain. Effective AI/ML tools to derisk L2C and later phases of development are important to accelerate the pharmaceutical development process, ameliorate escalating development costs, and achieve greater success rates.
Collapse
Affiliation(s)
- Douglas McNair
- Global Health, Integrated Development, Bill & Melinda Gates Foundation, Seattle, Washington, USA;
| |
Collapse
|
124
|
Yeh KB, Parekh FK, Mombo I, Leimer J, Hewson R, Olinger G, Fair JM, Sun Y, Hay J. Climate change and infectious disease: A prologue on multidisciplinary cooperation and predictive analytics. Front Public Health 2023; 11:1018293. [PMID: 36741948 PMCID: PMC9895942 DOI: 10.3389/fpubh.2023.1018293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2022] [Accepted: 01/02/2023] [Indexed: 01/22/2023] Open
Abstract
Climate change impacts global ecosystems at the interface of infectious disease agents and hosts and vectors for animals, humans, and plants. The climate is changing, and the impacts are complex, with multifaceted effects. In addition to connecting climate change and infectious diseases, we aim to draw attention to the challenges of working across multiple disciplines. Doing this requires concentrated efforts in a variety of areas to advance the technological state of the art and at the same time implement ideas and explain to the everyday citizen what is happening. The world's experience with COVID-19 has revealed many gaps in our past approaches to anticipating emerging infectious diseases. Most approaches to predicting outbreaks and identifying emerging microbes of major consequence have been with those causing high morbidity and mortality in humans and animals. These lagging indicators offer limited ability to prevent disease spillover and amplifications in new hosts. Leading indicators and novel approaches are more valuable and now feasible, with multidisciplinary approaches also within our grasp to provide links to disease predictions through holistic monitoring of micro and macro ecological changes. In this commentary, we describe niches for climate change and infectious diseases as well as overarching themes for the important role of collaborative team science, predictive analytics, and biosecurity. With a multidisciplinary cooperative "all call," we can enhance our ability to engage and resolve current and emerging problems.
Collapse
Affiliation(s)
| | | | - Illich Mombo
- CIRMF, Franceville, Gabon, Central African Republic
| | | | - Roger Hewson
- UK Health Security Agency, Salisbury, United Kingdom
- London School of Hygiene and Tropical Medicine, London, United Kingdom
| | | | - Jeanne M. Fair
- Los Alamos National Laboratory, Los Alamos, NM, United States
| | - Yijun Sun
- Jacobs School of Medicine and Biomedical Sciences, Buffalo, NY, United States
| | - John Hay
- Jacobs School of Medicine and Biomedical Sciences, Buffalo, NY, United States
| |
Collapse
|
125
|
Djokovic N, Rahnasto-Rilla M, Lougiakis N, Lahtela-Kakkonen M, Nikolic K. SIRT2i_Predictor: A Machine Learning-Based Tool to Facilitate the Discovery of Novel SIRT2 Inhibitors. Pharmaceuticals (Basel) 2023; 16:ph16010127. [PMID: 36678624 PMCID: PMC9864763 DOI: 10.3390/ph16010127] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 01/10/2023] [Accepted: 01/11/2023] [Indexed: 01/17/2023] Open
Abstract
A growing body of preclinical evidence recognized selective sirtuin 2 (SIRT2) inhibitors as novel therapeutics for treatment of age-related diseases. However, none of the SIRT2 inhibitors have reached clinical trials yet. Transformative potential of machine learning (ML) in early stages of drug discovery has been witnessed by widespread adoption of these techniques in recent years. Despite great potential, there is a lack of robust and large-scale ML models for discovery of novel SIRT2 inhibitors. In order to support virtual screening (VS), lead optimization, or facilitate the selection of SIRT2 inhibitors for experimental evaluation, a machine-learning-based tool titled SIRT2i_Predictor was developed. The tool was built on a panel of high-quality ML regression and classification-based models for prediction of inhibitor potency and SIRT1-3 isoform selectivity. State-of-the-art ML algorithms were used to train the models on a large and diverse dataset containing 1797 compounds. Benchmarking against structure-based VS protocol indicated comparable coverage of chemical space with great gain in speed. The tool was applied to screen the in-house database of compounds, corroborating the utility in the prioritization of compounds for costly in vitro screening campaigns. The easy-to-use web-based interface makes SIRT2i_Predictor a convenient tool for the wider community. The SIRT2i_Predictor's source code is made available online.
Collapse
Affiliation(s)
- Nemanja Djokovic
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, University of Belgrade, Vojvode Stepe 450, 11221 Belgrade, Serbia
- Correspondence: (N.D.); (K.N.)
| | - Minna Rahnasto-Rilla
- School of Pharmacy, University of Eastern Finland, P.O. Box 1627, 70210 Kuopio, Finland
| | - Nikolaos Lougiakis
- Laboratory of Medicinal Chemistry, Section of Pharmaceutical Chemistry, Department of Pharmacy, School of Health Sciences, National and Kapodistrian University of Athens, Panepistimiopolis-Zografou, 15771 Athens, Greece
| | | | - Katarina Nikolic
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, University of Belgrade, Vojvode Stepe 450, 11221 Belgrade, Serbia
- Correspondence: (N.D.); (K.N.)
| |
Collapse
|
126
|
López
Barreiro D, Folch-Fortuny A, Muntz I, Thies JC, Sagt CM, Koenderink GH. Sequence Control of the Self-Assembly of Elastin-Like Polypeptides into Hydrogels with Bespoke Viscoelastic and Structural Properties. Biomacromolecules 2023; 24:489-501. [PMID: 36516874 PMCID: PMC9832484 DOI: 10.1021/acs.biomac.2c01405] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
The biofabrication of structural proteins with controllable properties via amino acid sequence design is interesting for biomedicine and biotechnology, yet a complete framework that connects amino acid sequence to material properties is unavailable, despite great progress to establish design rules for synthesizing peptides and proteins with specific conformations (e.g., unfolded, helical, β-sheets, or β-turns) and intermolecular interactions (e.g., amphipathic peptides or hydrophobic domains). Molecular dynamics (MD) simulations can help in developing such a framework, but the lack of a standardized way of interpreting the outcome of these simulations hinders their predictive value for the design of de novo structural proteins. To address this, we developed a model that unambiguously classifies a library of de novo elastin-like polypeptides (ELPs) with varying numbers and locations of hydrophobic/hydrophilic and physical/chemical-cross-linking blocks according to their thermoresponsiveness at physiological temperature. Our approach does not require long simulation times or advanced sampling methods. Instead, we apply (un)supervised data analysis methods to a data set of molecular properties from relatively short MD simulations (150 ns). We also experimentally investigate hydrogels of those ELPs from the library predicted to be thermoresponsive, revealing several handles to tune their mechanical and structural properties: chain hydrophilicity/hydrophobicity or block distribution control the viscoelasticity and thermoresponsiveness, whereas ELP concentration defines the network permeability. Our findings provide an avenue to accelerate the design of de novo ELPs with bespoke phase behavior and material properties.
Collapse
Affiliation(s)
- Diego López
Barreiro
- DSM
Biosciences and Process Innovation, DSM, Alexander Fleminglaan 1, 2613 AXDelft, The Netherlands
| | - Abel Folch-Fortuny
- DSM
Biodata and Translation, DSM, Alexander Fleminglaan 1, 2613 AXDelft, The Netherlands
| | - Iain Muntz
- Department
of Bionanoscience, Kavli Institute of Nanoscience Delft, Delft University of Technology, Van der Maasweg 9, 2629 HZDelft, The Netherlands
| | - Jens C. Thies
- DSM
Biomedical, DSM, Urmonderbaan
22, 6160 BB, Geleen, The Netherlands,E-mail:
| | - Cees M.J. Sagt
- DSM
Biosciences and Process Innovation, DSM, Alexander Fleminglaan 1, 2613 AXDelft, The Netherlands,E-mail:
| | - Gijsje H. Koenderink
- Department
of Bionanoscience, Kavli Institute of Nanoscience Delft, Delft University of Technology, Van der Maasweg 9, 2629 HZDelft, The Netherlands,E-mail:
| |
Collapse
|
127
|
Lerksuthirat T, Chitphuk S, Stitchantrakul W, Dejsuphong D, Malik AA, Nantasenamat C. PARP1pred: a web server for screening the bioactivity of inhibitors against DNA repair enzyme PARP-1. EXCLI JOURNAL 2023; 22:84-107. [PMID: 36814851 PMCID: PMC9939779 DOI: 10.17179/excli2022-5602] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Figures] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 12/23/2022] [Indexed: 02/24/2023]
Abstract
Cancer is the leading cause of death worldwide, resulting in the mortality of more than 10 million people in 2020, according to Global Cancer Statistics 2020. A potential cancer therapy involves targeting the DNA repair process by inhibiting PARP-1. In this study, classification models were constructed using a non-redundant set of 2018 PARP-1 inhibitors. Briefly, compounds were described by 12 fingerprint types and built using the random forest algorithm concomitant with various sampling approaches. Results indicated that PubChem with an oversampling approach yielded the best performance, with a Matthews correlation coefficient > 0.7 while also affording interpretable molecular features. Moreover, feature importance, as determined from the Gini index, revealed that the aromatic/cyclic/heterocyclic moiety, nitrogen-containing fingerprints, and the ether/aldehyde/alcohol moiety were important for PARP-1 inhibition. Finally, our predictive model was deployed as a web application called PARP1pred and is publicly available at https://parp1pred.streamlitapp.com, allowing users to predict the biological activity of query compounds using their SMILES notation as the input. It is anticipated that the model described herein will aid in the discovery of effective PARP-1 inhibitors.
Collapse
Affiliation(s)
- Tassanee Lerksuthirat
- Research Center, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok 10400, Thailand,*To whom correspondence should be addressed: Tassanee Lerksuthirat, Research Center, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok 10400, Thailand, E-mail:
| | - Sermsiri Chitphuk
- Research Center, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok 10400, Thailand
| | - Wasana Stitchantrakul
- Research Center, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok 10400, Thailand
| | - Donniphat Dejsuphong
- Program in Translational Medicine, Chakri Naruebodindra Medical Institute, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Samut Prakan 10540, Thailand
| | - Aijaz Ahmad Malik
- Center of Excellence in Computational Molecular Biology, Faculty of Medicine, Chulalongkorn University, Bangkok 10330, Thailand
| | | |
Collapse
|
128
|
Wang CC, Hung YT, Chou CY, Hsuan SL, Chen ZW, Chang PY, Jan TR, Tung CW. Using random forest to predict antimicrobial minimum inhibitory concentrations of nontyphoidal Salmonella in Taiwan. Vet Res 2023; 54:11. [PMID: 36747286 PMCID: PMC9903507 DOI: 10.1186/s13567-023-01141-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Accepted: 01/13/2023] [Indexed: 02/08/2023] Open
Abstract
Antimicrobial resistance (AMR) is a global health issue and surveillance of AMR can be useful for understanding AMR trends and planning intervention strategies. Salmonella, widely distributed in food-producing animals, has been considered the first priority for inclusion in the AMR surveillance program by the World Health Organization (WHO). Recent advances in rapid and affordable whole-genome sequencing (WGS) techniques lead to the emergence of WGS as a one-stop test to predict the antimicrobial susceptibility. Since the variation of sequencing and minimum inhibitory concentration (MIC) measurement methods could result in different results, this study aimed to develop WGS-based random forest models for predicting MIC values of 24 drugs using data generated from the same laboratories in Taiwan. The WGS data have been transformed as a feature vector of 10-mers for machine learning. Based on rigorous validation and independent tests, a good performance was obtained with an average mean absolute error (MAE) less than 1 for both validation and independent test. Feature selection was then applied to identify top-ranked 10-mers that can further improve the prediction performance. For surveillance purposes, the genome sequence-based machine learning methods could be utilized to monitor the difference between predicted and experimental MIC, where a large difference might be worthy of investigation on the emerging genomic determinants.
Collapse
Affiliation(s)
- Chia-Chi Wang
- grid.19188.390000 0004 0546 0241Department and Graduate Institute of Veterinary Medicine, School of Veterinary Medicine, National Taiwan University, Taipei, 106 Taiwan
| | - Yu-Ting Hung
- grid.482517.dAnimal Technology Laboratories, Agricultural Technology Research Institute, Hsinchu City, 300 Taiwan ,grid.260542.70000 0004 0532 3749Graduate Institute of Veterinary Pathobiology, College of Veterinary Medicine, National Chung Hsing University, Taichung, 402 Taiwan
| | - Che-Yu Chou
- grid.412896.00000 0000 9337 0481Graduate Institute of Data Science, College of Management, Taipei Medical University, Taipei, 106 Taiwan
| | - Shih-Ling Hsuan
- grid.260542.70000 0004 0532 3749Graduate Institute of Veterinary Pathobiology, College of Veterinary Medicine, National Chung Hsing University, Taichung, 402 Taiwan
| | - Zeng-Weng Chen
- grid.482517.dAnimal Technology Laboratories, Agricultural Technology Research Institute, Hsinchu City, 300 Taiwan
| | - Pei-Yu Chang
- grid.59784.370000000406229172Institute of Biotechnology and Pharmaceutical Research, National Health Research Institutes, Miaoli County, 350 Taiwan
| | - Tong-Rong Jan
- Department and Graduate Institute of Veterinary Medicine, School of Veterinary Medicine, National Taiwan University, Taipei, 106, Taiwan.
| | - Chun-Wei Tung
- Graduate Institute of Data Science, College of Management, Taipei Medical University, Taipei, 106, Taiwan. .,Institute of Biotechnology and Pharmaceutical Research, National Health Research Institutes, Miaoli County, 350, Taiwan.
| |
Collapse
|
129
|
Sobańska AW. Immobilized artificial membrane-chromatographic and computational descriptors in studies of soil-water partition of environmentally relevant compounds. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023; 30:6192-6200. [PMID: 35994147 PMCID: PMC9895004 DOI: 10.1007/s11356-022-22514-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 08/09/2022] [Indexed: 05/27/2023]
Abstract
Chromatographic retention factor log kIAM obtained from immobilized artificial membrane (IAM) HPLC with buffered, aqueous mobile phases and calculated molecular descriptors (molecular weight - log MW; molar volume - VM; polar surface area - PSA; total count of nitrogen and oxygen atoms -(N + O); count of freely rotable bonds - FRB; H-bond donor count - HD; H-bond acceptor count - HA; energy of the highest occupied molecular orbital - EHOMO; energy of the lowest unoccupied orbital - ELUMO; dipole moment - DM; polarizability - α) obtained for a group of 175 structurally unrelated compounds were tested in order to generate useful models of solutes' soil-water partition coefficient normalized to organic carbon log Koc. It was established that log kIAM obtained in the conditions described in this study is not sufficient as a sole predictor of the soil-water partition coefficient. Simple, potentially useful models based on log kIAM and a selection of readily available, calculated descriptors and accounting for over 88% of total variability were generated using multiple linear regression (MLR) and artificial neural networks (ANN). The models proposed in the study were tested on a group of 50 compounds with known experimental log Koc values by plotting the calculated vs. experimental values. There is a good close similarity between the calculated and experimental data for both MLR and ANN models for compounds from different chemical families (R2 ≥ 0.80, n = 50) which proves the models' reliability.
Collapse
Affiliation(s)
- Anna W Sobańska
- Department of Analytical Chemistry, Medical University of Łódź, ul. Muszyńskiego 1, 90-151, Lodz, Poland.
| |
Collapse
|
130
|
Ogawa K, Sakamoto D, Hosoki R. Computer Science Technology in Natural Products Research: A Review of Its Applications and Implications. Chem Pharm Bull (Tokyo) 2023; 71:486-494. [PMID: 37394596 DOI: 10.1248/cpb.c23-00039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Computational approaches to drug development are rapidly growing in popularity and have been used to produce significant results. Recent developments in information science have expanded databases and chemical informatics knowledge relating to natural products. Natural products have long been well-studied, and a large number of unique structures and remarkable active substances have been reported. Analyzing accumulated natural product knowledge using emerging computational science techniques is expected to yield more new discoveries. In this article, we discuss the current state of natural product research using machine learning. The basic concepts and frameworks of machine learning are summarized. Natural product research that utilizes machine learning is described in terms of the exploration of active compounds, automatic compound design, and application to spectral data. In addition, efforts to develop drugs for intractable diseases will be addressed. Lastly, we discuss key considerations for applying machine learning in this field. This paper aims to promote progress in natural product research by presenting the current state of computational science and chemoinformatics approaches in terms of its applications, strengths, limitations, and implications for the field.
Collapse
Affiliation(s)
- Keiko Ogawa
- Laboratory of Regulatory Science, College of Pharmaceutical Sciences, Ritsumeikan University
| | - Daiki Sakamoto
- Laboratory of Regulatory Science, College of Pharmaceutical Sciences, Ritsumeikan University
| | - Rumiko Hosoki
- Laboratory of Regulatory Science, College of Pharmaceutical Sciences, Ritsumeikan University
| |
Collapse
|
131
|
Mou M, Pan Z, Lu M, Sun H, Wang Y, Luo Y, Zhu F. Application of Machine Learning in Spatial Proteomics. J Chem Inf Model 2022; 62:5875-5895. [PMID: 36378082 DOI: 10.1021/acs.jcim.2c01161] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Spatial proteomics is an interdisciplinary field that investigates the localization and dynamics of proteins, and it has gained extensive attention in recent years, especially the subcellular proteomics. Numerous evidence indicate that the subcellular localization of proteins is associated with various cellular processes and disease progression. Mass spectrometry (MS)-based and imaging-based experimental approaches have been developed to acquire large-scale spatial proteomic data. To allow the reliable analysis of increasingly complex spatial proteomics data, machine learning (ML) methods have been widely used in both MS-based and imaging-based spatial proteomic data analysis pipelines. Here, we comprehensively survey the applications of ML in spatial proteomics from following aspects: (1) data resources for spatial proteome are comprehensively introduced; (2) the roles of different ML algorithms in data analysis pipelines are elaborated; (3) successful applications of spatial proteomics and several analytical tools integrating ML methods are presented; (4) challenges existing in modern ML-based spatial proteomics studies are discussed. This review provides guidelines for researchers seeking to apply ML methods to analyze spatial proteomic data and can facilitate insightful understanding of cell biology as well as the future research in medical and drug discovery communities.
Collapse
Affiliation(s)
- Minjie Mou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Ziqi Pan
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Mingkun Lu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Huaicheng Sun
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yunxia Wang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yongchao Luo
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
132
|
Coutinho MG, Câmara GB, Barbosa RDM, Fernandes MA. SARS-CoV-2 virus classification based on stacked sparse autoencoder. Comput Struct Biotechnol J 2022; 21:284-298. [PMID: 36530948 PMCID: PMC9742810 DOI: 10.1016/j.csbj.2022.12.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 12/04/2022] [Accepted: 12/05/2022] [Indexed: 12/13/2022] Open
Abstract
Since December 2019, the world has been intensely affected by the COVID-19 pandemic, caused by the SARS-CoV-2. In the case of a novel virus identification, the early elucidation of taxonomic classification and origin of the virus genomic sequence is essential for strategic planning, containment, and treatments. Deep learning techniques have been successfully used in many viral classification problems associated with viral infection diagnosis, metagenomics, phylogenetics, and analysis. Considering that motivation, the authors proposed an efficient viral genome classifier for the SARS-CoV-2 using the deep neural network based on the stacked sparse autoencoder (SSAE). For the best performance of the model, we explored the utilization of image representations of the complete genome sequences as the SSAE input to provide a classification of the SARS-CoV-2. For that, a dataset based on k-mers image representation was applied. We performed four experiments to provide different levels of taxonomic classification of the SARS-CoV-2. The SSAE technique provided great performance results in all experiments, achieving classification accuracy between 92% and 100% for the validation set and between 98.9% and 100% when the SARS-CoV-2 samples were applied for the test set. In this work, samples of the SARS-CoV-2 were not used during the training process, only during subsequent tests, in which the model was able to infer the correct classification of the samples in the vast majority of cases. This indicates that our model can be adapted to classify other emerging viruses. Finally, the results indicated the applicability of this deep learning technique in genome classification problems.
Collapse
Affiliation(s)
- Maria G.F. Coutinho
- Laboratory of Machine Learning and Intelligent Instrumentation, IMD/nPITI, Federal University of Rio Grande do Norte, Natal, Brazil
| | - Gabriel B.M. Câmara
- Laboratory of Machine Learning and Intelligent Instrumentation, IMD/nPITI, Federal University of Rio Grande do Norte, Natal, Brazil
| | - Raquel de M. Barbosa
- Department of Pharmacy and Pharmaceutical Technology, University of Granada, 18071 Granada, Spain
| | - Marcelo A.C. Fernandes
- Laboratory of Machine Learning and Intelligent Instrumentation, IMD/nPITI, Federal University of Rio Grande do Norte, Natal, Brazil
- Department of Computer and Automation Engineering, Federal University of Rio Grande do Norte, Natal, Brazil
| |
Collapse
|
133
|
Baručić D, Kaushik S, Kybic J, Stanková J, Džubák P, Hajdúch M. Characterization of drug effects on cell cultures from phase-contrast microscopy images. Comput Biol Med 2022; 151:106171. [PMID: 36306582 DOI: 10.1016/j.compbiomed.2022.106171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 08/30/2022] [Accepted: 10/01/2022] [Indexed: 12/27/2022]
Abstract
In this work, we classify chemotherapeutic agents (topoisomerase inhibitors) based on their effect on U-2 OS cells. We use phase-contrast microscopy images, which are faster and easier to obtain than fluorescence images and support live cell imaging. We use a convolutional neural network (CNN) trained end-to-end directly on the input images without requiring for manual segmentations or any other auxiliary data. Our method can distinguish between tested cytotoxic drugs with an accuracy of 98%, provided that their mechanism of action differs, outperforming previous work. The results are even better when substance-specific concentrations are used. We show the benefit of sharing the extracted features over all classes (drugs). Finally, a 2D visualization of these features reveals clusters, which correspond well to known class labels, suggesting the possible use of our methodology for drug discovery application in analyzing new, unseen drugs.
Collapse
Affiliation(s)
- Denis Baručić
- Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic.
| | - Sumit Kaushik
- Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic.
| | - Jan Kybic
- Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic.
| | - Jarmila Stanková
- Institute of Molecular and Translational Medicine, Faculty of Medicine and Dentistry, Palacky University, Olomouc, Czech Republic
| | - Petr Džubák
- Institute of Molecular and Translational Medicine, Faculty of Medicine and Dentistry, Palacky University, Olomouc, Czech Republic
| | - Marián Hajdúch
- Institute of Molecular and Translational Medicine, Faculty of Medicine and Dentistry, Palacky University, Olomouc, Czech Republic
| |
Collapse
|
134
|
Machine learning and structure-based modeling for the prediction of UDP-glucuronosyltransferase inhibition. iScience 2022; 25:105290. [PMID: 36304105 PMCID: PMC9593791 DOI: 10.1016/j.isci.2022.105290] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 09/05/2022] [Accepted: 10/03/2022] [Indexed: 11/23/2022] Open
Abstract
UDP-glucuronosyltransferases (UGTs) are responsible for 35% of the phase II drug metabolism. In this study, we focused on UGT1A1, which is a key UGT isoform. Strong inhibition of UGT1A1 may trigger adverse drug/herb-drug interactions, or result in disorders of endobiotic metabolism. Most of the current machine learning methods predicting the inhibition of drug metabolizing enzymes neglect protein structure and dynamics, both being essential for the recognition of various substrates and inhibitors. We performed molecular dynamics simulations on a homology model of the human UGT1A1 structure containing both the cofactor- (UDP-glucuronic acid) and substrate-binding domains to explore UGT conformational changes. Then, we created models for the prediction of UGT1A1 inhibitors by integrating information on UGT1A1 structure and dynamics, interactions with diverse ligands, and machine learning. These models can be helpful for further prediction of drug-drug interactions of drug candidates and safety treatments. UGTs are responsible for 35% of the phase II drug metabolism reactions We created machine learning models for prediction of UGT1A1 inhibitors Our simulations suggested key residues of UGT1A1 involved in the substrate binding
Collapse
|
135
|
Sobańska AW. Affinity of Compounds for Phosphatydylcholine-Based Immobilized Artificial Membrane-A Measure of Their Bioconcentration in Aquatic Organisms. MEMBRANES 2022; 12:membranes12111130. [PMID: 36422122 PMCID: PMC9692598 DOI: 10.3390/membranes12111130] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Revised: 10/29/2022] [Accepted: 11/07/2022] [Indexed: 05/14/2023]
Abstract
The BCF (bioconcentration factor) of solutes in aquatic organisms is an important parameter because many undesired chemicals enter the ecosystem and affect the wildlife. Chromatographic retention factor log kwIAM obtained from immobilized artificial membrane (IAM) HPLC chromatography with buffered, aqueous mobile phases and calculated molecular descriptors obtained for a group of 120 structurally unrelated compounds were used to generate useful models of log BCF. It was established that log kwIAM obtained in the conditions described in this study is not sufficient as a sole predictor of bioconcentration. Simple, potentially useful models based on log kwIAM and a selection of readily available, calculated descriptors and accounting for over 88% of total variability were generated using multiple linear regression (MLR), partial least squares (PLS) regression and artificial neural networks (ANN). The models proposed in the study were tested on an external group of 120 compounds and on a group of 40 compounds with known experimental log BCF values. It was established that a relatively simple MLR model containing four independent variables leads to satisfying BCF predictions and is more intuitive than PLS or ANN models.
Collapse
Affiliation(s)
- Anna W Sobańska
- Department of Analytical Chemistry, Faculty of Pharmacy, Medical University of Lodz, ul. Muszyńskiego 1, 90-151 Lodz, Poland
| |
Collapse
|
136
|
Piekuś-Słomka N, Zapadka M, Kupcewicz B. Methoxy and methylthio-substituted trans-stilbene derivatives as CYP1B1 inhibitors – QSAR study with detailed interpretation of molecular descriptors. ARAB J CHEM 2022. [DOI: 10.1016/j.arabjc.2022.104204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
|
137
|
Mudedla SK, Braka A, Wu S. Quantum-based machine learning and AI models to generate force field parameters for drug-like small molecules. Front Mol Biosci 2022; 9:1002535. [PMID: 36304919 PMCID: PMC9592901 DOI: 10.3389/fmolb.2022.1002535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 09/15/2022] [Indexed: 11/28/2022] Open
Abstract
Force fields for drug-like small molecules play an essential role in molecular dynamics simulations and binding free energy calculations. In particular, the accurate generation of partial charges on small molecules is critical to understanding the interactions between proteins and drug-like molecules. However, it is a time-consuming process. Thus, we generated a force field for small molecules and employed a machine learning (ML) model to rapidly predict partial charges on molecules in less than a minute of time. We performed density functional theory (DFT) calculation for 31770 small molecules that covered the chemical space of drug-like molecules. The partial charges for the atoms in a molecule were predicted using an ML model trained on DFT-based atomic charges. The predicted values were comparable to the charges obtained from DFT calculations. The ML model showed high accuracy in the prediction of atomic charges for external test data sets. We also developed neural network (NN) models to assign atom types, phase angles and periodicities. All the models performed with high accuracy on test data sets. Our code calculated all the descriptors that were needed for the prediction of force field parameters and produced topologies for small molecules by combining results from ML and NN models. To assess the accuracy of the predicted force field parameters, we calculated solvation free energies for small molecules, and the results were in close agreement with experimental free energies. The AI-generated force field was effective in the fast and accurate generation of partial charges and other force field parameters for small drug-like molecules.
Collapse
Affiliation(s)
| | | | - Sangwook Wu
- R&D Center, PharmCADD, Busan, South Korea
- Department of Physics, Pukyong National University, Busan, South Korea
- *Correspondence: Sangwook Wu,
| |
Collapse
|
138
|
Subtypes and Mechanisms of Hypertrophic Cardiomyopathy Proposed by Machine Learning Algorithms. LIFE (BASEL, SWITZERLAND) 2022; 12:life12101566. [PMID: 36294999 PMCID: PMC9605444 DOI: 10.3390/life12101566] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Revised: 09/26/2022] [Accepted: 09/30/2022] [Indexed: 11/06/2022]
Abstract
Hypertrophic cardiomyopathy (HCM) is a relatively common inherited cardiac disease that results in left ventricular hypertrophy. Machine learning uses algorithms to study patterns in data and develop models able to make predictions. The aim of this study is to identify HCM subtypes and examine the mechanisms of HCM using machine learning algorithms. Clinical and laboratory findings of 143 adult patients with a confirmed diagnosis of nonobstructive HCM are analyzed; HCM subtypes are determined by clustering, while the presence of different HCM features is predicted in classification machine learning tasks. Four clusters are determined as the optimal number of clusters for this dataset. Models that can predict the presence of particular HCM features from other genotypic and phenotypic information are generated, and subsets of features sufficient to predict the presence of other features of HCM are determined. This research proposes four subtypes of HCM assessed by machine learning algorithms and based on the overall phenotypic expression of the participants of the study. The identified subsets of features sufficient to determine the presence of particular HCM aspects could provide deeper insights into the mechanisms of HCM.
Collapse
|
139
|
Wang FS, Chen PR, Chen TY, Zhang HX. Fuzzy optimization for identifying anti-cancer targets with few side effects in constraint-based models of head and neck cancer. ROYAL SOCIETY OPEN SCIENCE 2022; 9:220633. [PMID: 36303939 PMCID: PMC9597175 DOI: 10.1098/rsos.220633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Accepted: 09/27/2022] [Indexed: 06/16/2023]
Abstract
Computer-aided methods can be used to screen potential candidate targets and to reduce the time and cost of drug development. In most of these methods, synthetic lethality is used as a therapeutic criterion to identify drug targets. However, these methods do not consider the side effects during the identification stage. This study developed a fuzzy multi-objective optimization for identifying anti-cancer targets that not only evaluated cancer cell mortality, but also minimized side effects due to treatment. We identified potential anti-cancer enzymes and antimetabolites for the treatment of head and neck cancer (HNC). The identified one- and two-target enzymes were primarily involved in six major pathways, namely, purine and pyrimidine metabolism and the pentose phosphate pathway. Most of the identified targets can be regulated by approved drugs; thus, these drugs are potential candidates for drug repurposing as a treatment for HNC. Furthermore, we identified antimetabolites involved in pathways similar to those identified using a gene-centric approach. Moreover, HMGCR knockdown could not block the growth of HNC cells. However, the two-target combinations of (UMPS, HMGCR) and (CAD, HMGCR) could achieve cell mortality and improve metabolic deviation grades over 22% without reducing the cell viability grade.
Collapse
Affiliation(s)
- Feng-Sheng Wang
- Department of Chemical Engineering, National Chung Cheng University, Chiayi, Taiwan
| | - Pei-Rong Chen
- Department of Chemical Engineering, National Chung Cheng University, Chiayi, Taiwan
| | - Ting-Yu Chen
- Department of Chemical Engineering, National Chung Cheng University, Chiayi, Taiwan
| | - Hao-Xiang Zhang
- Department of Chemical Engineering, National Chung Cheng University, Chiayi, Taiwan
| |
Collapse
|
140
|
Herrera-Bravo J, Farías JG, Sandoval C, Herrera-Belén L, Quiñones J, Díaz R, Beltrán JF. nAChR-PEP-PRED: A Robust Tool for Predicting Peptide Inhibitors of Acetylcholine Receptors Using the Random Forest Classifier. Int J Pept Res Ther 2022; 28:152. [DOI: 10.1007/s10989-022-10460-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/02/2022] [Indexed: 10/14/2022]
|
141
|
Machine Learning-Based Virtual Screening for the Identification of Cdk5 Inhibitors. Int J Mol Sci 2022; 23:ijms231810653. [PMID: 36142566 PMCID: PMC9502400 DOI: 10.3390/ijms231810653] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 09/07/2022] [Accepted: 09/09/2022] [Indexed: 12/04/2022] Open
Abstract
Cyclin-dependent kinase 5 (Cdk5) is an atypical proline-directed serine/threonine protein kinase well-characterized for its role in the central nervous system rather than in the cell cycle. Indeed, its dysregulation has been strongly implicated in the progression of synaptic dysfunction and neurodegenerative diseases, such as Alzheimer’s disease (AD) and Parkinson’s disease (PD), and also in the development and progression of a variety of cancers. For this reason, Cdk5 is considered as a promising target for drug design, and the discovery of novel small-molecule Cdk5 inhibitors is of great interest in the medicinal chemistry field. In this context, we employed a machine learning-based virtual screening protocol with subsequent molecular docking, molecular dynamics simulations and binding free energy evaluations. Our virtual screening studies resulted in the identification of two novel Cdk5 inhibitors, highlighting an experimental hit rate of 50% and thus validating the reliability of the in silico workflow. Both identified ligands, compounds CPD1 and CPD4, showed a promising enzyme inhibitory activity and CPD1 also demonstrated a remarkable antiproliferative activity in ovarian and colon cancer cells. These ligands represent a valuable starting point for structure-based hit-optimization studies aimed at identifying new potent Cdk5 inhibitors.
Collapse
|
142
|
Wang R, Xu J, Yan R, Liu H, Zhao J, Xie Y, Deng W, Liao W, Nie Y. Virtual screening and activity evaluation of multitargeting inhibitors for idiopathic pulmonary fibrosis. Front Pharmacol 2022; 13:998245. [PMID: 36160399 PMCID: PMC9493029 DOI: 10.3389/fphar.2022.998245] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Accepted: 08/17/2022] [Indexed: 11/17/2022] Open
Abstract
Transforming growth factor β receptor (TGF-β1R) and receptor tyrosine kinases (RTKs), such as VEGFRs, PDGFRs and FGFRs are considered important therapeutic targets in blocking myofibroblast migration and activation of idiopathic pulmonary fibrosis (IPF). To screen and design innovative prodrug to simultaneously target these four classes of receptors, we proposed an approach based on network pharmacology combining virtual screening and machine learning activity prediction, followed by efficient in vitro and in vivo models to evaluate drug activity. We first constructed Collagen1A2-A549 cells with type I collagen as the main biomarker and evaluated the activity of compounds to inhibit collagen expression at the cellular level. The data from the first round of Collagen1A2-A549 cell screening were substituted into the machine learning model, and the model was optimized accordingly. As a result, the false positive rate of the model was reduced from 85.0% to 66.7%, and two prospective compounds, Z103080500 and Z104578368, were finally selected. Collagen levels were reduced effectively by both Z103080500 (67.88% reduction) and Z104578368 (69.54% reduction). Moreover, these two compounds showed low cellular cytotoxicity. Subsequently, the effect of Z103080500 and Z104578368 was evaluated in a bleomycin-induced C57BL/6 mouse IPF model. These results showed that 50 mg/kg Z103080500 and Z104578368 could effectively reduce the number of inflammatory cells and the expression level of α-SMA. Meanwhile, Z103080500 and Z104578368 reduced the expression of major markers and inflammatory factors of IPF, such as collagen, IFN-γ, IL-17 and HYP, indicating that these screened Z103080500 and Z104578368 effectively delayed lung tissue inflammation and had a potential therapeutic effect on IPF. Our findings demonstrate that a screening and evaluation model for prodrug against IPF has been successfully established. It is of great significance to further modify these compounds to enhance their potency and activity.
Collapse
Affiliation(s)
- Rui Wang
- Clinical Research Institute, The First People’s Hospital of Foshan, Foshan, China
| | - Jian Xu
- School of Pharmaceutical Sciences (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, China
| | - Rong Yan
- Clinical Research Institute, The First People’s Hospital of Foshan, Foshan, China
| | - Huanbin Liu
- School of Pharmaceutical Sciences (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, China
| | - Jingxin Zhao
- School of Pharmaceutical Sciences (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, China
| | - Yuan Xie
- School of Pharmaceutical Sciences (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, China
| | - Wenbin Deng
- School of Pharmaceutical Sciences (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, China
| | - Weiping Liao
- Foshan Fourth People’s Hospital, Foshan, China
- *Correspondence: Weiping Liao, ; Yichu Nie,
| | - Yichu Nie
- Clinical Research Institute, The First People’s Hospital of Foshan, Foshan, China
- School of Pharmaceutical Sciences (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, China
- *Correspondence: Weiping Liao, ; Yichu Nie,
| |
Collapse
|
143
|
Sajjan M, Li J, Selvarajan R, Sureshbabu SH, Kale SS, Gupta R, Singh V, Kais S. Quantum machine learning for chemistry and physics. Chem Soc Rev 2022; 51:6475-6573. [PMID: 35849066 DOI: 10.1039/d2cs00203e] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Machine learning (ML) has emerged as a formidable force for identifying hidden but pertinent patterns within a given data set with the objective of subsequent generation of automated predictive behavior. In recent years, it is safe to conclude that ML and its close cousin, deep learning (DL), have ushered in unprecedented developments in all areas of physical sciences, especially chemistry. Not only classical variants of ML, even those trainable on near-term quantum hardwares have been developed with promising outcomes. Such algorithms have revolutionized materials design and performance of photovoltaics, electronic structure calculations of ground and excited states of correlated matter, computation of force-fields and potential energy surfaces informing chemical reaction dynamics, reactivity inspired rational strategies of drug designing and even classification of phases of matter with accurate identification of emergent criticality. In this review we shall explicate a subset of such topics and delineate the contributions made by both classical and quantum computing enhanced machine learning algorithms over the past few years. We shall not only present a brief overview of the well-known techniques but also highlight their learning strategies using statistical physical insight. The objective of the review is not only to foster exposition of the aforesaid techniques but also to empower and promote cross-pollination among future research in all areas of chemistry which can benefit from ML and in turn can potentially accelerate the growth of such algorithms.
Collapse
Affiliation(s)
- Manas Sajjan
- Department of Chemistry, Purdue University, West Lafayette, IN-47907, USA. .,Purdue Quantum Science and Engineering Institute, Purdue University, West Lafayette, Indiana 47907, USA
| | - Junxu Li
- Purdue Quantum Science and Engineering Institute, Purdue University, West Lafayette, Indiana 47907, USA.,Department of Physics and Astronomy, Purdue University, West Lafayette, IN-47907, USA
| | - Raja Selvarajan
- Purdue Quantum Science and Engineering Institute, Purdue University, West Lafayette, Indiana 47907, USA.,Department of Physics and Astronomy, Purdue University, West Lafayette, IN-47907, USA
| | - Shree Hari Sureshbabu
- Purdue Quantum Science and Engineering Institute, Purdue University, West Lafayette, Indiana 47907, USA.,Elmore Family School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN-47907, USA
| | - Sumit Suresh Kale
- Department of Chemistry, Purdue University, West Lafayette, IN-47907, USA. .,Purdue Quantum Science and Engineering Institute, Purdue University, West Lafayette, Indiana 47907, USA
| | - Rishabh Gupta
- Department of Chemistry, Purdue University, West Lafayette, IN-47907, USA. .,Purdue Quantum Science and Engineering Institute, Purdue University, West Lafayette, Indiana 47907, USA
| | - Vinit Singh
- Department of Chemistry, Purdue University, West Lafayette, IN-47907, USA. .,Purdue Quantum Science and Engineering Institute, Purdue University, West Lafayette, Indiana 47907, USA
| | - Sabre Kais
- Department of Chemistry, Purdue University, West Lafayette, IN-47907, USA. .,Purdue Quantum Science and Engineering Institute, Purdue University, West Lafayette, Indiana 47907, USA.,Department of Physics and Astronomy, Purdue University, West Lafayette, IN-47907, USA.,Elmore Family School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN-47907, USA
| |
Collapse
|
144
|
Sun D, Gao W, Hu H, Zhou S. Why 90% of clinical drug development fails and how to improve it? Acta Pharm Sin B 2022; 12:3049-3062. [PMID: 35865092 PMCID: PMC9293739 DOI: 10.1016/j.apsb.2022.02.002] [Citation(s) in RCA: 618] [Impact Index Per Article: 206.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2021] [Revised: 02/03/2022] [Accepted: 02/06/2022] [Indexed: 12/14/2022] Open
Abstract
Ninety percent of clinical drug development fails despite implementation of many successful strategies, which raised the question whether certain aspects in target validation and drug optimization are overlooked? Current drug optimization overly emphasizes potency/specificity using structure‒activity-relationship (SAR) but overlooks tissue exposure/selectivity in disease/normal tissues using structure‒tissue exposure/selectivity-relationship (STR), which may mislead the drug candidate selection and impact the balance of clinical dose/efficacy/toxicity. We propose structure‒tissue exposure/selectivity-activity relationship (STAR) to improve drug optimization, which classifies drug candidates based on drug's potency/selectivity, tissue exposure/selectivity, and required dose for balancing clinical efficacy/toxicity. Class I drugs have high specificity/potency and high tissue exposure/selectivity, which needs low dose to achieve superior clinical efficacy/safety with high success rate. Class II drugs have high specificity/potency and low tissue exposure/selectivity, which requires high dose to achieve clinical efficacy with high toxicity and needs to be cautiously evaluated. Class III drugs have relatively low (adequate) specificity/potency but high tissue exposure/selectivity, which requires low dose to achieve clinical efficacy with manageable toxicity but are often overlooked. Class IV drugs have low specificity/potency and low tissue exposure/selectivity, which achieves inadequate efficacy/safety, and should be terminated early. STAR may improve drug optimization and clinical studies for the success of clinical drug development.
Collapse
Affiliation(s)
- Duxin Sun
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Wei Gao
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Hongxiang Hu
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Simon Zhou
- Translational Development and Clinical Pharmacology, Bristol Meyer Squibb Company, Summit, NJ, 07920, USA
| |
Collapse
|
145
|
Eckmann P, Sun K, Zhao B, Feng M, Gilson MK, Yu R. LIMO: Latent Inceptionism for Targeted Molecule Generation. PROCEEDINGS OF MACHINE LEARNING RESEARCH 2022; 162:5777-5792. [PMID: 36193121 PMCID: PMC9527083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Generation of drug-like molecules with high binding affinity to target proteins remains a difficult and resource-intensive task in drug discovery. Existing approaches primarily employ reinforcement learning, Markov sampling, or deep generative models guided by Gaussian processes, which can be prohibitively slow when generating molecules with high binding affinity calculated by computationally-expensive physics-based methods. We present Latent Inceptionism on Molecules (LIMO), which significantly accelerates molecule generation with an inceptionism-like technique. LIMO employs a variational autoencoder-generated latent space and property prediction by two neural networks in sequence to enable faster gradient-based reverse-optimization of molecular properties. Comprehensive experiments show that LIMO performs competitively on benchmark tasks and markedly outperforms state-of-the-art techniques on the novel task of generating drug-like compounds with high binding affinity, reaching nanomolar range against two protein targets. We corroborate these docking-based results with more accurate molecular dynamics-based calculations of absolute binding free energy and show that one of our generated drug-like compounds has a predicted K D (a measure of binding affinity) of 6 · 10-14 M against the human estrogen receptor, well beyond the affinities of typical early-stage drug candidates and most FDA-approved drugs to their respective targets. Code is available at https://github.com/Rose-STL-Lab/LIMO.
Collapse
Affiliation(s)
- Peter Eckmann
- Department of Computer Science and Engineering, UC San Diego, La Jolla, California, United States
| | - Kunyang Sun
- Department of Chemistry and Biochemistry, UC San Diego, La Jolla, California, United states
| | - Bo Zhao
- Department of Computer Science and Engineering, UC San Diego, La Jolla, California, United States
| | - Mudong Feng
- Department of Chemistry and Biochemistry, UC San Diego, La Jolla, California, United states
| | - Michael K. Gilson
- Department of Chemistry and Biochemistry, UC San Diego, La Jolla, California, United states
- Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, California, United States
| | - Rose Yu
- Department of Computer Science and Engineering, UC San Diego, La Jolla, California, United States
| |
Collapse
|
146
|
Xiouras C, Cameli F, Quilló GL, Kavousanakis ME, Vlachos DG, Stefanidis GD. Applications of Artificial Intelligence and Machine Learning Algorithms to Crystallization. Chem Rev 2022; 122:13006-13042. [PMID: 35759465 DOI: 10.1021/acs.chemrev.2c00141] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Artificial intelligence and specifically machine learning applications are nowadays used in a variety of scientific applications and cutting-edge technologies, where they have a transformative impact. Such an assembly of statistical and linear algebra methods making use of large data sets is becoming more and more integrated into chemistry and crystallization research workflows. This review aims to present, for the first time, a holistic overview of machine learning and cheminformatics applications as a novel, powerful means to accelerate the discovery of new crystal structures, predict key properties of organic crystalline materials, simulate, understand, and control the dynamics of complex crystallization process systems, as well as contribute to high throughput automation of chemical process development involving crystalline materials. We critically review the advances in these new, rapidly emerging research areas, raising awareness in issues such as the bridging of machine learning models with first-principles mechanistic models, data set size, structure, and quality, as well as the selection of appropriate descriptors. At the same time, we propose future research at the interface of applied mathematics, chemistry, and crystallography. Overall, this review aims to increase the adoption of such methods and tools by chemists and scientists across industry and academia.
Collapse
Affiliation(s)
- Christos Xiouras
- Chemical Process R&D, Crystallization Technology Unit, Janssen R&D, Turnhoutseweg 30, 2340 Beerse, Belgium
| | - Fabio Cameli
- Department of Chemical and Biomolecular Engineering, University of Delaware, 150 Academy Street, Newark, Delaware 19716, United States
| | - Gustavo Lunardon Quilló
- Chemical Process R&D, Crystallization Technology Unit, Janssen R&D, Turnhoutseweg 30, 2340 Beerse, Belgium.,Chemical and BioProcess Technology and Control, Department of Chemical Engineering, Faculty of Engineering Technology, KU Leuven, Gebroeders de Smetstraat 1, 9000 Ghent, Belgium
| | - Mihail E Kavousanakis
- School of Chemical Engineering, National Technical University of Athens, Heroon Polytechniou 9, 15780 Zografou, Greece
| | - Dionisios G Vlachos
- Department of Chemical and Biomolecular Engineering, University of Delaware, 150 Academy Street, Newark, Delaware 19716, United States
| | - Georgios D Stefanidis
- School of Chemical Engineering, National Technical University of Athens, Heroon Polytechniou 9, 15780 Zografou, Greece.,Laboratory for Chemical Technology, Ghent University; Tech Lane Ghent Science Park 125, B-9052 Ghent, Belgium
| |
Collapse
|
147
|
Zhang H, Saravanan KM, Yang Y, Wei Y, Yi P, Zhang JZH. Generating and screening de novo compounds against given targets using ultrafast deep learning models as core components. Brief Bioinform 2022; 23:6611918. [PMID: 35724626 DOI: 10.1093/bib/bbac226] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 04/27/2022] [Accepted: 05/14/2022] [Indexed: 11/13/2022] Open
Abstract
Deep learning is an artificial intelligence technique in which models express geometric transformations over multiple levels. This method has shown great promise in various fields, including drug development. The availability of public structure databases prompted the researchers to use generative artificial intelligence models to narrow down their search of the chemical space, a novel approach to chemogenomics and de novo drug development. In this study, we developed a strategy that combined an accelerated LSTM_Chem (long short-term memory for de novo compounds generation), dense fully convolutional neural network (DFCNN), and docking to generate a large number of de novo small molecular chemical compounds for given targets. To demonstrate its efficacy and applicability, six important targets that account for various human disorders were used as test examples. Moreover, using the M protease as a proof-of-concept example, we find that iteratively training with previously selected candidates can significantly increase the chance of obtaining novel compounds with higher and higher predicted binding affinities. In addition, we also check the potential benefit of obtaining reliable final de novo compounds with the help of MD simulation and metadynamics simulation. The generation of de novo compounds and the discovery of binders against various targets proposed here would be a practical and effective approach. Assessing the efficacy of these top de novo compounds with biochemical studies is promising to promote related drug development.
Collapse
Affiliation(s)
- Haiping Zhang
- Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China
| | - Konda Mani Saravanan
- Department of Biotechnology, Bharath Institute of Higher Education and Research, Chennai, 600073, Tamil Nadu, India
| | - Yang Yang
- Shenzhen Key Laboratory of Pathogen and Immunity, National Clinical Research Center for infectious disease, State Key Discipline of Infectious Disease, Shenzhen Third People's Hospital, Second Hospital Affiliated to Southern University of Science and Technology, Shenzhen, China
| | - Yanjie Wei
- Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, PR China 518055
| | - Pan Yi
- Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, PR China 518055
| | - John Z H Zhang
- Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China.,NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, 200062, China
| |
Collapse
|
148
|
Zapadka M, Dekowski P, Kupcewicz B. HATS5m as an Example of GETAWAY Molecular Descriptor in Assessing the Similarity/Diversity of the Structural Features of 4-Thiazolidinone. Int J Mol Sci 2022; 23:6576. [PMID: 35743020 PMCID: PMC9223869 DOI: 10.3390/ijms23126576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2022] [Revised: 04/30/2022] [Accepted: 06/10/2022] [Indexed: 11/29/2022] Open
Abstract
Among the various methods for drug design, the approach using molecular descriptors for quantitative structure-activity relationships (QSAR) bears promise for the prediction of innovative molecular structures with bespoke pharmacological activity. Despite the growing number of successful potential applications, the QSAR models often remain hard to interpret. The difficulty arises from the use of advanced chemometric or machine learning methods on the one hand, and the complexity of molecular descriptors on the other hand. Thus, there is a need to interpret molecular descriptors for identifying the features of molecules crucial for desirable activity. For example, the development of structure-activity modeling of different molecule endpoints confirmed the usefulness of H-GETAWAY (H-GEometry, Topology, and Atom-Weights AssemblY) descriptors in molecular sciences. However, compared with other 3D molecular descriptors, H-GETAWAY interpretation is much more complicated. The present study provides insights into the interpretation of the HATS5m descriptor (H-GETAWAY) concerning the molecular structures of the 4-thiazolidinone derivatives with antitrypanosomal activity. According to the published study, an increase in antitrypanosomal activity is associated with both a decrease and an increase in HATS5m (leverage-weighted autocorrelation with lag 5, weighted by atomic masses) values. The substructure-based method explored how the changes in molecular features affect the HATS5m value. Based on this approach, we proposed substituents that translate into low and high HATS5m. The detailed interpretation of H-GETAWAY descriptors requires the consideration of three elements: weighting scheme, leverages, and the Dirac delta function. Particular attention should be paid to the impact of chemical compounds' size and shape and the leverage values of individual atoms.
Collapse
Affiliation(s)
- Mariusz Zapadka
- Department of Inorganic and Analytical Chemistry, Faculty of Pharmacy, Nicolaus Copernicus University in Toruń, Jurasza 2, 85-089 Bydgoszcz, Poland
| | - Przemysław Dekowski
- New Technologies Department, Softmaks.pl Sp. z o.o., Kraszewskiego 1, 85-241 Bydgoszcz, Poland;
| | - Bogumiła Kupcewicz
- Department of Inorganic and Analytical Chemistry, Faculty of Pharmacy, Nicolaus Copernicus University in Toruń, Jurasza 2, 85-089 Bydgoszcz, Poland
| |
Collapse
|
149
|
Herrera-Bravo J, Farías JG, Contreras FP, Herrera-Belén L, Beltrán JF. PEP-PREDNa+: A web server for prediction of highly specific peptides targeting voltage-gated Na+ channels using machine learning techniques. Comput Biol Med 2022; 145:105414. [DOI: 10.1016/j.compbiomed.2022.105414] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2022] [Revised: 03/12/2022] [Accepted: 03/14/2022] [Indexed: 12/12/2022]
|
150
|
Deep Learning Based-Virtual Screening Using 2D Pharmacophore Fingerprint in Drug Discovery. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-10879-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|