1
|
Li Y, Cardoso-Silva J, Kelly JM, Delves MJ, Furnham N, Papageorgiou LG, Tsoka S. Optimisation-based modelling for explainable lead discovery in malaria. Artif Intell Med 2024; 147:102700. [PMID: 38184363 DOI: 10.1016/j.artmed.2023.102700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 10/17/2023] [Accepted: 10/29/2023] [Indexed: 01/08/2024]
Abstract
BACKGROUND The search for new antimalarial treatments is urgent due to growing resistance to existing therapies. The Open Source Malaria (OSM) project offers a promising starting point, having extensively screened various compounds for their effectiveness. Further analysis of the chemical space surrounding these compounds could provide the means for innovative drugs. METHODS We report an optimisation-based method for quantitative structure-activity relationship (QSAR) modelling that provides explainable modelling of ligand activity through a mathematical programming formulation. The methodology is based on piecewise regression principles and offers optimal detection of breakpoint features, efficient allocation of samples into distinct sub-groups based on breakpoint feature values, and insightful regression coefficients. Analysis of OSM antimalarial compounds yields interpretable results through rules generated by the model that reflect the contribution of individual fingerprint fragments in ligand activity prediction. Using knowledge of fragment prioritisation and screening of commercially available compound libraries, potential lead compounds for antimalarials are identified and evaluated experimentally via a Plasmodium falciparum asexual growth inhibition assay (PfGIA) and a human cell cytotoxicity assay. CONCLUSIONS Three compounds are identified as potential leads for antimalarials using the methodology described above. This work illustrates how explainable predictive models based on mathematical optimisation can pave the way towards more efficient fragment-based lead discovery as applied in malaria.
Collapse
Affiliation(s)
- Yutong Li
- Department of Informatics, King's College London, Bush House, London, WC2B 4BG, UK
| | - Jonathan Cardoso-Silva
- Data Science Institute, London School of Economics and Political Science, Houghton St, London, WC2A 2AE, UK
| | - John M Kelly
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, Keppel St, London, WC1E 7HT, UK
| | - Michael J Delves
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, Keppel St, London, WC1E 7HT, UK
| | - Nicholas Furnham
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, Keppel St, London, WC1E 7HT, UK
| | - Lazaros G Papageorgiou
- The Sargent Centre for Process Systems Engineering, Department of Chemical Engineering, University College London, Torrington Place, London, WC1E 7JE, UK
| | - Sophia Tsoka
- Department of Informatics, King's College London, Bush House, London, WC2B 4BG, UK.
| |
Collapse
|
2
|
Bathula S, Sankaranarayanan M, Malgija B, Kaliappan I, Bhandare RR, Shaik AB. 2-Amino Thiazole Derivatives as Prospective Aurora Kinase Inhibitors against Breast Cancer: QSAR, ADMET Prediction, Molecular Docking, and Molecular Dynamic Simulation Studies. ACS Omega 2023; 8:44287-44311. [PMID: 38027360 PMCID: PMC10666282 DOI: 10.1021/acsomega.3c07003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 10/05/2023] [Accepted: 10/23/2023] [Indexed: 12/01/2023]
Abstract
The aurora kinase is a key enzyme that is implicated in tumor growth. Research revealed that small molecules that target aurora kinase have beneficial effects as anticancer agents. In the present study, in order to identify potential antibreast cancer agents with aurora kinase inhibitory activity, we employed QSARINS software to perform the quantitative structure-activity relationship (QSAR). The statistical values resulted from the study include R2 = 0.8902, CCCtr = 0.7580, Q2 LOO = 0.7875, Q2LMO = 0.7624, CCCcv = 0.7535, R2ext = 0.8735, and CCCext = 0.8783. Among the four generated models, the two best models encompass five important variables, including PSA, EstateVSA5, MoRSEP3, MATSp5, and RDFC24. The parameters including the atomic volume, atomic charges, and Sanderson's electronegativity played an important role in designing newer lead compounds. Based on the above data, we have designed six series of compounds including 1a-e, 2a-e, 3a-e, 4a-e, 5a-e, and 6a-e. All these compounds were subjected to molecular docking studies by using AutoDock v4.2.6 against the aurora kinase protein (1MQ4). Among the above 30 compounds, the 2-amino thiazole derivatives 1a, 2a, 3e, 4d, 5d, and 6d have excellent binding interactions with the active site of 1MQ4. Compound 1a had the highest docking score (-9.67) and hence was additionally subjected to molecular dynamic simulation investigations for 100 ns. The stable binding of compound 1a with 1MQ4 was verified by RMSD, RMSF, RoG, H-bond, molecular mechanics-generalized Born surface area (MM-GBSA), free binding energy calculations, and solvent-accessible surface area (SASA) analyses. Furthermore, newly designed compound 1a exhibited excellent ADMET properties. Based on the above findings, we propose that the designed compound 1a may be utilized as the best theoretical lead for future experimental research of selective inhibition of aurora kinase, therefore assisting in the creation of new antibreast cancer drugs.
Collapse
Affiliation(s)
- Sivakumar Bathula
- Department
of Pharmaceutical Chemistry, SRM College of Pharmacy, SRM
Institute of Science and Technology, Kattankulathur 603203, Chengalpattu
District, Tamil Nadu, India
| | - Murugesan Sankaranarayanan
- Medicinal
Chemistry Research Laboratory, Department of Pharmacy, Birla Institute of Technology & Science (BITS)
Pilani, Pilani Campus, Pilani 333031, Rajasthan, India
| | - Beutline Malgija
- MCC-MRF
Innovation Park, Madras Christian College, Chennai 600059, Tamil Nadu, India
| | - Ilango Kaliappan
- Department
of Pharmaceutical Chemistry, SRM College of Pharmacy, SRM
Institute of Science and Technology, Kattankulathur 603203, Chengalpattu
District, Tamil Nadu, India
| | - Richie R. Bhandare
- Department
of Pharmaceutical Sciences, College
of Pharmacy and Health Sciences, Ajman University, P.O. Box 346, Ajman 61001, United Arab Emirates
- Centre of
Medical and Bio-allied Health Sciences Research, Ajman University, P.O. Box 346, Ajman 61001, United Arab Emirates
| | - Afzal B. Shaik
- St.
Mary’s College of Pharmacy, St. Mary’s
Group of Institutions Guntur, Affiliated to Jawaharlal Nehru Technological
University Kakinada, Chebrolu, Guntur 522212, Andhra
Pradesh, India
- Center
for Global Health Research, Saveetha Medical College, Saveetha Institute of Medical and Technical Sciences, Chennai 602105, Tamil Nadu, India
| |
Collapse
|
3
|
Niazi SK, Mariam Z. Recent Advances in Machine-Learning-Based Chemoinformatics: A Comprehensive Review. Int J Mol Sci 2023; 24:11488. [PMID: 37511247 PMCID: PMC10380192 DOI: 10.3390/ijms241411488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 06/30/2023] [Accepted: 07/12/2023] [Indexed: 07/30/2023] Open
Abstract
In modern drug discovery, the combination of chemoinformatics and quantitative structure-activity relationship (QSAR) modeling has emerged as a formidable alliance, enabling researchers to harness the vast potential of machine learning (ML) techniques for predictive molecular design and analysis. This review delves into the fundamental aspects of chemoinformatics, elucidating the intricate nature of chemical data and the crucial role of molecular descriptors in unveiling the underlying molecular properties. Molecular descriptors, including 2D fingerprints and topological indices, in conjunction with the structure-activity relationships (SARs), are pivotal in unlocking the pathway to small-molecule drug discovery. Technical intricacies of developing robust ML-QSAR models, including feature selection, model validation, and performance evaluation, are discussed herewith. Various ML algorithms, such as regression analysis and support vector machines, are showcased in the text for their ability to predict and comprehend the relationships between molecular structures and biological activities. This review serves as a comprehensive guide for researchers, providing an understanding of the synergy between chemoinformatics, QSAR, and ML. Due to embracing these cutting-edge technologies, predictive molecular analysis holds promise for expediting the discovery of novel therapeutic agents in the pharmaceutical sciences.
Collapse
Affiliation(s)
- Sarfaraz K Niazi
- College of Pharmacy, University of Illinois, Chicago, IL 61820, USA
| | - Zamara Mariam
- Zamara Mariam, School of Interdisciplinary Engineering & Sciences (SINES), National University of Sciences & Technology (NUST), Islamabad 24090, Pakistan
| |
Collapse
|
4
|
da Costa Avelar PH, Del Coco N, Lamb LC, Tsoka S, Cardoso-Silva J. A Bayesian predictive analytics model for improving long range epidemic forecasting during an infection wave. Healthc Anal (N Y) 2022; 2:100115. [PMID: 37520620 PMCID: PMC9533637 DOI: 10.1016/j.health.2022.100115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Revised: 08/17/2022] [Accepted: 09/26/2022] [Indexed: 11/04/2022]
Abstract
Following the outbreak of the coronavirus epidemic in early 2020, municipalities, regional governments and policymakers worldwide had to plan their Non-Pharmaceutical Interventions (NPIs) amidst a scenario of great uncertainty. At this early stage of an epidemic, where no vaccine or medical treatment is in sight, algorithmic prediction can become a powerful tool to inform local policymaking. However, when we replicated one prominent epidemiological model to inform health authorities in a region in the south of Brazil, we found that this model relied too heavily on manually predetermined covariates and was too reactive to changes in data trends. Our four proposed models access data of both daily reported deaths and infections as well as take into account missing data (e.g., the under-reporting of cases) more explicitly, with two of the proposed versions also attempting to model the delay in test reporting. We simulated weekly forecasting of deaths from the period from 31/05/2020 until 31/01/2021, with first week data being used as a cold-start to the algorithm, after which we use a lighter variant of the model for faster forecasting. Because our models are significantly more proactive in identifying trend changes, this has improved forecasting, especially in long-range predictions and after the peak of an infection wave, as they were quicker to adapt to scenarios after these peaks in reported deaths. Assuming reported cases were under-reported greatly benefited the model in its stability, and modelling retroactively-added data (due to the "hot" nature of the data used) had a negligible impact on performance.
Collapse
Affiliation(s)
- Pedro Henrique da Costa Avelar
- Data Science Brigade, Porto Alegre, Rio Grande do Sul, Brazil
- Institute of Informatics, Federal University of Rio Grande do Sul, Porto Alegre, Rio Grande do Sul, Brazil
- Department of Informatics, King's College London, London, United Kingdom
- Machine Intellection Department, Institute for Infocomm Research, A*STAR, Singapore
| | | | - Luis C Lamb
- Institute of Informatics, Federal University of Rio Grande do Sul, Porto Alegre, Rio Grande do Sul, Brazil
| | - Sophia Tsoka
- Department of Informatics, King's College London, London, United Kingdom
| | - Jonathan Cardoso-Silva
- Data Science Brigade, Porto Alegre, Rio Grande do Sul, Brazil
- Department of Informatics, King's College London, London, United Kingdom
- Data Science Institute, London School of Economics and Political Science, London, United Kingdom
| |
Collapse
|
5
|
Amiri Souri E, Laddach R, Karagiannis SN, Papageorgiou LG, Tsoka S. Novel drug-target interactions via link prediction and network embedding. BMC Bioinformatics 2022; 23:121. [PMID: 35379165 PMCID: PMC8978405 DOI: 10.1186/s12859-022-04650-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Accepted: 03/17/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND As many interactions between the chemical and genomic space remain undiscovered, computational methods able to identify potential drug-target interactions (DTIs) are employed to accelerate drug discovery and reduce the required cost. Predicting new DTIs can leverage drug repurposing by identifying new targets for approved drugs. However, developing an accurate computational framework that can efficiently incorporate chemical and genomic spaces remains extremely demanding. A key issue is that most DTI predictions suffer from the lack of experimentally validated negative interactions or limited availability of target 3D structures. RESULTS We report DT2Vec, a pipeline for DTI prediction based on graph embedding and gradient boosted tree classification. It maps drug-drug and protein-protein similarity networks to low-dimensional features and the DTI prediction is formulated as binary classification based on a strategy of concatenating the drug and target embedding vectors as input features. DT2Vec was compared with three top-performing graph similarity-based algorithms on a standard benchmark dataset and achieved competitive results. In order to explore credible novel DTIs, the model was applied to data from the ChEMBL repository that contain experimentally validated positive and negative interactions which yield a strong predictive model. Then, the developed model was applied to all possible unknown DTIs to predict new interactions. The applicability of DT2Vec as an effective method for drug repurposing is discussed through case studies and evaluation of some novel DTI predictions is undertaken using molecular docking. CONCLUSIONS The proposed method was able to integrate and map chemical and genomic space into low-dimensional dense vectors and showed promising results in predicting novel DTIs.
Collapse
Affiliation(s)
- E Amiri Souri
- Department of Informatics, Faculty of Natural, Mathematical and Engineering Sciences, King's College London, Bush House, London, WC2B 4BG, UK
| | - R Laddach
- Department of Informatics, Faculty of Natural, Mathematical and Engineering Sciences, King's College London, Bush House, London, WC2B 4BG, UK
- St. John's Institute of Dermatology, School of Basic and Medical Biosciences, King's College London, Guy's Hospital, London, SE1 9RT, UK
| | - S N Karagiannis
- St. John's Institute of Dermatology, School of Basic and Medical Biosciences, King's College London, Guy's Hospital, London, SE1 9RT, UK
- Breast Cancer Now Research Unit, School of Cancer and Pharmaceutical Sciences, King's College London, Guy's Cancer Centre, London, SE1 9RT, UK
| | - L G Papageorgiou
- Centre for Process Systems Engineering, Department of Chemical Engineering, University College London, Torrington Place, London, WC1E 7JE, UK
| | - S Tsoka
- Department of Informatics, Faculty of Natural, Mathematical and Engineering Sciences, King's College London, Bush House, London, WC2B 4BG, UK.
| |
Collapse
|
6
|
Tse EG, Aithani L, Anderson M, Cardoso-Silva J, Cincilla G, Conduit GJ, Galushka M, Guan D, Hallyburton I, Irwin BWJ, Kirk K, Lehane AM, Lindblom JCR, Lui R, Matthews S, McCulloch J, Motion A, Ng HL, Öeren M, Robertson MN, Spadavecchio V, Tatsis VA, van Hoorn WP, Wade AD, Whitehead TM, Willis P, Todd MH. An Open Drug Discovery Competition: Experimental Validation of Predictive Models in a Series of Novel Antimalarials. J Med Chem 2021; 64:16450-16463. [PMID: 34748707 DOI: 10.1021/acs.jmedchem.1c00313] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
The Open Source Malaria (OSM) consortium is developing compounds that kill the human malaria parasite, Plasmodium falciparum, by targeting PfATP4, an essential ion pump on the parasite surface. The structure of PfATP4 has not been determined. Here, we describe a public competition created to develop a predictive model for the identification of PfATP4 inhibitors, thereby reducing project costs associated with the synthesis of inactive compounds. Competition participants could see all entries as they were submitted. In the final round, featuring private sector entrants specializing in machine learning methods, the best-performing models were used to predict novel inhibitors, of which several were synthesized and evaluated against the parasite. Half possessed biological activity, with one featuring a motif that the human chemists familiar with this series would have dismissed as "ill-advised". Since all data and participant interactions remain in the public domain, this research project "lives" and may be improved by others.
Collapse
Affiliation(s)
- Edwin G Tse
- School of Pharmacy, University College London, London WC1N 1AX, U.K
| | - Laksh Aithani
- Exscientia Ltd., The Schrödinger Building, Oxford Science Park, Oxford OX4 4GE, U.K
| | - Mark Anderson
- Drug Discovery Unit, Division of Biological Chemistry and Drug Discovery, School of Life Sciences, University of Dundee, Dundee DD1 5EH, U.K
| | - Jonathan Cardoso-Silva
- Department of Informatics, Faculty of Natural and Mathematical Sciences, King's College London, London WC2B 4BG, U.K
| | | | - Gareth J Conduit
- Intellegens Ltd., Eagle Labs, Chesterton Road, Cambridge CB4 3AZ, U.K.,Theory of Condensed Matter Group, Cavendish Laboratories, University of Cambridge, Cambridge CB3 0HE, U.K
| | | | - Davy Guan
- School of Medical Sciences, The University of Sydney, Sydney, NSW 2006, Australia
| | - Irene Hallyburton
- Drug Discovery Unit, Division of Biological Chemistry and Drug Discovery, School of Life Sciences, University of Dundee, Dundee DD1 5EH, U.K
| | - Benedict W J Irwin
- Theory of Condensed Matter Group, Cavendish Laboratories, University of Cambridge, Cambridge CB3 0HE, U.K.,Optibrium Ltd. Blenheim House, Denny End Road, Cambridge CB25 9QE, U.K
| | - Kiaran Kirk
- Research School of Biology, Australian National University, Canberra, ACT 2601, Australia
| | - Adele M Lehane
- Research School of Biology, Australian National University, Canberra, ACT 2601, Australia
| | - Julia C R Lindblom
- Research School of Biology, Australian National University, Canberra, ACT 2601, Australia
| | - Raymond Lui
- School of Medical Sciences, The University of Sydney, Sydney, NSW 2006, Australia
| | - Slade Matthews
- School of Medical Sciences, The University of Sydney, Sydney, NSW 2006, Australia
| | - James McCulloch
- Kellerberrin, 6 Wharf Rd, Balmain, Sydney, NSW 2041, Australia
| | - Alice Motion
- School of Chemistry, The University of Sydney, Sydney, NSW 2006, Australia
| | - Ho Leung Ng
- Department of Biochemistry and Molecular Biophysics, Kansas State University, Manhattan Kansas 66506, United States
| | - Mario Öeren
- Optibrium Ltd. Blenheim House, Denny End Road, Cambridge CB25 9QE, U.K
| | - Murray N Robertson
- Strathclyde Institute Of Pharmacy And Biomedical Sciences, University of Strathclyde, Glasgow G4 ORE, U.K
| | | | - Vasileios A Tatsis
- Exscientia Ltd., The Schrödinger Building, Oxford Science Park, Oxford OX4 4GE, U.K
| | - Willem P van Hoorn
- Exscientia Ltd., The Schrödinger Building, Oxford Science Park, Oxford OX4 4GE, U.K
| | - Alexander D Wade
- Theory of Condensed Matter Group, Cavendish Laboratories, University of Cambridge, Cambridge CB3 0HE, U.K
| | | | - Paul Willis
- Medicines for Malaria Venture, PO Box 1826, 20 rte de Pre-Bois, 1215 Geneva 15, Switzerland
| | - Matthew H Todd
- School of Pharmacy, University College London, London WC1N 1AX, U.K
| |
Collapse
|
7
|
Zhou L, Fan D, Yin W, Gu W, Wang Z, Liu J, Xu Y, Shi L, Liu M, Ji G. Comparison of seven in silico tools for evaluating of daphnia and fish acute toxicity: case study on Chinese Priority Controlled Chemicals and new chemicals. BMC Bioinformatics 2021; 22:151. [PMID: 33761866 PMCID: PMC7992851 DOI: 10.1186/s12859-020-03903-w] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Accepted: 11/24/2020] [Indexed: 10/30/2022] Open
Abstract
BACKGROUND A number of predictive models for aquatic toxicity are available, however, the accuracy and extent of easy to use of these in silico tools in risk assessment still need further studied. This study evaluated the performance of seven in silico tools to daphnia and fish: ECOSAR, T.E.S.T., Danish QSAR Database, VEGA, KATE, Read Across and Trent Analysis. 37 Priority Controlled Chemicals in China (PCCs) and 92 New Chemicals (NCs) were used as validation dataset. RESULTS In the quantitative evaluation to PCCs with the criteria of 10-fold difference between experimental value and estimated value, the accuracies of VEGA is the highest among all of the models, both in prediction of daphnia and fish acute toxicity, with accuracies of 100% and 90% after considering AD, respectively. The performance of KATE, ECOSAR and T.E.S.T. is similar, with accuracies are slightly lower than VEGA. The accuracy of Danish Q.D. is the lowest among the above tools with which QSAR is the main mechanism. The performance of Read Across and Trent Analysis is lowest among all of the tested in silico tools. The predictive ability of models to NCs was lower than that of PCCs possibly because never appeared in training set of the models, and ECOSAR perform best than other in silico tools. CONCLUSION QSAR based in silico tools had the greater prediction accuracy than category approach (Read Across and Trent Analysis) in predicting the acute toxicity of daphnia and fish. Category approach (Read Across and Trent Analysis) requires expert knowledge to be utilized effectively. ECOSAR performs well in both PCCs and NCs, and the application shoud be promoted in both risk assessment and priority activities. We suggest that distribution of multiple data and water solubility should be considered when developing in silico models. Both more intelligent in silico tools and testing are necessary to identify hazards of Chemicals.
Collapse
Affiliation(s)
- Linjun Zhou
- Nanjing Tech University, Nanjing, 211816, China
- Nanjing Institute of Environmental Sciences, Ministry of Ecology and Environment, Nanjing, 210042, China
| | - Deling Fan
- Nanjing Institute of Environmental Sciences, Ministry of Ecology and Environment, Nanjing, 210042, China
| | - Wei Yin
- Nanjing Institute of Environmental Sciences, Ministry of Ecology and Environment, Nanjing, 210042, China
| | - Wen Gu
- Nanjing Institute of Environmental Sciences, Ministry of Ecology and Environment, Nanjing, 210042, China
| | - Zhen Wang
- Nanjing Institute of Environmental Sciences, Ministry of Ecology and Environment, Nanjing, 210042, China
| | - Jining Liu
- Nanjing Institute of Environmental Sciences, Ministry of Ecology and Environment, Nanjing, 210042, China
| | - Yanhua Xu
- Nanjing Tech University, Nanjing, 211816, China.
| | - Lili Shi
- Nanjing Institute of Environmental Sciences, Ministry of Ecology and Environment, Nanjing, 210042, China.
| | - Mingqing Liu
- Nanjing Institute of Environmental Sciences, Ministry of Ecology and Environment, Nanjing, 210042, China
| | - Guixiang Ji
- Nanjing Institute of Environmental Sciences, Ministry of Ecology and Environment, Nanjing, 210042, China
| |
Collapse
|