1
|
He S, Barón A, Munteanu CR, de Bilbao B, Casañola-Martin GM, Chelu M, Musuc AM, Bediaga H, Ascencio E, Castellanos-Rubio I, Arrasate S, Pazos A, Insausti M, Rasulev B, González-Díaz H. Drug Release Nanoparticle System Design: Data Set Compilation and Machine Learning Modeling. ACS APPLIED MATERIALS & INTERFACES 2025; 17:5290-5306. [PMID: 39800937 DOI: 10.1021/acsami.4c16800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2025]
Abstract
Magnetic nanoparticles (NPs) are gaining significant interest in the field of biomedical functional nanomaterials because of their distinctive chemical and physical characteristics, particularly in drug delivery and magnetic hyperthermia applications. In this paper, we experimentally synthesized and characterized new Fe3O4-based NPs, functionalizing its surface with a 5-TAMRA cadaverine modified copolymer consisting of PMAO and PEG. Despite these advancements, many combinations of NP cores and coatings remain unexplored. To address this, we created a new data set of NP systems from public sources. Herein, 11 different AI/ML algorithms were used to develop the predictive AI/ML models. The linear discriminant analysis (LDA) and random forest (RF) models showed high values of sensitivity and specificity (>0.9) in training/validation series and 3-fold cross validation, respectively. The AI/ML models are able to predict 14 output properties (CC50 (μM), EC50 (μM), inhibition (%), etc.) for all combinations of 54 different NP cores classes vs. 25 different coats and vs. 41 different cell lines, allowing the short listing of the best results for experimental assays. The results of this work may help to reduce the cost of traditional trial and error procedures.
Collapse
Affiliation(s)
- Shan He
- Department of Coatings and Polymer Materials, North Dakota State University, Fargo, North Dakota 58102, United States
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Greater Bilbao, Basque Country, Spain
- IKERDATA S.L., UPVEHU ZITEK, Rectorate Building, 48940 Leioa, Basque Country, Spain
| | - Ander Barón
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Greater Bilbao, Basque Country, Spain
| | - Cristian R Munteanu
- Computer Science Faculty, University of A Coruna, CITIC, 15071 A Coruña, Spain
- Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), 15006 A Coruña, Spain
| | - Begoña de Bilbao
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Greater Bilbao, Basque Country, Spain
| | - Gerardo M Casañola-Martin
- Department of Coatings and Polymer Materials, North Dakota State University, Fargo, North Dakota 58102, United States
| | - Mariana Chelu
- "IlieMurgulescu" Institute of Physical Chemistry, 202 Spl. Independentei, 060021 Bucharest, Romania
| | - Adina Magdalena Musuc
- "IlieMurgulescu" Institute of Physical Chemistry, 202 Spl. Independentei, 060021 Bucharest, Romania
| | - Harbil Bediaga
- IKERDATA S.L., UPVEHU ZITEK, Rectorate Building, 48940 Leioa, Basque Country, Spain
- Painting Department, Fine Arts Faculty, University of the Basque Country UPV/EHU, 48940, Leioa, Biscay, Basque Country, Spain
| | - Estefania Ascencio
- Department of Coatings and Polymer Materials, North Dakota State University, Fargo, North Dakota 58102, United States
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Greater Bilbao, Basque Country, Spain
- IKERDATA S.L., UPVEHU ZITEK, Rectorate Building, 48940 Leioa, Basque Country, Spain
| | - Idoia Castellanos-Rubio
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Greater Bilbao, Basque Country, Spain
| | - Sonia Arrasate
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Greater Bilbao, Basque Country, Spain
| | - Alejandro Pazos
- Computer Science Faculty, University of A Coruna, CITIC, 15071 A Coruña, Spain
- Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), 15006 A Coruña, Spain
| | - Maite Insausti
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Greater Bilbao, Basque Country, Spain
- BCMaterials, Basque Center for Materials, Applications and Nanostructures, 48940 Leioa, Spain
| | - Bakhtiyor Rasulev
- Department of Coatings and Polymer Materials, North Dakota State University, Fargo, North Dakota 58102, United States
| | - Humberto González-Díaz
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Greater Bilbao, Basque Country, Spain
- BIOFISIKA: Basque Center for Biophysics, CSIC-UPVEH, 48940 Leioa, Spain
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Biscay, Spain
| |
Collapse
|
2
|
Bediaga-Bañeres H, Moreno-Benítez I, Arrasate S, Pérez-Álvarez L, Halder AK, Cordeiro MNDS, González-Díaz H, Vilas-Vilela JL. Artificial Intelligence-Driven Modeling for Hydrogel Three-Dimensional Printing: Computational and Experimental Cases of Study. Polymers (Basel) 2025; 17:121. [PMID: 39795524 PMCID: PMC11723248 DOI: 10.3390/polym17010121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2024] [Revised: 12/26/2024] [Accepted: 12/29/2024] [Indexed: 01/13/2025] Open
Abstract
Determining the values of various properties for new bio-inks for 3D printing is a very important task in the design of new materials. For this purpose, a large number of experimental works have been consulted, and a database with more than 1200 bioprinting tests has been created. These tests cover different combinations of conditions in terms of print pressure, temperature, and needle values, for example. These data are difficult to deal with in terms of determining combinations of conditions to optimize the tests and analyze new options. The best model demonstrated a specificity (Sp) of 88.4% and a sensitivity (Sn) of 86.2% in the training series while achieving an Sp of 85.9% and an Sn of 80.3% in the external validation series. This model utilizes operators based on perturbation theory to analyze the complexity of the data. For comparative purposes, neural networks have been used, and very similar results have been obtained. The developed tool could easily be applied to predict the properties of bioprinting assays in silico. These findings could significantly improve the efficiency and accuracy of predictive models in bioprinting without resorting to trial-and-error tests, thereby saving time and funds. Ultimately, this tool may help pave the way for advances in personalized medicine and tissue engineering.
Collapse
Affiliation(s)
- Harbil Bediaga-Bañeres
- Department of Physical Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain; (H.B.-B.); (L.P.-Á.)
| | - Isabel Moreno-Benítez
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain; (S.A.); (H.G.-D.)
| | - Sonia Arrasate
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain; (S.A.); (H.G.-D.)
| | - Leyre Pérez-Álvarez
- Department of Physical Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain; (H.B.-B.); (L.P.-Á.)
- BCMaterials, Basque Center for Materials, Applications and Nanostructures, UPV/EHU Science Park, 48940 Leioa, Spain
| | - Amit K. Halder
- LAQV-REQUIMTE, Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal; (A.K.H.); (M.N.D.S.C.)
- Dr. B. C. Roy College of Pharmacy and Allied Health Sciences, Durgapur 713206, India
| | - M. Natalia D. S. Cordeiro
- LAQV-REQUIMTE, Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal; (A.K.H.); (M.N.D.S.C.)
| | - Humberto González-Díaz
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain; (S.A.); (H.G.-D.)
- Basque Center for Biophysics, CSIC-UPV/EHU, 48940 Leioa, Spain
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Spain
| | - José Luis Vilas-Vilela
- Department of Physical Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain; (H.B.-B.); (L.P.-Á.)
- BCMaterials, Basque Center for Materials, Applications and Nanostructures, UPV/EHU Science Park, 48940 Leioa, Spain
| |
Collapse
|
3
|
He S, Segura Abarrategi J, Bediaga H, Arrasate S, González-Díaz H. On the additive artificial intelligence-based discovery of nanoparticle neurodegenerative disease drug delivery systems. BEILSTEIN JOURNAL OF NANOTECHNOLOGY 2024; 15:535-555. [PMID: 38774585 PMCID: PMC11106676 DOI: 10.3762/bjnano.15.47] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Accepted: 04/23/2024] [Indexed: 05/24/2024]
Abstract
Neurodegenerative diseases are characterized by slowly progressing neuronal cell death. Conventional drug treatment strategies often fail because of poor solubility, low bioavailability, and the inability of the drugs to effectively cross the blood-brain barrier. Therefore, the development of new neurodegenerative disease drugs (NDDs) requires immediate attention. Nanoparticle (NP) systems are of increasing interest for transporting NDDs to the central nervous system. However, discovering effective nanoparticle neuronal disease drug delivery systems (N2D3Ss) is challenging because of the vast number of combinations of NP and NDD compounds, as well as the various assays involved. Artificial intelligence/machine learning (AI/ML) algorithms have the potential to accelerate this process by predicting the most promising NDD and NP candidates for assaying. Nevertheless, the relatively limited amount of reported data on N2D3S activity compared to assayed NDDs makes AI/ML analysis challenging. In this work, the IFPTML technique, which combines information fusion (IF), perturbation theory (PT), and machine learning (ML), was employed to address this challenge. Initially, we conducted the fusion into a unified dataset comprising 4403 NDD assays from ChEMBL and 260 NP cytotoxicity assays from journal articles. Through a resampling process, three new working datasets were generated, each containing 500,000 cases. We utilized linear discriminant analysis (LDA) along with artificial neural network (ANN) algorithms, such as multilayer perceptron (MLP) and deep learning networks (DLN), to construct linear and non-linear IFPTML models. The IFPTML-LDA models exhibited sensitivity (Sn) and specificity (Sp) values in the range of 70% to 73% (>375,000 training cases) and 70% to 80% (>125,000 validation cases), respectively. In contrast, the IFPTML-MLP and IFPTML-DLN achieved Sn and Sp values in the range of 85% to 86% for both training and validation series. Additionally, IFPTML-ANN models showed an area under the receiver operating curve (AUROC) of approximately 0.93 to 0.95. These results indicate that the IFPTML models could serve as valuable tools in the design of drug delivery systems for neurosciences.
Collapse
Affiliation(s)
- Shan He
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
- IKERDATA S.L., ZITEK, UPV/EHU, Rectorate Building, nº6, 48940 Leioa, Greater Bilbao, Basque Country, Spain
| | - Julen Segura Abarrategi
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| | - Harbil Bediaga
- IKERDATA S.L., ZITEK, UPV/EHU, Rectorate Building, nº6, 48940 Leioa, Greater Bilbao, Basque Country, Spain
- Painting Department, Fine Arts Faculty, University of the Basque Country UPV/EHU, 48940, Leioa, Biscay, Basque Country, Spain
| | - Sonia Arrasate
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| | - Humberto González-Díaz
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
- Instituto Biofisika (UPV/EHU-CSIC), 48940 Leioa, Spain
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Biscay, Spain
| |
Collapse
|
4
|
Diéguez-Santana K, Casañola-Martin GM, Torres R, Rasulev B, Green JR, González-Díaz H. Machine Learning Study of Metabolic Networks vs ChEMBL Data of Antibacterial Compounds. Mol Pharm 2022; 19:2151-2163. [PMID: 35671399 PMCID: PMC9986951 DOI: 10.1021/acs.molpharmaceut.2c00029] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Antibacterial drugs (AD) change the metabolic status of bacteria, contributing to bacterial death. However, antibiotic resistance and the emergence of multidrug-resistant bacteria increase interest in understanding metabolic network (MN) mutations and the interaction of AD vs MN. In this study, we employed the IFPTML = Information Fusion (IF) + Perturbation Theory (PT) + Machine Learning (ML) algorithm on a huge dataset from the ChEMBL database, which contains >155,000 AD assays vs >40 MNs of multiple bacteria species. We built a linear discriminant analysis (LDA) and 17 ML models centered on the linear index and based on atoms to predict antibacterial compounds. The IFPTML-LDA model presented the following results for the training subset: specificity (Sp) = 76% out of 70,000 cases, sensitivity (Sn) = 70%, and Accuracy (Acc) = 73%. The same model also presented the following results for the validation subsets: Sp = 76%, Sn = 70%, and Acc = 73.1%. Among the IFPTML nonlinear models, the k nearest neighbors (KNN) showed the best results with Sn = 99.2%, Sp = 95.5%, Acc = 97.4%, and Area Under Receiver Operating Characteristic (AUROC) = 0.998 in training sets. In the validation series, the Random Forest had the best results: Sn = 93.96% and Sp = 87.02% (AUROC = 0.945). The IFPTML linear and nonlinear models regarding the ADs vs MNs have good statistical parameters, and they could contribute toward finding new metabolic mutations in antibiotic resistance and reducing time/costs in antibacterial drug research.
Collapse
Affiliation(s)
- Karel Diéguez-Santana
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain.,Universidad Regional Amazónica IKIAM, Tena, Napo 150150, Ecuador
| | - Gerardo M Casañola-Martin
- Department of Coatings and Polymeric Materials, North Dakota State University, Fargo, North Dakota 58102, United States.,Department of Systems and Computer Engineering, Carleton University, K1S5B6 Ottawa, Ontario, Canada
| | - Roldan Torres
- Universidad Regional Amazónica IKIAM, Tena, Napo 150150, Ecuador
| | - Bakhtiyor Rasulev
- Department of Coatings and Polymeric Materials, North Dakota State University, Fargo, North Dakota 58102, United States
| | - James R Green
- Department of Systems and Computer Engineering, Carleton University, K1S5B6 Ottawa, Ontario, Canada
| | - Humbert González-Díaz
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain.,BIOFISIKA, Basque Center for Biophysics CSIC-UPVEH, 48940 Leioa, Spain.,IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Biscay, Spain
| |
Collapse
|
5
|
Quevedo-Tumailli V, Ortega-Tenezaca B, González-Díaz H. IFPTML Mapping of Drug Graphs with Protein and Chromosome Structural Networks vs. Pre-Clinical Assay Information for Discovery of Antimalarial Compounds. Int J Mol Sci 2021; 22:13066. [PMID: 34884870 PMCID: PMC8657696 DOI: 10.3390/ijms222313066] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Revised: 11/23/2021] [Accepted: 11/24/2021] [Indexed: 11/16/2022] Open
Abstract
The parasite species of genus Plasmodium causes Malaria, which remains a major global health problem due to parasite resistance to available Antimalarial drugs and increasing treatment costs. Consequently, computational prediction of new Antimalarial compounds with novel targets in the proteome of Plasmodium sp. is a very important goal for the pharmaceutical industry. We can expect that the success of the pre-clinical assay depends on the conditions of assay per se, the chemical structure of the drug, the structure of the target protein to be targeted, as well as on factors governing the expression of this protein in the proteome such as genes (Deoxyribonucleic acid, DNA) sequence and/or chromosomes structure. However, there are no reports of computational models that consider all these factors simultaneously. Some of the difficulties for this kind of analysis are the dispersion of data in different datasets, the high heterogeneity of data, etc. In this work, we analyzed three databases ChEMBL (Chemical database of the European Molecular Biology Laboratory), UniProt (Universal Protein Resource), and NCBI-GDV (National Center for Biotechnology Information-Genome Data Viewer) to achieve this goal. The ChEMBL dataset contains outcomes for 17,758 unique assays of potential Antimalarial compounds including numeric descriptors (variables) for the structure of compounds as well as a huge amount of information about the conditions of assays. The NCBI-GDV and UniProt datasets include the sequence of genes, proteins, and their functions. In addition, we also created two partitions (cassayj = caj and cdataj = cdj) of categorical variables from theChEMBL dataset. These partitions contain variables that encode information about experimental conditions of preclinical assays (caj) or about the nature and quality of data (cdj). These categorical variables include information about 22 parameters of biological activity (ca0), 28 target proteins (ca1), and 9 organisms of assay (ca2), etc. We also created another partition of (cprotj = cpj) including categorical variables with biological information about the target proteins, genes, and chromosomes. These variables cover32 genes (cp0), 10 chromosomes (cp1), gene orientation (cp2), and 31 protein functions (cp3). We used a Perturbation-Theory Machine Learning Information Fusion (IFPTML) algorithm to map all this information (from three databases) into and train a predictive model. Shannon's entropy measure Shk (numerical variables) was used to quantify the information about the structure of drugs, protein sequences, gene sequences, and chromosomes in the same information scale. Perturbation Theory Operators (PTOs) with the form of Moving Average (MA) operators have been used to quantify perturbations (deviations) in the structural variables with respect to their expected values for different subsets (partitions) of categorical variables. We obtained three IFPTML models using General Discriminant Analysis (GDA), Classification Tree with Univariate Splits (CTUS), and Classification Tree with Linear Combinations (CTLC). The IFPTML-CTLC presented the better performance with Sensitivity Sn(%) = 83.6/85.1, and Specificity Sp(%) = 89.8/89.7 for training/validation sets, respectively. This model could become a useful tool for the optimization of preclinical assays of new Antimalarial compounds vs. different proteins in the proteome of Plasmodium.
Collapse
Affiliation(s)
- Viviana Quevedo-Tumailli
- Grupo RNASA-IMEDIR, Department of Computer Science, University of A Coruña, 15071 A Coruña, Spain; (V.Q.-T.); (B.O.-T.)
- Research Department, Puyo Campus, Universidad Estatal Amazónica, Puyo 160150, Ecuador
| | - Bernabe Ortega-Tenezaca
- Grupo RNASA-IMEDIR, Department of Computer Science, University of A Coruña, 15071 A Coruña, Spain; (V.Q.-T.); (B.O.-T.)
- Information and Communications Technology Management Department, Puyo Campus, Universidad Estatal Amazónica, Puyo 160150, Ecuador
| | - Humberto González-Díaz
- Department of Organic and Inorganic Chemistry, University of the Basque Country UPV/EHU, 48940 Leioa, Spain
- BIOFISIKA, Basque Centre for Biophysics, CSIC-UPV/EHU, 48940 Leioa, Spain
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Spain
| |
Collapse
|
6
|
Diéguez-Santana K, González-Díaz H. Towards machine learning discovery of dual antibacterial drug-nanoparticle systems. NANOSCALE 2021; 13:17854-17870. [PMID: 34671801 DOI: 10.1039/d1nr04178a] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Artificial Intelligence/Machine Learning (AI/ML) algorithms may speed up the design of DADNP systems formed by Antibacterial Drugs (AD) and Nanoparticles (NP). In this work, we used IFPTML = Information Fusion (IF) + Perturbation-Theory (PT) + Machine Learning (ML) algorithm for the first time to study of a large dataset of putative DADNP systems composed by >165 000 ChEMBL AD assays and 300 NP assays vs. multiple bacteria species. We trained alternative models with Linear Discriminant Analysis (LDA), Artificial Neural Networks (ANN), Bayesian Networks (BNN), K-Nearest Neighbour (KNN) and other algorithms. IFPTML-LDA model was simpler with values of Sp ≈ 90% and Sn ≈ 74% in both training (>124 K cases) and validation (>41 K cases) series. IFPTML-ANN and KNN models are notably more complicated even when they are more balanced Sn ≈ Sp ≈ 88.5%-99.0% and AUROC ≈ 0.94-0.99 in both series. We also carried out a simulation (>1900 calculations) of the expected behavior for putative DADNPs in 72 different biological assays. The putative DADNPs studied are formed by 27 different drugs with multiple classes of NP and types of coats. In addition, we tested the validity of our additive model with 80 DADNP complexes experimentally synthetized and biologically tested (reported in >45 papers). All these DADNPs show values of MIC < 50 μg mL-1 (cutoff used) better that MIC of AD and NP alone (synergistic or additive effect). The assays involve DADNP complexes with 10 types of NP, 6 coating materials, NP size range 5-100 nm vs. 15 different antibiotics, and 12 bacteria species. The IFPTML-LDA model classified correctly 100% (80 out of 80) DADNP complexes as biologically active. IFPMTL additive strategy may become a useful tool to assist the design of DADNP systems for antibacterial therapy taking into consideration only information about AD and NP components by separate.
Collapse
Affiliation(s)
- Karel Diéguez-Santana
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| | - Humberto González-Díaz
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
- Basque Center for Biophysics CSIC-UPVEH, University of Basque Country UPV/EHU, 48940 Leioa, Spain.
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Biscay, Spain
| |
Collapse
|
7
|
Prediction of Anti-Glioblastoma Drug-Decorated Nanoparticle Delivery Systems Using Molecular Descriptors and Machine Learning. Int J Mol Sci 2021; 22:ijms222111519. [PMID: 34768951 PMCID: PMC8584266 DOI: 10.3390/ijms222111519] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 10/08/2021] [Accepted: 10/22/2021] [Indexed: 12/22/2022] Open
Abstract
The theoretical prediction of drug-decorated nanoparticles (DDNPs) has become a very important task in medical applications. For the current paper, Perturbation Theory Machine Learning (PTML) models were built to predict the probability of different pairs of drugs and nanoparticles creating DDNP complexes with anti-glioblastoma activity. PTML models use the perturbations of molecular descriptors of drugs and nanoparticles as inputs in experimental conditions. The raw dataset was obtained by mixing the nanoparticle experimental data with drug assays from the ChEMBL database. Ten types of machine learning methods have been tested. Only 41 features have been selected for 855,129 drug-nanoparticle complexes. The best model was obtained with the Bagging classifier, an ensemble meta-estimator based on 20 decision trees, with an area under the receiver operating characteristic curve (AUROC) of 0.96, and an accuracy of 87% (test subset). This model could be useful for the virtual screening of nanoparticle-drug complexes in glioblastoma. All the calculations can be reproduced with the datasets and python scripts, which are freely available as a GitHub repository from authors.
Collapse
|
8
|
Ortega-Tenezaca B, Quevedo-Tumailli V, Bediaga H, Collados J, Arrasate S, Madariaga G, Munteanu CR, Cordeiro MND, González-Díaz H. PTML Multi-Label Algorithms: Models, Software, and Applications. Curr Top Med Chem 2020; 20:2326-2337. [DOI: 10.2174/1568026620666200916122616] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Revised: 07/19/2020] [Accepted: 07/20/2020] [Indexed: 12/17/2022]
Abstract
By combining Machine Learning (ML) methods with Perturbation Theory (PT), it is possible
to develop predictive models for a variety of response targets. Such combination often known as
Perturbation Theory Machine Learning (PTML) modeling comprises a set of techniques that can handle
various physical, and chemical properties of different organisms, complex biological or material
systems under multiple input conditions. In so doing, these techniques effectively integrate a manifold
of diverse chemical and biological data into a single computational framework that can then be applied
for screening lead chemicals as well as to find clues for improving the targeted response(s).
PTML models have thus been extremely helpful in drug or material design efforts and found to be
predictive and applicable across a broad space of systems. After a brief outline of the applied methodology,
this work reviews the different uses of PTML in Medicinal Chemistry, as well as in other
applications. Finally, we cover the development of software available nowadays for setting up PTML
models from large datasets.
Collapse
Affiliation(s)
| | | | - Harbil Bediaga
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| | - Jon Collados
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| | - Sonia Arrasate
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| | - Gotzon Madariaga
- Department of Condensed Matter Physics, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| | - Cristian R Munteanu
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, 15071 A Coruna, Spain
| | - M. Natália D.S. Cordeiro
- LAQV@REQUIMTE, Department of Chemistry and Biochemistry, University of Porto, 4169-007 Porto, Portugal
| | - Humbert González-Díaz
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| |
Collapse
|
9
|
Urista DV, Carrué DB, Otero I, Arrasate S, Quevedo-Tumailli VF, Gestal M, González-Díaz H, Munteanu CR. Prediction of Antimalarial Drug-Decorated Nanoparticle Delivery Systems with Random Forest Models. BIOLOGY 2020; 9:biology9080198. [PMID: 32751710 PMCID: PMC7465777 DOI: 10.3390/biology9080198] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Revised: 07/22/2020] [Accepted: 07/27/2020] [Indexed: 12/13/2022]
Abstract
Drug-decorated nanoparticles (DDNPs) have important medical applications. The current work combined Perturbation Theory with Machine Learning and Information Fusion (PTMLIF). Thus, PTMLIF models were proposed to predict the probability of nanoparticle–compound/drug complexes having antimalarial activity (against Plasmodium). The aim is to save experimental resources and time by using a virtual screening for DDNPs. The raw data was obtained by the fusion of experimental data for nanoparticles with compound chemical assays from the ChEMBL database. The inputs for the eight Machine Learning classifiers were transformed features of drugs/compounds and nanoparticles as perturbations of molecular descriptors in specific experimental conditions (experiment-centered features). The resulting dataset contains 107 input features and 249,992 examples. The best classification model was provided by Random Forest, with 27 selected features of drugs/compounds and nanoparticles in all experimental conditions considered. The high performance of the model was demonstrated by the mean Area Under the Receiver Operating Characteristics (AUC) in a test subset with a value of 0.9921 ± 0.000244 (10-fold cross-validation). The results demonstrated the power of information fusion of the experimental-centered features of drugs/compounds and nanoparticles for the prediction of nanoparticle–compound antimalarial activity. The scripts and dataset for this project are available in the open GitHub repository.
Collapse
Affiliation(s)
- Diana V. Urista
- Department of Organic Chemistry II, University of Basque Country (UPV/EHU), Sarriena w/n, 48940 Leioa, Spain; (D.V.U.); (S.A.); (H.G.-D.)
| | - Diego B. Carrué
- RNASA-IMEDIR, Computer Science Faculty, CITIC, University of A Coruna, Campus Elviña s/n, 15071 A Coruña, Spain; (D.B.C.); (I.O.); (V.F.Q.-T.); (M.G.)
| | - Iago Otero
- RNASA-IMEDIR, Computer Science Faculty, CITIC, University of A Coruna, Campus Elviña s/n, 15071 A Coruña, Spain; (D.B.C.); (I.O.); (V.F.Q.-T.); (M.G.)
| | - Sonia Arrasate
- Department of Organic Chemistry II, University of Basque Country (UPV/EHU), Sarriena w/n, 48940 Leioa, Spain; (D.V.U.); (S.A.); (H.G.-D.)
| | - Viviana F. Quevedo-Tumailli
- RNASA-IMEDIR, Computer Science Faculty, CITIC, University of A Coruna, Campus Elviña s/n, 15071 A Coruña, Spain; (D.B.C.); (I.O.); (V.F.Q.-T.); (M.G.)
- Universidad Estatal Amazónica UEA, Km. 2 1/2 vía Puyo a Tena (paso lateral), Puyo 160150, Pastaza, Ecuador
| | - Marcos Gestal
- RNASA-IMEDIR, Computer Science Faculty, CITIC, University of A Coruna, Campus Elviña s/n, 15071 A Coruña, Spain; (D.B.C.); (I.O.); (V.F.Q.-T.); (M.G.)
- Biomedical Research Institute of A Coruña (INIBIC), Hospital Teresa Herrera, Xubias de Arriba 84, 15006 A Coruña, Spain
| | - Humbert González-Díaz
- Department of Organic Chemistry II, University of Basque Country (UPV/EHU), Sarriena w/n, 48940 Leioa, Spain; (D.V.U.); (S.A.); (H.G.-D.)
- IKERBASQUE, Basque Foundation for Science, Alameda Urquijo 36, 48011 Bilbao, Spain
- Basque Centre for Biophysics CSIC-UPVEHU, University of Basque Country UPV/EHU, Barrio Sarriena, 48940 Leioa, Spain
| | - Cristian R. Munteanu
- RNASA-IMEDIR, Computer Science Faculty, CITIC, University of A Coruna, Campus Elviña s/n, 15071 A Coruña, Spain; (D.B.C.); (I.O.); (V.F.Q.-T.); (M.G.)
- Biomedical Research Institute of A Coruña (INIBIC), Hospital Teresa Herrera, Xubias de Arriba 84, 15006 A Coruña, Spain
- Correspondence:
| |
Collapse
|
10
|
Ilan Y. Overcoming randomness does not rule out the importance of inherent randomness for functionality. J Biosci 2019. [DOI: 10.1007/s12038-019-9958-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
11
|
Concu R, D. S. Cordeiro MN, Munteanu CR, González-Díaz H. PTML Model of Enzyme Subclasses for Mining the Proteome of Biofuel Producing Microorganisms. J Proteome Res 2019; 18:2735-2746. [DOI: 10.1021/acs.jproteome.8b00949] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Affiliation(s)
- Riccardo Concu
- LAQV@REQUIMTE/Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal
| | - M. Natália. D. S. Cordeiro
- LAQV@REQUIMTE/Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal
| | - Cristian R. Munteanu
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruña, 15071 A Coruña, Spain
- INIBIC Biomedical Research Institute of Coruña, CHUAC University Hospital, 15006 A Coruña, Spain
| | - Humbert González-Díaz
- Department of Organic Chemistry II, University of Basque Country UPV/EHU, 48940 Leioa, Biscay, Spain
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Biscay, Spain
| |
Collapse
|