1
|
Fusani L, Cabrera AC. Active learning strategies with COMBINE analysis: new tricks for an old dog. J Comput Aided Mol Des 2019; 33:287-294. [PMID: 30564994 PMCID: PMC7087723 DOI: 10.1007/s10822-018-0181-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2018] [Accepted: 12/14/2018] [Indexed: 11/09/2022]
Abstract
The COMBINE method was designed to study congeneric series of compounds including structural information of ligand-protein complexes. Although very successful, the method has not received the same level of attention than other alternatives to study Quantitative Structure Active Relationships (QSAR) mainly because lack of ways to measure the uncertainty of the predictions and the need for large datasets. Active learning, a semi-supervised learning approach that makes use of uncertainty to enhance models' performance while reducing the size of the training sets, has been used in this work to address both problems. We propose two estimators of uncertainty: the pool of regressors and the distance to the training set. The performance of the methods has been evaluated by testing the resulting active learning workflows in 3 diverse datasets: HIV-1 protease inhibitors, Taxol-derivatives and BRD4 inhibitors. The proposed strategies were successful in 80% of the cases for the taxol-derivatives and BRD4 inhibitors, while outperformed random selection in the case of the HIV-1 protease inhibitors time-split. Our results suggest that AL-COMBINE might be an effective way of producing consistently superior QSAR models with a limited number of samples.
Collapse
Affiliation(s)
- Lucia Fusani
- Molecular Design UK. GSK Medicines Research Centre, Gunnels Wood Road, Stevenage, Hertfordshire, SG1 2NY, UK
| | - Alvaro Cortes Cabrera
- Data Science and Computational Chemistry, Galchimia S.A. Severo Ochoa 2, Tres Cantos, 28760, Spain.
| |
Collapse
|
2
|
Fukunishi Y, Nakamura H. Improved estimation of protein-ligand binding free energy by using the ligand-entropy and mobility of water molecules. Pharmaceuticals (Basel) 2013; 6:604-22. [PMID: 24276169 PMCID: PMC3817721 DOI: 10.3390/ph6050604] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2013] [Revised: 04/17/2013] [Accepted: 04/17/2013] [Indexed: 11/16/2022] Open
Abstract
We previously developed the direct interaction approximation (DIA) method to estimate the protein-ligand binding free energy (DG). The DIA method estimates the DG value based on the direct van der Waals and electrostatic interaction energies between the protein and the ligand. In the current study, the effect of the entropy of the ligand was introduced with protein dynamic properties by molecular dynamics simulations, and the interaction between each residue of the protein and the ligand was also weighted considering the hydration of each residue. The molecular dynamics simulation of the apo target protein gave the hydration effect of each residue, under the assumption that the residues, which strongly bind the water molecules, are important in the protein-ligand binding. These two effects improved the reliability of the DIA method. In fact, the parameters used in the DIA became independent of the target protein. The averaged error of DG estimation was 1.3 kcal/mol and the correlation coefficient between the experimental DG value and the calculated DG value was 0.75.
Collapse
Affiliation(s)
- Yoshifumi Fukunishi
- Molecular Profiling Research Center for Drug Discovery (molprof), National Institute of Advanced Industrial Science and Technology (AIST), 2-3-26, Aomi, Koto-ku, Tokyo 135-0064, Japan
- Author to whom correspondence should be addressed; E-Mail: ; Tel.: +81-3-3599-8290; Fax: +81-3-3599-8099
| | - Haruki Nakamura
- Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita, Osaka 565-0871, Japan; E-Mail:
| |
Collapse
|
3
|
Fukunishi Y, Nakamura H. Statistical estimation of the protein-ligand binding free energy based on direct protein-ligand interaction obtained by molecular dynamics simulation. Pharmaceuticals (Basel) 2012; 5:1064-79. [PMID: 24281257 PMCID: PMC3816655 DOI: 10.3390/ph5101064] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2012] [Revised: 09/19/2012] [Accepted: 09/21/2012] [Indexed: 11/28/2022] Open
Abstract
We have developed a method for estimating protein-ligand binding free energy (DG) based on the direct protein-ligand interaction obtained by a molecular dynamics simulation. Using this method, we estimated the DG value statistically by the average values of the van der Waals and electrostatic interactions between each amino acid of the target protein and the ligand molecule. In addition, we introduced fluctuations in the accessible surface area (ASA) and dihedral angles of the protein-ligand complex system as the entropy terms of the DG estimation. The present method included the fluctuation term of structural change of the protein and the effective dielectric constant. We applied this method to 34 protein-ligand complex structures. As a result, the correlation coefficient between the experimental and calculated DG values was 0.81, and the average error of DG was 1.2 kcal/mol with the use of the fixed parameters. These results were obtained from a 2 nsec molecular dynamics simulation.
Collapse
Affiliation(s)
- Yoshifumi Fukunishi
- Biological Information Research Center (BIRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-3-26, Aomi, Koto-ku, Tokyo 135-0064, Japan
| | - Haruki Nakamura
- Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita, Osaka 565-0871, Japan;
| |
Collapse
|
4
|
Abia D, Bastolla U, Chacón P, Fábrega C, Gago F, Morreale A, Tramontano A. In memoriam. Proteins 2010; 78:iii-viii. [DOI: 10.1002/prot.22660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
5
|
Henrich S, Feierberg I, Wang T, Blomberg N, Wade RC. Comparative binding energy analysis for binding affinity and target selectivity prediction. Proteins 2009; 78:135-53. [DOI: 10.1002/prot.22579] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
6
|
Gil-Redondo R, Klett J, Gago F, Morreale A. gCOMBINE: A graphical user interface to perform structure-based comparative binding energy (COMBINE) analysis on a set of ligand-receptor complexes. Proteins 2009; 78:162-72. [DOI: 10.1002/prot.22543] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
7
|
Martín-Santamaría S, Muñoz-Muriedas J, Luque FJ, Gago F. Modulation of Binding Strength in Several Classes of Active Site Inhibitors of Acetylcholinesterase Studied by Comparative Binding Energy Analysis. J Med Chem 2004; 47:4471-82. [PMID: 15317459 DOI: 10.1021/jm049877p] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The comparative binding energy (COMBINE) methodology has been used to identify the key residues that modulate the inhibitory potencies of three structurally different classes of acetylcholinesterase inhibitors (tacrines, huprines, and dihydroquinazolines) targeting the catalytic active site of this enzyme. The extended set of energy descriptors and the partial least-squares methodology used by COMBINE analysis on a unique training set containing all the compounds yielded an interpretable model that was able to fit and predict the activities of the whole series of inhibitors reasonably well (r2 = 0.91 and q2 = 0.76, 4 principal components). A more robust model (q2 = 0.81 and SDEP = 0.25, 3 principal components) was obtained when the same chemometric analysis was applied to the huprines set alone, but the method was unable to provide predictive models for the other two families when they were treated separately from the rest. This finding appears to indicate that the enrichment in chemical information brought about by the inclusion of different classes of compounds into a single training set can be beneficial when an internally consistent set of pharmacological data can be derived. The COMBINE model was externally validated when it was shown to predict the activity of an additional set of compounds that were not employed in model construction. Remarkably, the differences in inhibitory potency within the whole series were found to be finely tuned by the electrostatic contribution to the desolvation of the binding site and a network of secondary interactions established between the inhibitor and several protein residues that are distinct from those directly involved in the anchoring of the ligand. This information can now be used to advantage in the design of more potent inhibitors.
Collapse
Affiliation(s)
- Sonsoles Martín-Santamaría
- Departament de Fisicoquímica, Facultat de Farmàcia, Universitat de Barcelona, Av. Diagonal 643, 08028 Barcelona, Spain
| | | | | | | |
Collapse
|
8
|
Rodríguez-Barrios F, Gago F. Chemometrical identification of mutations in HIV-1 reverse transcriptase conferring resistance or enhanced sensitivity to arylsulfonylbenzonitriles. J Am Chem Soc 2004; 126:2718-9. [PMID: 14995186 DOI: 10.1021/ja038893t] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
A comparative binding energy (COMBINE) analysis on a series of nonnucleosidic reverse transcriptase inhibitors yields a QSAR model with high predictive ability and correctly identifies the effect of mutations on relevant protein residues.
Collapse
|
9
|
Murcia M, Ortiz AR. Virtual screening with flexible docking and COMBINE-based models. Application to a series of factor Xa inhibitors. J Med Chem 2004; 47:805-20. [PMID: 14761183 DOI: 10.1021/jm030137a] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
A two-step, fully automatic virtual screening procedure consisting of flexible docking followed by activity prediction by COMparative BINding Energy (COMBINE) analysis is presented. This novel approach has been successfully applied, as an example with medicinal chemistry interest, to a recently reported series of 133 factor Xa (fXa)(1) inhibitors whose activities encompass 4 orders of magnitude. The docking algorithm is linked to the COMBINE analysis program and used to derive independent regression models of the 133 inhibitors docked within three different fXa structures (PDB entries 1fjs, 1f0r, and 1xka), so as to explore the effect of receptor conformation on the overall results. Reliable docking conformations and predictive regression models requiring eight latent variables could be derived for two of the fXa structures, with the best model achieving a Q(2) of 0.63 and a standard deviation of errors of prediction (SDEP) of 0.51 (leave-one-out). The two-step procedure was then employed to screen a designed virtual library of 112 ligands, containing both active and inactive compounds. While docking energies alone could show a good performance for selecting hits, including structurally diverse ones, inclusion of COMBINE analysis regression models provided improved rankings for the identification of structurally related molecules in external sets. In our best case, a recognition rate of approximately 80% of known binders at approximately 15% false positives rate was achieved, corresponding to an enrichment factor of approximately 450% over random.
Collapse
Affiliation(s)
- Marta Murcia
- Department of Physiology & Biophysics, Mount Sinai School of Medicine, New York University, One Gustave Levy Place, Box 1218, New York, New York 10029, USA
| | | |
Collapse
|
10
|
Kmunícek J, Bohác M, Luengo S, Gago F, Wade RC, Damborský J. Comparative binding energy analysis of haloalkane dehalogenase substrates: modelling of enzyme-substrate complexes by molecular docking and quantum mechanical calculations. J Comput Aided Mol Des 2003; 17:299-311. [PMID: 14635723 DOI: 10.1023/a:1026159215220] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
We evaluate the applicability of automated molecular docking techniques and quantum mechanical calculations to the construction of a set of structures of enzyme-substrate complexes for use in Comparative binding energy (COMBINE) analysis to obtain 3D structure-activity relationships. The data set studied consists of the complexes of eighteen substrates docked within the active site of haloalkane dehalogenase (DhlA) from Xanthobacter autotrophicus GJ10. The results of the COMBINE analysis are compared with previously reported data obtained for the same dataset from modelled complexes that were based on an experimentally determined structure of the DhlA-dichloroethane complex. The quality of fit and the internal predictive power of the two COMBINE models are comparable, but better external predictions are obtained with the new approach. Both models show a similar composition of the principal components. Small differences in the relative contributions that are assigned to important residues for explaining binding affinity differences can be directly linked to structural differences in the modelled enzyme-substrate complexes: (i) rotation of all substrates in the active site about their longitudinal axis, (ii) repositioning of the ring of epihalohydrines and the halogen substituents of 1,2-dihalopropanes, and (iii) altered conformation of the long-chain molecules (halobutanes and halohexanes). For external validation, both a novel substrate not included in the training series and two different mutant proteins were used. The results obtained can be useful in the future to guide the rational engineering of substrate specificity in DhlA and other related enzymes.
Collapse
Affiliation(s)
- Jan Kmunícek
- National Centre for Biomolecular Research, Masaryk University, Kotlarska 2, 611 37 Brno, Czech Republic
| | | | | | | | | | | |
Collapse
|
11
|
de la Fuente JA, Manzanaro S, Martín MJ, de Quesada TG, Reymundo I, Luengo SM, Gago F. Synthesis, Activity, and Molecular Modeling Studies of Novel Human Aldose Reductase Inhibitors Based on a Marine Natural Product. J Med Chem 2003; 46:5208-21. [PMID: 14613323 DOI: 10.1021/jm030957n] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Aldose reductase (ALR2) has been implicated in the etiology of diabetic complications, including blindness. Because of the limited number of currently available drugs for the prevention of these long-term complications, the discovery of new ALR2 inhibitors appears highly desirable. In this study, a polybrominated diphenyl ether (1) naturally occurring in a marine sponge was found to inhibit recombinant human ALR2 with an IC(50) of 6.4 microM. A series of polyhalogenated analogues that were synthesized and tested in vitro to explore the structure-activity relationships displayed various degrees of inhibitory activity. The most active compounds were also capable of preventing sorbitol accumulation inside human retinal cells. In this cell-based assay, the most potent synthesized analogue (16) showed a 17-fold increase in inhibitory activity compared to that of sorbinil (IC(50) = 0.24 vs 4 microM). A molecular representation of human ALR2 in complex with the natural product was built using homology modeling, automated docking, and energy refinement methods. AMBER parameters for the halogen atoms were derived and calibrated using condensed phase molecular dynamics simulations of fluorobenzene, chlorobenzene, and bromobenzene. Inhibitor binding is proposed to cause a conformational change similar to that recently reported for zenarestat. A free energy perturbation thermodynamic cycle allowed us to assess the importance of a crucial bromine atom that distinguishes the active lead compound from a much less active close natural analogue. Remarkably, the spatial location of this bromine atom is equivalent to that occupied by the only bromine atom present in zenarestat.
Collapse
|