1
|
Easwaran M, Govindaraj RG, Naderi M, Brylinski M, De Zoysa M, Shin HJ. Evaluating the antibacterial activity of engineered phage ФEcSw endolysin against multidrug-resistant Escherichia coli strain Sw1. Int J Antimicrob Agents 2025; 65:107395. [PMID: 39612993 DOI: 10.1016/j.ijantimicag.2024.107395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2024] [Revised: 11/14/2024] [Accepted: 11/20/2024] [Indexed: 12/01/2024]
Abstract
OBJECTIVE The emergence of bacteriophage-encoded endolysins hold significant promise as novel antibacterial agents, particularly against the growing threat of antibiotic-resistant bacteria. Therefore, we investigated the phage ФEcSw endolysin to enhance the lytic activity against multi-drug-resistant Escherichia coli Sw1 through site-directed mutagenesis (SDM) guided by in silico identification of critical residues. METHODS A computational analysis was conducted to elucidate the protein folding pattern, identify the active domains, and recognize critical residues of ФEcSw endolysin. Structural similarity-based docking simulations were employed to identify residues potentially involved in both recognition and cleavage of the bacterial peptidoglycan. Phage endolysin was amplified, cloned, expressed, and purified from phage ФEcSw. Pure endolysin (EL) activity was subsequently validated through SDM. RESULTS Our studies revealed both open and closed conformations of ФEcSw endolysin within specific residue ranges (51-60 and 128-141). Notably, the active site was identified and contains the crucial catalytic residues, Glu19 and Asp34. A time-kill assay demonstrated that the holin (HL) - EL effectively reduced E. coli Sw1 growth by 46% within 12 h. Furthermore, treatment with HL, EL, and HL-EL significantly increased bacterial membrane permeability (11%, 74%, and 85%, respectively) within just 1 h. Importantly, SDM identified a double mutant (K19/H34) of the endolysin exhibiting the highest lytic activity compared to the wild-type and other mutants (E19D, E19K, D34E, and D34H) due to increase net charge from +3.23 to +6.29. CONCLUSIONS Our findings demonstrate that phage endolysins and HLs or engineered endolysin hold significant potential as therapeutic agents to combat multidrug-resistant bacterial infections.
Collapse
Affiliation(s)
- Maheswaran Easwaran
- Department of Research Analytics, Saveetha Dental College and Hospitals, Saveetha Institute of Medical and Technical Sciences, Saveetha University, Chennai, India
| | - Rajiv Gandhi Govindaraj
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA; HotSpot Therapeutics, Boston, MA, USA
| | - Misagh Naderi
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA; Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, USA
| | - Mahanama De Zoysa
- College of Veterinary Medicine, Research Institute of Veterinary Medicine, Chungnam National University, Daejeon, Republic of Korea
| | - Hyun-Jin Shin
- College of Veterinary Medicine, Research Institute of Veterinary Medicine, Chungnam National University, Daejeon, Republic of Korea.
| |
Collapse
|
2
|
Utgés JS, Barton GJ. Comparative evaluation of methods for the prediction of protein-ligand binding sites. J Cheminform 2024; 16:126. [PMID: 39529176 PMCID: PMC11552181 DOI: 10.1186/s13321-024-00923-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2024] [Accepted: 10/28/2024] [Indexed: 11/16/2024] Open
Abstract
The accurate identification of protein-ligand binding sites is of critical importance in understanding and modulating protein function. Accordingly, ligand binding site prediction has remained a research focus for over three decades with over 50 methods developed and a change of paradigm from geometry-based to machine learning. In this work, we collate 13 ligand binding site predictors, spanning 30 years, focusing on the latest machine learning-based methods such as VN-EGNN, IF-SitePred, GrASP, PUResNet, and DeepPocket and compare them to the established P2Rank, PRANK and fpocket and earlier methods like PocketFinder, Ligsite and Surfnet. We benchmark the methods against the human subset of our new curated reference dataset, LIGYSIS. LIGYSIS is a comprehensive protein-ligand complex dataset comprising 30,000 proteins with bound ligands which aggregates biologically relevant unique protein-ligand interfaces across biological units of multiple structures from the same protein. LIGYSIS is an improvement for testing methods over earlier datasets like sc-PDB, PDBbind, binding MOAD, COACH420 and HOLO4K which either include 1:1 protein-ligand complexes or consider asymmetric units. Re-scoring of fpocket predictions by PRANK and DeepPocket display the highest recall (60%) whilst IF-SitePred presents the lowest recall (39%). We demonstrate the detrimental effect that redundant prediction of binding sites has on performance as well as the beneficial impact of stronger pocket scoring schemes, with improvements up to 14% in recall (IF-SitePred) and 30% in precision (Surfnet). Finally, we propose top-N+2 recall as the universal benchmark metric for ligand binding site prediction and urge authors to share not only the source code of their methods, but also of their benchmark.Scientific contributionsThis study conducts the largest benchmark of ligand binding site prediction methods to date, comparing 13 original methods and 15 variants using 10 informative metrics. The LIGYSIS dataset is introduced, which aggregates biologically relevant protein-ligand interfaces across multiple structures of the same protein. The study highlights the detrimental effect of redundant binding site prediction and demonstrates significant improvement in recall and precision through stronger scoring schemes. Finally, top-N+2 recall is proposed as a universal benchmark metric for ligand binding site prediction, with a recommendation for open-source sharing of both methods and benchmarks.
Collapse
Affiliation(s)
- Javier S Utgés
- Division of Computational Biology, School of Life Sciences, University of Dundee, Dow Street, Dundee, DD1 5EH, Scotland, UK
| | - Geoffrey J Barton
- Division of Computational Biology, School of Life Sciences, University of Dundee, Dow Street, Dundee, DD1 5EH, Scotland, UK.
| |
Collapse
|
3
|
Vural O, Jololian L, Pan L. DeepLigType: Predicting Ligand Types of ProteinLigand Binding Sites Using a Deep Learning Model. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; PP:116-123. [PMID: 39509302 DOI: 10.1109/tcbb.2024.3493820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2024]
Abstract
The analysis of protein-ligand binding sites plays a crucial role in the initial stages of drug discovery. Accurately predicting the ligand types that are likely to bind to protein-ligand binding sites enables more informed decision making in drug design. Our study, DeepLigType, determines protein-ligand binding sites using Fpocket and then predicts the ligand type of these pockets with the deep learning model, Convolutional Block Attention Module (CBAM) with ResNet. CBAM-ResNet has been trained to accurately predict five distinct ligand types. We classified protein-ligand binding sites into five different categories according to the type of response ligands cause when they bind to their target proteins, which are antagonist, agonist, activator, inhibitor, and others. We created a novel dataset, referred to as LigType5, from the widely recognized PDBbind and scPDB dataset for training and testing our model. While the literature mostly focuses on the specificity and characteristic analysis of protein binding sites by experimental (laboratory-based) methods, we propose a computational method with the DeepLigType architecture. DeepLigType demonstrated an accuracy of 74.30% and an AUC of 0.83 in ligand type prediction on a novel test dataset using the CBAM-ResNet deep learning model.
Collapse
|
4
|
Hoogstraten CA, Koenderink JB, van Straaten CE, Scheer-Weijers T, Smeitink JAM, Schirris TJJ, Russel FGM. Pyruvate dehydrogenase is a potential mitochondrial off-target for gentamicin based on in silico predictions and in vitro inhibition studies. Toxicol In Vitro 2024; 95:105740. [PMID: 38036072 DOI: 10.1016/j.tiv.2023.105740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 11/08/2023] [Accepted: 11/22/2023] [Indexed: 12/02/2023]
Abstract
During the drug development process, organ toxicity leads to an estimated failure of one-third of novel chemical entities. Drug-induced toxicity is increasingly associated with mitochondrial dysfunction, but identifying the underlying molecular mechanisms remains a challenge. Computational modeling techniques have proven to be a good tool in searching for drug off-targets. Here, we aimed to identify mitochondrial off-targets of the nephrotoxic drugs tenofovir and gentamicin using different in silico approaches (KRIPO, ProBis and PDID). Dihydroorotate dehydrogenase (DHODH) and pyruvate dehydrogenase (PDH) were predicted as potential novel off-target sites for tenofovir and gentamicin, respectively. The predicted targets were evaluated in vitro, using (colorimetric) enzymatic activity measurements. Tenofovir did not inhibit DHODH activity, while gentamicin potently reduced PDH activity. In conclusion, the use of in silico methods appeared a valuable approach in predicting PDH as a mitochondrial off-target of gentamicin. Further research is required to investigate the contribution of PDH inhibition to overall renal toxicity of gentamicin.
Collapse
Affiliation(s)
- Charlotte A Hoogstraten
- Division of Pharmacology and Toxicology, Department of Pharmacy, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands; Radboud Center for Mitochondrial Medicine, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands
| | - Jan B Koenderink
- Division of Pharmacology and Toxicology, Department of Pharmacy, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands
| | - Carolijn E van Straaten
- Division of Pharmacology and Toxicology, Department of Pharmacy, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands
| | - Tom Scheer-Weijers
- Division of Pharmacology and Toxicology, Department of Pharmacy, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands
| | - Jan A M Smeitink
- Radboud Center for Mitochondrial Medicine, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands; Department of Pediatrics, Amalia Children's Hospital, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands; Khondrion BV, Nijmegen 6525 EX, the Netherlands
| | - Tom J J Schirris
- Division of Pharmacology and Toxicology, Department of Pharmacy, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands; Radboud Center for Mitochondrial Medicine, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands
| | - Frans G M Russel
- Division of Pharmacology and Toxicology, Department of Pharmacy, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands; Radboud Center for Mitochondrial Medicine, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands.
| |
Collapse
|
5
|
Callil-Soares PH, Biasi LCK, Pessoa Filho PDA. Effect of preprocessing and simulation parameters on the performance of molecular docking studies. J Mol Model 2023; 29:251. [PMID: 37452150 DOI: 10.1007/s00894-023-05637-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Accepted: 06/26/2023] [Indexed: 07/18/2023]
Abstract
CONTEXT Molecular docking is an important and rapid tool that provides a comprehensive view of different molecular mechanisms. It is often used to verify the binding interactions of many pairs of molecules and is much faster than more rigorous approaches. However, its application requires carefully preprocessing each molecule and selecting a series of simulation parameters, which is not always done correctly. We show how preprocessing and simulation parameters can positively or negatively impact molecular docking performance. For example, the inclusion of hydrogen atoms leads to better redocking scores, but molecular dynamics simulations must be performed under certain constraints; otherwise, it may worsen performance rather than improve it. This study clarifies the importance and influence of these different parameters in the simulation results. METHODS We analyzed the influence of different parameters on the predictive ability of molecular docking techniques using two software packages: AutoDock Vina and AutoDock-GPU. Thus, 90 receptor-ligand complexes were redocked, evaluating the root mean square deviation (RMSD) between the original position of the ligand (receptor-ligand complex obtained experimentally) and that obtained by the software for every analysis. We investigated the influence of hydrogen atoms (on the receptor and on the receptor-ligand complex), partial charges (QEq, QTPIE, EEM, EEM2015ha, MMFF94, Gasteiger-Marsili, and no charge), search boxes (size and exhaustiveness), ligand characteristics (size and number of torsions), and the use of molecular dynamics (of the receptor or the receptor-ligand complex) before docking analyses.
Collapse
Affiliation(s)
- Pedro Henrique Callil-Soares
- Chemical Engineering Department, Polytechnic School of the University of São Paulo, Av. Lineu Prestes, 580, São Paulo, 05508-000, Brazil
| | - Lilian Caroline Kramer Biasi
- Chemical Engineering Department, Polytechnic School of the University of São Paulo, Av. Lineu Prestes, 580, São Paulo, 05508-000, Brazil.
| | - Pedro de Alcântara Pessoa Filho
- Chemical Engineering Department, Polytechnic School of the University of São Paulo, Av. Lineu Prestes, 580, São Paulo, 05508-000, Brazil
| |
Collapse
|
6
|
Eguida M, Rognan D. Estimating the Similarity between Protein Pockets. Int J Mol Sci 2022; 23:12462. [PMID: 36293316 PMCID: PMC9604425 DOI: 10.3390/ijms232012462] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 10/15/2022] [Accepted: 10/16/2022] [Indexed: 10/28/2023] Open
Abstract
With the exponential increase in publicly available protein structures, the comparison of protein binding sites naturally emerged as a scientific topic to explain observations or generate hypotheses for ligand design, notably to predict ligand selectivity for on- and off-targets, explain polypharmacology, and design target-focused libraries. The current review summarizes the state-of-the-art computational methods applied to pocket detection and comparison as well as structural druggability estimates. The major strengths and weaknesses of current pocket descriptors, alignment methods, and similarity search algorithms are presented. Lastly, an exhaustive survey of both retrospective and prospective applications in diverse medicinal chemistry scenarios illustrates the capability of the existing methods and the hurdle that still needs to be overcome for more accurate predictions.
Collapse
Affiliation(s)
| | - Didier Rognan
- Laboratoire d’Innovation Thérapeutique, UMR7200 CNRS-Université de Strasbourg, 67400 Illkirch, France
| |
Collapse
|
7
|
Abstract
Computationally identifying new targets for existing drugs has drawn much attention in drug repurposing due to its advantages over de novo drugs, including low risk, low costs, and rapid pace. To facilitate the drug repurposing computation, we constructed an automated and parameter-free virtual screening server, namely DrugRep, which performed molecular 3D structure construction, binding pocket prediction, docking, similarity comparison and binding affinity screening in a fully automatic manner. DrugRep repurposed drugs not only by receptor-based screening but also by ligand-based screening. The former automatically detected possible binding pockets of the receptor with our cavity detection approach, and then performed batch docking over drugs with a widespread docking program, AutoDock Vina. The latter explored drugs using seven well-established similarity measuring tools, including our recently developed ligand-similarity-based methods LigMate and FitDock. DrugRep utilized easy-to-use graphic interfaces for the user operation, and offered interactive predictions with state-of-the-art accuracy. We expect that this freely available online drug repurposing tool could be beneficial to the drug discovery community. The web site is http://cao.labshare.cn/drugrep/.
Collapse
|
8
|
Shi W, Singha M, Pu L, Srivastava G, Ramanujam J, Brylinski M. GraphSite: Ligand Binding Site Classification with Deep Graph Learning. Biomolecules 2022; 12:1053. [PMID: 36008947 PMCID: PMC9405584 DOI: 10.3390/biom12081053] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 07/18/2022] [Accepted: 07/20/2022] [Indexed: 12/10/2022] Open
Abstract
The binding of small organic molecules to protein targets is fundamental to a wide array of cellular functions. It is also routinely exploited to develop new therapeutic strategies against a variety of diseases. On that account, the ability to effectively detect and classify ligand binding sites in proteins is of paramount importance to modern structure-based drug discovery. These complex and non-trivial tasks require sophisticated algorithms from the field of artificial intelligence to achieve a high prediction accuracy. In this communication, we describe GraphSite, a deep learning-based method utilizing a graph representation of local protein structures and a state-of-the-art graph neural network to classify ligand binding sites. Using neural weighted message passing layers to effectively capture the structural, physicochemical, and evolutionary characteristics of binding pockets mitigates model overfitting and improves the classification accuracy. Indeed, comprehensive cross-validation benchmarks against a large dataset of binding pockets belonging to 14 diverse functional classes demonstrate that GraphSite yields the class-weighted F1-score of 81.7%, outperforming other approaches such as molecular docking and binding site matching. Further, it also generalizes well to unseen data with the F1-score of 70.7%, which is the expected performance in real-world applications. We also discuss new directions to improve and extend GraphSite in the future.
Collapse
Affiliation(s)
- Wentao Shi
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA 70803, USA; (W.S.); (J.R.)
| | - Manali Singha
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA; (M.S.); (G.S.)
| | - Limeng Pu
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA 70803, USA;
| | - Gopal Srivastava
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA; (M.S.); (G.S.)
| | - Jagannathan Ramanujam
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA 70803, USA; (W.S.); (J.R.)
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA 70803, USA;
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA; (M.S.); (G.S.)
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA 70803, USA;
| |
Collapse
|
9
|
Paiva VDA, Gomes IDS, Monteiro CR, Mendonça MV, Martins PM, Santana CA, Gonçalves-Almeida V, Izidoro SC, Melo-Minardi RCD, Silveira SDA. Protein structural bioinformatics: An overview. Comput Biol Med 2022; 147:105695. [DOI: 10.1016/j.compbiomed.2022.105695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Revised: 06/01/2022] [Accepted: 06/02/2022] [Indexed: 11/27/2022]
|
10
|
Scafuri B, Verdino A, D'Arminio N, Marabotti A. Computational methods to assist in the discovery of pharmacological chaperones for rare diseases. Brief Bioinform 2022; 23:6590149. [PMID: 35595532 DOI: 10.1093/bib/bbac198] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 04/13/2022] [Accepted: 04/28/2022] [Indexed: 12/21/2022] Open
Abstract
Pharmacological chaperones are chemical compounds able to bind proteins and stabilize them against denaturation and following degradation. Some pharmacological chaperones have been approved, or are under investigation, for the treatment of rare inborn errors of metabolism, caused by genetic mutations that often can destabilize the structure of the wild-type proteins expressed by that gene. Given that, for rare diseases, there is a general lack of pharmacological treatments, many expectations are poured out on this type of compounds. However, their discovery is not straightforward. In this review, we would like to focus on the computational methods that can assist and accelerate the search for these compounds, showing also examples in which these methods were successfully applied for the discovery of promising molecules belonging to this new category of pharmacologically active compounds.
Collapse
Affiliation(s)
- Bernardina Scafuri
- Department of Chemistry and Biology "A. Zambelli", University of Salerno, via Giovanni Paolo II, 132, 84084 Fisciano (SA), Italy
| | - Anna Verdino
- Department of Chemistry and Biology "A. Zambelli", University of Salerno, via Giovanni Paolo II, 132, 84084 Fisciano (SA), Italy
| | - Nancy D'Arminio
- Department of Chemistry and Biology "A. Zambelli", University of Salerno, via Giovanni Paolo II, 132, 84084 Fisciano (SA), Italy
| | - Anna Marabotti
- Department of Chemistry and Biology "A. Zambelli", University of Salerno, via Giovanni Paolo II, 132, 84084 Fisciano (SA), Italy
| |
Collapse
|
11
|
Simončič M, Lukšič M, Druchok M. Machine learning assessment of the binding region as a tool for more efficient computational receptor-ligand docking. J Mol Liq 2022; 353:118759. [PMID: 35273421 PMCID: PMC8903148 DOI: 10.1016/j.molliq.2022.118759] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
We present a combined computational approach to protein-ligand binding, which consists of two steps: (1) a deep neural network is used to locate a binding region on a target protein, and (2) molecular docking of a ligand is performed within the specified region to obtain the best pose using Autodock Vina. Our in-house designed neural network was trained using the PepBDB dataset. Although the training dataset consisted of protein-peptide complexes, we show that the approach is not limited to peptides, but also works remarkably well for a large class of non-peptide ligands. The results are compared with those in which the binding region (first step) was provided by Accluster. In cases where no prior experimental data on the binding region are available, our deep neural network provides a fast and effective alternative to classical software for its localization. Our code is available at https://github.com/mksmd/NNforDocking.
Collapse
Affiliation(s)
- Matjaž Simončič
- Faculty of Chemistry and Chemical Technology, University of Ljubljana, Večna pot 113, SI-1000 Ljubljana, Slovenia
| | - Miha Lukšič
- Faculty of Chemistry and Chemical Technology, University of Ljubljana, Večna pot 113, SI-1000 Ljubljana, Slovenia
| | - Maksym Druchok
- Institute for Condensed Matter Physics, 1 Svientsitskii Str., UA-79011 Lviv, Ukraine
- SoftServe Inc., 2d Sadova Str., UA-79021 Lviv, Ukraine
| |
Collapse
|
12
|
Shi W, Singha M, Srivastava G, Pu L, Ramanujam J, Brylinski M. Pocket2Drug: An Encoder-Decoder Deep Neural Network for the Target-Based Drug Design. Front Pharmacol 2022; 13:837715. [PMID: 35359869 PMCID: PMC8962739 DOI: 10.3389/fphar.2022.837715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Accepted: 02/10/2022] [Indexed: 11/13/2022] Open
Abstract
Computational modeling is an essential component of modern drug discovery. One of its most important applications is to select promising drug candidates for pharmacologically relevant target proteins. Because of continuing advances in structural biology, putative binding sites for small organic molecules are being discovered in numerous proteins linked to various diseases. These valuable data offer new opportunities to build efficient computational models predicting binding molecules for target sites through the application of data mining and machine learning. In particular, deep neural networks are powerful techniques capable of learning from complex data in order to make informed drug binding predictions. In this communication, we describe Pocket2Drug, a deep graph neural network model to predict binding molecules for a given a ligand binding site. This approach first learns the conditional probability distribution of small molecules from a large dataset of pocket structures with supervised training, followed by the sampling of drug candidates from the trained model. Comprehensive benchmarking simulations show that using Pocket2Drug significantly improves the chances of finding molecules binding to target pockets compared to traditional drug selection procedures. Specifically, known binders are generated for as many as 80.5% of targets present in the testing set consisting of dissimilar data from that used to train the deep graph neural network model. Overall, Pocket2Drug is a promising computational approach to inform the discovery of novel biopharmaceuticals.
Collapse
Affiliation(s)
- Wentao Shi
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA, United States
| | - Manali Singha
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, United States
| | - Gopal Srivastava
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, United States
| | - Limeng Pu
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, United States
| | - J. Ramanujam
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA, United States
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, United States
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, United States
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, United States
- *Correspondence: Michal Brylinski,
| |
Collapse
|
13
|
Liu G, Singha M, Pu L, Neupane P, Feinstein J, Wu HC, Ramanujam J, Brylinski M. GraphDTI: A robust deep learning predictor of drug-target interactions from multiple heterogeneous data. J Cheminform 2021; 13:58. [PMID: 34380569 PMCID: PMC8356453 DOI: 10.1186/s13321-021-00540-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 07/31/2021] [Indexed: 12/22/2022] Open
Abstract
Traditional techniques to identify macromolecular targets for drugs utilize solely the information on a query drug and a putative target. Nonetheless, the mechanisms of action of many drugs depend not only on their binding affinity toward a single protein, but also on the signal transduction through cascades of molecular interactions leading to certain phenotypes. Although using protein-protein interaction networks and drug-perturbed gene expression profiles can facilitate system-level investigations of drug-target interactions, utilizing such large and heterogeneous data poses notable challenges. To improve the state-of-the-art in drug target identification, we developed GraphDTI, a robust machine learning framework integrating the molecular-level information on drugs, proteins, and binding sites with the system-level information on gene expression and protein-protein interactions. In order to properly evaluate the performance of GraphDTI, we compiled a high-quality benchmarking dataset and devised a new cluster-based cross-validation protocol. Encouragingly, GraphDTI not only yields an AUC of 0.996 against the validation dataset, but it also generalizes well to unseen data with an AUC of 0.939, significantly outperforming other predictors. Finally, selected examples of identified drugtarget interactions are validated against the biomedical literature. Numerous applications of GraphDTI include the investigation of drug polypharmacological effects, side effects through offtarget binding, and repositioning opportunities.
Collapse
Affiliation(s)
- Guannan Liu
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA, 70803, USA
| | - Manali Singha
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA
| | - Limeng Pu
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, 70803, USA
| | - Prasanga Neupane
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA, 70803, USA
| | - Joseph Feinstein
- Department of Computer Science, Brown University, Providence, RI, 02902, USA
| | - Hsiao-Chun Wu
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA, 70803, USA
| | - J Ramanujam
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA, 70803, USA.,Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, 70803, USA
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA. .,Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, 70803, USA.
| |
Collapse
|
14
|
Konc J, Lešnik S, Škrlj B, Janežič D. ProBiS-Dock Database: A Web Server and Interactive Web Repository of Small Ligand-Protein Binding Sites for Drug Design. J Chem Inf Model 2021; 61:4097-4107. [PMID: 34319727 DOI: 10.1021/acs.jcim.1c00454] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
We have developed a new system, ProBiS-Dock, which can be used to determine the different types of protein binding sites for small ligands. The binding sites identified this way are then used to construct a new binding site database, the ProBiS-Dock Database, that allows for the ranking of binding sites according to their utility for drug development. The newly constructed database currently has more than 1.4 million binding sites and offers the possibility to investigate potential drug targets originating from different biological species. The interactive ProBiS-Dock Database, a web server and repository that consists of all small-molecule ligand binding sites in all of the protein structures in the Protein Data Bank, is freely available at http://probis-dock-database.insilab.org. The ProBiS-Dock Database will be regularly updated to keep pace with the growth of the Protein Data Bank, and our anticipation is that it will be useful in drug discovery.
Collapse
Affiliation(s)
- Janez Konc
- Theory Department, National Institute of Chemistry, Hajdrihova 19, SI-1000 Ljubljana, Slovenia
| | - Samo Lešnik
- Theory Department, National Institute of Chemistry, Hajdrihova 19, SI-1000 Ljubljana, Slovenia
| | - Blaž Škrlj
- Theory Department, National Institute of Chemistry, Hajdrihova 19, SI-1000 Ljubljana, Slovenia.,Jozef Stefan International Postgraduate School, Jamova cesta 39, SI-1000 Ljubljana, Slovenia
| | - Dušanka Janežič
- Faculty of Mathematics, Natural Sciences and Information Technologies, University of Primorska, Glagoljaška ulica 8, SI-6000 Koper, Slovenia
| |
Collapse
|
15
|
Das S, Scholes HM, Sen N, Orengo C. CATH functional families predict functional sites in proteins. Bioinformatics 2021; 37:1099-1106. [PMID: 33135053 PMCID: PMC8150129 DOI: 10.1093/bioinformatics/btaa937] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Revised: 09/30/2020] [Accepted: 10/27/2020] [Indexed: 01/12/2023] Open
Abstract
MOTIVATION Identification of functional sites in proteins is essential for functional characterization, variant interpretation and drug design. Several methods are available for predicting either a generic functional site, or specific types of functional site. Here, we present FunSite, a machine learning predictor that identifies catalytic, ligand-binding and protein-protein interaction functional sites using features derived from protein sequence and structure, and evolutionary data from CATH functional families (FunFams). RESULTS FunSite's prediction performance was rigorously benchmarked using cross-validation and a holdout dataset. FunSite outperformed other publicly available functional site prediction methods. We show that conserved residues in FunFams are enriched in functional sites. We found FunSite's performance depends greatly on the quality of functional site annotations and the information content of FunFams in the training data. Finally, we analyze which structural and evolutionary features are most predictive for functional sites. AVAILABILITYAND IMPLEMENTATION https://github.com/UCL/cath-funsite-predictor. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sayoni Das
- PrecisionLife Ltd., Long Hanborough, OX29 8LJ Oxford, UK
| | - Harry M Scholes
- Institute of Structural and Molecular Biology, University College London, WC1E 6BT, London, UK
| | - Neeladri Sen
- Institute of Structural and Molecular Biology, University College London, WC1E 6BT, London, UK
| | - Christine Orengo
- Institute of Structural and Molecular Biology, University College London, WC1E 6BT, London, UK
| |
Collapse
|
16
|
Patel PK, Bhatt HG. Improved 3D-QSAR Prediction by Multiple Conformational Alignments and Molecular Docking Studies to Design and Discover HIV-I Protease Inhibitors. Curr HIV Res 2021; 19:154-171. [PMID: 33213349 DOI: 10.2174/1570162x18666201119143457] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Revised: 09/18/2020] [Accepted: 10/02/2020] [Indexed: 11/22/2022]
Abstract
BACKGROUND Inhibition of HIV-I protease enzyme is a strategic step for providing better treatment in retrovirus infections, which avoids resistance and possesses less toxicity. OBJECTIVES In the course of our research to discover new and potent protease inhibitors, 3D-QSAR (CoMFA and CoMSIA) models were generated using 3 different alignment techniques, including multifit alignment, docking based and Distill based alignment for 63 compounds. Novel molecules were designed from the output of this study. METHODS A total of 3 alignment methods were used to generate CoMFA and CoMSIA models. A Distill based alignment method was considered a better method according to different validation parameters. A 3D-QSAR model was generated and contour maps were discussed. The biological activity of designed molecules was predicted using the generated QSAR model to validate QSAR. The newly designed molecules were docked to predict binding affinity. RESULTS In CoMFA, leave one out cross-validated coefficient (q2), conventional coefficient (r2) and predicted correlation coefficient (r2Predicted) values were found to be 0.721, 0.991 and 0.780, respectively. The best obtained CoMSIA model also showed significant cross-validated coefficient (q2), conventional coefficient (r2) and predicted correlation coefficient (r2Predicted) values of 0.714, 0.987 and 0.721, respectively. Steric and electrostatic contour maps generated from CoMFA and hydrophobic and hydrogen bond donor and hydrogen bond acceptor contour maps from CoMSIA models were used to design new and bioactive protease inhibitors by incorporating bioisosterism and knowledge-based structure-activity relationship. CONCLUSION The results from both these approaches, ligand-based drug design and structure-based drug design, are adequate and promising to discover protease inhibitors.
Collapse
Affiliation(s)
- Paresh K Patel
- Department of Pharmaceutical Chemistry, Institute of Pharmacy, Nirma University, Ahmedabad 382 481, India
| | - Hardik G Bhatt
- Department of Pharmaceutical Chemistry, Institute of Pharmacy, Nirma University, Ahmedabad 382 481, India
| |
Collapse
|
17
|
Wang C, Kurgan L. Survey of Similarity-Based Prediction of Drug-Protein Interactions. Curr Med Chem 2021; 27:5856-5886. [PMID: 31393241 DOI: 10.2174/0929867326666190808154841] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2017] [Revised: 04/16/2018] [Accepted: 10/23/2018] [Indexed: 12/20/2022]
Abstract
Therapeutic activity of a significant majority of drugs is determined by their interactions with proteins. Databases of drug-protein interactions (DPIs) primarily focus on the therapeutic protein targets while the knowledge of the off-targets is fragmented and partial. One way to bridge this knowledge gap is to employ computational methods to predict protein targets for a given drug molecule, or interacting drugs for given protein targets. We survey a comprehensive set of 35 methods that were published in high-impact venues and that predict DPIs based on similarity between drugs and similarity between protein targets. We analyze the internal databases of known PDIs that these methods utilize to compute similarities, and investigate how they are linked to the 12 publicly available source databases. We discuss contents, impact and relationships between these internal and source databases, and well as the timeline of their releases and publications. The 35 predictors exploit and often combine three types of similarities that consider drug structures, drug profiles, and target sequences. We review the predictive architectures of these methods, their impact, and we explain how their internal DPIs databases are linked to the source databases. We also include a detailed timeline of the development of these predictors and discuss the underlying limitations of the current resources and predictive tools. Finally, we provide several recommendations concerning the future development of the related databases and methods.
Collapse
Affiliation(s)
- Chen Wang
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, United States
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, United States
| |
Collapse
|
18
|
Brackenridge DA, McGuffin LJ. Proteins and Their Interacting Partners: An Introduction to Protein-Ligand Binding Site Prediction Methods with a Focus on FunFOLD3. Methods Mol Biol 2021; 2365:43-58. [PMID: 34432238 DOI: 10.1007/978-1-0716-1665-9_3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Proteins are essential molecules with a diverse range of functions; elucidating their biological and biochemical characteristics can be difficult and time consuming using in vitro and/or in vivo methods. Additionally, in vivo protein-ligand binding site elucidation is unable to keep place with current growth in sequencing, leaving the majority of new protein sequences without known functions. Therefore, the development of new methods, which aim to predict the protein-ligand interactions and ligand-binding site residues directly from amino acid sequences, is becoming increasingly important. In silico prediction can utilise either sequence information, structural information or a combination of both. In this chapter, we will discuss the broad range of methods for ligand-binding site prediction from protein structure and we will describe our method, FunFOLD3, for the prediction of protein-ligand interactions and ligand-binding sites based on template-based modelling. Additionally, we will describe the step-by-step instructions using the FunFOLD3 downloadable application along with examples from the Critical Assessment of Techniques for Protein Structure Prediction (CASP) where FunFOLD3 has been used to aid ligand and ligand-binding site prediction. Finally, we will introduce our newer method, FunFOLD3-D, a version of FunFOLD3 which aims to improve template-based protein-ligand binding site prediction through the integration of docking, using AutoDock Vina.
Collapse
|
19
|
Shi W, Lemoine JM, Shawky AEMA, Singha M, Pu L, Yang S, Ramanujam J, Brylinski M. BionoiNet: ligand-binding site classification with off-the-shelf deep neural network. Bioinformatics 2020; 36:3077-3083. [PMID: 32053156 DOI: 10.1093/bioinformatics/btaa094] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2019] [Revised: 01/27/2020] [Accepted: 02/05/2020] [Indexed: 01/08/2023] Open
Abstract
MOTIVATION Fast and accurate classification of ligand-binding sites in proteins with respect to the class of binding molecules is invaluable not only to the automatic functional annotation of large datasets of protein structures but also to projects in protein evolution, protein engineering and drug development. Deep learning techniques, which have already been successfully applied to address challenging problems across various fields, are inherently suitable to classify ligand-binding pockets. Our goal is to demonstrate that off-the-shelf deep learning models can be employed with minimum development effort to recognize nucleotide- and heme-binding sites with a comparable accuracy to highly specialized, voxel-based methods. RESULTS We developed BionoiNet, a new deep learning-based framework implementing a popular ResNet model for image classification. BionoiNet first transforms the molecular structures of ligand-binding sites to 2D Voronoi diagrams, which are then used as the input to a pretrained convolutional neural network classifier. The ResNet model generalizes well to unseen data achieving the accuracy of 85.6% for nucleotide- and 91.3% for heme-binding pockets. BionoiNet also computes significance scores of pocket atoms, called BionoiScores, to provide meaningful insights into their interactions with ligand molecules. BionoiNet is a lightweight alternative to computationally expensive 3D architectures. AVAILABILITY AND IMPLEMENTATION BionoiNet is implemented in Python with the source code freely available at: https://github.com/CSBG-LSU/BionoiNet. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Wentao Shi
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Jeffrey M Lemoine
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Abd-El-Monsif A Shawky
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA.,Department of Cell Biology, National Research Centre, 12622 Giza, Egypt
| | - Manali Singha
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Limeng Pu
- Center for Computation & Technology, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Shuangyan Yang
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA 70803, USA
| | - J Ramanujam
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA 70803, USA.,Center for Computation & Technology, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA.,Center for Computation & Technology, Louisiana State University, Baton Rouge, LA 70803, USA
| |
Collapse
|
20
|
Zhao J, Cao Y, Zhang L. Exploring the computational methods for protein-ligand binding site prediction. Comput Struct Biotechnol J 2020; 18:417-426. [PMID: 32140203 PMCID: PMC7049599 DOI: 10.1016/j.csbj.2020.02.008] [Citation(s) in RCA: 103] [Impact Index Per Article: 20.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Revised: 01/23/2020] [Accepted: 02/11/2020] [Indexed: 12/21/2022] Open
Abstract
Proteins participate in various essential processes in vivo via interactions with other molecules. Identifying the residues participating in these interactions not only provides biological insights for protein function studies but also has great significance for drug discoveries. Therefore, predicting protein-ligand binding sites has long been under intense research in the fields of bioinformatics and computer aided drug discovery. In this review, we first introduce the research background of predicting protein-ligand binding sites and then classify the methods into four categories, namely, 3D structure-based, template similarity-based, traditional machine learning-based and deep learning-based methods. We describe representative algorithms in each category and elaborate on machine learning and deep learning-based prediction methods in more detail. Finally, we discuss the trends and challenges of the current research such as molecular dynamics simulation based cryptic binding sites prediction, and highlight prospective directions for the near future.
Collapse
Affiliation(s)
- Jingtian Zhao
- College of Computer Science, Sichuan University, Chengdu 610065, China
| | - Yang Cao
- Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Le Zhang
- College of Computer Science, Sichuan University, Chengdu 610065, China
| |
Collapse
|
21
|
Bagherian M, Sabeti E, Wang K, Sartor MA, Nikolovska-Coleska Z, Najarian K. Machine learning approaches and databases for prediction of drug-target interaction: a survey paper. Brief Bioinform 2020; 22:247-269. [PMID: 31950972 PMCID: PMC7820849 DOI: 10.1093/bib/bbz157] [Citation(s) in RCA: 200] [Impact Index Per Article: 40.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Revised: 11/01/2019] [Accepted: 11/07/2019] [Indexed: 12/12/2022] Open
Abstract
The task of predicting the interactions between drugs and targets plays a key role in the process of drug discovery. There is a need to develop novel and efficient prediction approaches in order to avoid costly and laborious yet not-always-deterministic experiments to determine drug–target interactions (DTIs) by experiments alone. These approaches should be capable of identifying the potential DTIs in a timely manner. In this article, we describe the data required for the task of DTI prediction followed by a comprehensive catalog consisting of machine learning methods and databases, which have been proposed and utilized to predict DTIs. The advantages and disadvantages of each set of methods are also briefly discussed. Lastly, the challenges one may face in prediction of DTI using machine learning approaches are highlighted and we conclude by shedding some lights on important future research directions.
Collapse
Affiliation(s)
- Maryam Bagherian
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Elyas Sabeti
- Michigan Institute for Data Science, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Kai Wang
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Maureen A Sartor
- Department of Pathology, University of Michigan, Ann Arbor, MI, 48109, USA
| | | | - Kayvan Najarian
- Department of Electrical Engineering and Computer Science, College of Engineering, University of Michigan, Ann Arbor, MI, 48109, USA
| |
Collapse
|
22
|
Liu Y, Grimm M, Dai WT, Hou MC, Xiao ZX, Cao Y. CB-Dock: a web server for cavity detection-guided protein-ligand blind docking. Acta Pharmacol Sin 2020; 41:138-144. [PMID: 31263275 PMCID: PMC7471403 DOI: 10.1038/s41401-019-0228-6] [Citation(s) in RCA: 421] [Impact Index Per Article: 84.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/05/2018] [Accepted: 03/14/2019] [Indexed: 12/19/2022]
Abstract
As the number of elucidated protein structures is rapidly increasing, the growing data call for methods to efficiently exploit the structural information for biological and pharmaceutical purposes. Given the three-dimensional (3D) structure of a protein and a ligand, predicting their binding sites and affinity are a key task for computer-aided drug discovery. To address this task, a variety of docking tools have been developed. Most of them focus on docking in the preset binding sites given by users. To automatically predict binding modes without information about binding sites, we developed a user-friendly blind docking web server, named CB-Dock, which predicts binding sites of a given protein and calculates the centers and sizes with a novel curvature-based cavity detection approach, and performs docking with a popular docking program, Autodock Vina. This method was carefully optimized and achieved ~70% success rate for the top-ranking poses whose root mean square deviation (RMSD) were within 2 Å from the X-ray pose, which outperformed the state-of-the-art blind docking tools in our benchmark tests. CB-Dock offers an interactive 3D visualization of results, and is freely available at http://cao.labshare.cn/cb-dock/.
Collapse
Affiliation(s)
- Yang Liu
- Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Maximilian Grimm
- Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Wen-Tao Dai
- Shanghai Center for Bioinformation Technology & Shanghai Engineering Research Center of Pharmaceutical Translation, Shanghai Industrial Technology Institute, Shanghai 201203, China
| | - Mu-Chun Hou
- Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Zhi-Xiong Xiao
- Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Yang Cao
- Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China.
| |
Collapse
|
23
|
Naderi M, Lemoine JM, Govindaraj RG, Kana OZ, Feinstein WP, Brylinski M. Binding site matching in rational drug design: algorithms and applications. Brief Bioinform 2019; 20:2167-2184. [PMID: 30169563 PMCID: PMC6954434 DOI: 10.1093/bib/bby078] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2018] [Revised: 07/18/2018] [Accepted: 07/29/2018] [Indexed: 01/06/2023] Open
Abstract
Interactions between proteins and small molecules are critical for biological functions. These interactions often occur in small cavities within protein structures, known as ligand-binding pockets. Understanding the physicochemical qualities of binding pockets is essential to improve not only our basic knowledge of biological systems, but also drug development procedures. In order to quantify similarities among pockets in terms of their geometries and chemical properties, either bound ligands can be compared to one another or binding sites can be matched directly. Both perspectives routinely take advantage of computational methods including various techniques to represent and compare small molecules as well as local protein structures. In this review, we survey 12 tools widely used to match pockets. These methods are divided into five categories based on the algorithm implemented to construct binding-site alignments. In addition to the comprehensive analysis of their algorithms, test sets and the performance of each method are described. We also discuss general pharmacological applications of computational pocket matching in drug repurposing, polypharmacology and side effects. Reflecting on the importance of these techniques in drug discovery, in the end, we elaborate on the development of more accurate meta-predictors, the incorporation of protein flexibility and the integration of powerful artificial intelligence technologies such as deep learning.
Collapse
Affiliation(s)
- Misagh Naderi
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Jeffrey Mitchell Lemoine
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
- Division of Computer Science and Engineering, Louisiana State University, Baton Rouge, LA 70803, USA
| | | | - Omar Zade Kana
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Wei Pan Feinstein
- High-Performance Computing, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
- Center for Computation & Technology, Louisiana State University, Baton Rouge, LA 70803, USA
| |
Collapse
|
24
|
Thaljeh LF, Rothschild JA, Naderi M, Coghill LM, Brown JM, Brylinski M. Hinge Region in DNA Packaging Terminase pUL15 of Herpes Simplex Virus: A Potential Allosteric Target for Antiviral Drugs. Biomolecules 2019; 9:biom9100603. [PMID: 31614784 PMCID: PMC6843332 DOI: 10.3390/biom9100603] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Revised: 09/30/2019] [Accepted: 10/08/2019] [Indexed: 12/23/2022] Open
Abstract
Approximately 80% of adults are infected with a member of the herpesviridae family. Herpesviruses establish life-long latent infections within neurons, which may reactivate into lytic infections due to stress or immune suppression. There are nine human herpesviruses (HHV) posing health concerns from benign conditions to life threatening encephalitis, including cancers associated with viral infections. The current treatment options for most HHV conditions mainly include several nucleoside and nucleotide analogs targeting viral DNA polymerase. Although these drugs help manage infections, their common mechanism of action may lead to the development of drug resistance, which is particularly devastating in immunocompromised patients. Therefore, new classes of drugs directed against novel targets in HHVs are necessary to alleviate this issue. We analyzed the conservation rates of all proteins in herpes simplex virus 1 (HHV-1), a representative of the HHV family and one of the most common viruses infecting the human population. Furthermore, we generated a full-length structure model of the most conserved HHV-1 protein, the DNA packaging terminase pUL15. A series of computational analyses were performed on the model to identify ATP and DNA binding sites and characterize the dynamics of the protein. Our study indicates that proteins involved in HHV-1 DNA packaging and cleavage are amongst the most conserved gene products of HHVs. Since the packaging protein pUL15 is the most conserved among all HHV-1 gene products, the virus will have a lower chance of developing resistance to small molecules targeting pUL15. A subsequent analysis of the structure of pUL15 revealed distinct ATP and DNA binding domains and the elastic network model identifies a functionally important hinge region between the two domains of pUL15. The atomic information on the active and allosteric sites in the ATP- and DNA-bound model of pUL15 presented in this study can inform the structure-based drug discovery of a new class of drugs to treat a wide range of HHVs.
Collapse
Affiliation(s)
- Lana F Thaljeh
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA.
| | - J Ainsley Rothschild
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA.
| | - Misagh Naderi
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA.
| | - Lyndon M Coghill
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA.
- Center for Computation & Technology, Louisiana State University, Baton Rouge, LA 70803, USA.
| | - Jeremy M Brown
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA.
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA.
- Center for Computation & Technology, Louisiana State University, Baton Rouge, LA 70803, USA.
| |
Collapse
|
25
|
Kana O, Brylinski M. Elucidating the druggability of the human proteome with eFindSite. J Comput Aided Mol Des 2019; 33:509-519. [PMID: 30888556 PMCID: PMC6516084 DOI: 10.1007/s10822-019-00197-w] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2018] [Accepted: 03/12/2019] [Indexed: 01/12/2023]
Abstract
Identifying the viability of protein targets is one of the preliminary steps of drug discovery. Determining the ability of a protein to bind drugs in order to modulate its function, termed the druggability, requires a non-trivial amount of time and resources. Inability to properly measure druggability has accounted for a significant portion of failures in drug discovery. This problem is only further exacerbated by the large sample space of proteins involved in human diseases. With these barriers, the druggability space within the human proteome remains unexplored and has made it difficult to develop drugs for numerous diseases. Hence, we present a new feature developed in eFindSite that employs supervised machine learning to predict the druggability of a given protein. Benchmarking calculations against the Non-Redundant data set of Druggable and Less Druggable binding sites demonstrate that an AUC for druggability prediction with eFindSite is as high as 0.88. With eFindSite, we elucidated the human druggability space to be 10,191 proteins. Considering the disease space from the Open Targets Platform and excluding already known targets from the predicted data set reveal 2731 potentially novel therapeutic targets. eFindSite is freely available as a stand-alone software at https://github.com/michal-brylinski/efindsite .
Collapse
Affiliation(s)
- Omar Kana
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA.
- Center for Computation & Technology, Louisiana State University, Baton Rouge, LA, 70803, USA.
| |
Collapse
|
26
|
Kaiser F, Labudde D. Unsupervised Discovery of Geometrically Common Structural Motifs and Long-Range Contacts in Protein 3D Structures. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:671-680. [PMID: 29990265 DOI: 10.1109/tcbb.2017.2786250] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The essential role of small evolutionarily conserved structural units in proteins has been extensively researched and validated. A popular example are serine proteases, where the peptide cleavage reaction is realized by a configuration of only three residues. Brought to spatial proximity during the protein folding process, such structural motifs are often long-range contacts and usually hard to detect at sequence level. Due to the constantly increasing resource of protein 3D structure data, the computational identification of structural motifs can contribute significantly to the understanding of protein fold and function. Thus, we propose a method to discover structural motifs of high geometrical similarity and desired sequence separation in protein 3D structure data. By utilizing methods originated from data mining, no a priori knowledge is required. The applicability of the method is demonstrated by the identification of the catalytic unit of serine proteases and the ion-coordination center of cupredoxins. Furthermore, large-scale analysis of the entire Protein Data Bank points towards the presence of ubiquitous structural motifs, independent of any specific fold or function. We envision that our method is suitable to uncover functional mechanisms and to derive fingerprint libraries of structural motifs, which could be used to assess protein family association.
Collapse
|
27
|
Pu L, Govindaraj RG, Lemoine JM, Wu HC, Brylinski M. DeepDrug3D: Classification of ligand-binding pockets in proteins with a convolutional neural network. PLoS Comput Biol 2019; 15:e1006718. [PMID: 30716081 PMCID: PMC6375647 DOI: 10.1371/journal.pcbi.1006718] [Citation(s) in RCA: 66] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Revised: 02/14/2019] [Accepted: 12/16/2018] [Indexed: 01/19/2023] Open
Abstract
Comprehensive characterization of ligand-binding sites is invaluable to infer molecular functions of hypothetical proteins, trace evolutionary relationships between proteins, engineer enzymes to achieve a desired substrate specificity, and develop drugs with improved selectivity profiles. These research efforts pose significant challenges owing to the fact that similar pockets are commonly observed across different folds, leading to the high degree of promiscuity of ligand-protein interactions at the system-level. On that account, novel algorithms to accurately classify binding sites are needed. Deep learning is attracting a significant attention due to its successful applications in a wide range of disciplines. In this communication, we present DeepDrug3D, a new approach to characterize and classify binding pockets in proteins with deep learning. It employs a state-of-the-art convolutional neural network in which biomolecular structures are represented as voxels assigned interaction energy-based attributes. The current implementation of DeepDrug3D, trained to detect and classify nucleotide- and heme-binding sites, not only achieves a high accuracy of 95%, but also has the ability to generalize to unseen data as demonstrated for steroid-binding proteins and peptidase enzymes. Interestingly, the analysis of strongly discriminative regions of binding pockets reveals that this high classification accuracy arises from learning the patterns of specific molecular interactions, such as hydrogen bonds, aromatic and hydrophobic contacts. DeepDrug3D is available as an open-source program at https://github.com/pulimeng/DeepDrug3D with the accompanying TOUGH-C1 benchmarking dataset accessible from https://osf.io/enz69/.
Collapse
Affiliation(s)
- Limeng Pu
- Division of Electrical & Computer Engineering, Louisiana State University, Baton Rouge, LA, United States of America
| | - Rajiv Gandhi Govindaraj
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, United States of America
| | - Jeffrey Mitchell Lemoine
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, United States of America
- Division of Computer Science and Engineering, Louisiana State University, Baton Rouge, LA, United States of America
| | - Hsiao-Chun Wu
- Division of Electrical & Computer Engineering, Louisiana State University, Baton Rouge, LA, United States of America
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, United States of America
- Center for Computation & Technology, Louisiana State University, Baton Rouge, LA, United States of America
- * E-mail:
| |
Collapse
|
28
|
Zhu M, Song X, Chen P, Wang W, Wang B. dbHDPLS: A database of human disease-related protein-ligand structures. Comput Biol Chem 2019; 78:353-358. [PMID: 30665056 DOI: 10.1016/j.compbiolchem.2018.12.023] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2018] [Revised: 12/11/2018] [Accepted: 12/30/2018] [Indexed: 12/31/2022]
Abstract
Protein-ligand complexes perform specific functions, most of which are related to human diseases. The database, called as human disease-related protein-ligand structures (dbHDPLS), collected 8833 structures which were extracted from protein data bank (PDB) and other related databases. The database is annotated with comprehensive information involving ligands and drugs, related human diseases and protein-ligand interaction information, with the information of protein structures. The database may be a reliable resource for structure-based drug target discoveries and druggability predictions of protein-ligand binding sites, drug-disease relationships based on protein-ligand complex structures. It can be publicly accessed at the website: http://DeepLearner.ahu.edu.cn/web/dbDPLS/.
Collapse
Affiliation(s)
- Muchun Zhu
- Institutes of Physical Science and Information Technology, Anhui University, 230601 Hefei, Anhui, China
| | - Xiaoping Song
- Institutes of Physical Science and Information Technology, Anhui University, 230601 Hefei, Anhui, China
| | - Peng Chen
- School of Electrical and Information Engineering, Anhui University of Technology, 243032 Ma'anshan, Anhui, China; Institutes of Physical Science and Information Technology, Anhui University, 230601 Hefei, Anhui, China.
| | - Wenyan Wang
- School of Electrical and Information Engineering, Anhui University of Technology, 243032 Ma'anshan, Anhui, China
| | - Bing Wang
- School of Electrical and Information Engineering, Anhui University of Technology, 243032 Ma'anshan, Anhui, China.
| |
Collapse
|
29
|
Krivák R, Hoksza D. P2Rank: machine learning based tool for rapid and accurate prediction of ligand binding sites from protein structure. J Cheminform 2018; 10:39. [PMID: 30109435 PMCID: PMC6091426 DOI: 10.1186/s13321-018-0285-8] [Citation(s) in RCA: 226] [Impact Index Per Article: 32.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2017] [Accepted: 06/29/2018] [Indexed: 01/29/2023] Open
Abstract
Background Ligand binding site prediction from protein structure has many applications related to elucidation of protein function and structure based drug discovery. It often represents only one step of many in complex computational drug design efforts. Although many methods have been published to date, only few of them are suitable for use in automated pipelines or for processing large datasets.
These use cases require stability and speed, which disqualifies many of the recently introduced tools that are either template based or available only as web servers. Results We present P2Rank, a stand-alone template-free tool for prediction of ligand binding sites based on machine learning. It is based on prediction of ligandability of local chemical neighbourhoods that are centered on points placed on the solvent accessible surface of a protein.
We show that P2Rank outperforms several existing tools, which include two widely used stand-alone tools (Fpocket, SiteHound), a comprehensive consensus based tool (MetaPocket 2.0), and a recent deep learning based method (DeepSite). P2Rank belongs to the fastest available tools (requires under 1 s for prediction on one protein), with additional advantage of multi-threaded implementation. Conclusions P2Rank is a new open source software package for ligand binding site prediction from protein structure. It is available as a user-friendly stand-alone command line program and a Java library. P2Rank has a lightweight installation and does not depend on other bioinformatics tools or large structural or sequence databases. Thanks to its speed and ability to make fully automated predictions, it is particularly well suited for processing large datasets or as a component of scalable structural bioinformatics pipelines. Electronic supplementary material The online version of this article (10.1186/s13321-018-0285-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Radoslav Krivák
- Department of Software Engineering, Charles University, Prague, Czech Republic.
| | - David Hoksza
- Department of Software Engineering, Charles University, Prague, Czech Republic.
| |
Collapse
|
30
|
Naderi M, Govindaraj RG, Brylinski M. eModel-BDB: a database of comparative structure models of drug-target interactions from the Binding Database. Gigascience 2018; 7:5057873. [PMID: 30052959 PMCID: PMC6131211 DOI: 10.1093/gigascience/giy091] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Accepted: 07/16/2018] [Indexed: 01/14/2023] Open
Abstract
Background The structural information on proteins in their ligand-bound conformational state is invaluable for protein function studies and rational drug design. Compared to the number of available sequences, not only is the repertoire of the experimentally determined structures of holo-proteins limited, these structures do not always include pharmacologically relevant compounds at their binding sites. In addition, binding affinity databases provide vast quantities of information on interactions between drug-like molecules and their targets, however, often lacking structural data. On that account, there is a need for computational methods to complement existing repositories by constructing the atomic-level models of drug-protein assemblies that will not be determined experimentally in the near future. Results We created eModel-BDB, a database of 200,005 comparative models of drug-bound proteins based on 1,391,403 interaction data obtained from the Binding Database and the PDB library of 31 January 2017. Complex models in eModel-BDB were generated with a collection of the state-of-the-art techniques, including protein meta-threading, template-based structure modeling, refinement and binding site detection, and ligand similarity-based docking. In addition to a rigorous quality control maintained during dataset generation, a subset of weakly homologous models was selected for the retrospective validation against experimental structural data recently deposited to the Protein Data Bank. Validation results indicate that eModel-BDB contains models that are accurate not only at the global protein structure level but also with respect to the atomic details of bound ligands. Conclusions Freely available eModel-BDB can be used to support structure-based drug discovery and repositioning, drug target identification, and protein structure determination.
Collapse
Affiliation(s)
- Misagh Naderi
- Department of Biological Sciences, Louisiana State University, 202 Life Sciences Bldg, Baton Rouge, LA 70803, USA
| | - Rajiv Gandhi Govindaraj
- Department of Biological Sciences, Louisiana State University, 202 Life Sciences Bldg, Baton Rouge, LA 70803, USA
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, 202 Life Sciences Bldg, Baton Rouge, LA 70803, USA.,Center for Computation & Technology, Louisiana State University, 2054 Digital Media Center, Baton Rouge, LA 70803, USA
| |
Collapse
|
31
|
Large-scale computational drug repositioning to find treatments for rare diseases. NPJ Syst Biol Appl 2018; 4:13. [PMID: 29560273 PMCID: PMC5847522 DOI: 10.1038/s41540-018-0050-7] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2017] [Revised: 01/22/2018] [Accepted: 02/03/2018] [Indexed: 11/08/2022] Open
Abstract
Rare, or orphan, diseases are conditions afflicting a small subset of people in a population. Although these disorders collectively pose significant health care problems, drug companies require government incentives to develop drugs for rare diseases due to extremely limited individual markets. Computer-aided drug repositioning, i.e., finding new indications for existing drugs, is a cheaper and faster alternative to traditional drug discovery offering a promising venue for orphan drug research. Structure-based matching of drug-binding pockets is among the most promising computational techniques to inform drug repositioning. In order to find new targets for known drugs ultimately leading to drug repositioning, we recently developed eMatchSite, a new computer program to compare drug-binding sites. In this study, eMatchSite is combined with virtual screening to systematically explore opportunities to reposition known drugs to proteins associated with rare diseases. The effectiveness of this integrated approach is demonstrated for a kinase inhibitor, which is a confirmed candidate for repositioning to synapsin Ia. The resulting dataset comprises 31,142 putative drug-target complexes linked to 980 orphan diseases. The modeling accuracy is evaluated against the structural data recently released for tyrosine-protein kinase HCK. To illustrate how potential therapeutics for rare diseases can be identified, we discuss a possibility to repurpose a steroidal aromatase inhibitor to treat Niemann-Pick disease type C. Overall, the exhaustive exploration of the drug repositioning space exposes new opportunities to combat orphan diseases with existing drugs. DrugBank/Orphanet repositioning data are freely available to research community at https://osf.io/qdjup/.
Collapse
|
32
|
Brylinski M, Naderi M, Govindaraj RG, Lemoine J. eRepo-ORP: Exploring the Opportunity Space to Combat Orphan Diseases with Existing Drugs. J Mol Biol 2017; 430:2266-2273. [PMID: 29237557 DOI: 10.1016/j.jmb.2017.12.001] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2017] [Revised: 11/15/2017] [Accepted: 12/05/2017] [Indexed: 01/29/2023]
Abstract
About 7000 rare, or orphan, diseases affect more than 350 million people worldwide. Although these conditions collectively pose significant health care problems, drug companies seldom develop drugs for orphan diseases due to extremely limited individual markets. Consequently, developing new treatments for often life-threatening orphan diseases is primarily contingent on financial incentives from governments, special research grants, and private philanthropy. Computer-aided drug repositioning is a cheaper and faster alternative to traditional drug discovery offering a promising venue for orphan drug research. Here, we present eRepo-ORP, a comprehensive resource constructed by a large-scale repositioning of existing drugs to orphan diseases with a collection of structural bioinformatics tools, including eThread, eFindSite, and eMatchSite. Specifically, a systematic exploration of 320,856 possible links between known drugs in DrugBank and orphan proteins obtained from Orphanet reveals as many as 18,145 candidates for repurposing. In order to illustrate how potential therapeutics for rare diseases can be identified with eRepo-ORP, we discuss the repositioning of a kinase inhibitor for Ras-associated autoimmune leukoproliferative disease. The eRepo-ORP data set is available through the Open Science Framework at https://osf.io/qdjup/.
Collapse
Affiliation(s)
- Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA; Center for Computation & Technology, Louisiana State University, Baton Rouge, LA 70803, USA.
| | - Misagh Naderi
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | | | - Jeffrey Lemoine
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA; Division of Computer Science and Engineering, Louisiana State University, Baton Rouge, LA 70803, USA
| |
Collapse
|
33
|
Sam E, Athri P. Web-based drug repurposing tools: a survey. Brief Bioinform 2017; 20:299-316. [DOI: 10.1093/bib/bbx125] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2017] [Indexed: 12/15/2022] Open
Affiliation(s)
- Elizabeth Sam
- Department of Computer Science & Engineering Amrita, University Bengaluru, India
| | - Prashanth Athri
- Department of Computer Science & Engineering Amrita, University Bengaluru, India
| |
Collapse
|
34
|
Brylinski M. Aromatic interactions at the ligand-protein interface: Implications for the development of docking scoring functions. Chem Biol Drug Des 2017; 91:380-390. [PMID: 28816025 DOI: 10.1111/cbdd.13084] [Citation(s) in RCA: 68] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2017] [Revised: 06/29/2017] [Accepted: 08/11/2017] [Indexed: 12/22/2022]
Abstract
The ability to design and fine-tune non-covalent interactions between organic ligands and proteins is indispensable to rational drug development. Aromatic stacking has long been recognized as one of the key constituents of ligand-protein interfaces. In this communication, we employ a two-parameter geometric model to conduct a large-scale statistical analysis of aromatic contacts in the experimental and computer-generated structures of ligand-protein complexes, considering various combinations of aromatic amino acid residues and ligand rings. The geometry of interfacial π-π stacking in crystal structures accords with experimental and theoretical data collected for simple systems, such as the benzene dimer. Many contemporary ligand docking programs implicitly treat aromatic stacking with van der Waals and Coulombic potentials. Although this approach generally provides a sufficient specificity to model aromatic interactions, the geometry of π-π contacts in high-scoring docking conformations could still be improved. The comprehensive analysis of aromatic geometries at ligand-protein interfaces lies the foundation for the development of type-specific statistical potentials to more accurately describe aromatic interactions in molecular docking. A Perl script to detect and calculate the geometric parameters of aromatic interactions in ligand-protein complexes is available at https://github.com/michal-brylinski/earomatic. The dataset comprising experimental complex structures and computer-generated models is available at https://osf.io/rztha/.
Collapse
Affiliation(s)
- Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA.,Center for Computation & Technology, Louisiana State University, Baton Rouge, LA, USA
| |
Collapse
|
35
|
Maheshwari S, Brylinski M. Across-proteome modeling of dimer structures for the bottom-up assembly of protein-protein interaction networks. BMC Bioinformatics 2017; 18:257. [PMID: 28499419 PMCID: PMC5427563 DOI: 10.1186/s12859-017-1675-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2017] [Accepted: 05/03/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Deciphering complete networks of interactions between proteins is the key to comprehend cellular regulatory mechanisms. A significant effort has been devoted to expanding the coverage of the proteome-wide interaction space at molecular level. Although a growing body of research shows that protein docking can, in principle, be used to predict biologically relevant interactions, the accuracy of the across-proteome identification of interacting partners and the selection of near-native complex structures still need to be improved. RESULTS In this study, we developed a new method to discover and model protein interactions employing an exhaustive all-to-all docking strategy. This approach integrates molecular modeling, structural bioinformatics, machine learning, and functional annotation filters in order to provide interaction data for the bottom-up assembly of protein interaction networks. Encouragingly, the success rates for dimer modeling is 57.5 and 48.7% when experimental and computer-generated monomer structures are employed, respectively. Further, our protocol correctly identifies 81% of protein-protein interactions at the expense of only 19% false positive rate. As a proof of concept, 61,913 protein-protein interactions were confidently predicted and modeled for the proteome of E. coli. Finally, we validated our method against the human immune disease pathway. CONCLUSIONS Protein docking supported by evolutionary restraints and machine learning can be used to reliably identify and model biologically relevant protein assemblies at the proteome scale. Moreover, the accuracy of the identification of protein-protein interactions is improved by considering only those protein pairs co-localized in the same cellular compartment and involved in the same biological process. The modeling protocol described in this communication can be applied to detect protein-protein interactions in other organisms and pathways as well as to construct dimer structures and estimate the confidence of protein interactions experimentally identified with high-throughput techniques.
Collapse
Affiliation(s)
- Surabhi Maheshwari
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA. .,Center for Computation & Technology, Louisiana State University, Baton Rouge, LA, USA.
| |
Collapse
|
36
|
Bartuzi D, Kaczor AA, Targowska-Duda KM, Matosiuk D. Recent Advances and Applications of Molecular Docking to G Protein-Coupled Receptors. Molecules 2017; 22:molecules22020340. [PMID: 28241450 PMCID: PMC6155844 DOI: 10.3390/molecules22020340] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2016] [Revised: 01/27/2017] [Accepted: 02/15/2017] [Indexed: 12/16/2022] Open
Abstract
The growing number of studies on G protein-coupled receptors (GPCRs) family are a source of noticeable improvement in our understanding of the functioning of these proteins. GPCRs are responsible for a vast part of signaling in vertebrates and, as such, invariably remain in the spotlight of medicinal chemistry. A deeper insight into the underlying mechanisms of interesting phenomena observed in GPCRs, such as biased signaling or allosteric modulation, can be gained with experimental and computational studies. The latter play an important role in this process, since they allow for observations on scales inaccessible for most other methods. One of the key steps in such studies is proper computational reconstruction of actual ligand-receptor or protein-protein interactions, a process called molecular docking. A number of improvements and innovative applications of this method were documented recently. In this review, we focus particularly on innovations in docking to GPCRs.
Collapse
Affiliation(s)
- Damian Bartuzi
- Department of Synthesis and Chemical Technology of Pharmaceutical Substances with Computer Modelling Lab, Medical University of Lublin, 4A Chodźki Str., PL20093 Lublin, Poland.
| | - Agnieszka A Kaczor
- Department of Synthesis and Chemical Technology of Pharmaceutical Substances with Computer Modelling Lab, Medical University of Lublin, 4A Chodźki Str., PL20093 Lublin, Poland.
- School of Pharmacy, University of Eastern Finland, Yliopistonranta 1, P.O. Box 1627, FI-70211 Kuopio, Finland.
| | | | - Dariusz Matosiuk
- Department of Synthesis and Chemical Technology of Pharmaceutical Substances with Computer Modelling Lab, Medical University of Lublin, 4A Chodźki Str., PL20093 Lublin, Poland.
| |
Collapse
|
37
|
Brylinski M. Local Alignment of Ligand Binding Sites in Proteins for Polypharmacology and Drug Repositioning. Methods Mol Biol 2017; 1611:109-122. [PMID: 28451975 PMCID: PMC5513668 DOI: 10.1007/978-1-4939-7015-5_9] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
The administration of drugs is a key strategy in pharmacotherapy to treat diseases. Drugs are typically developed to modulate the function of specific proteins, which are directly associated with particular disease states. Nonetheless, recent studies suggest that protein-drug interactions are rather promiscuous and the majority of pharmaceuticals exhibit activity against multiple, often unrelated proteins. Certainly, the lack of selectivity often leads to drug side effects; on the other hand, these polypharmacological attributes can be used to develop drugs acting on multiple targets within a unique disease pathway, as well as to identify new targets for existing drugs, which is known as drug repositioning. To support drug development and repurposing, we developed eMatchSite, a new approach to detect those binding sites having the capability to bind similar compounds. eMatchSite is available as a standalone software and a webserver at http://www.brylinski.org/ematchsite .
Collapse
Affiliation(s)
- Michal Brylinski
- Department of Biological Sciences, Louisiana State University, 407 Choppin Hall, Baton Rouge, LA, 70803, USA.
- Center for Computation and Technology, Louisiana State University, 2054 Digital Media Center, Baton Rouge, LA, 70803, USA.
| |
Collapse
|
38
|
Assessing the similarity of ligand binding conformations with the Contact Mode Score. Comput Biol Chem 2016; 64:403-413. [PMID: 27620381 DOI: 10.1016/j.compbiolchem.2016.08.007] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2016] [Revised: 08/17/2016] [Accepted: 08/25/2016] [Indexed: 11/22/2022]
Abstract
Structural and computational biologists often need to measure the similarity of ligand binding conformations. The commonly used root-mean-square deviation (RMSD) is not only ligand-size dependent, but also may fail to capture biologically meaningful binding features. To address these issues, we developed the Contact Mode Score (CMS), a new metric to assess the conformational similarity based on intermolecular protein-ligand contacts. The CMS is less dependent on the ligand size and has the ability to include flexible receptors. In order to effectively compare binding poses of non-identical ligands bound to different proteins, we further developed the eXtended Contact Mode Score (XCMS). We believe that CMS and XCMS provide a meaningful assessment of the similarity of ligand binding conformations. CMS and XCMS are freely available at http://brylinski.cct.lsu.edu/content/contact-mode-score and http://geaux-computational-bio.github.io/contact-mode-score/.
Collapse
|
39
|
Abstract
The success of molecular modeling and computational chemistry efforts are, by definition, dependent on quality software applications. Open source software development provides many advantages to users of modeling applications, not the least of which is that the software is free and completely extendable. In this review we categorize, enumerate, and describe available open source software packages for molecular modeling and computational chemistry. An updated online version of this catalog can be found at https://opensourcemolecularmodeling.github.io.
Collapse
|
40
|
Fang Y, Ding Y, Feinstein WP, Koppelman DM, Moreno J, Jarrell M, Ramanujam J, Brylinski M. GeauxDock: Accelerating Structure-Based Virtual Screening with Heterogeneous Computing. PLoS One 2016; 11:e0158898. [PMID: 27420300 PMCID: PMC4946785 DOI: 10.1371/journal.pone.0158898] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2016] [Accepted: 06/23/2016] [Indexed: 12/19/2022] Open
Abstract
Computational modeling of drug binding to proteins is an integral component of direct drug design. Particularly, structure-based virtual screening is often used to perform large-scale modeling of putative associations between small organic molecules and their pharmacologically relevant protein targets. Because of a large number of drug candidates to be evaluated, an accurate and fast docking engine is a critical element of virtual screening. Consequently, highly optimized docking codes are of paramount importance for the effectiveness of virtual screening methods. In this communication, we describe the implementation, tuning and performance characteristics of GeauxDock, a recently developed molecular docking program. GeauxDock is built upon the Monte Carlo algorithm and features a novel scoring function combining physics-based energy terms with statistical and knowledge-based potentials. Developed specifically for heterogeneous computing platforms, the current version of GeauxDock can be deployed on modern, multi-core Central Processing Units (CPUs) as well as massively parallel accelerators, Intel Xeon Phi and NVIDIA Graphics Processing Unit (GPU). First, we carried out a thorough performance tuning of the high-level framework and the docking kernel to produce a fast serial code, which was then ported to shared-memory multi-core CPUs yielding a near-ideal scaling. Further, using Xeon Phi gives 1.9× performance improvement over a dual 10-core Xeon CPU, whereas the best GPU accelerator, GeForce GTX 980, achieves a speedup as high as 3.5×. On that account, GeauxDock can take advantage of modern heterogeneous architectures to considerably accelerate structure-based virtual screening applications. GeauxDock is open-sourced and publicly available at www.brylinski.org/geauxdock and https://figshare.com/articles/geauxdock_tar_gz/3205249.
Collapse
Affiliation(s)
- Ye Fang
- School of Electrical Engineering and Computer Science, Louisiana State University, Baton Rouge, Louisiana, United States of America
- Center for Computation & Technology, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - Yun Ding
- Department of Physics and Astronomy, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - Wei P. Feinstein
- High-Performance Computing, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - David M. Koppelman
- School of Electrical Engineering and Computer Science, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - Juana Moreno
- Department of Physics and Astronomy, Louisiana State University, Baton Rouge, Louisiana, United States of America
- Center for Computation & Technology, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - Mark Jarrell
- Department of Physics and Astronomy, Louisiana State University, Baton Rouge, Louisiana, United States of America
- Center for Computation & Technology, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - J. Ramanujam
- School of Electrical Engineering and Computer Science, Louisiana State University, Baton Rouge, Louisiana, United States of America
- Center for Computation & Technology, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana, United States of America
- Center for Computation & Technology, Louisiana State University, Baton Rouge, Louisiana, United States of America
- * E-mail:
| |
Collapse
|
41
|
Tan Z, Chaudhai R, Zhang S. Polypharmacology in Drug Development: A Minireview of Current Technologies. ChemMedChem 2016; 11:1211-8. [PMID: 27154144 DOI: 10.1002/cmdc.201600067] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2016] [Revised: 03/21/2016] [Indexed: 01/09/2023]
Abstract
Polypharmacology, the process in which a single drug is able to bind to multiple targets specifically and simultaneously, is an emerging paradigm in drug development. The potency of a given drug can be increased through the engagement of multiple targets involved in a certain disease. Polypharmacology may also help identify novel applications of existing drugs through drug repositioning. However, many problems and challenges remain in this field. Rather than covering all aspects of polypharmacology, this Minireview is focused primarily on recently reported techniques, from bioinformatics technologies to cheminformatics approaches as well as text-mining-based methods, all of which have made significant contributions to the research of polypharmacology.
Collapse
Affiliation(s)
- Zhi Tan
- Integrated Molecular Discovery Laboratory, Department of Experimental Therapeutics, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA.,The University of Texas Graduate School of Biomedical Sciences, Houston, TX, 77030, USA
| | - Rajan Chaudhai
- Integrated Molecular Discovery Laboratory, Department of Experimental Therapeutics, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - Shuxing Zhang
- Integrated Molecular Discovery Laboratory, Department of Experimental Therapeutics, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA. .,The University of Texas Graduate School of Biomedical Sciences, Houston, TX, 77030, USA.
| |
Collapse
|
42
|
Wang C, Hu G, Wang K, Brylinski M, Xie L, Kurgan L. PDID: database of molecular-level putative protein-drug interactions in the structural human proteome. Bioinformatics 2016; 32:579-86. [PMID: 26504143 PMCID: PMC5963357 DOI: 10.1093/bioinformatics/btv597] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2015] [Revised: 09/24/2015] [Accepted: 10/12/2015] [Indexed: 12/31/2022] Open
Abstract
MOTIVATION Many drugs interact with numerous proteins besides their intended therapeutic targets and a substantial portion of these interactions is yet to be elucidated. Protein-Drug Interaction Database (PDID) addresses incompleteness of these data by providing access to putative protein-drug interactions that cover the entire structural human proteome. RESULTS PDID covers 9652 structures from 3746 proteins and houses 16 800 putative interactions generated from close to 1.1 million accurate, all-atom structure-based predictions for several dozens of popular drugs. The predictions were generated with three modern methods: ILbind, SMAP and eFindSite. They are accompanied by propensity scores that quantify likelihood of interactions and coordinates of the putative location of the binding drugs in the corresponding protein structures. PDID complements the current databases that focus on the curated interactions and the BioDrugScreen database that relies on docking to find putative interactions. Moreover, we also include experimentally curated interactions which are linked to their sources: DrugBank, BindingDB and Protein Data Bank. Our database can be used to facilitate studies related to polypharmacology of drugs including repurposing and explaining side effects of drugs. AVAILABILITY AND IMPLEMENTATION PDID database is freely available at http://biomine.ece.ualberta.ca/PDID/.
Collapse
Affiliation(s)
- Chen Wang
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada T6G 2V4
| | - Gang Hu
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin 300071, People's Republic of China
| | - Kui Wang
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin 300071, People's Republic of China
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Lei Xie
- Department of Computer Science, Hunter College, City University of New York (CUNY), New York, NY 10065, USA and
| | - Lukasz Kurgan
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada T6G 2V4, Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
43
|
|
44
|
Ding Y, Fang Y, Feinstein WP, Ramanujam J, Koppelman DM, Moreno J, Brylinski M, Jarrell M. GeauxDock: A novel approach for mixed-resolution ligand docking using a descriptor-based force field. J Comput Chem 2015; 36:2013-26. [PMID: 26250822 DOI: 10.1002/jcc.24031] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2015] [Revised: 06/07/2015] [Accepted: 07/03/2015] [Indexed: 12/26/2022]
Abstract
Molecular docking is an important component of computer-aided drug discovery. In this communication, we describe GeauxDock, a new docking approach that builds on the ideas of ligand homology modeling. GeauxDock features a descriptor-based scoring function integrating evolutionary constraints with physics-based energy terms, a mixed-resolution molecular representation of protein-ligand complexes, and an efficient Monte Carlo sampling protocol. To drive docking simulations toward experimental conformations, the scoring function was carefully optimized to produce a correlation between the total pseudoenergy and the native-likeness of binding poses. Indeed, benchmarking calculations demonstrate that GeauxDock has a strong capacity to identify near-native conformations across docking trajectories with the area under receiver operating characteristics of 0.85. By excluding closely related templates, we show that GeauxDock maintains its accuracy at lower levels of homology through the increased contribution from physics-based energy terms compensating for weak evolutionary constraints. GeauxDock is available at http://www.institute.loni.org/lasigma/package/dock/.
Collapse
Affiliation(s)
- Yun Ding
- Department of Physics and Astronomy, Louisiana State University, Baton Rouge, Louisiana, 70803
| | - Ye Fang
- School of Electrical Engineering and Computer Science, Louisiana State University, Baton Rouge, Louisiana, 70803.,Center for Computation & Technology, Louisiana State University, Baton Rouge, Louisiana, 70803
| | - Wei P Feinstein
- Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana, 70803
| | - Jagannathan Ramanujam
- School of Electrical Engineering and Computer Science, Louisiana State University, Baton Rouge, Louisiana, 70803.,Center for Computation & Technology, Louisiana State University, Baton Rouge, Louisiana, 70803
| | - David M Koppelman
- School of Electrical Engineering and Computer Science, Louisiana State University, Baton Rouge, Louisiana, 70803
| | - Juana Moreno
- Department of Physics and Astronomy, Louisiana State University, Baton Rouge, Louisiana, 70803.,Center for Computation & Technology, Louisiana State University, Baton Rouge, Louisiana, 70803
| | - Michal Brylinski
- Center for Computation & Technology, Louisiana State University, Baton Rouge, Louisiana, 70803.,Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana, 70803
| | - Mark Jarrell
- Department of Physics and Astronomy, Louisiana State University, Baton Rouge, Louisiana, 70803.,Center for Computation & Technology, Louisiana State University, Baton Rouge, Louisiana, 70803
| |
Collapse
|
45
|
Abstract
Drug discovery utilizes chemical biology and computational drug design approaches for the efficient identification and optimization of lead compounds. Chemical biology is mostly involved in the elucidation of the biological function of a target and the mechanism of action of a chemical modulator. On the other hand, computer-aided drug design makes use of the structural knowledge of either the target (structure-based) or known ligands with bioactivity (ligand-based) to facilitate the determination of promising candidate drugs. Various virtual screening techniques are now being used by both pharmaceutical companies and academic research groups to reduce the cost and time required for the discovery of a potent drug. Despite the rapid advances in these methods, continuous improvements are critical for future drug discovery tools. Advantages presented by structure-based and ligand-based drug design suggest that their complementary use, as well as their integration with experimental routines, has a powerful impact on rational drug design. In this article, we give an overview of the current computational drug design and their application in integrated rational drug development to aid in the progress of drug discovery research.
Collapse
Affiliation(s)
- Stephani Joy Y Macalino
- National Leading Research Laboratory of Molecular Modeling and Drug Design, College of Pharmacy and Graduate School of Pharmaceutical Sciences, and Global Top 5 Research Program, Ewha Womans University, Seoul, 120-750, Korea
| | - Vijayakumar Gosu
- National Leading Research Laboratory of Molecular Modeling and Drug Design, College of Pharmacy and Graduate School of Pharmaceutical Sciences, and Global Top 5 Research Program, Ewha Womans University, Seoul, 120-750, Korea
| | - Sunhye Hong
- National Leading Research Laboratory of Molecular Modeling and Drug Design, College of Pharmacy and Graduate School of Pharmaceutical Sciences, and Global Top 5 Research Program, Ewha Womans University, Seoul, 120-750, Korea
| | - Sun Choi
- National Leading Research Laboratory of Molecular Modeling and Drug Design, College of Pharmacy and Graduate School of Pharmaceutical Sciences, and Global Top 5 Research Program, Ewha Womans University, Seoul, 120-750, Korea.
| |
Collapse
|
46
|
Feinstein WP, Brylinski M. Calculating an optimal box size for ligand docking and virtual screening against experimental and predicted binding pockets. J Cheminform 2015; 7:18. [PMID: 26082804 PMCID: PMC4468813 DOI: 10.1186/s13321-015-0067-5] [Citation(s) in RCA: 131] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2014] [Accepted: 04/14/2015] [Indexed: 12/13/2022] Open
Abstract
Background Computational approaches have emerged as an instrumental methodology in modern research. For example, virtual screening by molecular docking is routinely used in computer-aided drug discovery. One of the critical parameters for ligand docking is the size of a search space used to identify low-energy binding poses of drug candidates. Currently available docking packages often come with a default protocol for calculating the box size, however, many of these procedures have not been systematically evaluated. Methods In this study, we investigate how the docking accuracy of AutoDock Vina is affected by the selection of a search space. We propose a new procedure for calculating the optimal docking box size that maximizes the accuracy of binding pose prediction against a non-redundant and representative dataset of 3,659 protein-ligand complexes selected from the Protein Data Bank. Subsequently, we use the Directory of Useful Decoys, Enhanced to demonstrate that the optimized docking box size also yields an improved ranking in virtual screening. Binding pockets in both datasets are derived from the experimental complex structures and, additionally, predicted by eFindSite. Results A systematic analysis of ligand binding poses generated by AutoDock Vina shows that the highest accuracy is achieved when the dimensions of the search space are 2.9 times larger than the radius of gyration of a docking compound. Subsequent virtual screening benchmarks demonstrate that this optimized docking box size also improves compound ranking. For instance, using predicted ligand binding sites, the average enrichment factor calculated for the top 1 % (10 %) of the screening library is 8.20 (3.28) for the optimized protocol, compared to 7.67 (3.19) for the default procedure. Depending on the evaluation metric, the optimal docking box size gives better ranking in virtual screening for about two-thirds of target proteins. Conclusions This fully automated procedure can be used to optimize docking protocols in order to improve the ranking accuracy in production virtual screening simulations. Importantly, the optimized search space systematically yields better results than the default method not only for experimental pockets, but also for those predicted from protein structures. A script for calculating the optimal docking box size is freely available at www.brylinski.org/content/docking-box-size. We developed a procedure to optimize the box size in molecular docking calculations. Left panel shows the predicted binding pose of NADP (green sticks) compared to the experimental complex structure of human aldose reductase (blue sticks) using a default protocol. Right panel shows the docking accuracy using an optimized box size. ![]()
Collapse
Affiliation(s)
- Wei P Feinstein
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803 USA ; Center for Computation & Technology, Louisiana State University, Baton Rouge, LA 70803 USA
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803 USA ; Center for Computation & Technology, Louisiana State University, Baton Rouge, LA 70803 USA
| |
Collapse
|
47
|
Feinstein WP, Moreno J, Jarrell M, Brylinski M. Accelerating the Pace of Protein Functional Annotation With Intel Xeon Phi Coprocessors. IEEE Trans Nanobioscience 2015; 14:429-439. [PMID: 25769169 DOI: 10.1109/tnb.2015.2403776] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Intel Xeon Phi is a new addition to the family of powerful parallel accelerators. The range of its potential applications in computationally driven research is broad; however, at present, the repository of scientific codes is still relatively limited. In this study, we describe the development and benchmarking of a parallel version of eFindSite, a structural bioinformatics algorithm for the prediction of ligand-binding sites in proteins. Implemented for the Intel Xeon Phi platform, the parallelization of the structure alignment portion of eFindSite using pragma-based OpenMP brings about the desired performance improvements, which scale well with the number of computing cores. Compared to a serial version, the parallel code runs 11.8 and 10.1 times faster on the CPU and the coprocessor, respectively; when both resources are utilized simultaneously, the speedup is 17.6. For example, ligand-binding predictions for 501 benchmarking proteins are completed in 2.1 hours on a single Stampede node equipped with the Intel Xeon Phi card compared to 3.1 hours without the accelerator and 36.8 hours required by a serial version. In addition to the satisfactory parallel performance, porting existing scientific codes to the Intel Xeon Phi architecture is relatively straightforward with a short development time due to the support of common parallel programming models by the coprocessor. The parallel version of eFindSite is freely available to the academic community at www.brylinski.org/efindsite.
Collapse
|
48
|
Petrey D, Chen TS, Deng L, Garzon JI, Hwang H, Lasso G, Lee H, Silkov A, Honig B. Template-based prediction of protein function. Curr Opin Struct Biol 2015; 32:33-8. [PMID: 25678152 DOI: 10.1016/j.sbi.2015.01.007] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2014] [Revised: 01/13/2015] [Accepted: 01/19/2015] [Indexed: 12/11/2022]
Abstract
We discuss recent approaches for structure-based protein function annotation. We focus on template-based methods where the function of a query protein is deduced from that of a template for which both the structure and function are known. We describe the different ways of identifying a template. These are typically based on sequence analysis but new methods based on purely structural similarity are also being developed that allow function annotation based on structural relationships that cannot be recognized by sequence. The growing number of available structures of known function, improved homology modeling techniques and new developments in the use of structure allow template-based methods to be applied on a proteome-wide scale and in many different biological contexts. This progress significantly expands the range of applicability of structural information in function annotation to a level that previously was only achievable by sequence comparison.
Collapse
Affiliation(s)
- Donald Petrey
- Howard Hughes Medical Institute, Department of Biochemistry and Molecular Biophysics, Department of Systems Biology, Center for Computational Biology and Bioinformatics, 1130 St. Nicholas Avenue, Room 815, New York, NY 10032, United States.
| | - T Scott Chen
- Howard Hughes Medical Institute, Department of Biochemistry and Molecular Biophysics, Department of Systems Biology, Center for Computational Biology and Bioinformatics, 1130 St. Nicholas Avenue, Room 815, New York, NY 10032, United States
| | - Lei Deng
- Howard Hughes Medical Institute, Department of Biochemistry and Molecular Biophysics, Department of Systems Biology, Center for Computational Biology and Bioinformatics, 1130 St. Nicholas Avenue, Room 815, New York, NY 10032, United States
| | - Jose Ignacio Garzon
- Howard Hughes Medical Institute, Department of Biochemistry and Molecular Biophysics, Department of Systems Biology, Center for Computational Biology and Bioinformatics, 1130 St. Nicholas Avenue, Room 815, New York, NY 10032, United States
| | - Howook Hwang
- Howard Hughes Medical Institute, Department of Biochemistry and Molecular Biophysics, Department of Systems Biology, Center for Computational Biology and Bioinformatics, 1130 St. Nicholas Avenue, Room 815, New York, NY 10032, United States
| | - Gorka Lasso
- Howard Hughes Medical Institute, Department of Biochemistry and Molecular Biophysics, Department of Systems Biology, Center for Computational Biology and Bioinformatics, 1130 St. Nicholas Avenue, Room 815, New York, NY 10032, United States
| | - Hunjoong Lee
- Howard Hughes Medical Institute, Department of Biochemistry and Molecular Biophysics, Department of Systems Biology, Center for Computational Biology and Bioinformatics, 1130 St. Nicholas Avenue, Room 815, New York, NY 10032, United States
| | - Antonina Silkov
- Howard Hughes Medical Institute, Department of Biochemistry and Molecular Biophysics, Department of Systems Biology, Center for Computational Biology and Bioinformatics, 1130 St. Nicholas Avenue, Room 815, New York, NY 10032, United States
| | - Barry Honig
- Howard Hughes Medical Institute, Department of Biochemistry and Molecular Biophysics, Department of Systems Biology, Center for Computational Biology and Bioinformatics, 1130 St. Nicholas Avenue, Room 815, New York, NY 10032, United States
| |
Collapse
|
49
|
Maheshwari S, Brylinski M. Prediction of protein-protein interaction sites from weakly homologous template structures using meta-threading and machine learning. J Mol Recognit 2015; 28:35-48. [DOI: 10.1002/jmr.2410] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2014] [Revised: 06/19/2014] [Accepted: 06/27/2014] [Indexed: 11/11/2022]
Affiliation(s)
- Surabhi Maheshwari
- Department of Biological Sciences; Louisiana State University; Baton Rouge LA 70803 USA
| | - Michal Brylinski
- Department of Biological Sciences; Louisiana State University; Baton Rouge LA 70803 USA
- Center for Computation & Technology; Louisiana State University; Baton Rouge LA 70803 USA
| |
Collapse
|
50
|
Abstract
Ligand binding is required for many proteins to function properly. A large number of bioinformatics tools have been developed to predict ligand binding sites as a first step in understanding a protein's function or to facilitate docking computations in virtual screening based drug design. The prediction usually requires only the three-dimensional structure (experimentally determined or computationally modeled) of the target protein to be searched for ligand binding site(s), and Web servers have been built, allowing the free and simple use of prediction tools. In this chapter, we review the underlying concepts of the methods used by various tools, and discuss their different features and the related issues of ligand binding site prediction. Some cautionary notes about the use of these tools are also provided.
Collapse
Affiliation(s)
- Zhong-Ru Xie
- Institute of Biomedical Sciences, Academia Sinica, 128 Academia Road, Section 2, Nankang, Taipei, 115, Taiwan
| | | |
Collapse
|