1
|
Guo D, Zhao H, Huang J, Zhao J, Xu X, Liu Y, Yang Y. PocketSCP: A Method for Spatiotemporal Topological Visualization and Analysis of Protein Pocket Dynamics. J Chem Inf Model 2025; 65:5231-5241. [PMID: 40358406 DOI: 10.1021/acs.jcim.5c00728] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/15/2025]
Abstract
The identification and analysis of pockets are crucial for understanding the functional mechanisms and therapeutic potential of proteins. However, it is challenging to track the dynamic characteristics of the pockets. In this paper, we present a method for the visualization and analysis of protein pocket dynamics called PocketSCP. Initially, the representation of lining amino acid atoms is proposed to characterize the spatiotemporal and topological properties of pockets. Subsequently, 3D mapping based on a reference molecular conformation is designed to generate 3D distribution maps of pockets. To facilitate observation and analysis, 3D to 2D plane mapping based on equidistant azimuthal projection is designed, leveraging the near-spherical shape properties of protein molecules. Finally, the efficacy of our method in identifying potential patterns within protein pockets is demonstrated through experimental validation.
Collapse
Affiliation(s)
- Dongliang Guo
- School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, P. R. China
- The Key Laboratory for Software Engineering of Hebei Province, Qinhuangdao 066004, China
| | - Hanqing Zhao
- School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, P. R. China
| | - Jiabin Huang
- School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, P. R. China
| | - Jiawei Zhao
- School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, P. R. China
| | - Ximing Xu
- School of Medicine and Pharmacy, Key Laboratory of Marine Drugs, Chinese Ministry of Education, Ocean University of China, Qingdao 266100, P. R. China
- Marine Biomedical Research Institute of Qingdao, Qingdao 266100, P. R. China
| | - Yapeng Liu
- School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, P. R. China
| | - Ying Yang
- Liren College, Yanshan University, Qinhuangdao 066004, P. R. China
| |
Collapse
|
2
|
Chandel S, Parashar B, Ali SA, Sharma S. Predictive cavity and binding site identification: Techniques and applications. ADVANCES IN PHARMACOLOGY (SAN DIEGO, CALIF.) 2025; 103:43-63. [PMID: 40175054 DOI: 10.1016/bs.apha.2025.02.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2025]
Abstract
Strategies for recognizing predictive cavities and binding site identification are decisive for drug discovery, molecular docking, and tracing protein-ligand interactions. The two major approaches experimental and computational strive for prognosticating and distinguishing protein's binding sites. Profuse diminutive molecules are associated with the binding sites and influence normal biological functioning. The various structure-based strategies such as molecular dynamics, docking simulations, algorithms for pocket identification, and homology modeling are covered under computational techniques, where they propound the exhaustive comprehension of possible binding pockets hinge on the structure of protein and its physiochemical properties. The various sequence-based approaches rely on the homogeneousness of the sequence and machine learning replicas edified on already known protein and ligand composites to anticipate the interactive sites of novel proteins. The high-resolution structural identification and foot printing of protein to map the confirmational changes attributable to ligand and binding sites can be identified through diverse experimental methods such as NMR spectroscopy, mass spectrometry, and x-ray crystallography. These techniques are pivotal for drug discovery and designing, as the efficiency and specificity of ligands can be amplified through virtual screening and structural-based drug designing. Moreover, the ongoing developments in this domain promise to drive the revolution and efficiency in drug discovery and research in molecular biology.
Collapse
Affiliation(s)
- Shilpa Chandel
- Faculty of Pharmaceutical Sciences, The ICFAI University, Himachal Pradesh, India; Department of Pharmacy, Banasthali Vidyapith, Banasthali, Rajasthan, India.
| | - Bharat Parashar
- Divine International College of Pharmacy, Gwalior, Madhya Pradesh, India
| | - Syed Atif Ali
- Institute of Chemistry, Academia Sinica, Taipei, Taiwan
| | - Shailesh Sharma
- Amar Shaheed Baba Ajit Singh Jujhar Singh Memorial College of Pharmacy, Bela, Punjab, India
| |
Collapse
|
3
|
Sacher S, Ray A. In Silico Strategies for Characterizing Inner Cavities of Lipid-Binding Proteins. Methods Mol Biol 2025; 2888:305-320. [PMID: 39699739 DOI: 10.1007/978-1-0716-4318-1_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2024]
Abstract
Cavities in proteins perform diverse functions such as substrate binding, enzyme catalysis, passage for transportation of small molecules, and protein oligomerization. Often, the physical properties of these cavities are closely linked to the protein function; such as the hydrophobic lipid-binding cavities in lipid-binding proteins (LBPs) that protect lipid substrates from the larger aqueous milieu. Therefore, the characterization of protein cavities can provide valuable insights into protein structure-function relationships, hinting toward their mechanism of action while aiding in the identification of ligand binding sites that are essential for drug discovery approaches. Several algorithms have historically been designed to identify and characterize the different types of cavities in protein structures. We summarize these algorithms and provide a step-by-step guide for locating and characterizing internal cavities in proteins using CICLOP by using ATP-binding cassette transporter A1 (ABCA1) as an example.
Collapse
Affiliation(s)
- Sukriti Sacher
- Department of Computational Biology, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), New Delhi, India
| | - Arjun Ray
- Department of Computational Biology, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), New Delhi, India.
| |
Collapse
|
4
|
Wang L. sesA: A Program for the Analytic Computation of Solvent-Excluded Surface Areas. ChemistryOpen 2024; 13:e202400172. [PMID: 39439129 PMCID: PMC11625950 DOI: 10.1002/open.202400172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Revised: 07/10/2024] [Indexed: 10/25/2024] Open
Abstract
The surface area of a molecule, an inherent geometric property of its structure, plays important roles in its solvation and functioning. Here we present an accurate and robust program, sesA, for the analytic computation of solvent-excluded surface (SES) areas. The accuracy and robustness are achieved through the analytic computations of all the solvent-accessible surface (SAS) regions for a surface atom and probe-probe intersections. The detailed comparisons of the areas for a large set of protein structures by sesA and msms, a de-facto standard for analytic SAS and SES computations, confirm sesA's accuracy to a good extent and in the same time reveal significant differences between them. The unprecedented accuracy and robustness of sesA make it possible to analyze in great detail the surface areas of any molecules in general and biomolecules in particular.
Collapse
Affiliation(s)
- Lincong Wang
- The College of Computer Science and TechnologyJilin University, ChangchunJilinChina
| |
Collapse
|
5
|
Vural O, Jololian L, Pan L. DeepLigType: Predicting Ligand Types of ProteinLigand Binding Sites Using a Deep Learning Model. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; PP:116-123. [PMID: 39509302 DOI: 10.1109/tcbb.2024.3493820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2024]
Abstract
The analysis of protein-ligand binding sites plays a crucial role in the initial stages of drug discovery. Accurately predicting the ligand types that are likely to bind to protein-ligand binding sites enables more informed decision making in drug design. Our study, DeepLigType, determines protein-ligand binding sites using Fpocket and then predicts the ligand type of these pockets with the deep learning model, Convolutional Block Attention Module (CBAM) with ResNet. CBAM-ResNet has been trained to accurately predict five distinct ligand types. We classified protein-ligand binding sites into five different categories according to the type of response ligands cause when they bind to their target proteins, which are antagonist, agonist, activator, inhibitor, and others. We created a novel dataset, referred to as LigType5, from the widely recognized PDBbind and scPDB dataset for training and testing our model. While the literature mostly focuses on the specificity and characteristic analysis of protein binding sites by experimental (laboratory-based) methods, we propose a computational method with the DeepLigType architecture. DeepLigType demonstrated an accuracy of 74.30% and an AUC of 0.83 in ligand type prediction on a novel test dataset using the CBAM-ResNet deep learning model.
Collapse
|
6
|
Lee D, Hwang W, Byun J, Shin B. Turbocharging protein binding site prediction with geometric attention, inter-resolution transfer learning, and homology-based augmentation. BMC Bioinformatics 2024; 25:306. [PMID: 39304807 DOI: 10.1186/s12859-024-05923-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Accepted: 09/05/2024] [Indexed: 09/22/2024] Open
Abstract
BACKGROUND Locating small molecule binding sites in target proteins, in the resolution of either pocket or residue, is critical in many drug-discovery scenarios. Since it is not always easy to find such binding sites using conventional methods, different deep learning methods to predict binding sites out of protein structures have been developed in recent years. The existing deep learning based methods have several limitations, including (1) the inefficiency of the CNN-only architecture, (2) loss of information due to excessive post-processing, and (3) the under-utilization of available data sources. METHODS We present a new model architecture and training method that resolves the aforementioned problems. First, by layering geometric self-attention units on top of residue-level 3D CNN outputs, our model overcomes the problems of CNN-only architectures. Second, by configuring the fundamental units of computation as residues and pockets instead of voxels, our method reduced the information loss from post-processing. Lastly, by employing inter-resolution transfer learning and homology-based augmentation, our method maximizes the utilization of available data sources to a significant extent. RESULTS The proposed method significantly outperformed all state-of-the-art baselines regarding both resolutions-pocket and residue. An ablation study demonstrated the indispensability of our proposed architecture, as well as transfer learning and homology-based augmentation, for achieving optimal performance. We further scrutinized our model's performance through a case study involving human serum albumin, which demonstrated our model's superior capability in identifying multiple binding sites of the protein, outperforming the existing methods. CONCLUSIONS We believe that our contribution to the literature is twofold. Firstly, we introduce a novel computational method for binding site prediction with practical applications, substantiated by its strong performance across diverse benchmarks and case studies. Secondly, the innovative aspects in our method- specifically, the design of the model architecture, inter-resolution transfer learning, and homology-based augmentation-would serve as useful components for future work.
Collapse
Affiliation(s)
| | | | | | - Bonggun Shin
- Deargen, Seoul, Republic of Korea.
- SK Life Science, Inc., Paramus, NJ, USA.
| |
Collapse
|
7
|
Gao J, Liu H, Zhuo C, Zeng C, Zhao Y. Predicting Small Molecule Binding Nucleotides in RNA Structures Using RNA Surface Topography. J Chem Inf Model 2024. [PMID: 39230508 DOI: 10.1021/acs.jcim.4c01264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/05/2024]
Abstract
RNA small molecule interactions play a crucial role in drug discovery and inhibitor design. Identifying RNA small molecule binding nucleotides is essential and requires methods that exhibit a high predictive ability to facilitate drug discovery and inhibitor design. Existing methods can predict the binding nucleotides of simple RNA structures, but it is hard to predict binding nucleotides in complex RNA structures with junctions. To address this limitation, we developed a new deep learning model based on spatial correlation, ZHmolReSTasite, which can accurately predict binding nucleotides of small and large RNA with junctions. We utilize RNA surface topography to consider the spatial correlation, characterizing nucleotides from sequence and tertiary structures to learn a high-level representation. Our method outperforms existing methods for benchmark test sets composed of simple RNA structures, achieving precision values of 72.9% on TE18 and 76.7% on RB9 test sets. For a challenging test set composed of RNA structures with junctions, our method outperforms the second best method by 11.6% in precision. Moreover, ZHmolReSTasite demonstrates robustness regarding the predicted RNA structures. In summary, ZHmolReSTasite successfully incorporates spatial correlation, outperforms previous methods on small and large RNA structures using RNA surface topography, and can provide valuable insights into RNA small molecule prediction and accelerate RNA inhibitor design.
Collapse
Affiliation(s)
- Jiaming Gao
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| | - Haoquan Liu
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| | - Chen Zhuo
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| | - Chengwei Zeng
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| | - Yunjie Zhao
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| |
Collapse
|
8
|
Lv N, Cao Z. Subpocket-Based Analysis Approach for the Protein Pocket Dynamics. J Chem Theory Comput 2024; 20:4909-4920. [PMID: 38772734 DOI: 10.1021/acs.jctc.4c00476] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/23/2024]
Abstract
Structural and dynamic characteristics of protein pockets remarkably influence their biological functions and are also important for enzyme engineering and new drug research and development. To date, several softwares have been developed to analyze the dynamic properties of protein pockets. However, due to the complexity and diversity of the pocket information during the kinetic relaxation, further improvement and capacity expansion of current tools are required. Here, we developed a platform software AlphaTraj in which a computational strategy that divides the whole protein pocket into subpockets and examines various properties of the subpockets such as survival time, stability, and correlation was proposed and implemented. We also proposed a scoring function for the subpockets as well as the whole pocket to visualize the quality of the pocket. Furthermore, we implemented automated conformational search functions for ligand docking and ligand optimization. These functions may help us to gain a deep understanding of the dynamic properties of protein pockets and accelerate the protein engineering and the design of inhibitors and small-molecule drugs. The software is freely available at https://github.com/dooo12332/AlphaTraj.git under the GNU GPL license.
Collapse
Affiliation(s)
- Nan Lv
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 360015, People's Republic of China
| | - Zexing Cao
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 360015, People's Republic of China
| |
Collapse
|
9
|
Jeevan K, Palistha S, Tayara H, Chong KT. PUResNetV2.0: a deep learning model leveraging sparse representation for improved ligand binding site prediction. J Cheminform 2024; 16:66. [PMID: 38849917 PMCID: PMC11157904 DOI: 10.1186/s13321-024-00865-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Accepted: 05/27/2024] [Indexed: 06/09/2024] Open
Abstract
Accurate ligand binding site prediction (LBSP) within proteins is essential for drug discovery. We developed ProteinUNetResNetV2.0 (PUResNetV2.0), leveraging sparse representation of protein structures to improve LBSP accuracy. Our training dataset included protein complexes from 4729 protein families. Evaluations on benchmark datasets showed that PUResNetV2.0 achieved an 85.4% Distance Center Atom (DCA) success rate and a 74.7% F1 Score on the Holo801 dataset, outperforming existing methods. However, its performance in specific cases, such as RNA, DNA, peptide-like ligand, and ion binding site prediction, was limited due to constraints in our training data. Our findings underscore the potential of sparse representation in LBSP, especially for oligomeric structures, suggesting PUResNetV2.0 as a promising tool for computational drug discovery.
Collapse
Affiliation(s)
- Kandel Jeevan
- Graduate School of Integrated Energy-AI, Jeonbuk National University, Jeonju, 54896, South Korea
| | - Shrestha Palistha
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju, 54896, South Korea
| | - Hilal Tayara
- School of International Engineering and Science, Jeonbuk National University, Jeonju, 54896, South Korea.
| | - Kil T Chong
- Graduate School of Integrated Energy-AI, Jeonbuk National University, Jeonju, 54896, South Korea.
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju, 54896, South Korea.
- School of International Engineering and Science, Jeonbuk National University, Jeonju, 54896, South Korea.
- Advanced Electronics and Information Research Center, Jeonbuk National University, Jeonju, 54896, South Korea.
| |
Collapse
|
10
|
Gahlawat A, Singh A, Sandhu H, Garg P. CRAFT: a web-integrated cavity prediction tool based on flow transfer algorithm. J Cheminform 2024; 16:12. [PMID: 38291536 PMCID: PMC10829215 DOI: 10.1186/s13321-024-00803-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Accepted: 01/13/2024] [Indexed: 02/01/2024] Open
Abstract
Numerous computational methods, including evolutionary-based, energy-based, and geometrical-based methods, are utilized to identify cavities inside proteins. Cavity information aids protein function annotation, drug design, poly-pharmacology, and allosteric site investigation. This article introduces "flow transfer algorithm" for rapid and effective identification of diverse protein cavities through multidimensional cavity scan. Initially, it identifies delimiter and susceptible tetrahedra to establish boundary regions and provide seed tetrahedra. Seed tetrahedron faces are precisely scanned using the maximum circle radius to transfer seed flow to neighboring tetrahedra. Seed flow continues until terminated by boundaries or forbidden faces, where a face is forbidden if the estimated maximum circle radius is less or equal to the user-defined maximum circle radius. After a seed scanning, tetrahedra involved in the flow are clustered to locate the cavity. The CRAFT web interface integrates this algorithm for protein cavity identification with enhanced user control. It supports proteins with cofactors, hydrogens, and ligands and provides comprehensive features such as 3D visualization, cavity physicochemical properties, percentage contribution graphs, and highlighted residues for each cavity. CRAFT can be accessed through its web interface at http://pitools.niper.ac.in/CRAFT , complemented by the command version available at https://github.com/PGlab-NIPER/CRAFT/ .Scientific contribution: Flow transfer algorithm is a novel geometric approach for accurate and reliable prediction of diverse protein cavities. This algorithm employs a distinct concept involving maximum circle radius within the 3D Delaunay triangulation to address diverse van der Waals radii while existing methods overlook atom specific van der Waals radii or rely on complex weighted geometric techniques.
Collapse
Affiliation(s)
- Anuj Gahlawat
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research (NIPER), Sector 67, S.A.S. Nagar, 160062, Punjab, India
| | - Anjali Singh
- Department of Computer Science, Kurukshetra University, Kurukshetra, Haryana, India
| | - Hardeep Sandhu
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research (NIPER), Sector 67, S.A.S. Nagar, 160062, Punjab, India
| | - Prabha Garg
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research (NIPER), Sector 67, S.A.S. Nagar, 160062, Punjab, India.
| |
Collapse
|
11
|
Zhu Y, Zhao L, Wen N, Wang J, Wang C. DataDTA: a multi-feature and dual-interaction aggregation framework for drug-target binding affinity prediction. Bioinformatics 2023; 39:btad560. [PMID: 37688568 PMCID: PMC10516524 DOI: 10.1093/bioinformatics/btad560] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2022] [Revised: 05/09/2023] [Accepted: 09/07/2023] [Indexed: 09/11/2023] Open
Abstract
MOTIVATION Accurate prediction of drug-target binding affinity (DTA) is crucial for drug discovery. The increase in the publication of large-scale DTA datasets enables the development of various computational methods for DTA prediction. Numerous deep learning-based methods have been proposed to predict affinities, some of which only utilize original sequence information or complex structures, but the effective combination of various information and protein-binding pockets have not been fully mined. Therefore, a new method that integrates available key information is urgently needed to predict DTA and accelerate the drug discovery process. RESULTS In this study, we propose a novel deep learning-based predictor termed DataDTA to estimate the affinities of drug-target pairs. DataDTA utilizes descriptors of predicted pockets and sequences of proteins, as well as low-dimensional molecular features and SMILES strings of compounds as inputs. Specifically, the pockets were predicted from the three-dimensional structure of proteins and their descriptors were extracted as the partial input features for DTA prediction. The molecular representation of compounds based on algebraic graph features was collected to supplement the input information of targets. Furthermore, to ensure effective learning of multiscale interaction features, a dual-interaction aggregation neural network strategy was developed. DataDTA was compared with state-of-the-art methods on different datasets, and the results showed that DataDTA is a reliable prediction tool for affinities estimation. Specifically, the concordance index (CI) of DataDTA is 0.806 and the Pearson correlation coefficient (R) value is 0.814 on the test dataset, which is higher than other methods. AVAILABILITY AND IMPLEMENTATION The codes and datasets of DataDTA are available at https://github.com/YanZhu06/DataDTA.
Collapse
Affiliation(s)
- Yan Zhu
- Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China
| | - Lingling Zhao
- Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China
| | - Naifeng Wen
- School of Mechanical and Electrical Engineering, Dalian Minzu University, Dalian 116600, China
| | - Junjie Wang
- Department of Medical Informatics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Chunyu Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China
| |
Collapse
|
12
|
Guerra JVS, Alves LFG, Bourissou D, Lopes-de-Oliveira PS, Szalóki G. Cavity Characterization in Supramolecular Cages. J Chem Inf Model 2023. [PMID: 37129917 DOI: 10.1021/acs.jcim.3c00328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Confining molecular guests within artificial hosts has provided a major driving force in the rational design of supramolecular cages with tailored properties. Over the last 30 years, a set of design strategies have been developed that enabled the controlled synthesis of a myriad of cages. Recently, there has been a growing interest in involving in silico methods in this toolbox. Cavity shape and size are important parameters that can be easily accessed by inexpensive geometric algorithms. Although these algorithms are well developed for the detection of nonartificial cavities (e.g., enzymes), they are not routinely used for the rational design of supramolecular cages. In order to test the capabilities of this tool, we have evaluated the performance and characteristics of seven different cavity characterization software in the context of 22 analogues of well-known supramolecular cages. Among the tested software, KVFinder project and Fpocket proved to be the most software to characterize supramolecular cavities. With the results of this work, we aim to popularize this underused technique within the supramolecular community.
Collapse
Affiliation(s)
- João V S Guerra
- Brazilian Center for Research in Energy and Materials (CNPEM), Brazilian Biosciences National Laboratory (LNBio), Rua Giuseppe Máximo Scolfaro, 10000, Bosque Das Palmeiras, Campinas, SP 13083-100, Brazil
| | - Luiz F G Alves
- Brazilian Center for Research in Energy and Materials (CNPEM), Brazilian Biosciences National Laboratory (LNBio), Rua Giuseppe Máximo Scolfaro, 10000, Bosque Das Palmeiras, Campinas, SP 13083-100, Brazil
| | - Didier Bourissou
- Laboratoire Hétérochimie Fondamentale et Appliquée (LHFA, UMR 5069), CNRS, Université Toulouse III─Paul Sabatier, 118 Route de Narbonne, Toulouse 31062, Cedex 09, France
| | - Paulo S Lopes-de-Oliveira
- Brazilian Center for Research in Energy and Materials (CNPEM), Brazilian Biosciences National Laboratory (LNBio), Rua Giuseppe Máximo Scolfaro, 10000, Bosque Das Palmeiras, Campinas, SP 13083-100, Brazil
| | - György Szalóki
- Laboratoire Hétérochimie Fondamentale et Appliquée (LHFA, UMR 5069), CNRS, Université Toulouse III─Paul Sabatier, 118 Route de Narbonne, Toulouse 31062, Cedex 09, France
| |
Collapse
|
13
|
RPflex: A Coarse-Grained Network Model for RNA Pocket Flexibility Study. Int J Mol Sci 2023; 24:ijms24065497. [PMID: 36982570 PMCID: PMC10058308 DOI: 10.3390/ijms24065497] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 03/09/2023] [Accepted: 03/11/2023] [Indexed: 03/18/2023] Open
Abstract
RNA regulates various biological processes, such as gene regulation, RNA splicing, and intracellular signal transduction. RNA’s conformational dynamics play crucial roles in performing its diverse functions. Thus, it is essential to explore the flexibility characteristics of RNA, especially pocket flexibility. Here, we propose a computational approach, RPflex, to analyze pocket flexibility using the coarse-grained network model. We first clustered 3154 pockets into 297 groups by similarity calculation based on the coarse-grained lattice model. Then, we introduced the flexibility score to quantify the flexibility by global pocket features. The results show strong correlations between the flexibility scores and root-mean-square fluctuation (RMSF) values, with Pearson correlation coefficients of 0.60, 0.76, and 0.53 in Testing Sets I–III. Considering both flexibility score and network calculations, the Pearson correlation coefficient was increased to 0.71 in flexible pockets on Testing Set IV. The network calculations reveal that the long-range interaction changes contributed most to flexibility. In addition, the hydrogen bonds in the base–base interactions greatly stabilize the RNA structure, while backbone interactions determine RNA folding. The computational analysis of pocket flexibility could facilitate RNA engineering for biological or medical applications.
Collapse
|
14
|
Vemula D, Jayasurya P, Sushmitha V, Kumar YN, Bhandari V. CADD, AI and ML in drug discovery: A comprehensive review. Eur J Pharm Sci 2023; 181:106324. [PMID: 36347444 DOI: 10.1016/j.ejps.2022.106324] [Citation(s) in RCA: 72] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 10/26/2022] [Accepted: 11/03/2022] [Indexed: 11/06/2022]
Abstract
Computer-aided drug design (CADD) is an emerging field that has drawn a lot of interest because of its potential to expedite and lower the cost of the drug development process. Drug discovery research is expensive and time-consuming, and it frequently took 10-15 years for a drug to be commercially available. CADD has significantly impacted this area of research. Further, the combination of CADD with Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) technologies to handle enormous amounts of biological data has reduced the time and cost associated with the drug development process. This review will discuss how CADD, AI, ML, and DL approaches help identify drug candidates and various other steps of the drug discovery process. It will also provide a detailed overview of the different in silico tools used and how these approaches interact.
Collapse
Affiliation(s)
- Divya Vemula
- National Institute of Pharmaceutical Education and Research- Hyderabad, India
| | - Perka Jayasurya
- National Institute of Pharmaceutical Education and Research- Hyderabad, India
| | - Varthiya Sushmitha
- National Institute of Pharmaceutical Education and Research- Hyderabad, India
| | | | - Vasundhra Bhandari
- National Institute of Pharmaceutical Education and Research- Hyderabad, India.
| |
Collapse
|
15
|
Liao J, Wang Q, Wu F, Huang Z. In Silico Methods for Identification of Potential Active Sites of Therapeutic Targets. Molecules 2022; 27:7103. [PMID: 36296697 PMCID: PMC9609013 DOI: 10.3390/molecules27207103] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 08/12/2022] [Accepted: 08/25/2022] [Indexed: 07/30/2023] Open
Abstract
Target identification is an important step in drug discovery, and computer-aided drug target identification methods are attracting more attention compared with traditional drug target identification methods, which are time-consuming and costly. Computer-aided drug target identification methods can greatly reduce the searching scope of experimental targets and associated costs by identifying the diseases-related targets and their binding sites and evaluating the druggability of the predicted active sites for clinical trials. In this review, we introduce the principles of computer-based active site identification methods, including the identification of binding sites and assessment of druggability. We provide some guidelines for selecting methods for the identification of binding sites and assessment of druggability. In addition, we list the databases and tools commonly used with these methods, present examples of individual and combined applications, and compare the methods and tools. Finally, we discuss the challenges and limitations of binding site identification and druggability assessment at the current stage and provide some recommendations and future perspectives.
Collapse
Affiliation(s)
- Jianbo Liao
- Key Laboratory of Big Data Mining and Precision Drug Design of Guangdong Medical University, Key Laboratory of Computer-Aided Drug Design of Dongguan City, Key Laboratory for Research and Development of Natural Drugs of Guangdong Province, School of Pharmacy, Guangdong Medical University, Dongguan 523808, China
- The Second School of Clinical Medicine, Guangdong Medical University, Dongguan 523808, China
| | - Qinyu Wang
- Key Laboratory of Big Data Mining and Precision Drug Design of Guangdong Medical University, Key Laboratory of Computer-Aided Drug Design of Dongguan City, Key Laboratory for Research and Development of Natural Drugs of Guangdong Province, School of Pharmacy, Guangdong Medical University, Dongguan 523808, China
| | - Fengxu Wu
- Hubei Key Laboratory of Wudang Local Chinese Medicine Research, School of Pharmaceutical Sciences, Hubei University of Medicine, Shiyan 442000, China
| | - Zunnan Huang
- Key Laboratory of Big Data Mining and Precision Drug Design of Guangdong Medical University, Key Laboratory of Computer-Aided Drug Design of Dongguan City, Key Laboratory for Research and Development of Natural Drugs of Guangdong Province, School of Pharmacy, Guangdong Medical University, Dongguan 523808, China
- Marine Biomedical Research Institute of Guangdong Zhanjiang, Zhanjiang 524023, China
| |
Collapse
|
16
|
Eguida M, Rognan D. Estimating the Similarity between Protein Pockets. Int J Mol Sci 2022; 23:12462. [PMID: 36293316 PMCID: PMC9604425 DOI: 10.3390/ijms232012462] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 10/15/2022] [Accepted: 10/16/2022] [Indexed: 10/28/2023] Open
Abstract
With the exponential increase in publicly available protein structures, the comparison of protein binding sites naturally emerged as a scientific topic to explain observations or generate hypotheses for ligand design, notably to predict ligand selectivity for on- and off-targets, explain polypharmacology, and design target-focused libraries. The current review summarizes the state-of-the-art computational methods applied to pocket detection and comparison as well as structural druggability estimates. The major strengths and weaknesses of current pocket descriptors, alignment methods, and similarity search algorithms are presented. Lastly, an exhaustive survey of both retrospective and prospective applications in diverse medicinal chemistry scenarios illustrates the capability of the existing methods and the hurdle that still needs to be overcome for more accurate predictions.
Collapse
Affiliation(s)
| | - Didier Rognan
- Laboratoire d’Innovation Thérapeutique, UMR7200 CNRS-Université de Strasbourg, 67400 Illkirch, France
| |
Collapse
|
17
|
Guerra JVDS, Ribeiro-Filho HV, Jara GE, Bortot LO, Pereira JGDC, Lopes-de-Oliveira PS. pyKVFinder: an efficient and integrable Python package for biomolecular cavity detection and characterization in data science. BMC Bioinformatics 2021; 22:607. [PMID: 34930115 PMCID: PMC8685811 DOI: 10.1186/s12859-021-04519-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 12/07/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Biomolecular interactions that modulate biological processes occur mainly in cavities throughout the surface of biomolecular structures. In the data science era, structural biology has benefited from the increasing availability of biostructural data due to advances in structural determination and computational methods. In this scenario, data-intensive cavity analysis demands efficient scripting routines built on easily manipulated data structures. To fulfill this need, we developed pyKVFinder, a Python package to detect and characterize cavities in biomolecular structures for data science and automated pipelines. RESULTS pyKVFinder efficiently detects cavities in biomolecular structures and computes their volume, area, depth and hydropathy, storing these cavity properties in NumPy arrays. Benefited from Python ecosystem interoperability and data structures, pyKVFinder can be integrated with third-party scientific packages and libraries for mathematical calculations, machine learning and 3D visualization in automated workflows. As proof of pyKVFinder's capabilities, we successfully identified and compared ADRP substrate-binding site of SARS-CoV-2 and a set of homologous proteins with pyKVFinder, showing its integrability with data science packages such as matplotlib, NGL Viewer, SciPy and Jupyter notebook. CONCLUSIONS We introduce an efficient, highly versatile and easily integrable software for detecting and characterizing biomolecular cavities in data science applications and automated protocols. pyKVFinder facilitates biostructural data analysis with scripting routines in the Python ecosystem and can be building blocks for data science and drug design applications.
Collapse
Affiliation(s)
- João Victor da Silva Guerra
- Brazilian Center for Research in Energy and Materials (CNPEM), Brazilian Biosciences National Laboratory (LNBio), R. Giuseppe Máximo Scolfaro, 10000 - Bosque das Palmeiras, Campinas, SP, 13083-100, Brazil. .,Graduate Program in Pharmaceutical Sciences, Faculty of Pharmaceutical Sciences, University of Campinas, Campinas, SP, Brazil.
| | - Helder Veras Ribeiro-Filho
- Brazilian Center for Research in Energy and Materials (CNPEM), Brazilian Biosciences National Laboratory (LNBio), R. Giuseppe Máximo Scolfaro, 10000 - Bosque das Palmeiras, Campinas, SP, 13083-100, Brazil
| | - Gabriel Ernesto Jara
- Brazilian Center for Research in Energy and Materials (CNPEM), Brazilian Biosciences National Laboratory (LNBio), R. Giuseppe Máximo Scolfaro, 10000 - Bosque das Palmeiras, Campinas, SP, 13083-100, Brazil
| | - Leandro Oliveira Bortot
- Brazilian Center for Research in Energy and Materials (CNPEM), Brazilian Biosciences National Laboratory (LNBio), R. Giuseppe Máximo Scolfaro, 10000 - Bosque das Palmeiras, Campinas, SP, 13083-100, Brazil
| | - José Geraldo de Carvalho Pereira
- Brazilian Center for Research in Energy and Materials (CNPEM), Brazilian Biosciences National Laboratory (LNBio), R. Giuseppe Máximo Scolfaro, 10000 - Bosque das Palmeiras, Campinas, SP, 13083-100, Brazil
| | - Paulo Sérgio Lopes-de-Oliveira
- Brazilian Center for Research in Energy and Materials (CNPEM), Brazilian Biosciences National Laboratory (LNBio), R. Giuseppe Máximo Scolfaro, 10000 - Bosque das Palmeiras, Campinas, SP, 13083-100, Brazil. .,Graduate Program in Pharmaceutical Sciences, Faculty of Pharmaceutical Sciences, University of Campinas, Campinas, SP, Brazil.
| |
Collapse
|
18
|
Feng L, Wang F, Zhang J, Tang Y, Zhao J, Zhou L, Wang J, Guo D, Singh AK. Particle-based calculation and visualization of protein cavities using SES models. IEEE J Biomed Health Inform 2021; 26:2447-2457. [PMID: 34843433 DOI: 10.1109/jbhi.2021.3130897] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The analysis of molecular cavities, where ligands interact with protein structures, plays a critical role in protein structure-based drug design. However, it is a challenge because of the ambiguous definition of the cavity boundaries in most cavity detection methods. The cavities are mostly calculated by input parameters, which are difficult for users to visualize cavities in interactive ways. In this paper, we propose a novel method for the interactive exploration of cavity calculation and visualization. Firstly, the proposed method combines the two solvent-excluded surfaces (SES) models of a given protein to define the boundaries and provides cavity emission points. Secondly, the system provides a user-guided interactive method to allow users to select cavities by simply clicking operations and to track the cavity identify and filling process based on position constraints. Finally, the selected cavities are represented with the colorful depth perception method. Experiments show that our work can effectively identify and calculate cavities.
Collapse
|
19
|
Lu ZC, Jiang F, Wu YD. Phosphate binding sites prediction in phosphorylation-dependent protein-protein interactions. Bioinformatics 2021; 37:4712-4718. [PMID: 34270697 DOI: 10.1093/bioinformatics/btab525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Revised: 06/07/2021] [Accepted: 07/13/2021] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Phosphate binding plays an important role in modulating protein-protein interactions, which are ubiquitous in various biological processes. Accurate prediction of phosphate binding sites is an important but challenging task. Small size and diversity of phosphate binding sites lead to a substantial challenge for developing accurate prediction methods. RESULTS Here we present the phosphate binding site predictor (PBSP), a novel and accurate approach to identifying phosphate binding sites from protein structures. PBSP combines an energy-based ligand-binding sites identification method with reverse focused docking using a phosphate probe. We show that PBSP outperforms not only general ligand binding sites predictors but also other existing phospholigand-specific binding sites predictors. It achieves ∼95% success rate for top 10 predicted sites with an average Matthews correlation coefficient (MCC) value of 0.84 for successful predictions. PBSP can accurately predict phosphate binding modes, with average position error of 1.4 Å and 2.4 Å in bound and unbound datasets, respectively. Lastly, visual inspection of the predictions is conducted. Reasons for failed predictions are further analyzed and possible ways to improve the performance are provided. These results demonstrate a novel and accurate approach to phosphate binding sites identification in protein structures. AVAILABILITY The software and benchmark datasets are freely available at http://web.pkusz.edu.cn/wu/PBSP/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zheng-Chang Lu
- Lab of Computational Chemistry and Drug Design, State Key Laboratory of Chemical Oncogenomics, Peking University Shenzhen Graduate School, Shenzhen, 518055, China.,Shenzhen Bay Laboratory, Shenzhen, 518055, China
| | - Fan Jiang
- Lab of Computational Chemistry and Drug Design, State Key Laboratory of Chemical Oncogenomics, Peking University Shenzhen Graduate School, Shenzhen, 518055, China.,NanoAI Biotech Co., Ltd, Shenzhen, 518118, China
| | - Yun-Dong Wu
- Lab of Computational Chemistry and Drug Design, State Key Laboratory of Chemical Oncogenomics, Peking University Shenzhen Graduate School, Shenzhen, 518055, China.,Shenzhen Bay Laboratory, Shenzhen, 518055, China.,College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, China
| |
Collapse
|
20
|
An Updated Review of Computer-Aided Drug Design and Its Application to COVID-19. BIOMED RESEARCH INTERNATIONAL 2021; 2021:8853056. [PMID: 34258282 PMCID: PMC8241505 DOI: 10.1155/2021/8853056] [Citation(s) in RCA: 79] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Revised: 05/31/2021] [Accepted: 06/11/2021] [Indexed: 12/23/2022]
Abstract
The recent outbreak of the deadly coronavirus disease 19 (COVID-19) pandemic poses serious health concerns around the world. The lack of approved drugs or vaccines continues to be a challenge and further necessitates the discovery of new therapeutic molecules. Computer-aided drug design has helped to expedite the drug discovery and development process by minimizing the cost and time. In this review article, we highlight two important categories of computer-aided drug design (CADD), viz., the ligand-based as well as structured-based drug discovery. Various molecular modeling techniques involved in structure-based drug design are molecular docking and molecular dynamic simulation, whereas ligand-based drug design includes pharmacophore modeling, quantitative structure-activity relationship (QSARs), and artificial intelligence (AI). We have briefly discussed the significance of computer-aided drug design in the context of COVID-19 and how the researchers continue to rely on these computational techniques in the rapid identification of promising drug candidate molecules against various drug targets implicated in the pathogenesis of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The structural elucidation of pharmacological drug targets and the discovery of preclinical drug candidate molecules have accelerated both structure-based as well as ligand-based drug design. This review article will help the clinicians and researchers to exploit the immense potential of computer-aided drug design in designing and identification of drug molecules and thereby helping in the management of fatal disease.
Collapse
|
21
|
Molecular docking and density functional theory studies of potent 1,3-disubstituted-9H-pyrido[3,4-b]indoles antifilarial compounds. Struct Chem 2021. [DOI: 10.1007/s11224-021-01772-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
22
|
Brackenridge DA, McGuffin LJ. Proteins and Their Interacting Partners: An Introduction to Protein-Ligand Binding Site Prediction Methods with a Focus on FunFOLD3. Methods Mol Biol 2021; 2365:43-58. [PMID: 34432238 DOI: 10.1007/978-1-0716-1665-9_3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Proteins are essential molecules with a diverse range of functions; elucidating their biological and biochemical characteristics can be difficult and time consuming using in vitro and/or in vivo methods. Additionally, in vivo protein-ligand binding site elucidation is unable to keep place with current growth in sequencing, leaving the majority of new protein sequences without known functions. Therefore, the development of new methods, which aim to predict the protein-ligand interactions and ligand-binding site residues directly from amino acid sequences, is becoming increasingly important. In silico prediction can utilise either sequence information, structural information or a combination of both. In this chapter, we will discuss the broad range of methods for ligand-binding site prediction from protein structure and we will describe our method, FunFOLD3, for the prediction of protein-ligand interactions and ligand-binding sites based on template-based modelling. Additionally, we will describe the step-by-step instructions using the FunFOLD3 downloadable application along with examples from the Critical Assessment of Techniques for Protein Structure Prediction (CASP) where FunFOLD3 has been used to aid ligand and ligand-binding site prediction. Finally, we will introduce our newer method, FunFOLD3-D, a version of FunFOLD3 which aims to improve template-based protein-ligand binding site prediction through the integration of docking, using AutoDock Vina.
Collapse
|
23
|
Giulini M, Menichetti R, Shell MS, Potestio R. An Information-Theory-Based Approach for Optimal Model Reduction of Biomolecules. J Chem Theory Comput 2020; 16:6795-6813. [PMID: 33108737 PMCID: PMC7659038 DOI: 10.1021/acs.jctc.0c00676] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Indexed: 02/06/2023]
Abstract
In theoretical modeling of a physical system, a crucial step consists of the identification of those degrees of freedom that enable a synthetic yet informative representation of it. While in some cases this selection can be carried out on the basis of intuition and experience, straightforward discrimination of the important features from the negligible ones is difficult for many complex systems, most notably heteropolymers and large biomolecules. We here present a thermodynamics-based theoretical framework to gauge the effectiveness of a given simplified representation by measuring its information content. We employ this method to identify those reduced descriptions of proteins, in terms of a subset of their atoms, that retain the largest amount of information from the original model; we show that these highly informative representations share common features that are intrinsically related to the biological properties of the proteins under examination, thereby establishing a bridge between protein structure, energetics, and function.
Collapse
Affiliation(s)
- Marco Giulini
- Physics
Department, University of Trento, via Sommarive 14, I-38123 Trento, Italy
- INFN-TIFPA, Trento
Institute for Fundamental Physics and Applications, I-38123 Trento, Italy
| | - Roberto Menichetti
- Physics
Department, University of Trento, via Sommarive 14, I-38123 Trento, Italy
- INFN-TIFPA, Trento
Institute for Fundamental Physics and Applications, I-38123 Trento, Italy
| | - M. Scott Shell
- Department
of Chemical Engineering, University of California
Santa Barbara, Santa
Barbara, California 93106, United States
| | - Raffaello Potestio
- Physics
Department, University of Trento, via Sommarive 14, I-38123 Trento, Italy
- INFN-TIFPA, Trento
Institute for Fundamental Physics and Applications, I-38123 Trento, Italy
| |
Collapse
|
24
|
Zhao J, Cao Y, Zhang L. Exploring the computational methods for protein-ligand binding site prediction. Comput Struct Biotechnol J 2020; 18:417-426. [PMID: 32140203 PMCID: PMC7049599 DOI: 10.1016/j.csbj.2020.02.008] [Citation(s) in RCA: 106] [Impact Index Per Article: 21.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Revised: 01/23/2020] [Accepted: 02/11/2020] [Indexed: 12/21/2022] Open
Abstract
Proteins participate in various essential processes in vivo via interactions with other molecules. Identifying the residues participating in these interactions not only provides biological insights for protein function studies but also has great significance for drug discoveries. Therefore, predicting protein-ligand binding sites has long been under intense research in the fields of bioinformatics and computer aided drug discovery. In this review, we first introduce the research background of predicting protein-ligand binding sites and then classify the methods into four categories, namely, 3D structure-based, template similarity-based, traditional machine learning-based and deep learning-based methods. We describe representative algorithms in each category and elaborate on machine learning and deep learning-based prediction methods in more detail. Finally, we discuss the trends and challenges of the current research such as molecular dynamics simulation based cryptic binding sites prediction, and highlight prospective directions for the near future.
Collapse
Affiliation(s)
- Jingtian Zhao
- College of Computer Science, Sichuan University, Chengdu 610065, China
| | - Yang Cao
- Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Le Zhang
- College of Computer Science, Sichuan University, Chengdu 610065, China
| |
Collapse
|
25
|
CavBench: A benchmark for protein cavity detection methods. PLoS One 2019; 14:e0223596. [PMID: 31609980 PMCID: PMC6791542 DOI: 10.1371/journal.pone.0223596] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2019] [Accepted: 09/24/2019] [Indexed: 11/19/2022] Open
Abstract
Extensive research has been applied to discover new techniques and methods to model protein-ligand interactions. In particular, considerable efforts focused on identifying candidate binding sites, which quite often are active sites that correspond to protein pockets or cavities. Thus, these cavities play an important role in molecular docking. However, there is no established benchmark to assess the accuracy of new cavity detection methods. In practice, each new technique is evaluated using a small set of proteins with known binding sites as ground-truth. However, studies supported by large datasets of known cavities and/or binding sites and statistical classification (i.e., false positives, false negatives, true positives, and true negatives) would yield much stronger and reliable assessments. To this end, we propose CavBench, a generic and extensible benchmark to compare different cavity detection methods relative to diverse ground truth datasets (e.g., PDBsum) using statistical classification methods.
Collapse
|
26
|
Chen Z, Zhang X, Peng C, Wang J, Xu Z, Chen K, Shi J, Zhu W. D3Pockets: A Method and Web Server for Systematic Analysis of Protein Pocket Dynamics. J Chem Inf Model 2019; 59:3353-3358. [DOI: 10.1021/acs.jcim.9b00332] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Affiliation(s)
- Zhaoqiang Chen
- CAS Key Laboratory of Receptor Research, Drug Discovery and Design Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing 100049, China
| | - Xinben Zhang
- CAS Key Laboratory of Receptor Research, Drug Discovery and Design Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Cheng Peng
- CAS Key Laboratory of Receptor Research, Drug Discovery and Design Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing 100049, China
| | - Jinan Wang
- CAS Key Laboratory of Receptor Research, Drug Discovery and Design Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Zhijian Xu
- CAS Key Laboratory of Receptor Research, Drug Discovery and Design Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing 100049, China
| | - Kaixian Chen
- CAS Key Laboratory of Receptor Research, Drug Discovery and Design Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing 100049, China
- Open Studio for Druggability Research of Marine Natural Products, Pilot National Laboratory for Marine Science and Technology (Qingdao), 1 Wenhai Road, Aoshanwei, Jimo, Qingdao 266237, China
| | - Jiye Shi
- UCB Biopharma SPRL, Chemin du Foriest, Braine-l’ Alleud B-1420, Belgium
| | - Weiliang Zhu
- CAS Key Laboratory of Receptor Research, Drug Discovery and Design Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing 100049, China
- Open Studio for Druggability Research of Marine Natural Products, Pilot National Laboratory for Marine Science and Technology (Qingdao), 1 Wenhai Road, Aoshanwei, Jimo, Qingdao 266237, China
| |
Collapse
|
27
|
Simões TMC, Gomes AJP. CavVis-A Field-of-View Geometric Algorithm for Protein Cavity Detection. J Chem Inf Model 2019; 59:786-796. [PMID: 30629446 DOI: 10.1021/acs.jcim.8b00572] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Several geometric-based methods have been developed for the last two to three decades to detect and identify cavities (i.e., putative binding sites) on proteins, as needed to study protein-ligand interactions and protein docking. This paper introduces a new protein cavity method, called CavVis, which combines voxelization (i.e., a grid of voxels) and an analytic formulation of Gaussian surfaces that approximates the solvent-excluded surface. This method builds upon visibility of points on protein surface to find its cavities. Specifically, the visibility criterion combines three concepts we borrow from computer graphics, the field-of-view of each surface point, voxel ray casting, and back-face culling.
Collapse
Affiliation(s)
- Tiago M C Simões
- Instituto de Telecomunicações , Delegação da Covilhã , 6200-001 Covilhã , Portugal.,Departamento de Informática , Universidade da Beira Interior , 6200-001 Covilhã , Portugal
| | - Abel J P Gomes
- Instituto de Telecomunicações , Delegação da Covilhã , 6200-001 Covilhã , Portugal.,Departamento de Informática , Universidade da Beira Interior , 6200-001 Covilhã , Portugal
| |
Collapse
|
28
|
New Binding Sites, New Opportunities for GPCR Drug Discovery. Trends Biochem Sci 2019; 44:312-330. [PMID: 30612897 DOI: 10.1016/j.tibs.2018.11.011] [Citation(s) in RCA: 100] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2018] [Revised: 08/11/2018] [Accepted: 11/27/2018] [Indexed: 12/29/2022]
Abstract
Many central biological events rely on protein-ligand interactions. The identification and characterization of protein-binding sites for ligands are crucial for the understanding of functions of both endogenous ligands and synthetic drug molecules. G protein-coupled receptors (GPCRs) typically detect extracellular signal molecules on the cell surface and transfer these chemical signals across the membrane, inducing downstream cellular responses via G proteins or β-arrestin. GPCRs mediate many central physiological processes, making them important targets for modern drug discovery. Here, we focus on the most recent breakthroughs in finding new binding sites and binding modes of GPCRs and their potentials for the development of new medicines.
Collapse
|
29
|
Krivák R, Hoksza D. P2Rank: machine learning based tool for rapid and accurate prediction of ligand binding sites from protein structure. J Cheminform 2018; 10:39. [PMID: 30109435 PMCID: PMC6091426 DOI: 10.1186/s13321-018-0285-8] [Citation(s) in RCA: 228] [Impact Index Per Article: 32.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2017] [Accepted: 06/29/2018] [Indexed: 01/29/2023] Open
Abstract
Background Ligand binding site prediction from protein structure has many applications related to elucidation of protein function and structure based drug discovery. It often represents only one step of many in complex computational drug design efforts. Although many methods have been published to date, only few of them are suitable for use in automated pipelines or for processing large datasets.
These use cases require stability and speed, which disqualifies many of the recently introduced tools that are either template based or available only as web servers. Results We present P2Rank, a stand-alone template-free tool for prediction of ligand binding sites based on machine learning. It is based on prediction of ligandability of local chemical neighbourhoods that are centered on points placed on the solvent accessible surface of a protein.
We show that P2Rank outperforms several existing tools, which include two widely used stand-alone tools (Fpocket, SiteHound), a comprehensive consensus based tool (MetaPocket 2.0), and a recent deep learning based method (DeepSite). P2Rank belongs to the fastest available tools (requires under 1 s for prediction on one protein), with additional advantage of multi-threaded implementation. Conclusions P2Rank is a new open source software package for ligand binding site prediction from protein structure. It is available as a user-friendly stand-alone command line program and a Java library. P2Rank has a lightweight installation and does not depend on other bioinformatics tools or large structural or sequence databases. Thanks to its speed and ability to make fully automated predictions, it is particularly well suited for processing large datasets or as a component of scalable structural bioinformatics pipelines. Electronic supplementary material The online version of this article (10.1186/s13321-018-0285-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Radoslav Krivák
- Department of Software Engineering, Charles University, Prague, Czech Republic.
| | - David Hoksza
- Department of Software Engineering, Charles University, Prague, Czech Republic.
| |
Collapse
|
30
|
Simões T, Lopes D, Dias S, Fernandes F, Pereira J, Jorge J, Bajaj C, Gomes A. Geometric Detection Algorithms for Cavities on Protein Surfaces in Molecular Graphics: A Survey. COMPUTER GRAPHICS FORUM : JOURNAL OF THE EUROPEAN ASSOCIATION FOR COMPUTER GRAPHICS 2017; 36:643-683. [PMID: 29520122 PMCID: PMC5839519 DOI: 10.1111/cgf.13158] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
Detecting and analyzing protein cavities provides significant information about active sites for biological processes (e.g., protein-protein or protein-ligand binding) in molecular graphics and modeling. Using the three-dimensional structure of a given protein (i.e., atom types and their locations in 3D) as retrieved from a PDB (Protein Data Bank) file, it is now computationally viable to determine a description of these cavities. Such cavities correspond to pockets, clefts, invaginations, voids, tunnels, channels, and grooves on the surface of a given protein. In this work, we survey the literature on protein cavity computation and classify algorithmic approaches into three categories: evolution-based, energy-based, and geometry-based. Our survey focuses on geometric algorithms, whose taxonomy is extended to include not only sphere-, grid-, and tessellation-based methods, but also surface-based, hybrid geometric, consensus, and time-varying methods. Finally, we detail those techniques that have been customized for GPU (Graphics Processing Unit) computing.
Collapse
Affiliation(s)
- Tiago Simões
- Instituto de Telecomunicações, Portugal
- Universidade da Beira Interior, Portugal
| | | | - Sérgio Dias
- Instituto de Telecomunicações, Portugal
- Universidade da Beira Interior, Portugal
| | | | - João Pereira
- INESC-ID Lisboa, Portugal
- Instituto Superior Técnico, Universidade de Lisboa, Portugal
| | - Joaquim Jorge
- INESC-ID Lisboa, Portugal
- Instituto Superior Técnico, Universidade de Lisboa, Portugal
| | | | - Abel Gomes
- Instituto de Telecomunicações, Portugal
- Universidade da Beira Interior, Portugal
| |
Collapse
|
31
|
Dias SED, Martins AM, Nguyen QT, Gomes AJP. GPU-based detection of protein cavities using Gaussian surfaces. BMC Bioinformatics 2017; 18:493. [PMID: 29145826 PMCID: PMC5691400 DOI: 10.1186/s12859-017-1913-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2016] [Accepted: 11/01/2017] [Indexed: 11/10/2022] Open
Abstract
Background Protein cavities play a key role in biomolecular recognition and function, particularly in protein-ligand interactions, as usual in drug discovery and design. Grid-based cavity detection methods aim at finding cavities as aggregates of grid nodes outside the molecule, under the condition that such cavities are bracketed by nodes on the molecule surface along a set of directions (not necessarily aligned with coordinate axes). Therefore, these methods are sensitive to scanning directions, a problem that we call cavity ground-and-walls ambiguity, i.e., they depend on the position and orientation of the protein in the discretized domain. Also, it is hard to distinguish grid nodes belonging to protein cavities amongst all those outside the protein, a problem that we call cavity ceiling ambiguity. Results We solve those two ambiguity problems using two implicit isosurfaces of the protein, the protein surface itself (called inner isosurface) that excludes all its interior nodes from any cavity, and the outer isosurface that excludes most of its exterior nodes from any cavity. Summing up, the cavities are formed from nodes located between these two isosurfaces. It is worth noting that these two surfaces do not need to be evaluated (i.e., sampled), triangulated, and rendered on the screen to find the cavities in between; their defining analytic functions are enough to determine which grid nodes are in the empty space between them. Conclusion This article introduces a novel geometric algorithm to detect cavities on the protein surface that takes advantage of the real analytic functions describing two Gaussian surfaces of a given protein.
Collapse
Affiliation(s)
- Sérgio E D Dias
- Universidade da Beira Interior, Av. Marques D'Ávila e Bolama, Covilhã, 6200-001, Portugal.,Instituto de Telecomunicações, Av. Marques D'Ávila e Bolama, Covilhã, 6200-001, Portugal
| | - Ana Mafalda Martins
- Universidade da Beira Interior, Av. Marques D'Ávila e Bolama, Covilhã, 6200-001, Portugal
| | - Quoc T Nguyen
- Universidade da Beira Interior, Av. Marques D'Ávila e Bolama, Covilhã, 6200-001, Portugal.,Instituto de Telecomunicações, Av. Marques D'Ávila e Bolama, Covilhã, 6200-001, Portugal
| | - Abel J P Gomes
- Universidade da Beira Interior, Av. Marques D'Ávila e Bolama, Covilhã, 6200-001, Portugal. .,Instituto de Telecomunicações, Av. Marques D'Ávila e Bolama, Covilhã, 6200-001, Portugal.
| |
Collapse
|
32
|
Glantz-Gashai Y, Meirson T, Samson AO. Normal Modes Expose Active Sites in Enzymes. PLoS Comput Biol 2016; 12:e1005293. [PMID: 28002427 PMCID: PMC5225006 DOI: 10.1371/journal.pcbi.1005293] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2015] [Revised: 01/10/2017] [Accepted: 12/07/2016] [Indexed: 01/10/2023] Open
Abstract
Accurate prediction of active sites is an important tool in bioinformatics. Here we present an improved structure based technique to expose active sites that is based on large changes of solvent accessibility accompanying normal mode dynamics. The technique which detects EXPOsure of active SITes through normal modEs is named EXPOSITE. The technique is trained using a small 133 enzyme dataset and tested using a large 845 enzyme dataset, both with known active site residues. EXPOSITE is also tested in a benchmark protein ligand dataset (PLD) comprising 48 proteins with and without bound ligands. EXPOSITE is shown to successfully locate the active site in most instances, and is found to be more accurate than other structure-based techniques. Interestingly, in several instances, the active site does not correspond to the largest pocket. EXPOSITE is advantageous due to its high precision and paves the way for structure based prediction of active site in enzymes. In this paper, we present an improved technique to predict active sites in enzymes. Our technique is based on changes of solvent accessibility that accompany normal mode dynamics. We assert the technique strength using several enzyme datasets with known catalytic residues. We show the technique successfully locates the active site in most cases, and consistently surpasses the accuracy of other techniques. We show how the technique is advantageous and paves the way for high precision prediction of active sites.
Collapse
Affiliation(s)
| | - Tomer Meirson
- Faculty of Medicine in the Galilee, Bar Ilan University, Safed, Israel
| | - Abraham O. Samson
- Faculty of Medicine in the Galilee, Bar Ilan University, Safed, Israel
- * E-mail:
| |
Collapse
|
33
|
Broomhead NK, Soliman ME. Can We Rely on Computational Predictions To Correctly Identify Ligand Binding Sites on Novel Protein Drug Targets? Assessment of Binding Site Prediction Methods and a Protocol for Validation of Predicted Binding Sites. Cell Biochem Biophys 2016; 75:15-23. [PMID: 27796788 DOI: 10.1007/s12013-016-0769-y] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2016] [Accepted: 10/19/2016] [Indexed: 11/30/2022]
Abstract
In the field of medicinal chemistry there is increasing focus on identifying key proteins whose biochemical functions can firmly be linked to serious diseases. Such proteins become targets for drug or inhibitor molecules that could treat or halt the disease through therapeutic action or by blocking the protein function respectively. The protein must be targeted at the relevant biologically active site for drug or inhibitor binding to be effective. As insufficient experimental data is available to confirm the biologically active binding site for novel protein targets, researchers often rely on computational prediction methods to identify binding sites. Presented herein is a short review on structure-based computational methods that (i) predict putative binding sites and (ii) assess the druggability of predicted binding sites on protein targets. This review briefly covers the principles upon which these methods are based, where they can be accessed and their reliability in identifying the correct binding site on a protein target. Based on this review, we believe that these methods are useful in predicting putative binding sites, but as they do not account for the dynamic nature of protein-ligand binding interactions, they cannot definitively identify the correct site from a ranked list of putative sites. To overcome this shortcoming, we strongly recommend using molecular docking to predict the most likely protein-ligand binding site(s) and mode(s), followed by molecular dynamics simulations and binding thermodynamics calculations to validate the docking results. This protocol provides a valuable platform for experimental and computational efforts to design novel drugs and inhibitors that target disease-related proteins.
Collapse
Affiliation(s)
- Neal K Broomhead
- Molecular Modelling & Drug Design Research Group, School of Health Sciences, University of KwaZulu-Natal, Westville, Durban, 4001, South Africa
| | - Mahmoud E Soliman
- Molecular Modelling & Drug Design Research Group, School of Health Sciences, University of KwaZulu-Natal, Westville, Durban, 4001, South Africa.
| |
Collapse
|
34
|
Abstract
Ligand binding is required for many proteins to function properly. A large number of bioinformatics tools have been developed to predict ligand binding sites as a first step in understanding a protein's function or to facilitate docking computations in virtual screening based drug design. The prediction usually requires only the three-dimensional structure (experimentally determined or computationally modeled) of the target protein to be searched for ligand binding site(s), and Web servers have been built, allowing the free and simple use of prediction tools. In this chapter, we review the underlying concepts of the methods used by various tools, and discuss their different features and the related issues of ligand binding site prediction. Some cautionary notes about the use of these tools are also provided.
Collapse
Affiliation(s)
- Zhong-Ru Xie
- Institute of Biomedical Sciences, Academia Sinica, 128 Academia Road, Section 2, Nankang, Taipei, 115, Taiwan
| | | |
Collapse
|
35
|
Brylinski M, Feinstein WP. eFindSite: improved prediction of ligand binding sites in protein models using meta-threading, machine learning and auxiliary ligands. J Comput Aided Mol Des 2013; 27:551-67. [PMID: 23838840 DOI: 10.1007/s10822-013-9663-5] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2013] [Accepted: 07/01/2013] [Indexed: 02/02/2023]
Abstract
Molecular structures and functions of the majority of proteins across different species are yet to be identified. Much needed functional annotation of these gene products often benefits from the knowledge of protein-ligand interactions. Towards this goal, we developed eFindSite, an improved version of FINDSITE, designed to more efficiently identify ligand binding sites and residues using only weakly homologous templates. It employs a collection of effective algorithms, including highly sensitive meta-threading approaches, improved clustering techniques, advanced machine learning methods and reliable confidence estimation systems. Depending on the quality of target protein structures, eFindSite outperforms geometric pocket detection algorithms by 15-40 % in binding site detection and by 5-35 % in binding residue prediction. Moreover, compared to FINDSITE, it identifies 14 % more binding residues in the most difficult cases. When multiple putative binding pockets are identified, the ranking accuracy is 75-78 %, which can be further improved by 3-4 % by including auxiliary information on binding ligands extracted from biomedical literature. As a first across-genome application, we describe structure modeling and binding site prediction for the entire proteome of Escherichia coli. Carefully calibrated confidence estimates strongly indicate that highly reliable ligand binding predictions are made for the majority of gene products, thus eFindSite holds a significant promise for large-scale genome annotation and drug development projects. eFindSite is freely available to the academic community at http://www.brylinski.org/efindsite .
Collapse
Affiliation(s)
- Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA.
| | | |
Collapse
|
36
|
Xie ZR, Liu CK, Hsiao FC, Yao A, Hwang MJ. LISE: a server using ligand-interacting and site-enriched protein triangles for prediction of ligand-binding sites. Nucleic Acids Res 2013; 41:W292-6. [PMID: 23609546 PMCID: PMC3692107 DOI: 10.1093/nar/gkt300] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
LISE is a web server for a novel method for predicting small molecule binding sites on proteins. It differs from a number of servers currently available for such predictions in two aspects. First, rather than relying on knowledge of similar protein structures, identification of surface cavities or estimation of binding energy, LISE computes a score by counting geometric motifs extracted from sub-structures of interaction networks connecting protein and ligand atoms. These network motifs take into account spatial and physicochemical properties of ligand-interacting protein surface atoms. Second, LISE has now been more thoroughly tested, as, in addition to the evaluation we previously reported using two commonly used small benchmark test sets and targets of two community-based experiments on ligand-binding site predictions, we now report an evaluation using a large non-redundant data set containing >2000 protein–ligand complexes. This unprecedented test, the largest ever reported to our knowledge, demonstrates LISE’s overall accuracy and robustness. Furthermore, we have identified some hard to predict protein classes and provided an estimate of the performance that can be expected from a state-of-the-art binding site prediction server, such as LISE, on a proteome scale. The server is freely available at http://lise.ibms.sinica.edu.tw.
Collapse
Affiliation(s)
- Zhong-Ru Xie
- Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan
| | | | | | | | | |
Collapse
|
37
|
Chemogenomics in drug discovery: computational methods based on the comparison of binding sites. Future Med Chem 2013; 4:1971-9. [PMID: 23088277 DOI: 10.4155/fmc.12.147] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Novel computational methods for understanding relationships between ligands and all possible biological targets have emerged in recent years. Proteins are connected to each other based on the similarity of their ligands or based on the similarity of their binding sites. The assumption is that compounds sharing chemical similarity should share targets and that targets with a similar binding site should also share ligands. A large number of computational techniques have been developed to assess ligand and binding site similarity, which can be used to make quantitative predictions of the most probable biological target of a given compound. This review covers the recent advances in new computational methods for relating biological targets based on the similarity of their binding sites. Binding site comparisons are used for the prediction of their most likely ligands, their possible cross reactivity and selectivity. These comparisons can also be used to infer the function of novel uncharacterized proteins.
Collapse
|
38
|
Xie ZR, Hwang MJ. Ligand-binding site prediction using ligand-interacting and binding site-enriched protein triangles. ACTA ACUST UNITED AC 2012; 28:1579-85. [PMID: 22495747 DOI: 10.1093/bioinformatics/bts182] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION Knowledge about the site at which a ligand binds provides an important clue for predicting the function of a protein and is also often a prerequisite for performing docking computations in virtual drug design and screening. We have previously shown that certain ligand-interacting triangles of protein atoms, called protein triangles, tend to occur more frequently at ligand-binding sites than at other parts of the protein. RESULTS In this work, we describe a new ligand-binding site prediction method that was developed based on binding site-enriched protein triangles. The new method was tested on 2 benchmark datasets and on 19 targets from two recent community-based studies of such predictions, and excellent results were obtained. Where comparisons were made, the success rates for the new method for the first predicted site were significantly better than methods that are not a meta-predictor. Further examination showed that, for most of the unsuccessful predictions, the pocket of the ligand-binding site was identified, but not the site itself, whereas for some others, the failure was not due to the method itself but due to the use of an incorrect biological unit in the structure examined, although using correct biological units would not necessarily improve the prediction success rates. These results suggest that the new method is a valuable new addition to a suite of existing structure-based bioinformatics tools for studies of molecular recognition and related functions of proteins in post-genomics research. AVAILABILITY The executable binaries and a web server for our method are available from http://sourceforge.net/projects/msdock/ and http://lise.ibms.sinica.edu.tw, respectively, free for academic users.
Collapse
Affiliation(s)
- Zhong-Ru Xie
- Institute of Biomedical Informatics, National Yang-Ming University, Taipei 112, Taiwan
| | | |
Collapse
|
39
|
Fauman EB, Rai BK, Huang ES. Structure-based druggability assessment--identifying suitable targets for small molecule therapeutics. Curr Opin Chem Biol 2011; 15:463-8. [PMID: 21704549 DOI: 10.1016/j.cbpa.2011.05.020] [Citation(s) in RCA: 118] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2011] [Revised: 05/10/2011] [Accepted: 05/23/2011] [Indexed: 01/08/2023]
Abstract
A target is druggable if it can be modulated in vivo by a drug-like molecule. The general properties of oral drugs are summarized by the 'rule of 5' which specifies parameters related to size and lipophilicity. Structure-based target druggability assessment consists of predicting ligand-binding sites on the protein that are complementary to these drug-like properties. Automated identification of ligand-binding sites can use geometrical considerations alone or include specific physicochemical properties of the protein surface. Features of a pocket's size and shape, together with measures of its hydrophobicity, are most informative in identifying suitable drug-binding pockets. The recent availability of several validation sets of druggable versus undruggable targets has helped fuel the development of more elaborate methods.
Collapse
Affiliation(s)
- Eric B Fauman
- Computational Sciences Center of Emphasis, Pfizer Worldwide Research and Development, Cambridge, MA, United States
| | | | | |
Collapse
|