1
|
Skolnick J, Srinivasan B, Skolnick S, Edelman B, Zhou H. Entabolons: How Metabolites Modify the Biochemical Function of Proteins and Cause the Correlated Behavior of Proteins in Pathways. J Chem Inf Model 2025. [PMID: 40378093 DOI: 10.1021/acs.jcim.5c00462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/18/2025]
Abstract
Although there are over 100,000 distinct human metabolites, their biological significance is often not fully appreciated. Metabolites can reshape the protein pockets to which they bind by COLIG formation, thereby influencing enzyme kinetics and altering the monomer-multimer equilibrium in protein complexes. Binding a common metabolite to a set of protein monomers or multimers results in metabolic entanglements that couple the conformational states and functions of nonhomologous, nonphysically interacting proteins that bind the same metabolite. These shared metabolites might provide the collective behavior responsible for protein pathway formation. Proteins whose binding and functional behavior is modified by a set of metabolites are termed an "entabolon"─a portmanteau of metabolic entanglement and metabolon. 55%-60% (22%-24%) of pairs of nonenzymatic proteins that likely bind the same metabolite have a p-value that they are in the same pathway, which is <0.05 (0.0005). Interestingly, the most populated pairs of proteins common to multiple pathways bind ancient metabolites. Similarly, we suggest how metabolites can possibly activate, terminate, or preclude transcription and other nucleic acid functions and may facilitate or inhibit the binding of nucleic acids to proteins, thereby influencing transcription and translation processes. Consequently, metabolites likely play a critical role in the organization and function of biological systems.
Collapse
Affiliation(s)
- Jeffrey Skolnick
- Center for the Study of Systems Biology Georgia Institute of Technology 950 Atlantic Dr NW Atlanta, Georgia 30332, United States
| | - Bharath Srinivasan
- School of Pharmacy and Life Sciences Robert Gordon University, Aberdeen, Scotland AB10 7AQ, United Kingdom
- Cancer Research Horizons Cancer Research U.K., London CB22 3AT, United Kingdom
| | - Samuel Skolnick
- Center for the Study of Systems Biology Georgia Institute of Technology 950 Atlantic Dr NW Atlanta, Georgia 30332, United States
| | - Brice Edelman
- Center for the Study of Systems Biology Georgia Institute of Technology 950 Atlantic Dr NW Atlanta, Georgia 30332, United States
| | - Hongyi Zhou
- Center for the Study of Systems Biology Georgia Institute of Technology 950 Atlantic Dr NW Atlanta, Georgia 30332, United States
| |
Collapse
|
2
|
Maity D, Qiao B. AlloBench: A Data Set Pipeline for the Development and Benchmarking of Allosteric Site Prediction Tools. ACS OMEGA 2025; 10:17973-17982. [PMID: 40352555 PMCID: PMC12059942 DOI: 10.1021/acsomega.5c01263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/10/2025] [Revised: 04/14/2025] [Accepted: 04/17/2025] [Indexed: 05/14/2025]
Abstract
Allostery refers to the activity regulation of biological macromolecules originating from the binding of an effector molecule at the allosteric site that is distant from the active site. The few existing allosteric data sets have not been updated with recent discoveries of allosteric proteins and are challenging to use for data-intensive tasks. Instead of providing another data set bound to become outdated, we present the AlloBench pipeline to create high-quality data sets of biomolecules with allosteric and active site information suitable for computational and data-driven studies of protein allostery. The pipeline produces a data set of 2141 allosteric sites from 2034 protein structures with 418 unique protein chains by integrating information from AlloSteric Database, UniProt, Mechanism and Catalytic Site Atlas, and Protein Data Bank. Furthermore, we use a subset of 100 proteins from the AlloBench data set to quantitatively compare the performance of currently available allosteric site prediction tools: APOP, PASSer, Ohm, ALLO, Allosite, STRESS, and AlloPred. Such a large-scale benchmarking of these programs has not been undertaken on a common test set. The results show a significant need for improvement, as the accuracy for all programs is well below 60%, with PASSer (Ensemble) outperforming the rest. The AlloBench pipeline will not only promote the development of improved allosteric site prediction tools but also serve as a reference for studying allostery in general.
Collapse
Affiliation(s)
- Dibyajyoti Maity
- Department of Natural Sciences, Baruch College, City University of New York, New York 10010, New York United States
| | - Baofu Qiao
- Department of Natural Sciences, Baruch College, City University of New York, New York 10010, New York United States
| |
Collapse
|
3
|
Zhang R, Chen Z, Li S, Lv H, Li J, Yang N, Dai S. Proteome-Wide Identification and Comparison of Drug Pockets for Discovering New Drug Indications and Side Effects. Molecules 2025; 30:260. [PMID: 39860130 PMCID: PMC11767986 DOI: 10.3390/molecules30020260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2024] [Revised: 01/04/2025] [Accepted: 01/08/2025] [Indexed: 01/27/2025] Open
Abstract
Drug development faces significant financial and time challenges, highlighting the need for more efficient strategies. This study evaluated the druggability of the entire human proteome using Fpocket. We identified 15,043 druggable pockets in 20,255 predicted protein structures, significantly expanding the estimated druggable proteome from 3000 to over 11,000 proteins. Notably, many druggable pockets were found in less studied proteins, suggesting untapped therapeutic opportunities. The results of a pairwise pocket similarity analysis identified 220,312 similar pocket pairs, with 3241 pairs across different protein families, indicating shared drug-binding potential. In addition, 62,077 significant matches were found between druggable pockets and 1872 known drug pockets, highlighting candidates for drug repositioning. We repositioned progesterone to ADGRD1 for pemphigus and breast cancer, as well as estradiol to ANO2 for shingles and medulloblastoma, which were validated via molecular docking. Off-target effects were analyzed to assess the safety of drugs such as axitinib, linking newly identified targets with known side effects. For axitinib, 127 new targets were identified, and 46 out of 48 documented side effects were linked to these targets. These findings demonstrate the utility of pocket similarity in drug repositioning, target expansion, and improved drug safety evaluation, offering new avenues for the discovery of new indications and side effects of existing drugs.
Collapse
Affiliation(s)
- Renxin Zhang
- State Key Laboratory of Primate Biomedical Research, Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming 650500, China; (R.Z.); (Z.C.); (S.L.); (H.L.); (J.L.)
- Yunnan Key Laboratory of Primate Biomedical Research, Kunming 650500, China
| | - Zhiyuan Chen
- State Key Laboratory of Primate Biomedical Research, Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming 650500, China; (R.Z.); (Z.C.); (S.L.); (H.L.); (J.L.)
- Yunnan Key Laboratory of Primate Biomedical Research, Kunming 650500, China
| | - Shuhan Li
- State Key Laboratory of Primate Biomedical Research, Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming 650500, China; (R.Z.); (Z.C.); (S.L.); (H.L.); (J.L.)
- Yunnan Key Laboratory of Primate Biomedical Research, Kunming 650500, China
| | - Haohao Lv
- State Key Laboratory of Primate Biomedical Research, Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming 650500, China; (R.Z.); (Z.C.); (S.L.); (H.L.); (J.L.)
- Yunnan Key Laboratory of Primate Biomedical Research, Kunming 650500, China
| | - Jinjun Li
- State Key Laboratory of Primate Biomedical Research, Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming 650500, China; (R.Z.); (Z.C.); (S.L.); (H.L.); (J.L.)
- Yunnan Key Laboratory of Primate Biomedical Research, Kunming 650500, China
| | - Naixue Yang
- State Key Laboratory of Primate Biomedical Research, Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming 650500, China; (R.Z.); (Z.C.); (S.L.); (H.L.); (J.L.)
- Yunnan Key Laboratory of Primate Biomedical Research, Kunming 650500, China
| | - Shaoxing Dai
- State Key Laboratory of Primate Biomedical Research, Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming 650500, China; (R.Z.); (Z.C.); (S.L.); (H.L.); (J.L.)
- Yunnan Key Laboratory of Primate Biomedical Research, Kunming 650500, China
| |
Collapse
|
4
|
Zhang H, Gur M, Bahar I. Global hinge sites of proteins as target sites for drug binding. Proc Natl Acad Sci U S A 2024; 121:e2414333121. [PMID: 39585988 PMCID: PMC11626116 DOI: 10.1073/pnas.2414333121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2024] [Accepted: 10/17/2024] [Indexed: 11/27/2024] Open
Abstract
Hinge sites of proteins play a key role in mediating conformational mechanics. Among them, those involved in the most collective modes of motion, also called global hinges, are of particular interest, as they support cooperative rearrangements that are often functional. Yet, the utility of targeting global hinges for modulating function remains to be established. We present here a systematic study of a series of proteins resolved in drug-bound forms to examine the probabilistic occurrence of spatial overlaps between hinge sites and drug-binding pockets. Our analysis reveals a high propensity of drug binding to hinge sites compared to random. Notably, one-third of currently approved drugs are colocalized with hinge sites. These mechanosensitive sites are predictable by simple models such as the Gaussian Network Model. Their targeting thus emerges as a viable strategy for developing a new class of drugs that would exploit and modulate the target proteins' intrinsic dynamics, and potentially alleviate drug-resistance when used in combination with orthosteric or allosteric drugs.
Collapse
Affiliation(s)
- Haotian Zhang
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA15261
| | - Mert Gur
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA15261
| | - Ivet Bahar
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA15261
- Laufer Center for Physical and Quantitative Biology and Department of Biochemistry and Cell Biology, School of Medicine, Stony Brook University, New York, NY11794
| |
Collapse
|
5
|
Gao M, Skolnick J. Predicting protein interactions of the kinase Lck critical to T cell modulation. Structure 2024; 32:2168-2179.e2. [PMID: 39368461 PMCID: PMC11560573 DOI: 10.1016/j.str.2024.09.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Revised: 08/19/2024] [Accepted: 09/10/2024] [Indexed: 10/07/2024]
Abstract
Protein-protein interactions (PPIs) play pivotal roles in directing T cell fate. One key player is the non-receptor tyrosine protein kinase Lck that helps to transduce T cell activation signals. Lck is mediated by other proteins via interactions that are inadequately understood. Here, we use the deep learning method AF2Complex to predict PPIs involving Lck, by screening it against ∼1,000 proteins implicated in immune responses, followed by extensive structural modeling for selected interactions. Remarkably, we describe how Lck may be specifically targeted by a palmitoyltransferase using a phosphotyrosine motif. We uncover "hotspot" interactions between Lck and the tyrosine phosphatase CD45, leading to a significant conformational shift of Lck for activation. Lastly, we present intriguing interactions between the phosphotyrosine-binding domain of Lck and the cytoplasmic tail of the immune checkpoint LAG3 and propose a molecular mechanism for its inhibitory role. Together, this multifaceted study provides valuable insights into T cell regulation and signaling.
Collapse
Affiliation(s)
- Mu Gao
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332, USA; AgnistaBio Inc, Palo Alto, CA 94301, USA.
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332, USA.
| |
Collapse
|
6
|
Ugurlu SY, McDonald D, He S. MEF-AlloSite: an accurate and robust Multimodel Ensemble Feature selection for the Allosteric Site identification model. J Cheminform 2024; 16:116. [PMID: 39444016 PMCID: PMC11515501 DOI: 10.1186/s13321-024-00882-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2024] [Accepted: 07/09/2024] [Indexed: 10/25/2024] Open
Abstract
A crucial mechanism for controlling the actions of proteins is allostery. Allosteric modulators have the potential to provide many benefits compared to orthosteric ligands, such as increased selectivity and saturability of their effect. The identification of new allosteric sites presents prospects for the creation of innovative medications and enhances our comprehension of fundamental biological mechanisms. Allosteric sites are increasingly found in different protein families through various techniques, such as machine learning applications, which opens up possibilities for creating completely novel medications with a diverse variety of chemical structures. Machine learning methods, such as PASSer, exhibit limited efficacy in accurately finding allosteric binding sites when relying solely on 3D structural information.Scientific ContributionPrior to conducting feature selection for allosteric binding site identification, integration of supporting amino-acid-based information to 3D structural knowledge is advantageous. This approach can enhance performance by ensuring accuracy and robustness. Therefore, we have developed an accurate and robust model called Multimodel Ensemble Feature Selection for Allosteric Site Identification (MEF-AlloSite) after collecting 9460 relevant and diverse features from the literature to characterise pockets. The model employs an accurate and robust multimodal feature selection technique for the small training set size of only 90 proteins to improve predictive performance. This state-of-the-art technique increased the performance in allosteric binding site identification by selecting promising features from 9460 features. Also, the relationship between selected features and allosteric binding sites enlightened the understanding of complex allostery for proteins by analysing selected features. MEF-AlloSite and state-of-the-art allosteric site identification methods such as PASSer2.0 and PASSerRank have been tested on three test cases 51 times with a different split of the training set. The Student's t test and Cohen's D value have been used to evaluate the average precision and ROC AUC score distribution. On three test cases, most of the p-values ( < 0.05 ) and the majority of Cohen's D values ( > 0.5 ) showed that MEF-AlloSite's 1-6% higher mean of average precision and ROC AUC than state-of-the-art allosteric site identification methods are statistically significant.
Collapse
Affiliation(s)
- Sadettin Y Ugurlu
- School of Computer Science, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK
| | | | - Shan He
- School of Computer Science, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK.
- AIA Insights Ltd, Birmingham, UK.
| |
Collapse
|
7
|
Reim T, Ehrt C, Graef J, Günther S, Meents A, Rarey M. SiteMine: Large-scale binding site similarity searching in protein structure databases. Arch Pharm (Weinheim) 2024; 357:e2300661. [PMID: 38335311 DOI: 10.1002/ardp.202300661] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 01/10/2024] [Accepted: 01/16/2024] [Indexed: 02/12/2024]
Abstract
Drug discovery and design challenges, such as drug repurposing, analyzing protein-ligand and protein-protein complexes, ligand promiscuity studies, or function prediction, can be addressed by protein binding site similarity analysis. Although numerous tools exist, they all have individual strengths and drawbacks with regard to run time, provision of structure superpositions, and applicability to diverse application domains. Here, we introduce SiteMine, an all-in-one database-driven, alignment-providing binding site similarity search tool to tackle the most pressing challenges of binding site comparison. The performance of SiteMine is evaluated on the ProSPECCTs benchmark, showing a promising performance on most of the data sets. The method performs convincingly regarding all quality criteria for reliable binding site comparison, offering a novel state-of-the-art approach for structure-based molecular design based on binding site comparisons. In a SiteMine showcase, we discuss the high structural similarity between cathepsin L and calpain 1 binding sites and give an outlook on the impact of this finding on structure-based drug design. SiteMine is available at https://uhh.de/naomi.
Collapse
Affiliation(s)
- Thorben Reim
- ZBH - Center for Bioinformatics, Universität Hamburg, Hamburg, Germany
| | - Christiane Ehrt
- ZBH - Center for Bioinformatics, Universität Hamburg, Hamburg, Germany
| | - Joel Graef
- ZBH - Center for Bioinformatics, Universität Hamburg, Hamburg, Germany
| | - Sebastian Günther
- Center for Free-Electron Laser Science CFEL, Deutsches Elektronen-Synchrotron DESY, Hamburg, Germany
| | - Alke Meents
- Center for Free-Electron Laser Science CFEL, Deutsches Elektronen-Synchrotron DESY, Hamburg, Germany
| | - Matthias Rarey
- ZBH - Center for Bioinformatics, Universität Hamburg, Hamburg, Germany
| |
Collapse
|
8
|
Pallante L, Cannariato M, Androutsos L, Zizzi EA, Bompotas A, Hada X, Grasso G, Kalogeras A, Mavroudi S, Di Benedetto G, Theofilatos K, Deriu MA. VirtuousPocketome: a computational tool for screening protein-ligand complexes to identify similar binding sites. Sci Rep 2024; 14:6296. [PMID: 38491261 PMCID: PMC10943019 DOI: 10.1038/s41598-024-56893-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 03/12/2024] [Indexed: 03/18/2024] Open
Abstract
Protein residues within binding pockets play a critical role in determining the range of ligands that can interact with a protein, influencing its structure and function. Identifying structural similarities in proteins offers valuable insights into their function and activation mechanisms, aiding in predicting protein-ligand interactions, anticipating off-target effects, and facilitating the development of therapeutic agents. Numerous computational methods assessing global or local similarity in protein cavities have emerged, but their utilization is impeded by complexity, impractical automation for amino acid pattern searches, and an inability to evaluate the dynamics of scrutinized protein-ligand systems. Here, we present a general, automatic and unbiased computational pipeline, named VirtuousPocketome, aimed at screening huge databases of proteins for similar binding pockets starting from an interested protein-ligand complex. We demonstrate the pipeline's potential by exploring a recently-solved human bitter taste receptor, i.e. the TAS2R46, complexed with strychnine. We pinpointed 145 proteins sharing similar binding sites compared to the analysed bitter taste receptor and the enrichment analysis highlighted the related biological processes, molecular functions and cellular components. This work represents the foundation for future studies aimed at understanding the effective role of tastants outside the gustatory system: this could pave the way towards the rationalization of the diet as a supplement to standard pharmacological treatments and the design of novel tastants-inspired compounds to target other proteins involved in specific diseases or disorders. The proposed pipeline is publicly accessible, can be applied to any protein-ligand complex, and could be expanded to screen any database of protein structures.
Collapse
Affiliation(s)
- Lorenzo Pallante
- Department of Mechanical and Aerospace Engineering, Politecnico di Torino, PolitoBIOMedLab, 10129, Torino, Italy
| | - Marco Cannariato
- Department of Mechanical and Aerospace Engineering, Politecnico di Torino, PolitoBIOMedLab, 10129, Torino, Italy
| | | | - Eric A Zizzi
- Department of Mechanical and Aerospace Engineering, Politecnico di Torino, PolitoBIOMedLab, 10129, Torino, Italy
| | - Agorakis Bompotas
- Industrial Systems Institute, Athena Research Center, 265 04, Patras, Greece
| | - Xhesika Hada
- Department of Mechanical and Aerospace Engineering, Politecnico di Torino, PolitoBIOMedLab, 10129, Torino, Italy
| | - Gianvito Grasso
- Dalle Molle Institute for Artificial Intelligence IDSIA USI-SUPSI, 6962, Lugano-Viganello, Switzerland
| | | | - Seferina Mavroudi
- Department of Nursing, School of Health Rehabilitation Sciences, University of Patras, 265 04, Patras, Greece
| | | | | | - Marco A Deriu
- Department of Mechanical and Aerospace Engineering, Politecnico di Torino, PolitoBIOMedLab, 10129, Torino, Italy.
| |
Collapse
|
9
|
Shen Y, Parks JM, Smith JC. HLA-Clus: HLA class I clustering based on 3D structure. BMC Bioinformatics 2023; 24:189. [PMID: 37161375 PMCID: PMC10169335 DOI: 10.1186/s12859-023-05297-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Accepted: 04/18/2023] [Indexed: 05/11/2023] Open
Abstract
BACKGROUND In a previous paper, we classified populated HLA class I alleles into supertypes and subtypes based on the similarity of 3D landscape of peptide binding grooves, using newly defined structure distance metric and hierarchical clustering approach. Compared to other approaches, our method achieves higher correlation with peptide binding specificity, intra-cluster similarity (cohesion), and robustness. Here we introduce HLA-Clus, a Python package for clustering HLA Class I alleles using the method we developed recently and describe additional features including a new nearest neighbor clustering method that facilitates clustering based on user-defined criteria. RESULTS The HLA-Clus pipeline includes three stages: First, HLA Class I structural models are coarse grained and transformed into clouds of labeled points. Second, similarities between alleles are determined using a newly defined structure distance metric that accounts for spatial and physicochemical similarities. Finally, alleles are clustered via hierarchical or nearest-neighbor approaches. We also interfaced HLA-Clus with the peptide:HLA affinity predictor MHCnuggets. By using the nearest neighbor clustering method to select optimal allele-specific deep learning models in MHCnuggets, the average accuracy of peptide binding prediction of rare alleles was improved. CONCLUSIONS The HLA-Clus package offers a solution for characterizing the peptide binding specificities of a large number of HLA alleles. This method can be applied in HLA functional studies, such as the development of peptide affinity predictors, disease association studies, and HLA matching for grafting. HLA-Clus is freely available at our GitHub repository ( https://github.com/yshen25/HLA-Clus ).
Collapse
Affiliation(s)
- Yue Shen
- Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN, 37996, USA
| | - Jerry M Parks
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA
| | - Jeremy C Smith
- Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN, 37996, USA.
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA.
- Department of Biochemistry and Cellular and Molecular Biology, University of Tennessee, Knoxville, TN, 37996, USA.
| |
Collapse
|
10
|
Shen Y, Parks JM, Smith JC. HLA Class I Supertype Classification Based on Structural Similarity. JOURNAL OF IMMUNOLOGY (BALTIMORE, MD. : 1950) 2023; 210:103-114. [PMID: 36453976 DOI: 10.4049/jimmunol.2200685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Accepted: 10/31/2022] [Indexed: 12/24/2022]
Abstract
HLA class I proteins, a critical component in adaptive immunity, bind and present intracellular Ags to CD8+ T cells. The extreme polymorphism of HLA genes and associated peptide binding specificities leads to challenges in various endeavors, including neoantigen vaccine development, disease association studies, and HLA typing. Supertype classification, defined by clustering functionally similar HLA alleles, has proven helpful in reducing the complexity of distinguishing alleles. However, determining supertypes via experiments is impractical, and current in silico classification methods exhibit limitations in stability and functional relevance. In this study, by incorporating three-dimensional structures we present a method for classifying HLA class I molecules with improved breadth, accuracy, stability, and flexibility. Critical for these advances is our finding that structural similarity highly correlates with peptide binding specificity. The new classification should be broadly useful in peptide-based vaccine development and HLA-disease association studies.
Collapse
Affiliation(s)
- Yue Shen
- UT-ORNL Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN
| | - Jerry M Parks
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN; and
| | - Jeremy C Smith
- UT-ORNL Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN.,Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN; and.,Department of Biochemistry & Cellular and Molecular Biology, University of Tennessee, Knoxville, TN
| |
Collapse
|
11
|
Chan L, Kumar R, Verdonk M, Poelking C. A multilevel generative framework with hierarchical self-contrasting for bias control and transparency in structure-based ligand design. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00564-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
12
|
Guo Z, Liu J, Skolnick J, Cheng J. Prediction of inter-chain distance maps of protein complexes with 2D attention-based deep neural networks. Nat Commun 2022; 13:6963. [PMID: 36379943 PMCID: PMC9666547 DOI: 10.1038/s41467-022-34600-2] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Accepted: 10/24/2022] [Indexed: 11/16/2022] Open
Abstract
Residue-residue distance information is useful for predicting tertiary structures of protein monomers or quaternary structures of protein complexes. Many deep learning methods have been developed to predict intra-chain residue-residue distances of monomers accurately, but few methods can accurately predict inter-chain residue-residue distances of complexes. We develop a deep learning method CDPred (i.e., Complex Distance Prediction) based on the 2D attention-powered residual network to address the gap. Tested on two homodimer datasets, CDPred achieves the precision of 60.94% and 42.93% for top L/5 inter-chain contact predictions (L: length of the monomer in homodimer), respectively, substantially higher than DeepHomo's 37.40% and 23.08% and GLINTER's 48.09% and 36.74%. Tested on the two heterodimer datasets, the top Ls/5 inter-chain contact prediction precision (Ls: length of the shorter monomer in heterodimer) of CDPred is 47.59% and 22.87% respectively, surpassing GLINTER's 23.24% and 13.49%. Moreover, the prediction of CDPred is complementary with that of AlphaFold2-multimer.
Collapse
Affiliation(s)
- Zhiye Guo
- grid.134936.a0000 0001 2162 3504Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211 USA
| | - Jian Liu
- grid.134936.a0000 0001 2162 3504Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211 USA
| | - Jeffrey Skolnick
- grid.213917.f0000 0001 2097 4943School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332-200 USA
| | - Jianlin Cheng
- grid.134936.a0000 0001 2162 3504Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211 USA
| |
Collapse
|
13
|
Eguida M, Rognan D. Estimating the Similarity between Protein Pockets. Int J Mol Sci 2022; 23:12462. [PMID: 36293316 PMCID: PMC9604425 DOI: 10.3390/ijms232012462] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 10/15/2022] [Accepted: 10/16/2022] [Indexed: 10/28/2023] Open
Abstract
With the exponential increase in publicly available protein structures, the comparison of protein binding sites naturally emerged as a scientific topic to explain observations or generate hypotheses for ligand design, notably to predict ligand selectivity for on- and off-targets, explain polypharmacology, and design target-focused libraries. The current review summarizes the state-of-the-art computational methods applied to pocket detection and comparison as well as structural druggability estimates. The major strengths and weaknesses of current pocket descriptors, alignment methods, and similarity search algorithms are presented. Lastly, an exhaustive survey of both retrospective and prospective applications in diverse medicinal chemistry scenarios illustrates the capability of the existing methods and the hurdle that still needs to be overcome for more accurate predictions.
Collapse
Affiliation(s)
| | - Didier Rognan
- Laboratoire d’Innovation Thérapeutique, UMR7200 CNRS-Université de Strasbourg, 67400 Illkirch, France
| |
Collapse
|
14
|
Skolnick J, Zhou H. Implications of the Essential Role of Small Molecule Ligand Binding Pockets in Protein-Protein Interactions. J Phys Chem B 2022; 126:6853-6867. [PMID: 36044742 PMCID: PMC9484464 DOI: 10.1021/acs.jpcb.2c04525] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Revised: 08/18/2022] [Indexed: 11/28/2022]
Abstract
Protein-protein interactions (PPIs) and protein-metabolite interactions play a key role in many biochemical processes, yet they are often viewed as being independent. However, the fact that small molecule drugs have been successful in inhibiting PPIs suggests a deeper relationship between protein pockets that bind small molecules and PPIs. We demonstrate that 2/3 of PPI interfaces, including antibody-epitope interfaces, contain at least one significant small molecule ligand binding pocket. In a representative library of 50 distinct protein-protein interactions involving hundreds of mutations, >75% of hot spot residues overlap with small molecule ligand binding pockets. Hence, ligand binding pockets play an essential role in PPIs. In representative cases, evolutionary unrelated monomers that are involved in different multimeric interactions yet share the same pocket are predicted to bind the same metabolites/drugs; these results are confirmed by examples in the PDB. Thus, the binding of a metabolite can shift the equilibrium between monomers and multimers. This implicit coupling of PPI equilibria, termed "metabolic entanglement", was successfully employed to suggest novel functional relationships among protein multimers that do not directly interact. Thus, the current work provides an approach to unify metabolomics and protein interactomics.
Collapse
Affiliation(s)
- Jeffrey Skolnick
- Center for the Study of Systems
Biology, School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, NW, Atlanta, Georgia 30332, United States
| | - Hongyi Zhou
- Center for the Study of Systems
Biology, School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, NW, Atlanta, Georgia 30332, United States
| |
Collapse
|
15
|
D’Arrigo G, Autiero I, Gianquinto E, Siragusa L, Baroni M, Cruciani G, Spyrakis F. Exploring Ligand Binding Domain Dynamics in the NRs Superfamily. Int J Mol Sci 2022; 23:ijms23158732. [PMID: 35955864 PMCID: PMC9369052 DOI: 10.3390/ijms23158732] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 07/29/2022] [Accepted: 08/04/2022] [Indexed: 11/16/2022] Open
Abstract
Nuclear receptors (NRs) are transcription factors that play an important role in multiple diseases, such as cancer, inflammation, and metabolic disorders. They share a common structural organization composed of five domains, of which the ligand-binding domain (LBD) can adopt different conformations in response to substrate, agonist, and antagonist binding, leading to distinct transcription effects. A key feature of NRs is, indeed, their intrinsic dynamics that make them a challenging target in drug discovery. This work aims to provide a meaningful investigation of NR structural variability to outline a dynamic profile for each of them. To do that, we propose a methodology based on the computation and comparison of protein cavities among the crystallographic structures of NR LBDs. First, pockets were detected with the FLAPsite algorithm and then an "all against all" approach was applied by comparing each pair of pockets within the same sub-family on the basis of their similarity score. The analysis concerned all the detectable cavities in NRs, with particular attention paid to the active site pockets. This approach can guide the investigation of NR intrinsic dynamics, the selection of reference structures to be used in drug design and the easy identification of alternative binding sites.
Collapse
Affiliation(s)
- Giulia D’Arrigo
- Department of Drug Science and Technology, University of Turin, Via Giuria 9, 10125 Turin, Italy
| | - Ida Autiero
- Molecular Horizon Srl, Via Montelino 30, 06084 Bettona, Italy
- National Research Council, Institute of Biostructures and Bioimaging, 80138 Naples, Italy
| | - Eleonora Gianquinto
- Department of Drug Science and Technology, University of Turin, Via Giuria 9, 10125 Turin, Italy
| | - Lydia Siragusa
- Molecular Horizon Srl, Via Montelino 30, 06084 Bettona, Italy
- Molecular Discovery Ltd., Theobald Street, Elstree Borehamwood, Hertfordshire WD6 4PJ, UK
| | - Massimo Baroni
- Molecular Discovery Ltd., Theobald Street, Elstree Borehamwood, Hertfordshire WD6 4PJ, UK
| | - Gabriele Cruciani
- Department of Chemistry, Biology and Biotechnology, University of Perugia, Via Elce di Sotto 8, 06123 Perugia, Italy
- Consortium for Computational Molecular and Materials Sciences (CMS), Via Elce di Sotto 8, 06123 Perugia, Italy
- Correspondence: (G.C.); (F.S.); Tel.: +39-075-5855629 (G.C.); +39-011-6707185 (F.S.)
| | - Francesca Spyrakis
- Department of Drug Science and Technology, University of Turin, Via Giuria 9, 10125 Turin, Italy
- Correspondence: (G.C.); (F.S.); Tel.: +39-075-5855629 (G.C.); +39-011-6707185 (F.S.)
| |
Collapse
|
16
|
Yang L, He W, Yun Y, Gao Y, Zhu Z, Teng M, Liang Z, Niu L. Defining A Global Map of Functional Group-based 3D Ligand-binding Motifs. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022; 20:765-779. [PMID: 35288344 PMCID: PMC9881048 DOI: 10.1016/j.gpb.2021.08.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/22/2020] [Revised: 06/30/2021] [Accepted: 09/27/2021] [Indexed: 01/31/2023]
Abstract
Uncovering conserved 3D protein-ligand binding patterns on the basis of functional groups (FGs) shared by a variety of small molecules can greatly expand our knowledge of protein-ligand interactions. Despite that conserved binding patterns for a few commonly used FGs have been reported in the literature, large-scale identification and evaluation of FG-based 3D binding motifs are still lacking. Here, we propose a computational method, Automatic FG-based Three-dimensional Motif Extractor (AFTME), for automatic mapping of 3D motifs to different FGs of a specific ligand. Applying our method to 233 naturally-occurring ligands, we define 481 FG-binding motifs that are highly conserved across different ligand-binding pockets. Systematic analysis further reveals four main classes of binding motifs corresponding to distinct sets of FGs. Combinations of FG-binding motifs facilitate the binding of proteins to a wide spectrum of ligands with various binding affinities. Finally, we show that our FG-motif map can be used to nominate FGs that potentially bind to specific drug targets, thus providing useful insights and guidance for rational design of small-molecule drugs.
Collapse
Affiliation(s)
- Liu Yang
- School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei 230026, China; Division of Molecular and Cellular Biophysics, Hefei National Laboratory for Physical Sciences at the Microscale, Hefei 230026, China
| | - Wei He
- School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei 230026, China; Division of Molecular and Cellular Biophysics, Hefei National Laboratory for Physical Sciences at the Microscale, Hefei 230026, China.
| | - Yuehui Yun
- School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei 230026, China; Division of Molecular and Cellular Biophysics, Hefei National Laboratory for Physical Sciences at the Microscale, Hefei 230026, China
| | - Yongxiang Gao
- School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei 230026, China; Division of Molecular and Cellular Biophysics, Hefei National Laboratory for Physical Sciences at the Microscale, Hefei 230026, China
| | - Zhongliang Zhu
- School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei 230026, China; Division of Molecular and Cellular Biophysics, Hefei National Laboratory for Physical Sciences at the Microscale, Hefei 230026, China
| | - Maikun Teng
- School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei 230026, China; Division of Molecular and Cellular Biophysics, Hefei National Laboratory for Physical Sciences at the Microscale, Hefei 230026, China
| | - Zhi Liang
- School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei 230026, China; Division of Molecular and Cellular Biophysics, Hefei National Laboratory for Physical Sciences at the Microscale, Hefei 230026, China.
| | - Liwen Niu
- School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei 230026, China; Division of Molecular and Cellular Biophysics, Hefei National Laboratory for Physical Sciences at the Microscale, Hefei 230026, China.
| |
Collapse
|
17
|
Dankwah KO, Mohl JE, Begum K, Leung MY. What Makes GPCRs from Different Families Bind to the Same Ligand? Biomolecules 2022; 12:863. [PMID: 35883418 PMCID: PMC9313020 DOI: 10.3390/biom12070863] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2022] [Revised: 06/09/2022] [Accepted: 06/19/2022] [Indexed: 12/10/2022] Open
Abstract
G protein-coupled receptors (GPCRs) are the largest class of cell-surface receptor proteins with important functions in signal transduction and often serve as therapeutic drug targets. With the rapidly growing public data on three dimensional (3D) structures of GPCRs and GPCR-ligand interactions, computational prediction of GPCR ligand binding becomes a convincing option to high throughput screening and other experimental approaches during the beginning phases of ligand discovery. In this work, we set out to computationally uncover and understand the binding of a single ligand to GPCRs from several different families. Three-dimensional structural comparisons of the GPCRs that bind to the same ligand revealed local 3D structural similarities and often these regions overlap with locations of binding pockets. These pockets were found to be similar (based on backbone geometry and side-chain orientation using APoc), and they correlate positively with electrostatic properties of the pockets. Moreover, the more similar the pockets, the more likely a ligand binding to the pockets will interact with similar residues, have similar conformations, and produce similar binding affinities across the pockets. These findings can be exploited to improve protein function inference, drug repurposing and drug toxicity prediction, and accelerate the development of new drugs.
Collapse
Affiliation(s)
- Kwabena Owusu Dankwah
- Computational Science Program, The University of Texas at El Paso, El Paso, TX 79968, USA;
| | - Jonathon E. Mohl
- Computational Science Program, The University of Texas at El Paso, El Paso, TX 79968, USA;
- Bioinformatics Program, The University of Texas at El Paso, El Paso, TX 79968, USA;
- Department of Mathematical Sciences, The University of Texas at El Paso, El Paso, TX 79968, USA
- Border Biomedical Research Center, The University of Texas at El Paso, El Paso, TX 79968, USA
| | - Khodeza Begum
- Bioinformatics Program, The University of Texas at El Paso, El Paso, TX 79968, USA;
- Border Biomedical Research Center, The University of Texas at El Paso, El Paso, TX 79968, USA
| | - Ming-Ying Leung
- Computational Science Program, The University of Texas at El Paso, El Paso, TX 79968, USA;
- Bioinformatics Program, The University of Texas at El Paso, El Paso, TX 79968, USA;
- Department of Mathematical Sciences, The University of Texas at El Paso, El Paso, TX 79968, USA
- Border Biomedical Research Center, The University of Texas at El Paso, El Paso, TX 79968, USA
| |
Collapse
|
18
|
Sankar S, Chandra N. SiteMotif: A graph-based algorithm for deriving structural motifs in Protein Ligand binding sites. PLoS Comput Biol 2022; 18:e1009901. [PMID: 35202398 PMCID: PMC8903255 DOI: 10.1371/journal.pcbi.1009901] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Revised: 03/08/2022] [Accepted: 02/07/2022] [Indexed: 12/03/2022] Open
Abstract
Studying similarities in protein molecules has become a fundamental activity in much of biology and biomedical research, for which methods such as multiple sequence alignments are widely used. Most methods available for such comparisons cater to studying proteins which have clearly recognizable evolutionary relationships but not to proteins that recognize the same or similar ligands but do not share similarities in their sequence or structural folds. In many cases, proteins in the latter class share structural similarities only in their binding sites. While several algorithms are available for comparing binding sites, there are none for deriving structural motifs of the binding sites, independent of the whole proteins. We report the development of SiteMotif, a new algorithm that compares binding sites from multiple proteins and derives sequence-order independent structural site motifs. We have tested the algorithm at multiple levels of complexity and demonstrate its performance in different scenarios. We have benchmarked against 3 current methods available for binding site comparison and demonstrate superior performance of our algorithm. We show that SiteMotif identifies new structural motifs of spatially conserved residues in proteins, even when there is no sequence or fold-level similarity. We expect SiteMotif to be useful for deriving key mechanistic insights into the mode of ligand interaction, predict the ligand type that a protein can bind and improve the sensitivity of functional annotation. A large number of biological functions are orchestrated by proteins. The function of proteins is governed by its structure and its interacting ligand. However, it is known that not all residues are involved in ligand recognition. More specifically, residues that are located within 4.5 Å of ligand atoms are considered to be ’binding sites’. Here, we have developed an algorithm called SiteMotif that efficiently aligns multiple binding sites into a common frame. This process enables us to derive conservation among the binding site residues in a sequence order independent manner. The algorithm was validated extensively across five different levels and measured binding site similarities in each of them. Previous research has found multiple instances where different proteins have comparable binding sites and hence perform the same function. We present the ability of our method to detect such scenarios. Finally, As a use case, we applied SiteMotif to a set of glutathione binding proteins and derived a site based sequence motif characteristic of all glutathione binding proteins.
Collapse
Affiliation(s)
- Santhosh Sankar
- Department of Biochemistry, Indian Institute of Science, Bangalore, Karnataka, India
| | - Nagasuma Chandra
- Department of Biochemistry, Indian Institute of Science, Bangalore, Karnataka, India
- BioSystems Science and Engineering, Indian Institute of Science, Bangalore, Karnataka, India
- * E-mail:
| |
Collapse
|
19
|
Zhang W, Huang J. EViS: An Enhanced Virtual Screening Approach Based on Pocket-Ligand Similarity. J Chem Inf Model 2022; 62:498-510. [PMID: 35084171 DOI: 10.1021/acs.jcim.1c00944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Virtual screening (VS) is a popular technology in drug discovery to identify a new scaffold of actives for a specific drug target, which can be classified into ligand-based and structure-based approaches. As the number of protein-ligand complex structures available in public databases increases, it would be possible to develop a template searching-based VS approach that utilizes such information. In this work, we proposed an enhanced VS approach, which is termed EViS, to integrate ligand docking, protein pocket template searching, and ligand template shape similarity calculation. A novel and simple PL-score to characterize local pocket-ligand template similarity was used to evaluate the screening compounds. Benchmark tests were performed on three datasets including DUDE, LIT-PCBA, and DEKOIS. EViS achieved the average enrichment factors (EFs) of 27.8 and 23.4 at a 1% cutoff for experimental and predicted structures on the widely used DUDE dataset, respectively. Detailed data analysis shows that EViS benefits from obtaining favorable ligand poses from docking and using such ligand geometric information to perform three-dimensional (3D) ligand similarity calculations, and the PL-score is efficient to screen compounds based on template searching in the protein-ligand structure database.
Collapse
Affiliation(s)
- Wenyi Zhang
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, 18 Shilongshan Road, Hangzhou, Zhejiang 310024, China.,Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou, Zhejiang 310024, China.,Institute of Biology, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou, Zhejiang 310024, China
| | - Jing Huang
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, 18 Shilongshan Road, Hangzhou, Zhejiang 310024, China.,Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou, Zhejiang 310024, China.,Institute of Biology, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou, Zhejiang 310024, China
| |
Collapse
|
20
|
Rao L, Jia NX, Hu J, Yu DJ, Zhang GJ. ATPdock: a template-based method for ATP-specific protein-ligand docking. Bioinformatics 2022; 38:556-558. [PMID: 34546290 DOI: 10.1093/bioinformatics/btab667] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2021] [Revised: 09/15/2021] [Accepted: 09/18/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Accurately identifying protein-ATP binding poses is significantly valuable for both basic structure biology and drug discovery. Although many docking methods have been designed, most of them require a user-defined binding site and are difficult to achieve a high-quality protein-ATP docking result. It is critical to develop a protein-ATP-specific blind docking method without user-defined binding sites. RESULTS Here, we present ATPdock, a template-based method for docking ATP into protein. For each query protein, if no pocket site is given, ATPdock first identifies its most potential pocket using ATPbind, an ATP-binding site predictor; then, the template pocket, which is most similar to the given or identified pocket, is searched from the database of pocket-ligand structures using APoc, a pocket structural alignment tool; thirdly, the rough docking pose of ATP (rdATP) is generated using LS-align, a ligand structural alignment tool, to align the initial ATP pose to the template ligand corresponding to template pocket; finally, the Metropolis Monte Carlo simulation is used to fine-tune the rdATP under the guidance of AutoDock Vina energy function. Benchmark tests show that ATPdock significantly outperforms other state-of-the-art methods in docking accuracy. AVAILABILITY AND IMPLEMENTATION https://jun-csbio.github.io/atpdock/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Liang Rao
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Ning-Xin Jia
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Jun Hu
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Dong-Jun Yu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
| | - Gui-Jun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| |
Collapse
|
21
|
Gao M, Nakajima An D, Skolnick J. Deep learning-driven insights into super protein complexes for outer membrane protein biogenesis in bacteria. eLife 2022; 11:82885. [PMID: 36576775 PMCID: PMC9797188 DOI: 10.7554/elife.82885] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 11/28/2022] [Indexed: 12/29/2022] Open
Abstract
To reach their final destinations, outer membrane proteins (OMPs) of gram-negative bacteria undertake an eventful journey beginning in the cytosol. Multiple molecular machines, chaperones, proteases, and other enzymes facilitate the translocation and assembly of OMPs. These helpers usually associate, often transiently, forming large protein assemblies. They are not well understood due to experimental challenges in capturing and characterizing protein-protein interactions (PPIs), especially transient ones. Using AF2Complex, we introduce a high-throughput, deep learning pipeline to identify PPIs within the Escherichia coli cell envelope and apply it to several proteins from an OMP biogenesis pathway. Among the top confident hits obtained from screening ~1500 envelope proteins, we find not only expected interactions but also unexpected ones with profound implications. Subsequently, we predict atomic structures for these protein complexes. These structures, typically of high confidence, explain experimental observations and lead to mechanistic hypotheses for how a chaperone assists a nascent, precursor OMP emerging from a translocon, how another chaperone prevents it from aggregating and docks to a β-barrel assembly port, and how a protease performs quality control. This work presents a general strategy for investigating biological pathways by using structural insights gained from deep learning-based predictions.
Collapse
Affiliation(s)
- Mu Gao
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of TechnologyAtlantaUnited States
| | - Davi Nakajima An
- School of Computer Science, Georgia Institute of TechnologyAtlantaUnited States
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of TechnologyAtlantaUnited States
| |
Collapse
|
22
|
Li J, Moumbock AFA, Qaseem A, Xu Q, Feng Y, Wang D, Günther S. AroCageDB: A Web-Based Resource for Aromatic Cage Binding Sites and Their Intrinsic Ligands. J Chem Inf Model 2021; 61:5327-5330. [PMID: 34738791 DOI: 10.1021/acs.jcim.1c00927] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
While aromatic cages have extensively been investigated in the context of structural biology, molecular recognition, and drug discovery, there exist to date no comprehensive resource for proteins sharing this conserved structural motif. To this end, we parsed the Protein Data Bank and thus constructed the Aromatic Cage Database (AroCageDB), a database for investigating the binding pocket descriptors and ligand binding space of aromatic-cage-containing proteins (ACCPs). AroCageDB contains 487 unique ACCPs bound to 890 unique ligands, for a total of 1636 complexes. This web-accessible database provides a user-friendly interface for the interactive visualization of ligand-bound ACCP structures, with a variety of search options that will open up opportunities for structural analyses and drug discovery campaigns. AroCageDB is freely available at http://www.pharmbioinf.uni-freiburg.de/arocagedb/.
Collapse
Affiliation(s)
- Jianyu Li
- Institute of Pharmaceutical Sciences, Faculty of Chemistry and Pharmacy, Albert-Ludwigs-Universität Freiburg, Hermann-Herder-Straße 9, D-79104 Freiburg, Germany
| | - Aurélien F A Moumbock
- Institute of Pharmaceutical Sciences, Faculty of Chemistry and Pharmacy, Albert-Ludwigs-Universität Freiburg, Hermann-Herder-Straße 9, D-79104 Freiburg, Germany
| | - Ammar Qaseem
- Institute of Pharmaceutical Sciences, Faculty of Chemistry and Pharmacy, Albert-Ludwigs-Universität Freiburg, Hermann-Herder-Straße 9, D-79104 Freiburg, Germany
| | - Qianqing Xu
- Institute of Pharmaceutical Sciences, Faculty of Chemistry and Pharmacy, Albert-Ludwigs-Universität Freiburg, Hermann-Herder-Straße 9, D-79104 Freiburg, Germany
| | - Yue Feng
- Institute of Pharmaceutical Sciences, Faculty of Chemistry and Pharmacy, Albert-Ludwigs-Universität Freiburg, Hermann-Herder-Straße 9, D-79104 Freiburg, Germany
| | - Dan Wang
- Institute of Pharmaceutical Sciences, Faculty of Chemistry and Pharmacy, Albert-Ludwigs-Universität Freiburg, Hermann-Herder-Straße 9, D-79104 Freiburg, Germany
| | - Stefan Günther
- Institute of Pharmaceutical Sciences, Faculty of Chemistry and Pharmacy, Albert-Ludwigs-Universität Freiburg, Hermann-Herder-Straße 9, D-79104 Freiburg, Germany
| |
Collapse
|
23
|
Gao M, Lund-Andersen P, Morehead A, Mahmud S, Chen C, Chen X, Giri N, Roy RS, Quadir F, Effler TC, Prout R, Abraham S, Elwasif W, Haas NQ, Skolnick J, Cheng J, Sedova A. High-Performance Deep Learning Toolbox for Genome-Scale Prediction of Protein Structure and Function. WORKSHOP ON MACHINE LEARNING IN HPC ENVIRONMENTS. WORKSHOP ON MACHINE LEARNING IN HPC ENVIRONMENTS 2021; 2021:46-57. [PMID: 35112110 PMCID: PMC8802329 DOI: 10.1109/mlhpc54614.2021.00010] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Computational biology is one of many scientific disciplines ripe for innovation and acceleration with the advent of high-performance computing (HPC). In recent years, the field of machine learning has also seen significant benefits from adopting HPC practices. In this work, we present a novel HPC pipeline that incorporates various machine-learning approaches for structure-based functional annotation of proteins on the scale of whole genomes. Our pipeline makes extensive use of deep learning and provides computational insights into best practices for training advanced deep-learning models for high-throughput data such as proteomics data. We showcase methodologies our pipeline currently supports and detail future tasks for our pipeline to envelop, including large-scale sequence comparison using SAdLSA and prediction of protein tertiary structures using AlphaFold2.
Collapse
Affiliation(s)
- Mu Gao
- Georgia Institute of Technology, Atlanta, GA
| | | | | | | | - Chen Chen
- University of Missouri, Columbia, MO
| | - Xiao Chen
- University of Missouri, Columbia, MO
| | | | | | | | | | - Ryan Prout
- Oak Ridge National Laboratory, Oak Ridge, TN
| | | | | | | | | | | | - Ada Sedova
- Oak Ridge National Laboratory, Oak Ridge, TN
| |
Collapse
|
24
|
Guterres H, Park SJ, Zhang H, Im W. CHARMM-GUI LBS Finder & Refiner for Ligand Binding Site Prediction and Refinement. J Chem Inf Model 2021; 61:3744-3751. [PMID: 34296608 DOI: 10.1021/acs.jcim.1c00561] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
A protein performs its task by binding a variety of ligands in its local region that is also known as the ligand-binding-site (LBS). Therefore, accurate prediction, characterization, and refinement of LBS can facilitate protein functional annotations and structure-based drug design. In this work, we present CHARMM-GUI LBS Finder & Refiner (https://www.charmm-gui.org/input/lbsfinder) that predicts potential LBS, offers interactive features for local LBS structure analysis, and prepares various molecular dynamics (MD) systems and inputs by setting up distance restraint potentials for LBS structure refinement. LBS Finder & Refiner supports 5 different commonly used simulation programs, such as NAMD, AMBER, GROMACS, GENESIS, and OpenMM, for LBS structure refinement together with hydrogen mass repartitioning. The capability of LBS Finder & Refiner is illustrated through LBS structure predictions and refinements of 48 modeled and 20 apo benchmark target proteins. Overall, successful LBS structure predictions and refinements are seen in our benchmark tests. We hope that LBS Finder & Refiner is useful to predict, characterize, and refine potential LBS on any given protein of interest.
Collapse
Affiliation(s)
- Hugo Guterres
- Departments of Biological Sciences, Chemistry, Bioengineering, and Computer Science and Engineering, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| | - Sang-Jun Park
- Departments of Biological Sciences, Chemistry, Bioengineering, and Computer Science and Engineering, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| | - Han Zhang
- Departments of Biological Sciences, Chemistry, Bioengineering, and Computer Science and Engineering, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| | - Wonpil Im
- Departments of Biological Sciences, Chemistry, Bioengineering, and Computer Science and Engineering, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| |
Collapse
|
25
|
Li S, Cai C, Gong J, Liu X, Li H. A fast protein binding site comparison algorithm for proteome-wide protein function prediction and drug repurposing. Proteins 2021; 89:1541-1556. [PMID: 34245187 DOI: 10.1002/prot.26176] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Revised: 06/26/2021] [Accepted: 06/30/2021] [Indexed: 01/18/2023]
Abstract
The expansion of three-dimensional protein structures and enhanced computing power have significantly facilitated our understanding of protein sequence/structure/function relationships. A challenge in structural genomics is to predict the function of uncharacterized proteins. Protein function deconvolution based on global sequence or structural homology is impracticable when a protein relates to no other proteins with known function, and in such cases, functional relationships can be established by detecting their local ligand binding site similarity. Here, we introduce a sequence order-independent comparison algorithm, PocketShape, for structural proteome-wide exploration of protein functional site by fully considering the geometry of the backbones, orientation of the sidechains, and physiochemical properties of the pocket-lining residues. PocketShape is efficient in distinguishing similar from dissimilar ligand binding site pairs by retrieving 99.3% of the similar pairs while rejecting 100% of the dissimilar pairs on a dataset containing 1538 binding site pairs. This method successfully classifies 83 enzyme structures with diverse functions into 12 clusters, which is highly in accordance with the actual structural classification of proteins classification. PocketShape also achieves superior performances than other methods in protein profiling based on experimental data. Potential new applications for representative SARS-CoV-2 drugs Remdesivir and 11a are predicted. The high accuracy and time-efficient characteristics of PocketShape will undoubtedly make it a promising complementary tool for proteome-wide protein function inference and drug repurposing study.
Collapse
Affiliation(s)
- Shiliang Li
- State Key Laboratory of Bioreactor Engineering, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Chaoqian Cai
- State Key Laboratory of Bioreactor Engineering, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China.,School of Information Science and Engineering, East China University of Science and Technology, Shanghai, China
| | - Jiayu Gong
- State Key Laboratory of Bioreactor Engineering, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China.,School of Information Science and Engineering, East China University of Science and Technology, Shanghai, China
| | - Xiaofeng Liu
- State Key Laboratory of Bioreactor Engineering, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Honglin Li
- State Key Laboratory of Bioreactor Engineering, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China.,School of Information Science and Engineering, East China University of Science and Technology, Shanghai, China.,Research and Development Department, Jiangzhong Pharmaceutical Co., Ltd., Nanchang, China
| |
Collapse
|
26
|
Gao M, Skolnick J. A novel sequence alignment algorithm based on deep learning of the protein folding code. Bioinformatics 2021; 37:490-496. [PMID: 32960943 PMCID: PMC8599902 DOI: 10.1093/bioinformatics/btaa810] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Revised: 08/11/2020] [Accepted: 09/08/2020] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION From evolutionary interference, function annotation to structural prediction, protein sequence comparison has provided crucial biological insights. While many sequence alignment algorithms have been developed, existing approaches often cannot detect hidden structural relationships in the 'twilight zone' of low sequence identity. To address this critical problem, we introduce a computational algorithm that performs protein Sequence Alignments from deep-Learning of Structural Alignments (SAdLSA, silent 'd'). The key idea is to implicitly learn the protein folding code from many thousands of structural alignments using experimentally determined protein structures. RESULTS To demonstrate that the folding code was learned, we first show that SAdLSA trained on pure α-helical proteins successfully recognizes pairs of structurally related pure β-sheet protein domains. Subsequent training and benchmarking on larger, highly challenging datasets show significant improvement over established approaches. For challenging cases, SAdLSA is ∼150% better than HHsearch for generating pairwise alignments and ∼50% better for identifying the proteins with the best alignments in a sequence library. The time complexity of SAdLSA is O(N) thanks to GPU acceleration. AVAILABILITY AND IMPLEMENTATION Datasets and source codes of SAdLSA are available free of charge for academic users at http://sites.gatech.edu/cssb/sadlsa/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mu Gao
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332, USA
| |
Collapse
|
27
|
Gao M, Skolnick J. A General Framework to Learn Tertiary Structure for Protein Sequence Characterization. FRONTIERS IN BIOINFORMATICS 2021; 1. [PMID: 34308415 PMCID: PMC8301223 DOI: 10.3389/fbinf.2021.689960] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
During the past five years, deep-learning algorithms have enabled ground-breaking progress towards the prediction of tertiary structure from a protein sequence. Very recently, we developed SAdLSA, a new computational algorithm for protein sequence comparison via deep-learning of protein structural alignments. SAdLSA shows significant improvement over established sequence alignment methods. In this contribution, we show that SAdLSA provides a general machine-learning framework for structurally characterizing protein sequences. By aligning a protein sequence against itself, SAdLSA generates a fold distogram for the input sequence, including challenging cases whose structural folds were not present in the training set. About 70% of the predicted distograms are statistically significant. Although at present the accuracy of the intra-sequence distogram predicted by SAdLSA self-alignment is not as good as deep-learning algorithms specifically trained for distogram prediction, it is remarkable that the prediction of single protein structures is encoded by an algorithm that learns ensembles of pairwise structural comparisons, without being explicitly trained to recognize individual structural folds. As such, SAdLSA can not only predict protein folds for individual sequences, but also detects subtle, yet significant, structural relationships between multiple protein sequences using the same deep-learning neural network. The former reduces to a special case in this general framework for protein sequence annotation.
Collapse
Affiliation(s)
- Mu Gao
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, United States
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, United States
| |
Collapse
|
28
|
Bhadra A, Yeturu K. Site2Vec: a reference frame invariant algorithm for vector embedding of protein–ligand binding sites. MACHINE LEARNING: SCIENCE AND TECHNOLOGY 2021. [DOI: 10.1088/2632-2153/abad88] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Abstract
Protein–ligand interactions are one of the fundamental types of molecular interactions in living systems. Ligands are small molecules that interact with protein molecules at specific regions on their surfaces called binding sites. Binding sites would also determine ADMET properties of a drug molecule. Tasks such as assessment of protein functional similarity and detection of side effects of drugs need identification of similar binding sites of disparate proteins across diverse pathways. To this end, methods for computing similarities between binding sites are still evolving and is an active area of research even today. Machine learning methods for similarity assessment require feature descriptors of binding sites. Traditional methods based on hand engineered motifs and atomic configurations are not scalable across several thousands of sites. In this regard, deep neural network algorithms are now deployed which can capture very complex input feature space. However, one fundamental challenge in applying deep learning to structures of binding sites is the input representation and the reference frame. We report here a novel algorithm, Site2Vec, that derives reference frame invariant vector embedding of a protein–ligand binding site. The method is based on pairwise distances between representative points and chemical compositions in terms of constituent amino acids of a site. The vector embedding serves as a locality sensitive hash function for proximity queries and determining similar sites. The method has been the top performer with more than 95% quality scores in extensive benchmarking studies carried over 10 data sets and against 23 other site comparison methods in the field. The algorithm serves for high throughput processing and has been evaluated for stability with respect to reference frame shifts, coordinate perturbations and residue mutations. We also provide the method as a standalone executable and a web service hosted at (http://services.iittp.ac.in/bioinfo/home).
Collapse
|
29
|
Predicting binding sites from unbound versus bound protein structures. Sci Rep 2020; 10:15856. [PMID: 32985584 PMCID: PMC7522209 DOI: 10.1038/s41598-020-72906-7] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Accepted: 07/27/2020] [Indexed: 11/30/2022] Open
Abstract
We present the application of seven binding-site prediction algorithms to a meticulously curated dataset of ligand-bound and ligand-free crystal structures for 304 unique protein sequences (2528 crystal structures). We probe the influence of starting protein structures on the results of binding-site prediction, so the dataset contains a minimum of two ligand-bound and two ligand-free structures for each protein. We use this dataset in a brief survey of five geometry-based, one energy-based, and one machine-learning-based methods: Surfnet, Ghecom, LIGSITEcsc, Fpocket, Depth, AutoSite, and Kalasanty. Distributions of the F scores and Matthew’s correlation coefficients for ligand-bound versus ligand-free structure performance show no statistically significant difference in structure type versus performance for most methods. Only Fpocket showed a statistically significant but low magnitude enhancement in performance for holo structures. Lastly, we found that most methods will succeed on some crystal structures and fail on others within the same protein family, despite all structures being relatively high-quality structures with low structural variation. We expected better consistency across varying protein conformations of the same sequence. Interestingly, the success or failure of a given structure cannot be predicted by quality metrics such as resolution, Cruickshank Diffraction Precision index, or unresolved residues. Cryptic sites were also examined.
Collapse
|
30
|
Chaudhari R, Fong LW, Tan Z, Huang B, Zhang S. An up-to-date overview of computational polypharmacology in modern drug discovery. Expert Opin Drug Discov 2020; 15:1025-1044. [PMID: 32452701 PMCID: PMC7415563 DOI: 10.1080/17460441.2020.1767063] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Accepted: 05/06/2020] [Indexed: 12/30/2022]
Abstract
INTRODUCTION In recent years, computational polypharmacology has gained significant attention to study the promiscuous nature of drugs. Despite tremendous challenges, community-wide efforts have led to a variety of novel approaches for predicting drug polypharmacology. In particular, some rapid advances using machine learning and artificial intelligence have been reported with great success. AREAS COVERED In this article, the authors provide a comprehensive update on the current state-of-the-art polypharmacology approaches and their applications, focusing on those reports published after our 2017 review article. The authors particularly discuss some novel, groundbreaking concepts, and methods that have been developed recently and applied to drug polypharmacology studies. EXPERT OPINION Polypharmacology is evolving and novel concepts are being introduced to counter the current challenges in the field. However, major hurdles remain including incompleteness of high-quality experimental data, lack of in vitro and in vivo assays to characterize multi-targeting agents, shortage of robust computational methods, and challenges to identify the best target combinations and design effective multi-targeting agents. Fortunately, numerous national/international efforts including multi-omics and artificial intelligence initiatives as well as most recent collaborations on addressing the COVID-19 pandemic have shown significant promise to propel the field of polypharmacology forward.
Collapse
Affiliation(s)
- Rajan Chaudhari
- Intelligent Molecular Discovery Laboratory, Department of Experimental Therapeutics, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Boulevard, Houston, Texas 77030, United States
| | - Long Wolf Fong
- Intelligent Molecular Discovery Laboratory, Department of Experimental Therapeutics, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Boulevard, Houston, Texas 77030, United States
- MD Anderson UTHealth Graduate School of Biomedical Sciences, 6767 Bertner Avenue, Houston, Texas 77030, United States
| | - Zhi Tan
- Intelligent Molecular Discovery Laboratory, Department of Experimental Therapeutics, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Boulevard, Houston, Texas 77030, United States
| | - Beibei Huang
- Intelligent Molecular Discovery Laboratory, Department of Experimental Therapeutics, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Boulevard, Houston, Texas 77030, United States
| | - Shuxing Zhang
- Intelligent Molecular Discovery Laboratory, Department of Experimental Therapeutics, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Boulevard, Houston, Texas 77030, United States
- MD Anderson UTHealth Graduate School of Biomedical Sciences, 6767 Bertner Avenue, Houston, Texas 77030, United States
| |
Collapse
|
31
|
Trigueiro-Louro J, Correia V, Figueiredo-Nunes I, Gíria M, Rebelo-de-Andrade H. Unlocking COVID therapeutic targets: A structure-based rationale against SARS-CoV-2, SARS-CoV and MERS-CoV Spike. Comput Struct Biotechnol J 2020; 18:2117-2131. [PMID: 32913581 PMCID: PMC7452956 DOI: 10.1016/j.csbj.2020.07.017] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Revised: 07/20/2020] [Accepted: 07/22/2020] [Indexed: 12/11/2022] Open
Abstract
There are no approved target therapeutics against SARS-CoV-2 or other beta-CoVs. The beta-CoV Spike protein is a promising target considering the critical role in viral infection and pathogenesis and its surface exposed features. We performed a structure-based strategy targeting highly conserved druggable regions resulting from a comprehensive large-scale sequence analysis and structural characterization of Spike domains across SARSr- and MERSr-CoVs. We have disclosed 28 main consensus druggable pockets within the Spike. The RBD and SD1 (S1 subunit); and the CR, HR1 and CH (S2 subunit) represent the most promising conserved druggable regions. Additionally, we have identified 181 new potential hot spot residues for the hSARSr-CoVs and 72 new hot spot residues for the SARSr- and MERSr-CoVs, which have not been described before in the literature. These sites/residues exhibit advantageous structural features for targeted molecular and pharmacological modulation. This study establishes the Spike as a promising anti-CoV target using an approach with a potential higher resilience to resistance development and directed to a broad spectrum of Beta-CoVs, including the new SARS-CoV-2 responsible for COVID-19. This research also provides a structure-based rationale for the design and discovery of chemical inhibitors, antibodies or other therapeutic modalities successfully targeting the Beta-CoV Spike protein.
Collapse
Key Words
- ACE2, angiotensin-converting enzyme2
- Bat-SL-CoVs, bat SARS-like coronavirus
- Beta-CoVs, betacoronavirus
- Betacoronavirus
- CC, conserved cluster
- CD, connector domain
- CDP, consensus druggable pocket
- CDR, consensus druggable residue
- CH, central helix
- CP, cytoplasmic domain
- CR, connecting region
- CS, conservation score
- CoVs, coronavirus
- Coronavirus disease
- DGSS, DoGSiteScorer
- DPP4, dipeptidyl peptidase-4
- Druggability prediction
- FP, fusion peptide
- HR1, heptad repeat 1
- HR2, heptad repeat 2
- MERS-CoVs, middle east respiratory syndrome coronavirus
- MERSr-CoVs, middle east respiratory syndrome-related coronavirus
- MSA, multiple sequence alignment
- NTD, N-terminal domain
- Novel antiviral targets
- PDB, Protein Data Bank
- PDS, PockDrug-Server
- RBD, Receptor-Binding Domain
- S, Spike
- SARS-CoV-2
- SARS-CoV-2, severe acute respiratory syndrome coronavirus 2
- SARS-CoVs, severe acute respiratory syndrome coronavirus
- SARSr-CoVs, severe acute respiratory syndrome-related coronavirus
- SD1, subdomain 1
- SD2, subdomain 2
- SF, SiteFinder from MOE
- SP, small pocket
- Sequence conservation
- Spike protein
- Sv, shorter variant
- T-RHS, top-ranked hot spots
- TMPRSS2, transmembrane protease serine 2
- aa, amino acid
- hSARSr-CoVs, human Severe acute respiratory syndrome-related coronavirus
- nts, nucleotides
Collapse
Affiliation(s)
- João Trigueiro-Louro
- Antiviral Resistance Lab, Research & Development Unit, Infectious Diseases Department, Instituto Nacional de Saúde Doutor Ricardo Jorge, IP, Av. Padre Cruz, 1649-016 Lisbon, Portugal
- Host-Pathogen Interaction Unit, Research Institute for Medicines (iMed.ULisboa), Faculty of Pharmacy, Universidade de Lisboa, Av. Professor Gama Pinto, 1649-003 Lisbon, Portugal
| | - Vanessa Correia
- Antiviral Resistance Lab, Research & Development Unit, Infectious Diseases Department, Instituto Nacional de Saúde Doutor Ricardo Jorge, IP, Av. Padre Cruz, 1649-016 Lisbon, Portugal
| | - Inês Figueiredo-Nunes
- Host-Pathogen Interaction Unit, Research Institute for Medicines (iMed.ULisboa), Faculty of Pharmacy, Universidade de Lisboa, Av. Professor Gama Pinto, 1649-003 Lisbon, Portugal
| | - Marta Gíria
- Host-Pathogen Interaction Unit, Research Institute for Medicines (iMed.ULisboa), Faculty of Pharmacy, Universidade de Lisboa, Av. Professor Gama Pinto, 1649-003 Lisbon, Portugal
| | - Helena Rebelo-de-Andrade
- Antiviral Resistance Lab, Research & Development Unit, Infectious Diseases Department, Instituto Nacional de Saúde Doutor Ricardo Jorge, IP, Av. Padre Cruz, 1649-016 Lisbon, Portugal
- Host-Pathogen Interaction Unit, Research Institute for Medicines (iMed.ULisboa), Faculty of Pharmacy, Universidade de Lisboa, Av. Professor Gama Pinto, 1649-003 Lisbon, Portugal
| |
Collapse
|
32
|
Cao Y, Park SJ, Im W. A systematic analysis of protein-carbohydrate interactions in the Protein Data Bank. Glycobiology 2020; 31:126-136. [PMID: 32614943 DOI: 10.1093/glycob/cwaa062] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2020] [Revised: 06/26/2020] [Accepted: 06/26/2020] [Indexed: 12/17/2022] Open
Abstract
Protein-carbohydrate interactions underlie essential biological processes. Elucidating the mechanism of protein-carbohydrate recognition is a prerequisite for modeling and optimizing protein-carbohydrate interactions, which will help in discovery of carbohydrate-derived therapeutics. In this work, we present a survey of a curated database consisting of 6,402 protein-carbohydrate complexes in the Protein Data Bank (PDB). We performed an all-against-all comparison of a subset of nonredundant binding sites, and the result indicates that the interaction pattern similarity is not completely relevant to the binding site structural similarity. Investigation of both binding site and ligand promiscuities reveals that the geometry of chemical feature points is more important than local backbone structure in determining protein-carbohydrate interactions. A further analysis on the frequency and geometry of atomic interactions shows that carbohydrate functional groups are not equally involved in binding interactions. Finally, we discuss the usefulness of protein-carbohydrate complexes in the PDB with acknowledgement that the carbohydrates in many structures are incomplete.
Collapse
Affiliation(s)
- Yiwei Cao
- Departments of Biological Sciences, Chemistry, Bioengineering, and Computer Sciences and Engineering, Lehigh University, Bethlehem, PA 18015, USA
| | - Sang-Jun Park
- Departments of Biological Sciences, Chemistry, Bioengineering, and Computer Sciences and Engineering, Lehigh University, Bethlehem, PA 18015, USA
| | - Wonpil Im
- Departments of Biological Sciences, Chemistry, Bioengineering, and Computer Sciences and Engineering, Lehigh University, Bethlehem, PA 18015, USA.,School of Computational Sciences, Korea Institute for Advanced Study, Seoul 02455, Republic of Korea
| |
Collapse
|
33
|
Katigbak J, Li H, Rooklin D, Zhang Y. AlphaSpace 2.0: Representing Concave Biomolecular Surfaces Using β-Clusters. J Chem Inf Model 2020; 60:1494-1508. [PMID: 31995373 PMCID: PMC7093224 DOI: 10.1021/acs.jcim.9b00652] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Modern rational modulator design and structure-function characterization often concentrate on concave regions of biomolecular surfaces, ranging from well-defined small-molecule binding sites to large protein-protein interaction interfaces. Here, we introduce a β-cluster as a pseudomolecular representation of fragment-centric pockets detected by AlphaSpace [J. Chem. Inf. Model. 2015, 55, 1585], a recently developed computational analysis tool for topographical mapping of biomolecular concavities. By mimicking the shape as well as atomic details of potential molecular binders, this new β-cluster representation allows direct pocket-to-ligand shape comparison and can be used to guide ligand optimization. Furthermore, we defined the β-score, the optimal Vina score of the β-cluster, as an indicator of pocket ligandability and developed an ensemble β-cluster approach, which allows one-to-one pocket mapping and comparison among aligned protein structures. We demonstrated the utility of β-cluster representation by applying the approach to a wide variety of problems including binding site detection and comparison, characterization of protein-protein interactions, and fragment-based ligand optimization. These new β-cluster functionalities have been implemented in AlphaSpace 2.0, which is freely available on the web at http://www.nyu.edu/projects/yzhang/AlphaSpace2.
Collapse
Affiliation(s)
- Joseph Katigbak
- Department of Chemistry, New York University, New York, New York 10003, United States
| | - Haotian Li
- Department of Chemistry, New York University, New York, New York 10003, United States
| | - David Rooklin
- Department of Chemistry, New York University, New York, New York 10003, United States
| | - Yingkai Zhang
- Department of Chemistry, New York University, New York, New York 10003, United States
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| |
Collapse
|
34
|
Ribeiro VS, Santana CA, Fassio AV, Cerqueira FR, da Silveira CH, Romanelli JPR, Patarroyo-Vargas A, Oliveira MGA, Gonçalves-Almeida V, Izidoro SC, de Melo-Minardi RC, Silveira SDA. visGReMLIN: graph mining-based detection and visualization of conserved motifs at 3D protein-ligand interface at the atomic level. BMC Bioinformatics 2020; 21:80. [PMID: 32164574 PMCID: PMC7068867 DOI: 10.1186/s12859-020-3347-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Background Interactions between proteins and non-proteic small molecule ligands play important roles in the biological processes of living systems. Thus, the development of computational methods to support our understanding of the ligand-receptor recognition process is of fundamental importance since these methods are a major step towards ligand prediction, target identification, lead discovery, and more. This article presents visGReMLIN, a web server that couples a graph mining-based strategy to detect motifs at the protein-ligand interface with an interactive platform to visually explore and interpret these motifs in the context of protein-ligand interfaces. Results To illustrate the potential of visGReMLIN, we conducted two cases in which our strategy was compared with previous experimentally and computationally determined results. visGReMLIN allowed us to detect patterns previously documented in the literature in a totally visual manner. In addition, we found some motifs that we believe are relevant to protein-ligand interactions in the analyzed datasets. Conclusions We aimed to build a visual analytics-oriented web server to detect and visualize common motifs at the protein-ligand interface. visGReMLIN motifs can support users in gaining insights on the key atoms/residues responsible for protein-ligand interactions in a dataset of complexes.
Collapse
Affiliation(s)
- Vagner S Ribeiro
- Department of Computer Science, Universidade Federal de Viçosa, Viçosa, 36570-900, Brazil
| | - Charles A Santana
- Department of Biochemistry and Immunology, Universidade Federal de Minas Gerais, Belo Horizonte, 31270-901, Brazil
| | - Alexandre V Fassio
- Department of Biochemistry and Immunology, Universidade Federal de Minas Gerais, Belo Horizonte, 31270-901, Brazil
| | - Fabio R Cerqueira
- Department of Production Engineering, Universidade Federal Fluminense, Petrópolis, 25650-050, Brazil
| | - Carlos H da Silveira
- Department of Computer Engineering, Advanced Campus at Itabira, Universidade Federal de Itajubá, Itabira, 35903-087, Brazil
| | - João P R Romanelli
- Department of Computer Engineering, Advanced Campus at Itabira, Universidade Federal de Itajubá, Itabira, 35903-087, Brazil
| | - Adriana Patarroyo-Vargas
- Department of Biochemistry and Molecular Biology, Universidade Federal de Viçosa, Viçosa, 36570-900, Brazil
| | - Maria G A Oliveira
- Department of Biochemistry and Molecular Biology, Universidade Federal de Viçosa, Viçosa, 36570-900, Brazil.,Instituto de Biotecnologia aplicada à Agropecuária (BIOAGRO), Universidade Federal de Viçosa, Viçosa, 36570-900, Brazil
| | - Valdete Gonçalves-Almeida
- Department of Biochemistry and Immunology, Universidade Federal de Minas Gerais, Belo Horizonte, 31270-901, Brazil
| | - Sandro C Izidoro
- Department of Computer Engineering, Advanced Campus at Itabira, Universidade Federal de Itajubá, Itabira, 35903-087, Brazil
| | - Raquel C de Melo-Minardi
- Department of Computer Science, Universidade Federal de Minas Gerais, Belo Horizonte, 31270-901, Brazil
| | - Sabrina de A Silveira
- Department of Computer Science, Universidade Federal de Viçosa, Viçosa, 36570-900, Brazil. .,European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, CB10 1SD, UK.
| |
Collapse
|
35
|
Simonovsky M, Meyers J. DeeplyTough: Learning Structural Comparison of Protein Binding Sites. J Chem Inf Model 2020; 60:2356-2366. [DOI: 10.1021/acs.jcim.9b00554] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Affiliation(s)
- Martin Simonovsky
- BenevolentAI, London W1T 5HD, U.K
- École des Ponts ParisTech, Champs sur Marne 77455, France
- Université Paris-Est, Champs sur Marne 77455, France
| | | |
Collapse
|
36
|
On the possible origin of protein homochirality, structure, and biochemical function. Proc Natl Acad Sci U S A 2019; 116:26571-26579. [PMID: 31822617 DOI: 10.1073/pnas.1908241116] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Living systems have chiral molecules, e.g., native proteins that almost entirely contain L-amino acids. How protein homochirality emerged from a background of equal numbers of L and D amino acids is among many questions about life's origin. The origin of homochirality and its implications are explored in computer simulations examining the stability and structural and functional properties of an artificial library of compact proteins containing 1:1 (termed demi-chiral), 3:1, and 1:3 ratios of D:L and purely L or D amino acids generated without functional selection. Demi-chiral proteins have shorter secondary structures and fewer internal hydrogen bonds and are less stable than homochiral proteins. Selection for hydrogen bonding yields a preponderance of L or D amino acids. Demi-chiral proteins have native global folds, including similarity to early ribosomal proteins, similar small molecule ligand binding pocket geometries, and many constellations of L-chiral amino acids with a 1.0-Å RMSD to native enzyme active sites. For a representative subset containing 550 active site geometries matching 457 (2) 4-digit (3-digit) enzyme classification (E.C.) numbers, native active site amino acids were generated at random for 472 of 550 cases. This increases to 548 of 550 cases when similar residues are allowed. The most frequently generated sequences correspond to ancient enzymatic functions, e.g., glycolysis, replication, and nucleotide biosynthesis. Surprisingly, even without selection, demi-chiral proteins possess the requisite marginal biochemical function and structure of modern proteins, but were thermodynamically less stable. If demi-chiral proteins were present, they could engage in early metabolism, which created the feedback loop for transcription and cell formation.
Collapse
|
37
|
Naderi M, Lemoine JM, Govindaraj RG, Kana OZ, Feinstein WP, Brylinski M. Binding site matching in rational drug design: algorithms and applications. Brief Bioinform 2019; 20:2167-2184. [PMID: 30169563 PMCID: PMC6954434 DOI: 10.1093/bib/bby078] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2018] [Revised: 07/18/2018] [Accepted: 07/29/2018] [Indexed: 01/06/2023] Open
Abstract
Interactions between proteins and small molecules are critical for biological functions. These interactions often occur in small cavities within protein structures, known as ligand-binding pockets. Understanding the physicochemical qualities of binding pockets is essential to improve not only our basic knowledge of biological systems, but also drug development procedures. In order to quantify similarities among pockets in terms of their geometries and chemical properties, either bound ligands can be compared to one another or binding sites can be matched directly. Both perspectives routinely take advantage of computational methods including various techniques to represent and compare small molecules as well as local protein structures. In this review, we survey 12 tools widely used to match pockets. These methods are divided into five categories based on the algorithm implemented to construct binding-site alignments. In addition to the comprehensive analysis of their algorithms, test sets and the performance of each method are described. We also discuss general pharmacological applications of computational pocket matching in drug repurposing, polypharmacology and side effects. Reflecting on the importance of these techniques in drug discovery, in the end, we elaborate on the development of more accurate meta-predictors, the incorporation of protein flexibility and the integration of powerful artificial intelligence technologies such as deep learning.
Collapse
Affiliation(s)
- Misagh Naderi
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Jeffrey Mitchell Lemoine
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
- Division of Computer Science and Engineering, Louisiana State University, Baton Rouge, LA 70803, USA
| | | | - Omar Zade Kana
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Wei Pan Feinstein
- High-Performance Computing, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
- Center for Computation & Technology, Louisiana State University, Baton Rouge, LA 70803, USA
| |
Collapse
|
38
|
Guterres H, Lee HS, Im W. Ligand-Binding-Site Structure Refinement Using Molecular Dynamics with Restraints Derived from Predicted Binding Site Templates. J Chem Theory Comput 2019; 15:6524-6535. [PMID: 31557013 DOI: 10.1021/acs.jctc.9b00751] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Accurate modeling of ligand-binding-site structures plays a critical role in structure-based virtual screening. However, the structures of the ligand-binding site in most predicted protein models are generally of low quality and need refinements. In this work, we present a ligand-binding-site structure refinement protocol using molecular dynamics simulation with restraints derived from predicted binding site templates. Our benchmark validation shows great performance for 40 diverse sets of proteins from the Astex list. The ligand-binding sites on modeled protein structures are consistently refined using our method with an average Cα RMSD improvement of 0.90 Å. Comparison of ligand binding modes from ligand docking to initial unrefined and refined structures shows an average of 1.97 Å RMSD improvement in the refined structures. These results demonstrate a promising new method of structure refinement for protein ligand-binding-site structures.
Collapse
Affiliation(s)
- Hugo Guterres
- Department of Biological Sciences , Lehigh University , Bethlehem , Pennsylvania 18015 , United States
| | - Hui Sun Lee
- Department of Biological Sciences , Lehigh University , Bethlehem , Pennsylvania 18015 , United States
| | - Wonpil Im
- Department of Biological Sciences , Lehigh University , Bethlehem , Pennsylvania 18015 , United States.,School of Computational Sciences , Korea Institute for Advanced Study , Seoul 02455 , Republic of Korea
| |
Collapse
|
39
|
Rifaioglu AS, Atas H, Martin MJ, Cetin-Atalay R, Atalay V, Doğan T. Recent applications of deep learning and machine intelligence on in silico drug discovery: methods, tools and databases. Brief Bioinform 2019; 20:1878-1912. [PMID: 30084866 PMCID: PMC6917215 DOI: 10.1093/bib/bby061] [Citation(s) in RCA: 267] [Impact Index Per Article: 44.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2018] [Revised: 05/25/2018] [Indexed: 01/16/2023] Open
Abstract
The identification of interactions between drugs/compounds and their targets is crucial for the development of new drugs. In vitro screening experiments (i.e. bioassays) are frequently used for this purpose; however, experimental approaches are insufficient to explore novel drug-target interactions, mainly because of feasibility problems, as they are labour intensive, costly and time consuming. A computational field known as 'virtual screening' (VS) has emerged in the past decades to aid experimental drug discovery studies by statistically estimating unknown bio-interactions between compounds and biological targets. These methods use the physico-chemical and structural properties of compounds and/or target proteins along with the experimentally verified bio-interaction information to generate predictive models. Lately, sophisticated machine learning techniques are applied in VS to elevate the predictive performance. The objective of this study is to examine and discuss the recent applications of machine learning techniques in VS, including deep learning, which became highly popular after giving rise to epochal developments in the fields of computer vision and natural language processing. The past 3 years have witnessed an unprecedented amount of research studies considering the application of deep learning in biomedicine, including computational drug discovery. In this review, we first describe the main instruments of VS methods, including compound and protein features (i.e. representations and descriptors), frequently used libraries and toolkits for VS, bioactivity databases and gold-standard data sets for system training and benchmarking. We subsequently review recent VS studies with a strong emphasis on deep learning applications. Finally, we discuss the present state of the field, including the current challenges and suggest future directions. We believe that this survey will provide insight to the researchers working in the field of computational drug discovery in terms of comprehending and developing novel bio-prediction methods.
Collapse
Affiliation(s)
- Ahmet Sureyya Rifaioglu
- Department of Computer Engineering, Middle East Technical University, Ankara, Turkey
- Department of Computer Engineering, İskenderun Technical University, Hatay, Turkey
| | - Heval Atas
- Cancer System Biology Laboratory (CanSyL), Graduate School of Informatics, Middle East Technical University, Ankara, Turkey
| | - Maria Jesus Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL–EBI), Cambridge, Hinxton, UK
| | - Rengul Cetin-Atalay
- Department of Computer Engineering, Middle East Technical University, Ankara, Turkey
| | - Volkan Atalay
- Department of Computer Engineering, Middle East Technical University, Ankara, Turkey
| | - Tunca Doğan
- Cancer System Biology Laboratory (CanSyL), Graduate School of Informatics, Middle East Technical University, Ankara, Turkey and European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL–EBI), Cambridge, Hinxton, UK
| |
Collapse
|
40
|
Cerisier N, Petitjean M, Regad L, Bayard Q, Réau M, Badel A, Camproux AC. High Impact: The Role of Promiscuous Binding Sites in Polypharmacology. Molecules 2019; 24:molecules24142529. [PMID: 31295958 PMCID: PMC6680532 DOI: 10.3390/molecules24142529] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2019] [Revised: 06/27/2019] [Accepted: 06/27/2019] [Indexed: 02/06/2023] Open
Abstract
The literature focuses on drug promiscuity, which is a drug’s ability to bind to several targets, because it plays an essential role in polypharmacology. However, little work has been completed regarding binding site promiscuity, even though its properties are now recognized among the key factors that impact drug promiscuity. Here, we quantified and characterized the promiscuity of druggable binding sites from protein-ligand complexes in the high quality Mother Of All Databases while using statistical methods. Most of the sites (80%) exhibited promiscuity, irrespective of the protein class. Nearly half were highly promiscuous and able to interact with various types of ligands. The corresponding pockets were rather large and hydrophobic, with high sulfur atom and aliphatic residue frequencies, but few side chain atoms. Consequently, their interacting ligands can be large, rigid, and weakly hydrophilic. The selective sites that interacted with one ligand type presented less favorable pocket properties for establishing ligand contacts. Thus, their ligands were highly adaptable, small, and hydrophilic. In the dataset, the promiscuity of the site rather than the drug mainly explains the multiple interactions between the drug and target, as most ligand types are dedicated to one site. This underlines the essential contribution of binding site promiscuity to drug promiscuity between different protein classes.
Collapse
Affiliation(s)
- Natacha Cerisier
- Université de Paris, Biologie Fonctionnelle et Adaptative, UMR 8251, CNRS, ERL U1133, INSERM, Computational Modeling of Protein Ligand Interactions, F-75013 Paris, France
| | - Michel Petitjean
- Université de Paris, Biologie Fonctionnelle et Adaptative, UMR 8251, CNRS, ERL U1133, INSERM, Computational Modeling of Protein Ligand Interactions, F-75013 Paris, France
| | - Leslie Regad
- Université de Paris, Biologie Fonctionnelle et Adaptative, UMR 8251, CNRS, ERL U1133, INSERM, Computational Modeling of Protein Ligand Interactions, F-75013 Paris, France
| | - Quentin Bayard
- Centre de Recherche des Cordeliers, Sorbonne Universités, INSERM, USPC, Université Paris Descartes, Université Paris Diderot, Université Paris 13, Functional Genomics of Solid Tumors Laboratory, F-75006 Paris, France
| | - Manon Réau
- Laboratoire Génomique Bioinformatique et Chimie Moléculaire, EA 7528, Conservatoire National des Arts et Métiers, F-75003 Paris, France
| | - Anne Badel
- Université de Paris, Biologie Fonctionnelle et Adaptative, UMR 8251, CNRS, ERL U1133, INSERM, Computational Modeling of Protein Ligand Interactions, F-75013 Paris, France
| | - Anne-Claude Camproux
- Université de Paris, Biologie Fonctionnelle et Adaptative, UMR 8251, CNRS, ERL U1133, INSERM, Computational Modeling of Protein Ligand Interactions, F-75013 Paris, France.
| |
Collapse
|
41
|
Engineering brain activity patterns by neuromodulator polytherapy for treatment of disorders. Nat Commun 2019; 10:2620. [PMID: 31197165 PMCID: PMC6565674 DOI: 10.1038/s41467-019-10541-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2018] [Accepted: 05/15/2019] [Indexed: 11/08/2022] Open
Abstract
Conventional drug screens and treatments often ignore the underlying complexity of brain network dysfunctions, resulting in suboptimal outcomes. Here we ask whether we can correct abnormal functional connectivity of the entire brain by identifying and combining multiple neuromodulators that perturb connectivity in complementary ways. Our approach avoids the combinatorial complexity of screening all drug combinations. We develop a high-speed platform capable of imaging more than 15000 neurons in 50ms to map the entire brain functional connectivity in large numbers of vertebrates under many conditions. Screening a panel of drugs in a zebrafish model of human Dravet syndrome, we show that even drugs with related mechanisms of action can modulate functional connectivity in significantly different ways. By clustering connectivity fingerprints, we algorithmically select small subsets of complementary drugs and rapidly identify combinations that are significantly more effective at correcting abnormal networks and reducing spontaneous seizures than monotherapies, while minimizing behavioral side effects. Even at low concentrations, our polytherapy performs superior to individual drugs even at highest tolerated concentrations. Brain disorders are associated with network dysfunctions that are not addressed by conventional drug screens. Here, the authors use high-throughput functional imaging of brain activity in zebrafish larvae to study the effects of individual drugs on network connectivity and demonstrate an algorithm that predicts the most effective drug combinations to normalize both the activity patterns and the animal behavior.
Collapse
|
42
|
Ehrt C, Brinkjost T, Koch O. Binding site characterization - similarity, promiscuity, and druggability. MEDCHEMCOMM 2019; 10:1145-1159. [PMID: 31391887 DOI: 10.1039/c9md00102f] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Accepted: 05/31/2019] [Indexed: 12/19/2022]
Abstract
The elucidation of non-obvious binding site similarities has provided useful indications for the establishment of polypharmacology, the identification of potential off-targets, or the repurposing of known drugs. The concept underlying all of these approaches is promiscuous binding which can be analyzed from a ligand-based or a binding site-based perspective. Herein, we applied methods for the automated analysis and comparison of protein binding sites to study promiscuous binding on a novel dataset of sites in complex with ligands sharing common shape and physicochemical properties. We show the suitability of this dataset for the benchmarking of novel binding site comparison methods. Our investigations also reveal promising directions for further in-depth analyses of promiscuity and druggability in a pocket-centered manner. Drawbacks concerning binding site similarity assessment and druggability prediction are outlined, enabling researchers to avoid the typical pitfalls of binding site analyses.
Collapse
Affiliation(s)
- Christiane Ehrt
- Faculty of Chemistry and Chemical Biology , TU Dortmund University , Dortmund , Germany
| | - Tobias Brinkjost
- Faculty of Chemistry and Chemical Biology , TU Dortmund University , Dortmund , Germany.,Department of Computer Science , TU Dortmund University , Dortmund , Germany
| | - Oliver Koch
- Faculty of Chemistry and Chemical Biology , TU Dortmund University , Dortmund , Germany
| |
Collapse
|
43
|
Updates to Binding MOAD (Mother of All Databases): Polypharmacology Tools and Their Utility in Drug Repurposing. J Mol Biol 2019; 431:2423-2433. [PMID: 31125569 DOI: 10.1016/j.jmb.2019.05.024] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2019] [Revised: 05/13/2019] [Accepted: 05/14/2019] [Indexed: 01/02/2023]
Abstract
The goal of Binding MOAD is to provide users with a data set focused on high-quality x-ray crystal structures that have been solved with biologically relevant ligands bound. Where available, experimental binding affinities (Ka, Kd, Ki, IC50) are provided from the primary literature of the crystal structure. The database has been updated regularly since 2005, and this most recent update has added nearly 7000 new structures (growth of 21%). MOAD currently contains 32,747 structures, composed of 9117 protein families and 16,044 unique ligands. The data are freely available on www.BindingMOAD.org. This paper outlines updates to the data in Binding MOAD as well as improvements made to both the website and its contents. The NGL viewer has been added to improve visualization of the ligands and protein structures. MarvinJS has been implemented, over the outdated MarvinView, to work with JChem for small molecule searching in the database. To add tools for predicting polypharmacology, we have added information about sequence, binding-site, and ligand similarity between entries in the database. A main premise behind polypharmacology is that similar binding sites will bind similar ligands. The large amount of protein-ligand information available in Binding MOAD allows us to compute pairwise ligand and binding-site similarities. Lists of similar ligands and similar binding sites have been added to allow users to identify potential polypharmacology pairs. To show the utility of the polypharmacology data, we detail a few examples from Binding MOAD of drug repurposing targets with their respective similarities.
Collapse
|
44
|
Trigueiro-Louro JM, Correia V, Santos LA, Guedes RC, Brito RMM, Rebelo-de-Andrade H. To hit or not to hit: Large-scale sequence analysis and structure characterization of influenza A NS1 unlocks new antiviral target potential. Virology 2019; 535:297-307. [PMID: 31104825 DOI: 10.1016/j.virol.2019.04.009] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Revised: 04/22/2019] [Accepted: 04/23/2019] [Indexed: 12/13/2022]
Abstract
Influenza NS1 protein is among the most promising novel druggable anti-influenza target, based on its structure; multiple interactions; and global function in influenza replication and pathogenesis. Notwithstanding, drug development guidance based on NS1 structural biology is lacking. Here, we design a promising strategy directed to highly conserved druggable regions as a result of an exhaustive large-scale sequence analysis and structure characterization of NS1 protein across human-infecting influenza A subtypes, over the past 100 years. We have identified 3 druggable pockets and 8 new potential hot spot residues in the NS1 protein, not described before, additionally to other 16 sites previously identified, which represent attractive targets for pharmacological modulation. This study provides a rationale towards structure-function studies of NS1 druggable sites, which have the potential to accelerate the NS1 target validation. This research also contributes to a deeper comprehension and insight into the evolutionary dynamics of influenza A NS1 protein.
Collapse
Affiliation(s)
- João M Trigueiro-Louro
- Host-Pathogen Interaction Unit, Research Institute for Medicines (iMed.ULisboa), Faculty of Pharmacy, Universidade de Lisboa, Av. Professor Gama Pinto, 1649-003, Lisbon, Portugal; Antiviral Resistance Lab, Research & Development Unit, Infectious Diseases Department, Instituto Nacional de Saúde Doutor Ricardo Jorge, IP, Av. Padre Cruz, 1649-016, Lisbon, Portugal.
| | - Vanessa Correia
- Antiviral Resistance Lab, Research & Development Unit, Infectious Diseases Department, Instituto Nacional de Saúde Doutor Ricardo Jorge, IP, Av. Padre Cruz, 1649-016, Lisbon, Portugal
| | - Luís A Santos
- Antiviral Resistance Lab, Research & Development Unit, Infectious Diseases Department, Instituto Nacional de Saúde Doutor Ricardo Jorge, IP, Av. Padre Cruz, 1649-016, Lisbon, Portugal
| | - Rita C Guedes
- Medicinal Chemistry Unit, Research Institute for Medicines (iMed.ULisboa), Faculty of Pharmacy, Universidade de Lisboa, Av. Professor Gama Pinto, 1649-003, Lisbon, Portugal
| | - Rui M M Brito
- Chemistry Department and Coimbra Chemistry Centre, Faculty of Science and Technology, University of Coimbra, 3004-535, Coimbra, Portugal
| | - Helena Rebelo-de-Andrade
- Host-Pathogen Interaction Unit, Research Institute for Medicines (iMed.ULisboa), Faculty of Pharmacy, Universidade de Lisboa, Av. Professor Gama Pinto, 1649-003, Lisbon, Portugal; Antiviral Resistance Lab, Research & Development Unit, Infectious Diseases Department, Instituto Nacional de Saúde Doutor Ricardo Jorge, IP, Av. Padre Cruz, 1649-016, Lisbon, Portugal.
| |
Collapse
|
45
|
Kaiser F, Labudde D. Unsupervised Discovery of Geometrically Common Structural Motifs and Long-Range Contacts in Protein 3D Structures. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:671-680. [PMID: 29990265 DOI: 10.1109/tcbb.2017.2786250] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The essential role of small evolutionarily conserved structural units in proteins has been extensively researched and validated. A popular example are serine proteases, where the peptide cleavage reaction is realized by a configuration of only three residues. Brought to spatial proximity during the protein folding process, such structural motifs are often long-range contacts and usually hard to detect at sequence level. Due to the constantly increasing resource of protein 3D structure data, the computational identification of structural motifs can contribute significantly to the understanding of protein fold and function. Thus, we propose a method to discover structural motifs of high geometrical similarity and desired sequence separation in protein 3D structure data. By utilizing methods originated from data mining, no a priori knowledge is required. The applicability of the method is demonstrated by the identification of the catalytic unit of serine proteases and the ion-coordination center of cupredoxins. Furthermore, large-scale analysis of the entire Protein Data Bank points towards the presence of ubiquitous structural motifs, independent of any specific fold or function. We envision that our method is suitable to uncover functional mechanisms and to derive fingerprint libraries of structural motifs, which could be used to assess protein family association.
Collapse
|
46
|
Bhagavat R, Sankar S, Srinivasan N, Chandra N. An Augmented Pocketome: Detection and Analysis of Small-Molecule Binding Pockets in Proteins of Known 3D Structure. Structure 2019. [PMID: 29514079 DOI: 10.1016/j.str.2018.02.001] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Protein-ligand interactions form the basis of most cellular events. Identifying ligand binding pockets in proteins will greatly facilitate rationalizing and predicting protein function. Ligand binding sites are unknown for many proteins of known three-dimensional (3D) structure, creating a gap in our understanding of protein structure-function relationships. To bridge this gap, we detect pockets in proteins of known 3D structures, using computational techniques. This augmented pocketome (PocketDB) consists of 249,096 pockets, which is about seven times larger than what is currently known. We deduce possible ligand associations for about 46% of the newly identified pockets. The augmented pocketome, when subjected to clustering based on similarities among pockets, yielded 2,161 site types, which are associated with 1,037 ligand types, together providing fold-site-type-ligand-type associations. The PocketDB resource facilitates a structure-based function annotation, delineation of the structural basis of ligand recognition, and provides functional clues for domains of unknown functions, allosteric proteins, and druggable pockets.
Collapse
Affiliation(s)
- Raghu Bhagavat
- National Mathematics Initiative, Indian Institute of Science, Bangalore 560012, India
| | - Santhosh Sankar
- Department of Biochemistry, Indian Institute of Science, Bangalore 560012, India
| | - Narayanaswamy Srinivasan
- National Mathematics Initiative, Indian Institute of Science, Bangalore 560012, India; Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560012, India
| | - Nagasuma Chandra
- National Mathematics Initiative, Indian Institute of Science, Bangalore 560012, India; Department of Biochemistry, Indian Institute of Science, Bangalore 560012, India.
| |
Collapse
|
47
|
Yamaotsu N, Hirono S. In silico fragment-mapping method: a new tool for fragment-based/structure-based drug discovery. J Comput Aided Mol Des 2018; 32:1229-1245. [PMID: 30196523 DOI: 10.1007/s10822-018-0160-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2018] [Accepted: 09/04/2018] [Indexed: 01/09/2023]
Abstract
Here, we propose an in silico fragment-mapping method as a potential tool for fragment-based/structure-based drug discovery (FBDD/SBDD). For this method, we created a database named Canonical Subsite-Fragment DataBase (CSFDB) and developed a knowledge-based fragment-mapping program, Fsubsite. CSFDB consists of various pairs of subsite-fragments derived from X-ray crystal structures of known protein-ligand complexes. Using three-dimensional similarity-matching between subsites on one protein and another, Fsubsite compares the surface of a target protein with all subsites in CSFDB. When a local topography similar to the subsite is found on the surface, Fsubsite places a fragment combined with the subsite in CSFDB on the target protein. For validation purposes, we applied the method to the apo-structure of cyclin-dependent kinase 2 (CDK2) and identified four compounds containing three mapped fragments that existed in the list of known inhibitors of CDK2. Next, the utility of our fragment-mapping method for fragment-growing was examined on the complex structure of tRNA-guanine transglycosylase with a small ligand. Fsubsite mapped appropriate fragments on the same position as the binding ligand or in the vicinity of the ligand. Finally, a 3D-pharmacophore model was constructed from the fragments mapped on the apo-structure of heat shock protein 90-α (HSP90α). Then, 3D pharmacophore-based virtual screening was carried out using a commercially available compound database. The resultant hit compounds were very similar to a known ligand of HSP90α. As a result of these findings, this in silico fragment-mapping method seems to be a useful tool for computational FBDD and SBDD.
Collapse
Affiliation(s)
- Noriyuki Yamaotsu
- Department of Pharmaceutical Sciences, School of Pharmacy, Kitasato University, 5-9-1 Shirokane, Minato-ku, Tokyo, 108-8641, Japan.
| | - Shuichi Hirono
- Department of Pharmaceutical Sciences, School of Pharmacy, Kitasato University, 5-9-1 Shirokane, Minato-ku, Tokyo, 108-8641, Japan.
| |
Collapse
|
48
|
Correia V, Abecasis AB, Rebelo-de-Andrade H. Molecular footprints of selective pressure in the neuraminidase gene of currently circulating human influenza subtypes and lineages. Virology 2018; 522:122-130. [PMID: 30029011 DOI: 10.1016/j.virol.2018.07.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2017] [Revised: 07/03/2018] [Accepted: 07/04/2018] [Indexed: 12/20/2022]
Abstract
Influenza neuraminidase (NA) is under selective pressure (SP) of both host immune system and drug use. Here, we assembled large datasets of NA sequences of worldwide circulating viruses to estimate the global and site-specific SP acting on all current subtypes/lineages of human influenza NA. An overall negative SP of similar magnitude and a prevalence of negatively selected sites were observed for all subtypes/lineages. Positively selected sites varied according to the subtype/lineage, including N1-NA sites 247 and 275, N2-NA sites 148 and 151, and B/Victoria-NA site 395 associated with drug-resistance or reduced susceptibility. These results evidenced a potential role of positive selection in the low-level spread of A(H1N1)pdm09-H275Y drug-resistant viruses, and alerted for a potential higher risk of spread of a synergistic A(H1N1)pdm09 drug-resistant variant (H275Y/S247N). The positive selection detected at N2-NA sites 148 and 151 was probably an artefact from cell-culture. Overall mapping revealed six potential new druggable regions.
Collapse
Affiliation(s)
- Vanessa Correia
- Infectious Diseases Department, Instituto Nacional de Saúde Doutor Ricardo Jorge, IP, Av. Padre Cruz, 1649-016 Lisbon, Portugal; Host-Pathogen Interaction Unit, Research Institute for Medicines (iMed.ULisboa), Faculdade de Farmácia, Universidade de Lisboa, Av. Professor Gama Pinto, 1649-003 Lisbon, Portugal.
| | - Ana B Abecasis
- Global Health and Tropical Medicine, Instituto de Higiene e Medicina Tropical, Universidade NOVA de Lisboa, Rua da Junqueira 100, 1349-008 Lisbon, Portugal.
| | - Helena Rebelo-de-Andrade
- Infectious Diseases Department, Instituto Nacional de Saúde Doutor Ricardo Jorge, IP, Av. Padre Cruz, 1649-016 Lisbon, Portugal; Host-Pathogen Interaction Unit, Research Institute for Medicines (iMed.ULisboa), Faculdade de Farmácia, Universidade de Lisboa, Av. Professor Gama Pinto, 1649-003 Lisbon, Portugal.
| |
Collapse
|
49
|
Budowski-Tal I, Kolodny R, Mandel-Gutfreund Y. A Novel Geometry-Based Approach to Infer Protein Interface Similarity. Sci Rep 2018; 8:8192. [PMID: 29844500 PMCID: PMC5974305 DOI: 10.1038/s41598-018-26497-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2017] [Accepted: 05/10/2018] [Indexed: 11/21/2022] Open
Abstract
The protein interface is key to understand protein function, providing a vital insight on how proteins interact with each other and with other molecules. Over the years, many computational methods to compare protein structures were developed, yet evaluating interface similarity remains a very difficult task. Here, we present PatchBag – a geometry based method for efficient comparison of protein surfaces and interfaces. PatchBag is a Bag-Of-Words approach, which represents complex objects as vectors, enabling to search interface similarity in a highly efficient manner. Using a novel framework for evaluating interface similarity, we show that PatchBag performance is comparable to state-of-the-art alignment-based structural comparison methods. The great advantage of PatchBag is that it does not rely on sequence or fold information, thus enabling to detect similarities between interfaces in unrelated proteins. We propose that PatchBag can contribute to reveal novel evolutionary and functional relationships between protein interfaces.
Collapse
Affiliation(s)
- Inbal Budowski-Tal
- Faculty of Biology, Technion, Israel Institute of Technology, Haifa, 3200003, Israel.,Department of Computer Science, University of Haifa, Mount Carmel, Haifa, 3498838, Israel
| | - Rachel Kolodny
- Department of Computer Science, University of Haifa, Mount Carmel, Haifa, 3498838, Israel.
| | - Yael Mandel-Gutfreund
- Faculty of Biology, Technion, Israel Institute of Technology, Haifa, 3200003, Israel.
| |
Collapse
|
50
|
Govindaraj RG, Brylinski M. Comparative assessment of strategies to identify similar ligand-binding pockets in proteins. BMC Bioinformatics 2018. [PMID: 29523085 PMCID: PMC5845264 DOI: 10.1186/s12859-018-2109-2] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open
Abstract
Background Detecting similar ligand-binding sites in globally unrelated proteins has a wide range of applications in modern drug discovery, including drug repurposing, the prediction of side effects, and drug-target interactions. Although a number of techniques to compare binding pockets have been developed, this problem still poses significant challenges. Results We evaluate the performance of three algorithms to calculate similarities between ligand-binding sites, APoc, SiteEngine, and G-LoSA. Our assessment considers not only the capabilities to identify similar pockets and to construct accurate local alignments, but also the dependence of these alignments on the sequence order. We point out certain drawbacks of previously compiled datasets, such as the inclusion of structurally similar proteins, leading to an overestimated performance. To address these issues, a rigorous procedure to prepare unbiased, high-quality benchmarking sets is proposed. Further, we conduct a comparative assessment of techniques directly aligning binding pockets to indirect strategies employing structure-based virtual screening with AutoDock Vina and rDock. Conclusions Thorough benchmarks reveal that G-LoSA offers a fairly robust overall performance, whereas the accuracy of APoc and SiteEngine is satisfactory only against easy datasets. Moreover, combining various algorithms into a meta-predictor improves the performance of existing methods to detect similar binding sites in unrelated proteins by 5–10%. All data reported in this paper are freely available at https://osf.io/6ngbs/.
Collapse
Affiliation(s)
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA. .,Center for Computation & Technology, Louisiana State University, Baton Rouge, LA, USA.
| |
Collapse
|