1
|
Xia R, Li W, Cheng Y, Xie L, Xu X. Molecular surfaces modeling: Advancements in deep learning for molecular interactions and predictions. Biochem Biophys Res Commun 2025; 763:151799. [PMID: 40239539 DOI: 10.1016/j.bbrc.2025.151799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2025] [Revised: 03/20/2025] [Accepted: 04/10/2025] [Indexed: 04/18/2025]
Abstract
Molecular surface analysis can provide a high-dimensional, rich representation of molecular properties and interactions, which is crucial for enabling powerful predictive modeling and rational molecular design across diverse scientific and technological domains. With remarkable successes achieved by artificial intelligence (AI) in different fields such as computer vision and natural language processing, there is a growing imperative to harness AI's potential in accelerating molecular discovery and innovation. The integration of AI techniques with molecular surface analysis has opened up new frontiers, allowing researchers to uncover hidden patterns, relationships, and design principles that were previously elusive. By leveraging the complementary strengths of molecular surface representations and advanced AI algorithms, scientists can now explore chemical space more efficiently, optimize molecular properties with greater precision, and drive transformative advancements in areas like drug development, materials engineering, and catalysis. In this review, we aim to provide an overview of recent advancements in the field of molecular surface analysis and its integration with AI techniques. These AI-driven approaches have led to significant advancements in various downstream tasks, including interface site prediction, protein-protein interaction prediction, surface-centric molecular generation and design.
Collapse
Affiliation(s)
- Renjie Xia
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou, 213001, China
| | - Wei Li
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou, 213001, China
| | - Yi Cheng
- College of Engineering, Lishui University, Lishui, 323000, China
| | - Liangxu Xie
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou, 213001, China.
| | - Xiaojun Xu
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou, 213001, China.
| |
Collapse
|
2
|
Carpenter KA, Altman RB. Databases of ligand-binding pockets and protein-ligand interactions. Comput Struct Biotechnol J 2024; 23:1320-1338. [PMID: 38585646 PMCID: PMC10997877 DOI: 10.1016/j.csbj.2024.03.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 03/16/2024] [Accepted: 03/17/2024] [Indexed: 04/09/2024] Open
Abstract
Many research groups and institutions have created a variety of databases curating experimental and predicted data related to protein-ligand binding. The landscape of available databases is dynamic, with new databases emerging and established databases becoming defunct. Here, we review the current state of databases that contain binding pockets and protein-ligand binding interactions. We have compiled a list of such databases, fifty-three of which are currently available for use. We discuss variation in how binding pockets are defined and summarize pocket-finding methods. We organize the fifty-three databases into subgroups based on goals and contents, and describe standard use cases. We also illustrate that pockets within the same protein are characterized differently across different databases. Finally, we assess critical issues of sustainability, accessibility and redundancy.
Collapse
Affiliation(s)
- Kristy A. Carpenter
- Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
| | - Russ B. Altman
- Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
- Department of Medicine, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
3
|
Ohno S, Manabe N, Yamaguchi Y. Prediction of protein structure and AI. J Hum Genet 2024; 69:477-480. [PMID: 38177398 DOI: 10.1038/s10038-023-01215-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 12/10/2023] [Indexed: 01/06/2024]
Abstract
AlphaFold, an artificial intelligence (AI)-based tool for predicting the 3D structure of proteins, is now widely recognized for its high accuracy and versatility in the folding of human proteins. AlphaFold is useful for understanding structure-function relationships from protein 3D structure models and can serve as a template or a reference for experimental structural analysis including X-ray crystallography, NMR and cryo-EM analysis. Its use is expanding among researchers, not only in structural biology but also in other research fields. Researchers are currently exploring the full potential of AlphaFold-generated protein models. Predicting disease severity caused by missense mutations is one such application. This article provides an overview of the 3D structural modeling of AlphaFold based on deep learning techniques and highlights the challenges in predicting the pathogenicity of missense mutations.
Collapse
Affiliation(s)
- Shiho Ohno
- Division of Structural Glycobiology, Institute of Molecular Biomembrane and Glycobiology, Tohoku Medical and Pharmaceutical University, 4-4-1 Komatsushima, Aoba-ku, Sendai, Miyagi, 981-8558, Japan
| | - Noriyoshi Manabe
- Division of Structural Glycobiology, Institute of Molecular Biomembrane and Glycobiology, Tohoku Medical and Pharmaceutical University, 4-4-1 Komatsushima, Aoba-ku, Sendai, Miyagi, 981-8558, Japan
| | - Yoshiki Yamaguchi
- Division of Structural Glycobiology, Institute of Molecular Biomembrane and Glycobiology, Tohoku Medical and Pharmaceutical University, 4-4-1 Komatsushima, Aoba-ku, Sendai, Miyagi, 981-8558, Japan.
| |
Collapse
|
4
|
Comajuncosa-Creus A, Jorba G, Barril X, Aloy P. Comprehensive detection and characterization of human druggable pockets through binding site descriptors. Nat Commun 2024; 15:7917. [PMID: 39256431 PMCID: PMC11387482 DOI: 10.1038/s41467-024-52146-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Accepted: 08/27/2024] [Indexed: 09/12/2024] Open
Abstract
Druggable pockets are protein regions that have the ability to bind organic small molecules, and their characterization is essential in target-based drug discovery. However, deriving pocket descriptors is challenging and existing strategies are often limited in applicability. We introduce PocketVec, an approach to generate pocket descriptors via inverse virtual screening of lead-like molecules. PocketVec performs comparably to leading methodologies while addressing key limitations. Additionally, we systematically search for druggable pockets in the human proteome, using experimentally determined structures and AlphaFold2 models, identifying over 32,000 binding sites across 20,000 protein domains. We then generate PocketVec descriptors for each site and conduct an extensive similarity search, exploring over 1.2 billion pairwise comparisons. Our results reveal druggable pocket similarities not detected by structure- or sequence-based methods, uncovering clusters of similar pockets in proteins lacking crystallized inhibitors and opening the door to strategies for prioritizing chemical probe development to explore the druggable space.
Collapse
Affiliation(s)
- Arnau Comajuncosa-Creus
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Guillem Jorba
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Xavier Barril
- Facultat de Farmàcia and Institut de Biomedicina, Universitat de Barcelona, Barcelona, Catalonia, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Catalonia, Spain
| | - Patrick Aloy
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain.
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Catalonia, Spain.
| |
Collapse
|
5
|
Liu JX, Zhang X, Huang YQ, Hao GF, Yang GF. Multi-level bioinformatics resources support drug target discovery of protein-protein interactions. Drug Discov Today 2024; 29:103979. [PMID: 38608830 DOI: 10.1016/j.drudis.2024.103979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 03/14/2024] [Accepted: 04/05/2024] [Indexed: 04/14/2024]
Abstract
Drug discovery often begins with a new target. Protein-protein interactions (PPIs) are crucial to multitudinous cellular processes and offer a promising avenue for drug-target discovery. PPIs are characterized by multi-level complexity: at the protein level, interaction networks can be used to identify potential targets, whereas at the residue level, the details of the interactions of individual PPIs can be used to examine a target's druggability. Much great progress has been made in target discovery through multi-level PPI-related computational approaches, but these resources have not been fully discussed. Here, we systematically survey bioinformatics tools for identifying and assessing potential drug targets, examining their characteristics, limitations and applications. This work will aid the integration of the broader protein-to-network context with the analysis of detailed binding mechanisms to support the discovery of drug targets.
Collapse
Affiliation(s)
- Jia-Xin Liu
- National Key Laboratory of Green Pesticide, Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan 430079, PR China
| | - Xiao Zhang
- State Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Center for R&D of Fine Chemicals, Guizhou University, Guiyang 550025, PR China
| | - Yuan-Qin Huang
- State Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Center for R&D of Fine Chemicals, Guizhou University, Guiyang 550025, PR China
| | - Ge-Fei Hao
- National Key Laboratory of Green Pesticide, Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan 430079, PR China; State Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Center for R&D of Fine Chemicals, Guizhou University, Guiyang 550025, PR China.
| | - Guang-Fu Yang
- National Key Laboratory of Green Pesticide, Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan 430079, PR China.
| |
Collapse
|
6
|
Tsuchiya Y, Yonezawa T, Yamamori Y, Inoura H, Osawa M, Ikeda K, Tomii K. PoSSuM v.3: A Major Expansion of the PoSSuM Database for Finding Similar Binding Sites of Proteins. J Chem Inf Model 2023; 63:7578-7587. [PMID: 38016694 PMCID: PMC10716853 DOI: 10.1021/acs.jcim.3c01405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 10/28/2023] [Accepted: 11/01/2023] [Indexed: 11/30/2023]
Abstract
Information on structures of protein-ligand complexes, including comparisons of known and putative protein-ligand-binding pockets, is valuable for protein annotation and drug discovery and development. To facilitate biomedical and pharmaceutical research, we developed PoSSuM (https://possum.cbrc.pj.aist.go.jp/PoSSuM/), a database for identifying similar binding pockets in proteins. The current PoSSuM database includes 191 million similar pairs among almost 10 million identified pockets. PoSSuM drug search (PoSSuMds) is a resource for investigating ligand and receptor diversity among a set of pockets that can bind to an approved drug compound. The enhanced PoSSuMds covers pockets associated with both approved drugs and drug candidates in clinical trials from the latest release of ChEMBL. Additionally, we developed two new databases: PoSSuMAg for investigating antibody-antigen interactions and PoSSuMAF to simplify exploring putative pockets in AlphaFold human protein models.
Collapse
Affiliation(s)
- Yuko Tsuchiya
- Artificial
Intelligence Research Center, National Institute
of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
| | - Tomoki Yonezawa
- Division
of Physics for Life Functions, Keio University
Faculty of Pharmacy, 1-5-30 Shibakoen, Minato-ku, Tokyo 105-8512, Japan
| | - Yu Yamamori
- Artificial
Intelligence Research Center, National Institute
of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
| | - Hiroko Inoura
- Artificial
Intelligence Research Center, National Institute
of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
| | - Masanori Osawa
- Division
of Physics for Life Functions, Keio University
Faculty of Pharmacy, 1-5-30 Shibakoen, Minato-ku, Tokyo 105-8512, Japan
| | - Kazuyoshi Ikeda
- Division
of Physics for Life Functions, Keio University
Faculty of Pharmacy, 1-5-30 Shibakoen, Minato-ku, Tokyo 105-8512, Japan
- Medicinal
Chemistry Applied AI Unit, HPC- and AI-driven Drug Development Platform
Division, RIKEN Center for Computational
Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Kentaro Tomii
- Artificial
Intelligence Research Center, National Institute
of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
| |
Collapse
|
7
|
Stevenson GA, Kirshner D, Bennion BJ, Yang Y, Zhang X, Zemla A, Torres MW, Epstein A, Jones D, Kim H, Bennett WFD, Wong SE, Allen JE, Lightstone FC. Clustering Protein Binding Pockets and Identifying Potential Drug Interactions: A Novel Ligand-Based Featurization Method. J Chem Inf Model 2023; 63:6655-6666. [PMID: 37847557 PMCID: PMC10647021 DOI: 10.1021/acs.jcim.3c00722] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Indexed: 10/18/2023]
Abstract
Protein-ligand interactions are essential to drug discovery and drug development efforts. Desirable on-target or multitarget interactions are the first step in finding an effective therapeutic, while undesirable off-target interactions are the first step in assessing safety. In this work, we introduce a novel ligand-based featurization and mapping of human protein pockets to identify closely related protein targets and to project novel drugs into a hybrid protein-ligand feature space to identify their likely protein interactions. Using structure-based template matches from PDB, protein pockets are featured by the ligands that bind to their best co-complex template matches. The simplicity and interpretability of this approach provide a granular characterization of the human proteome at the protein-pocket level instead of the traditional protein-level characterization by family, function, or pathway. We demonstrate the power of this featurization method by clustering a subset of the human proteome and evaluating the predicted cluster associations of over 7000 compounds.
Collapse
Affiliation(s)
- Garrett A. Stevenson
- Computational
Engineering Division, Lawrence Livermore
National Laboratory, Livermore, California 94550, United States
| | - Dan Kirshner
- Biosciences
and Biotechnology Division, Lawrence Livermore
National Laboratory, Livermore, California 94550, United States
| | - Brian J. Bennion
- Biosciences
and Biotechnology Division, Lawrence Livermore
National Laboratory, Livermore, California 94550, United States
| | - Yue Yang
- Biosciences
and Biotechnology Division, Lawrence Livermore
National Laboratory, Livermore, California 94550, United States
| | - Xiaohua Zhang
- Biosciences
and Biotechnology Division, Lawrence Livermore
National Laboratory, Livermore, California 94550, United States
| | - Adam Zemla
- Global
Security Computing Applications Division, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - Marisa W. Torres
- Global
Security Computing Applications Division, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - Aidan Epstein
- Global
Security Computing Applications Division, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - Derek Jones
- Global
Security Computing Applications Division, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
- Department
of Computer Science and Engineering, University
of California, San Diego, La Jolla, California 92093, United States
| | - Hyojin Kim
- Center
for Applied Scientific Computing, Lawrence
Livermore National Laboratory, Livermore, California 94550, United States
| | - W. F. Drew Bennett
- Biosciences
and Biotechnology Division, Lawrence Livermore
National Laboratory, Livermore, California 94550, United States
| | - Sergio E. Wong
- Biosciences
and Biotechnology Division, Lawrence Livermore
National Laboratory, Livermore, California 94550, United States
| | - Jonathan E. Allen
- Global
Security Computing Applications Division, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - Felice C. Lightstone
- Biosciences
and Biotechnology Division, Lawrence Livermore
National Laboratory, Livermore, California 94550, United States
| |
Collapse
|
8
|
Nunes-Alves A, Merz K. AlphaFold2 in Molecular Discovery. J Chem Inf Model 2023; 63:5947-5949. [PMID: 37807755 DOI: 10.1021/acs.jcim.3c01459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/10/2023]
Affiliation(s)
- Ariane Nunes-Alves
- Institute of Chemistry, Technische Universität Berlin, Berlin 10623, Germany
| | - Kenneth Merz
- Department of Chemistry, Michigan State University, East Lansing 48824, Michigan, United States
| |
Collapse
|
9
|
Ravnik V, Jukič M, Bren U. Identifying Metal Binding Sites in Proteins Using Homologous Structures, the MADE Approach. J Chem Inf Model 2023; 63:5204-5219. [PMID: 37557084 PMCID: PMC10466382 DOI: 10.1021/acs.jcim.3c00558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Indexed: 08/11/2023]
Abstract
In order to identify the locations of metal ions in the binding sites of proteins, we have developed a method named the MADE (MAcromolecular DEnsity and Structure Analysis) approach. The MADE approach represents an evolution of our previous toolset, the ProBiS H2O (MD) methodology, for the identification of conserved water molecules. Our method uses experimental structures of proteins homologous to a query, which are subsequently superimposed upon it. Areas with a particular species present in a similar location among many homologous protein structures are identified using a clustering algorithm. Dense clusters likely represent positions containing species important to the query protein structure or function. We analyze well-characterized apo protein structures and show that the MADE approach can identify clusters corresponding to the expected positions of metal ions in their binding sites. The greatest advantage of our method lies in its generality. It can in principle be applied to any species found in protein records; it is not only limited to metal ions. We additionally demonstrate that the MADE approach can be successfully applied to predict the location of cofactors in computer-modeled structures, e.g., via AlphaFold. We also conduct a careful protein superposition method comparison and find our methodology robust and the results largely independent of the selected protein superposition algorithm. We postulate that with increasing structural data availability, additional applications of the MADE approach will be possible such as non-protein systems, water network identification, protein binding site elaboration, and analysis of binding events, all in a dynamic manner. We have implemented the MADE approach as a plugin for the PyMOL molecular visualization tool. The MADE plugin is available free of charge at https://gitlab.com/Jukic/made_software.
Collapse
Affiliation(s)
- Vid Ravnik
- Faculty
of Chemistry and Chemical Engineering, University
of Maribor, Smetanova
ulica 17, Maribor SI-2000, Slovenia
| | - Marko Jukič
- Faculty
of Chemistry and Chemical Engineering, University
of Maribor, Smetanova
ulica 17, Maribor SI-2000, Slovenia
- The
Faculty of Mathematics, Natural Sciences and Information Technologies, University of Primorska, Glagoljaška 8, Koper SI-6000, Slovenia
- Institute
for Environmental Protection and Sensors, Beloruska ulica 7, Maribor SI-2000, Slovenia
| | - Urban Bren
- Faculty
of Chemistry and Chemical Engineering, University
of Maribor, Smetanova
ulica 17, Maribor SI-2000, Slovenia
- The
Faculty of Mathematics, Natural Sciences and Information Technologies, University of Primorska, Glagoljaška 8, Koper SI-6000, Slovenia
- Institute
for Environmental Protection and Sensors, Beloruska ulica 7, Maribor SI-2000, Slovenia
| |
Collapse
|
10
|
Konc J, Janežič D. Protein binding sites for drug design. Biophys Rev 2022; 14:1413-1421. [PMID: 36532870 PMCID: PMC9734416 DOI: 10.1007/s12551-022-01028-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Accepted: 12/01/2022] [Indexed: 12/13/2022] Open
Abstract
Drug development is a lengthy and challenging process that can be accelerated at early stages by new mathematical approaches and modern computers. To address this important issue, we are developing new mathematical solutions for the detection and characterization of protein binding sites that are important for new drug development. In this review, we present algorithms based on graph theory combined with molecular dynamics simulations that we have developed for studying biological target proteins to provide important data for optimizing the early stages of new drug development. A particular focus is the development of new protein binding site prediction algorithms (ProBiS) and new web tools for modeling pharmaceutically interesting molecules-ProBiS Tools (algorithm, database, web server), which have evolved into a full-fledged graphical tool for studying proteins in the proteome. ProBiS differs from other structural algorithms in that it can align proteins with different folds without prior knowledge of the binding sites. It allows detection of similar binding sites and can predict molecular ligands of various types of pharmaceutical interest that could be advanced to drugs to treat a disease, based on the entire Protein Data Bank (PDB) and AlphaFold database, including proteins not yet in the PDB. All ProBiS Tools are freely available to the academic community at http://insilab.org and https://probis.nih.gov.
Collapse
Affiliation(s)
- Janez Konc
- Theory Department, National Institute of Chemistry, Hajdrihova 19, SI-1001 Ljubljana, Slovenia
| | - Dušanka Janežič
- Faculty of Mathematics, Natural Sciences and Information Technologies, University of Primorska, Glagoljaška 8, SI-6000 Koper, Slovenia
| |
Collapse
|