1
|
Hashemi AS, Vaisman II. Topology-based protein classification: A deep learning approach. Biochem Biophys Res Commun 2025; 746:151240. [PMID: 39742787 DOI: 10.1016/j.bbrc.2024.151240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2024] [Revised: 11/29/2024] [Accepted: 12/23/2024] [Indexed: 01/04/2025]
Abstract
Utilizing Artificial Intelligence (AI) in computational biology techniques could offer significant advantages in alleviating the growing workloads faced by structural biologists, especially with the emergence of big data. In this study, we employed Delaunay tessellation as a promising method to obtain the overall structural topology of proteins. Subsequently, we developed multi-class deep neural network models to classify protein superfamilies based on their local topology. Our models achieved a test accuracy of approximately 0.92 in classifying proteins into 18 well-populated superfamilies. We believe that the results of this study hold substantial value since, to the best of our knowledge, no previous studies have reported the utilization of protein topological data for protein classification through deep learning and Delaunay tessellation.
Collapse
Affiliation(s)
- Aliye Sadat Hashemi
- School of Systems Biology, George Mason University, Manassas, VA, 20110, USA.
| | - Iosif I Vaisman
- School of Systems Biology, George Mason University, Manassas, VA, 20110, USA.
| |
Collapse
|
2
|
Sachdeva S, Joo H, Tsai J, Jasti B, Li X. A Rational Approach for Creating Peptides Mimicking Antibody Binding. Sci Rep 2019; 9:997. [PMID: 30700733 PMCID: PMC6353898 DOI: 10.1038/s41598-018-37201-6] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2018] [Accepted: 11/28/2018] [Indexed: 12/22/2022] Open
Abstract
This study reports a novel method to design peptides that mimic antibody binding. Using the Knob-Socket model for protein-protein interaction, the interaction surface between Cetuximab and EGFR was mapped. EGFR binding peptides were designed based on geometry and the probability of the mapped knob-sockets pairs. Designed peptides were synthesized and then characterized for binding specificity, affinity, cytotoxicity of drug-peptide conjugate and inhibition of phosphorylation. In cell culture studies, designed peptides specifically bind and internalize to EGFR overexpressing cells with three to four-fold higher uptake compared to control cells that do not overexpress EGFR. The designed peptide, Pep11, bound to EGFR with KD of 252 nM. Cytotoxicity of Monomethyl Auristatin E (MMAE)-EGFR-Pep11 peptide-drug conjugate was more than 2,000 fold higher against EGFR overexpressing cell lines A431, MDA MB 468 than control HEK 293 cells which lack EGFR overexpression. MMAE-EGFR-Pep11 conjugate also showed more than 90-fold lower cytotoxicity towards non-EGFR overexpressing HEK 293 cells when compared with cytotoxicity of MMAE itself. In conclusion, a method that can rationally design peptides using knob-socket model is presented. This method was successfully applied to create peptides based on the antigen-antibody interaction to mimic the specificity, affinity and functionality of antibody.
Collapse
Affiliation(s)
- Sameer Sachdeva
- Department of Pharmaceutics and Medicinal Chemistry, University of the Pacific, Stockton, CA, 95211, USA.,Amneal Pharmaceuticals, Piscataway, NJ, 08854, USA
| | - Hyun Joo
- Department of Chemistry, University of the Pacific, Stockton, CA, 95211, USA
| | - Jerry Tsai
- Department of Chemistry, University of the Pacific, Stockton, CA, 95211, USA
| | - Bhaskara Jasti
- Department of Pharmaceutics and Medicinal Chemistry, University of the Pacific, Stockton, CA, 95211, USA
| | - Xiaoling Li
- Department of Pharmaceutics and Medicinal Chemistry, University of the Pacific, Stockton, CA, 95211, USA.
| |
Collapse
|
3
|
Aydinkal RM, Bagci EZ. Residue packing in globular and intrinsically disordered proteins. Proteins 2018; 86:434-438. [DOI: 10.1002/prot.25459] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2017] [Accepted: 01/11/2018] [Indexed: 11/05/2022]
Affiliation(s)
- Rasim Murat Aydinkal
- Department of Bioengineering; Institute of Pure and Applied Sciences, Marmara University; Istanbul Turkey
| | | |
Collapse
|
4
|
Fraga KJ, Joo H, Tsai J. An amino acid code to define a protein's tertiary packing surface. Proteins 2015; 84:201-16. [PMID: 26575337 DOI: 10.1002/prot.24966] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2015] [Revised: 09/24/2015] [Accepted: 11/09/2015] [Indexed: 01/28/2023]
Abstract
One difficult aspect of the protein-folding problem is characterizing the nonspecific interactions that define packing in protein tertiary structure. To better understand tertiary structure, this work extends the knob-socket model by classifying the interactions of a single knob residue packed into a set of contiguous sockets, or a pocket made up of 4 or more residues. The knob-socket construct allows for a symbolic two-dimensional mapping of pockets. The two-dimensional mapping of pockets provides a simple method to investigate the variety of pocket shapes to understand the geometry of protein tertiary surfaces. The diversity of pocket geometries can be organized into groups of pockets that share a common core, which suggests that some interactions in pockets are ancillary to packing. Further analysis of pocket geometries displays a preferred configuration that is right-handed in α-helices and left-handed in β-sheets. The amino acid composition of pockets illustrates the importance of nonpolar amino acids in packing as well as position specificity. As expected, all pocket shapes prefer to pack with hydrophobic knobs; however, knobs are not selective for the pockets they pack. Investigating side-chain rotamer preferences for certain pocket shapes uncovers no strong correlations. These findings allow a simple vocabulary based on knobs and sockets to describe protein tertiary packing that supports improved analysis, design, and prediction of protein structure.
Collapse
Affiliation(s)
- Keith J Fraga
- Department of Chemistry, University of the Pacific, Stockton, California, 95211
| | - Hyun Joo
- Department of Chemistry, University of the Pacific, Stockton, California, 95211
| | - Jerry Tsai
- Department of Chemistry, University of the Pacific, Stockton, California, 95211
| |
Collapse
|
5
|
Joo H, Chavan AG, Fraga KJ, Tsai J. An amino acid code for irregular and mixed protein packing. Proteins 2015; 83:2147-61. [PMID: 26370334 DOI: 10.1002/prot.24929] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2015] [Revised: 09/01/2015] [Accepted: 09/02/2015] [Indexed: 11/10/2022]
Abstract
To advance our understanding of protein tertiary structure, the development of the knob-socket model is completed in an analysis of the packing in irregular coil and turn secondary structure packing as well as between mixed secondary structure. The knob-socket model simplifies packing based on repeated patterns of two motifs: a three-residue socket for packing within secondary (2°) structure and a four-residue knob-socket for tertiary (3°) packing. For coil and turn secondary structure, knob-sockets allow identification of a correlation between amino acid composition and tertiary arrangements in space. Coil contributes almost as much as α-helices to tertiary packing. In irregular sockets, Gly, Pro, Asp, and Ser are favored, while in irregular knobs, the preference order is Arg, Asp, Pro, Asn, Thr, Leu, and Gly. Cys, His,Met, and Trp are not favored in either. In mixed packing, the knob amino acid preferences are a function of the socket that they are packing into, whereas the amino acid composition of the sockets does not depend on the secondary structure of the knob. A unique motif of a coil knob with an XYZ β-sheet socket may potentially function to inhibit β-sheet extension. In addition, analysis of the preferred crossing angles for strands within a β-sheet and mixed α-helice/β-sheet identifies canonical packing patterns useful in protein design. Lastly, the knob-socket model abstracts the complexity of protein tertiary structure into an intuitive packing surface topology map.
Collapse
Affiliation(s)
- Hyun Joo
- Department of Chemistry, University of the Pacific, Stockton, California, 95211
| | - Archana G Chavan
- Department of Chemistry, University of the Pacific, Stockton, California, 95211
| | - Keith J Fraga
- Department of Chemistry, University of the Pacific, Stockton, California, 95211
| | - Jerry Tsai
- Department of Chemistry, University of the Pacific, Stockton, California, 95211
| |
Collapse
|
6
|
Abstract
Modularity is known as one of the most important features of protein's robust and efficient design. The architecture and topology of proteins play a vital role by providing necessary robust scaffolds to support organism's growth and survival in constant evolutionary pressure. These complex biomolecules can be represented by several layers of modular architecture, but it is pivotal to understand and explore the smallest biologically relevant structural component. In the present study, we have developed a component-based method, using protein's secondary structures and their arrangements (i.e. patterns) in order to investigate its structural space. Our result on all-alpha protein shows that the known structural space is highly populated with limited set of structural patterns. We have also noticed that these frequently observed structural patterns are present as modules or "building blocks" in large proteins (i.e. higher secondary structure content). From structural descriptor analysis, observed patterns are found to be within similar deviation; however, frequent patterns are found to be distinctly occurring in diverse functions e.g. in enzymatic classes and reactions. In this study, we are introducing a simple approach to explore protein structural space using combinatorial- and graph-based geometry methods, which can be used to describe modularity in protein structures. Moreover, analysis indicates that protein function seems to be the driving force that shapes the known structure space.
Collapse
Affiliation(s)
- Taushif Khan
- a School of Computational & Integrative Sciences , Jawaharlal Nehru University , New Delhi 110067 , India
| | - Indira Ghosh
- a School of Computational & Integrative Sciences , Jawaharlal Nehru University , New Delhi 110067 , India
| |
Collapse
|
7
|
Deng L, Wu A, Dai W, Song T, Cui Y, Jiang T. Exploring protein domain organization by recognition of secondary structure packing interfaces. Bioinformatics 2014; 30:2440-6. [PMID: 24813541 DOI: 10.1093/bioinformatics/btu327] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Protein domains are fundamental units of protein structure, function and evolution; thus, it is critical to gain a deep understanding of protein domain organization. Previous works have attempted to identify key residues involved in organization of domain architecture. Because one of the most important characteristics of domain architecture is the arrangement of secondary structure elements (SSEs), here we present a picture of domain organization through an integrated consideration of SSE arrangements and residue contact networks. RESULTS In this work, by representing SSEs as main-chain scaffolds and side-chain interfaces and through construction of residue contact networks, we have identified the SSE interfaces well packed within protein domains as SSE packing clusters. In total, 17 334 SSE packing clusters were recognized from 9015 Structural Classification of Proteins domains of <40% sequence identity. The similar SSE packing clusters were observed not only among domains of the same folds, but also among domains of different folds, indicating their roles as common scaffolds for organization of protein domains. Further analysis of 14 small single-domain proteins reveals a high correlation between the SSE packing clusters and the folding nuclei. Consistent with their important roles in domain organization, SSE packing clusters were found to be more conserved than other regions within the same proteins. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lizong Deng
- Key Laboratory of Protein & Peptide Pharmaceuticals, National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101 and University of the Chinese Academy of Sciences, Beijing 100049, China Key Laboratory of Protein & Peptide Pharmaceuticals, National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101 and University of the Chinese Academy of Sciences, Beijing 100049, China
| | - Aiping Wu
- Key Laboratory of Protein & Peptide Pharmaceuticals, National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101 and University of the Chinese Academy of Sciences, Beijing 100049, China
| | - Wentao Dai
- Key Laboratory of Protein & Peptide Pharmaceuticals, National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101 and University of the Chinese Academy of Sciences, Beijing 100049, China Key Laboratory of Protein & Peptide Pharmaceuticals, National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101 and University of the Chinese Academy of Sciences, Beijing 100049, China
| | - Tingrui Song
- Key Laboratory of Protein & Peptide Pharmaceuticals, National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101 and University of the Chinese Academy of Sciences, Beijing 100049, China Key Laboratory of Protein & Peptide Pharmaceuticals, National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101 and University of the Chinese Academy of Sciences, Beijing 100049, China
| | - Ya Cui
- Key Laboratory of Protein & Peptide Pharmaceuticals, National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101 and University of the Chinese Academy of Sciences, Beijing 100049, China Key Laboratory of Protein & Peptide Pharmaceuticals, National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101 and University of the Chinese Academy of Sciences, Beijing 100049, China
| | - Taijiao Jiang
- Key Laboratory of Protein & Peptide Pharmaceuticals, National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101 and University of the Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
8
|
Joo H, Tsai J. An amino acid code for β-sheet packing structure. Proteins 2014; 82:2128-40. [PMID: 24668690 DOI: 10.1002/prot.24569] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2013] [Revised: 03/17/2014] [Accepted: 03/19/2014] [Indexed: 11/09/2022]
Abstract
To understand the relationship between protein sequence and structure, this work extends the knob-socket model in an investigation of β-sheet packing. Over a comprehensive set of β-sheet folds, the contacts between residues were used to identify packing cliques: sets of residues that all contact each other. These packing cliques were then classified based on size and contact order. From this analysis, the two types of four-residue packing cliques necessary to describe β-sheet packing were characterized. Both occur between two adjacent hydrogen bonded β-strands. First, defining the secondary structure packing within β-sheets, the combined socket or XY:HG pocket consists of four residues i, i+2 on one strand and j, j+2 on the other. Second, characterizing the tertiary packing between β-sheets, the knob-socket XY:H+B consists of a three-residue XY:H socket (i, i+2 on one strand and j on the other) packed against a knob B residue (residue k distant in sequence). Depending on the packing depth of the knob B residue, two types of knob-sockets are found: side-chain and main-chain sockets. The amino acid composition of the pockets and knob-sockets reveal the sequence specificity of β-sheet packing. For β-sheet formation, the XY:HG pocket clearly shows sequence specificity of amino acids. For tertiary packing, the XY:H+B side-chain and main-chain sockets exhibit distinct amino acid preferences at each position. These relationships define an amino acid code for β-sheet structure and provide an intuitive topological mapping of β-sheet packing.
Collapse
Affiliation(s)
- Hyun Joo
- Department of Chemistry, University of the Pacific, Stockton, California, 95212
| | | |
Collapse
|
9
|
Day R, Joo H, Chavan AC, Lennox KP, Chen YA, Dahl DB, Vannucci M, Tsai JW. Understanding the general packing rearrangements required for successful template based modeling of protein structure from a CASP experiment. Comput Biol Chem 2013; 42:40-8. [DOI: 10.1016/j.compbiolchem.2012.10.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2012] [Revised: 10/30/2012] [Accepted: 10/31/2012] [Indexed: 11/16/2022]
|
10
|
Zhou W, Yan H. Alpha shape and Delaunay triangulation in studies of protein-related interactions. Brief Bioinform 2012. [PMID: 23193202 DOI: 10.1093/bib/bbs077] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
In recent years, more 3D protein structures have become available, which has made the analysis of large molecular structures much easier. There is a strong demand for geometric models for the study of protein-related interactions. Alpha shape and Delaunay triangulation are powerful tools to represent protein structures and have advantages in characterizing the surface curvature and atom contacts. This review presents state-of-the-art applications of alpha shape and Delaunay triangulation in the studies on protein-DNA, protein-protein, protein-ligand interactions and protein structure analysis.
Collapse
Affiliation(s)
- Weiqiang Zhou
- Department of Electronic Engineering, City University of Hong Kong, Tat Chee Avenue 83, Hong Kong.
| | | |
Collapse
|
11
|
Joo H, Chavan AG, Phan J, Day R, Tsai J. An amino acid packing code for α-helical structure and protein design. J Mol Biol 2012; 419:234-54. [PMID: 22426125 DOI: 10.1016/j.jmb.2012.03.004] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2011] [Revised: 02/22/2012] [Accepted: 03/07/2012] [Indexed: 11/19/2022]
Abstract
This work demonstrates that all packing in α-helices can be simplified to repetitive patterns of a single motif: the knob-socket. Using the precision of Voronoi Polyhedra/Delauney Tessellations to identify contacts, the knob-socket is a four-residue tetrahedral motif: a knob residue on one α-helix packs into the three-residue socket on another α-helix. The principle of the knob-socket model relates the packing between levels of protein structure: the intra-helical packing arrangements within secondary structure that permit inter-helix tertiary packing interactions. Within an α-helix, the three-residue sockets arrange residues into a uniform packing lattice. Inter-helix packing results from a definable pattern of interdigitated knob-socket motifs between two α-helices. Furthermore, the knob-socket model classifies three types of sockets: (1) free, favoring only intra-helical packing; (2) filled, favoring inter-helical interactions; and (3) non, disfavoring α-helical structure. The amino acid propensities in these three socket classes essentially represent an amino acid code for structure in α-helical packing. Using this code, we used a novel yet straightforward approach for the design of α-helical structure to validate the knob-socket model. Unique sequences for three peptides were created to produce a predicted amount of α-helical structure: mostly helical, some helical, and no helix. These three peptides were synthesized, and helical content was assessed using CD spectroscopy. The measured α-helicity of each peptide was consistent with the expected predictions. These results and analysis demonstrate that the knob-socket motif functions as the basic unit of packing and presents an intuitive tool to decipher the rules governing packing in protein structure.
Collapse
Affiliation(s)
- Hyun Joo
- Department of Chemistry, University of the Pacific, Stockton, CA 95211, USA
| | | | | | | | | |
Collapse
|