1
|
Balakrishnan A, Mishra SK, Georrge JJ. Insight into Protein Engineering: From In silico Modelling to In vitro Synthesis. Curr Pharm Des 2025; 31:179-202. [PMID: 39354773 DOI: 10.2174/0113816128349577240927071706] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2024] [Revised: 09/12/2024] [Accepted: 09/13/2024] [Indexed: 10/03/2024]
Abstract
Protein engineering alters the polypeptide chain to obtain a novel protein with improved functional properties. This field constantly evolves with advanced in silico tools and techniques to design novel proteins and peptides. Rational incorporating mutations, unnatural amino acids, and post-translational modifications increases the applications of engineered proteins and peptides. It aids in developing drugs with maximum efficacy and minimum side effects. Currently, the engineering of peptides is gaining attention due to their high stability, binding specificity, less immunogenic, and reduced toxicity properties. Engineered peptides are potent candidates for drug development due to their high specificity and low cost of production compared with other biologics, including proteins and antibodies. Therefore, understanding the current perception of designing and engineering peptides with the help of currently available in silico tools is crucial. This review extensively studies various in silico tools available for protein engineering in the prospect of designing peptides as therapeutics, followed by in vitro aspects. Moreover, a discussion on the chemical synthesis and purification of peptides, a case study, and challenges are also incorporated.
Collapse
Affiliation(s)
- Anagha Balakrishnan
- Department of Bioinformatics, University of North Bengal, Siliguri, District-Darjeeling, West Bengal 734013, India
| | - Saurav K Mishra
- Department of Bioinformatics, University of North Bengal, Siliguri, District-Darjeeling, West Bengal 734013, India
| | - John J Georrge
- Department of Bioinformatics, University of North Bengal, Siliguri, District-Darjeeling, West Bengal 734013, India
| |
Collapse
|
2
|
Kim DN, Yin T, Zhang T, Im AK, Cort JR, Rozum JC, Pollock D, Qian WJ, Feng S. Artificial Intelligence Transforming Post-Translational Modification Research. Bioengineering (Basel) 2024; 12:26. [PMID: 39851300 PMCID: PMC11762806 DOI: 10.3390/bioengineering12010026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2024] [Revised: 12/16/2024] [Accepted: 12/29/2024] [Indexed: 01/26/2025] Open
Abstract
Post-Translational Modifications (PTMs) are covalent changes to amino acids that occur after protein synthesis, including covalent modifications on side chains and peptide backbones. Many PTMs profoundly impact cellular and molecular functions and structures, and their significance extends to evolutionary studies as well. In light of these implications, we have explored how artificial intelligence (AI) can be utilized in researching PTMs. Initially, rationales for adopting AI and its advantages in understanding the functions of PTMs are discussed. Then, various deep learning architectures and programs, including recent applications of language models, for predicting PTM sites on proteins and the regulatory functions of these PTMs are compared. Finally, our high-throughput PTM-data-generation pipeline, which formats data suitably for AI training and predictions is described. We hope this review illuminates areas where future AI models on PTMs can be improved, thereby contributing to the field of PTM bioengineering.
Collapse
Affiliation(s)
- Doo Nam Kim
- Biological Sciences Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, USA (J.C.R.); (D.P.); (W.-J.Q.)
| | - Tianzhixi Yin
- National Security Directorate, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, USA
| | - Tong Zhang
- Biological Sciences Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, USA (J.C.R.); (D.P.); (W.-J.Q.)
| | - Alexandria K. Im
- Biological Sciences Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, USA (J.C.R.); (D.P.); (W.-J.Q.)
| | - John R. Cort
- Biological Sciences Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, USA (J.C.R.); (D.P.); (W.-J.Q.)
| | - Jordan C. Rozum
- Biological Sciences Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, USA (J.C.R.); (D.P.); (W.-J.Q.)
| | - David Pollock
- Biological Sciences Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, USA (J.C.R.); (D.P.); (W.-J.Q.)
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO 80045, USA
| | - Wei-Jun Qian
- Biological Sciences Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, USA (J.C.R.); (D.P.); (W.-J.Q.)
| | - Song Feng
- Biological Sciences Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, USA (J.C.R.); (D.P.); (W.-J.Q.)
| |
Collapse
|
3
|
Zhang O, Naik SA, Liu ZH, Forman-Kay J, Head-Gordon T. A curated rotamer library for common post-translational modifications of proteins. Bioinformatics 2024; 40:btae444. [PMID: 38995731 PMCID: PMC11254353 DOI: 10.1093/bioinformatics/btae444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Revised: 06/06/2024] [Accepted: 07/11/2024] [Indexed: 07/14/2024] Open
Abstract
MOTIVATION Sidechain rotamer libraries of the common amino acids of a protein are useful for folded protein structure determination and for generating ensembles of intrinsically disordered proteins (IDPs). However, much of protein function is modulated beyond the translated sequence through the introduction of post-translational modifications (PTMs). RESULTS In this work, we have provided a curated set of side chain rotamers for the most common PTMs derived from the RCSB PDB database, including phosphorylated, methylated, and acetylated sidechains. Our rotamer libraries improve upon existing methods such as SIDEpro, Rosetta, and AlphaFold3 in predicting the experimental structures for PTMs in folded proteins. In addition, we showcase our PTM libraries in full use by generating ensembles with the Monte Carlo Side Chain Entropy (MCSCE) for folded proteins, and combining MCSCE with the Local Disordered Region Sampling algorithms within IDPConformerGenerator for proteins with intrinsically disordered regions. AVAILABILITY AND IMPLEMENTATION The codes for dihedral angle computations and library creation are available at https://github.com/THGLab/ptm_sc.git.
Collapse
Affiliation(s)
- Oufan Zhang
- Kenneth S. Pitzer Center for Theoretical Chemistry, University of California, Berkeley, CA 94720, United States
| | - Shubhankar A Naik
- Department of Chemistry, University of California, Berkeley, CA 94720, United States
| | - Zi Hao Liu
- Molecular Medicine Program, Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
- Department of Biochemistry, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Julie Forman-Kay
- Molecular Medicine Program, Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
- Department of Biochemistry, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Teresa Head-Gordon
- Kenneth S. Pitzer Center for Theoretical Chemistry, University of California, Berkeley, CA 94720, United States
- Department of Chemistry, University of California, Berkeley, CA 94720, United States
- Department of Bioengineering, University of California, Berkeley, CA 94720, United States
- Department of Chemical and Biomolecular Engineering, University of California, Berkeley, CA 94720, United States
| |
Collapse
|
4
|
Kellman BP, Mariethoz J, Zhang Y, Shaul S, Alteri M, Sandoval D, Jeffris M, Armingol E, Bao B, Lisacek F, Bojar D, Lewis NE. Decoding glycosylation potential from protein structure across human glycoproteins with a multi-view recurrent neural network. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.15.594334. [PMID: 38798633 PMCID: PMC11118808 DOI: 10.1101/2024.05.15.594334] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Glycosylation is described as a non-templated biosynthesis. Yet, the template-free premise is antithetical to the observation that different N-glycans are consistently placed at specific sites. It has been proposed that glycosite-proximal protein structures could constrain glycosylation and explain the observed microheterogeneity. Using site-specific glycosylation data, we trained a hybrid neural network to parse glycosites (recurrent neural network) and match them to feasible N-glycosylation events (graph neural network). From glycosite-flanking sequences, the algorithm predicts most human N-glycosylation events documented in the GlyConnect database and proposed structures corresponding to observed monosaccharide composition of the glycans at these sites. The algorithm also recapitulated glycosylation in Enhanced Aromatic Sequons, SARS-CoV-2 spike, and IgG3 variants, thus demonstrating the ability of the algorithm to predict both glycan structure and abundance. Thus, protein structure constrains glycosylation, and the neural network enables predictive in silico glycosylation of uncharacterized or novel protein sequences and genetic variants.
Collapse
Affiliation(s)
- Benjamin P. Kellman
- Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
- Augment Biologics, La Jolla, CA 92092
- Ragon Institute of MGH, MIT, and Harvard, Cambridge, MA, USA
| | - Julien Mariethoz
- Proteome Informatics Group, Swiss Institute of Bioinformatics, CH-1227 Geneva, Switzerland
| | - Yujie Zhang
- Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA
| | - Sigal Shaul
- Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Mia Alteri
- Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Daniel Sandoval
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Mia Jeffris
- Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Erick Armingol
- Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - Bokan Bao
- Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - Frederique Lisacek
- Proteome Informatics Group, Swiss Institute of Bioinformatics, CH-1227 Geneva, Switzerland
- Computer Science Department & Section of Biology, University of Geneva, route de Drize 7, CH-1227, Geneva, Switzerland
| | - Daniel Bojar
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg 41390, Sweden
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg 41390, Sweden
| | - Nathan E. Lewis
- Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
- Ragon Institute of MGH, MIT, and Harvard, Cambridge, MA, USA
| |
Collapse
|
5
|
Zhang O, Naik SA, Liu ZH, Forman-Kay J, Head-Gordon T. A Curated Rotamer Library for Common Post-Translational Modifications of Proteins. ARXIV 2024:arXiv:2405.03120v1. [PMID: 38764597 PMCID: PMC11100909] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/21/2024]
Abstract
Sidechain rotamer libraries of the common amino acids of a protein are useful for folded protein structure determination and for generating ensembles of intrinsically disordered proteins (IDPs). However much of protein function is modulated beyond the translated sequence through thFiguree introduction of post-translational modifications (PTMs). In this work we have provided a curated set of side chain rotamers for the most common PTMs derived from the RCSB PDB database, including phosphorylated, methylated, and acetylated sidechains. Our rotamer libraries improve upon existing methods such as SIDEpro and Rosetta in predicting the experimental structures for PTMs in folded proteins. In addition, we showcase our PTM libraries in full use by generating ensembles with the Monte Carlo Side Chain Entropy (MCSCE) for folded proteins, and combining MCSCE with the Local Disordered Region Sampling algorithms within IDPConformerGenerator for proteins with intrinsically disordered regions.
Collapse
Affiliation(s)
- Oufan Zhang
- Kenneth S. Pitzer Center for Theoretical Chemistry, University of California, Berkeley, Berkeley, California 94720, USA
| | - Shubhankar A. Naik
- Department of Chemistry, University of California, Berkeley, Berkeley, California 94720, USA
| | - Zi Hao Liu
- Molecular Medicine Program, Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Julie Forman-Kay
- Molecular Medicine Program, Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Teresa Head-Gordon
- Kenneth S. Pitzer Center for Theoretical Chemistry, University of California, Berkeley, Berkeley, California 94720, USA
- Department of Chemistry, University of California, Berkeley, Berkeley, California 94720, USA
- Department of Bioengineering, University of California, Berkeley, Berkeley, California 94720, USA
- Department of Chemical and Biomolecular Engineering, University of California, Berkeley, Berkeley, California 94720, USA
| |
Collapse
|
6
|
Esmaili F, Pourmirzaei M, Ramazi S, Shojaeilangari S, Yavari E. A Review of Machine Learning and Algorithmic Methods for Protein Phosphorylation Site Prediction. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:1266-1285. [PMID: 37863385 PMCID: PMC11082408 DOI: 10.1016/j.gpb.2023.03.007] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 01/16/2023] [Accepted: 03/23/2023] [Indexed: 10/22/2023]
Abstract
Post-translational modifications (PTMs) have key roles in extending the functional diversity of proteins and, as a result, regulating diverse cellular processes in prokaryotic and eukaryotic organisms. Phosphorylation modification is a vital PTM that occurs in most proteins and plays a significant role in many biological processes. Disorders in the phosphorylation process lead to multiple diseases, including neurological disorders and cancers. The purpose of this review is to organize this body of knowledge associated with phosphorylation site (p-site) prediction to facilitate future research in this field. At first, we comprehensively review all related databases and introduce all steps regarding dataset creation, data preprocessing, and method evaluation in p-site prediction. Next, we investigate p-site prediction methods, which are divided into two computational groups: algorithmic and machine learning (ML). Additionally, it is shown that there are basically two main approaches for p-site prediction by ML: conventional and end-to-end deep learning methods, both of which are given an overview. Moreover, this review introduces the most important feature extraction techniques, which have mostly been used in p-site prediction. Finally, we create three test sets from new proteins related to the released version of the database of protein post-translational modifications (dbPTM) in 2022 based on general and human species. Evaluating online p-site prediction tools on newly added proteins introduced in the dbPTM 2022 release, distinct from those in the dbPTM 2019 release, reveals their limitations. In other words, the actual performance of these online p-site prediction tools on unseen proteins is notably lower than the results reported in their respective research papers.
Collapse
Affiliation(s)
- Farzaneh Esmaili
- Department of Information Technology, Tarbiat Modares University, Tehran 14115-111, Iran
| | - Mahdi Pourmirzaei
- Department of Information Technology, Tarbiat Modares University, Tehran 14115-111, Iran
| | - Shahin Ramazi
- Department of Biophysics, Faculty of Biological Sciences, Tarbiat Modares University, Tehran 14115-111, Iran.
| | - Seyedehsamaneh Shojaeilangari
- Biomedical Engineering Group, Department of Electrical Engineering and Information Technology, Iranian Research Organization for Science and Technology (IROST), Tehran 33535-111, Iran
| | - Elham Yavari
- Department of Information Technology, Tarbiat Modares University, Tehran 14115-111, Iran
| |
Collapse
|
7
|
Afshinpour M, Smith LA, Chakravarty S. AQcalc: A web server that identifies weak molecular interactions in protein structures. Protein Sci 2023; 32:e4762. [PMID: 37596782 PMCID: PMC10503417 DOI: 10.1002/pro.4762] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 07/25/2023] [Accepted: 08/15/2023] [Indexed: 08/20/2023]
Abstract
Weak molecular interactions play an important role in protein structure and function. Computational tools that identify weak molecular interactions are, therefore, valuable for the study of proteins. Here, we present AQcalc, a web server (https://aqcalcbiocomputing.com/) that can be used to identify anion-quadrupole (AQ) interactions, which are weak interactions involving aromatic residue (Trp, Tyr, and Phe) ring edges and anions (Asp, Glu, and phosphate ion) both within proteins and at their interfaces (protein-protein, protein-nucleic acids, and protein-lipid bilayer). AQcalc identifies AQ interactions as well as clusters involving AQ, cation-π, and salt bridges, among others. Utilizing AQcalc we analyzed weak interactions in protein models, even in the absence of experimental structures, to understand the contributions of weak interactions to deleterious structural changes, including those associated with oncogenic and germline disease variants. We identified several deleterious variants with disrupted AQ interactions (comparable in frequency to cation-π disruptions). Amyloid fibrils utilize AQ to bury anions at frequencies that far exceed those observed for globular proteins. AQ interactions were detected three and five times more frequently than the hydrogen-bonded AQ (HBAQ) in fibril structures and protein-lipid bilayer interfaces, respectively. By contrast, AQ and HBAQ interactions were detected with similar frequencies in globular proteins. Collectively, these findings suggest AQcalc will be effective in facilitating fine structural analysis. As other web utilities designed to identify protein residue interaction networks do not report AQ interactions, wide use of AQcalc will enrich our understanding of residue interaction networks and facilitate hypothesis testing by identifying and experimentally characterizing these comparably weak but important interactions.
Collapse
Affiliation(s)
- Maral Afshinpour
- Department of Chemistry & BiochemistrySouth Dakota State UniversityBrookingsSouth DakotaUSA
| | - Logan A. Smith
- Department of Chemistry & BiochemistrySouth Dakota State UniversityBrookingsSouth DakotaUSA
| | - Suvobrata Chakravarty
- Department of Chemistry & BiochemistrySouth Dakota State UniversityBrookingsSouth DakotaUSA
| |
Collapse
|
8
|
Sharapov SZ, Timoshchuk AN, Aulchenko YS. Genetic control of N-glycosylation of human blood plasma proteins. Vavilovskii Zhurnal Genet Selektsii 2023; 27:224-239. [PMID: 37293449 PMCID: PMC10244589 DOI: 10.18699/vjgb-23-29] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 01/20/2023] [Accepted: 01/23/2022] [Indexed: 06/10/2023] Open
Abstract
Glycosylation is an important protein modification, which influences the physical and chemical properties as well as biological function of these proteins. Large-scale population studies have shown that the levels of various plasma protein N-glycans are associated with many multifactorial human diseases. Observed associations between protein glycosylation levels and human diseases have led to the conclusion that N-glycans can be considered a potential source of biomarkers and therapeutic targets. Although biochemical pathways of glycosylation are well studied, the understanding of the mechanisms underlying general and tissue-specific regulation of these biochemical reactions in vivo is limited. This complicates both the interpretation of the observed associations between protein glycosylation levels and human diseases, and the development of glycan-based biomarkers and therapeutics. By the beginning of the 2010s, high-throughput methods of N-glycome profiling had become available, allowing research into the genetic control of N-glycosylation using quantitative genetics methods, including genome-wide association studies (GWAS). Application of these methods has made it possible to find previously unknown regulators of N-glycosylation and expanded the understanding of the role of N-glycans in the control of multifactorial diseases and human complex traits. The present review considers the current knowledge of the genetic control of variability in the levels of N-glycosylation of plasma proteins in human populations. It briefly describes the most popular physical-chemical methods of N-glycome profiling and the databases that contain genes involved in the biosynthesis of N-glycans. It also reviews the results of studies of environmental and genetic factors contributing to the variability of N-glycans as well as the mapping results of the genomic loci of N-glycans by GWAS. The results of functional in vitro and in silico studies are described. The review summarizes the current progress in human glycogenomics and suggests possible directions for further research.
Collapse
Affiliation(s)
- S Zh Sharapov
- MSU Institute for Artificial Intelligence, Lomonosov Moscow State University, Moscow, Russia
| | - A N Timoshchuk
- MSU Institute for Artificial Intelligence, Lomonosov Moscow State University, Moscow, Russia
| | - Y S Aulchenko
- MSU Institute for Artificial Intelligence, Lomonosov Moscow State University, Moscow, Russia Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| |
Collapse
|
9
|
Kuschert S, Stroet M, Chin YKY, Conibear AC, Jia X, Lee T, Bartling CRO, Strømgaard K, Güntert P, Rosengren KJ, Mark AE, Mobli M. Facilitating the structural characterisation of non-canonical amino acids in biomolecular NMR. MAGNETIC RESONANCE (GOTTINGEN, GERMANY) 2023; 4:57-72. [PMID: 37904802 PMCID: PMC10583272 DOI: 10.5194/mr-4-57-2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 02/07/2023] [Indexed: 11/01/2023]
Abstract
Peptides and proteins containing non-canonical amino acids (ncAAs) are a large and important class of biopolymers. They include non-ribosomally synthesised peptides, post-translationally modified proteins, expressed or synthesised proteins containing unnatural amino acids, and peptides and proteins that are chemically modified. Here, we describe a general procedure for generating atomic descriptions required to incorporate ncAAs within popular NMR structure determination software such as CYANA, CNS, Xplor-NIH and ARIA. This procedure is made publicly available via the existing Automated Topology Builder (ATB) server (https://atb.uq.edu.au, last access: 17 February 2023) with all submitted ncAAs stored in a dedicated database. The described procedure also includes a general method for linking of side chains of amino acids from CYANA templates. To ensure compatibility with other systems, atom names comply with IUPAC guidelines. In addition to describing the workflow, 3D models of complex natural products generated by CYANA are presented, including vancomycin. In order to demonstrate the manner in which the templates for ncAAs generated by the ATB can be used in practice, we use a combination of CYANA and CNS to solve the structure of a synthetic peptide designed to disrupt Alzheimer-related protein-protein interactions. Automating the generation of structural templates for ncAAs will extend the utility of NMR spectroscopy to studies of more complex biomolecules, with applications in the rapidly growing fields of synthetic biology and chemical biology. The procedures we outline can also be used to standardise the creation of structural templates for any amino acid and thus have the potential to impact structural biology more generally.
Collapse
Affiliation(s)
- Sarah Kuschert
- Centre for Advanced Imaging, Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Martin Stroet
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Yanni Ka-Yan Chin
- Centre for Advanced Imaging, Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Anne Claire Conibear
- Institute of Applied Synthetic Chemistry, Technische Universität Wien, Getreidemarkt 9/163, Wien 1060, Vienna, Austria
| | - Xinying Jia
- Centre for Advanced Imaging, Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Thomas Lee
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, QLD 4072, Australia
| | | | - Kristian Strømgaard
- Department of Drug Design and Pharmacology, University of Copenhagen, Universitetsparken 2, 2100 Copenhagen, Denmark
| | - Peter Güntert
- Laboratory of Physical Chemistry, ETH Zürich, 8093 Zurich, Switzerland
- Institute of Biophysical Chemistry, Center for Biomolecular Magnetic Resonance, Goethe University Frankfurt, 60438 Frankfurt am Main, Germany
- Department of Chemistry, Tokyo Metropolitan University, Hachiōji, Tokyo 192-0397, Japan
| | - Karl Johan Rosengren
- School of Biomedical Sciences, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Alan Edward Mark
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Mehdi Mobli
- Centre for Advanced Imaging, Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, Brisbane, QLD 4072, Australia
| |
Collapse
|
10
|
Weigle AT, Feng J, Shukla D. Thirty years of molecular dynamics simulations on posttranslational modifications of proteins. Phys Chem Chem Phys 2022; 24:26371-26397. [PMID: 36285789 PMCID: PMC9704509 DOI: 10.1039/d2cp02883b] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/06/2023]
Abstract
Posttranslational modifications (PTMs) are an integral component to how cells respond to perturbation. While experimental advances have enabled improved PTM identification capabilities, the same throughput for characterizing how structural changes caused by PTMs equate to altered physiological function has not been maintained. In this Perspective, we cover the history of computational modeling and molecular dynamics simulations which have characterized the structural implications of PTMs. We distinguish results from different molecular dynamics studies based upon the timescales simulated and analysis approaches used for PTM characterization. Lastly, we offer insights into how opportunities for modern research efforts on in silico PTM characterization may proceed given current state-of-the-art computing capabilities and methodological advancements.
Collapse
Affiliation(s)
- Austin T Weigle
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Jiangyan Feng
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Diwakar Shukla
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
- Department of Plant Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA.
| |
Collapse
|
11
|
England WE, Wang J, Chen S, Baldi P, Flynn RA, Spitale RC. An atlas of posttranslational modifications on RNA binding proteins. Nucleic Acids Res 2022; 50:4329-4339. [PMID: 35438783 PMCID: PMC9071496 DOI: 10.1093/nar/gkac243] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Revised: 03/24/2022] [Accepted: 04/15/2022] [Indexed: 12/13/2022] Open
Abstract
RNA structure and function are intimately tied to RNA binding protein recognition and regulation. Posttranslational modifications are chemical modifications which can control protein biology. The role of PTMs in the regulation RBPs is not well understood, in part due to a lacking analysis of PTM deposition on RBPs. Herein, we present an analysis of posttranslational modifications (PTMs) on RNA binding proteins (RBPs; a PTM RBP Atlas). We curate published datasets and primary literature to understand the landscape of PTMs and use protein-protein interaction data to understand and potentially provide a framework for understanding which enzymes are controlling PTM deposition and removal on the RBP landscape. Intersection of our data with The Cancer Genome Atlas also provides researchers understanding of mutations that would alter PTM deposition. Additional characterization of the RNA-protein interface provided from in-cell UV crosslinking experiments provides a framework for hypotheses about which PTMs could be regulating RNA binding and thus RBP function. Finally, we provide an online database for our data that is easy to use for the community. It is our hope our efforts will provide researchers will an invaluable tool to test the function of PTMs controlling RBP function and thus RNA biology.
Collapse
Affiliation(s)
- Whitney E England
- Department of Pharmaceutical Sciences, University of California, Irvine. Irvine, CA, USA
| | - Jingtian Wang
- Department of Pharmaceutical Sciences, University of California, Irvine. Irvine, CA, USA
| | - Siwei Chen
- School of Information and Computer Sciences, University of California, Irvine. Irvine, CA, USA
| | - Pierre Baldi
- School of Information and Computer Sciences, University of California, Irvine. Irvine, CA, USA
| | - Ryan A Flynn
- Stem Cell Program, Boston Children's Hospital, Boston, MA, USA.,Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, MA, USA
| | - Robert C Spitale
- Department of Pharmaceutical Sciences, University of California, Irvine. Irvine, CA, USA.,Department of Developmental and Cellular Biology, University of California, Irvine. Irvine, CA, USA.,Department of Chemistry, University of California, Irvine. Irvine, CA, USA
| |
Collapse
|
12
|
Methodological advances in the design of peptide-based vaccines. Drug Discov Today 2022; 27:1367-1380. [DOI: 10.1016/j.drudis.2022.03.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Revised: 12/02/2021] [Accepted: 03/07/2022] [Indexed: 12/11/2022]
|
13
|
Gan SKE, Phua SX, Yeo JY. Sagacious epitope selection for vaccines, and both antibody-based therapeutics and diagnostics: tips from virology and oncology. Antib Ther 2022; 5:63-72. [PMID: 35372784 PMCID: PMC8972324 DOI: 10.1093/abt/tbac005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Revised: 01/24/2022] [Accepted: 02/12/2022] [Indexed: 11/12/2022] Open
Abstract
Abstract
The target of an antibody plays a significant role in the success of antibody-based therapeutics and diagnostics, and vaccine development. This importance is focused on the target binding site—epitope, where epitope selection as a part of design thinking beyond traditional antigen selection using whole cell or whole protein immunization can positively impact success. With purified recombinant protein production and peptide synthesis to display limited/selected epitopes, intrinsic factors that can affect the functioning of resulting antibodies can be more easily selected for. Many of these factors stem from the location of the epitope that can impact accessibility of the antibody to the epitope at a cellular or molecular level, direct inhibition of target antigen activity, conservation of function despite escape mutations, and even non-competitive inhibition sites. By incorporating novel computational methods for predicting antigen changes to model-informed drug discovery and development, superior vaccines and antibody-based therapeutics or diagnostics can be easily designed to mitigate failures. With detailed examples, this review highlights the new opportunities, factors and methods of predicting antigenic changes for consideration in sagacious epitope selection.
Collapse
Affiliation(s)
- Samuel Ken-En Gan
- Antibody & Product Development Lab, EDDC-BII, Agency for Science, Technology and Research (A*STAR), Singapore 138672, Singapore
- APD SKEG Pte Ltd, Singapore 439444, Singapore
| | - Ser-Xian Phua
- Antibody & Product Development Lab, EDDC-BII, Agency for Science, Technology and Research (A*STAR), Singapore 138672, Singapore
| | - Joshua Yi Yeo
- Antibody & Product Development Lab, EDDC-BII, Agency for Science, Technology and Research (A*STAR), Singapore 138672, Singapore
| |
Collapse
|
14
|
de Brevern AG, Rebehmed J. Current status of PTMs structural databases: applications, limitations and prospects. Amino Acids 2022; 54:575-590. [PMID: 35020020 DOI: 10.1007/s00726-021-03119-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Accepted: 12/20/2021] [Indexed: 12/11/2022]
Abstract
Protein 3D structures, determined by their amino acid sequences, are the support of major crucial biological functions. Post-translational modifications (PTMs) play an essential role in regulating these functions by altering the physicochemical properties of proteins. By virtue of their importance, several PTM databases have been developed and released in decades, but very few of these databases incorporate real 3D structural data. Since PTMs influence the function of the protein and their aberrant states are frequently implicated in human diseases, providing structural insights to understand the influence and dynamics of PTMs is crucial for unraveling the underlying processes. This review is dedicated to the current status of databases providing 3D structural data on PTM sites in proteins. Some of these databases are general, covering multiple types of PTMs in different organisms, while others are specific to one particular type of PTM, class of proteins or organism. The importance of these databases is illustrated with two major types of in silico applications: predicting PTM sites in proteins using machine learning approaches and investigating protein structure-function relationships involving PTMs. Finally, these databases suffer from multiple problems and care must be taken when analyzing the PTMs data.
Collapse
Affiliation(s)
- Alexandre G de Brevern
- Université de Paris, INSERM, UMR_S 1134, DSIMB, 75739, Paris, France.,Université de la Réunion, INSERM, UMR_S 1134, DSIMB, 97715, Saint-Denis de La Réunion, France.,Laboratoire d'Excellence GR-Ex, 75739, Paris, France
| | - Joseph Rebehmed
- Department of Computer Science and Mathematics, Lebanese American University, Beirut, Lebanon.
| |
Collapse
|
15
|
Craveur P, Narwani TJ, Srinivasan N, Gelly JC, Rebehmed J, de Brevern AG. Shaking the β-Bulges. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:14-18. [PMID: 34115590 DOI: 10.1109/tcbb.2021.3088444] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
β-bulges are irregularities inside the β-sheets. They represent more than 3 percent of the protein residues, i.e., they are as frequent as 3.10 helices. In terms of evolution, β-bulges are not more conserved than any other local protein conformations within homologous protein structures. In a first of its kind study, we have investigated the dynamical behaviour of β-bulges using the largest known set of protein molecular dynamics simulations. We observed that more than 50 percent of the existing β-bulges in protein crystal structures remained stable during dynamics while more than1/6th were not stable at all and disappeared entirely. Surprisingly, 1.1 percent of β-bulges that appeared remained stable. β-bulges have been categorized in different subtypes. The most common β-bulges' types are the smallest insertion in β-strands (namely AC and AG); they are found as stable as the whole β-bulges dataset. Low occurring types (namely PC and AS), that have the largest insertions, are significantly more stable than expected. Thus, this pioneer study allowed to precisely quantify the stability of the β-bulges, demonstrating their structural robustness, with few unexpected cases raising structural questions.
Collapse
|
16
|
Zhang H, He J, Hu G, Zhu F, Jiang H, Gao J, Zhou H, Lin H, Wang Y, Chen K, Meng F, Hao M, Zhao K, Luo C, Liang Z. Dynamics of Post-Translational Modification Inspires Drug Design in the Kinase Family. J Med Chem 2021; 64:15111-15125. [PMID: 34668699 DOI: 10.1021/acs.jmedchem.1c01076] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Post-translational modification (PTM) on protein plays important roles in the regulation of cellular function and disease pathogenesis. The systematic analysis of PTM dynamics presents great opportunities to enlarge the target space by PTM allosteric regulation. Here, we presented a framework by integrating the sequence, structural topology, and particular dynamics features to characterize the functional context and druggabilities of PTMs in the well-known kinase family. The machine learning models with these biophysical features could successfully predict PTMs. On the other hand, PTMs were identified to be significantly enriched in the reported allosteric pockets and the allosteric potential of PTM pockets were thus proposed through these biophysical features. In the end, the covalent inhibitor DC-Srci-6668 targeting the PTM pocket in c-Src kinase was identified, which inhibited the phosphorylation and locked c-Src in the inactive state. Our findings represent a crucial step toward PTM-inspired drug design in the kinase family.
Collapse
Affiliation(s)
- Huimin Zhang
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou 215123, China.,Drug Discovery and Design Center, the Center for Chemical Biology, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.,School of Life Science and Technology, Shanghai Tech University, 100 Haike Road, Shanghai 201210, China.,University of Chinese Academy of Sciences (UCAS), 19 Yuquan Road, Beijing 100049, China
| | - Jixiao He
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou 215123, China
| | - Guang Hu
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou 215123, China
| | - Fei Zhu
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou 215123, China
| | - Hao Jiang
- Drug Discovery and Design Center, the Center for Chemical Biology, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.,University of Chinese Academy of Sciences (UCAS), 19 Yuquan Road, Beijing 100049, China
| | - Jing Gao
- Drug Discovery and Design Center, the Center for Chemical Biology, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.,University of Chinese Academy of Sciences (UCAS), 19 Yuquan Road, Beijing 100049, China
| | - Hu Zhou
- Drug Discovery and Design Center, the Center for Chemical Biology, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.,University of Chinese Academy of Sciences (UCAS), 19 Yuquan Road, Beijing 100049, China
| | - Hua Lin
- Biomedical Research Center of South China, College of Life Sciences, Fujian Normal University, 1 Keji Road, Fuzhou 350117, China
| | - Yingjuan Wang
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou 215123, China
| | - Kaixian Chen
- Drug Discovery and Design Center, the Center for Chemical Biology, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.,School of Life Science and Technology, Shanghai Tech University, 100 Haike Road, Shanghai 201210, China.,University of Chinese Academy of Sciences (UCAS), 19 Yuquan Road, Beijing 100049, China
| | - Fanwang Meng
- Department of Chemistry and Chemical Biology, McMaster University, Hamilton, ON L8S 4L8, Canada
| | - Minghong Hao
- Ensem Therapeutics, Inc., 200 Boston Avenue, Medford, Massachusetts 02155, United States
| | - Kehao Zhao
- School of Pharmacy, Key Laboratory of Molecular Pharmacology and Drug Evaluation (Yantai University), Ministry of Education, Collaborative Innovation Center of Advanced Drug Delivery System and Biotech Drugs in Universities of Shandong, Yantai University, Yantai 264005, China
| | - Cheng Luo
- Drug Discovery and Design Center, the Center for Chemical Biology, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.,School of Life Science and Technology, Shanghai Tech University, 100 Haike Road, Shanghai 201210, China.,University of Chinese Academy of Sciences (UCAS), 19 Yuquan Road, Beijing 100049, China.,School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou 310024, China
| | - Zhongjie Liang
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou 215123, China
| |
Collapse
|
17
|
Bonne Køhler J, Jers C, Senissar M, Shi L, Derouiche A, Mijakovic I. Importance of protein Ser/Thr/Tyr phosphorylation for bacterial pathogenesis. FEBS Lett 2020; 594:2339-2369. [PMID: 32337704 DOI: 10.1002/1873-3468.13797] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2020] [Revised: 04/16/2020] [Accepted: 04/20/2020] [Indexed: 12/13/2022]
Abstract
Protein phosphorylation regulates a large variety of biological processes in all living cells. In pathogenic bacteria, the study of serine, threonine, and tyrosine (Ser/Thr/Tyr) phosphorylation has shed light on the course of infectious diseases, from adherence to host cells to pathogen virulence, replication, and persistence. Mass spectrometry (MS)-based phosphoproteomics has provided global maps of Ser/Thr/Tyr phosphosites in bacterial pathogens. Despite recent developments, a quantitative and dynamic view of phosphorylation events that occur during bacterial pathogenesis is currently lacking. Temporal, spatial, and subpopulation resolution of phosphorylation data is required to identify key regulatory nodes underlying bacterial pathogenesis. Herein, we discuss how technological improvements in sample handling, MS instrumentation, data processing, and machine learning should improve bacterial phosphoproteomic datasets and the information extracted from them. Such information is expected to significantly extend the current knowledge of Ser/Thr/Tyr phosphorylation in pathogenic bacteria and should ultimately contribute to the design of novel strategies to combat bacterial infections.
Collapse
Affiliation(s)
- Julie Bonne Køhler
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Lyngby, Denmark
| | - Carsten Jers
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Lyngby, Denmark
| | - Mériem Senissar
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Lyngby, Denmark
| | - Lei Shi
- Systems and Synthetic Biology Division, Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
| | - Abderahmane Derouiche
- Systems and Synthetic Biology Division, Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
| | - Ivan Mijakovic
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Lyngby, Denmark.,Systems and Synthetic Biology Division, Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
| |
Collapse
|
18
|
Li F, Fan C, Marquez-Lago TT, Leier A, Revote J, Jia C, Zhu Y, Smith AI, Webb GI, Liu Q, Wei L, Li J, Song J. PRISMOID: a comprehensive 3D structure database for post-translational modifications and mutations with functional impact. Brief Bioinform 2020; 21:1069-1079. [PMID: 31161204 PMCID: PMC7299293 DOI: 10.1093/bib/bbz050] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Revised: 03/26/2019] [Accepted: 03/29/2019] [Indexed: 12/26/2022] Open
Abstract
Post-translational modifications (PTMs) play very important roles in various cell signaling pathways and biological process. Due to PTMs' extremely important roles, many major PTMs have been studied, while the functional and mechanical characterization of major PTMs is well documented in several databases. However, most currently available databases mainly focus on protein sequences, while the real 3D structures of PTMs have been largely ignored. Therefore, studies of PTMs 3D structural signatures have been severely limited by the deficiency of the data. Here, we develop PRISMOID, a novel publicly available and free 3D structure database for a wide range of PTMs. PRISMOID represents an up-to-date and interactive online knowledge base with specific focus on 3D structural contexts of PTMs sites and mutations that occur on PTMs and in the close proximity of PTM sites with functional impact. The first version of PRISMOID encompasses 17 145 non-redundant modification sites on 3919 related protein 3D structure entries pertaining to 37 different types of PTMs. Our entry web page is organized in a comprehensive manner, including detailed PTM annotation on the 3D structure and biological information in terms of mutations affecting PTMs, secondary structure features and per-residue solvent accessibility features of PTM sites, domain context, predicted natively disordered regions and sequence alignments. In addition, high-definition JavaScript packages are employed to enhance information visualization in PRISMOID. PRISMOID equips a variety of interactive and customizable search options and data browsing functions; these capabilities allow users to access data via keyword, ID and advanced options combination search in an efficient and user-friendly way. A download page is also provided to enable users to download the SQL file, computational structural features and PTM sites' data. We anticipate PRISMOID will swiftly become an invaluable online resource, assisting both biologists and bioinformaticians to conduct experiments and develop applications supporting discovery efforts in the sequence-structural-functional relationship of PTMs and providing important insight into mutations and PTM sites interaction mechanisms. The PRISMOID database is freely accessible at http://prismoid.erc.monash.edu/. The database and web interface are implemented in MySQL, JSP, JavaScript and HTML with all major browsers supported.
Collapse
Affiliation(s)
- Fuyi Li
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria, Australia
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC, Australia
| | - Cunshuo Fan
- College of Information Engineering, Northwest A&F University, Yangling, China
| | - Tatiana T Marquez-Lago
- Department of Genetics and Department of Cell, Developmental and Integrative Biology, School of Medicine, University of Alabama at Birmingham, AL, USA
| | - André Leier
- Department of Genetics and Department of Cell, Developmental and Integrative Biology, School of Medicine, University of Alabama at Birmingham, AL, USA
| | - Jerico Revote
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria, Australia
| | - Cangzhi Jia
- College of Science, Dalian Maritime University, Dalian, China
- School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
| | - Yan Zhu
- Biomedicine Discovery Institute and Department of Microbiology, Monash University, Melbourne, Victoria, Australia
| | - A Ian Smith
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria, Australia
| | - Geoffrey I Webb
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC, Australia
| | - Quanzhong Liu
- College of Information Engineering, Northwest A&F University, Yangling, China
| | - Leyi Wei
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Jian Li
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria, Australia
| | - Jiangning Song
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria, Australia
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC, Australia
| |
Collapse
|
19
|
Santhosh R, Bankoti N, Gurudarshan M, Jeyakanthan J, Sekar K. IMRPS: Inserted and Modified Residues in Protein Structures. A database. J Appl Crystallogr 2020. [DOI: 10.1107/s1600576720001880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Modified residues present in proteins are the result of post-translational modifications (PTMs). These PTMs increase the functional diversity of the proteome and influence various biological processes and diseased conditions. Therefore, identification and understanding of PTMs in various protein structures is of great significance. In view of this, an online database, Inserted and Modified Residues in Protein Structures (IMRPS), has been developed. IMRPS is a derived database that furnishes information on the residues modified and inserted in the protein structures available in the Protein Data Bank (PDB). The database is equipped with a graphical user interface and has an option to view the data for non-redundant protein structures (25 and 90%) as well. A quality criteria cutoff has been incorporated to assist in displaying the specific set of PDB codes. The entire protein structure along with the inserted or modified residues can be visualized in JSmol. This database will be updated regularly (presently, every three months) and can be accessed through the URL http://cluster.physics.iisc.ac.in/imrps/.
Collapse
|
20
|
Accurate Representation of Protein-Ligand Structural Diversity in the Protein Data Bank (PDB). Int J Mol Sci 2020; 21:ijms21062243. [PMID: 32213914 PMCID: PMC7139665 DOI: 10.3390/ijms21062243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Revised: 03/06/2020] [Accepted: 03/20/2020] [Indexed: 11/16/2022] Open
Abstract
The number of available protein structures in the Protein Data Bank (PDB) has considerably increased in recent years. Thanks to the growth of structures and complexes, numerous large-scale studies have been done in various research areas, e.g., protein-protein, protein-DNA, or in drug discovery. While protein redundancy was only simply managed using simple protein sequence identity threshold, the similarity of protein-ligand complexes should also be considered from a structural perspective. Hence, the protein-ligand duplicates in the PDB are widely known, but were never quantitatively assessed, as they are quite complex to analyze and compare. Here, we present a specific clustering of protein-ligand structures to avoid bias found in different studies. The methodology is based on binding site superposition, and a combination of weighted Root Mean Square Deviation (RMSD) assessment and hierarchical clustering. Repeated structures of proteins of interest are highlighted and only representative conformations were conserved for a non-biased view of protein distribution. Three types of cases are described based on the number of distinct conformations identified for each complex. Defining these categories decreases by 3.84-fold the number of complexes, and offers more refined results compared to a protein sequence-based method. Widely distinct conformations were analyzed using normalized B-factors. Furthermore, a non-redundant dataset was generated for future molecular interactions analysis or virtual screening studies.
Collapse
|
21
|
Narwani TJ, Craveur P, Shinada NK, Floch A, Santuz H, Vattekatte AM, Srinivasan N, Rebehmed J, Gelly JC, Etchebest C, de Brevern AG. Discrete analyses of protein dynamics. J Biomol Struct Dyn 2019; 38:2988-3002. [PMID: 31361191 DOI: 10.1080/07391102.2019.1650112] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Protein structures are highly dynamic macromolecules. This dynamics is often analysed through experimental and/or computational methods only for an isolated or a limited number of proteins. Here, we explore large-scale protein dynamics simulation to observe dynamics of local protein conformations using different perspectives. We analysed molecular dynamics to investigate protein flexibility locally, using classical approaches such as RMSf, solvent accessibility, but also innovative approaches such as local entropy. First, we focussed on classical secondary structures and analysed specifically how β-strand, β-turns, and bends evolve during molecular simulations. We underlined interesting specific bias between β-turns and bends, which are considered as the same category, while their dynamics show differences. Second, we used a structural alphabet that is able to approximate every part of the protein structures conformations, namely protein blocks (PBs) to analyse (i) how each initial local protein conformations evolve during dynamics and (ii) if some exchange can exist among these PBs. Interestingly, the results are largely complex than simple regular/rigid and coil/flexible exchange. AbbreviationsNeqnumber of equivalentPBProtein BlocksPDBProtein DataBankRMSfroot mean square fluctuationsCommunicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Tarun Jairaj Narwani
- Biologie Intégrée du Globule Rouge UMR_S1134, Inserm, Univ. Paris, Univ. de la Réunion, Univ. des Antilles, Paris, France.,Laboratoire D'Excellence GR-Ex, Paris, France.,Institut National de la Transfusion Sanguine (INTS), Paris, France
| | - Pierrick Craveur
- Biologie Intégrée du Globule Rouge UMR_S1134, Inserm, Univ. Paris, Univ. de la Réunion, Univ. des Antilles, Paris, France.,Laboratoire D'Excellence GR-Ex, Paris, France.,Institut National de la Transfusion Sanguine (INTS), Paris, France.,Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Nicolas K Shinada
- Biologie Intégrée du Globule Rouge UMR_S1134, Inserm, Univ. Paris, Univ. de la Réunion, Univ. des Antilles, Paris, France.,Laboratoire D'Excellence GR-Ex, Paris, France.,Institut National de la Transfusion Sanguine (INTS), Paris, France.,Discngine, SAS, Paris, France
| | - Aline Floch
- Laboratoire D'Excellence GR-Ex, Paris, France.,Etablissement Français du Sang Ile de France, Créteil, France.,IMRB - INSERM U955 Team 2 « Transfusion et Maladies du Globule Rouge », Paris Est- Créteil Univ, Créteil, France.,UPEC, Université Paris Est-Créteil, Créteil, France
| | - Hubert Santuz
- Biologie Intégrée du Globule Rouge UMR_S1134, Inserm, Univ. Paris, Univ. de la Réunion, Univ. des Antilles, Paris, France.,Laboratoire D'Excellence GR-Ex, Paris, France.,Institut National de la Transfusion Sanguine (INTS), Paris, France
| | - Akhila Melarkode Vattekatte
- Biologie Intégrée du Globule Rouge UMR_S1134, Inserm, Univ. Paris, Univ. de la Réunion, Univ. des Antilles, Paris, France.,Laboratoire D'Excellence GR-Ex, Paris, France.,Institut National de la Transfusion Sanguine (INTS), Paris, France.,Faculté Des Sciences et Technologies, Saint Denis Messag, La Réunion, France
| | | | - Joseph Rebehmed
- Biologie Intégrée du Globule Rouge UMR_S1134, Inserm, Univ. Paris, Univ. de la Réunion, Univ. des Antilles, Paris, France.,Laboratoire D'Excellence GR-Ex, Paris, France.,Institut National de la Transfusion Sanguine (INTS), Paris, France.,Department of Computer Science and Mathematics, Lebanese American University, Byblos, Lebanon
| | - Jean-Christophe Gelly
- Biologie Intégrée du Globule Rouge UMR_S1134, Inserm, Univ. Paris, Univ. de la Réunion, Univ. des Antilles, Paris, France.,Laboratoire D'Excellence GR-Ex, Paris, France.,Institut National de la Transfusion Sanguine (INTS), Paris, France.,Faculté Des Sciences et Technologies, Saint Denis Messag, La Réunion, France.,IBL, Paris, France
| | - Catherine Etchebest
- Biologie Intégrée du Globule Rouge UMR_S1134, Inserm, Univ. Paris, Univ. de la Réunion, Univ. des Antilles, Paris, France.,Laboratoire D'Excellence GR-Ex, Paris, France.,Institut National de la Transfusion Sanguine (INTS), Paris, France.,Faculté Des Sciences et Technologies, Saint Denis Messag, La Réunion, France
| | - Alexandre G de Brevern
- Biologie Intégrée du Globule Rouge UMR_S1134, Inserm, Univ. Paris, Univ. de la Réunion, Univ. des Antilles, Paris, France.,Laboratoire D'Excellence GR-Ex, Paris, France.,Institut National de la Transfusion Sanguine (INTS), Paris, France.,Faculté Des Sciences et Technologies, Saint Denis Messag, La Réunion, France.,IBL, Paris, France
| |
Collapse
|
22
|
Sharapov SZ, Tsepilov YA, Klaric L, Mangino M, Thareja G, Shadrina AS, Simurina M, Dagostino C, Dmitrieva J, Vilaj M, Vuckovic F, Pavic T, Stambuk J, Trbojevic-Akmacic I, Kristic J, Simunovic J, Momcilovic A, Campbell H, Doherty M, Dunlop MG, Farrington SM, Pucic-Bakovic M, Gieger C, Allegri M, Louis E, Georges M, Suhre K, Spector T, Williams FMK, Lauc G, Aulchenko YS. Defining the genetic control of human blood plasma N-glycome using genome-wide association study. Hum Mol Genet 2019; 28:2062-2077. [PMID: 31163085 PMCID: PMC6664388 DOI: 10.1093/hmg/ddz054] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2018] [Revised: 03/01/2019] [Accepted: 03/06/2019] [Indexed: 01/10/2023] Open
Abstract
Glycosylation is a common post-translational modification of proteins. Glycosylation is associated with a number of human diseases. Defining genetic factors altering glycosylation may provide a basis for novel approaches to diagnostic and pharmaceutical applications. Here we report a genome-wide association study of the human blood plasma N-glycome composition in up to 3811 people measured by Ultra Performance Liquid Chromatography (UPLC) technology. Starting with the 36 original traits measured by UPLC, we computed an additional 77 derived traits leading to a total of 113 glycan traits. We studied associations between these traits and genetic polymorphisms located on human autosomes. We discovered and replicated 12 loci. This allowed us to demonstrate an overlap in genetic control between total plasma protein and IgG glycosylation. The majority of revealed loci contained genes that encode enzymes directly involved in glycosylation (FUT3/FUT6, FUT8, B3GAT1, ST6GAL1, B4GALT1, ST3GAL4, MGAT3 and MGAT5) and a known regulator of plasma protein fucosylation (HNF1A). However, we also found loci that could possibly reflect other more complex aspects of glycosylation process. Functional genomic annotation suggested the role of several genes including DERL3, CHCHD10, TMEM121, IGH and IKZF1. The hypotheses we generated may serve as a starting point for further functional studies in this research area.
Collapse
Affiliation(s)
- Sodbo Zh Sharapov
- Institute of Cytology and Genetics SB RAS, Prospekt Lavrentyeva 10, Novosibirsk, Russia
- Novosibirsk State University, 1, Pirogova str., Novosibirsk, Russia
| | - Yakov A Tsepilov
- Institute of Cytology and Genetics SB RAS, Prospekt Lavrentyeva 10, Novosibirsk, Russia
- Novosibirsk State University, 1, Pirogova str., Novosibirsk, Russia
| | - Lucija Klaric
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Crewe Road South, Edinburgh, UK
- Genos Glycoscience Research Laboratory, Borongajska cesta 83h, Zagreb, Croatia
| | - Massimo Mangino
- Department of Twin Research and Genetic Epidemiology, School of Life Course Sciences, King’s College London, St Thomas’ Campus, London, UK
- NIHR Biomedical Research Centre at Guy’s and St Thomas’ Foundation Trust, London, UK
| | - Gaurav Thareja
- Department of Physiology and Biophysics, Weill Cornell Medicine-Qatar, Education City, Doha, Qatar
| | | | - Mirna Simurina
- Faculty of Pharmacy and Biochemistry, University of Zagreb, Ante Kovacica 1, Zagreb, Croatia
| | - Concetta Dagostino
- Department of Medicine and Surgery, University of Parma, Via Gramsci 14, Parma, Italy
| | - Julia Dmitrieva
- Unit of Animal Genomics, WELBIO, GIGA-R and Faculty of Veterinary Medicine, University of Liège, Liège, Belgium
| | - Marija Vilaj
- Genos Glycoscience Research Laboratory, Borongajska cesta 83h, Zagreb, Croatia
| | - Frano Vuckovic
- Genos Glycoscience Research Laboratory, Borongajska cesta 83h, Zagreb, Croatia
| | - Tamara Pavic
- Faculty of Pharmacy and Biochemistry, University of Zagreb, Ante Kovacica 1, Zagreb, Croatia
| | - Jerko Stambuk
- Genos Glycoscience Research Laboratory, Borongajska cesta 83h, Zagreb, Croatia
| | | | - Jasminka Kristic
- Genos Glycoscience Research Laboratory, Borongajska cesta 83h, Zagreb, Croatia
| | - Jelena Simunovic
- Genos Glycoscience Research Laboratory, Borongajska cesta 83h, Zagreb, Croatia
| | - Ana Momcilovic
- Genos Glycoscience Research Laboratory, Borongajska cesta 83h, Zagreb, Croatia
| | - Harry Campbell
- Centre for Global Health Research, Usher Institute of Population Health Sciences and Informatics, The University of Edinburgh, Edinburgh, UK
- Colon Cancer Genetics Group, MRC Human Genetics Unit, MRC Institute of Genetics & Molecular Medicine, Western General Hospital, The University of Edinburgh, Edinburgh, UK
| | - Margaret Doherty
- Institute of Technology Sligo, Department of Life Sciences, Sligo, Ireland
- National Institute for Bioprocessing Research & Training, Dublin, Ireland
| | - Malcolm G Dunlop
- Colon Cancer Genetics Group, MRC Human Genetics Unit, MRC Institute of Genetics & Molecular Medicine, Western General Hospital, The University of Edinburgh, Edinburgh, UK
| | - Susan M Farrington
- Colon Cancer Genetics Group, MRC Human Genetics Unit, MRC Institute of Genetics & Molecular Medicine, Western General Hospital, The University of Edinburgh, Edinburgh, UK
| | - Maja Pucic-Bakovic
- Genos Glycoscience Research Laboratory, Borongajska cesta 83h, Zagreb, Croatia
| | - Christian Gieger
- Institute of Epidemiology II, Research Unit of Molecular Epidemiology, Helmholtz Centre Munich, German Research Center for Environmental Health, Ingolstädter Landstr. 1, Neuherberg, Germany
| | - Massimo Allegri
- Pain Therapy Department, Policlinico Monza Hospital, Monza, Italy
| | - Edouard Louis
- CHU-Liège and Unit of Gastroenterology, GIGA-R and Faculty of Medicine, University of Liège, 1 Avenue de l’Hôpital, Liège, Belgium
| | - Michel Georges
- Unit of Animal Genomics, WELBIO, GIGA-R and Faculty of Veterinary Medicine, University of Liège, Liège, Belgium
| | - Karsten Suhre
- Department of Physiology and Biophysics, Weill Cornell Medicine-Qatar, Education City, Doha, Qatar
| | - Tim Spector
- Department of Twin Research and Genetic Epidemiology, School of Life Course Sciences, King’s College London, St Thomas’ Campus, London, UK
| | - Frances M K Williams
- Department of Twin Research and Genetic Epidemiology, School of Life Course Sciences, King’s College London, St Thomas’ Campus, London, UK
| | - Gordan Lauc
- Genos Glycoscience Research Laboratory, Borongajska cesta 83h, Zagreb, Croatia
- Faculty of Pharmacy and Biochemistry, University of Zagreb, Ante Kovacica 1, Zagreb, Croatia
| | - Yurii S Aulchenko
- Institute of Cytology and Genetics SB RAS, Prospekt Lavrentyeva 10, Novosibirsk, Russia
- Novosibirsk State University, 1, Pirogova str., Novosibirsk, Russia
- PolyOmica, Het Vlaggeschip 61, PA 's-Hertogenbosch, The Netherlands
| |
Collapse
|
23
|
Craveur P, Narwani TJ, Rebehmed J, de Brevern AG. Investigation of the impact of PTMs on the protein backbone conformation. Amino Acids 2019; 51:1065-1079. [DOI: 10.1007/s00726-019-02747-w] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2018] [Accepted: 05/18/2019] [Indexed: 12/17/2022]
|
24
|
Pascovici D, Wu JX, McKay MJ, Joseph C, Noor Z, Kamath K, Wu Y, Ranganathan S, Gupta V, Mirzaei M. Clinically Relevant Post-Translational Modification Analyses-Maturing Workflows and Bioinformatics Tools. Int J Mol Sci 2018; 20:E16. [PMID: 30577541 PMCID: PMC6337699 DOI: 10.3390/ijms20010016] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2018] [Revised: 12/09/2018] [Accepted: 12/17/2018] [Indexed: 01/04/2023] Open
Abstract
Post-translational modifications (PTMs) can occur soon after translation or at any stage in the lifecycle of a given protein, and they may help regulate protein folding, stability, cellular localisation, activity, or the interactions proteins have with other proteins or biomolecular species. PTMs are crucial to our functional understanding of biology, and new quantitative mass spectrometry (MS) and bioinformatics workflows are maturing both in labelled multiplexed and label-free techniques, offering increasing coverage and new opportunities to study human health and disease. Techniques such as Data Independent Acquisition (DIA) are emerging as promising approaches due to their re-mining capability. Many bioinformatics tools have been developed to support the analysis of PTMs by mass spectrometry, from prediction and identifying PTM site assignment, open searches enabling better mining of unassigned mass spectra-many of which likely harbour PTMs-through to understanding PTM associations and interactions. The remaining challenge lies in extracting functional information from clinically relevant PTM studies. This review focuses on canvassing the options and progress of PTM analysis for large quantitative studies, from choosing the platform, through to data analysis, with an emphasis on clinically relevant samples such as plasma and other body fluids, and well-established tools and options for data interpretation.
Collapse
Affiliation(s)
- Dana Pascovici
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia.
- Australian Proteome Analysis Facility, Macquarie University, Sydney, NSW 2109, Australia.
| | - Jemma X Wu
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia.
- Australian Proteome Analysis Facility, Macquarie University, Sydney, NSW 2109, Australia.
| | - Matthew J McKay
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia.
- Australian Proteome Analysis Facility, Macquarie University, Sydney, NSW 2109, Australia.
| | - Chitra Joseph
- Department of Clinical Medicine, Macquarie University, Sydney, NSW 2109, Australia.
| | - Zainab Noor
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia.
| | - Karthik Kamath
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia.
- Australian Proteome Analysis Facility, Macquarie University, Sydney, NSW 2109, Australia.
| | - Yunqi Wu
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia.
- Australian Proteome Analysis Facility, Macquarie University, Sydney, NSW 2109, Australia.
| | - Shoba Ranganathan
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia.
| | - Vivek Gupta
- Department of Clinical Medicine, Macquarie University, Sydney, NSW 2109, Australia.
| | - Mehdi Mirzaei
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia.
- Australian Proteome Analysis Facility, Macquarie University, Sydney, NSW 2109, Australia.
- Department of Clinical Medicine, Macquarie University, Sydney, NSW 2109, Australia.
| |
Collapse
|
25
|
Dehzangi A, López Y, Taherzadeh G, Sharma A, Tsunoda T. SumSec: Accurate Prediction of Sumoylation Sites Using Predicted Secondary Structure. Molecules 2018; 23:E3260. [PMID: 30544729 PMCID: PMC6320791 DOI: 10.3390/molecules23123260] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2018] [Revised: 11/30/2018] [Accepted: 12/05/2018] [Indexed: 12/13/2022] Open
Abstract
Post Translational Modification (PTM) is defined as the modification of amino acids along the protein sequences after the translation process. These modifications significantly impact on the functioning of proteins. Therefore, having a comprehensive understanding of the underlying mechanism of PTMs turns out to be critical in studying the biological roles of proteins. Among a wide range of PTMs, sumoylation is one of the most important modifications due to its known cellular functions which include transcriptional regulation, protein stability, and protein subcellular localization. Despite its importance, determining sumoylation sites via experimental methods is time-consuming and costly. This has led to a great demand for the development of fast computational methods able to accurately determine sumoylation sites in proteins. In this study, we present a new machine learning-based method for predicting sumoylation sites called SumSec. To do this, we employed the predicted secondary structure of amino acids to extract two types of structural features from neighboring amino acids along the protein sequence which has never been used for this task. As a result, our proposed method is able to enhance the sumoylation site prediction task, outperforming previously proposed methods in the literature. SumSec demonstrated high sensitivity (0.91), accuracy (0.94) and MCC (0.88). The prediction accuracy achieved in this study is 21% better than those reported in previous studies. The script and extracted features are publicly available at: https://github.com/YosvanyLopez/SumSec.
Collapse
Affiliation(s)
- Abdollah Dehzangi
- Department of Computer Science, Morgan State University, Baltimore, MD 21251, USA.
| | - Yosvany López
- Genesis Institute of Genetic Research, Genesis Healthcare Co., Tokyo 150-6015, Japan.
| | - Ghazaleh Taherzadeh
- School of Information and Communication Technology, Griffith University, Gold Coast 4222, Australia.
| | - Alok Sharma
- Institute for Integrated and Intelligent Systems, Griffith University, Brisbane 4111, Australia.
- School of Engineering & Physics, University of the South Pacific, Suva, Fiji.
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.
- CREST, JST, Tokyo 102-0076, Japan.
- Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University, Tokyo 113-8510, Japan.
| | - Tatsuhiko Tsunoda
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.
- CREST, JST, Tokyo 102-0076, Japan.
- Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University, Tokyo 113-8510, Japan.
| |
Collapse
|
26
|
Srivastava A, Nagai T, Srivastava A, Miyashita O, Tama F. Role of Computational Methods in Going beyond X-ray Crystallography to Explore Protein Structure and Dynamics. Int J Mol Sci 2018; 19:E3401. [PMID: 30380757 PMCID: PMC6274748 DOI: 10.3390/ijms19113401] [Citation(s) in RCA: 53] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Revised: 10/20/2018] [Accepted: 10/27/2018] [Indexed: 12/13/2022] Open
Abstract
Protein structural biology came a long way since the determination of the first three-dimensional structure of myoglobin about six decades ago. Across this period, X-ray crystallography was the most important experimental method for gaining atomic-resolution insight into protein structures. However, as the role of dynamics gained importance in the function of proteins, the limitations of X-ray crystallography in not being able to capture dynamics came to the forefront. Computational methods proved to be immensely successful in understanding protein dynamics in solution, and they continue to improve in terms of both the scale and the types of systems that can be studied. In this review, we briefly discuss the limitations of X-ray crystallography in studying protein dynamics, and then provide an overview of different computational methods that are instrumental in understanding the dynamics of proteins and biomacromolecular complexes.
Collapse
Affiliation(s)
- Ashutosh Srivastava
- Institute of Transformative Bio-Molecules (WPI), Nagoya University, Nagoya, Aichi 464-8601, Japan.
| | - Tetsuro Nagai
- Department of Physics, Graduate School of Science, Nagoya University, Nagoya, Aichi 464-8602, Japan.
| | - Arpita Srivastava
- Department of Physics, Graduate School of Science, Nagoya University, Nagoya, Aichi 464-8602, Japan.
| | - Osamu Miyashita
- RIKEN-Center for Computational Science, Kobe, Hyogo 650-0047, Japan.
| | - Florence Tama
- Institute of Transformative Bio-Molecules (WPI), Nagoya University, Nagoya, Aichi 464-8601, Japan.
- Department of Physics, Graduate School of Science, Nagoya University, Nagoya, Aichi 464-8602, Japan.
- RIKEN-Center for Computational Science, Kobe, Hyogo 650-0047, Japan.
| |
Collapse
|
27
|
Ledesma L, Sandoval E, Cruz-Martínez U, Escalante AM, Mejía S, Moreno-Álvarez P, Ávila E, García E, Coello G, Torres-Quiroz F. YAAM: Yeast Amino Acid Modifications Database. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018; 2018:4797096. [PMID: 29688347 PMCID: PMC7206644 DOI: 10.1093/database/bax099] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/24/2017] [Accepted: 12/06/2017] [Indexed: 01/29/2023]
Abstract
Proteins are dynamic molecules that regulate a myriad of cellular functions; these functions may be regulated by protein post-translational modifications (PTMs) that mediate the activity, localization and interaction partners of proteins. Thus, understanding the meaning of a single PTM or the combination of several of them is essential to unravel the mechanisms of protein regulation. Yeast Amino Acid Modification (YAAM) (http://yaam.ifc.unam.mx) is a comprehensive database that contains information from 121 921 residues of proteins, which are post-translationally modified in the yeast model Saccharomyces cerevisiae. All the PTMs contained in YAAM have been confirmed experimentally. YAAM database maps PTM residues in a 3D canvas for 680 proteins with a known 3D structure. The structure can be visualized and manipulated using the most common web browsers without the need for any additional plugin. The aim of our database is to retrieve and organize data about the location of modified amino acids providing information in a concise but comprehensive and user-friendly way, enabling users to find relevant information on PTMs. Given that PTMs influence almost all aspects of the biology of both healthy and diseased cells, identifying and understanding PTMs is critical in the study of molecular and cell biology. YAAM allows users to perform multiple searches, up to three modifications at the same residue, giving the possibility to explore possible regulatory mechanism for some proteins. Using YAAM search engine, we found three different PTMs of lysine residues involved in protein translation. This suggests an important regulatory mechanism for protein translation that needs to be further studied. Database URL: http://yaam.ifc.unam.mx/
Collapse
Affiliation(s)
- Leonardo Ledesma
- Unidad de Cómputo, Instituto de Fisiología Celular, Universidad Nacional Autónoma de México, México, Ciudad de México 04510, México
| | - Eduardo Sandoval
- Unidad de Cómputo, Instituto de Fisiología Celular, Universidad Nacional Autónoma de México, México, Ciudad de México 04510, México
| | - Uriel Cruz-Martínez
- División de Ciencia Básica, Departamento de Bioquímica y Biología Estructural, Instituto de Fisiología Celular, Universidad Nacional Autónoma de México, México, Ciudad de México 04510, México
| | - Ana María Escalante
- Unidad de Cómputo, Instituto de Fisiología Celular, Universidad Nacional Autónoma de México, México, Ciudad de México 04510, México
| | - Selene Mejía
- Coordinación de Difusión y Divulgación, Instituto de Fisiología Celular, Universidad Nacional Autónoma de México, México, Ciudad de México 04510, México
| | - Paola Moreno-Álvarez
- División de Ciencia Básica, Departamento de Bioquímica y Biología Estructural, Instituto de Fisiología Celular, Universidad Nacional Autónoma de México, México, Ciudad de México 04510, México
| | - Emiliano Ávila
- División de Ciencia Básica, Departamento de Bioquímica y Biología Estructural, Instituto de Fisiología Celular, Universidad Nacional Autónoma de México, México, Ciudad de México 04510, México
| | - Erik García
- División de Ciencia Básica, Departamento de Bioquímica y Biología Estructural, Instituto de Fisiología Celular, Universidad Nacional Autónoma de México, México, Ciudad de México 04510, México
| | - Gerardo Coello
- Unidad de Cómputo, Instituto de Fisiología Celular, Universidad Nacional Autónoma de México, México, Ciudad de México 04510, México
| | - Francisco Torres-Quiroz
- División de Ciencia Básica, Departamento de Bioquímica y Biología Estructural, Instituto de Fisiología Celular, Universidad Nacional Autónoma de México, México, Ciudad de México 04510, México
| |
Collapse
|
28
|
Su MG, Weng JTY, Hsu JBK, Huang KY, Chi YH, Lee TY. Investigation and identification of functional post-translational modification sites associated with drug binding and protein-protein interactions. BMC SYSTEMS BIOLOGY 2017; 11:132. [PMID: 29322920 PMCID: PMC5763307 DOI: 10.1186/s12918-017-0506-1] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Background Protein post-translational modification (PTM) plays an essential role in various cellular processes that modulates the physical and chemical properties, folding, conformation, stability and activity of proteins, thereby modifying the functions of proteins. The improved throughput of mass spectrometry (MS) or MS/MS technology has not only brought about a surge in proteome-scale studies, but also contributed to a fruitful list of identified PTMs. However, with the increase in the number of identified PTMs, perhaps the more crucial question is what kind of biological mechanisms these PTMs are involved in. This is particularly important in light of the fact that most protein-based pharmaceuticals deliver their therapeutic effects through some form of PTM. Yet, our understanding is still limited with respect to the local effects and frequency of PTM sites near pharmaceutical binding sites and the interfaces of protein-protein interaction (PPI). Understanding PTM’s function is critical to our ability to manipulate the biological mechanisms of protein. Results In this study, to understand the regulation of protein functions by PTMs, we mapped 25,835 PTM sites to proteins with available three-dimensional (3D) structural information in the Protein Data Bank (PDB), including 1785 modified PTM sites on the 3D structure. Based on the acquired structural PTM sites, we proposed to use five properties for the structural characterization of PTM substrate sites: the spatial composition of amino acids, residues and side-chain orientations surrounding the PTM substrate sites, as well as the secondary structure, division of acidity and alkaline residues, and solvent-accessible surface area. We further mapped the structural PTM sites to the structures of drug binding and PPI sites, identifying a total of 1917 PTM sites that may affect PPI and 3951 PTM sites associated with drug-target binding. An integrated analytical platform (CruxPTM), with a variety of methods and online molecular docking tools for exploring the structural characteristics of PTMs, is presented. In addition, all tertiary structures of PTM sites on proteins can be visualized using the JSmol program. Conclusion Resolving the function of PTM sites is important for understanding the role that proteins play in biological mechanisms. Our work attempted to delineate the structural correlation between PTM sites and PPI or drug-target binding. CurxPTM could help scientists narrow the scope of their PTM research and enhance the efficiency of PTM identification in the face of big proteome data. CruxPTM is now available at http://csb.cse.yzu.edu.tw/CruxPTM/. Electronic supplementary material The online version of this article (10.1186/s12918-017-0506-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Min-Gang Su
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan, 320, Taiwan
| | - Julia Tzu-Ya Weng
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan, 320, Taiwan
| | - Justin Bo-Kai Hsu
- Department of Medical Research, Taipei Medical University Hospital, Taipei, 110, Taiwan
| | - Kai-Yao Huang
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan, 320, Taiwan.,Department of Medical Research, Hsinchu Mackay Memorial Hospital, Hsinchu City, 300, Taiwan
| | - Yu-Hsiang Chi
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan, 320, Taiwan
| | - Tzong-Yi Lee
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan, 320, Taiwan. .,Innovation Center for Big Data and Digital Convergence, Yuan Ze University, Taoyuan, 320, Taiwan.
| |
Collapse
|
29
|
Abriata LA. Structural database resources for biological macromolecules. Brief Bioinform 2017; 18:659-669. [PMID: 27273290 DOI: 10.1093/bib/bbw049] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2016] [Indexed: 12/30/2022] Open
Abstract
This Briefing reviews the widely used, currently active, up-to-date databases derived from the worldwide Protein Data Bank (PDB) to facilitate browsing, finding and exploring its entries. These databases contain visualization and analysis tools tailored to specific kinds of molecules and interactions, often including also complex metrics precomputed by experts or external programs, and connections to sequence and functional annotation databases. Importantly, updates of most of these databases involves steps of curation and error checks based on specific expertise about the subject molecules or interactions, and removal of sequence redundancy, both leading to better data sets for mining studies compared with the full list of raw PDB entries. The article presents the databases in groups such as those aimed to facilitate browsing through PDB entries, their molecules and their general information, those built to link protein structure with sequence and dynamics, those specific for transmembrane proteins, nucleic acids, interactions of biomacromolecules with each other and with small molecules or metal ions, and those concerning specific structural features or specific protein families. A few webservers directly connected to active databases, and a few databases that have been discontinued but would be important to have back, are also briefly commented on. Along the Briefing, sample cases where these databases have been used to aid structural studies or advance our knowledge about biological macromolecules are referenced. A few specific examples are also given where using these databases is easier and more informative than using raw PDB data.
Collapse
|
30
|
Laurie J, Chattopadhyay AK, Flower DR. Protein lipograms. J Theor Biol 2017; 430:109-116. [PMID: 28716385 DOI: 10.1016/j.jtbi.2017.07.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2017] [Revised: 06/30/2017] [Accepted: 07/12/2017] [Indexed: 11/20/2022]
Abstract
Linguistic analysis of protein sequences is an underexploited technique. Here, we capitalize on the concept of the lipogram to characterize sequences at the proteome levels. A lipogram is a literary composition which omits one or more letters. A protein lipogram likewise omits one or more types of amino acid. In this article, we establish a usable terminology for the decomposition of a sequence collection in terms of the lipogram. Next, we characterize Uniref50 using a lipogram decomposition. At the global level, protein lipograms exhibit power-law properties. A clear correlation with metabolic cost is seen. Finally, we use the lipogram construction to assign proteomes to the four branches of the tree-of-life: archaea, bacteria, eukaryotes and viruses. We conclude from this pilot study that the lipogram demonstrates considerable potential as an additional tool for sequence analysis and proteome classification.
Collapse
Affiliation(s)
- Jason Laurie
- School of Engineering and Applied Science, Aston University, Birmingham B4 7ET, UK; Systems Analytics Research Institute, Aston University, Birmingham B4 7ET, UK
| | - Amit K Chattopadhyay
- School of Engineering and Applied Science, Aston University, Birmingham B4 7ET, UK; Systems Analytics Research Institute, Aston University, Birmingham B4 7ET, UK
| | - Darren R Flower
- School of Life and Health Sciences, Aston University, Birmingham B4 7ET, UK.
| |
Collapse
|
31
|
Shen X, Klarić L, Sharapov S, Mangino M, Ning Z, Wu D, Trbojević-Akmačić I, Pučić-Baković M, Rudan I, Polašek O, Hayward C, Spector TD, Wilson JF, Lauc G, Aulchenko YS. Multivariate discovery and replication of five novel loci associated with Immunoglobulin G N-glycosylation. Nat Commun 2017; 8:447. [PMID: 28878392 PMCID: PMC5587582 DOI: 10.1038/s41467-017-00453-3] [Citation(s) in RCA: 70] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2016] [Accepted: 06/29/2017] [Indexed: 01/20/2023] Open
Abstract
Joint modeling of a number of phenotypes using multivariate methods has often been neglected in genome-wide association studies and if used, replication has not been sought. Modern omics technologies allow characterization of functional phenomena using a large number of related phenotype measures, which can benefit from such joint analysis. Here, we report a multivariate genome-wide association studies of 23 immunoglobulin G (IgG) N-glycosylation phenotypes. In the discovery cohort, our multi-phenotype method uncovers ten genome-wide significant loci, of which five are novel (IGH, ELL2, HLA-B-C, AZI1, FUT6-FUT3). We convincingly replicate all novel loci via multivariate tests. We show that IgG N-glycosylation loci are strongly enriched for genes expressed in the immune system, in particular antibody-producing cells and B lymphocytes. We empirically demonstrate the efficacy of multivariate methods to discover novel, reproducible pleiotropic effects.Multivariate analysis methods can uncover the relationship between phenotypic measures characterised by modern omic techniques. Here the authors conduct a multivariate GWAS on IgG N-glycosylation phenotypes and identify 5 novel loci enriched in immune system genes.
Collapse
Affiliation(s)
- Xia Shen
- Centre for Global Health Research, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Teviot Place, Edinburgh, EH8 9AG, Scotland, UK.
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Nobels väg 12 A, SE-17 177, Stockholm, Sweden.
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Crew Road, Edinburgh, EH4 2XU, Scotland, UK.
| | - Lucija Klarić
- Centre for Global Health Research, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Teviot Place, Edinburgh, EH8 9AG, Scotland, UK
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Crew Road, Edinburgh, EH4 2XU, Scotland, UK
- Genos Glycoscience Research Laboratory, Hondlova 2/11, Zagreb, 10000, Croatia
| | - Sodbo Sharapov
- Novosibirsk State University, Pirogova 2, Novosibirsk, 630090, Russia
- Institute of Cytology and Genetics SB RAS, Lavrentyeva ave. 10, Novosibirsk, 630090, Russia
| | - Massimo Mangino
- Department for Twin Research, King's College London, London, WC2R 2LS, England, UK
- National Institute for Health Research (NIHR) Biomedical Research Centre at Guy's and St. Thomas' Foundation Trust, London, SE1 9RT, England, UK
| | - Zheng Ning
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Nobels väg 12 A, SE-17 177, Stockholm, Sweden
| | - Di Wu
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Tomtebodavägen 23B, Stockholm, SE-171 65, Sweden
| | | | - Maja Pučić-Baković
- Genos Glycoscience Research Laboratory, Hondlova 2/11, Zagreb, 10000, Croatia
| | - Igor Rudan
- Centre for Global Health Research, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Teviot Place, Edinburgh, EH8 9AG, Scotland, UK
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Crew Road, Edinburgh, EH4 2XU, Scotland, UK
| | - Ozren Polašek
- Faculty of Medicine, University of Split, Šoltanska ul. 2, Split, 21000, Croatia
| | - Caroline Hayward
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Crew Road, Edinburgh, EH4 2XU, Scotland, UK
| | - Timothy D Spector
- Department for Twin Research, King's College London, London, WC2R 2LS, England, UK
| | - James F Wilson
- Centre for Global Health Research, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Teviot Place, Edinburgh, EH8 9AG, Scotland, UK
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Crew Road, Edinburgh, EH4 2XU, Scotland, UK
| | - Gordan Lauc
- Genos Glycoscience Research Laboratory, Hondlova 2/11, Zagreb, 10000, Croatia
- Faculty of Pharmacy and Biochemistry, University of Zagreb, A. Kovacica 1, Zagreb, 10000, Croatia
| | - Yurii S Aulchenko
- Novosibirsk State University, Pirogova 2, Novosibirsk, 630090, Russia.
- Institute of Cytology and Genetics SB RAS, Lavrentyeva ave. 10, Novosibirsk, 630090, Russia.
- PolyOmica, Het Vlaggeschip 61, 's-Hertogenbosch, 5237PA, The Netherlands.
| |
Collapse
|
32
|
Tay AP, Pang CNI, Winter DL, Wilkins MR. PTMOracle: A Cytoscape App for Covisualizing and Coanalyzing Post-Translational Modifications in Protein Interaction Networks. J Proteome Res 2017; 16:1988-2003. [PMID: 28349685 DOI: 10.1021/acs.jproteome.6b01052] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Post-translational modifications of proteins (PTMs) act as key regulators of protein activity and of protein-protein interactions (PPIs). To date, it has been difficult to comprehensively explore functional links between PTMs and PPIs. To address this, we developed PTMOracle, a Cytoscape app for coanalyzing PTMs within PPI networks. PTMOracle also allows extensive data to be integrated and coanalyzed with PPI networks, allowing the role of domains, motifs, and disordered regions to be considered. For proteins of interest, or a whole proteome, PTMOracle can generate network visualizations to reveal complex PTM-associated relationships. This is assisted by OraclePainter for coloring proteins by modifications, OracleTools for network analytics, and OracleResults for exploring tabulated findings. To illustrate the use of PTMOracle, we investigate PTM-associated relationships and their role in PPIs in four case studies. In the yeast interactome and its rich set of PTMs, we construct and explore histone-associated and domain-domain interaction networks and show how integrative approaches can predict kinases involved in phosphodegrons. In the human interactome, a phosphotyrosine-associated network is analyzed but highlights the sparse nature of human PPI networks and lack of PTM-associated data. PTMOracle is open source and available at the Cytoscape app store: http://apps.cytoscape.org/apps/ptmoracle .
Collapse
Affiliation(s)
- Aidan P Tay
- Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences, The University of New South Wales , Sydney, New South Wales 2052, Australia
| | - Chi Nam Ignatius Pang
- Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences, The University of New South Wales , Sydney, New South Wales 2052, Australia
| | - Daniel L Winter
- Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences, The University of New South Wales , Sydney, New South Wales 2052, Australia
| | - Marc R Wilkins
- Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences, The University of New South Wales , Sydney, New South Wales 2052, Australia
| |
Collapse
|
33
|
Dewhurst HM, Torres MP. Systematic analysis of non-structural protein features for the prediction of PTM function potential by artificial neural networks. PLoS One 2017; 12:e0172572. [PMID: 28225828 PMCID: PMC5321281 DOI: 10.1371/journal.pone.0172572] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2017] [Accepted: 02/07/2017] [Indexed: 12/31/2022] Open
Abstract
Post-translational modifications (PTMs) provide an extensible framework for regulation of protein behavior beyond the diversity represented within the genome alone. While the rate of identification of PTMs has rapidly increased in recent years, our knowledge of PTM functionality encompasses less than 5% of this data. We previously developed SAPH-ire (Structural Analysis of PTM Hotspots) for the prioritization of eukaryotic PTMs based on function potential of discrete modified alignment positions (MAPs) in a set of 8 protein families. A proteome-wide expansion of the dataset to all families of PTM-bearing, eukaryotic proteins with a representational crystal structure and the application of artificial neural network (ANN) models demonstrated the broader applicability of this approach. Although structural features of proteins have been repeatedly demonstrated to be predictive of PTM functionality, the availability of adequately resolved 3D structures in the Protein Data Bank (PDB) limits the scope of these methods. In order to bridge this gap and capture the larger set of PTM-bearing proteins without an available, homologous structure, we explored all available MAP features as ANN inputs to identify predictive models that do not rely on 3D protein structural data. This systematic, algorithmic approach explores 8 available input features in exhaustive combinations (247 models; size 2-8). To control for potential bias in random sampling for holdback in training sets, we iterated each model across 100 randomized, sample training and testing sets-yielding 24,700 individual ANNs. The size of the analyzed dataset and iterative generation of ANNs represents the largest and most thorough investigation of predictive models for PTM functionality to date. Comparison of input layer combinations allows us to quantify ANN performance with a high degree of confidence and subsequently select a top-ranked, robust fit model which highlights 3,687 MAPs, including 10,933 PTMs with a high probability of biological impact but without a currently known functional role.
Collapse
Affiliation(s)
- Henry M. Dewhurst
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Matthew P. Torres
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| |
Collapse
|
34
|
Abstract
Post-translational modifications (PTMs) are an important source of protein regulation; they fine-tune the function, localization, and interaction with other molecules of the majority of proteins and are partially responsible for their multifunctionality. Usually, proteins have several potential modification sites, and their patterns of occupancy are associated with certain functional states. These patterns imply cross talk among PTMs within and between proteins, the majority of which are still to be discovered. Several methods detect associations between PTMs; these have recently combined into a global resource, the PTMcode database, which contains already known and predicted functional associations between pairs of PTMs from more than 45,000 proteins in 19 eukaryotic species.
Collapse
Affiliation(s)
- Pablo Minguez
- Department of Genetics and Genomics, Instituto de Investigacion Sanitaria-University Hospital Fundacion Jimenez Diaz (IIS-FJD), Avda. Reyes Católicos 2, 28040, Madrid, Spain.
| | - Peer Bork
- European Molecular Biology Laboratory, Structural and Computational Biology Unit, 69117, Heidelberg, Germany
- Max Delbrück Centre for Molecular Medicine, 13125, Berlin, Germany
| |
Collapse
|
35
|
Korkuć P, Walther D. Towards understanding the crosstalk between protein post-translational modifications: Homo- and heterotypic PTM pair distances on protein surfaces are not random. Proteins 2016; 85:78-92. [PMID: 27802577 DOI: 10.1002/prot.25200] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2016] [Revised: 09/29/2016] [Accepted: 10/20/2016] [Indexed: 12/18/2022]
Affiliation(s)
- Paula Korkuć
- Max Planck Institute for Molecular Plant Physiology; Am Mühlenberg 1 Potsdam-Golm 14476 Germany
| | - Dirk Walther
- Max Planck Institute for Molecular Plant Physiology; Am Mühlenberg 1 Potsdam-Golm 14476 Germany
| |
Collapse
|
36
|
Noël F, Malpertuy A, de Brevern AG. Global analysis of VHHs framework regions with a structural alphabet. Biochimie 2016; 131:11-19. [PMID: 27613403 DOI: 10.1016/j.biochi.2016.09.005] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2016] [Revised: 09/05/2016] [Accepted: 09/05/2016] [Indexed: 02/08/2023]
Abstract
The VHHs are antigen-binding region/domain of camelid heavy chain antibodies (HCAb). They have many interesting biotechnological and biomedical properties due to their small size, high solubility and stability, and high affinity and specificity for their antigens. HCAb and classical IgGs are evolutionary related and share a common fold. VHHs are composed of regions considered as constant, called the frameworks (FRs) connected by Complementarity Determining Regions (CDRs), a highly variable region that provide interaction with the epitope. Actually, no systematic structural analyses had been performed on VHH structures despite a significant number of structures. This work is the first study to analyse the structural diversity of FRs of VHHs. Using a structural alphabet that allows approximating the local conformation, we show that each of the four FRs do not have a unique structure but exhibit many structural variant patterns. Moreover, no direct simple link between the local conformational change and amino acid composition can be detected. These results indicate that long-range interactions affect the local conformation of FRs and impact the building of structural models.
Collapse
Affiliation(s)
- Floriane Noël
- INSERM, U 1134, DSIMB, F-75739 Paris, France; Univ Paris Diderot, Sorbonne Paris Cité, UMR_S 1134, F-75739 Paris, France; Institut National de la Transfusion Sanguine (INTS), F-75739 Paris, France; Laboratoire d'Excellence GR-Ex, F-75739 Paris, France
| | | | - Alexandre G de Brevern
- INSERM, U 1134, DSIMB, F-75739 Paris, France; Univ Paris Diderot, Sorbonne Paris Cité, UMR_S 1134, F-75739 Paris, France; Institut National de la Transfusion Sanguine (INTS), F-75739 Paris, France; Laboratoire d'Excellence GR-Ex, F-75739 Paris, France.
| |
Collapse
|
37
|
Sirota FL, Maurer-Stroh S, Eisenhaber B, Eisenhaber F. Single-residue posttranslational modification sites at the N-terminus, C-terminus or in-between: To be or not to be exposed for enzyme access. Proteomics 2016; 15:2525-46. [PMID: 26038108 PMCID: PMC4745020 DOI: 10.1002/pmic.201400633] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2014] [Revised: 04/17/2015] [Accepted: 05/29/2015] [Indexed: 11/30/2022]
Abstract
Many protein posttranslational modifications (PTMs) are the result of an enzymatic reaction. The modifying enzyme has to recognize the substrate protein's sequence motif containing the residue(s) to be modified; thus, the enzyme's catalytic cleft engulfs these residue(s) and the respective sequence environment. This residue accessibility condition principally limits the range where enzymatic PTMs can occur in the protein sequence. Non‐globular, flexible, intrinsically disordered segments or large loops/accessible long side chains should be preferred whereas residues buried in the core of structures should be void of what we call canonical, enzyme‐generated PTMs. We investigate whether PTM sites annotated in UniProtKB (with MOD_RES/LIPID keys) are situated within sequence ranges that can be mapped to known 3D structures. We find that N‐ or C‐termini harbor essentially exclusively canonical PTMs. We also find that the overwhelming majority of all other PTMs are also canonical though, later in the protein's life cycle, the PTM sites can become buried due to complex formation. Among the remaining cases, some can be explained (i) with autocatalysis, (ii) with modification before folding or after temporary unfolding, or (iii) as products of interaction with small, diffusible reactants. Others require further research how these PTMs are mechanistically generated in vivo.
Collapse
Affiliation(s)
- Fernanda L Sirota
- Bioinformatics Institute (BII), Agency for Science and Technology (A*STAR), Matrix, Singapore
| | - Sebastian Maurer-Stroh
- Bioinformatics Institute (BII), Agency for Science and Technology (A*STAR), Matrix, Singapore.,School of Biological Sciences (SBS), Nanyang Technological University (NTU), Singapore
| | - Birgit Eisenhaber
- Bioinformatics Institute (BII), Agency for Science and Technology (A*STAR), Matrix, Singapore
| | - Frank Eisenhaber
- Bioinformatics Institute (BII), Agency for Science and Technology (A*STAR), Matrix, Singapore.,Department of Biological Sciences (DBS), National University of Singapore (NUS), Singapore.,School of Computer Engineering (SCE), Nanyang Technological University (NTU), Singapore
| |
Collapse
|
38
|
Snider NT, Omary MB. Assays for Posttranslational Modifications of Intermediate Filament Proteins. Methods Enzymol 2015; 568:113-38. [PMID: 26795469 DOI: 10.1016/bs.mie.2015.09.005] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Intermediate filament (IF) proteins are known to be regulated by a number of posttranslational modifications (PTMs). Phosphorylation is the best-studied IF PTM, whereas ubiquitination, sumoylation, acetylation, glycosylation, ADP-ribosylation, farnesylation, and transamidation are less understood in functional terms but are known to regulate specific IFs under various contexts. The number and diversity of IF PTMs is certain to grow along with rapid advances in proteomic technologies. Therefore, the need for a greater understanding of the implications of PTMs to the structure, organization, and function of the IF cytoskeleton has become more apparent with the increased availability of data from global profiling studies of normal and diseased specimens. This chapter will provide information on established methods for the isolation and monitoring of IF PTMs along with the key reagents that are necessary to carry out these experiments.
Collapse
Affiliation(s)
- Natasha T Snider
- Department of Cell Biology and Physiology, University of North Carolina, Chapel Hill, North Carolina, USA.
| | - M Bishr Omary
- Department of Molecular & Integrative Physiology, Department of Medicine, University of Michigan, Ann Arbor, Michigan, USA; VA Ann Arbor Healthcare System, Ann Arbor, Michigan, USA
| |
Collapse
|
39
|
Dewhurst HM, Choudhury S, Torres MP. Structural Analysis of PTM Hotspots (SAPH-ire)--A Quantitative Informatics Method Enabling the Discovery of Novel Regulatory Elements in Protein Families. Mol Cell Proteomics 2015; 14:2285-97. [PMID: 26070665 PMCID: PMC4528253 DOI: 10.1074/mcp.m115.051177] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2015] [Indexed: 11/08/2022] Open
Abstract
Predicting the biological function potential of post-translational modifications (PTMs) is becoming increasingly important in light of the exponential increase in available PTM data from high-throughput proteomics. We developed structural analysis of PTM hotspots (SAPH-ire)—a quantitative PTM ranking method that integrates experimental PTM observations, sequence conservation, protein structure, and interaction data to allow rank order comparisons within or between protein families. Here, we applied SAPH-ire to the study of PTMs in diverse G protein families, a conserved and ubiquitous class of proteins essential for maintenance of intracellular structure (tubulins) and signal transduction (large and small Ras-like G proteins). A total of 1728 experimentally verified PTMs from eight unique G protein families were clustered into 451 unique hotspots, 51 of which have a known and cited biological function or response. Using customized software, the hotspots were analyzed in the context of 598 unique protein structures. By comparing distributions of hotspots with known versus unknown function, we show that SAPH-ire analysis is predictive for PTM biological function. Notably, SAPH-ire revealed high-ranking hotspots for which a functional impact has not yet been determined, including phosphorylation hotspots in the N-terminal tails of G protein gamma subunits—conserved protein structures never before reported as regulators of G protein coupled receptor signaling. To validate this prediction we used the yeast model system for G protein coupled receptor signaling, revealing that gamma subunit–N-terminal tail phosphorylation is activated in response to G protein coupled receptor stimulation and regulates protein stability in vivo. These results demonstrate the utility of integrating protein structural and sequence features into PTM prioritization schemes that can improve the analysis and functional power of modification-specific proteomics data.
Collapse
Affiliation(s)
- Henry M Dewhurst
- From the ‡Georgia Institute of Technology; School of Biology; 310 Ferst Drive; Atlanta, Georgia 30332
| | - Shilpa Choudhury
- From the ‡Georgia Institute of Technology; School of Biology; 310 Ferst Drive; Atlanta, Georgia 30332
| | - Matthew P Torres
- From the ‡Georgia Institute of Technology; School of Biology; 310 Ferst Drive; Atlanta, Georgia 30332
| |
Collapse
|
40
|
Craveur P, Joseph AP, Esque J, Narwani TJ, Noël F, Shinada N, Goguet M, Leonard S, Poulain P, Bertrand O, Faure G, Rebehmed J, Ghozlane A, Swapna LS, Bhaskara RM, Barnoud J, Téletchéa S, Jallu V, Cerny J, Schneider B, Etchebest C, Srinivasan N, Gelly JC, de Brevern AG. Protein flexibility in the light of structural alphabets. Front Mol Biosci 2015; 2:20. [PMID: 26075209 PMCID: PMC4445325 DOI: 10.3389/fmolb.2015.00020] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2015] [Accepted: 04/30/2015] [Indexed: 01/01/2023] Open
Abstract
Protein structures are valuable tools to understand protein function. Nonetheless, proteins are often considered as rigid macromolecules while their structures exhibit specific flexibility, which is essential to complete their functions. Analyses of protein structures and dynamics are often performed with a simplified three-state description, i.e., the classical secondary structures. More precise and complete description of protein backbone conformation can be obtained using libraries of small protein fragments that are able to approximate every part of protein structures. These libraries, called structural alphabets (SAs), have been widely used in structure analysis field, from definition of ligand binding sites to superimposition of protein structures. SAs are also well suited to analyze the dynamics of protein structures. Here, we review innovative approaches that investigate protein flexibility based on SAs description. Coupled to various sources of experimental data (e.g., B-factor) and computational methodology (e.g., Molecular Dynamic simulation), SAs turn out to be powerful tools to analyze protein dynamics, e.g., to examine allosteric mechanisms in large set of structures in complexes, to identify order/disorder transition. SAs were also shown to be quite efficient to predict protein flexibility from amino-acid sequence. Finally, in this review, we exemplify the interest of SAs for studying flexibility with different cases of proteins implicated in pathologies and diseases.
Collapse
Affiliation(s)
- Pierrick Craveur
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
| | - Agnel P Joseph
- Rutherford Appleton Laboratory, Science and Technology Facilities Council Didcot, UK
| | - Jeremy Esque
- Institut National de la Santé et de la Recherche Médicale U964,7 UMR Centre National de la Recherche Scientifique 7104, IGBMC, Université de Strasbourg Illkirch, France
| | - Tarun J Narwani
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
| | - Floriane Noël
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
| | - Nicolas Shinada
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
| | - Matthieu Goguet
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
| | - Sylvain Leonard
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
| | - Pierre Poulain
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France ; Ets Poulain Pointe-Noire, Congo
| | - Olivier Bertrand
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
| | - Guilhem Faure
- National Library of Medicine, National Center for Biotechnology Information, National Institutes of Health Bethesda, MD, USA
| | - Joseph Rebehmed
- Centre National de la Recherche Scientifique UMR7590, Sorbonne Universités, Université Pierre et Marie Curie - MNHN - IRD - IUC Paris, France
| | | | - Lakshmipuram S Swapna
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore Bangalore, India ; Hospital for Sick Children, and Departments of Biochemistry and Molecular Genetics, University of Toronto Toronto, ON, Canada
| | - Ramachandra M Bhaskara
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore Bangalore, India ; Department of Theoretical Biophysics, Max Planck Institute of Biophysics Frankfurt, Germany
| | - Jonathan Barnoud
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France ; Laboratoire de Physique, École Normale Supérieure de Lyon, Université de Lyon, Centre National de la Recherche Scientifique UMR 5672 Lyon, France
| | - Stéphane Téletchéa
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France ; Faculté des Sciences et Techniques, Université de Nantes, Unité Fonctionnalité et Ingénierie des Protéines, Centre National de la Recherche Scientifique UMR 6286, Université Nantes Nantes, France
| | - Vincent Jallu
- Platelet Unit, Institut National de la Transfusion Sanguine Paris, France
| | - Jiri Cerny
- Institute of Biotechnology, The Czech Academy of Sciences Prague, Czech Republic
| | - Bohdan Schneider
- Institute of Biotechnology, The Czech Academy of Sciences Prague, Czech Republic
| | - Catherine Etchebest
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
| | | | - Jean-Christophe Gelly
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
| | - Alexandre G de Brevern
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
| |
Collapse
|
41
|
Matlock MK, Holehouse AS, Naegle KM. ProteomeScout: a repository and analysis resource for post-translational modifications and proteins. Nucleic Acids Res 2014; 43:D521-30. [PMID: 25414335 PMCID: PMC4383955 DOI: 10.1093/nar/gku1154] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
ProteomeScout (https://proteomescout.wustl.edu) is a resource for the study of proteins and their post-translational modifications (PTMs) consisting of a database of PTMs, a repository for experimental data, an analysis suite for PTM experiments, and a tool for visualizing the relationships between complex protein annotations. The PTM database is a compendium of public PTM data, coupled with user-uploaded experimental data. ProteomeScout provides analysis tools for experimental datasets, including summary views and subset selection, which can identify relationships within subsets of data by testing for statistically significant enrichment of protein annotations. Protein annotations are incorporated in the ProteomeScout database from external resources and include terms such as Gene Ontology annotations, domains, secondary structure and non-synonymous polymorphisms. These annotations are available in the database download, in the analysis tools and in the protein viewer. The protein viewer allows for the simultaneous visualization of annotations in an interactive web graphic, which can be exported in Scalable Vector Graphics (SVG) format. Finally, quantitative data measurements associated with public experiments are also easily viewable within protein records, allowing researchers to see how PTMs change across different contexts. ProteomeScout should prove useful for protein researchers and should benefit the proteomics community by providing a stable repository for PTM experiments.
Collapse
Affiliation(s)
- Matthew K Matlock
- Department of Biomedical Engineering and the Center for Biological Systems Engineering, Washington University, St Louis, MO 63130, USA
| | - Alex S Holehouse
- Department of Biomedical Engineering and the Center for Biological Systems Engineering, Washington University, St Louis, MO 63130, USA
| | - Kristen M Naegle
- Department of Biomedical Engineering and the Center for Biological Systems Engineering, Washington University, St Louis, MO 63130, USA
| |
Collapse
|