1
|
Cai J, Zhao J, Bin Y, Xia J, Zheng C. iAmyP: A Multi-view Learning for Amyloidogenic Hexapeptides Identification Based on Sequence Least Squares Programming. Interdiscip Sci 2025; 17:277-292. [PMID: 39546159 DOI: 10.1007/s12539-024-00666-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Revised: 10/07/2024] [Accepted: 10/09/2024] [Indexed: 11/17/2024]
Abstract
The development of peptide drug is hindered by the risk of amyloidogenic aggregation; if peptides tend to aggregate in this manner, they may be unsuitable for drug design. Computational methods aimed at predicting amyloidogenic sequences often face challenges in extracting high-quality features, and their predictive performance can be enchanced. To surmount these challenges, iAmyP was introduced as a specialized computational tool designed for predicting amyloidogenic hexapeptides. Utilizing multi-view learning, iAmyP incorporated sequence, structural, and evolutionary features, performing feature selection and feature fusion through recursive feature elimination and attention mechanisms. This amalgamation of features and subsequent feature selection and fusion lead to optimal performance facilitated by an optimization algorithm based on sequence least squares programming. Notably, iAmyP exhibited robust generalization for peptides with lengths of 7-10 amino acids. The role of hydrophobic amino acids in the aggregation process is critical, and a thorough analysis have significantly enhanced our insight into their significance in amyloidogenic hexapeptides. This tool represented an advancement in the development of peptide therapeutics by providing an understanding of amyloidogenic aggregation, establishing itself as a valuable framework for assessing amyloidogenic sequences. The data and code can be freely accessed at https://github.com/xialab-ahu/iAmyP .
Collapse
Affiliation(s)
- Jinling Cai
- College of Mathematics and System Science, Xinjiang University, Urumqi, 830046, China
| | - Jianping Zhao
- College of Mathematics and System Science, Xinjiang University, Urumqi, 830046, China.
| | - Yannan Bin
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Information Materials and Intelligent Sensing Laboratory of Anhui Province, and School of Artificial Intelligence, Anhui University, Hefei, 230601, China.
- Institutes of Physical Science and Information Technology, Anhui University, Hefei, 230601, China.
| | - Junfeng Xia
- College of Mathematics and System Science, Xinjiang University, Urumqi, 830046, China.
- Institutes of Physical Science and Information Technology, Anhui University, Hefei, 230601, China.
| | - Chunhou Zheng
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Information Materials and Intelligent Sensing Laboratory of Anhui Province, and School of Artificial Intelligence, Anhui University, Hefei, 230601, China.
| |
Collapse
|
2
|
Basha S, Mukunda DC, Pai AR, Mahato KK. Assessing amyloid fibrils and amorphous aggregates: A review. Int J Biol Macromol 2025; 311:143725. [PMID: 40324497 DOI: 10.1016/j.ijbiomac.2025.143725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2025] [Revised: 04/23/2025] [Accepted: 04/29/2025] [Indexed: 05/07/2025]
Abstract
Protein misfolding and aggregation play a central role in the progression of neurodegenerative diseases such as Alzheimer's and Parkinson's. These aggregates manifest either as structured amyloid fibrils enriched in β-sheet conformations or as irregular amorphous aggregates with diverse morphologies. Understanding their formation, structure, and behavior is critical for deciphering disease mechanisms and developing targeted diagnostics and therapeutics. This review presents an integrated overview of both conventional and advanced techniques used to detect, distinguish, and structurally characterize these protein aggregates. It covers a range of spectroscopic and spectrometric tools, such as fluorescence, Raman, and mass spectrometry that facilitate aggregate identification. Microscopy methods, including atomic force and electron microscopy, are highlighted for morphological analysis. The review also discusses in situ detection strategies using fluorescent dyes, conformation-specific antibodies, enzymatic reporters, and real-time imaging. Separation methods like centrifugation, electrophoresis, and chromatography are outlined alongside structural analysis tools such as X-ray diffraction. Furthermore, the growing utility of computational approaches and artificial intelligence in predicting aggregation propensities and integrating biological data is emphasized. By critically evaluating each method's capabilities and limitations, this review provides a practical and forward-looking resource for researchers studying the complex landscape of protein aggregation.
Collapse
Affiliation(s)
- Shaik Basha
- Department of Biophysics, Manipal School of Life Sciences, Manipal Academy of Higher Education, Manipal 576104, Karnataka, India
| | | | - Aparna Ramakrishna Pai
- Department of Neurology, Kasturba Medical College Manipal, Manipal Academy of Higher Education, Manipal 576104, Karnataka, India
| | - Krishna Kishore Mahato
- Department of Biophysics, Manipal School of Life Sciences, Manipal Academy of Higher Education, Manipal 576104, Karnataka, India.
| |
Collapse
|
3
|
Hassan M, Shahzadi S, Li MS, Kloczkowski A. Prediction and Evaluation of Protein Aggregation with Computational Methods. Methods Mol Biol 2025; 2867:299-314. [PMID: 39576588 PMCID: PMC12126135 DOI: 10.1007/978-1-0716-4196-5_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2024]
Abstract
Protein and peptide aggregation has recently become one of the most studied biomedical problems due to its central role in several neurodegenerative disorders and of biotechnological importance. Multiple in silico methods, databases, tools, and algorithms have been developed to predict aggregation of proteins and peptides to better understand fundamental mechanisms of various aggregation diseases. Here, we attempt to provide a brief overview of bioinformatic methods and tools to better understand molecular mechanisms of aggregation disorders. Furthermore, through a better understanding of protein aggregation mechanisms, it might be possible to design novel therapeutic agents to treat and hopefully prevent protein aggregation diseases.
Collapse
Affiliation(s)
- Mubashir Hassan
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA.
| | - Saba Shahzadi
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Mai Suan Li
- Institute of Physics, Polish Academy of Sciences, Warsaw, Poland
| | - Andrzej Kloczkowski
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA.
- Department of Pediatrics, The Ohio State University, Columbus, OH, USA.
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA.
| |
Collapse
|
4
|
Li W, Lin H, Huang Z, Xie S, Zhou Y, Gong R, Jiang Q, Xiang C, Huang J. DOTAD: A Database of Therapeutic Antibody Developability. Interdiscip Sci 2024; 16:623-634. [PMID: 38530613 DOI: 10.1007/s12539-024-00613-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2023] [Revised: 01/25/2024] [Accepted: 01/27/2024] [Indexed: 03/28/2024]
Abstract
The development of therapeutic antibodies is an important aspect of new drug discovery pipelines. The assessment of an antibody's developability-its suitability for large-scale production and therapeutic use-is a particularly important step in this process. Given that experimental assays to assess antibody developability in large scale are expensive and time-consuming, computational methods have been a more efficient alternative. However, the antibody research community faces significant challenges due to the scarcity of readily accessible data on antibody developability, which is essential for training and validating computational models. To address this gap, DOTAD (Database Of Therapeutic Antibody Developability) has been built as the first database dedicated exclusively to the curation of therapeutic antibody developability information. DOTAD aggregates all available therapeutic antibody sequence data along with various developability metrics from the scientific literature, offering researchers a robust platform for data storage, retrieval, exploration, and downloading. In addition to serving as a comprehensive repository, DOTAD enhances its utility by integrating a web-based interface that features state-of-the-art tools for the assessment of antibody developability. This ensures that users not only have access to critical data but also have the convenience of analyzing and interpreting this information. The DOTAD database represents a valuable resource for the scientific community, facilitating the advancement of therapeutic antibody research. It is freely accessible at http://i.uestc.edu.cn/DOTAD/ , providing an open data platform that supports the continuous growth and evolution of computational methods in the field of antibody development.
Collapse
Affiliation(s)
- Wenzhen Li
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Hongyan Lin
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Ziru Huang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Shiyang Xie
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Yuwei Zhou
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Rong Gong
- School of Computer Science and Technology, Aba Teachers University, Aba, 623002, China
| | - Qianhu Jiang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - ChangCheng Xiang
- School of Computer Science and Technology, Aba Teachers University, Aba, 623002, China.
| | - Jian Huang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 611731, China.
- School of Healthcare Technology, Chengdu Neusoft University, Chengdu, 611844, China.
| |
Collapse
|
5
|
Ghosh D, Biswas A, Radhakrishna M. Advanced computational approaches to understand protein aggregation. BIOPHYSICS REVIEWS 2024; 5:021302. [PMID: 38681860 PMCID: PMC11045254 DOI: 10.1063/5.0180691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 03/18/2024] [Indexed: 05/01/2024]
Abstract
Protein aggregation is a widespread phenomenon implicated in debilitating diseases like Alzheimer's, Parkinson's, and cataracts, presenting complex hurdles for the field of molecular biology. In this review, we explore the evolving realm of computational methods and bioinformatics tools that have revolutionized our comprehension of protein aggregation. Beginning with a discussion of the multifaceted challenges associated with understanding this process and emphasizing the critical need for precise predictive tools, we highlight how computational techniques have become indispensable for understanding protein aggregation. We focus on molecular simulations, notably molecular dynamics (MD) simulations, spanning from atomistic to coarse-grained levels, which have emerged as pivotal tools in unraveling the complex dynamics governing protein aggregation in diseases such as cataracts, Alzheimer's, and Parkinson's. MD simulations provide microscopic insights into protein interactions and the subtleties of aggregation pathways, with advanced techniques like replica exchange molecular dynamics, Metadynamics (MetaD), and umbrella sampling enhancing our understanding by probing intricate energy landscapes and transition states. We delve into specific applications of MD simulations, elucidating the chaperone mechanism underlying cataract formation using Markov state modeling and the intricate pathways and interactions driving the toxic aggregate formation in Alzheimer's and Parkinson's disease. Transitioning we highlight how computational techniques, including bioinformatics, sequence analysis, structural data, machine learning algorithms, and artificial intelligence have become indispensable for predicting protein aggregation propensity and locating aggregation-prone regions within protein sequences. Throughout our exploration, we underscore the symbiotic relationship between computational approaches and empirical data, which has paved the way for potential therapeutic strategies against protein aggregation-related diseases. In conclusion, this review offers a comprehensive overview of advanced computational methodologies and bioinformatics tools that have catalyzed breakthroughs in unraveling the molecular basis of protein aggregation, with significant implications for clinical interventions, standing at the intersection of computational biology and experimental research.
Collapse
Affiliation(s)
- Deepshikha Ghosh
- Department of Biological Sciences and Engineering, Indian Institute of Technology (IIT) Gandhinagar, Palaj, Gujarat 382355, India
| | - Anushka Biswas
- Department of Chemical Engineering, Indian Institute of Technology (IIT) Gandhinagar, Palaj, Gujarat 382355, India
| | | |
Collapse
|
6
|
Szulc N, Gąsior-Głogowska M, Żyłka P, Szefczyk M, Wojciechowski JW, Żak AM, Dyrka W, Kaczorowska A, Burdukiewicz M, Tarek M, Kotulska M. Structural effects of charge destabilization and amino acid substitutions in amyloid fragments of CsgA. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2024; 313:124094. [PMID: 38503257 DOI: 10.1016/j.saa.2024.124094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 02/20/2024] [Accepted: 02/27/2024] [Indexed: 03/21/2024]
Abstract
The most studied functional amyloid is the CsgA, major curli subunit protein, which is produced by numerous strains of Enterobacteriaceae. Although CsgA sequences are highly conserved, they exhibit species diversity, which reflects the specific evolutionary and functional adaptability of the major curli subunit. Herein, we performed bioinformatics analyses to uncover the differences in the amyloidogenic properties of the R4 fragments in Escherichia coli and Salmonella enterica and proposed four mutants for more detailed studies: M1, M2, M3, and M4. The mutated sequences were characterized by various experimental techniques, such as circular dichroism, ATR-FTIR, FT-Raman, thioflavin T, transmission electron microscopy and confocal microscopy. Additionally, molecular dynamics simulations were performed to determine the role of buffer ions in the aggregation process. Our results demonstrated that the aggregation kinetics, fibril morphology, and overall structure of the peptide were significantly affected by the positions of charged amino acids within the repeat sequences of CsgA. Notably, substituting glycine with lysine resulted in the formation of distinctive spherically packed globular aggregates. The differences in morphology observed are attributed to the influence of phosphate ions, which disrupt the local electrostatic interaction network of the polypeptide chains. This study provides knowledge on the preferential formation of amyloid fibrils based on charge states within the polypeptide chain.
Collapse
Affiliation(s)
- Natalia Szulc
- Department of Biomedical Engineering, Faculty of Fundamental Problems of Technology, Wroclaw University of Science and Technology, Wybrzeze Wyspianskiego 27, 50-370 Wroclaw, Poland; CNRS, University of Lorraine, F-5400 Nancy, France; Department of Physics and Biophysics, Faculty of Biotechnology and Food Science, Wrocław University of Environmental and Life Sciences, Norwida 25, 50-375 Wrocław, Poland
| | - Marlena Gąsior-Głogowska
- Department of Biomedical Engineering, Faculty of Fundamental Problems of Technology, Wroclaw University of Science and Technology, Wybrzeze Wyspianskiego 27, 50-370 Wroclaw, Poland
| | - Paweł Żyłka
- Department of Electrical Engineering Fundamentals, Faculty of Electrical Engineering, Wroclaw University of Science and Technology, Wybrzeze Wyspianskiego 27, 50-370 Wroclaw, Poland
| | - Monika Szefczyk
- Department of Bioorganic Chemistry, Faculty of Chemistry, Wroclaw University of Science and Technology, Wybrzeze Wyspianskiego 27, 50-370 Wroclaw, Poland
| | - Jakub W Wojciechowski
- Department of Biomedical Engineering, Faculty of Fundamental Problems of Technology, Wroclaw University of Science and Technology, Wybrzeze Wyspianskiego 27, 50-370 Wroclaw, Poland
| | - Andrzej M Żak
- Institute of Advanced Materials, Faculty of Chemistry, Wroclaw University of Science and Technology, Wybrzeze Wyspianskiego 27, 50-370 Wroclaw, Poland
| | - Witold Dyrka
- Department of Biomedical Engineering, Faculty of Fundamental Problems of Technology, Wroclaw University of Science and Technology, Wybrzeze Wyspianskiego 27, 50-370 Wroclaw, Poland
| | - Aleksandra Kaczorowska
- Department of Biomedical Engineering, Faculty of Fundamental Problems of Technology, Wroclaw University of Science and Technology, Wybrzeze Wyspianskiego 27, 50-370 Wroclaw, Poland; Laboratory of Cytobiochemistry, Faculty of Biotechnology, University of Wroclaw, F. Joliot-Curie 14a, 50-383 Wroclaw, Poland
| | - Michał Burdukiewicz
- Institute of Biotechnology and Biomedicine, Autonomous University of Barcelona, Campus Universitat Autònoma de Barcelona Plaça Cívica Bellaterra, s/n, 08193 Cerdanyola del Vallès, Barcelona, Spain; Clinical Research Centre, Medical University of Bialystok, Jana Kilinskiego 1, 15-089 Bialystok, Poland
| | - Mounir Tarek
- CNRS, University of Lorraine, F-5400 Nancy, France.
| | - Malgorzata Kotulska
- Department of Biomedical Engineering, Faculty of Fundamental Problems of Technology, Wroclaw University of Science and Technology, Wybrzeze Wyspianskiego 27, 50-370 Wroclaw, Poland.
| |
Collapse
|
7
|
Wojciechowski JW, Szczurek W, Szulc N, Szefczyk M, Kotulska M. PACT - Prediction of amyloid cross-interaction by threading. Sci Rep 2023; 13:22268. [PMID: 38097650 PMCID: PMC10721876 DOI: 10.1038/s41598-023-48886-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Accepted: 11/30/2023] [Indexed: 12/17/2023] Open
Abstract
Amyloid proteins are often associated with the onset of diseases, including Alzheimer's, Parkinson's and many others. However, there is a wide class of functional amyloids that are involved in physiological functions, e.g., formation of microbial biofilms or storage of hormones. Recent studies showed that an amyloid fibril could affect the aggregation of another protein, even from a different species. This may result in amplification or attenuation of the aggregation process. Insight into amyloid cross-interactions may be crucial for better understanding of amyloid diseases and the potential influence of microbial amyloids on human proteins. However, due to the demanding nature of the needed experiments, knowledge of such interactions is still limited. Here, we present PACT (Prediction of Amyloid Cross-interaction by Threading) - the computational method for the prediction of amyloid cross-interactions. The method is based on modeling of a heterogeneous fibril formed by two amyloidogenic peptides. The resulting structure is assessed by the structural statistical potential that approximates its plausibility and energetic stability. PACT was developed and first evaluated mostly on data collected in the AmyloGraph database of interacting amyloids and achieved high values of Area Under ROC (AUC=0.88) and F1 (0.82). Then, we applied our method to study the interactions of CsgA - a bacterial biofilm protein that was not used in our in-reference datasets, which is expressed in several bacterial species that inhabit the human intestines - with two human proteins. The study included alpha-synuclein, a human protein that is involved in Parkinson's disease, and human islet amyloid polypeptide (hIAPP), which is involved in type 2 diabetes. In both cases, PACT predicted the appearance of cross-interactions. Importantly, the method indicated specific regions of the proteins, which were shown to play a central role in both interactions. We experimentally confirmed the novel results of the indicated CsgA fragments interacting with hIAPP based on the kinetic characteristics obtained with the ThT assay. PACT opens the possibility of high-throughput studies of amyloid interactions. Importantly, it can work with fairly long protein fragments, and as a purely physicochemical approach, it relies very little on scarce training data. The tool is available as a web server at https://pact.e-science.pl/pact/ . The local version can be downloaded from https://github.com/KubaWojciechowski/PACT .
Collapse
Affiliation(s)
- Jakub W Wojciechowski
- Department of Biomedical Engineering, Faculty of Fundamental Problems of Technology, Wrocław University of Science and Technology, 50-370, Wrocław, Poland.
| | - Witold Szczurek
- Department of Biomedical Engineering, Faculty of Fundamental Problems of Technology, Wrocław University of Science and Technology, 50-370, Wrocław, Poland
| | - Natalia Szulc
- Department of Biomedical Engineering, Faculty of Fundamental Problems of Technology, Wrocław University of Science and Technology, 50-370, Wrocław, Poland
- Department of Physics and Biophysics, Wrocław University of Environmental and Life Sciences, Norwida 25, 50-375, Wrocław, Poland
- LPCT, CNRS, Université de Lorraine, F-54000, Nancy, France
| | - Monika Szefczyk
- Department of Bioorganic Chemistry, Faculty of Chemistry, Wrocław University of Science and Technology, 50-370, Wrocław, Poland
| | - Malgorzata Kotulska
- Department of Biomedical Engineering, Faculty of Fundamental Problems of Technology, Wrocław University of Science and Technology, 50-370, Wrocław, Poland.
| |
Collapse
|
8
|
Louros N, Schymkowitz J, Rousseau F. Mechanisms and pathology of protein misfolding and aggregation. Nat Rev Mol Cell Biol 2023; 24:912-933. [PMID: 37684425 DOI: 10.1038/s41580-023-00647-2] [Citation(s) in RCA: 89] [Impact Index Per Article: 44.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/28/2023] [Indexed: 09/10/2023]
Abstract
Despite advances in machine learning-based protein structure prediction, we are still far from fully understanding how proteins fold into their native conformation. The conventional notion that polypeptides fold spontaneously to their biologically active states has gradually been replaced by our understanding that cellular protein folding often requires context-dependent guidance from molecular chaperones in order to avoid misfolding. Misfolded proteins can aggregate into larger structures, such as amyloid fibrils, which perpetuate the misfolding process, creating a self-reinforcing cascade. A surge in amyloid fibril structures has deepened our comprehension of how a single polypeptide sequence can exhibit multiple amyloid conformations, known as polymorphism. The assembly of these polymorphs is not a random process but is influenced by the specific conditions and tissues in which they originate. This observation suggests that, similar to the folding of native proteins, the kinetics of pathological amyloid assembly are modulated by interactions specific to cells and tissues. Here, we review the current understanding of how intrinsic protein conformational propensities are modulated by physiological and pathological interactions in the cell to shape protein misfolding and aggregation pathology.
Collapse
Affiliation(s)
- Nikolaos Louros
- Switch Laboratory, VIB-KU Leuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Cellular and Molecular Medicine, KU Leuven, Leuven, Belgium
| | - Joost Schymkowitz
- Switch Laboratory, VIB-KU Leuven Center for Brain & Disease Research, Leuven, Belgium.
- Department of Cellular and Molecular Medicine, KU Leuven, Leuven, Belgium.
| | - Frederic Rousseau
- Switch Laboratory, VIB-KU Leuven Center for Brain & Disease Research, Leuven, Belgium.
- Department of Cellular and Molecular Medicine, KU Leuven, Leuven, Belgium.
| |
Collapse
|
9
|
Falgarone T, Villain E, Richard F, Osmanli Z, Kajava AV. Census of exposed aggregation-prone regions in proteomes. Brief Bioinform 2023; 24:bbad183. [PMID: 37200152 DOI: 10.1093/bib/bbad183] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 03/30/2023] [Accepted: 04/21/2023] [Indexed: 05/20/2023] Open
Abstract
Loss of solubility usually leads to the detrimental elimination of protein function. In some cases, the protein aggregation is also required for beneficial functions. Given the duality of this phenomenon, it remains a fundamental question how natural selection controls the aggregation. The exponential growth of genomic sequence data and recent progress with in silico predictors of the aggregation allows approaching this problem by a large-scale bioinformatics analysis. Most of the aggregation-prone regions are hidden within the 3D structure, rendering them inaccessible for the intermolecular interactions responsible for aggregation. Thus, the most realistic census of the aggregation-prone regions requires crossing aggregation prediction with information about the location of the natively unfolded regions. This allows us to detect so-called 'exposed aggregation-prone regions' (EARs). Here, we analyzed the occurrence and distribution of the EARs in 76 reference proteomes from the three kingdoms of life. For this purpose, we used a bioinformatics pipeline, which provides a consensual result based on several predictors of aggregation. Our analysis revealed a number of new statistically significant correlations about the presence of EARs in different organisms, their dependence on protein length, cellular localizations, co-occurrence with short linear motifs and the level of protein expression. We also obtained a list of proteins with the conserved aggregation-prone sequences for further experimental tests. Insights gained from this work led to a deeper understanding of the relationship between protein evolution and aggregation.
Collapse
Affiliation(s)
- Théo Falgarone
- Centre de Recherche en Biologie cellulaire de Montpellier, CNRS, Université Montpellier, Montpellier, 34293, France
| | - Etienne Villain
- Centre de Recherche en Biologie cellulaire de Montpellier, CNRS, Université Montpellier, Montpellier, 34293, France
| | - Francois Richard
- Centre de Recherche en Biologie cellulaire de Montpellier, CNRS, Université Montpellier, Montpellier, 34293, France
| | - Zarifa Osmanli
- Centre de Recherche en Biologie cellulaire de Montpellier, CNRS, Université Montpellier, Montpellier, 34293, France
- Biophysics Institute, Ministry of Science and Education of Azerbaijan Republic, Az1141, Baku, Azerbaijan
| | - Andrey V Kajava
- Centre de Recherche en Biologie cellulaire de Montpellier, CNRS, Université Montpellier, Montpellier, 34293, France
- Institut de Biologie Computationnelle, Université Montpellier, 34095 Montpellier, France
| |
Collapse
|
10
|
Frenkel A, Zecharia E, Gómez-Pérez D, Sendersky E, Yegorov Y, Jacob A, Benichou JIC, Stierhof YD, Parnasa R, Golden SS, Kemen E, Schwarz R. Cell specialization in cyanobacterial biofilm development revealed by expression of a cell-surface and extracellular matrix protein. NPJ Biofilms Microbiomes 2023; 9:10. [PMID: 36864092 PMCID: PMC9981879 DOI: 10.1038/s41522-023-00376-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 02/06/2023] [Indexed: 03/04/2023] Open
Abstract
Cyanobacterial biofilms are ubiquitous and play important roles in diverse environments, yet, understanding of the processes underlying the development of these aggregates is just emerging. Here we report cell specialization in formation of Synechococcus elongatus PCC 7942 biofilms-a hitherto unknown characteristic of cyanobacterial social behavior. We show that only a quarter of the cell population expresses at high levels the four-gene ebfG-operon that is required for biofilm formation. Almost all cells, however, are assembled in the biofilm. Detailed characterization of EbfG4 encoded by this operon revealed cell-surface localization as well as its presence in the biofilm matrix. Moreover, EbfG1-3 were shown to form amyloid structures such as fibrils and are thus likely to contribute to the matrix structure. These data suggest a beneficial 'division of labor' during biofilm formation where only some of the cells allocate resources to produce matrix proteins-'public goods' that support robust biofilm development by the majority of the cells. In addition, previous studies revealed the operation of a self-suppression mechanism that depends on an extracellular inhibitor, which supresses transcription of the ebfG-operon. Here we revealed inhibitor activity at an early growth stage and its gradual accumulation along the exponential growth phase in correlation with cell density. Data, however, do not support a threshold-like phenomenon known for quorum-sensing in heterotrophs. Together, data presented here demonstrate cell specialization and imply density-dependent regulation thereby providing deep insights into cyanobacterial communal behavior.
Collapse
Affiliation(s)
- Alona Frenkel
- grid.22098.310000 0004 1937 0503The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, 5290002 Ramat-Gan, Israel
| | - Eli Zecharia
- grid.22098.310000 0004 1937 0503The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, 5290002 Ramat-Gan, Israel
| | - Daniel Gómez-Pérez
- grid.10392.390000 0001 2190 1447Center for Plant Molecular Biology (ZMBP), University of Tübingen, 72074 Tübingen, Germany
| | - Eleonora Sendersky
- grid.22098.310000 0004 1937 0503The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, 5290002 Ramat-Gan, Israel
| | - Yevgeni Yegorov
- grid.22098.310000 0004 1937 0503The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, 5290002 Ramat-Gan, Israel
| | - Avi Jacob
- grid.22098.310000 0004 1937 0503The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, 5290002 Ramat-Gan, Israel
| | - Jennifer I. C. Benichou
- grid.22098.310000 0004 1937 0503The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, 5290002 Ramat-Gan, Israel
| | - York-Dieter Stierhof
- grid.10392.390000 0001 2190 1447Center for Plant Molecular Biology (ZMBP), University of Tübingen, 72074 Tübingen, Germany
| | - Rami Parnasa
- grid.22098.310000 0004 1937 0503The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, 5290002 Ramat-Gan, Israel
| | - Susan S. Golden
- grid.266100.30000 0001 2107 4242Division of Biological Sciences, University of California, San Diego, La Jolla, CA 92093 USA ,grid.266100.30000 0001 2107 4242Center for Circadian Biology, University of California, San Diego, La Jolla, CA 92093 USA
| | - Eric Kemen
- grid.10392.390000 0001 2190 1447Center for Plant Molecular Biology (ZMBP), University of Tübingen, 72074 Tübingen, Germany
| | - Rakefet Schwarz
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, 5290002, Ramat-Gan, Israel.
| |
Collapse
|
11
|
Kamal M, Tokmakjian L, Knox J, Mastrangelo P, Ji J, Cai H, Wojciechowski JW, Hughes MP, Takács K, Chu X, Pei J, Grolmusz V, Kotulska M, Forman-Kay JD, Roy PJ. A spatiotemporal reconstruction of the C. elegans pharyngeal cuticle reveals a structure rich in phase-separating proteins. eLife 2022; 11:e79396. [PMID: 36259463 PMCID: PMC9629831 DOI: 10.7554/elife.79396] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Accepted: 10/11/2022] [Indexed: 11/19/2022] Open
Abstract
How the cuticles of the roughly 4.5 million species of ecdysozoan animals are constructed is not well understood. Here, we systematically mine gene expression datasets to uncover the spatiotemporal blueprint for how the chitin-based pharyngeal cuticle of the nematode Caenorhabditis elegans is built. We demonstrate that the blueprint correctly predicts expression patterns and functional relevance to cuticle development. We find that as larvae prepare to molt, catabolic enzymes are upregulated and the genes that encode chitin synthase, chitin cross-linkers, and homologs of amyloid regulators subsequently peak in expression. Forty-eight percent of the gene products secreted during the molt are predicted to be intrinsically disordered proteins (IDPs), many of which belong to four distinct families whose transcripts are expressed in overlapping waves. These include the IDPAs, IDPBs, and IDPCs, which are introduced for the first time here. All four families have sequence properties that drive phase separation and we demonstrate phase separation for one exemplar in vitro. This systematic analysis represents the first blueprint for cuticle construction and highlights the massive contribution that phase-separating materials make to the structure.
Collapse
Affiliation(s)
- Muntasir Kamal
- Department of Molecular Genetics, University of TorontoTorontoCanada
- The Donnelly Centre for Cellular and Biomolecular Research, University of TorontoTorontoCanada
| | - Levon Tokmakjian
- The Donnelly Centre for Cellular and Biomolecular Research, University of TorontoTorontoCanada
- Department of Pharmacology and Toxicology, University of TorontoTorontoCanada
| | - Jessica Knox
- Department of Molecular Genetics, University of TorontoTorontoCanada
- The Donnelly Centre for Cellular and Biomolecular Research, University of TorontoTorontoCanada
| | - Peter Mastrangelo
- Department of Molecular Genetics, University of TorontoTorontoCanada
- The Donnelly Centre for Cellular and Biomolecular Research, University of TorontoTorontoCanada
| | - Jingxiu Ji
- Department of Molecular Genetics, University of TorontoTorontoCanada
- The Donnelly Centre for Cellular and Biomolecular Research, University of TorontoTorontoCanada
| | - Hao Cai
- Molecular Medicine Program, The Hospital for Sick ChildrenTorontoCanada
| | - Jakub W Wojciechowski
- Wroclaw University of Science and Technology, Faculty of Fundamental Problems of Technology, Department of Biomedical EngineeringWroclawPoland
| | - Michael P Hughes
- Department of Cell and Molecular Biology, St. Jude Children’s Research HospitalMemphisUnited States
| | - Kristóf Takács
- PIT Bioinformatics Group, Institute of Mathematics, Eötvös UniversityBudapestHungary
| | - Xiaoquan Chu
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking UniversityBeijingChina
| | - Jianfeng Pei
- Department of Computer Science and Technology, Tsinghua UniversityBeijingChina
| | - Vince Grolmusz
- PIT Bioinformatics Group, Institute of Mathematics, Eötvös UniversityBudapestHungary
| | - Malgorzata Kotulska
- Wroclaw University of Science and Technology, Faculty of Fundamental Problems of Technology, Department of Biomedical EngineeringWroclawPoland
| | - Julie Deborah Forman-Kay
- Molecular Medicine Program, The Hospital for Sick ChildrenTorontoCanada
- Department of Biochemistry, University of TorontoTorontoCanada
| | - Peter J Roy
- Department of Molecular Genetics, University of TorontoTorontoCanada
- The Donnelly Centre for Cellular and Biomolecular Research, University of TorontoTorontoCanada
- Department of Pharmacology and Toxicology, University of TorontoTorontoCanada
| |
Collapse
|
12
|
Qing R, Hao S, Smorodina E, Jin D, Zalevsky A, Zhang S. Protein Design: From the Aspect of Water Solubility and Stability. Chem Rev 2022; 122:14085-14179. [PMID: 35921495 PMCID: PMC9523718 DOI: 10.1021/acs.chemrev.1c00757] [Citation(s) in RCA: 102] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Indexed: 12/13/2022]
Abstract
Water solubility and structural stability are key merits for proteins defined by the primary sequence and 3D-conformation. Their manipulation represents important aspects of the protein design field that relies on the accurate placement of amino acids and molecular interactions, guided by underlying physiochemical principles. Emulated designer proteins with well-defined properties both fuel the knowledge-base for more precise computational design models and are used in various biomedical and nanotechnological applications. The continuous developments in protein science, increasing computing power, new algorithms, and characterization techniques provide sophisticated toolkits for solubility design beyond guess work. In this review, we summarize recent advances in the protein design field with respect to water solubility and structural stability. After introducing fundamental design rules, we discuss the transmembrane protein solubilization and de novo transmembrane protein design. Traditional strategies to enhance protein solubility and structural stability are introduced. The designs of stable protein complexes and high-order assemblies are covered. Computational methodologies behind these endeavors, including structure prediction programs, machine learning algorithms, and specialty software dedicated to the evaluation of protein solubility and aggregation, are discussed. The findings and opportunities for Cryo-EM are presented. This review provides an overview of significant progress and prospects in accurate protein design for solubility and stability.
Collapse
Affiliation(s)
- Rui Qing
- State
Key Laboratory of Microbial Metabolism, School of Life Sciences and
Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
- Media
Lab, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
- The
David H. Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Shilei Hao
- Media
Lab, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
- Key
Laboratory of Biorheological Science and Technology, Ministry of Education, College of Bioengineering, Chongqing University, Chongqing 400030, China
| | - Eva Smorodina
- Department
of Immunology, University of Oslo and Oslo
University Hospital, Oslo 0424, Norway
| | - David Jin
- Avalon GloboCare
Corp., Freehold, New Jersey 07728, United States
| | - Arthur Zalevsky
- Laboratory
of Bioinformatics Approaches in Combinatorial Chemistry and Biology, Shemyakin−Ovchinnikov Institute of Bioorganic
Chemistry RAS, Moscow 117997, Russia
| | - Shuguang Zhang
- Media
Lab, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
13
|
Computational methods to predict protein aggregation. Curr Opin Struct Biol 2022; 73:102343. [PMID: 35240456 DOI: 10.1016/j.sbi.2022.102343] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Revised: 12/20/2021] [Accepted: 01/17/2022] [Indexed: 01/13/2023]
Abstract
In most cases, protein aggregation stems from the establishment of non-native intermolecular contacts. The formation of insoluble protein aggregates is associated with many human diseases and is a major bottleneck for the industrial production of protein-based therapeutics. Strikingly, fibrillar aggregates are naturally exploited for structural scaffolding or to generate molecular switches and can be artificially engineered to build up multi-functional nanomaterials. Thus, there is a high interest in rationalizing and forecasting protein aggregation. Here, we review the available computational toolbox to predict protein aggregation propensities, identify sequential or structural aggregation-prone regions, evaluate the impact of mutations on aggregation or recognize prion-like domains. We discuss the strengths and limitations of these algorithms and how they can evolve in the next future.
Collapse
|
14
|
Bioinformatics Methods in Predicting Amyloid Propensity of Peptides and Proteins. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2340:1-15. [PMID: 35167067 DOI: 10.1007/978-1-0716-1546-1_1] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Several computational methods have been developed to predict amyloid propensity of a protein or peptide. These bioinformatics tools are time- and cost-saving alternatives to expensive and laborious experimental methods which are used to confirm self-aggregation of a protein. Computational approaches not only allow preselection of reliable candidates for amyloids but, most importantly, are capable of a thorough and informative analysis of a protein, indicating the sequence determinants of protein aggregation, identifying the potential causal mutations and likely mechanisms. Bioinformatics modeling applies several different approaches, which most typically include physicochemical or structure-based modeling, machine learning, or statistics based modeling. Bioinformatics methods typically use the amino acid sequence of a protein as an input, some also include additional information, for example, an available structure. This chapter describes the methods currently used to computationally predict amyloid propensity of a protein or peptide. Since the accuracy of bioinformatics methods may be highly dependent on reference data used to develop and evaluate the predictors, we also briefly present the main databases of amyloids used by the authors of bioinformatics tools.
Collapse
|
15
|
Multiple Antimicrobial Effects of Hybrid Peptides Synthesized Based on the Sequence of Ribosomal S1 Protein from Staphylococcus aureus. Int J Mol Sci 2022; 23:ijms23010524. [PMID: 35008951 PMCID: PMC8745237 DOI: 10.3390/ijms23010524] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Revised: 12/21/2021] [Accepted: 01/01/2022] [Indexed: 02/06/2023] Open
Abstract
The need to develop new antimicrobial peptides is due to the high resistance of pathogenic bacteria to traditional antibiotics now and in the future. The creation of synthetic peptide constructs is a common and successful approach to the development of new antimicrobial peptides. In this work, we use a simple, flexible, and scalable technique to create hybrid antimicrobial peptides containing amyloidogenic regions of the ribosomal S1 protein from Staphylococcus aureus. While the cell-penetrating peptide allows the peptide to enter the bacterial cell, the amyloidogenic site provides an antimicrobial effect by coaggregating with functional bacterial proteins. We have demonstrated the antimicrobial effects of the R23F, R23DI, and R23EI hybrid peptides against Staphylococcus aureus, methicillin-resistant S. aureus (MRSA), Pseudomonas aeruginosa, Escherichia coli, and Bacillus cereus. R23F, R23DI, and R23EI can be used as antimicrobial peptides against Gram-positive and Gram-negative bacteria resistant to traditional antibiotics.
Collapse
|
16
|
Gil‐Garcia M, Iglesias V, Pallarès I, Ventura S. Prion-like proteins: from computational approaches to proteome-wide analysis. FEBS Open Bio 2021; 11:2400-2417. [PMID: 34057308 PMCID: PMC8409284 DOI: 10.1002/2211-5463.13213] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 05/07/2021] [Accepted: 05/28/2021] [Indexed: 12/16/2022] Open
Abstract
Prions are self-perpetuating proteins able to switch between a soluble state and an aggregated-and-transmissible conformation. These proteinaceous entities have been widely studied in yeast, where they are involved in hereditable phenotypic adaptations. The notion that such proteins could play functional roles and be positively selected by evolution has triggered the development of computational tools to identify prion-like proteins in different kingdoms of life. These algorithms have succeeded in screening multiple proteomes, allowing the identification of prion-like proteins in a diversity of unrelated organisms, evidencing that the prion phenomenon is well conserved among species. Interestingly enough, prion-like proteins are not only connected with the formation of functional membraneless protein-nucleic acid coacervates, but are also linked to human diseases. This review addresses state-of-the-art computational approaches to identify prion-like proteins, describes proteome-wide analysis efforts, discusses these unique proteins' functional role, and illustrates recently validated examples in different domains of life.
Collapse
Affiliation(s)
- Marcos Gil‐Garcia
- Departament de Bioquímica i Biologia MolecularInstitut de Biotecnologia i de BiomedicinaUniversitat Autònoma de BarcelonaSpain
| | - Valentín Iglesias
- Departament de Bioquímica i Biologia MolecularInstitut de Biotecnologia i de BiomedicinaUniversitat Autònoma de BarcelonaSpain
| | - Irantzu Pallarès
- Departament de Bioquímica i Biologia MolecularInstitut de Biotecnologia i de BiomedicinaUniversitat Autònoma de BarcelonaSpain
| | - Salvador Ventura
- Departament de Bioquímica i Biologia MolecularInstitut de Biotecnologia i de BiomedicinaUniversitat Autònoma de BarcelonaSpain
| |
Collapse
|
17
|
Szulc N, Gąsior-Głogowska M, Wojciechowski JW, Szefczyk M, Żak AM, Burdukiewicz M, Kotulska M. Variability of Amyloid Propensity in Imperfect Repeats of CsgA Protein of Salmonella enterica and Escherichia coli. Int J Mol Sci 2021; 22:ijms22105127. [PMID: 34066237 PMCID: PMC8151669 DOI: 10.3390/ijms22105127] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Revised: 04/22/2021] [Accepted: 05/07/2021] [Indexed: 11/18/2022] Open
Abstract
CsgA is an aggregating protein from bacterial biofilms, representing a class of functional amyloids. Its amyloid propensity is defined by five fragments (R1–R5) of the sequence, representing non-perfect repeats. Gate-keeper amino acid residues, specific to each fragment, define the fragment’s propensity for self-aggregation and aggregating characteristics of the whole protein. We study the self-aggregation and secondary structures of the repeat fragments of Salmonella enterica and Escherichia coli and comparatively analyze their potential effects on these proteins in a bacterial biofilm. Using bioinformatics predictors, ATR-FTIR and FT-Raman spectroscopy techniques, circular dichroism, and transmission electron microscopy, we confirmed self-aggregation of R1, R3, R5 fragments, as previously reported for Escherichia coli, however, with different temporal characteristics for each species. We also observed aggregation propensities of R4 fragment of Salmonella enterica that is different than that of Escherichia coli. Our studies showed that amyloid structures of CsgA repeats are more easily formed and more durable in Salmonella enterica than those in Escherichia coli.
Collapse
Affiliation(s)
- Natalia Szulc
- Department of Biomedical Engineering, Faculty of Fundamental Problems of Technology, Wroclaw University of Science and Technology, Wybrzeże Wyspiańskiego 27, 50-370 Wrocław, Poland; (N.S.); (M.G.-G.); (J.W.W.)
- LPCT, CNRS, Université de Lorraine, F-54000 Nancy, France
| | - Marlena Gąsior-Głogowska
- Department of Biomedical Engineering, Faculty of Fundamental Problems of Technology, Wroclaw University of Science and Technology, Wybrzeże Wyspiańskiego 27, 50-370 Wrocław, Poland; (N.S.); (M.G.-G.); (J.W.W.)
| | - Jakub W. Wojciechowski
- Department of Biomedical Engineering, Faculty of Fundamental Problems of Technology, Wroclaw University of Science and Technology, Wybrzeże Wyspiańskiego 27, 50-370 Wrocław, Poland; (N.S.); (M.G.-G.); (J.W.W.)
| | - Monika Szefczyk
- Department of Bioorganic Chemistry, Faculty of Chemistry, Wroclaw University of Science and Technology, Wybrzeże Wyspiańskiego 27, 50-370 Wrocław, Poland;
| | - Andrzej M. Żak
- Electron Microscopy Laboratory, Faculty of Mechanical Engineering, Wroclaw University of Science and Technology, Wybrzeże Wyspiańskiego 27, 50-370 Wrocław, Poland;
| | - Michał Burdukiewicz
- Clinical Research Centre, Medical University of Białystok, Jana Kilińskiego 1, 15-089 Białystok, Poland
- Institute of Biochemistry and Biophysics, Polish Academy Sciences, 02-106 Warsaw, Poland
- Faculty of Natural Sciences, Brandenburg University of Technology Cottbus-Senftenberg, 01968 Senftenberg, Germany
- Correspondence: (M.B.); (M.K.)
| | - Malgorzata Kotulska
- Department of Biomedical Engineering, Faculty of Fundamental Problems of Technology, Wroclaw University of Science and Technology, Wybrzeże Wyspiańskiego 27, 50-370 Wrocław, Poland; (N.S.); (M.G.-G.); (J.W.W.)
- Correspondence: (M.B.); (M.K.)
| |
Collapse
|
18
|
Dyrka W, Gąsior-Głogowska M, Szefczyk M, Szulc N. Searching for universal model of amyloid signaling motifs using probabilistic context-free grammars. BMC Bioinformatics 2021; 22:222. [PMID: 33926372 PMCID: PMC8086366 DOI: 10.1186/s12859-021-04139-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2020] [Accepted: 04/19/2021] [Indexed: 11/16/2022] Open
Abstract
Background Amyloid signaling motifs are a class of protein motifs which share basic structural and functional features despite the lack of clear sequence homology. They are hard to detect in large sequence databases either with the alignment-based profile methods (due to short length and diversity) or with generic amyloid- and prion-finding tools (due to insufficient discriminative power). We propose to address the challenge with a machine learning grammatical model capable of generalizing over diverse collections of unaligned yet related motifs. Results First, we introduce and test improvements to our probabilistic context-free grammar framework for protein sequences that allow for inferring more sophisticated models achieving high sensitivity at low false positive rates. Then, we infer universal grammars for a collection of recently identified bacterial amyloid signaling motifs and demonstrate that the method is capable of generalizing by successfully searching for related motifs in fungi. The results are compared to available alternative methods. Finally, we conduct spectroscopy and staining analyses of selected peptides to verify their structural and functional relationship. Conclusions While the profile HMMs remain the method of choice for modeling homologous sets of sequences, PCFGs seem more suitable for building meta-family descriptors and extrapolating beyond the seed sample. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04139-y.
Collapse
Affiliation(s)
- Witold Dyrka
- Wydział Podstawowych Problemów Techniki, Katedra Inżynierii Biomedycznej, Politechnika Wrocławska, Wrocław, Poland.
| | - Marlena Gąsior-Głogowska
- Wydział Podstawowych Problemów Techniki, Katedra Inżynierii Biomedycznej, Politechnika Wrocławska, Wrocław, Poland
| | - Monika Szefczyk
- Wydział Chemiczny, Katedra Chemii Bioorganicznej, Politechnika Wrocławska, Wrocław, Poland
| | - Natalia Szulc
- Wydział Podstawowych Problemów Techniki, Katedra Inżynierii Biomedycznej, Politechnika Wrocławska, Wrocław, Poland
| |
Collapse
|
19
|
Szulc N, Burdukiewicz M, Gąsior-Głogowska M, Wojciechowski JW, Chilimoniuk J, Mackiewicz P, Šneideris T, Smirnovas V, Kotulska M. Bioinformatics methods for identification of amyloidogenic peptides show robustness to misannotated training data. Sci Rep 2021; 11:8934. [PMID: 33903613 PMCID: PMC8076271 DOI: 10.1038/s41598-021-86530-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Accepted: 03/08/2021] [Indexed: 02/02/2023] Open
Abstract
Several disorders are related to amyloid aggregation of proteins, for example Alzheimer's or Parkinson's diseases. Amyloid proteins form fibrils of aggregated beta structures. This is preceded by formation of oligomers-the most cytotoxic species. Determining amyloidogenicity is tedious and costly. The most reliable identification of amyloids is obtained with high resolution microscopies, such as electron microscopy or atomic force microscopy (AFM). More frequently, less expensive and faster methods are used, especially infrared (IR) spectroscopy or Thioflavin T staining. Different experimental methods are not always concurrent, especially when amyloid peptides do not readily form fibrils but oligomers. This may lead to peptide misclassification and mislabeling. Several bioinformatics methods have been proposed for in-silico identification of amyloids, many of them based on machine learning. The effectiveness of these methods heavily depends on accurate annotation of the reference training data obtained from in-vitro experiments. We study how robust are bioinformatics methods to weak supervision, encountering imperfect training data. AmyloGram and three other amyloid predictors were applied. The results proved that a certain degree of misannotation in the reference data can be eliminated by the bioinformatics tools, even if they belonged to their training set. The computational results are supported by new experiments with IR and AFM methods.
Collapse
Affiliation(s)
- Natalia Szulc
- grid.7005.20000 0000 9805 3178Department of Biomedical Engineering, Wroclaw University of Science and Technology, 50-370 Wroclaw, Poland ,grid.29172.3f0000 0001 2194 6418University of Lorraine, CNRS, 5400 Nancy, France
| | - Michał Burdukiewicz
- grid.48324.390000000122482838Medical University of Bialystok, 15-089 Białystok, Poland ,grid.413454.30000 0001 1958 0162Institute of Biochemistry and Biophysics, Polish Academy Sciences, 02-106 Warsaw, Poland
| | - Marlena Gąsior-Głogowska
- grid.7005.20000 0000 9805 3178Department of Biomedical Engineering, Wroclaw University of Science and Technology, 50-370 Wroclaw, Poland
| | - Jakub W. Wojciechowski
- grid.7005.20000 0000 9805 3178Department of Biomedical Engineering, Wroclaw University of Science and Technology, 50-370 Wroclaw, Poland
| | - Jarosław Chilimoniuk
- grid.8505.80000 0001 1010 5103Faculty of Biotechnology, University of Wroclaw, 50-137 Wroclaw, Poland
| | - Paweł Mackiewicz
- grid.8505.80000 0001 1010 5103Faculty of Biotechnology, University of Wroclaw, 50-137 Wroclaw, Poland
| | - Tomas Šneideris
- grid.6441.70000 0001 2243 2806Life Sciences Center, Institute of Biotechnology, Vilnius University, 01513 Vilnius, Lithuania
| | - Vytautas Smirnovas
- grid.6441.70000 0001 2243 2806Life Sciences Center, Institute of Biotechnology, Vilnius University, 01513 Vilnius, Lithuania
| | - Malgorzata Kotulska
- grid.7005.20000 0000 9805 3178Department of Biomedical Engineering, Wroclaw University of Science and Technology, 50-370 Wroclaw, Poland
| |
Collapse
|
20
|
Prabakaran R, Rawat P, Thangakani AM, Kumar S, Gromiha MM. Protein aggregation: in silico algorithms and applications. Biophys Rev 2021; 13:71-89. [PMID: 33747245 PMCID: PMC7930180 DOI: 10.1007/s12551-021-00778-w] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Accepted: 01/01/2021] [Indexed: 01/08/2023] Open
Abstract
Protein aggregation is a topic of immense interest to the scientific community due to its role in several neurodegenerative diseases/disorders and industrial importance. Several in silico techniques, tools, and algorithms have been developed to predict aggregation in proteins and understand the aggregation mechanisms. This review attempts to provide an essence of the vast developments in in silico approaches, resources available, and future perspectives. It reviews aggregation-related databases, mechanistic models (aggregation-prone region and aggregation propensity prediction), kinetic models (aggregation rate prediction), and molecular dynamics studies related to aggregation. With a multitude of prediction models related to aggregation already available to the scientific community, the field of protein aggregation is rapidly maturing to tackle new applications.
Collapse
Affiliation(s)
- R. Prabakaran
- Department of Biotechnology, Indian Institute of Technology Madras, Chennai, Tamil Nadu India
| | - Puneet Rawat
- Department of Biotechnology, Indian Institute of Technology Madras, Chennai, Tamil Nadu India
| | - A. Mary Thangakani
- Department of Biotechnology, Indian Institute of Technology Madras, Chennai, Tamil Nadu India
| | - Sandeep Kumar
- Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceutical Inc., Ridgefield, CT USA
| | - M. Michael Gromiha
- Department of Biotechnology, Indian Institute of Technology Madras, Chennai, Tamil Nadu India
- School of Computing, Institute of Innovative Research, Tokyo Institute of Technology, Yokohama, Kanagawa Japan
| |
Collapse
|