1
|
Pretti E, Shell MS. Characterizing the Sequence Landscape of Peptide Fibrillization with a Bottom-Up Coarse-Grained Model. J Phys Chem B 2025; 129:3559-3570. [PMID: 40146906 DOI: 10.1021/acs.jpcb.4c07248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2025]
Abstract
Molecular insight into amyloid aggregation is crucial for understanding the details of protein fibril nucleation and growth, which play a significant role in a wide range of proteinopathies. The length and time scales for fibrillization make its computational study an intrinsically multiscale problem, necessitating the use of coarse-grained modeling. A wide variety of coarse-grained models for peptides have been proposed, often parametrized with a combination of top-down and bottom-up approaches. Here, we present a predictive, sequence-transferable bottom-up coarse-grained model, systematically developed using only information from atomistic simulations by applying an extended-ensemble relative entropy minimization technique. The resulting model is capable of accurately recovering conformational properties of peptides constructed from a reduced alphabet of amino acids, of predicting secondary structures of isolated and interacting peptides from their sequences alone, and of simulating aggregation of peptides that have been experimentally characterized as amyloidogenic. Finally, we couple such coarse-grained simulations with a genetic algorithm to characterize the sequence space of the reduced alphabet and identify features of sequences for which ordered fibrillar states are both thermodynamically favorable and kinetically accessible.
Collapse
Affiliation(s)
- Evan Pretti
- Department of Chemical Engineering, Engineering II Building, University of California, Santa Barbara, Santa Barbara, California 93106-5080, United States
| | - M Scott Shell
- Department of Chemical Engineering, Engineering II Building, University of California, Santa Barbara, Santa Barbara, California 93106-5080, United States
| |
Collapse
|
2
|
Fantini J, Azzaz F, Di Scala C, Aulas A, Chahinian H, Yahi N. Conformationally adaptive therapeutic peptides for diseases caused by intrinsically disordered proteins (IDPs). New paradigm for drug discovery: Target the target, not the arrow. Pharmacol Ther 2025; 267:108797. [PMID: 39828029 DOI: 10.1016/j.pharmthera.2025.108797] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2024] [Revised: 11/28/2024] [Accepted: 01/10/2025] [Indexed: 01/22/2025]
Abstract
The traditional model of protein structure determined by the amino acid sequence is today seriously challenged by the fact that approximately half of the human proteome is made up of proteins that do not have a stable 3D structure, either partially or in totality. These proteins, called intrinsically disordered proteins (IDPs), are involved in numerous physiological functions and are associated with severe pathologies, e.g. Alzheimer, Parkinson, Creutzfeldt-Jakob, amyotrophic lateral sclerosis (ALS), and type 2 diabetes. Targeting these proteins is challenging for two reasons: i) we need to preserve their physiological functions, and ii) drug design by molecular docking is not possible due to the lack of reliable starting conditions. Faced with this challenge, the solutions proposed by artificial intelligence (AI) such as AlphaFold are clearly unsuitable. Instead, we suggest an innovative approach consisting of mimicking, in short synthetic peptides, the conformational flexibility of IDPs. These peptides, which we call adaptive peptides, are derived from the domains of IDPs that become structured after interacting with a ligand. Adaptive peptides are designed with the aim of selectively antagonizing the harmful effects of IDPs, without targeting them directly but through selected ligands, without affecting their physiological properties. This "target the target, not the arrow" strategy is promised to open a new route to drug discovery for currently undruggable proteins.
Collapse
Affiliation(s)
- Jacques Fantini
- Aix-Marseille University, INSERM UA 16, Faculty of Medicine, 13015 Marseille, France.
| | - Fodil Azzaz
- Aix-Marseille University, INSERM UA 16, Faculty of Medicine, 13015 Marseille, France
| | - Coralie Di Scala
- Neuroscience Center-HiLIFE, Helsinki Institute of Life Science, University of Helsinki, 00014 Helsinki, Finland
| | - Anaïs Aulas
- Neuroscience Center-HiLIFE, Helsinki Institute of Life Science, University of Helsinki, 00014 Helsinki, Finland
| | - Henri Chahinian
- Aix-Marseille University, INSERM UA 16, Faculty of Medicine, 13015 Marseille, France
| | - Nouara Yahi
- Aix-Marseille University, INSERM UA 16, Faculty of Medicine, 13015 Marseille, France
| |
Collapse
|
3
|
Chen J, Li Q, Xia S, Arsala D, Sosa D, Wang D, Long M. The Rapid Evolution of De Novo Proteins in Structure and Complex. Genome Biol Evol 2024; 16:evae107. [PMID: 38753069 PMCID: PMC11149777 DOI: 10.1093/gbe/evae107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/10/2024] [Indexed: 06/06/2024] Open
Abstract
Recent studies in the rice genome-wide have established that de novo genes, evolving from noncoding sequences, enhance protein diversity through a stepwise process. However, the pattern and rate of their evolution in protein structure over time remain unclear. Here, we addressed these issues within a surprisingly short evolutionary timescale (<1 million years for 97% of Oryza de novo genes) with comparative approaches to gene duplicates. We found that de novo genes evolve faster than gene duplicates in the intrinsically disordered regions (such as random coils), secondary structure elements (such as α helix and β strand), hydrophobicity, and molecular recognition features. In de novo proteins, specifically, we observed an 8% to 14% decay in random coils and intrinsically disordered region lengths and a 2.3% to 6.5% increase in structured elements, hydrophobicity, and molecular recognition features, per million years on average. These patterns of structural evolution align with changes in amino acid composition over time as well. We also revealed higher positive charges but smaller molecular weights for de novo proteins than duplicates. Tertiary structure predictions showed that most de novo proteins, though not typically well folded on their own, readily form low-energy and compact complexes with other proteins facilitated by extensive residue contacts and conformational flexibility, suggesting a faster-binding scenario in de novo proteins to promote interaction. These analyses illuminate a rapid evolution of protein structure in de novo genes in rice genomes, originating from noncoding sequences, highlighting their quick transformation into active, protein complex-forming components within a remarkably short evolutionary timeframe.
Collapse
Affiliation(s)
- Jianhai Chen
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| | - Qingrong Li
- Division of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA 92093, USA
- Department of Cellular & Molecular Medicine, School of Medicine, University of California San Diego, La Jolla, CA 92093, USA
| | - Shengqian Xia
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| | - Deanna Arsala
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| | - Dylan Sosa
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| | - Dong Wang
- Division of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA 92093, USA
- Department of Cellular & Molecular Medicine, School of Medicine, University of California San Diego, La Jolla, CA 92093, USA
| | - Manyuan Long
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| |
Collapse
|
4
|
Bruley A, Bitard-Feildel T, Callebaut I, Duprat E. A sequence-based foldability score combined with AlphaFold2 predictions to disentangle the protein order/disorder continuum. Proteins 2023; 91:466-484. [PMID: 36306150 DOI: 10.1002/prot.26441] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 10/14/2022] [Accepted: 10/18/2022] [Indexed: 11/11/2022]
Abstract
Order and disorder govern protein functions, but there is a great diversity in disorder, from regions that are-and stay-fully disordered to conditional order. This diversity is still difficult to decipher even though it is encoded in the amino acid sequences. Here, we developed an analytic Python package, named pyHCA, to estimate the foldability of a protein segment from the only information of its amino acid sequence and based on a measure of its density in regular secondary structures associated with hydrophobic clusters, as defined by the hydrophobic cluster analysis (HCA) approach. The tool was designed by optimizing the separation between foldable segments from databases of disorder (DisProt) and order (SCOPe [soluble domains] and OPM [transmembrane domains]). It allows to specify the ratio between order, embodied by regular secondary structures (either participating in the hydrophobic core of well-folded 3D structures or conditionally formed in intrinsically disordered regions) and disorder. We illustrated the relevance of pyHCA with several examples and applied it to the sequences of the proteomes of 21 species ranging from prokaryotes and archaea to unicellular and multicellular eukaryotes, for which structure models are provided in the AlphaFold protein structure database. Cases of low-confidence scores related to disorder were distinguished from those of sequences that we identified as foldable but are still excluded from accurate modeling by AlphaFold2 due to a lack of sequence homologs or to compositional biases. Overall, our approach is complementary to AlphaFold2, providing guides to map structural innovations through evolutionary processes, at proteome and gene scales.
Collapse
Affiliation(s)
- Apolline Bruley
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| | - Tristan Bitard-Feildel
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| | - Isabelle Callebaut
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| | - Elodie Duprat
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| |
Collapse
|
5
|
Han B, Ren C, Wang W, Li J, Gong X. Computational Prediction of Protein Intrinsically Disordered Region Related Interactions and Functions. Genes (Basel) 2023; 14:432. [PMID: 36833360 PMCID: PMC9956190 DOI: 10.3390/genes14020432] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Revised: 02/02/2023] [Accepted: 02/05/2023] [Indexed: 02/11/2023] Open
Abstract
Intrinsically Disordered Proteins (IDPs) and Regions (IDRs) exist widely. Although without well-defined structures, they participate in many important biological processes. In addition, they are also widely related to human diseases and have become potential targets in drug discovery. However, there is a big gap between the experimental annotations related to IDPs/IDRs and their actual number. In recent decades, the computational methods related to IDPs/IDRs have been developed vigorously, including predicting IDPs/IDRs, the binding modes of IDPs/IDRs, the binding sites of IDPs/IDRs, and the molecular functions of IDPs/IDRs according to different tasks. In view of the correlation between these predictors, we have reviewed these prediction methods uniformly for the first time, summarized their computational methods and predictive performance, and discussed some problems and perspectives.
Collapse
Affiliation(s)
- Bingqing Han
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
| | - Chongjiao Ren
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
| | - Wenda Wang
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
| | - Jiashan Li
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
| | - Xinqi Gong
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
- Beijing Academy of Intelligence, Beijing 100083, China
| |
Collapse
|
6
|
A liquid-to-solid phase transition of Cu/Zn superoxide dismutase 1 initiated by oxidation and disease mutation. J Biol Chem 2023; 299:102857. [PMID: 36592929 PMCID: PMC9898760 DOI: 10.1016/j.jbc.2022.102857] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Revised: 12/17/2022] [Accepted: 12/19/2022] [Indexed: 01/01/2023] Open
Abstract
Cu/Zn superoxide dismutase 1 (SOD1) has a high propensity to misfold and form abnormal aggregates when it is subjected to oxidative stress or carries mutations associated with amyotrophic lateral sclerosis. However, the transition from functional soluble SOD1 protein to aggregated SOD1 protein is not completely clear. Here, we propose that liquid-liquid phase separation (LLPS) represents a biophysical process that converts soluble SOD1 into aggregated SOD1. We determined that SOD1 undergoes LLPS in vitro and cells under oxidative stress. Abnormal oxidation of SOD1 induces maturation of droplets formed by LLPS, eventually leading to protein aggregation and fibrosis, and involves residues Cys111 and Trp32. Additionally, we found that pathological mutations in SOD1 associated with ALS alter the morphology and material state of the droplets and promote the transformation of SOD1 to solid-like oligomers which are toxic to nerve cells. Furthermore, the fibrous aggregates formed by both pathways have a concentration-dependent toxicity effect on nerve cells. Thus, these combined results strongly indicate that LLPS may play a major role in pathological SOD1 aggregation, contributing to pathogenesis in ALS.
Collapse
|
7
|
Bruley A, Mornon JP, Duprat E, Callebaut I. Digging into the 3D Structure Predictions of AlphaFold2 with Low Confidence: Disorder and Beyond. Biomolecules 2022; 12:1467. [PMID: 36291675 PMCID: PMC9599455 DOI: 10.3390/biom12101467] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 10/04/2022] [Accepted: 10/05/2022] [Indexed: 01/12/2023] Open
Abstract
AlphaFold2 (AF2) has created a breakthrough in biology by providing three-dimensional structure models for whole-proteome sequences, with unprecedented levels of accuracy. In addition, the AF2 pLDDT score, related to the model confidence, has been shown to provide a good measure of residue-wise disorder. Here, we combined AF2 predictions with pyHCA, a tool we previously developed to identify foldable segments and estimate their order/disorder ratio, from a single protein sequence. We focused our analysis on the AF2 predictions available for 21 reference proteomes (AFDB v1), in particular on their long foldable segments (>30 amino acids) that exhibit characteristics of soluble domains, as estimated by pyHCA. Among these segments, we provided a global analysis of those with very low pLDDT values along their entire length and compared their characteristics to those of segments with very high pLDDT values. We highlighted cases containing conditional order, as well as cases that could form well-folded structures but escape the AF2 prediction due to a shallow multiple sequence alignment and/or undocumented structure or fold. AF2 and pyHCA can therefore be advantageously combined to unravel cryptic structural features in whole proteomes and to refine predictions for different flavors of disorder.
Collapse
|
8
|
Chen R, Li X, Yang Y, Song X, Wang C, Qiao D. Prediction of protein-protein interaction sites in intrinsically disordered proteins. Front Mol Biosci 2022; 9:985022. [PMID: 36250006 PMCID: PMC9567019 DOI: 10.3389/fmolb.2022.985022] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Accepted: 07/27/2022] [Indexed: 11/25/2022] Open
Abstract
Intrinsically disordered proteins (IDPs) participate in many biological processes by interacting with other proteins, including the regulation of transcription, translation, and the cell cycle. With the increasing amount of disorder sequence data available, it is thus crucial to identify the IDP binding sites for functional annotation of these proteins. Over the decades, many computational approaches have been developed to predict protein-protein binding sites of IDP (IDP-PPIS) based on protein sequence information. Moreover, there are new IDP-PPIS predictors developed every year with the rapid development of artificial intelligence. It is thus necessary to provide an up-to-date overview of these methods in this field. In this paper, we collected 30 representative predictors published recently and summarized the databases, features and algorithms. We described the procedure how the features were generated based on public data and used for the prediction of IDP-PPIS, along with the methods to generate the feature representations. All the predictors were divided into three categories: scoring functions, machine learning-based prediction, and consensus approaches. For each category, we described the details of algorithms and their performances. Hopefully, our manuscript will not only provide a full picture of the status quo of IDP binding prediction, but also a guide for selecting different methods. More importantly, it will shed light on the inspirations for future development trends and principles.
Collapse
Affiliation(s)
- Ranran Chen
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Xinlu Li
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Yaqing Yang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Xixi Song
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Cheng Wang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Dongdong Qiao
- Shandong Mental Health Center, Shandong University, Jinan, China
| |
Collapse
|
9
|
Roca-Martinez J, Lazar T, Gavalda-Garcia J, Bickel D, Pancsa R, Dixit B, Tzavella K, Ramasamy P, Sanchez-Fornaris M, Grau I, Vranken WF. Challenges in describing the conformation and dynamics of proteins with ambiguous behavior. Front Mol Biosci 2022; 9:959956. [PMID: 35992270 PMCID: PMC9382080 DOI: 10.3389/fmolb.2022.959956] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Accepted: 06/27/2022] [Indexed: 11/13/2022] Open
Abstract
Traditionally, our understanding of how proteins operate and how evolution shapes them is based on two main data sources: the overall protein fold and the protein amino acid sequence. However, a significant part of the proteome shows highly dynamic and/or structurally ambiguous behavior, which cannot be correctly represented by the traditional fixed set of static coordinates. Representing such protein behaviors remains challenging and necessarily involves a complex interpretation of conformational states, including probabilistic descriptions. Relating protein dynamics and multiple conformations to their function as well as their physiological context (e.g., post-translational modifications and subcellular localization), therefore, remains elusive for much of the proteome, with studies to investigate the effect of protein dynamics relying heavily on computational models. We here investigate the possibility of delineating three classes of protein conformational behavior: order, disorder, and ambiguity. These definitions are explored based on three different datasets, using interpretable machine learning from a set of features, from AlphaFold2 to sequence-based predictions, to understand the overlap and differences between these datasets. This forms the basis for a discussion on the current limitations in describing the behavior of dynamic and ambiguous proteins.
Collapse
Affiliation(s)
- Joel Roca-Martinez
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
| | - Tamas Lazar
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- VIB-VUB Center for Structural Biology, Brussels, Belgium
| | - Jose Gavalda-Garcia
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
| | - David Bickel
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
| | - Rita Pancsa
- Research Centre for Natural Sciences, Institute of Enzymology, Budapest, Hungary
| | - Bhawna Dixit
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
- IBiTech-Biommeda, Universiteit Gent, Gent, Belgium
| | - Konstantina Tzavella
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
| | - Pathmanaban Ramasamy
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
- VIB-UGent Center for Medical Biotechnology, Universiteit Gent, Gent, Belgium
| | - Maite Sanchez-Fornaris
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
- Department of Computer Sciences, University of Camagüey, Camagüey, Cuba
| | - Isel Grau
- Information Systems, Eindhoven University of Technology, Eindhoven, Netherlands
| | - Wim F. Vranken
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
| |
Collapse
|
10
|
Piersimoni L, Abd El Malek M, Bhatia T, Bender J, Brankatschk C, Calvo Sánchez J, Dayhoff GW, Di Ianni A, Figueroa Parra JO, Garcia-Martinez D, Hesselbarth J, Köppen J, Lauth LM, Lippik L, Machner L, Sachan S, Schmidt L, Selle R, Skalidis I, Sorokin O, Ubbiali D, Voigt B, Wedler A, Wei AAJ, Zorn P, Dunker AK, Köhn M, Sinz A, Uversky VN. Lighting up Nobel Prize-winning studies with protein intrinsic disorder. Cell Mol Life Sci 2022; 79:449. [PMID: 35882686 PMCID: PMC11072364 DOI: 10.1007/s00018-022-04468-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 06/18/2022] [Accepted: 07/04/2022] [Indexed: 11/03/2022]
Abstract
Intrinsically disordered proteins and regions (IDPs and IDRs) and their importance in biology are becoming increasingly recognized in biology, biochemistry, molecular biology and chemistry textbooks, as well as in current protein science and structural biology curricula. We argue that the sequence → dynamic conformational ensemble → function principle is of equal importance as the classical sequence → structure → function paradigm. To highlight this point, we describe the IDPs and/or IDRs behind the discoveries associated with 17 Nobel Prizes, 11 in Physiology or Medicine and 6 in Chemistry. The Nobel Laureates themselves did not always mention that the proteins underlying the phenomena investigated in their award-winning studies are in fact IDPs or contain IDRs. In several cases, IDP- or IDR-based molecular functions have been elucidated, while in other instances, it is recognized that the respective protein(s) contain IDRs, but the specific IDR-based molecular functions have yet to be determined. To highlight the importance of IDPs and IDRs as general principle in biology, we present here illustrative examples of IDPs/IDRs in Nobel Prize-winning mechanisms and processes.
Collapse
Affiliation(s)
- Lolita Piersimoni
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Marina Abd El Malek
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Twinkle Bhatia
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Julian Bender
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Christin Brankatschk
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Jaime Calvo Sánchez
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Guy W Dayhoff
- Department of Chemistry, College of Art and Sciences, University of South Florida, Tampa, FL, 33620, USA
| | - Alessio Di Ianni
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | | | - Dailen Garcia-Martinez
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Julia Hesselbarth
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Janett Köppen
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Luca M Lauth
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Laurin Lippik
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Lisa Machner
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Shubhra Sachan
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Lisa Schmidt
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Robin Selle
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Ioannis Skalidis
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Oleksandr Sorokin
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Daniele Ubbiali
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Bruno Voigt
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Alice Wedler
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Alan An Jung Wei
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Peter Zorn
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Alan Keith Dunker
- Department of Biochemistry and Molecular Biology, Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, 46202, USA
| | - Marcel Köhn
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany.
| | - Andrea Sinz
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany.
| | - Vladimir N Uversky
- Department of Molecular Medicine, USF Health Byrd Alzheimer's Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, 33612, USA.
| |
Collapse
|
11
|
Bezerra RP, Conniff AS, Uversky VN. Comparative study of structures and functional motifs in lectins from the commercially important photosynthetic microorganisms. Biochimie 2022; 201:63-74. [PMID: 35839918 DOI: 10.1016/j.biochi.2022.07.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Revised: 06/17/2022] [Accepted: 07/08/2022] [Indexed: 11/26/2022]
Abstract
Photosynthetic microorganisms, specifically cyanobacteria and microalgae, can synthesize a vast array of biologically active molecules, such as lectins, that have great potential for various biotechnological and biomedical applications. However, since the structures of these proteins are not well established, likely due to the presence of intrinsically disordered regions, our ability to better understand their functionality is hampered. We embarked on a study of the carbohydrate recognition domain (CRD), intrinsically disordered regions (IDRs), amino acidic composition, as well as and functional motifs in lectins from cyanobacteria of the genus Arthrospira and microalgae Chlorella and Dunaliella genus using a combination of bioinformatics techniques. This search revealed the presence of five distinctive CRD types differently distributed between the genera. Most CRDs displayed a group-specific distribution, except to C. sorokiniana possessing distinctive CRD probably due to its specific lifestyle. We also found that all CRDs contain short IDRs. Bacterial lectin of Arthrospira prokarionte showed lower intrinsic disorder and proline content when compared to the lectins from the eukaryotic microalgae (Chlorella and Dunaliella). Among the important functions predicted in all lectins were several specific motifs, which directly interacts with proteins involved in the cell-cycle control and which may be used for pharmaceutical purposes. Since the aforementioned properties of each type of lectin were investigated in silico, they need experimental confirmation. The results of our study provide an overview of the distribution of CRD, IDRs, and functional motifs within lectin from the commercially important microalgae.
Collapse
Affiliation(s)
- Raquel P Bezerra
- Department of Morphology and Animal Physiology, Federal Rural University of Pernambuco-UFRPE, Dom Manoel de Medeiros Ave, Recife, PE, 52171-900, Brazil.
| | - Amanda S Conniff
- Department of Medical Engineering, Morsani College of Medicine and College of Engineering, University of South Florida, Tampa, FL, 33612, USA.
| | - Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, 33612, USA.
| |
Collapse
|
12
|
Intrinsically disordered proteins and proteins with intrinsically disordered regions in neurodegenerative diseases. Biophys Rev 2022; 14:679-707. [DOI: 10.1007/s12551-022-00968-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Accepted: 05/28/2022] [Indexed: 12/14/2022] Open
|
13
|
Tang YJ, Pang YH, Liu B. DeepIDP-2L: protein intrinsically disordered region prediction by combining convolutional attention network and hierarchical attention network. Bioinformatics 2022; 38:1252-1260. [PMID: 34864847 DOI: 10.1093/bioinformatics/btab810] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 11/02/2021] [Accepted: 11/26/2021] [Indexed: 01/05/2023] Open
Abstract
MOTIVATION Intrinsically disordered regions (IDRs) are widely distributed in proteins. Accurate prediction of IDRs is critical for the protein structure and function analysis. The IDRs are divided into long disordered regions (LDRs) and short disordered regions (SDRs) according to their lengths. Previous studies have shown that LDRs and SDRs have different proprieties. However, the existing computational methods fail to extract different features for LDRs and SDRs separately. As a result, they achieve unstable performance on datasets with different ratios of LDRs and SDRs. RESULTS In this study, a two-layer predictor was proposed called DeepIDP-2L. In the first layer, two kinds of attention-based models are used to extract different features for LDRs and SDRs, respectively. The hierarchical attention network is used to capture the distribution pattern features of LDRs, and convolutional attention network is used to capture the local correlation features of SDRs. The second layer of DeepIDP-2L maps the feature extracted in the first layer into a new feature space. Convolutional network and bidirectional long short term memory are used to capture the local and long-range information for predicting both SDRs and LDRs. Experimental results show that DeepIDP-2L can achieve more stable performance than other exiting predictors on independent test sets with different ratios of SDRs and LDRs. AVAILABILITY AND IMPLEMENTATION For the convenience of most experimental scientists, a user-friendly and publicly accessible web-server for the new predictor has been established at http://bliulab.net/DeepIDP-2L/. It is anticipated that DeepIDP-2L will become a very useful tool for identification of intrinsically disordered regions. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yi-Jun Tang
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Yi-He Pang
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China.,Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing 100081, China
| |
Collapse
|
14
|
Chen TR, Lo CH, Juan SH, Lo WC. The influence of dataset homology and a rigorous evaluation strategy on protein secondary structure prediction. PLoS One 2021; 16:e0254555. [PMID: 34260641 PMCID: PMC8279362 DOI: 10.1371/journal.pone.0254555] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Accepted: 06/29/2021] [Indexed: 11/28/2022] Open
Abstract
The secondary structure prediction (SSP) of proteins has long been an essential structural biology technique with various applications. Despite its vital role in many research and industrial fields, in recent years, as the accuracy of state-of-the-art secondary structure predictors approaches the theoretical upper limit, SSP has been considered no longer challenging or too challenging to make advances. With the belief that the substantial improvement of SSP will move forward many fields depending on it, we conducted this study, which focused on three issues that have not been noticed or thoroughly examined yet but may have affected the reliability of the evaluation of previous SSP algorithms. These issues are all about the sequence homology between or within the developmental and evaluation datasets. We thus designed many different homology layouts of datasets to train and evaluate SSP prediction models. Multiple repeats were performed in each experiment by random sampling. The conclusions obtained with small experimental datasets were verified with large-scale datasets using state-of-the-art SSP algorithms. Very different from the long-established assumption, we discover that the sequence homology between query datasets for training, testing, and independent tests exerts little influence on SSP accuracy. Besides, the sequence homology redundancy between or within most datasets would make the accuracy of an SSP algorithm overestimated, while the redundancy within the reference dataset for extracting predictive features would make the accuracy underestimated. Since the overestimating effects are more significant than the underestimating effect, the accuracy of some SSP methods might have been overestimated. Based on the discoveries, we propose a rigorous procedure for developing SSP algorithms and making reliable evaluations, hoping to bring substantial improvements to future SSP methods and benefit all research and application fields relying on accurate prediction of protein secondary structures.
Collapse
Affiliation(s)
- Teng-Ruei Chen
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Chia-Hua Lo
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan
| | - Sheng-Hung Juan
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan
| | - Wei-Cheng Lo
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan
- Department of Biological Science and Technology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
- The Center for Bioinformatics Research, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| |
Collapse
|
15
|
Shmookler Reis RJ, Atluri R, Balasubramaniam M, Johnson J, Ganne A, Ayyadevara S. "Protein aggregates" contain RNA and DNA, entrapped by misfolded proteins but largely rescued by slowing translational elongation. Aging Cell 2021; 20:e13326. [PMID: 33788386 PMCID: PMC8135009 DOI: 10.1111/acel.13326] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Revised: 01/12/2021] [Accepted: 02/01/2021] [Indexed: 01/03/2023] Open
Abstract
All neurodegenerative diseases feature aggregates, which usually contain disease-specific diagnostic proteins; non-protein constituents, however, have rarely been explored. Aggregates from SY5Y-APPSw neuroblastoma, a cell model of familial Alzheimer's disease, were crosslinked and sequences of linked peptides identified. We constructed a normalized "contactome" comprising 11 subnetworks, centered on 24 high-connectivity hubs. Remarkably, all 24 are nucleic acid-binding proteins. This led us to isolate and sequence RNA and DNA from Alzheimer's and control aggregates. RNA fragments were mapped to the human genome by RNA-seq and DNA by ChIP-seq. Nearly all aggregate RNA sequences mapped to specific genes, whereas DNA fragments were predominantly intergenic. These nucleic acid mappings are all significantly nonrandom, making an artifactual origin extremely unlikely. RNA (mostly cytoplasmic) exceeded DNA (chiefly nuclear) by twofold to fivefold. RNA fragments recovered from AD tissue were ~1.5-to 2.5-fold more abundant than those recovered from control tissue, similar to the increase in protein. Aggregate abundances of specific RNA sequences were strikingly differential between cultured SY5Y-APPSw glioblastoma cells expressing APOE3 vs. APOE4, consistent with APOE4 competition for E-box/CLEAR motifs. We identified many G-quadruplex and viral sequences within RNA and DNA of aggregates, suggesting that sequestration of viral genomes may have driven the evolution of disordered nucleic acid-binding proteins. After RNA-interference knockdown of the translational-procession factor EEF2 to suppress translation in SY5Y-APPSw cells, the RNA content of aggregates declined by >90%, while reducing protein content by only 30% and altering DNA content by ≤10%. This implies that cotranslational misfolding of nascent proteins may ensnare polysomes into aggregates, accounting for most of their RNA content.
Collapse
Affiliation(s)
- Robert J. Shmookler Reis
- Central Arkansas Veterans Healthcare System Little Rock AR USA
- Department of Geriatrics University of Arkansas for Medical Sciences Little Rock AR USA
- BioInformatics Program University of Arkansas for Medical Sciences and University of Arkansas at Little Rock Little Rock AR USA
| | - Ramani Atluri
- Department of Geriatrics University of Arkansas for Medical Sciences Little Rock AR USA
| | | | - Jay Johnson
- BioInformatics Program University of Arkansas for Medical Sciences and University of Arkansas at Little Rock Little Rock AR USA
| | - Akshatha Ganne
- BioInformatics Program University of Arkansas for Medical Sciences and University of Arkansas at Little Rock Little Rock AR USA
| | - Srinivas Ayyadevara
- Central Arkansas Veterans Healthcare System Little Rock AR USA
- Department of Geriatrics University of Arkansas for Medical Sciences Little Rock AR USA
| |
Collapse
|
16
|
Eleutherio ECA, Silva Magalhães RS, de Araújo Brasil A, Monteiro Neto JR, de Holanda Paranhos L. SOD1, more than just an antioxidant. Arch Biochem Biophys 2020; 697:108701. [PMID: 33259795 DOI: 10.1016/j.abb.2020.108701] [Citation(s) in RCA: 120] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Revised: 11/23/2020] [Accepted: 11/24/2020] [Indexed: 12/14/2022]
Abstract
During cellular respiration, radicals, such as superoxide, are produced, and in a large concentration, they may cause cell damage. To combat this threat, the cell employs the enzyme Cu/Zn Superoxide Dismutase (SOD1), which converts the radical superoxide into molecular oxygen and hydrogen peroxide, through redox reactions. Although this is its main function, recent studies have shown that the SOD1 has other functions that deviates from its original one including activation of nuclear gene transcription or as an RNA binding protein. This comprehensive review looks at the most important aspects of human SOD1 (hSOD1), including the structure, properties, and characteristics as well as transcriptional and post-translational modifications (PTM) that the enzyme can receive and their effects, and its many functions. We also discuss the strategies currently used to analyze it to better understand its participation in diseases linked to hSOD1 including Amyotrophic Lateral Sclerosis (ALS), cancer, and Parkinson.
Collapse
|
17
|
Shamsi A, Ahmed A, Khan MS, Husain FM, Bano B. Rosmarinic acid restrains protein glycation and aggregation in human serum albumin: Multi spectroscopic and microscopic insight - Possible Therapeutics Targeting Diseases. Int J Biol Macromol 2020; 161:187-193. [PMID: 32526295 DOI: 10.1016/j.ijbiomac.2020.06.048] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 05/19/2020] [Accepted: 06/05/2020] [Indexed: 12/30/2022]
Abstract
Protein aggregation and glycation are directly associated with many pathological conditions including several neurodegenerative disorders. This study investigates the potential of naturally occurring plant product, Rosmarinic acid (RA), to inhibit the glycation and aggregation process. In this study, we report that varying concentrations of methylglyoxal (MG) induce advanced glycation end products (AGEs) and aggregates formation in HSA in vitro on day 6 and day 8, respectively. AGEs specific fluorescence confirmed the formation of AGEs in HSA in the presence of MG and further characterized the inhibitory potential of RA. It was found that the presence of RA prevented AGEs formation in vitro. Further, aggregates of HSA were characterized employing multi spectroscopic and microscopic techniques and RA was found to inhibit this process. This study proposes that RA could be a potential natural molecule to treat disorders where AGEs and aggregates of proteins play a pivotal role.
Collapse
Affiliation(s)
- Anas Shamsi
- Department of Biochemistry, Aligarh Muslim University, Aligarh, India
| | - Azaj Ahmed
- Department of Biochemistry, Aligarh Muslim University, Aligarh, India
| | - Mohd Shahnawaz Khan
- Department of Biochemistry, College of Sciences, King Saud University, Riyadh 11451, Saudi Arabia
| | - Fohad Mabood Husain
- Department of Food Science and Nutrition, Faculty of Food and Agricultural Sciences, King Saud University, Riyadh 11451, Saudi Arabia
| | - Bilqees Bano
- Department of Biochemistry, Aligarh Muslim University, Aligarh, India.
| |
Collapse
|
18
|
Hanson J, Litfin T, Paliwal K, Zhou Y. Identifying molecular recognition features in intrinsically disordered regions of proteins by transfer learning. Bioinformatics 2020; 36:1107-1113. [PMID: 31504193 DOI: 10.1093/bioinformatics/btz691] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2019] [Revised: 07/24/2019] [Accepted: 08/31/2019] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Protein intrinsic disorder describes the tendency of sequence residues to not fold into a rigid three-dimensional shape by themselves. However, some of these disordered regions can transition from disorder to order when interacting with another molecule in segments known as molecular recognition features (MoRFs). Previous analysis has shown that these MoRF regions are indirectly encoded within the prediction of residue disorder as low-confidence predictions [i.e. in a semi-disordered state P(D)≈0.5]. Thus, what has been learned for disorder prediction may be transferable to MoRF prediction. Transferring the internal characterization of protein disorder for the prediction of MoRF residues would allow us to take advantage of the large training set available for disorder prediction, enabling the training of larger analytical models than is currently feasible on the small number of currently available annotated MoRF proteins. In this paper, we propose a new method for MoRF prediction by transfer learning from the SPOT-Disorder2 ensemble models built for disorder prediction. RESULTS We confirm that directly training on the MoRF set with a randomly initialized model yields substantially poorer performance on independent test sets than by using the transfer-learning-based method SPOT-MoRF, for both deep and simple networks. Its comparison to current state-of-the-art techniques reveals its superior performance in identifying MoRF binding regions in proteins across two independent testing sets, including our new dataset of >800 protein chains. These test chains share <30% sequence similarity to all training and validation proteins used in SPOT-Disorder2 and SPOT-MoRF, and provide a much-needed large-scale update on the performance of current MoRF predictors. The method is expected to be useful in locating functional disordered regions in proteins. AVAILABILITY AND IMPLEMENTATION SPOT-MoRF and its data are available as a web server and as a standalone program at: http://sparks-lab.org/jack/server/SPOT-MoRF/index.php. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jack Hanson
- Signal Processing Laboratory, Griffith University, Brisbane, QLD 4122, Australia
| | - Thomas Litfin
- Institute for Glycomics, School of Information and Communication Technology, Griffith University, Southport, QLD 4222, Australia
| | - Kuldip Paliwal
- Signal Processing Laboratory, Griffith University, Brisbane, QLD 4122, Australia
| | - Yaoqi Zhou
- Institute for Glycomics, School of Information and Communication Technology, Griffith University, Southport, QLD 4222, Australia
| |
Collapse
|
19
|
Shapovalov M, Dunbrack RL, Vucetic S. Multifaceted analysis of training and testing convolutional neural networks for protein secondary structure prediction. PLoS One 2020; 15:e0232528. [PMID: 32374785 PMCID: PMC7202669 DOI: 10.1371/journal.pone.0232528] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Accepted: 04/16/2020] [Indexed: 11/30/2022] Open
Abstract
Protein secondary structure prediction remains a vital topic with broad applications. Due to lack of a widely accepted standard in secondary structure predictor evaluation, a fair comparison of predictors is challenging. A detailed examination of factors that contribute to higher accuracy is also lacking. In this paper, we present: (1) new test sets, Test2018, Test2019, and Test2018-2019, consisting of proteins from structures released in 2018 and 2019 with less than 25% identity to any protein published before 2018; (2) a 4-layer convolutional neural network, SecNet, with an input window of ±14 amino acids which was trained on proteins ≤25% identical to proteins in Test2018 and the commonly used CB513 test set; (3) an additional test set that shares no homologous domains with the training set proteins, according to the Evolutionary Classification of Proteins (ECOD) database; (4) a detailed ablation study where we reverse one algorithmic choice at a time in SecNet and evaluate the effect on the prediction accuracy; (5) new 4- and 5-label prediction alphabets that may be more practical for tertiary structure prediction methods. The 3-label accuracy (helix, sheet, coil) of the leading predictors on both Test2018 and CB513 is 81-82%, while SecNet's accuracy is 84% for both sets. Accuracy on the non-homologous ECOD set is only 0.6 points (83.9%) lower than the results on the Test2018-2019 set (84.5%). The ablation study of features, neural network architecture, and training hyper-parameters suggests the best accuracy results are achieved with good choices for each of them while the neural network architecture is not as critical as long as it is not too simple. Protocols for generating and using unbiased test, validation, and training sets are provided. Our data sets, including input features and assigned labels, and SecNet software including third-party dependencies and databases, are downloadable from dunbrack.fccc.edu/ss and github.com/sh-maxim/ss.
Collapse
Affiliation(s)
- Maxim Shapovalov
- Fox Chase Cancer Center, Philadelphia, PA, United States of America
- Temple University, Philadelphia, PA, United States of America
| | | | | |
Collapse
|
20
|
Badierah RA, Uversky VN, Redwan EM. Dancing with Trojan horses: an interplay between the extracellular vesicles and viruses. J Biomol Struct Dyn 2020; 39:3034-3060. [DOI: 10.1080/07391102.2020.1756409] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- Raied A. Badierah
- Biological Science Department, Faculty of Science, King Abdulaziz University, Jeddah, Saudi Arabia
- Molecular Diagnostic Laboratory, King Abdulaziz University Hospital, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Vladimir N. Uversky
- Biological Science Department, Faculty of Science, King Abdulaziz University, Jeddah, Saudi Arabia
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL, USA
- Laboratory of New Methods in Biology, Institute for Biological Instrumentation, Russian Academy of Sciences, Federal Research Center ‘Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences’, Pushchino, Moscow Region, Russia
| | - Elrashdy M. Redwan
- Biological Science Department, Faculty of Science, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
21
|
Hanson J, Paliwal KK, Litfin T, Zhou Y. SPOT-Disorder2: Improved Protein Intrinsic Disorder Prediction by Ensembled Deep Learning. GENOMICS PROTEOMICS & BIOINFORMATICS 2020; 17:645-656. [PMID: 32173600 PMCID: PMC7212484 DOI: 10.1016/j.gpb.2019.01.004] [Citation(s) in RCA: 87] [Impact Index Per Article: 17.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/26/2018] [Revised: 01/18/2019] [Accepted: 02/15/2019] [Indexed: 01/13/2023]
Abstract
Intrinsically disordered or unstructured proteins (or regions in proteins) have been found to be important in a wide range of biological functions and implicated in many diseases. Due to the high cost and low efficiency of experimental determination of intrinsic disorder and the exponential increase of unannotated protein sequences, developing complementary computational prediction methods has been an active area of research for several decades. Here, we employed an ensemble of deep Squeeze-and-Excitation residual inception and long short-term memory (LSTM) networks for predicting protein intrinsic disorder with input from evolutionary information and predicted one-dimensional structural properties. The method, called SPOT-Disorder2, offers substantial and consistent improvement not only over our previous technique based on LSTM networks alone, but also over other state-of-the-art techniques in three independent tests with different ratios of disordered to ordered amino acid residues, and for sequences with either rich or limited evolutionary information. More importantly, semi-disordered regions predicted in SPOT-Disorder2 are more accurate in identifying molecular recognition features (MoRFs) than methods directly designed for MoRFs prediction. SPOT-Disorder2 is available as a web server and as a standalone program at https://sparks-lab.org/server/spot-disorder2/.
Collapse
Affiliation(s)
- Jack Hanson
- Signal Processing Laboratory, Griffith University, Brisbane 4111, Australia
| | - Kuldip K Paliwal
- Signal Processing Laboratory, Griffith University, Brisbane 4111, Australia
| | - Thomas Litfin
- School of Information and Communication Technology, Griffith University, Gold Coast 4222, Australia
| | - Yaoqi Zhou
- School of Information and Communication Technology, Griffith University, Gold Coast 4222, Australia; Institute for Glycomics, Griffith University, Gold Coast 4222, Australia.
| |
Collapse
|
22
|
Abstract
Intrinsically disordered proteins (IDPs) and regions (IDRs) are commonly found in all proteomes analyzed so far. These proteins/regions are subject to numerous posttranslational modifications (PTMs) and alternative splicing, are involved in a wide range of cellular functions, and often facilitate protein-protein interactions (PPIs). Some of these proteins contain molecular recognition features (MoRFs), which are IDRs that bind to partner proteins and undergo disorder-to-order transitions. Although many IDPs/IDRs can fold upon binding, a large fraction of these proteins are known to maintain significant amounts of disorder in their bound states. Being well-recognized interaction specialists, IDPs/IDRs can participate in one-to-many and many-to-one interactions, where one IDP/IDR binds to multiple partners potentially gaining very different structures in the bound state, or where multiple unrelated IDPs/IDRs bind to one partner. As a result, IDPs frequently serve as hubs (i.e., proteins with many links) in complex PPI networks. The goal of this chapter is to describe computational and bioinformatics tools that can be used to look at the disorder status of proteins within a given PPI network and also to gain some knowledge on the disorder-based functionality of the members of this network. To this end, description is provided for some of the use of UniProt and DisProt databases, several databases generating PPI networks (BioGRID, IntAct, DIP, MINT, HPRD, APID, KEGG, and STRING), Composition profiler, some tools for the per-residue disorder predictions (PONDR® VLXT, PONDR® VL3, PONDR® VSL2, PONDR-FIT, and IUPred), binary disorder classifiers CH-plot and CDF-plot and their combined CH-CDF analysis, web-based tools for the visualization of disorder distribution in a query protein (D2P2 and MobiDB), as well as some tools for evaluation disorder-based functionality of proteins (ANCHOR, MoRFpred, DEPP, and ModPred).
Collapse
Affiliation(s)
- Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL, USA. .,USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA. .,Laboratory of New Methods in Biology, Institute for Biological Instrumentation, Russian Academy of Sciences, Pushchino, Moscow Region, Russian Federation.
| |
Collapse
|
23
|
El Hadidy N, Uversky VN. Intrinsic Disorder of the BAF Complex: Roles in Chromatin Remodeling and Disease Development. Int J Mol Sci 2019; 20:ijms20215260. [PMID: 31652801 PMCID: PMC6862534 DOI: 10.3390/ijms20215260] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2019] [Revised: 10/12/2019] [Accepted: 10/21/2019] [Indexed: 12/13/2022] Open
Abstract
The two-meter-long DNA is compressed into chromatin in the nucleus of every cell, which serves as a significant barrier to transcription. Therefore, for processes such as replication and transcription to occur, the highly compacted chromatin must be relaxed, and the processes required for chromatin reorganization for the aim of replication or transcription are controlled by ATP-dependent nucleosome remodelers. One of the most highly studied remodelers of this kind is the BRG1- or BRM-associated factor complex (BAF complex, also known as SWItch/sucrose non-fermentable (SWI/SNF) complex), which is crucial for the regulation of gene expression and differentiation in eukaryotes. Chromatin remodeling complex BAF is characterized by a highly polymorphic structure, containing from four to 17 subunits encoded by 29 genes. The aim of this paper is to provide an overview of the role of BAF complex in chromatin remodeling and also to use literature mining and a set of computational and bioinformatics tools to analyze structural properties, intrinsic disorder predisposition, and functionalities of its subunits, along with the description of the relations of different BAF complex subunits to the pathogenesis of various human diseases.
Collapse
Affiliation(s)
- Nashwa El Hadidy
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Blvd. MDC07, Tampa, FL 33612, USA.
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Blvd. MDC07, Tampa, FL 33612, USA.
- Laboratory of New Methods in Biology, Institute for Biological Instrumentation of the Russian Academy of Sciences, Federal Research Center "Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences", Pushchino, 142290 Moscow Region, Russia.
| |
Collapse
|
24
|
Abstract
Entropy should directly reflect the extent of disorder in proteins. By clustering structurally related proteins and studying the multiple-sequence-alignment of the sequences of these clusters, we were able to link between sequence, structure, and disorder information. We introduced several parameters as measures of fluctuations at a given MSA site and used these as representative of the sequence and structure entropy at that site. In general, we found a tendency for negative correlations between disorder and structure, and significant positive correlations between disorder and the fluctuations in the system. We also found evidence for residue-type conservation for those residues proximate to potentially disordered sites. Mutation at the disorder site itself appear to be allowed. In addition, we found positive correlation for disorder and accessible surface area, validating that disordered residues occur in exposed regions of proteins. Finally, we also found that fluctuations in the dihedral angles at the original mutated residue and disorder are positively correlated while dihedral angle fluctuations in spatially proximal residues are negatively correlated with disorder. Our results seem to indicate permissible variability in the disordered site, but greater rigidity in the parts of the protein with which the disordered site interacts. This is another indication that disordered residues are involved in protein function.
Collapse
|
25
|
Katuwawala A, Peng Z, Yang J, Kurgan L. Computational Prediction of MoRFs, Short Disorder-to-order Transitioning Protein Binding Regions. Comput Struct Biotechnol J 2019; 17:454-462. [PMID: 31007871 PMCID: PMC6453775 DOI: 10.1016/j.csbj.2019.03.013] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2019] [Revised: 03/22/2019] [Accepted: 03/23/2019] [Indexed: 12/28/2022] Open
Abstract
Molecular recognition features (MoRFs) are short protein-binding regions that undergo disorder-to-order transitions (induced folding) upon binding protein partners. These regions are abundant in nature and can be predicted from protein sequences based on their distinctive sequence signatures. This first-of-its-kind survey covers 14 MoRF predictors and six related methods for the prediction of short protein-binding linear motifs, disordered protein-binding regions and semi-disordered regions. We show that the development of MoRF predictors has accelerated in the recent years. These predictors depend on machine learning-derived models that were generated using training datasets where MoRFs are annotated using putative disorder. Our analysis reveals that they generate accurate predictions. We identified eight methods that offer area under the ROC curve (AUC) ≥ 0.7 on experimentally-validated test datasets. We show that modern MoRF predictors accurately find experimentally annotated MoRFs even though they were trained using the putative disorder annotations. They are relatively highly-cited, particularly the methods available as webservers that on average secure three times more citations than methods without this option. MoRF predictions contribute to the experimental discovery of protein-protein interactions, annotation of protein functions and computational analysis of a variety of proteomes, protein families, and pathways. We outline future development and application directions for these tools, stressing the importance to develop novel tools that would target interactions of disordered regions with other types of partners.
Collapse
Affiliation(s)
- Akila Katuwawala
- Department of Computer Science, Virginia Commonwealth University, USA
| | - Zhenling Peng
- Center for Applied Mathematics, Tianjin University, Tianjin, China
| | - Jianyi Yang
- School of Mathematical Sciences, Nankai University, Tianjin, China
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, USA
| |
Collapse
|
26
|
Fang C, Moriwaki Y, Tian A, Li C, Shimizu K. Identifying short disorder-to-order binding regions in disordered proteins with a deep convolutional neural network method. J Bioinform Comput Biol 2019; 17:1950004. [PMID: 30866736 DOI: 10.1142/s0219720019500045] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Molecular recognition features (MoRFs) are key functional regions of intrinsically disordered proteins (IDPs), which play important roles in the molecular interaction network of cells and are implicated in many serious human diseases. Identifying MoRFs is essential for both functional studies of IDPs and drug design. This study adopts the cutting-edge machine learning method of artificial intelligence to develop a powerful model for improving MoRFs prediction. We proposed a method, named as en_DCNNMoRF (ensemble deep convolutional neural network-based MoRF predictor). It combines the outcomes of two independent deep convolutional neural network (DCNN) classifiers that take advantage of different features. The first, DCNNMoRF1, employs position-specific scoring matrix (PSSM) and 22 types of amino acid-related factors to describe protein sequences. The second, DCNNMoRF2, employs PSSM and 13 types of amino acid indexes to describe protein sequences. For both single classifiers, DCNN with a novel two-dimensional attention mechanism was adopted, and an average strategy was added to further process the output probabilities of each DCNN model. Finally, en_DCNNMoRF combined the two models by averaging their final scores. When compared with other well-known tools applied to the same datasets, the accuracy of the novel proposed method was comparable with that of state-of-the-art methods. The related web server can be accessed freely via http://vivace.bi.a.u-tokyo.ac.jp:8008/fang/en_MoRFs.php .
Collapse
Affiliation(s)
- Chun Fang
- Department of Computer Science and Engineering, Shandong University of Technology, Shandong 255049, P. R. China
| | - Yoshitaka Moriwaki
- Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo 113-8657, Japan
| | - Aikui Tian
- Department of Computer Science and Engineering, Shandong University of Technology, Shandong 255049, P. R. China
| | - Caihong Li
- Department of Computer Science and Engineering, Shandong University of Technology, Shandong 255049, P. R. China
| | - Kentaro Shimizu
- Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo 113-8657, Japan
| |
Collapse
|
27
|
Kumar Ghosh D, Nanaji Shrikondawar A, Ranjan A. Local structural unfolding at the edge-strands of beta sheets is the molecular basis for instability and aggregation of G85R and G93A mutants of superoxide dismutase 1. J Biomol Struct Dyn 2019; 38:647-659. [DOI: 10.1080/07391102.2019.1584125] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Debasish Kumar Ghosh
- Computational and Functional Genomics Group, Centre for DNA Fingerprinting and Diagnostics, Uppal, Hyderabad, India
- Graduate Studies, Manipal Academy of Higher Education, Manipal, Karnataka, India
| | - Akshaykumar Nanaji Shrikondawar
- Computational and Functional Genomics Group, Centre for DNA Fingerprinting and Diagnostics, Uppal, Hyderabad, India
- Graduate Studies, Regional Centre for Biotechnology, Faridabad, Haryana, India
| | - Akash Ranjan
- Computational and Functional Genomics Group, Centre for DNA Fingerprinting and Diagnostics, Uppal, Hyderabad, India
| |
Collapse
|
28
|
Kurvits L, Reimann E, Kadastik-Eerme L, Truu L, Kingo K, Erm T, Kõks S, Taba P, Planken A. Serum Amyloid Alpha Is Downregulated in Peripheral Tissues of Parkinson's Disease Patients. Front Neurosci 2019; 13:13. [PMID: 30760975 PMCID: PMC6361740 DOI: 10.3389/fnins.2019.00013] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2018] [Accepted: 01/08/2019] [Indexed: 11/13/2022] Open
Abstract
We report the changed levels of serum amyloid alpha, an immunologically active protein, in Parkinson’s disease (PD) patients’ peripheral tissues. We have previously shown that Saa-1 and -2 (serum amyloid alpha-1,-2, genes) were among the top downregulated genes in PD patients’ skin, using whole-genome RNA sequencing. In the current study, we characterized the gene and protein expression profiles of skin and blood samples from patients with confirmed PD diagnosis and age/sex matched controls. qRT-PCR analysis of PD skin demonstrated downregulation of Saa-1 and -2 genes in PD patients. However, the lowered amount of protein could not be visualized using immunohistochemistry, due to low quantity of SAA (Serum Amyloid Alpha, protein) in skin. Saa-1 and -2 expression levels in whole blood were below detection threshold based on RNA sequencing, however significantly lowered protein levels of SAA1/2 in PD patients’ serum were shown with ELISA, implying that SAA is secreted into the blood. These results show that SAA is differentially expressed in the peripheral tissues of PD patients.
Collapse
Affiliation(s)
- Lille Kurvits
- Department of Neurology and Neurosurgery, University of Tartu, Tartu, Estonia.,Department of Neurology, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Ene Reimann
- Institute of Pathophysiology, University of Tartu, Tartu, Estonia
| | - Liis Kadastik-Eerme
- Department of Neurology and Neurosurgery, University of Tartu, Tartu, Estonia
| | - Laura Truu
- Laboratory of Bioenergetics, National Institute of Chemical Physics and Biophysics, Tallinn, Estonia
| | - Külli Kingo
- Department of Dermatology, University of Tartu, Tartu, Estonia.,Dermatology Clinic, Tartu University Hospital, Tartu, Estonia
| | - Triin Erm
- Department of Pathology, Tartu University Hospital, Tartu, Estonia
| | - Sulev Kõks
- Centre for Comparative Genomics, Murdoch University, Perth, WA, Australia.,Perron Institute for Neurological and Translational Science, University of Western Australia, Perth, WA, Australia
| | - Pille Taba
- Department of Neurology and Neurosurgery, University of Tartu, Tartu, Estonia
| | - Anu Planken
- Oncology and Haematology Clinic, North-Estonian Medical Centre, Tallinn, Estonia
| |
Collapse
|
29
|
Pedersen JN, Jiang Z, Christiansen G, Lee JC, Pedersen JS, Otzen DE. Lysophospholipids induce fibrillation of the repeat domain of Pmel17 through intermediate core-shell structures. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2018; 1867:519-528. [PMID: 30471451 DOI: 10.1016/j.bbapap.2018.11.007] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/16/2018] [Revised: 11/02/2018] [Accepted: 11/17/2018] [Indexed: 11/26/2022]
Abstract
Lipids often play an important role in the initial steps of fibrillation. The melanosomal protein Pmel17 forms amyloid in vivo and contains a highly amyloidogenic Repeat domain (RPT), important for melanin biosynthesis. RPT fibrillation is influenced by two lysolipids, the anionic lysophosphatidylglycerol (LPG) and zwitterionic lysophosphatidylcholine (LPC), both present in vivo at elevated concentrations in melanosomes, organelles in which Pmel17 aggregate. Here we investigate the interaction of RPT with both LPG and LPC using small-angle X-ray scattering (SAXS), isothermal titration calorimetry (ITC), electron microscopy, fluorescence and circular dichroism (CD) spectroscopy. Under non-shaking conditions, both lipids promote fibrillation but this is driven by different interactions with RPT. Each RPT binds >40 LPG molecules but only weak interactions are seen with LPC. Above LPG's criticial micelle concentration (cmc), LPG and RPT form connected micelles where RPT binds to the surface as beads on a string with core-shell structures. Binding to LPG only induces α-helical structure well above the cmc, while LPC has no measurable effect on the protein structure. While low (but still super-cmc) concentrations of LPG strongly promote aggregation, at higher LPG concentrations (10 mM), only ~ one RPT binds per micelle, inhibiting amyloid formation. ITC and SAXS reveal some interactions between the zwitterionic lipid LPC and RPT below the cmc but little above the cmc. Nevertheless, LPC only promotes aggregation above the cmc and this process is not inhibited by high LPC concentrations, suggesting that monomers and micelles cooperate to influence amyloid formation.
Collapse
Affiliation(s)
- Jannik Nedergaard Pedersen
- Interdisciplinary Nanoscience Center (iNANO), Department of Chemistry, Aarhus University, Gustav Wieds Vej 14, 8000 Aarhus, Denmark
| | - Zhiping Jiang
- Laboratory of Protein Conformation and Dynamics, Biochemistry and Biophysics Center, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD 20892-8013, USA
| | - Gunna Christiansen
- Department of Biomedicine, Aarhus University, Wilhelm Meyers Allé 4, DK-8000 Aarhus, Denmark
| | - Jennifer C Lee
- Laboratory of Protein Conformation and Dynamics, Biochemistry and Biophysics Center, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD 20892-8013, USA
| | - Jan Skov Pedersen
- Interdisciplinary Nanoscience Center (iNANO), Department of Chemistry, Aarhus University, Gustav Wieds Vej 14, 8000 Aarhus, Denmark.
| | - Daniel E Otzen
- Interdisciplinary Nanoscience Center (iNANO), Department of Molecular Biology and Genetics, Aarhus University, Gustav Wieds Vej 14, DK-8000 Aarhus, Denmark.
| |
Collapse
|
30
|
An in-silico method for identifying aggregation rate enhancer and mitigator mutations in proteins. Int J Biol Macromol 2018; 118:1157-1167. [DOI: 10.1016/j.ijbiomac.2018.06.102] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2018] [Revised: 06/19/2018] [Accepted: 06/20/2018] [Indexed: 12/27/2022]
|
31
|
Yang Y, Gao J, Wang J, Heffernan R, Hanson J, Paliwal K, Zhou Y. Sixty-five years of the long march in protein secondary structure prediction: the final stretch? Brief Bioinform 2018; 19:482-494. [PMID: 28040746 PMCID: PMC5952956 DOI: 10.1093/bib/bbw129] [Citation(s) in RCA: 89] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2016] [Revised: 11/15/2016] [Indexed: 11/13/2022] Open
Abstract
Protein secondary structure prediction began in 1951 when Pauling and Corey predicted helical and sheet conformations for protein polypeptide backbone even before the first protein structure was determined. Sixty-five years later, powerful new methods breathe new life into this field. The highest three-state accuracy without relying on structure templates is now at 82-84%, a number unthinkable just a few years ago. These improvements came from increasingly larger databases of protein sequences and structures for training, the use of template secondary structure information and more powerful deep learning techniques. As we are approaching to the theoretical limit of three-state prediction (88-90%), alternative to secondary structure prediction (prediction of backbone torsion angles and Cα-atom-based angles and torsion angles) not only has more room for further improvement but also allows direct prediction of three-dimensional fragment structures with constantly improved accuracy. About 20% of all 40-residue fragments in a database of 1199 non-redundant proteins have <6 Å root-mean-squared distance from the native conformations by SPIDER2. More powerful deep learning methods with improved capability of capturing long-range interactions begin to emerge as the next generation of techniques for secondary structure prediction. The time has come to finish off the final stretch of the long march towards protein secondary structure prediction.
Collapse
Affiliation(s)
- Yuedong Yang
- Insitute for Glycomics and School of Information and Communication Technology, Griffith University, Parklands Drive, Southport, QLD, Australia
| | - Jianzhao Gao
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, China
| | - Jihua Wang
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, China
| | - Rhys Heffernan
- Signal Processing Laboratory, Griffith University, Brisbane, Australia
| | - Jack Hanson
- Signal Processing Laboratory, Griffith University, Brisbane, Australia
| | - Kuldip Paliwal
- Signal Processing Laboratory, Griffith University, Brisbane, Australia
| | - Yaoqi Zhou
- Insitute for Glycomics and School of Information and Communication Technology, Griffith University, Parklands Drive, Southport, QLD, Australia
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, China
| |
Collapse
|
32
|
Jones CL, Njomen E, Sjögren B, Dexheimer TS, Tepe JJ. Small Molecule Enhancement of 20S Proteasome Activity Targets Intrinsically Disordered Proteins. ACS Chem Biol 2017; 12:2240-2247. [PMID: 28719185 DOI: 10.1021/acschembio.7b00489] [Citation(s) in RCA: 70] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The 20S proteasome is the main protease for the degradation of oxidatively damaged and intrinsically disordered proteins. When accumulation of disordered or oxidatively damaged proteins exceeds proper clearance in neurons, imbalanced pathway signaling or aggregation occurs, which have been implicated in the pathogenesis of several neurological disorders. Screening of the NIH Clinical Collection and Prestwick libraries identified the neuroleptic agent chlorpromazine as a lead agent capable of enhancing 20S proteasome activity. Chemical manipulation of chlorpromazine abrogated its D2R receptor binding affinity while retaining its ability to enhance 20S mediated proteolysis at low micromolar concentrations. The resulting small molecule enhancers of 20S proteasome activity induced the degradation of intrinsically disordered proteins, α-synuclein, and tau but not structured proteins. These small molecule 20S agonists can serve as leads to explore the therapeutic potential of 20S activation or as new tools to provide insight into the yet unclear mechanics of 20S-gate regulation.
Collapse
Affiliation(s)
- Corey L. Jones
- Department
of Chemistry and ‡Department of Pharmacology and Toxicology, Michigan State University, East
Lansing, Michigan 48824, United States
| | - Evert Njomen
- Department
of Chemistry and ‡Department of Pharmacology and Toxicology, Michigan State University, East
Lansing, Michigan 48824, United States
| | - Benita Sjögren
- Department
of Chemistry and ‡Department of Pharmacology and Toxicology, Michigan State University, East
Lansing, Michigan 48824, United States
| | - Thomas S. Dexheimer
- Department
of Chemistry and ‡Department of Pharmacology and Toxicology, Michigan State University, East
Lansing, Michigan 48824, United States
| | - Jetze J. Tepe
- Department
of Chemistry and ‡Department of Pharmacology and Toxicology, Michigan State University, East
Lansing, Michigan 48824, United States
| |
Collapse
|
33
|
Hanson J, Yang Y, Paliwal K, Zhou Y. Improving protein disorder prediction by deep bidirectional long short-term memory recurrent neural networks. Bioinformatics 2017; 33:685-692. [PMID: 28011771 DOI: 10.1093/bioinformatics/btw678] [Citation(s) in RCA: 109] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2016] [Accepted: 10/26/2016] [Indexed: 11/12/2022] Open
Abstract
Motivation Capturing long-range interactions between structural but not sequence neighbors of proteins is a long-standing challenging problem in bioinformatics. Recently, long short-term memory (LSTM) networks have significantly improved the accuracy of speech and image classification problems by remembering useful past information in long sequential events. Here, we have implemented deep bidirectional LSTM recurrent neural networks in the problem of protein intrinsic disorder prediction. Results The new method, named SPOT-Disorder, has steadily improved over a similar method using a traditional, window-based neural network (SPINE-D) in all datasets tested without separate training on short and long disordered regions. Independent tests on four other datasets including the datasets from critical assessment of structure prediction (CASP) techniques and >10 000 annotated proteins from MobiDB, confirmed SPOT-Disorder as one of the best methods in disorder prediction. Moreover, initial studies indicate that the method is more accurate in predicting functional sites in disordered regions. These results highlight the usefulness combining LSTM with deep bidirectional recurrent neural networks in capturing non-local, long-range interactions for bioinformatics applications. Availability and Implementation SPOT-disorder is available as a web server and as a standalone program at: http://sparks-lab.org/server/SPOT-disorder/index.php . Contact j.hanson@griffith.edu.au or yuedong.yang@griffith.edu.au or yaoqi.zhou@griffith.edu.au. Supplementary information Supplementary data is available at Bioinformatics online.
Collapse
Affiliation(s)
- Jack Hanson
- Signal Processing Laboratory, Griffith University, Brisbane 4122, Australia
| | - Yuedong Yang
- Institute for Glycomics, Griffith University, Gold Coast 4215, Australia
| | - Kuldip Paliwal
- Signal Processing Laboratory, Griffith University, Brisbane 4122, Australia
| | - Yaoqi Zhou
- Institute for Glycomics, Griffith University, Gold Coast 4215, Australia
| |
Collapse
|
34
|
Ginn BR. The thermodynamics of protein aggregation reactions may underpin the enhanced metabolic efficiency associated with heterosis, some balancing selection, and the evolution of ploidy levels. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2017; 126:1-21. [PMID: 28185903 DOI: 10.1016/j.pbiomolbio.2017.01.005] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/02/2017] [Accepted: 01/24/2017] [Indexed: 01/04/2023]
Abstract
Identifying the physical basis of heterosis (or "hybrid vigor") has remained elusive despite over a hundred years of research on the subject. The three main theories of heterosis are dominance theory, overdominance theory, and epistasis theory. Kacser and Burns (1981) identified the molecular basis of dominance, which has greatly enhanced our understanding of its importance to heterosis. This paper aims to explain how overdominance, and some features of epistasis, can similarly emerge from the molecular dynamics of proteins. Possessing multiple alleles at a gene locus results in the synthesis of different allozymes at reduced concentrations. This in turn reduces the rate at which each allozyme forms soluble oligomers, which are toxic and must be degraded, because allozymes co-aggregate at low efficiencies. The model developed in this paper can explain how heterozygosity impacts the metabolic efficiency of an organism. It can also explain why the viabilities of some inbred lines seem to decline rapidly at high inbreeding coefficients (F > 0.5), which may provide a physical basis for truncation selection for heterozygosity. Finally, the model has implications for the ploidy level of organisms. It can explain why polyploids are frequently found in environments where severe physical stresses promote the formation of soluble oligomers. The model can also explain why complex organisms, which need to synthesize aggregation-prone proteins that contain intrinsically unstructured regions (IURs) and multiple domains because they facilitate complex protein interaction networks (PINs), tend to be diploid while haploidy tends to be restricted to relatively simple organisms.
Collapse
Affiliation(s)
- B R Ginn
- University of Georgia, GA 30602, United States.
| |
Collapse
|
35
|
Abstract
Over the past decade, it has become evident that a large proportion of proteins contain intrinsically disordered regions, which play important roles in pivotal cellular functions. Many computational tools have been developed with the aim of identifying the level and location of disorder within a protein. In this chapter, we describe a neural network based technique called SPINE-D that employs a unique three-state design and can accurately capture disordered residues in both short and long disordered regions. SPINE-D was trained on a large database of 4229 non-redundant proteins, and yielded an AUC of 0.86 on a cross-validation test and 0.89 on an independent test. SPINE-D can also detect a semi-disordered state that is associated with induced folders and aggregation-prone regions in disordered proteins and weakly stable or locally unfolded regions in structured proteins. We implement an online web service and an offline stand-alone program for SPINE-D, they are freely available at http://sparks-lab.org/SPINE-D/ . We then walk you through how to use the online and offline SPINE-D in making disorder predictions, and examine the disorder and semi-disorder prediction in a case study on the p53 protein.
Collapse
Affiliation(s)
- Tuo Zhang
- Department of Microbiology and Immunology, Weill Cornell Medical College, New York, NY, 10065, USA
| | - Eshel Faraggi
- Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, Indianapolis, IN, 46032, USA
- Research and Information Systems, LLC, Indianapolis, IN, USA
| | - Zhixiu Li
- Translational Genomics Group, Institute of Health and Biomedical Innovation, Queensland University of Technology at Translational Research Institute, 37 Kent Street, Woolloongabba, QLD, 4102, Australia
| | - Yaoqi Zhou
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Gold Coast Campus, Science 1 (G24) 2.10, Parklands Drive, Southport, QLD, 4222, Australia.
| |
Collapse
|
36
|
Abstract
Currently available computational tools, which are many, provide a researcher with the multitude of options for prediction of intrinsic disorder in a protein of interest and for finding at least some of its disorder-based functions. This chapter provides a highly subjective guideline on how not to be lost in the "dark forest" of available tools for the analysis of intrinsic disorder. By no means it gives a unique pathway through this forest, but simply presents some of the tools the author uses in his everyday research.
Collapse
Affiliation(s)
- Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, 33612, USA.
- Institute for Biological Instrumentation, Russian Academy of Sciences, 142290 Pushchino, Moscow Region, Russian Federation.
- Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg, Russian Federation.
| |
Collapse
|
37
|
Lieutaud P, Ferron F, Uversky AV, Kurgan L, Uversky VN, Longhi S. How disordered is my protein and what is its disorder for? A guide through the "dark side" of the protein universe. INTRINSICALLY DISORDERED PROTEINS 2016; 4:e1259708. [PMID: 28232901 DOI: 10.1080/21690707.2016.1259708] [Citation(s) in RCA: 80] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/22/2016] [Revised: 11/03/2016] [Accepted: 11/04/2016] [Indexed: 12/18/2022]
Abstract
In the last 2 decades it has become increasingly evident that a large number of proteins are either fully or partially disordered. Intrinsically disordered proteins lack a stable 3D structure, are ubiquitous and fulfill essential biological functions. Their conformational heterogeneity is encoded in their amino acid sequences, thereby allowing intrinsically disordered proteins or regions to be recognized based on properties of these sequences. The identification of disordered regions facilitates the functional annotation of proteins and is instrumental for delineating boundaries of protein domains amenable to structural determination with X-ray crystallization. This article discusses a comprehensive selection of databases and methods currently employed to disseminate experimental and putative annotations of disorder, predict disorder and identify regions involved in induced folding. It also provides a set of detailed instructions that should be followed to perform computational analysis of disorder.
Collapse
Affiliation(s)
- Philippe Lieutaud
- Aix-Marseille Université, AFMB UMR, Marseille, France; CNRS, AFMB UMR, Marseille, France
| | - François Ferron
- Aix-Marseille Université, AFMB UMR, Marseille, France; CNRS, AFMB UMR, Marseille, France
| | - Alexey V Uversky
- Center for Data Analytics and Biomedical Informatics, Department of Computer and Information Sciences, College of Science and Technology, Temple University , Philadelphia, PA, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University , Richmond, VA, USA
| | - Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA; Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg, Russia
| | - Sonia Longhi
- Aix-Marseille Université, AFMB UMR, Marseille, France; CNRS, AFMB UMR, Marseille, France
| |
Collapse
|
38
|
Naeem A, Bhat SA, Iram A, Khan RH. Aggregation of intrinsically disordered fibrinogen as the influence of backbone conformation. Arch Biochem Biophys 2016; 603:38-47. [DOI: 10.1016/j.abb.2016.04.017] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2015] [Revised: 04/13/2016] [Accepted: 04/30/2016] [Indexed: 10/21/2022]
|
39
|
Yu JF, Cao Z, Yang Y, Wang CL, Su ZD, Zhao YW, Wang JH, Zhou Y. Natural protein sequences are more intrinsically disordered than random sequences. Cell Mol Life Sci 2016; 73:2949-57. [PMID: 26801222 PMCID: PMC4937073 DOI: 10.1007/s00018-016-2138-9] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2015] [Revised: 01/10/2016] [Accepted: 01/11/2016] [Indexed: 11/16/2022]
Abstract
Most natural protein sequences have resulted from millions or even billions of years of evolution. How they differ from random sequences is not fully understood. Previous computational and experimental studies of random proteins generated from noncoding regions yielded inclusive results due to species-dependent codon biases and GC contents. Here, we approach this problem by investigating 10,000 sequences randomized at the amino acid level. Using well-established predictors for protein intrinsic disorder, we found that natural sequences have more long disordered regions than random sequences, even when random and natural sequences have the same overall composition of amino acid residues. We also showed that random sequences are as structured as natural sequences according to contents and length distributions of predicted secondary structure, although the structures from random sequences may be in a molten globular-like state, according to molecular dynamics simulations. The bias of natural sequences toward more intrinsic disorder suggests that natural sequences are created and evolved to avoid protein aggregation and increase functional diversity.
Collapse
Affiliation(s)
- Jia-Feng Yu
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, 253023, China
| | - Zanxia Cao
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, 253023, China
- College of Physics and Electronic Information, Dezhou University, Dezhou, 253023, China
| | - Yuedong Yang
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Parklands Dr, Southport, QLD, 4222, Australia
| | - Chun-Ling Wang
- College of Physics and Electronic Information, Dezhou University, Dezhou, 253023, China
| | - Zhen-Dong Su
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, 253023, China
| | - Ya-Wei Zhao
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, 253023, China
| | - Ji-Hua Wang
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, 253023, China
- College of Physics and Electronic Information, Dezhou University, Dezhou, 253023, China
| | - Yaoqi Zhou
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, 253023, China.
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Parklands Dr, Southport, QLD, 4222, Australia.
| |
Collapse
|
40
|
DeForte S, Uversky VN. Resolving the ambiguity: Making sense of intrinsic disorder when PDB structures disagree. Protein Sci 2016; 25:676-88. [PMID: 26683124 DOI: 10.1002/pro.2864] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2015] [Revised: 12/14/2015] [Accepted: 12/15/2015] [Indexed: 12/25/2022]
Abstract
Missing regions in X-ray crystal structures in the Protein Data Bank (PDB) have played a foundational role in the study of intrinsically disordered protein regions (IDPRs), especially in the development of in silico predictors of intrinsic disorder. However, a missing region is only a weak indication of intrinsic disorder, and this uncertainty is compounded by the presence of ambiguous regions, where more than one structure of the same protein sequence "disagrees" in terms of the presence or absence of missing residues. The question is this: are these ambiguous regions intrinsically disordered, or are they the result of static disorder that arises from experimental conditions, ensembles of structures, or domain wobbling? A novel way of looking at ambiguous regions in terms of the pattern between multiple PDB structures has been demonstrated. It was found that the propensity for intrinsic disorder increases as the level of ambiguity decreases. However, it is also shown that ambiguity is more likely to occur as the protein region is placed within different environmental conditions, and even the most ambiguous regions as a set display compositional bias that suggests flexibility. The results suggested that ambiguity is a natural result for many IDPRs crystallized under different conditions and that static disorder and wobbling domains are relatively rare. Instead, it is more likely that ambiguity arises because many of these regions were conditionally or partially disordered.
Collapse
Affiliation(s)
- Shelly DeForte
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, Florida, 33612
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, Florida, 33612.,USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, Florida, 33612.,Department of Biological Science, Faculty of Science, King Abdulaziz University, PO Box 80203, Jeddah, Jeddah 21589, Saudi Arabia.,Institute for Biological Instrumentation, Russian Academy of Sciences, Pushchino, Moscow Region, 142290, Russian Federation.,Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg, Russian Federation
| |
Collapse
|
41
|
Yan J, Dunker AK, Uversky VN, Kurgan L. Molecular recognition features (MoRFs) in three domains of life. MOLECULAR BIOSYSTEMS 2016; 12:697-710. [DOI: 10.1039/c5mb00640f] [Citation(s) in RCA: 103] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
MoRFs are widespread intrinsically disordered protein-binding regions that have similar abundance and amino acid composition across the three domains of life.
Collapse
Affiliation(s)
- Jing Yan
- Department of Electrical and Computer Engineering
- University of Alberta
- Edmonton
- Canada
| | - A. Keith Dunker
- Center for Computational Biology and Bioinformatics
- Indiana University School of Medicine
- Indianapolis
- USA
- Indiana University School of Informatics
| | - Vladimir N. Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute
- Morsani College of Medicine
- University of South Florida
- Tampa
- USA
| | - Lukasz Kurgan
- Department of Electrical and Computer Engineering
- University of Alberta
- Edmonton
- Canada
- Department of Computer Science
| |
Collapse
|
42
|
Tokunaga Y, Matsumoto M, Sugimoto Y. Amyloid fibril formation from a 9 amino acid peptide, 55th–63rd residues of human lysozyme. Int J Biol Macromol 2015; 80:208-16. [DOI: 10.1016/j.ijbiomac.2015.06.015] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2015] [Revised: 06/09/2015] [Accepted: 06/11/2015] [Indexed: 10/23/2022]
|
43
|
Abstract
Accumulating evidence indicates that RNA metabolism components assemble into supramolecular cellular structures to mediate functional compartmentalization within the cytoplasmic membrane of the bacterial cell. This cellular compartmentalization could play important roles in the processes of RNA degradation and maturation. These components include Hfq, the RNA chaperone protein, which is involved in the post-transcriptional control of protein synthesis mainly by the virtue of its interactions with several small regulatory ncRNAs (sRNA). The Escherichia coli Hfq is structurally organized into two domains. An N-terminal domain that folds as strongly bent β-sheets within individual protomers to assemble into a typical toroidal hexameric ring. A C-terminal flexible domain that encompasses approximately one-third of the protein seems intrinsically unstructured. RNA-binding function of Hfq mainly lies within its N-terminal core, whereas the function of the flexible domain remains controversial and largely unknown. In the present study, we demonstrate that the Hfq-C-terminal region (CTR) has an intrinsic property to self-assemble into long amyloid-like fibrillar structures in vitro. We show that normal localization of Hfq within membrane-associated coiled structures in vivo requires this C-terminal domain. This finding establishes for the first time a function for the hitherto puzzling CTR, with a plausible central role in RNA transactions. We showed that Hfq C-terminal region (CTR) has an intrinsic property to self-assemble into amyloid-like fibrils. This region is required for cellular assembly of Hfq into membrane-associated coiled structures. The work establishes a new function for this naturally unstructured Hfq domain.
Collapse
|
44
|
DeForte S, Reddy KD, Uversky VN. Digested disorder, Quarterly intrinsic disorder digest (October-November-December, 2013). INTRINSICALLY DISORDERED PROTEINS 2015; 3:e984569. [PMID: 28293487 DOI: 10.4161/21690707.2014.984569] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/09/2014] [Revised: 10/09/2014] [Accepted: 10/10/2014] [Indexed: 11/19/2022]
Abstract
This is the 4th issue of the Digested Disorder series that represents reader's digest of the scientific literature on intrinsically disordered proteins. The only 2 criteria for inclusion in this digest are the publication date (a paper should be published within the covered time frame) and topic (a paper should be dedicated to any aspect of protein intrinsic disorder). The current digest issue covers papers published during the fourth quarter of 2013; i.e. during the period of October, November, and December of 2013. Similar to previous issues, the papers are grouped hierarchically by topics they cover, and for each of the included paper a short description is given on its major findings.
Collapse
Affiliation(s)
- Shelly DeForte
- Department of Molecular Medicine; Morsani College of Medicine; University of South Florida; Tampa, FL USA; These authors contributed equally to this work
| | - Krishna D Reddy
- Department of Molecular Medicine; Morsani College of Medicine; University of South Florida; Tampa, FL USA; These authors contributed equally to this work
| | - Vladimir N Uversky
- Department of Molecular Medicine; Morsani College of Medicine; University of South Florida; Tampa, FL USA; USF Health Byrd Alzheimer Research Institute; Morsani College of Medicine; University of South Florida; Tampa, FL USA; Biology Department; Faculty of Science; King Abdulaziz University; Jeddah, Kingdom of Saudi Arabia; Laboratory of Structural Dynamics, Stability, and Folding of Proteins; Institute of Cytology; Russian Academy of Sciences; St. Petersburg, Russia; Institute for Biological Instrumentation; Russian Academy of Sciences; Moscow Region, Russia
| |
Collapse
|
45
|
Andrich K, Bieschke J. The Effect of (-)-Epigallo-catechin-(3)-gallate on Amyloidogenic Proteins Suggests a Common Mechanism. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2015; 863:139-61. [PMID: 26092630 DOI: 10.1007/978-3-319-18365-7_7] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Studies on the interaction of the green tea polyphenol (-)-Epigallocatechin-3-gallate (EGCG) with fourteen disease-related amyloid polypeptides and prions Huntingtin, Amyloid-beta, alpha-Synuclein, islet amyloid polypeptide (IAPP), Sup35, NM25 and NM4, tau, MSP2, semen-derived enhancer of virus infection (SEVI), immunoglobulin light chains, beta-microglobulin, prion protein (PrP) and Insulin, have yielded a variety of experimental observations. Here, we analyze whether these observations could be explained by a common mechanism and give a broad overview of the published experimental data on the actions of EGCG. Firstly, we look at the influence of EGCG on aggregate toxicity, morphology, seeding competence, stability and conformational changes. Secondly, we screened publications elucidating the biochemical mechanism of EGCG intervention, notably the effect of EGCG on aggregation kinetics, oligomeric aggregation intermediates, and its binding mode to polypeptides. We hypothesize that the experimental results may be reconciled in a common mechanism, in which EGCG binds to cross-beta sheet aggregation intermediates. The relative position of these species in the energy profile of the amyloid cascade would determine the net effect of EGCG on aggregation and disaggregation of amyloid fibrils.
Collapse
Affiliation(s)
- Kathrin Andrich
- Department of Biomedical Engineering, Washington University in St. Louis, Saint Louis, MO, USA
| | | |
Collapse
|
46
|
Dunker AK, Oldfield CJ. Back to the Future: Nuclear Magnetic Resonance and Bioinformatics Studies on Intrinsically Disordered Proteins. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2015; 870:1-34. [PMID: 26387098 DOI: 10.1007/978-3-319-20164-1_1] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
From the 1970s to the present, regions of missing electron density in protein structures determined by X-ray diffraction and the characterization of the functions of these regions have suggested that not all protein regions depend on prior 3D structure to carry out function. Motivated by these observations, in early 1996 we began to use bioinformatics approaches to study these intrinsically disordered proteins (IDPs) and IDP regions. At just about the same time, several laboratory groups began to study a collection of IDPs and IDP regions using nuclear magnetic resonance. The temporal overlap of the bioinformatics and NMR studies played a significant role in the development of our understanding of IDPs. Here the goal is to recount some of this history and to project from this experience possible directions for future work.
Collapse
Affiliation(s)
- A Keith Dunker
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 46202, Indianapolis, IN, USA.
| | - Christopher J Oldfield
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 46202, Indianapolis, IN, USA.
| |
Collapse
|
47
|
Wang J, Yang Y, Cao Z, Li Z, Zhao H, Zhou Y. The role of semidisorder in temperature adaptation of bacterial FlgM proteins. Biophys J 2014; 105:2598-605. [PMID: 24314090 DOI: 10.1016/j.bpj.2013.10.026] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2013] [Revised: 10/18/2013] [Accepted: 10/18/2013] [Indexed: 01/08/2023] Open
Abstract
Probabilities of disorder for FlgM proteins of 39 species whose optimal growth temperature ranges from 273 K (0°C) to 368 K (95°C) were predicted by a newly developed method called Sequence-based Prediction with Integrated NEural networks for Disorder (SPINE-D). We showed that the temperature-dependent behavior of FlgM proteins could be separated into two subgroups according to their sequence lengths. Only shorter sequences evolved to adapt to high temperatures (>318 K or 45°C). Their ability to adapt to high temperatures was achieved through a transition from a fully disordered state with little secondary structure to a semidisordered state with high predicted helical probability at the N-terminal region. The predicted results are consistent with available experimental data. An analysis of all orthologous protein families in 39 species suggests that such a transition from a fully disordered state to semidisordered and/or ordered states is one of the strategies employed by nature for adaptation to high temperatures.
Collapse
Affiliation(s)
- Jihua Wang
- Shandong Provincial Key Laboratory of Functional Macromolecular Biophysics, Dezhou University, Dezhou, Shandong Province China; School of Physics and Electronic Information, Dezhou University, Dezhou, Shandong Province China
| | | | | | | | | | | |
Collapse
|
48
|
Abstract
Intrinsically disordered proteins (IDPs) and IDP regions fail to form a stable structure, yet they exhibit biological activities. Their mobile flexibility and structural instability are encoded by their amino acid sequences. They recognize proteins, nucleic acids, and other types of partners; they accelerate interactions and chemical reactions between bound partners; and they help accommodate posttranslational modifications, alternative splicing, protein fusions, and insertions or deletions. Overall, IDP-associated biological activities complement those of structured proteins. Recently, there has been an explosion of studies on IDP regions and their functions, yet the discovery and investigation of these proteins have a long, mostly ignored history. Along with recent discoveries, we present several early examples and the mechanisms by which IDPs contribute to function, which we hope will encourage comprehensive discussion of IDPs and IDP regions in biochemistry textbooks. Finally, we propose future directions for IDP research.
Collapse
Affiliation(s)
- Christopher J Oldfield
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, Indiana 46202; ,
| | | |
Collapse
|
49
|
Relini A, Marano N, Gliozzi A. Misfolding of amyloidogenic proteins and their interactions with membranes. Biomolecules 2013; 4:20-55. [PMID: 24970204 PMCID: PMC4030986 DOI: 10.3390/biom4010020] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2013] [Revised: 12/13/2013] [Accepted: 12/17/2013] [Indexed: 01/07/2023] Open
Abstract
In this paper, we discuss amyloidogenic proteins, their misfolding, resulting structures, and interactions with membranes, which lead to membrane damage and subsequent cell death. Many of these proteins are implicated in serious illnesses such as Alzheimer’s disease and Parkinson’s disease. Misfolding of amyloidogenic proteins leads to the formation of polymorphic oligomers and fibrils. Oligomeric aggregates are widely thought to be the toxic species, however, fibrils also play a role in membrane damage. We focus on the structure of these aggregates and their interactions with model membranes. Study of interactions of amlyoidogenic proteins with model and natural membranes has shown the importance of the lipid bilayer in protein misfolding and aggregation and has led to the development of several models for membrane permeabilization by the resulting amyloid aggregates. We discuss several of these models: formation of structured pores by misfolded amyloidogenic proteins, extraction of lipids, interactions with receptors in biological membranes, and membrane destabilization by amyloid aggregates perhaps analogous to that caused by antimicrobial peptides.
Collapse
Affiliation(s)
- Annalisa Relini
- Department of Physics, University of Genoa, Genoa 16146, Italy.
| | - Nadia Marano
- Department of Physics, University of Genoa, Genoa 16146, Italy.
| | | |
Collapse
|
50
|
Zhao H, Yang Y, Lin H, Zhang X, Mort M, Cooper DN, Liu Y, Zhou Y. DDIG-in: discriminating between disease-associated and neutral non-frameshifting micro-indels. Genome Biol 2013; 14:R23. [PMID: 23497682 PMCID: PMC4053752 DOI: 10.1186/gb-2013-14-3-r23] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2012] [Accepted: 03/13/2013] [Indexed: 02/07/2023] Open
Abstract
Micro-indels (insertions or deletions shorter than 21 bps) constitute the second most frequent class of human gene mutation after single nucleotide variants. Despite the relative abundance of non-frameshifting indels, their damaging effect on protein structure and function has gone largely unstudied. We have developed a support vector machine-based method named DDIG-in (Detecting disease-causing genetic variations due to indels) to prioritize non-frameshifting indels by comparing disease-associated mutations with putatively neutral mutations from the 1,000 Genomes Project. The final model gives good discrimination for indels and is robust against annotation errors. A webserver implementing DDIG-in is available at http://sparks-lab.org/ddig.
Collapse
|