1
|
Guo L, Yu Q, Wang D, Wu X, Wolynes PG, Chen M. Generating the polymorph landscapes of amyloid fibrils using AI: RibbonFold. Proc Natl Acad Sci U S A 2025; 122:e2501321122. [PMID: 40232799 PMCID: PMC12037047 DOI: 10.1073/pnas.2501321122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2025] [Accepted: 03/07/2025] [Indexed: 04/16/2025] Open
Abstract
The concept that proteins are selected to fold into a well-defined native state has been effectively addressed within the framework of energy landscapes, underpinning the recent successes of structure prediction tools like AlphaFold. The amyloid fold, however, does not represent a unique minimum for a given single sequence. While the cross-β hydrogen-bonding pattern is common to all amyloids, other aspects of amyloid fiber structures are sensitive not only to the sequence of the aggregating peptides but also to the experimental conditions. This polymorphic nature of amyloid structures challenges structure predictions. In this paper, we use AI to explore the landscape of possible amyloid protofilament structures composed of a single stack of peptides aligned in a parallel, in-register manner. This perspective enables a practical method for predicting protofilament structures of arbitrary sequences: RibbonFold. RibbonFold is adapted from AlphaFold2, incorporating parallel in-register constraints within AlphaFold2's template module, along with an appropriate polymorphism loss function to address the structural diversity of folds. RibbonFold outperforms AlphaFold2/3 on independent test sets, achieving a mean TM-score of 0.5. RibbonFold proves well-suited to study the polymorphic landscapes of widely studied sequences with documented polymorphisms. The resulting landscapes capture these observed polymorphisms effectively. We show that while well-known amyloid-forming sequences exhibit a limited number of plausible polymorphs on their "solubility" landscape, randomly shuffled sequences with the same composition appear to be negatively selected in terms of their relative solubility. RibbonFold is a valuable framework for structurally characterizing amyloid polymorphism landscapes.
Collapse
Affiliation(s)
| | - Qilin Yu
- Changping Laboratory, Beijing102206, China
| | - Di Wang
- Changping Laboratory, Beijing102206, China
| | - Xiaoyu Wu
- Changping Laboratory, Beijing102206, China
| | - Peter G. Wolynes
- Center for Theoretical Biological Physics, Rice University, Houston, TX77005
- Department of Chemistry, Rice University, Houston, TX77005
- Department of Physics and Astronomy, Rice University, Houston, TX77005
- Department of Biosciences, Rice University, Houston, TX77005
| | | |
Collapse
|
2
|
Wang A, Lin X, Chau KN, Onuchic JN, Levine H, George JT. RACER-m leverages structural features for sparse T cell specificity prediction. SCIENCE ADVANCES 2024; 10:eadl0161. [PMID: 38748791 PMCID: PMC11095454 DOI: 10.1126/sciadv.adl0161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Accepted: 04/10/2024] [Indexed: 05/19/2024]
Abstract
Reliable prediction of T cell specificity against antigenic signatures is a formidable task, complicated by the immense diversity of T cell receptor and antigen sequence space and the resulting limited availability of training sets for inferential models. Recent modeling efforts have demonstrated the advantage of incorporating structural information to overcome the need for extensive training sequence data, yet disentangling the heterogeneous TCR-antigen interface to accurately predict MHC-allele-restricted TCR-peptide interactions has remained challenging. Here, we present RACER-m, a coarse-grained structural model leveraging key biophysical information from the diversity of publicly available TCR-antigen crystal structures. Explicit inclusion of structural content substantially reduces the required number of training examples and maintains reliable predictions of TCR-recognition specificity and sensitivity across diverse biological contexts. Our model capably identifies biophysically meaningful point-mutant peptides that affect binding affinity, distinguishing its ability in predicting TCR specificity of point-mutants from alternative sequence-based methods. Its application is broadly applicable to studies involving both closely related and structurally diverse TCR-peptide pairs.
Collapse
Affiliation(s)
- Ailun Wang
- Center for Theoretical Biological Physics, Northeastern University, Boston, MA, USA
- Department of Physics, Northeastern University, Boston, MA, USA
| | - Xingcheng Lin
- Department of Physics, North Carolina State University, Raleigh, NC, USA
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA
| | - Kevin Ng Chau
- Center for Theoretical Biological Physics, Northeastern University, Boston, MA, USA
- Department of Physics, Northeastern University, Boston, MA, USA
| | - José N. Onuchic
- Departments of Physics and Astronomy, Chemistry, and Biosciences, Rice University, Houston, TX, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX, USA
| | - Herbert Levine
- Center for Theoretical Biological Physics, Northeastern University, Boston, MA, USA
- Department of Physics, Northeastern University, Boston, MA, USA
- Department of Bioengineering, Northeastern University, Boston, MA, USA
| | - Jason T. George
- Center for Theoretical Biological Physics, Rice University, Houston, TX, USA
- Department of Biomedical Engineering, Texas A&M University, Houston, TX, USA
| |
Collapse
|
3
|
Evans R, Ramisetty S, Kulkarni P, Weninger K. Illuminating Intrinsically Disordered Proteins with Integrative Structural Biology. Biomolecules 2023; 13:124. [PMID: 36671509 PMCID: PMC9856150 DOI: 10.3390/biom13010124] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Revised: 01/01/2023] [Accepted: 01/04/2023] [Indexed: 01/11/2023] Open
Abstract
Intense study of intrinsically disordered proteins (IDPs) did not begin in earnest until the late 1990s when a few groups, working independently, convinced the community that these 'weird' proteins could have important functions. Over the past two decades, it has become clear that IDPs play critical roles in a multitude of biological phenomena with prominent examples including coordination in signaling hubs, enabling gene regulation, and regulating ion channels, just to name a few. One contributing factor that delayed appreciation of IDP functional significance is the experimental difficulty in characterizing their dynamic conformations. The combined application of multiple methods, termed integrative structural biology, has emerged as an essential approach to understanding IDP phenomena. Here, we review some of the recent applications of the integrative structural biology philosophy to study IDPs.
Collapse
Affiliation(s)
- Rachel Evans
- Department of Physics, North Carolina State University, Raleigh, NC 27695, USA
| | - Sravani Ramisetty
- Department of Medical Oncology and Therapeutics Research, City of Hope National Medical Center, Duarte, CA 91010, USA
| | - Prakash Kulkarni
- Department of Medical Oncology and Therapeutics Research, City of Hope National Medical Center, Duarte, CA 91010, USA
- Department of Systems Biology, City of Hope National Medical Center, Duarte, CA 91010, USA
| | - Keith Weninger
- Department of Physics, North Carolina State University, Raleigh, NC 27695, USA
| |
Collapse
|
4
|
BAP1 forms a trimer with HMGB1 and HDAC1 that modulates gene × environment interaction with asbestos. Proc Natl Acad Sci U S A 2021; 118:2111946118. [PMID: 34815344 DOI: 10.1073/pnas.2111946118] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/15/2021] [Indexed: 12/25/2022] Open
Abstract
Carriers of heterozygous germline BAP1 mutations (BAP1 +/-) are affected by the "BAP1 cancer syndrome." Although they can develop almost any cancer type, they are unusually susceptible to asbestos carcinogenesis and mesothelioma. Here we investigate why among all carcinogens, BAP1 mutations cooperate with asbestos. Asbestos carcinogenesis and mesothelioma have been linked to a chronic inflammatory process promoted by the extracellular release of the high-mobility group box 1 protein (HMGB1). We report that BAP1 +/- cells secrete increased amounts of HMGB1, and that BAP1 +/- carriers have detectable serum levels of acetylated HMGB1 that further increase when they develop mesothelioma. We linked these findings to our discovery that BAP1 forms a trimeric protein complex with HMGB1 and with histone deacetylase 1 (HDAC1) that modulates HMGB1 acetylation and its release. Reduced BAP1 levels caused increased ubiquitylation and degradation of HDAC1, leading to increased acetylation of HMGB1 and its active secretion that in turn promoted mesothelial cell transformation.
Collapse
|
5
|
Liwo A, Czaplewski C, Sieradzan AK, Lipska AG, Samsonov SA, Murarka RK. Theory and Practice of Coarse-Grained Molecular Dynamics of Biologically Important Systems. Biomolecules 2021; 11:1347. [PMID: 34572559 PMCID: PMC8465211 DOI: 10.3390/biom11091347] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Revised: 09/03/2021] [Accepted: 09/09/2021] [Indexed: 12/16/2022] Open
Abstract
Molecular dynamics with coarse-grained models is nowadays extensively used to simulate biomolecular systems at large time and size scales, compared to those accessible to all-atom molecular dynamics. In this review article, we describe the physical basis of coarse-grained molecular dynamics, the coarse-grained force fields, the equations of motion and the respective numerical integration algorithms, and selected practical applications of coarse-grained molecular dynamics. We demonstrate that the motion of coarse-grained sites is governed by the potential of mean force and the friction and stochastic forces, resulting from integrating out the secondary degrees of freedom. Consequently, Langevin dynamics is a natural means of describing the motion of a system at the coarse-grained level and the potential of mean force is the physical basis of the coarse-grained force fields. Moreover, the choice of coarse-grained variables and the fact that coarse-grained sites often do not have spherical symmetry implies a non-diagonal inertia tensor. We describe selected coarse-grained models used in molecular dynamics simulations, including the most popular MARTINI model developed by Marrink's group and the UNICORN model of biological macromolecules developed in our laboratory. We conclude by discussing examples of the application of coarse-grained molecular dynamics to study biologically important processes.
Collapse
Affiliation(s)
- Adam Liwo
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308 Gdańsk, Poland; (C.C.); (A.K.S.); (A.G.L.); (S.A.S.)
| | - Cezary Czaplewski
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308 Gdańsk, Poland; (C.C.); (A.K.S.); (A.G.L.); (S.A.S.)
| | - Adam K. Sieradzan
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308 Gdańsk, Poland; (C.C.); (A.K.S.); (A.G.L.); (S.A.S.)
| | - Agnieszka G. Lipska
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308 Gdańsk, Poland; (C.C.); (A.K.S.); (A.G.L.); (S.A.S.)
| | - Sergey A. Samsonov
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308 Gdańsk, Poland; (C.C.); (A.K.S.); (A.G.L.); (S.A.S.)
| | - Rajesh K. Murarka
- Department of Chemistry, Indian Institute of Science Education and Research Bhopal, Bhopal Bypass Road, Bhopal 462066, MP, India;
| |
Collapse
|
6
|
Giulini M, Rigoli M, Mattiotti G, Menichetti R, Tarenzi T, Fiorentini R, Potestio R. From System Modeling to System Analysis: The Impact of Resolution Level and Resolution Distribution in the Computer-Aided Investigation of Biomolecules. Front Mol Biosci 2021; 8:676976. [PMID: 34164432 PMCID: PMC8215203 DOI: 10.3389/fmolb.2021.676976] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2021] [Accepted: 05/06/2021] [Indexed: 12/18/2022] Open
Abstract
The ever increasing computer power, together with the improved accuracy of atomistic force fields, enables researchers to investigate biological systems at the molecular level with remarkable detail. However, the relevant length and time scales of many processes of interest are still hardly within reach even for state-of-the-art hardware, thus leaving important questions often unanswered. The computer-aided investigation of many biological physics problems thus largely benefits from the usage of coarse-grained models, that is, simplified representations of a molecule at a level of resolution that is lower than atomistic. A plethora of coarse-grained models have been developed, which differ most notably in their granularity; this latter aspect determines one of the crucial open issues in the field, i.e. the identification of an optimal degree of coarsening, which enables the greatest simplification at the expenses of the smallest information loss. In this review, we present the problem of coarse-grained modeling in biophysics from the viewpoint of system representation and information content. In particular, we discuss two distinct yet complementary aspects of protein modeling: on the one hand, the relationship between the resolution of a model and its capacity of accurately reproducing the properties of interest; on the other hand, the possibility of employing a lower resolution description of a detailed model to extract simple, useful, and intelligible information from the latter.
Collapse
Affiliation(s)
- Marco Giulini
- Physics Department, University of Trento, Trento, Italy.,INFN-TIFPA, Trento Institute for Fundamental Physics and Applications, Trento, Italy
| | - Marta Rigoli
- Physics Department, University of Trento, Trento, Italy.,INFN-TIFPA, Trento Institute for Fundamental Physics and Applications, Trento, Italy
| | - Giovanni Mattiotti
- Physics Department, University of Trento, Trento, Italy.,INFN-TIFPA, Trento Institute for Fundamental Physics and Applications, Trento, Italy
| | - Roberto Menichetti
- Physics Department, University of Trento, Trento, Italy.,INFN-TIFPA, Trento Institute for Fundamental Physics and Applications, Trento, Italy
| | - Thomas Tarenzi
- Physics Department, University of Trento, Trento, Italy.,INFN-TIFPA, Trento Institute for Fundamental Physics and Applications, Trento, Italy
| | - Raffaele Fiorentini
- Physics Department, University of Trento, Trento, Italy.,INFN-TIFPA, Trento Institute for Fundamental Physics and Applications, Trento, Italy
| | - Raffaello Potestio
- Physics Department, University of Trento, Trento, Italy.,INFN-TIFPA, Trento Institute for Fundamental Physics and Applications, Trento, Italy
| |
Collapse
|
7
|
Oliveira Junior AB, Lin X, Kulkarni P, Onuchic JN, Roy S, Leite VBP. Exploring Energy Landscapes of Intrinsically Disordered Proteins: Insights into Functional Mechanisms. J Chem Theory Comput 2021; 17:3178-3187. [PMID: 33871257 DOI: 10.1021/acs.jctc.1c00027] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Intrinsically disordered proteins (IDPs) lack a rigid three-dimensional structure and populate a polymorphic ensemble of conformations. Because of the lack of a reference conformation, their energy landscape representation in terms of reaction coordinates presents a daunting challenge. Here, our newly developed energy landscape visualization method (ELViM), a reaction coordinate-free approach, shows its prime application to explore frustrated energy landscapes of an intrinsically disordered protein, prostate-associated gene 4 (PAGE4). PAGE4 is a transcriptional coactivator that potentiates the oncogene c-Jun. Two kinases, namely, HIPK1 and CLK2, phosphorylate PAGE4, generating variants phosphorylated at different serine/threonine residues (HIPK1-PAGE4 and CLK2-PAGE4, respectively) with opposing functions. While HIPK1-PAGE4 predominantly phosphorylates Thr51 and potentiates c-Jun, CLK2-PAGE4 hyperphosphorylates PAGE4 and attenuates transactivation. To understand the underlying mechanisms of conformational diversity among different phosphoforms, we have analyzed their atomistic trajectories simulated using AWSEM forcefield, and the energy landscapes were elucidated using ELViM. This method allows us to identify and compare the population distributions of different conformational ensembles of PAGE4 phosphoforms using the same effective phase space. The results reveal a predominant conformational ensemble with an extended C-terminal segment of WT PAGE4, which exposes a functional residue Thr51, implying its potential of undertaking a fly-casting mechanism while binding to its cognate partner. In contrast, for HIPK1-PAGE4, a compact conformational ensemble enhances its population sequestering phosphorylated-Thr51. This clearly explains the experimentally observed weaker affinity of HIPK1-PAGE4 for c-Jun. ELViM appears as a powerful tool, especially to analyze the highly frustrated energy landscape representation of IDPs where appropriate reaction coordinates are hard to apprehend.
Collapse
Affiliation(s)
- Antonio B Oliveira Junior
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005-1892, United States
| | - Xingcheng Lin
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139-4307, United States
| | - Prakash Kulkarni
- Department of Medical Oncology and Therapeutics Research, City of Hope National Medical Center, Duarte, California 91010, United States
| | - José N Onuchic
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005-1892, United States
| | - Susmita Roy
- Department of Chemical Sciences, Indian Institute of Science Education and Research Kolkata, Mohanpur, West Bengal 741246, India
| | - Vitor B P Leite
- Departamento de Física, Instituto de Biociências, Letras e Ciências Exatas, Universidade Estadual Paulista (UNESP), São José do Rio Preto, São Paulo 15054-000, Brazil
| |
Collapse
|
8
|
Jin S, Miller MD, Chen M, Schafer NP, Lin X, Chen X, Phillips GN, Wolynes PG. Molecular-replacement phasing using predicted protein structures from AWSEM-Suite. IUCRJ 2020; 7:1168-1178. [PMID: 33209327 PMCID: PMC7642774 DOI: 10.1107/s2052252520013494] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Accepted: 10/07/2020] [Indexed: 06/11/2023]
Abstract
The phase problem in X-ray crystallography arises from the fact that only the intensities, and not the phases, of the diffracting electromagnetic waves are measured directly. Molecular replacement can often estimate the relative phases of reflections starting with those derived from a template structure, which is usually a previously solved structure of a similar protein. The key factor in the success of molecular replacement is finding a good template structure. When no good solved template exists, predicted structures based partially on templates can sometimes be used to generate models for molecular replacement, thereby extending the lower bound of structural and sequence similarity required for successful structure determination. Here, the effectiveness is examined of structures predicted by a state-of-the-art prediction algorithm, the Associative memory, Water-mediated, Structure and Energy Model Suite (AWSEM-Suite), which has been shown to perform well in predicting protein structures in CASP13 when there is no significant sequence similarity to a solved protein or only very low sequence similarity to known templates. The performance of AWSEM-Suite structures in molecular replacement is discussed and the results show that AWSEM-Suite performs well in providing useful phase information, often performing better than I-TASSER-MR and the previous algorithm AWSEM-Template.
Collapse
Affiliation(s)
- Shikai Jin
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, USA
- Department of Biosciences, Rice University, Houston, Texas, USA
| | | | - Mingchen Chen
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, USA
| | - Nicholas P. Schafer
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, USA
- Department of Chemistry, Rice University, Houston, Texas, USA
| | - Xingcheng Lin
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| | - Xun Chen
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, USA
- Department of Chemistry, Rice University, Houston, Texas, USA
| | - George N. Phillips
- Department of Biosciences, Rice University, Houston, Texas, USA
- Department of Chemistry, Rice University, Houston, Texas, USA
| | - Peter G. Wolynes
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, USA
- Department of Biosciences, Rice University, Houston, Texas, USA
- Department of Chemistry, Rice University, Houston, Texas, USA
- Department of Physics, Rice University, Houston, Texas, USA
| |
Collapse
|
9
|
Structural and Dynamical Order of a Disordered Protein: Molecular Insights into Conformational Switching of PAGE4 at the Systems Level. Biomolecules 2019; 9:biom9020077. [PMID: 30813315 PMCID: PMC6406393 DOI: 10.3390/biom9020077] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2019] [Revised: 02/10/2019] [Accepted: 02/10/2019] [Indexed: 01/10/2023] Open
Abstract
Folded proteins show a high degree of structural order and undergo (fairly constrained) collective motions related to their functions. On the other hand, intrinsically disordered proteins (IDPs), while lacking a well-defined three-dimensional structure, do exhibit some structural and dynamical ordering, but are less constrained in their motions than folded proteins. The larger structural plasticity of IDPs emphasizes the importance of entropically driven motions. Many IDPs undergo function-related disorder-to-order transitions driven by their interaction with specific binding partners. As experimental techniques become more sensitive and become better integrated with computational simulations, we are beginning to see how the modest structural ordering and large amplitude collective motions of IDPs endow them with an ability to mediate multiple interactions with different partners in the cell. To illustrate these points, here, we use Prostate-associated gene 4 (PAGE4), an IDP implicated in prostate cancer (PCa) as an example. We first review our previous efforts using molecular dynamics simulations based on atomistic AWSEM to study the conformational dynamics of PAGE4 and how its motions change in its different physiologically relevant phosphorylated forms. Our simulations quantitatively reproduced experimental observations and revealed how structural and dynamical ordering are encoded in the sequence of PAGE4 and can be modulated by different extents of phosphorylation by the kinases HIPK1 and CLK2. This ordering is reflected in changing populations of certain secondary structural elements as well as in the regularity of its collective motions. These ordered features are directly correlated with the functional interactions of WT-PAGE4, HIPK1-PAGE4 and CLK2-PAGE4 with the AP-1 signaling axis. These interactions give rise to repeated transitions between (high HIPK1-PAGE4, low CLK2-PAGE4) and (low HIPK1-PAGE4, high CLK2-PAGE4) cell phenotypes, which possess differing sensitivities to the standard PCa therapies, such as androgen deprivation therapy (ADT). We argue that, although the structural plasticity of an IDP is important in promoting promiscuous interactions, the modulation of the structural ordering is important for sculpting its interactions so as to rewire with agility biomolecular interaction networks with significant functional consequences.
Collapse
|
10
|
Wu H, Wolynes PG, Papoian GA. AWSEM-IDP: A Coarse-Grained Force Field for Intrinsically Disordered Proteins. J Phys Chem B 2018; 122:11115-11125. [PMID: 30091924 PMCID: PMC6713210 DOI: 10.1021/acs.jpcb.8b05791] [Citation(s) in RCA: 72] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The associative memory, water-mediated, structure and energy model (AWSEM) has been successfully used to study protein folding, binding, and aggregation problems. In this work, we introduce AWSEM-IDP, a new AWSEM branch for simulating intrinsically disordered proteins (IDPs), where the weights of the potentials determining secondary structure formation have been finely tuned, and a novel potential is introduced that helps to precisely control both the average extent of protein chain collapse and the chain's fluctuations in size. AWSEM-IDP can efficiently sample large conformational spaces, while retaining sufficient molecular accuracy to realistically model proteins. We applied this new model to two IDPs, demonstrating that AWSEM-IDP can reasonably well reproduce higher-resolution reference data, thus providing the foundation for a transferable IDP force field. Finally, we used thermodynamic perturbation theory to show that, in general, the conformational ensembles of IDPs are highly sensitive to fine-tuning of force field parameters.
Collapse
Affiliation(s)
- Hao Wu
- Biophysics Program, Institute for Physical Science and Technology, University of Maryland, College Park, Maryland 20742, United States
| | - Peter G. Wolynes
- Departments of Chemistry and Physics and Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States
| | - Garegin A. Papoian
- Biophysics Program, Institute for Physical Science and Technology, University of Maryland, College Park, Maryland 20742, United States
- Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland 20742, United States
| |
Collapse
|
11
|
Neelamraju S, Gosavi S, Wales DJ. Energy Landscape of the Designed Protein Top7. J Phys Chem B 2018; 122:12282-12291. [DOI: 10.1021/acs.jpcb.8b08499] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Affiliation(s)
- Sridhar Neelamraju
- Simons Centre for the Study of Living Machines, National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bangalore, Karnataka 560065, India
- University Chemical Laboratories, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Shachi Gosavi
- Simons Centre for the Study of Living Machines, National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bangalore, Karnataka 560065, India
| | - David J. Wales
- University Chemical Laboratories, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| |
Collapse
|
12
|
Chen M, Lin X, Lu W, Schafer NP, Onuchic JN, Wolynes PG. Template-Guided Protein Structure Prediction and Refinement Using Optimized Folding Landscape Force Fields. J Chem Theory Comput 2018; 14:6102-6116. [PMID: 30240202 DOI: 10.1021/acs.jctc.8b00683] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
When good structural templates can be identified, template-based modeling is the most reliable way to predict the tertiary structure of proteins. In this study, we combine template-based modeling with a realistic coarse-grained force field, AWSEM, that has been optimized using the principles of energy landscape theory. The Associative memory, Water mediated, Structure and Energy Model (AWSEM) is a coarse-grained force field having both transferable tertiary interactions and knowledge-based local-in-sequence interaction terms. We incorporate template information into AWSEM by introducing soft collective biases to the template structures, resulting in a model that we call AWSEM-Template. Structure prediction tests on eight targets, four of which are in the low sequence identity "twilight zone" of homology modeling, show that AWSEM-Template can achieve high-resolution structure prediction. Our results also confirm that using a combination of AWSEM and a template-guided potential leads to more accurate prediction of protein structures than simply using a template-guided potential alone. Free energy profile analyses demonstrate that the soft collective biases to the template effectively increase funneling toward native-like structures while still allowing significant flexibility so as to allow for correction of discrepancies between the target structure and the template. A further stage of refinement using all-atom molecular dynamics augmented with soft collective biases to the structures predicted by AWSEM-Template leads to a further improvement of both backbone and side-chain accuracy by maintaining sufficient flexibility but at the same time discouraging unproductive unfolding events often seen in unrestrained all-atom refinement simulations. The all-atom refinement simulations also reduce patches of frustration of the initial predictions. Some of the backbones found among the structures produced during the initial coarse-grained prediction step already have CE-RMSD values of less than 3 Å with 90% or more of the residues aligned to the experimentally solved structure for all targets. All-atom structures generated during the following all-atom refinement simulations, which started from coarse-grained structures that were chosen without reference to any knowledge about the native structure, have CE-RMSD values of less than 2.5 Å with 90% or more of the residues aligned for 6 out of 8 targets. Clustering low energy structures generated during the initial coarse-grained annealing picks out reliably structures that are within 1 Å of the best sampled structures in 5 out of 8 cases. After the all-atom refinement, structures that are within 1 Å of the best sampled structures can be selected using a simple algorithm based on energetic features alone in 7 out of 8 cases.
Collapse
Affiliation(s)
- Mingchen Chen
- Center for Theoretical Biological Physics, Rice University , Houston , Texas 77030 , United States.,Department of Bioengineering , Rice University , Houston , Texas 77005 , United States
| | - Xingcheng Lin
- Center for Theoretical Biological Physics, Rice University , Houston , Texas 77030 , United States.,Department of Physics and Astronomy , Rice University , Houston , Texas 77005 , United States
| | - Wei Lu
- Center for Theoretical Biological Physics, Rice University , Houston , Texas 77030 , United States.,Department of Physics and Astronomy , Rice University , Houston , Texas 77005 , United States
| | - Nicholas P Schafer
- Center for Theoretical Biological Physics, Rice University , Houston , Texas 77030 , United States.,Department of Chemistry , Rice University , Houston , Texas 77005 , United States
| | - José N Onuchic
- Center for Theoretical Biological Physics, Rice University , Houston , Texas 77030 , United States.,Department of Physics and Astronomy , Rice University , Houston , Texas 77005 , United States.,Department of Chemistry , Rice University , Houston , Texas 77005 , United States.,Department of Biosciences , Rice University , Houston , Texas 77005 , United States
| | - Peter G Wolynes
- Center for Theoretical Biological Physics, Rice University , Houston , Texas 77030 , United States.,Department of Chemistry , Rice University , Houston , Texas 77005 , United States.,Department of Biosciences , Rice University , Houston , Texas 77005 , United States
| |
Collapse
|
13
|
Chen M, Schafer NP, Wolynes PG. Surveying the Energy Landscapes of Aβ Fibril Polymorphism. J Phys Chem B 2018; 122:11414-11430. [PMID: 30215519 DOI: 10.1021/acs.jpcb.8b07364] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Many unrelated proteins and peptides have been found spontaneously to form amyloid fibers above a critical concentration. Even for a single sequence, however, the amyloid fold is not a single well-defined structure. Although the cross-β hydrogen bonding pattern is common to all amyloids, all other aspects of amyloid fiber structures are sensitive to both the sequence of the aggregating peptides and the solvent conditions under which the aggregation occurs. Amyloid fibers are easy to identify and grossly characterize using microscopy, but their insolubility and aperiodicity along the dimensions transverse to the fiber axis have complicated detailed experimental structural characterization. In this paper, we explore the landscape of possibilities for amyloid protofilament structures that are made up of a single stack of peptides associated in a parallel in-register manner. We view this landscape as a two-dimensional version of the usual three-dimensional protein folding problem: the survey of the two-dimensional folds of protein ribbons. Adopting this view leads to a practical method of predicting stable protofilament structures of arbitrary sequences. We apply this scheme to variants of Aβ, the amyloid forming peptide that is characteristically associated with Alzheimer's disease. Consistent with what is known from experiment, we find that Aβ protofibrils are polymorphic. To our surprise, however, the ribbon-folding landscape of Aβ turned out to be strikingly simple. We confirm that, at the level of the monomeric protofilament, the landscape for the Aβ sequence is reasonably well funneled toward structures that are similar to those that have been determined by experiment. The landscape has more distinct minima than does a typical globular protein landscape but fewer and deeper minima than the landscape of a randomly shuffled sequence having the same overall composition. It is tempting to consider the possibility that the significant degree of funneling of Aβ's ribbon-folding landscape has arisen as a result of natural selection. More likely, however, the intermediate complexity of Aβ's ribbon-folding landscape has come from the post facto selection of the Aβ sequence as an object of study by researchers because only by having a landscape with some degree of funneling can ordered aggregation of such a peptide occur at in vivo concentrations. In addition to predicting polymorph structures, we show that predicted solubilities of polymorphs correlate with experiment and with their elongation free energies computed by coarse-grained molecular dynamics.
Collapse
Affiliation(s)
- Mingchen Chen
- Center for Theoretical Biological Physics , Rice University , Houston , Texas 77005 , United States.,Department of Bioengineering , Rice University , Houston , Texas 77005 , United States
| | - Nicholas P Schafer
- Center for Theoretical Biological Physics , Rice University , Houston , Texas 77005 , United States.,Department of Chemistry , Rice University , Houston , Texas 77005 , United States
| | - Peter G Wolynes
- Center for Theoretical Biological Physics , Rice University , Houston , Texas 77005 , United States.,Department of Chemistry , Rice University , Houston , Texas 77005 , United States
| |
Collapse
|
14
|
Lin X, Roy S, Jolly MK, Bocci F, Schafer NP, Tsai MY, Chen Y, He Y, Grishaev A, Weninger K, Orban J, Kulkarni P, Rangarajan G, Levine H, Onuchic JN. PAGE4 and Conformational Switching: Insights from Molecular Dynamics Simulations and Implications for Prostate Cancer. J Mol Biol 2018; 430:2422-2438. [PMID: 29758263 DOI: 10.1016/j.jmb.2018.05.011] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2018] [Revised: 04/13/2018] [Accepted: 05/07/2018] [Indexed: 11/15/2022]
Abstract
Prostate-associated gene 4 (PAGE4) is an intrinsically disordered protein implicated in prostate cancer. Thestress-response kinase homeodomain-interacting protein kinase 1 (HIPK1) phosphorylates two residues in PAGE4, serine 9 and threonine 51. Phosphorylation of these two residues facilitates the interaction of PAGE4 with activator protein-1 (AP-1) transcription factor complex to potentiate AP-1's activity. In contrast, hyperphosphorylation of PAGE4 by CDC-like kinase 2 (CLK2) attenuates this interaction with AP-1. Small-angleX-ray scattering and single-molecule fluorescence resonance energy transfer measurements have shown that PAGE4 expands upon hyperphosphorylation and that this expansion is localized to its N-terminal half. To understand the interactions underlying this structural transition, we performed molecular dynamics simulations using Atomistic AWSEM, a multi-scale molecular model that combines atomistic and coarse-grained simulation approaches. Our simulations show that electrostatic interactions drive transient formation of an N-terminal loop, the destabilization of which accounts for the dramatic change in size upon hyperphosphorylation. Phosphorylation also changes the preference of secondary structure formation of the PAGE4 ensemble, which leads to a transition between states that display different degrees of disorder. Finally, we construct a mechanism-based mathematical model that allows us to capture the interactions ofdifferent phosphoforms of PAGE4 with AP-1 and its downstream target, the androgen receptor (AR)-a key therapeutic target in prostate cancer. Our model predicts intracellular oscillatory dynamics of HIPK1-PAGE4, CLK2-PAGE4, and AR activity, indicating phenotypic heterogeneity in an isogenic cell population. Thus, conformational switching of PAGE4 may potentially affect the efficiency of therapeutically targeting AR activity.
Collapse
Affiliation(s)
- Xingcheng Lin
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77005, United States; Department of Physics and Astronomy, Rice University, Houston, TX 77005, United States
| | - Susmita Roy
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77005, United States
| | - Mohit Kumar Jolly
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77005, United States
| | - Federico Bocci
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77005, United States; Department of Chemistry, Rice University, Houston, TX 77005, United States
| | - Nicholas P Schafer
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77005, United States; Department of Chemistry, Rice University, Houston, TX 77005, United States
| | - Min-Yeh Tsai
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77005, United States; Department of Chemistry, Rice University, Houston, TX 77005, United States
| | - Yihong Chen
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD 20850, United States
| | - Yanan He
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD 20850, United States
| | - Alexander Grishaev
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD 20850, United States; National Institute of Standards and Technology, Gaithersburg, MD 20899, United States
| | - Keith Weninger
- Department of Physics, North Carolina State University, Raleigh, NC 27695, United States
| | - John Orban
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD 20850, United States; Department of Chemistry and Biochemistry, University of Maryland, College Park, MD 20742, United States
| | - Prakash Kulkarni
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD 20850, United States; Department of Medical Oncology and Therapeutics Research, City of Hope National Medical Center, Duarte, CA 91010, United States
| | - Govindan Rangarajan
- Department of Mathematics, Indian Institute of Science, Bangalore 560012, India; Center for Neuroscience, Indian Institute of Science, Bangalore 560012, India
| | - Herbert Levine
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77005, United States; Department of Physics and Astronomy, Rice University, Houston, TX 77005, United States
| | - José N Onuchic
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77005, United States; Department of Physics and Astronomy, Rice University, Houston, TX 77005, United States; Department of Chemistry, Rice University, Houston, TX 77005, United States; Department of BioSciences, Rice University, Houston, TX 77005, United States.
| |
Collapse
|
15
|
Chen M, Schafer NP, Zheng W, Wolynes PG. The Associative Memory, Water Mediated, Structure and Energy Model (AWSEM)-Amylometer: Predicting Amyloid Propensity and Fibril Topology Using an Optimized Folding Landscape Model. ACS Chem Neurosci 2018; 9:1027-1039. [PMID: 29241326 DOI: 10.1021/acschemneuro.7b00436] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Amyloids are fibrillar protein aggregates with simple repeated structural motifs in their cores, usually β-strands but sometimes α-helices. Identifying the amyloid-prone regions within protein sequences is important both for understanding the mechanisms of amyloid-associated diseases and for understanding functional amyloids. Based on the crystal structures of seven cross-β amyloidogenic peptides with different topologies and one recently solved cross-α fiber structure, we have developed a computational approach for identifying amyloidogenic segments in protein sequences using the Associative memory, Water mediated, Structure and Energy Model (AWSEM). The AWSEM-Amylometer performs favorably in comparison with other predictors in predicting aggregation-prone sequences in multiple data sets. The method also predicts well the specific topologies (the relative arrangement of β-strands in the core) of the amyloid fibrils. An important advantage of the AWSEM-Amylometer over other existing methods is its direct connection with an efficient, optimized protein folding simulation model, AWSEM. This connection allows one to combine efficient and accurate search of protein sequences for amyloidogenic segments with the detailed study of the thermodynamic and kinetic roles that these segments play in folding and aggregation in the context of the entire protein sequence. We present new simulation results that highlight the free energy landscapes of peptides that can take on multiple fibril topologies. We also demonstrate how the Amylometer methodology can be straightforwardly extended to the study of functional amyloids that have the recently discovered cross-α fibril architecture.
Collapse
|
16
|
Dawid AE, Gront D, Kolinski A. Coarse-Grained Modeling of the Interplay between Secondary Structure Propensities and Protein Fold Assembly. J Chem Theory Comput 2018; 14:2277-2287. [PMID: 29486120 DOI: 10.1021/acs.jctc.7b01242] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We recently developed a new coarse-grained model of protein structure and dynamics [ Dawid et al. J. Chem. Theory Comput. 2017 , 13 ( 11 ), 5766 - 5779 ]. The model assumed a single bead representation of amino acid residues, where positions of such united residues were defined by centers of mass of four amino acid fragments. Replica exchange Monte Carlo sampling of the model chains provided good pictures of modeled structures and their dynamics. In its generic form the statistical knowledge-based force field of the model has been dedicated for single-domain globular proteins. Sequence-specific interactions are defined by three-letter secondary structure data. In the present work we demonstrate that different assignments and/or predictions of secondary structures are usually sufficient for enforcing cooperative formation of native-like folds of SURPASS chains for the majority of single-domain globular proteins. Simulations of native-like structure assembly for a representative set of globular proteins have shown that the accuracy of secondary structure data is usually not crucial for model performance, although some specific errors can strongly distort the obtained three-dimensional structures.
Collapse
Affiliation(s)
- Aleksandra E Dawid
- Faculty of Chemistry, Biological and Chemical Research Center , University of Warsaw , Pasteura 1 , 02-093 Warsaw , Poland
| | - Dominik Gront
- Faculty of Chemistry, Biological and Chemical Research Center , University of Warsaw , Pasteura 1 , 02-093 Warsaw , Poland
| | - Andrzej Kolinski
- Faculty of Chemistry, Biological and Chemical Research Center , University of Warsaw , Pasteura 1 , 02-093 Warsaw , Poland
| |
Collapse
|
17
|
Sirovetz BJ, Schafer NP, Wolynes PG. Protein structure prediction: making AWSEM AWSEM-ER by adding evolutionary restraints. Proteins 2017; 85:2127-2142. [PMID: 28799172 DOI: 10.1002/prot.25367] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2017] [Revised: 07/29/2017] [Accepted: 08/08/2017] [Indexed: 11/07/2022]
Abstract
Protein sequences have evolved to fold into functional structures, resulting in families of diverse protein sequences that all share the same overall fold. One can harness protein family sequence data to infer likely contacts between pairs of residues. In the current study, we combine this kind of inference from coevolutionary information with a coarse-grained protein force field ordinarily used with single sequence input, the Associative memory, Water mediated, Structure and Energy Model (AWSEM), to achieve improved structure prediction. The resulting Associative memory, Water mediated, Structure and Energy Model with Evolutionary Restraints (AWSEM-ER) yields a significant improvement in the quality of protein structure prediction over the single sequence prediction from AWSEM when a sufficiently large number of homologous sequences are available. Free energy landscape analysis shows that the addition of the evolutionary term shifts the free energy minimum to more native-like structures, which explains the improvement in the quality of structures when performing predictions using simulated annealing. Simulations using AWSEM without coevolutionary information have proved useful in elucidating not only protein folding behavior, but also mechanisms of protein function. The success of AWSEM-ER in de novo structure prediction suggests that the enhanced model opens the door to functional studies of proteins even when no experimentally solved structures are available.
Collapse
Affiliation(s)
- Brian J Sirovetz
- Center for Theoretical Biological Physics, Rice University, Houston, Texas.,Department of Chemistry, Rice University, Houston, Texas
| | - Nicholas P Schafer
- Center for Theoretical Biological Physics, Rice University, Houston, Texas
| | - Peter G Wolynes
- Center for Theoretical Biological Physics, Rice University, Houston, Texas.,Department of Chemistry, Rice University, Houston, Texas.,Department of Physics, Rice University, Houston, Texas.,Department of Biosciences, Rice University, Houston, Texas
| |
Collapse
|
18
|
Aggregation landscapes of Huntingtin exon 1 protein fragments and the critical repeat length for the onset of Huntington's disease. Proc Natl Acad Sci U S A 2017; 114:4406-4411. [PMID: 28400517 DOI: 10.1073/pnas.1702237114] [Citation(s) in RCA: 63] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Huntington's disease (HD) is a neurodegenerative disease caused by an abnormal expansion in the polyglutamine (polyQ) track of the Huntingtin (HTT) protein. The severity of the disease depends on the polyQ repeat length, arising only in patients with proteins having 36 repeats or more. Previous studies have shown that the aggregation of N-terminal fragments (encoded by HTT exon 1) underlies the disease pathology in mouse models and that the HTT exon 1 gene product can self-assemble into amyloid structures. Here, we provide detailed structural mechanisms for aggregation of several protein fragments encoded by HTT exon 1 by using the associative memory, water-mediated, structure and energy model (AWSEM) to construct their free energy landscapes. We find that the addition of the N-terminal 17-residue sequence ([Formula: see text]) facilitates polyQ aggregation by encouraging the formation of prefibrillar oligomers, whereas adding the C-terminal polyproline sequence ([Formula: see text]) inhibits aggregation. The combination of both terminal additions in HTT exon 1 fragment leads to a complex aggregation mechanism with a basic core that resembles that found for the aggregation of pure polyQ repeats using AWSEM. At the extrapolated physiological concentration, although the grand canonical free energy profiles are uphill for HTT exon 1 fragments having 20 or 30 glutamines, the aggregation landscape for fragments with 40 repeats has become downhill. This computational prediction agrees with the critical length found for the onset of HD and suggests potential therapies based on blocking early binding events involving the terminal additions to the polyQ repeats.
Collapse
|