1
|
Ding N, Jiang Y, Lee S, Cheng Z, Ran X, Ding Y, Ge R, Zhang Y, Yang ZJ. Enzyme miniaturization: Revolutionizing future biocatalysts. Biotechnol Adv 2025; 82:108598. [PMID: 40354901 DOI: 10.1016/j.biotechadv.2025.108598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2025] [Revised: 04/05/2025] [Accepted: 05/09/2025] [Indexed: 05/14/2025]
Abstract
Enzyme miniaturization offers a transformative approach to overcome limitations posed by the large size of conventional enzymes in industrial, therapeutic, and diagnostic applications. However, the evolutionary optimization of enzymes for activity has not inherently favored compact structures, creating challenges for modern applications requiring smaller catalysts. In this review, we surveyed the advantages of miniature enzymes, including enhanced expressivity, folding efficiency, thermostability, and resistance to proteolysis. We described the applications of miniature enzymes as industrial catalysts, therapeutic agents, and diagnostic elements. We highlighted strategies such as genome mining, rational design, random deletion, and de novo design for achieving enzyme miniaturization, integrating both computational and experimental techniques. By investigating these approaches, we aim to provide a framework for advancing enzyme engineering, emphasizing the unique potential of miniature enzymes to revolutionize biocatalysis, gene therapy, and biosensing technologies.
Collapse
Affiliation(s)
- Ning Ding
- Department of Chemistry, Vanderbilt University, Nashville, TN 37235, United States; Center for Structural Biology, Vanderbilt University, Nashville, TN 37235, United States.
| | - Yaoyukun Jiang
- Department of Chemistry, Vanderbilt University, Nashville, TN 37235, United States; Department of Chemistry and California Institute for Quantitative Biosciences, University of California-Berkeley, Berkeley, CA 94720, United States
| | - Sangsin Lee
- Department of Genetics, Stanford University, Stanford, CA 94305, United States
| | - Zihao Cheng
- Department of Chemistry, Vanderbilt University, Nashville, TN 37235, United States
| | - Xinchun Ran
- Department of Chemistry, Vanderbilt University, Nashville, TN 37235, United States
| | - Yujing Ding
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China; Beijing Advanced Innovation Center for Soft Matter Science and Engineering, Beijing University of Chemical Technology, Beijing 100029, China
| | - Robbie Ge
- Department of Chemistry, Vanderbilt University, Nashville, TN 37235, United States
| | - Yifei Zhang
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China; Beijing Advanced Innovation Center for Soft Matter Science and Engineering, Beijing University of Chemical Technology, Beijing 100029, China.
| | - Zhongyue J Yang
- Department of Chemistry, Vanderbilt University, Nashville, TN 37235, United States; Center for Structural Biology, Vanderbilt University, Nashville, TN 37235, United States.
| |
Collapse
|
2
|
Glyakina AV, Suvorina MY, Dovidchenko NV, Katina NS, Surin AK, Galzitskaya OV. Exploring Compactness and Dynamics of Apomyoglobin. Proteins 2025; 93:997-1008. [PMID: 39713842 DOI: 10.1002/prot.26786] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2024] [Revised: 12/02/2024] [Accepted: 12/09/2024] [Indexed: 12/24/2024]
Abstract
Hydrogen-deuterium exchange mass spectrometry (HDX-MS) approach has become a valuable analytical complement to traditional methods. HDX-MS allows the identification of dynamic surfaces in proteins. We have shown that the introduction of various mutations into the amino acid sequence of whale apomyoglobin (apoMb) leads to a change in the number of exchangeable hydrogen atoms, which is associated with a change in its compactness in the native-like condition. Thus, amino acid substitutions V10A, A15S, P120G, and M131A result in an increase in the number of exchangeable hydrogen atoms at the native-like condition, while the mutant form A144S leads to a decrease in the number of exchangeable hydrogen atoms. This may be due to a decrease and increase in the compactness of apoMb structure compared to the wild-type apoMb, respectively. The L9F and L9E mutations did not affect the compactness of the molecule compared to the wild type. We have demonstrated that V10A and M131A substitutions lead to the maximum and large increase correspondently in the average number of exchangeable hydrogen atoms for deuterium, since these substitutions lead to the loss of contacts between important parts of myoglobin structure: helices A, G, and H, which are structured at the early stage of folding.
Collapse
Affiliation(s)
- Anna V Glyakina
- Institute of Mathematical Problems of Biology, Russian Academy of Sciences, the Branch of Keldysh Institute of Applied Mathematics, Russian Academy of Sciences, Moscow, Russia
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Russia
| | - Mariya Y Suvorina
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Russia
| | - Nikita V Dovidchenko
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Russia
- Gamaleya Research Centre of Epidemiology and Microbiology, Moscow, Russia
| | - Natalya S Katina
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Russia
- The Branch of the Institute of Bioorganic Chemistry, Russian Academy of Sciences, Pushchino, Russia
| | - Alexey K Surin
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Russia
- The Branch of the Institute of Bioorganic Chemistry, Russian Academy of Sciences, Pushchino, Russia
- State Research Center for Applied Microbiology and Biotechnology, Russia
| | - Oxana V Galzitskaya
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Russia
- Gamaleya Research Centre of Epidemiology and Microbiology, Moscow, Russia
- Institute of Theoretical and Experimental Biophysics, Russian Academy of Sciences, Pushchino, Russia
| |
Collapse
|
3
|
Harihar B, Saravanan KM, Gromiha MM, Selvaraj S. Importance of Inter-residue Contacts for Understanding Protein Folding and Unfolding Rates, Remote Homology, and Drug Design. Mol Biotechnol 2025; 67:862-884. [PMID: 38498284 DOI: 10.1007/s12033-024-01119-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Accepted: 02/10/2024] [Indexed: 03/20/2024]
Abstract
Inter-residue interactions in protein structures provide valuable insights into protein folding and stability. Understanding these interactions can be helpful in many crucial applications, including rational design of therapeutic small molecules and biologics, locating functional protein sites, and predicting protein-protein and protein-ligand interactions. The process of developing machine learning models incorporating inter-residue interactions has been improved recently. This review highlights the theoretical models incorporating inter-residue interactions in predicting folding and unfolding rates of proteins. Utilizing contact maps to depict inter-residue interactions aids researchers in developing computer models for detecting remote homologs and interface residues within protein-protein complexes which, in turn, enhances our knowledge of the relationship between sequence and structure of proteins. Further, the application of contact maps derived from inter-residue interactions is highlighted in the field of drug discovery. Overall, this review presents an extensive assessment of the significant models that use inter-residue interactions to investigate folding rates, unfolding rates, remote homology, and drug development, providing potential future advancements in constructing efficient computational models in structural biology.
Collapse
Affiliation(s)
- Balasubramanian Harihar
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, Tamil Nadu, 620024, India
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, 600036, India
| | - Konda Mani Saravanan
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, Tamil Nadu, 620024, India
- Department of Biotechnology, Bharath Institute of Higher Education and Research, Chennai, Tamil Nadu, 600073, India
| | - Michael M Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, 600036, India
| | - Samuel Selvaraj
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, Tamil Nadu, 620024, India.
| |
Collapse
|
4
|
Garbuzynskiy SO, Marchenkov VV, Marchenko NY, Semisotnov GV, Finkelstein AV. How proteins manage to fold and how chaperones manage to assist the folding. Phys Life Rev 2025; 52:66-79. [PMID: 39709754 DOI: 10.1016/j.plrev.2024.12.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2024] [Accepted: 12/12/2024] [Indexed: 12/24/2024]
Abstract
This review presents the current understanding of (i) spontaneous self-organization of spatial structures of protein molecules, and (ii) possible ways of chaperones' assistance to this process. Specifically, we overview the most important features of spontaneous folding of proteins (mostly, of the single-domain water-soluble globular proteins): the choice of the unique protein structure among zillions of alternatives, the nucleation of the folding process, and phase transitions within protein molecules. We consider the main experimental facts on protein folding, both in vivo and in vitro, of both kinetic and thermodynamic nature. We discuss the famous Levinthal's paradox of protein folding and its solution, theoretical models of protein folding and unfolding, and the dependence of the rates of these processes on the protein chain length. Special attention is paid to relatively small, single-domain, and water-soluble globular proteins whose structure and folding are much better studied and understood than those of large proteins, especially membrane or fibrous proteins. Lastly, we describe the chaperone-assisted protein folding with an emphasis on the chaperones' ability to prevent proteins from their irreversible aggregation. Since the possible assistance mechanisms connected with chaperones are still debatable, experimental data useful in selecting the most likely mechanisms of chaperone-assisted protein folding are presented.
Collapse
Affiliation(s)
- Sergiy O Garbuzynskiy
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Moscow Region, 142290, Russian Federation
| | - Victor V Marchenkov
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Moscow Region, 142290, Russian Federation
| | - Natalia Y Marchenko
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Moscow Region, 142290, Russian Federation
| | - Gennady V Semisotnov
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Moscow Region, 142290, Russian Federation.
| | - Alexei V Finkelstein
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Moscow Region, 142290, Russian Federation.
| |
Collapse
|
5
|
Xiao N, Yang W, Wang J, Li J, Zhao R, Li M, Li C, Liu K, Li Y, Yin C, Chen Z, Li X, Jiang Y. Protein structuromics: A new method for protein structure-function crosstalk in glioma. Proteins 2024; 92:24-36. [PMID: 37497743 DOI: 10.1002/prot.26555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2023] [Revised: 06/16/2023] [Accepted: 07/04/2023] [Indexed: 07/28/2023]
Abstract
Glioma is a type of tumor that starts in the glial cells of the brain or spine. Since the 1800s, when the disease was first named, its survival rates have always been unsatisfactory. Despite great advances in molecular biology and traditional treatment methods, many questions regarding cancer occurrence and the underlying mechanism remain to be answered. In this study, we assessed the protein structural features of 20 oncogenes and 20 anti-oncogenes via protein structure and dynamic analysis methods and 3D structural and systematic analyses of the structure-function relationships of proteins. All of these results directly indicate that unfavorable group proteins show more complex structures than favorable group proteins. As the tumor cell microenvironment changes, the balance of oncogene-related and anti-oncogene-related proteins is disrupted, and most of the structures of the two groups of proteins will be disrupted. However, more unfavorable group proteins will maintain and refold to achieve their correct shape faster and perform their functions more quickly than favorable group proteins, and the former thus support cancer development. We hope that these analyses will help promote mechanistic research and the development of new treatments for glioma.
Collapse
Affiliation(s)
- Nan Xiao
- Department of Medical Science, Medical College of Jinzhou Medical University, Jinzhou, Liaoning, China
| | - Wenming Yang
- Department of Neurosurgery, The First Affiliated Hospital of Jinzhou Medical University, Jinzhou, Liaoning, China
| | - Jin Wang
- Department of Rehabilitation, Medical College of Jinzhou Medical University, Jinzhou, Liaoning, China
| | - Jiarong Li
- Department of Rehabilitation, Medical College of Jinzhou Medical University, Jinzhou, Liaoning, China
| | - Ruoxuan Zhao
- Department of Medical Science, Medical College of Jinzhou Medical University, Jinzhou, Liaoning, China
| | - Muzheng Li
- Department of Rehabilitation, Medical College of Jinzhou Medical University, Jinzhou, Liaoning, China
| | - Chi Li
- Department of Anesthesiology, Medical College of Jinzhou Medical University, Jinzhou, Liaoning, China
| | - Kang Liu
- Department of Medical Science, Medical College of Jinzhou Medical University, Jinzhou, Liaoning, China
| | - Yingxin Li
- Department of Medical Science, Medical College of Jinzhou Medical University, Jinzhou, Liaoning, China
| | - Chaoqun Yin
- Department of Medical Science, Medical College of Jinzhou Medical University, Jinzhou, Liaoning, China
| | - Zhibo Chen
- Department of Medical Science, Medical College of Jinzhou Medical University, Jinzhou, Liaoning, China
| | - Xingqi Li
- Department of Medicine, Medical College of Jinzhou Medical University, Jinzhou, Liaoning, China
| | - Yun Jiang
- Department of Medical Science, Medical College of Jinzhou Medical University, Jinzhou, Liaoning, China
| |
Collapse
|
6
|
Casier R, Duhamel J. Appraisal of blob-Based Approaches in the Prediction of Protein Folding Times. J Phys Chem B 2023; 127:8852-8859. [PMID: 37793094 DOI: 10.1021/acs.jpcb.3c04958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/06/2023]
Abstract
A series of reports published in the last 3 years has illustrated that a blob-based model (BBM) can predict the folding time of proteins from their primary amino acid (aa) sequence based on three simple rules established to characterize the long-range backbone dynamics (LRBD) of racemic polypeptides. The sole use of LRBD to predict protein folding times with the BBM represents a radical departure from all other prediction methods currently applied to determine protein folding times, which rely instead on parameters such as the structure content, folding kinetics, chain length, amino acid properties, or contact topography of proteins. Furthermore, the built-in modularity of the BBM enables the parametrization and inclusion of new phenomena affecting the LRBD of polypeptides, while its conceptual simplicity makes it an interesting new mathematical tool for studying protein folding. However, its novelty implies that its relationship with many other methods used to predict protein folding times has not been well researched. Consequently, the purpose of this report is to uncover the physical phenomena encountered during protein folding that are best described by the BBM through the identification of parameters that have been recognized over the years as being strong predictors for protein folding, such as protein size, topology, structural class, and folding kinetics. This was accomplished by determining the parameters most strongly correlated with the folding times predicted by the BBM. While the BBM in its present form appears to be a good indicator of the folding times of the vast majority of the 195 proteins considered so far, this report finds that it excels for moderately large proteins that are primarily composed of locally formed structural motifs such as α-helices or for proteins that fold in multiple steps. Altogether, these observations based on the use of the BBM support the notion that proteins fold the way they do because the LRBD of polypeptides is mostly driven by the local interactions experienced between aa's within reach of one another.
Collapse
Affiliation(s)
- Remi Casier
- Institute for Polymer Research, Waterloo Institute for Nanotechnology, Department of Chemistry, University of Waterloo, Waterloo, Ontario N2L3G1, Canada
| | - Jean Duhamel
- Institute for Polymer Research, Waterloo Institute for Nanotechnology, Department of Chemistry, University of Waterloo, Waterloo, Ontario N2L3G1, Canada
| |
Collapse
|
7
|
Vila JA. Protein folding rate evolution upon mutations. Biophys Rev 2023; 15:661-669. [PMID: 37681091 PMCID: PMC10480377 DOI: 10.1007/s12551-023-01088-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Accepted: 06/24/2023] [Indexed: 09/09/2023] Open
Abstract
Despite the spectacular success of cutting-edge protein fold prediction methods, many critical questions remain unanswered, including why proteins can reach their native state in a biologically reasonable time. A satisfactory answer to this simple question could shed light on the slowest folding rate of proteins as well as how mutations-amino-acid substitutions and/or post-translational modifications-might affect it. Preliminary results indicate that (i) Anfinsen's dogma validity ensures that proteins reach their native state on a reasonable timescale regardless of their sequence or length, and (ii) it is feasible to determine the evolution of protein folding rates without accounting for epistasis effects or the mutational trajectories between the starting and target sequences. These results have direct implications for evolutionary biology because they lay the groundwork for a better understanding of why, and to what extent, mutations-a crucial element of evolution and a factor influencing it-affect protein evolvability. Furthermore, they may spur significant progress in our efforts to solve crucial structural biology problems, such as how a sequence encodes its folding.
Collapse
Affiliation(s)
- Jorge A. Vila
- IMASL-CONICET, Universidad Nacional de San Luis, Ejército de Los Andes 950, 5700 San Luis, Argentina
| |
Collapse
|
8
|
Xiao N, Ma H, Gao H, Yang J, Tong D, Gan D, Yang J, Li C, Liu K, Li Y, Chen Z, Yin C, Li X, Wang H. Structure-function crosstalk in liver cancer research: Protein structuromics. Int J Biol Macromol 2023:125291. [PMID: 37315670 DOI: 10.1016/j.ijbiomac.2023.125291] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 06/04/2023] [Accepted: 06/07/2023] [Indexed: 06/16/2023]
Abstract
Liver cancer can be primary (starting in the liver) or secondary (cancer that has spread from elsewhere to the liver, known as liver metastasis). Liver metastasis is more common than primary liver cancer. Despite great advances in molecular biology methods and treatments, liver cancer is still associated with a poor survival rate and a high death rate, and there is no cure. Many questions remain regarding the mechanisms of liver cancer occurrence and development as well as tumor reoccurrence after treatment. In this study, we assessed the protein structural features of 20 oncogenes and 20 anti-oncogenes via protein structure and dynamic analysis methods and 3D structural and systematic analyses of the structure-function relationships of proteins. Our aim was to provide new insights that may inform research on the development and treatment of liver cancer.
Collapse
Affiliation(s)
- Nan Xiao
- Department of Medical Science, Medical College of Jinzhou Medical University, Jinzhou City, Liaoning Province, China.
| | - Hongming Ma
- Department of Oncology, China Emergency General Hospital City, Beijing, China
| | - Hong Gao
- Department of Oncology, China Emergency General Hospital City, Beijing, China
| | - Jing Yang
- Department of Computer Center, Medical College of Jinzhou Medical University, Jinzhou City, Liaoning Province, China
| | - Dan Tong
- Department of Nurse, Medical College of Jinzhou Medical University, Jinzhou City, Liaoning Province, China
| | - Dingzhu Gan
- Department of Publicity, Peking Union Medical College, Beijing, China
| | - Jinhua Yang
- Department of Development and Production, Institute of Medical Biology, Peking Union Medical College, Kunming City, Yunnan Province, China
| | - Chi Li
- Department of Anesthesiology, Medical College of Jinzhou Medical University, Jinzhou City, Liaoning Province, China
| | - Kang Liu
- Department of Medical Science, Medical College of Jinzhou Medical University, Jinzhou City, Liaoning Province, China
| | - Yingxin Li
- Department of Medical Science, Medical College of Jinzhou Medical University, Jinzhou City, Liaoning Province, China
| | - Zhibo Chen
- Department of Medical Science, Medical College of Jinzhou Medical University, Jinzhou City, Liaoning Province, China
| | - Chaoqun Yin
- Department of Medical Science, Medical College of Jinzhou Medical University, Jinzhou City, Liaoning Province, China
| | - Xingqi Li
- Department of Medicine, Medical College of Jinzhou Medical University, Jinzhou City, Liaoning Province, China
| | - Hongwu Wang
- Department of Respiratory and Critical Care Medicine, Dongzhimen Hospital Affiliated to Beijing University of Chinese Medicine, Beijing, China
| |
Collapse
|
9
|
Nithiyanandam S, Sangaraju VK, Manavalan B, Lee G. Computational prediction of protein folding rate using structural parameters and network centrality measures. Comput Biol Med 2023; 155:106436. [PMID: 36848800 DOI: 10.1016/j.compbiomed.2022.106436] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Revised: 11/28/2022] [Accepted: 12/13/2022] [Indexed: 02/17/2023]
Abstract
Protein folding is a complex physicochemical process whereby a polymer of amino acids samples numerous conformations in its unfolded state before settling on an essentially unique native three-dimensional (3D) structure. To understand this process, several theoretical studies have used a set of 3D structures, identified different structural parameters, and analyzed their relationships using the natural logarithmic protein folding rate (ln(kf)). Unfortunately, these structural parameters are specific to a small set of proteins that are not capable of accurately predicting ln(kf) for both two-state (TS) and non-two-state (NTS) proteins. To overcome the limitations of the statistical approach, a few machine learning (ML)-based models have been proposed using limited training data. However, none of these methods can explain plausible folding mechanisms. In this study, we evaluated the predictive capabilities of ten different ML algorithms using eight different structural parameters and five different network centrality measures based on newly constructed datasets. In comparison to the other nine regressors, support vector machine was found to be the most appropriate for predicting ln(kf) with mean absolute differences of 1.856, 1.55, and 1.745 for the TS, NTS, and combined datasets, respectively. Furthermore, combining structural parameters and network centrality measures improves the prediction performance compared to individual parameters, indicating that multiple factors are involved in the folding process.
Collapse
Affiliation(s)
- Saraswathy Nithiyanandam
- Department of Molecular Science and Technology, Ajou University, 206 World Cup-ro, Suwon, 16499, South Korea
| | - Vinoth Kumar Sangaraju
- Department of Physiology, Ajou University School of Medicine, 206 World Cup-ro, Suwon, 16499, South Korea
| | - Balachandran Manavalan
- Department of Physiology, Ajou University School of Medicine, 206 World Cup-ro, Suwon, 16499, South Korea.
| | - Gwang Lee
- Department of Molecular Science and Technology, Ajou University, 206 World Cup-ro, Suwon, 16499, South Korea; Computational Biology and Bioinformatics Laboratory, Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon, 16419, Gyeonggi-do, South Korea.
| |
Collapse
|
10
|
Finkelstein AV, Bogatyreva NS, Ivankov DN, Garbuzynskiy SO. Protein folding problem: enigma, paradox, solution. Biophys Rev 2022; 14:1255-1272. [PMID: 36659994 PMCID: PMC9842845 DOI: 10.1007/s12551-022-01000-1] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 09/19/2022] [Indexed: 01/22/2023] Open
Abstract
The ability of protein chains to spontaneously form their three-dimensional structures is a long-standing mystery in molecular biology. The most conceptual aspect of this mystery is how the protein chain can find its native, "working" spatial structure (which, for not too big protein chains, corresponds to the global free energy minimum) in a biologically reasonable time, without exhaustive enumeration of all possible conformations, which would take billions of years. This is the so-called "Levinthal's paradox." In this review, we discuss the key ideas and discoveries leading to the current understanding of protein folding kinetics, including folding landscapes and funnels, free energy barriers at the folding/unfolding pathways, and the solution of Levinthal's paradox. A special role here is played by the "all-or-none" phase transition occurring at protein folding and unfolding and by the point of thermodynamic (and kinetic) equilibrium between the "native" and the "unfolded" phases of the protein chain (where the theory obtains the simplest form). The modern theory provides an understanding of key features of protein folding and, in good agreement with experiments, it (i) outlines the chain length-dependent range of protein folding times, (ii) predicts the observed maximal size of "foldable" proteins and domains. Besides, it predicts the maximal size of proteins and domains that fold under solely thermodynamic (rather than kinetic) control. Complementarily, a theoretical analysis of the number of possible protein folding patterns, performed at the level of formation and assembly of secondary structures, correctly outlines the upper limit of protein folding times.
Collapse
Affiliation(s)
- Alexei V. Finkelstein
- Institute of Protein Research of the Russian Academy of Sciences, 142290 Pushchino, Moscow Region, Russia
- Biotechnology Department of the Lomonosov Moscow State University, 4 Institutskaya Str, 142290 Pushchino, Moscow Region, Russia
- Biology Department of the Lomonosov Moscow State University, 1-12 Leninskie Gory, 119991 Moscow, Russia
| | - Natalya S. Bogatyreva
- Institute of Protein Research of the Russian Academy of Sciences, 142290 Pushchino, Moscow Region, Russia
| | - Dmitry N. Ivankov
- Center of Life Sciences, Skolkovo Institute of Science and Technology, 121205 Moscow, Russia
| | - Sergiy O. Garbuzynskiy
- Institute of Protein Research of the Russian Academy of Sciences, 142290 Pushchino, Moscow Region, Russia
| |
Collapse
|
11
|
Bychkova VE, Dolgikh DA, Balobanov VA, Finkelstein AV. The Molten Globule State of a Globular Protein in a Cell Is More or Less Frequent Case Rather than an Exception. Molecules 2022; 27:molecules27144361. [PMID: 35889244 PMCID: PMC9319461 DOI: 10.3390/molecules27144361] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 07/01/2022] [Accepted: 07/03/2022] [Indexed: 02/01/2023] Open
Abstract
Quite a long time ago, Oleg B. Ptitsyn put forward a hypothesis about the possible functional significance of the molten globule (MG) state for the functioning of proteins. MG is an intermediate between the unfolded and the native state of a protein. Its experimental detection and investigation in a cell are extremely difficult. In the last decades, intensive studies have demonstrated that the MG-like state of some globular proteins arises from either their modifications or interactions with protein partners or other cell components. This review summarizes such reports. In many cases, MG was evidenced to be functionally important. Thus, the MG state is quite common for functional cellular proteins. This supports Ptitsyn’s hypothesis that some globular proteins may switch between two active states, rigid (N) and soft (MG), to work in solution or interact with partners.
Collapse
Affiliation(s)
- Valentina E. Bychkova
- Institute of Protein Research, Russian Academy of Sciences, 142290 Pushchino, Moscow Region, Russia; (V.E.B.); (A.V.F.)
| | - Dmitry A. Dolgikh
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 117871 Moscow, Russia;
| | - Vitalii A. Balobanov
- Institute of Protein Research, Russian Academy of Sciences, 142290 Pushchino, Moscow Region, Russia; (V.E.B.); (A.V.F.)
- Correspondence:
| | - Alexei V. Finkelstein
- Institute of Protein Research, Russian Academy of Sciences, 142290 Pushchino, Moscow Region, Russia; (V.E.B.); (A.V.F.)
| |
Collapse
|
12
|
Scalvini B, Sheikhhassani V, Mashaghi A. Topological principles of protein folding. Phys Chem Chem Phys 2021; 23:21316-21328. [PMID: 34545868 DOI: 10.1039/d1cp03390e] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
What is the topology of a protein and what governs protein folding to a specific topology? This is a fundamental question in biology. The protein folding reaction is a critically important cellular process, which is failing in many prevalent diseases. Understanding protein folding is also key to the design of new proteins for applications. However, our ability to predict the folding of a protein chain is quite limited and much is still unknown about the topological principles of folding. Current predictors of folding kinetics, including the contact order and size, present a limited predictive power, suggesting that these models are fundamentally incomplete. Here, we use a newly developed mathematical framework to define and extract the topology of a native protein conformation beyond knot theory, and investigate the relationship between native topology and folding kinetics in experimentally characterized proteins. We show that not only the folding rate, but also the mechanistic insight into folding mechanisms can be inferred from topological parameters. We identify basic topological features that speed up or slow down the folding process. The approach enabled the decomposition of protein 3D conformation into topologically independent elementary folding units, called circuits. The number of circuits correlates significantly with the folding rate, offering not only an efficient kinetic predictor, but also a tool for a deeper understanding of theoretical folding models. This study contributes to recent work that reveals the critical relevance of topology to protein folding with a new, contact-based, mathematically rigorous perspective. We show that topology can predict folding kinetics when geometry-based predictors like contact order and size fail.
Collapse
Affiliation(s)
- Barbara Scalvini
- Medical Systems Biophysics and Bioengineering, Leiden Academic Centre for Drug Research, Faculty of Science, Leiden University, Einsteinweg 55, 2333CC Leiden, The Netherlands.
| | - Vahid Sheikhhassani
- Medical Systems Biophysics and Bioengineering, Leiden Academic Centre for Drug Research, Faculty of Science, Leiden University, Einsteinweg 55, 2333CC Leiden, The Netherlands.
| | - Alireza Mashaghi
- Medical Systems Biophysics and Bioengineering, Leiden Academic Centre for Drug Research, Faculty of Science, Leiden University, Einsteinweg 55, 2333CC Leiden, The Netherlands.
| |
Collapse
|
13
|
Casier R, Duhamel J. Blob-Based Predictions of Protein Folding Times from the Amino Acid-Dependent Conformation of Polypeptides in Solution. Macromolecules 2021. [DOI: 10.1021/acs.macromol.0c02617] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Remi Casier
- Institute for Polymer Research, Waterloo Institute for Nanotechnology, Department of Chemistry, University of Waterloo, Waterloo, ON N2L3G1, Canada
| | - Jean Duhamel
- Institute for Polymer Research, Waterloo Institute for Nanotechnology, Department of Chemistry, University of Waterloo, Waterloo, ON N2L3G1, Canada
| |
Collapse
|
14
|
Casier R, Duhamel J. Blob-Based Approach to Estimate the Folding Time of Proteins Supported by Pyrene Excimer Fluorescence Experiments. Macromolecules 2020. [DOI: 10.1021/acs.macromol.0c02201] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Remi Casier
- Institute for Polymer Research, Waterloo Institute for Nanotechnology, Department of Chemistry, University of Waterloo, Waterloo, ON N2L 3G1, Canada
| | - Jean Duhamel
- Institute for Polymer Research, Waterloo Institute for Nanotechnology, Department of Chemistry, University of Waterloo, Waterloo, ON N2L 3G1, Canada
| |
Collapse
|
15
|
Li Y, Zhang Y, Lv J. An Effective Cumulative Torsion Angles Model for Prediction of Protein Folding Rates. Protein Pept Lett 2020; 27:321-328. [PMID: 31612815 DOI: 10.2174/0929866526666191014152207] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2019] [Revised: 06/07/2019] [Accepted: 06/29/2019] [Indexed: 02/05/2023]
Abstract
BACKGROUND Protein folding rate is mainly determined by the size of the conformational space to search, which in turn is dictated by factors such as size, structure and amino-acid sequence in a protein. It is important to integrate these factors effectively to form a more precisely description of conformation space. But there is no general paradigm to answer this question except some intuitions and empirical rules. Therefore, at the present stage, predictions of the folding rate can be improved through finding new factors, and some insights are given to the above question. OBJECTIVE Its purpose is to propose a new parameter that can describe the size of the conformational space to improve the prediction accuracy of protein folding rate. METHODS Based on the optimal set of amino acids in a protein, an effective cumulative backbone torsion angles (CBTAeff) was proposed to describe the size of the conformational space. Linear regression model was used to predict protein folding rate with CBTAeff as a parameter. The degree of correlation was described by the coefficient of determination and the mean absolute error MAE between the predicted folding rates and experimental observations. RESULTS It achieved a high correlation (with the coefficient of determination of 0.70 and MAE of 1.88) between the logarithm of folding rates and the (CBTAeff)0.5 with experimental over 112 twoand multi-state folding proteins. CONCLUSION The remarkable performance of our simplistic model demonstrates that CBTA based on optimal set was the major determinants of the conformation space of natural proteins.
Collapse
Affiliation(s)
- Yanru Li
- Department of Physics, College of Science, Inner Mongolia University of Technology, Hohhot, China
| | - Ying Zhang
- Department of Physics, College of Science, Inner Mongolia University of Technology, Hohhot, China
| | - Jun Lv
- Department of Physics, College of Science, Inner Mongolia University of Technology, Hohhot, China
| |
Collapse
|
16
|
Kuwajima K. The Molten Globule, and Two-State vs. Non-Two-State Folding of Globular Proteins. Biomolecules 2020; 10:biom10030407. [PMID: 32155758 PMCID: PMC7175247 DOI: 10.3390/biom10030407] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Revised: 03/03/2020] [Accepted: 03/06/2020] [Indexed: 11/16/2022] Open
Abstract
From experimental studies of protein folding, it is now clear that there are two types of folding behavior, i.e., two-state folding and non-two-state folding, and understanding the relationships between these apparently different folding behaviors is essential for fully elucidating the molecular mechanisms of protein folding. This article describes how the presence of the two types of folding behavior has been confirmed experimentally, and discusses the relationships between the two-state and the non-two-state folding reactions, on the basis of available data on the correlations of the folding rate constant with various structure-based properties, which are determined primarily by the backbone topology of proteins. Finally, a two-stage hierarchical model is proposed as a general mechanism of protein folding. In this model, protein folding occurs in a hierarchical manner, reflecting the hierarchy of the native three-dimensional structure, as embodied in the case of non-two-state folding with an accumulation of the molten globule state as a folding intermediate. The two-state folding is thus merely a simplified version of the hierarchical folding caused either by an alteration in the rate-limiting step of folding or by destabilization of the intermediate.
Collapse
Affiliation(s)
- Kunihiro Kuwajima
- Department of Physics, School of Science, the University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan; ; Tel.: +81-90-5435-6540
- School of Computational Sciences, Korea Institute for Advanced Study (KIAS), Seoul 02455, Korea
| |
Collapse
|
17
|
How Quickly Do Proteins Fold and Unfold, and What Structural Parameters Correlate with These Values? Biomolecules 2020; 10:biom10020197. [PMID: 32013136 PMCID: PMC7072309 DOI: 10.3390/biom10020197] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2019] [Revised: 01/22/2020] [Accepted: 01/26/2020] [Indexed: 11/24/2022] Open
Abstract
The correlations between the logarithm of the unfolding rate of 108 proteins and their structural parameters were calculated. We showed that there is a good correlation between the logarithm of folding and unfolding rates (0.79) and protein stability and unfolding rate (0.79). Thus, the faster the protein folds, the faster it unfolds. Folding and unfolding rates are higher for the proteins with two-state kinetics, in comparison with the proteins with multi-state kinetics. At the same time, two-state bacterial proteins folds and unfolds two orders of magnitude faster than two-state eukaryotic proteins, and multi-state bacterial proteins folds and unfolds slower than multi-state eukaryotic proteins. Despite the fact that the folding rates of thermophilic and mesophilic proteins are close, the unfolding rates of thermophilic proteins is about two orders of magnitude lower than for mesophilic proteins. The correlation between unfolding rate and stability of thermophilic proteins is high (0.90). We also found that the unfolding rate correlates with such structural parameters as: size of the protein, radius of the cross-section, logarithm of absolute contact order, and radius of gyration. This information will be useful for engineering and designing new proteins with desired properties.
Collapse
|
18
|
Uversky VN, Finkelstein AV. Life in Phases: Intra- and Inter- Molecular Phase Transitions in Protein Solutions. Biomolecules 2019; 9:E842. [PMID: 31817975 PMCID: PMC6995567 DOI: 10.3390/biom9120842] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Revised: 12/05/2019] [Accepted: 12/06/2019] [Indexed: 02/06/2023] Open
Abstract
Proteins, these evolutionarily-edited biological polymers, are able to undergo intramolecular and intermolecular phase transitions. Spontaneous intramolecular phase transitions define the folding of globular proteins, whereas binding-induced, intra- and inter- molecular phase transitions play a crucial role in the functionality of many intrinsically-disordered proteins. On the other hand, intermolecular phase transitions are the behind-the-scenes players in a diverse set of macrosystemic phenomena taking place in protein solutions, such as new phase nucleation in bulk, on the interface, and on the impurities, protein crystallization, protein aggregation, the formation of amyloid fibrils, and intermolecular liquid-liquid or liquid-gel phase transitions associated with the biogenesis of membraneless organelles in the cells. This review is dedicated to the systematic analysis of the phase behavior of protein molecules and their ensembles, and provides a description of the major physical principles governing intramolecular and intermolecular phase transitions in protein solutions.
Collapse
Affiliation(s)
- Vladimir N. Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA
- Laboratory of New Methods in Biology, Institute for Biological Instrumentation, Russian Academy of Sciences, Federal Research Center “Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences”, 142290 Pushchino, Moscow, Russia
| | - Alexei V. Finkelstein
- Institute of Protein Research, Russian Academy of Sciences, 142290 Pushchino, Moscow, Russia
- Biology Department, Lomonosov Moscow State University, 119192 Moscow, Russia
- Bioltechnogy Department, Lomonosov Moscow State University, 142290 Pushchino, Moscow, Russia
| |
Collapse
|
19
|
|
20
|
Censoni L, Martínez L. Prediction of kinetics of protein folding with non-redundant contact information. Bioinformatics 2018; 34:4034-4038. [DOI: 10.1093/bioinformatics/bty478] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2018] [Accepted: 06/12/2018] [Indexed: 11/14/2022] Open
Affiliation(s)
- Luciano Censoni
- Institute of Chemistry and Center for Computational Engineering and Science, University of Campinas, Campinas, SP, Brazil
| | - Leandro Martínez
- Institute of Chemistry and Center for Computational Engineering and Science, University of Campinas, Campinas, SP, Brazil
| |
Collapse
|
21
|
Comparative genomic analysis of mollicutes with and without a chaperonin system. PLoS One 2018; 13:e0192619. [PMID: 29438383 PMCID: PMC5810989 DOI: 10.1371/journal.pone.0192619] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2017] [Accepted: 01/27/2018] [Indexed: 11/19/2022] Open
Abstract
The GroE chaperonin system, which comprises GroEL and GroES, assists protein folding in vivo and in vitro. It is conserved in all prokaryotes except in most, but not all, members of the class of mollicutes. In Escherichia coli, about 60 proteins were found to be obligatory clients of the GroE system. Here, we describe the properties of the homologs of these GroE clients in mollicutes and the evolution of chaperonins in this class of bacteria. Comparing the properties of these homologs in mollicutes with and without chaperonins enabled us to search for features correlated with the presence of GroE. Interestingly, no sequence-based features of proteins such as average length, amino acid composition and predicted folding/disorder propensity were found to be affected by the absence of GroE. Other properties such as genome size and number of proteins were also found to not differ between mollicute species with and without GroE. Our data suggest that two clades of mollicutes re-acquired the GroE system, thereby supporting the view that gaining the system occurred polyphyletically and not monophyletically, as previously debated. Our data also suggest that there might have been three isolated cases of lateral gene transfer from specific bacterial sources. Taken together, our data indicate that loss of GroE does not involve crossing a high evolutionary barrier and can be compensated for by a small number of changes within the few dozen client proteins.
Collapse
|
22
|
Bychkova VE, Semisotnov GV, Balobanov VA, Finkelstein AV. The Molten Globule Concept: 45 Years Later. BIOCHEMISTRY (MOSCOW) 2018; 83:S33-S47. [DOI: 10.1134/s0006297918140043] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
23
|
Chaney JL, Steele A, Carmichael R, Rodriguez A, Specht AT, Ngo K, Li J, Emrich S, Clark PL. Widespread position-specific conservation of synonymous rare codons within coding sequences. PLoS Comput Biol 2017; 13:e1005531. [PMID: 28475588 PMCID: PMC5438181 DOI: 10.1371/journal.pcbi.1005531] [Citation(s) in RCA: 67] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2016] [Revised: 05/19/2017] [Accepted: 04/21/2017] [Indexed: 02/01/2023] Open
Abstract
Synonymous rare codons are considered to be sub-optimal for gene expression because they are translated more slowly than common codons. Yet surprisingly, many protein coding sequences include large clusters of synonymous rare codons. Rare codons at the 5’ terminus of coding sequences have been shown to increase translational efficiency. Although a general functional role for synonymous rare codons farther within coding sequences has not yet been established, several recent reports have identified rare-to-common synonymous codon substitutions that impair folding of the encoded protein. Here we test the hypothesis that although the usage frequencies of synonymous codons change from organism to organism, codon rarity will be conserved at specific positions in a set of homologous coding sequences, for example to tune translation rate without altering a protein sequence. Such conservation of rarity–rather than specific codon identity–could coordinate co-translational folding of the encoded protein. We demonstrate that many rare codon cluster positions are indeed conserved within homologous coding sequences across diverse eukaryotic, bacterial, and archaeal species, suggesting they result from positive selection and have a functional role. Most conserved rare codon clusters occur within rather than between conserved protein domains, challenging the view that their primary function is to facilitate co-translational folding after synthesis of an autonomous structural unit. Instead, many conserved rare codon clusters separate smaller protein structural motifs within structural domains. These smaller motifs typically fold faster than an entire domain, on a time scale more consistent with translation rate modulation by synonymous codon usage. While proteins with conserved rare codon clusters are structurally and functionally diverse, they are enriched in functions associated with organism growth and development, suggesting an important role for synonymous codon usage in organism physiology. The identification of conserved rare codon clusters advances our understanding of distinct, functional roles for otherwise synonymous codons and enables experimental testing of the impact of synonymous codon usage on the production of functional proteins. Proteins are long linear polymers that must fold into complex three-dimensional shapes in order to carry out their cellular functions. Every protein is synthesized by the ribosome, which decodes each trinucleotide codon in an mRNA coding sequence in order to select the amino acid residue that will occupy each position in the protein sequence. Most amino acids can be encoded by more than one codon, but these synonymous codons are not used with equal frequency. Rare codons are associated with generally slower rates for protein synthesis, and for this reason have traditionally been considered mildly deleterious for efficient protein production. However, because synonymous codon substitutions do not change the sequence of the encoded protein, the majority view is that they merely reflect genomic ‘background noise’. To the contrary, here we show that the positions of many synonymous rare codons are conserved in mRNA sequences that encode structurally similar proteins from a diverse range of organisms. These results suggest that rare codons have a functional role related to the production of functional proteins, potentially to regulate the rate of protein synthesis and the earliest steps of protein folding, while synthesis is still underway.
Collapse
Affiliation(s)
- Julie L. Chaney
- Department of Chemistry & Biochemistry, University of Notre Dame, Notre Dame, Indiana, United States of America
| | - Aaron Steele
- Department of Computer Science & Engineering, University of Notre Dame, Notre Dame, Indiana, United States of America
| | - Rory Carmichael
- Department of Computer Science & Engineering, University of Notre Dame, Notre Dame, Indiana, United States of America
| | - Anabel Rodriguez
- Department of Chemistry & Biochemistry, University of Notre Dame, Notre Dame, Indiana, United States of America
| | - Alicia T. Specht
- Department of Applied and Computational Mathematics & Statistics, University of Notre Dame, Notre Dame, Indiana, United States of America
| | - Kim Ngo
- Department of Chemistry & Biochemistry, University of Notre Dame, Notre Dame, Indiana, United States of America
- Department of Computer Science & Engineering, University of Notre Dame, Notre Dame, Indiana, United States of America
| | - Jun Li
- Department of Applied and Computational Mathematics & Statistics, University of Notre Dame, Notre Dame, Indiana, United States of America
| | - Scott Emrich
- Department of Computer Science & Engineering, University of Notre Dame, Notre Dame, Indiana, United States of America
- * E-mail: (PLC); (SE)
| | - Patricia L. Clark
- Department of Chemistry & Biochemistry, University of Notre Dame, Notre Dame, Indiana, United States of America
- Department of Chemical & Biomolecular Engineering, University of Notre Dame, Notre Dame, Indiana, United States of America
- * E-mail: (PLC); (SE)
| |
Collapse
|
24
|
Finkelstein AV, Badretdin AJ, Galzitskaya OV, Ivankov DN, Bogatyreva NS, Garbuzynskiy SO. There and back again: Two views on the protein folding puzzle. Phys Life Rev 2017; 21:56-71. [PMID: 28190683 DOI: 10.1016/j.plrev.2017.01.025] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2016] [Revised: 01/05/2017] [Accepted: 01/19/2017] [Indexed: 02/08/2023]
Abstract
The ability of protein chains to spontaneously form their spatial structures is a long-standing puzzle in molecular biology. Experimentally measured folding times of single-domain globular proteins range from microseconds to hours: the difference (10-11 orders of magnitude) is the same as that between the life span of a mosquito and the age of the universe. This review describes physical theories of rates of overcoming the free-energy barrier separating the natively folded (N) and unfolded (U) states of protein chains in both directions: "U-to-N" and "N-to-U". In the theory of protein folding rates a special role is played by the point of thermodynamic (and kinetic) equilibrium between the native and unfolded state of the chain; here, the theory obtains the simplest form. Paradoxically, a theoretical estimate of the folding time is easier to get from consideration of protein unfolding (the "N-to-U" transition) rather than folding, because it is easier to outline a good unfolding pathway of any structure than a good folding pathway that leads to the stable fold, which is yet unknown to the folding protein chain. And since the rates of direct and reverse reactions are equal at the equilibrium point (as follows from the physical "detailed balance" principle), the estimated folding time can be derived from the estimated unfolding time. Theoretical analysis of the "N-to-U" transition outlines the range of protein folding rates in a good agreement with experiment. Theoretical analysis of folding (the "U-to-N" transition), performed at the level of formation and assembly of protein secondary structures, outlines the upper limit of protein folding times (i.e., of the time of search for the most stable fold). Both theories come to essentially the same results; this is not a surprise, because they describe overcoming one and the same free-energy barrier, although the way to the top of this barrier from the side of the unfolded state is very different from the way from the side of the native state; and both theories agree with experiment. In addition, they predict the maximal size of protein domains that fold under solely thermodynamic (rather than kinetic) control and explain the observed maximal size of the "foldable" protein domains.
Collapse
Affiliation(s)
- Alexei V Finkelstein
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Moscow Region 142290, Russian Federation.
| | - Azat J Badretdin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Oxana V Galzitskaya
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Moscow Region 142290, Russian Federation
| | - Dmitry N Ivankov
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Moscow Region 142290, Russian Federation; Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, 08003 Barcelona, Spain; Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
| | - Natalya S Bogatyreva
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Moscow Region 142290, Russian Federation; Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, 08003 Barcelona, Spain; Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
| | - Sergiy O Garbuzynskiy
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Moscow Region 142290, Russian Federation
| |
Collapse
|
25
|
Wołek K, Gómez-Sicilia À, Cieplak M. Determination of contact maps in proteins: A combination of structural and chemical approaches. J Chem Phys 2016; 143:243105. [PMID: 26723590 DOI: 10.1063/1.4929599] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Contact map selection is a crucial step in structure-based molecular dynamics modelling of proteins. The map can be determined in many different ways. We focus on the methods in which residues are represented as clusters of effective spheres. One contact map, denoted as overlap (OV), is based on the overlap of such spheres. Another contact map, named Contacts of Structural Units (CSU), involves the geometry in a different way and, in addition, brings chemical considerations into account. We develop a variant of the CSU approach in which we also incorporate Coulombic effects such as formation of the ionic bridges and destabilization of possible links through repulsion. In this way, the most essential and well defined contacts are identified. The resulting residue-residue contact map, dubbed repulsive CSU (rCSU), is more sound in its physico-chemical justification than CSU. It also provides a clear prescription for validity of an inter-residual contact: the number of attractive atomic contacts should be larger than the number of repulsive ones - a feature that is not present in CSU. However, both of these maps do not correlate well with the experimental data on protein stretching. Thus, we propose to use rCSU together with the OV map. We find that the combined map, denoted as OV+rCSU, performs better than OV. In most situations, OV and OV+rCSU yield comparable folding properties but for some proteins rCSU provides contacts which improve folding in a substantial way. We discuss the likely residue-specificity of the rCSU contacts. Finally, we make comparisons to the recently proposed shadow contact map, which is derived from different principles.
Collapse
Affiliation(s)
- Karol Wołek
- Institute of Physics, Polish Academy of Science, Al. Lotników 32/46, 02-668 Warsaw, Poland
| | - Àngel Gómez-Sicilia
- Instituto Cajal, Consejo Superior de Investigaciones Cientificas (CSIC), Av. Doctor Arce, 37, 28002 Madrid, Spain
| | - Marek Cieplak
- Institute of Physics, Polish Academy of Science, Al. Lotników 32/46, 02-668 Warsaw, Poland
| |
Collapse
|
26
|
Corrales M, Cuscó P, Usmanova DR, Chen HC, Bogatyreva NS, Filion GJ, Ivankov DN. Machine Learning: How Much Does It Tell about Protein Folding Rates? PLoS One 2015; 10:e0143166. [PMID: 26606303 PMCID: PMC4659572 DOI: 10.1371/journal.pone.0143166] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2015] [Accepted: 11/02/2015] [Indexed: 11/18/2022] Open
Abstract
The prediction of protein folding rates is a necessary step towards understanding the principles of protein folding. Due to the increasing amount of experimental data, numerous protein folding models and predictors of protein folding rates have been developed in the last decade. The problem has also attracted the attention of scientists from computational fields, which led to the publication of several machine learning-based models to predict the rate of protein folding. Some of them claim to predict the logarithm of protein folding rate with an accuracy greater than 90%. However, there are reasons to believe that such claims are exaggerated due to large fluctuations and overfitting of the estimates. When we confronted three selected published models with new data, we found a much lower predictive power than reported in the original publications. Overly optimistic predictive powers appear from violations of the basic principles of machine-learning. We highlight common misconceptions in the studies claiming excessive predictive power and propose to use learning curves as a safeguard against those mistakes. As an example, we show that the current amount of experimental data is insufficient to build a linear predictor of logarithms of folding rates based on protein amino acid composition.
Collapse
Affiliation(s)
- Marc Corrales
- Genome Architecture, Gene Regulation, Stem Cells and Cancer Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Spain Genome Architecture, Gene Regulation, Stem Cells and Cancer Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain
| | - Pol Cuscó
- Genome Architecture, Gene Regulation, Stem Cells and Cancer Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Spain Genome Architecture, Gene Regulation, Stem Cells and Cancer Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain
| | - Dinara R. Usmanova
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain
- Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, Russia
| | - Heng-Chang Chen
- Genome Architecture, Gene Regulation, Stem Cells and Cancer Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Spain Genome Architecture, Gene Regulation, Stem Cells and Cancer Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain
| | - Natalya S. Bogatyreva
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain
- Laboratory of Protein Physics, Institute of Protein Research of the Russian Academy of Sciences, Pushchino, Moscow Region, Russia
| | - Guillaume J. Filion
- Genome Architecture, Gene Regulation, Stem Cells and Cancer Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Spain Genome Architecture, Gene Regulation, Stem Cells and Cancer Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain
| | - Dmitry N. Ivankov
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain
- Laboratory of Protein Physics, Institute of Protein Research of the Russian Academy of Sciences, Pushchino, Moscow Region, Russia
- * E-mail:
| |
Collapse
|
27
|
Ruiz-Blanco YB, Marrero-Ponce Y, Prieto PJ, Salgado J, García Y, Sotomayor-Torres CM. A Hooke׳s law-based approach to protein folding rate. J Theor Biol 2015; 364:407-17. [DOI: 10.1016/j.jtbi.2014.09.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2014] [Revised: 08/28/2014] [Accepted: 09/02/2014] [Indexed: 10/24/2022]
|
28
|
Krobath H, Rey A, Faísca PFN. How determinant is N-terminal to C-terminal coupling for protein folding? Phys Chem Chem Phys 2015; 17:3512-24. [DOI: 10.1039/c4cp05178e] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
The existence of native interactions between the protein termini is a major determinant of the free energy barrier in a two-state folding transition being therefore a critical modulator of protein folding cooperativity.
Collapse
Affiliation(s)
- Heinrich Krobath
- Centro de Física da Matéria Condensada and Departamento de Física
- Faculdade de Ciências da Universidade de Lisboa
- Portugal
| | - Antonio Rey
- Departamento de Química Física I
- Facultad de Ciencias Químicas
- Universidad Complutense
- Madrid
- Spain
| | - Patrícia F. N. Faísca
- Centro de Física da Matéria Condensada and Departamento de Física
- Faculdade de Ciências da Universidade de Lisboa
- Portugal
| |
Collapse
|
29
|
Xiong H, Yang Y, Hu XP, He YM, Ma BG. Sequence determinants of prokaryotic gene expression level under heat stress. Gene 2014; 551:92-102. [PMID: 25168890 DOI: 10.1016/j.gene.2014.08.049] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2014] [Accepted: 08/25/2014] [Indexed: 10/24/2022]
Abstract
Prokaryotic gene expression is environment-dependent and temperature plays an important role in shaping the gene expression profile. Revealing the regulation mechanisms of gene expression pertaining to temperature has attracted tremendous efforts in recent years particularly owning to the yielding of transcriptome and proteome data by high-throughput techniques. However, most of the previous works concentrated on the characterization of the gene expression profile of individual organism and little effort has been made to disclose the commonality among organisms, especially for the gene sequence features. In this report, we collected the transcriptome and proteome data measured under heat stress condition from recently published literature and studied the sequence determinants for the expression level of heat-responsive genes on multiple layers. Our results showed that there indeed exist commonness and consistent patterns of the sequence features among organisms for the differentially expressed genes under heat stress condition. Some features are attributed to the requirement of thermostability while some are dominated by gene function. The revealed sequence determinants of bacterial gene expression level under heat stress complement the knowledge about the regulation factors of prokaryotic gene expression responding to the change of environmental conditions. Furthermore, comparisons to thermophilic adaption have been performed to reveal the similarity and dissimilarity of the sequence determinants for the response to heat stress and for the adaption to high habitat temperature, which elucidates the complex landscape of gene expression related to the same physical factor of temperature.
Collapse
Affiliation(s)
- Heng Xiong
- Agricultural Bioinformatics Key Laboratory of Hubei Province, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Yi Yang
- Agricultural Bioinformatics Key Laboratory of Hubei Province, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Xiao-Pan Hu
- Agricultural Bioinformatics Key Laboratory of Hubei Province, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Yi-Ming He
- Agricultural Bioinformatics Key Laboratory of Hubei Province, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Bin-Guang Ma
- Agricultural Bioinformatics Key Laboratory of Hubei Province, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, China.
| |
Collapse
|
30
|
Rollins GC, Dill KA. General mechanism of two-state protein folding kinetics. J Am Chem Soc 2014; 136:11420-7. [PMID: 25056406 DOI: 10.1021/ja5049434] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
We describe here a general model of the kinetic mechanism of protein folding. In the Foldon Funnel Model, proteins fold in units of secondary structures, which form sequentially along the folding pathway, stabilized by tertiary interactions. The model predicts that the free energy landscape has a volcano shape, rather than a simple funnel, that folding is two-state (single-exponential) when secondary structures are intrinsically unstable, and that each structure along the folding path is a transition state for the previous structure. It shows how sequential pathways are consistent with multiple stochastic routes on funnel landscapes, and it gives good agreement with the 9 order of magnitude dependence of folding rates on protein size for a set of 93 proteins, at the same time it is consistent with the near independence of folding equilibrium constant on size. This model gives estimates of folding rates of proteomes, leading to a median folding time in Escherichia coli of about 5 s.
Collapse
Affiliation(s)
- Geoffrey C Rollins
- Department of Biochemistry and Biophysics, University of California , San Francisco, California 94143, United States
| | | |
Collapse
|
31
|
Wagaman AS, Jaswal SS. Capturing protein folding-relevant topology via absolute contact order variants. JOURNAL OF THEORETICAL & COMPUTATIONAL CHEMISTRY 2014. [DOI: 10.1142/s0219633614500059] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Absolute contact order is one of the simplest parameters used to predict protein folding rates. Many variants of contact order (CO) have been applied to highlight different aspects of contact neighborhoods and their relationship to folding. However, a systematic study of the influence of CO variants on correlation with folding rate has not been performed for a large combined set of multi- and two-state proteins. We explore different contact neighborhoods and resulting CO by varying the distance thresholds and weighting of sequence separation for heavy atom and residue-based counting methods for a set of 136 proteins diverse across folding and structural classes. We examine the changes in contact neighborhoods and compare correlations with our CO variants and the protein folding rates across our data set as well as by folding type and structural class. Different CO variants lead to the strongest correlations within each protein structural class. Our results demonstrate that backbone topology at a distance beyond where energetic interactions dominate is able to capture folding determinants, and suggest that more sensitive methods of characterizing contact relationships may improve ln kf prediction for diverse protein sets.
Collapse
Affiliation(s)
- Amy S. Wagaman
- Mathematics Department, Amherst College, P. O. Box 5000, Amherst, MA 01002, USA
| | - Sheila S. Jaswal
- Chemistry Department and Program in Biochemistry and Biophysics, Amherst College, P. O. Box 5000, Amherst, MA 01002, USA
| |
Collapse
|
32
|
Hudson NE, Ding F, Bucay I, O'Brien ET, Gorkun OV, Superfine R, Lord ST, Dokholyan NV, Falvo MR. Submillisecond elastic recoil reveals molecular origins of fibrin fiber mechanics. Biophys J 2014; 104:2671-80. [PMID: 23790375 DOI: 10.1016/j.bpj.2013.04.052] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2012] [Revised: 04/19/2013] [Accepted: 04/22/2013] [Indexed: 12/13/2022] Open
Abstract
Fibrin fibers form the structural scaffold of blood clots. Thus, their mechanical properties are of central importance to understanding hemostasis and thrombotic disease. Recent studies have revealed that fibrin fibers are elastomeric despite their high degree of molecular ordering. These results have inspired a variety of molecular models for fibrin's elasticity, ranging from reversible protein unfolding to rubber-like elasticity. An important property that has not been explored is the timescale of elastic recoil, a parameter that is critical for fibrin's mechanical function and places a temporal constraint on molecular models of fiber elasticity. Using high-frame-rate imaging and atomic force microscopy-based nanomanipulation, we measured the recoil dynamics of individual fibrin fibers and found that the recoil was orders of magnitude faster than anticipated from models involving protein refolding. We also performed steered discrete molecular-dynamics simulations to investigate the molecular origins of the observed recoil. Our results point to the unstructured αC regions of the otherwise structured fibrin molecule as being responsible for the elastic recoil of the fibers.
Collapse
Affiliation(s)
- Nathan E Hudson
- Immune Disease Institute, Children's Hospital Boston, Massachusetts, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Chintapalli SV, Illingworth CJR, Upton GJG, Sacquin-Mora S, Reeves PJ, Mohammedali HS, Reynolds CA. Assessing the effect of dynamics on the closed-loop protein-folding hypothesis. J R Soc Interface 2013; 11:20130935. [PMID: 24258160 PMCID: PMC3869168 DOI: 10.1098/rsif.2013.0935] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
The closed-loop (loop-n-lock) hypothesis of protein folding suggests that loops of about 25 residues, closed through interactions between the loop ends (locks), play an important role in protein structure. Coarse-grain elastic network simulations, and examination of loop lengths in a diverse set of proteins, each supports a bias towards loops of close to 25 residues in length between residues of high stability. Previous studies have established a correlation between total contact distance (TCD), a metric of sequence distances between contacting residues (cf. contact order), and the log-folding rate of a protein. In a set of 43 proteins, we identify an improved correlation (r2 = 0.76), when the metric is restricted to residues contacting the locks, compared to the equivalent result when all residues are considered (r2 = 0.65). This provides qualified support for the hypothesis, albeit with an increased emphasis upon the importance of a much larger set of residues surrounding the locks. Evidence of a similar-sized protein core/extended nucleus (with significant overlap) was obtained from TCD calculations in which residues were successively eliminated according to their hydrophobicity and connectivity, and from molecular dynamics simulations. Our results suggest that while folding is determined by a subset of residues that can be predicted by application of the closed-loop hypothesis, the original hypothesis is too simplistic; efficient protein folding is dependent on a considerably larger subset of residues than those involved in lock formation.
Collapse
Affiliation(s)
- Sree V Chintapalli
- School of Biological Sciences, University of Essex, , Wivenhoe Park, Colchester CO4 3SQ, UK
| | | | | | | | | | | | | |
Collapse
|
34
|
|
35
|
Braselmann E, Chaney JL, Clark PL. Folding the proteome. Trends Biochem Sci 2013; 38:337-44. [PMID: 23764454 DOI: 10.1016/j.tibs.2013.05.001] [Citation(s) in RCA: 91] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2012] [Revised: 05/01/2013] [Accepted: 05/02/2013] [Indexed: 02/07/2023]
Abstract
Protein folding is an essential prerequisite for protein function and hence cell function. Kinetic and thermodynamic studies of small proteins that refold reversibly were essential for developing our current understanding of the fundamentals of protein folding mechanisms. However, we still lack sufficient understanding to accurately predict protein structures from sequences, or the effects of disease-causing mutations. To date, model proteins selected for folding studies represent only a small fraction of the complexity of the proteome and are unlikely to exhibit the breadth of folding mechanisms used in vivo. We are in urgent need of new methods - both theoretical and experimental - that can quantify the folding behavior of a truly broad set of proteins under in vivo conditions. Such a shift in focus will provide a more comprehensive framework from which to understand the connections between protein folding, the molecular basis of disease, and cell function and evolution.
Collapse
Affiliation(s)
- Esther Braselmann
- Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, IN 46556 USA
| | | | | |
Collapse
|
36
|
N-terminal domains in two-domain proteins are biased to be shorter and predicted to fold faster than their C-terminal counterparts. Cell Rep 2013; 3:1051-6. [PMID: 23602567 DOI: 10.1016/j.celrep.2013.03.032] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2012] [Revised: 03/08/2013] [Accepted: 03/20/2013] [Indexed: 11/22/2022] Open
Abstract
Computational analysis of proteomes in all kingdoms of life reveals a strong tendency for N-terminal domains in two-domain proteins to have shorter sequences than their neighboring C-terminal domains. Given that folding rates are affected by chain length, we asked whether the tendency for N-terminal domains to be shorter than their neighboring C-terminal domains reflects selection for faster-folding N-terminal domains. Calculations of absolute contact order, another predictor of folding rate, provide additional evidence that N-terminal domains tend to fold faster than their neighboring C-terminal domains. A possible explanation for this bias, which is more pronounced in prokaryotes than in eukaryotes, is that faster folding of N-terminal domains reduces the risk for protein aggregation during folding by preventing formation of nonnative interdomain interactions. This explanation is supported by our finding that two-domain proteins with a shorter N-terminal domain are much more abundant than those with a shorter C-terminal domain.
Collapse
|
37
|
Mohazab AR, Plotkin SS. Polymer uncrossing and knotting in protein folding, and their role in minimal folding pathways. PLoS One 2013; 8:e53642. [PMID: 23365638 PMCID: PMC3554774 DOI: 10.1371/journal.pone.0053642] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2012] [Accepted: 11/30/2012] [Indexed: 11/19/2022] Open
Abstract
We introduce a method for calculating the extent to which chain non-crossing is important in the most efficient, optimal trajectories or pathways for a protein to fold. This involves recording all unphysical crossing events of a ghost chain, and calculating the minimal uncrossing cost that would have been required to avoid such events. A depth-first tree search algorithm is applied to find minimal transformations to fold [Formula: see text], [Formula: see text], [Formula: see text], and knotted proteins. In all cases, the extra uncrossing/non-crossing distance is a small fraction of the total distance travelled by a ghost chain. Different structural classes may be distinguished by the amount of extra uncrossing distance, and the effectiveness of such discrimination is compared with other order parameters. It was seen that non-crossing distance over chain length provided the best discrimination between structural and kinetic classes. The scaling of non-crossing distance with chain length implies an inevitable crossover to entanglement-dominated folding mechanisms for sufficiently long chains. We further quantify the minimal folding pathways by collecting the sequence of uncrossing moves, which generally involve leg, loop, and elbow-like uncrossing moves, and rendering the collection of these moves over the unfolded ensemble as a multiple-transformation "alignment". The consensus minimal pathway is constructed and shown schematically for representative cases of an [Formula: see text], [Formula: see text], and knotted protein. An overlap parameter is defined between pathways; we find that [Formula: see text] proteins have minimal overlap indicating diverse folding pathways, knotted proteins are highly constrained to follow a dominant pathway, and [Formula: see text] proteins are somewhere in between. Thus we have shown how topological chain constraints can induce dominant pathway mechanisms in protein folding.
Collapse
Affiliation(s)
- Ali R. Mohazab
- Department of Physics and Astronomy, University of British Columbia, Vancouver, B.C, Canada
| | - Steven S. Plotkin
- Department of Physics and Astronomy, University of British Columbia, Vancouver, B.C, Canada
| |
Collapse
|
38
|
Huang S, Huang JT. Inter-residue interaction is a determinant of protein folding kinetics. J Theor Biol 2013; 317:224-8. [DOI: 10.1016/j.jtbi.2012.10.003] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2012] [Revised: 09/17/2012] [Accepted: 10/02/2012] [Indexed: 11/30/2022]
|
39
|
Wang J, Oliveira RJ, Chu X, Whitford PC, Chahine J, Han W, Wang E, Onuchic JN, Leite VB. Topography of funneled landscapes determines the thermodynamics and kinetics of protein folding. Proc Natl Acad Sci U S A 2012; 109:15763-8. [PMID: 23019359 PMCID: PMC3465441 DOI: 10.1073/pnas.1212842109] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The energy landscape approach has played a fundamental role in advancing our understanding of protein folding. Here, we quantify protein folding energy landscapes by exploring the underlying density of states. We identify three quantities essential for characterizing landscape topography: the stabilizing energy gap between the native and nonnative ensembles δE, the energetic roughness ΔE, and the scale of landscape measured by the entropy S. We show that the dimensionless ratio between the gap, roughness, and entropy of the system Λ=δE/(ΔE√(2S)) accurately predicts the thermodynamics, as well as the kinetics of folding. Large Λ implies that the energy gap (or landscape slope towards the native state) is dominant, leading to more funneled landscapes. We investigate the role of topological and energetic roughness for proteins of different sizes and for proteins of the same size, but with different structural topologies. The landscape topography ratio Λ is shown to be monotonically correlated with the thermodynamic stability against trapping, as characterized by the ratio of folding temperature versus trapping temperature. Furthermore, Λ also monotonically correlates with the folding kinetic rates. These results provide the quantitative bridge between the landscape topography and experimental folding measurements.
Collapse
Affiliation(s)
- Jin Wang
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences Changchun, Jilin 130012 China
- College of Physics and State Key Laboratory of Superhard Materials, Jilin University, Changchun, Jilin 130021, China
- Department of Chemistry, Physics and Applied Mathematics, State University of New York at Stony Brook, Stony Brook, NY 11794-3400
| | - Ronaldo J. Oliveira
- Departamento de Física—Instituto de Biociências, Letras e Ciências Exatas, Universidade Estadual Paulista, 15054-000 São José do Rio Preto, Brazil
- Laboratório Nacional de Ciência e Tecnologia do Bioetanol, Centro Nacional de Pesquisa em Energia e Materiais,13083-970 Campinas, SP, Brazil; and
| | - Xiakun Chu
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences Changchun, Jilin 130012 China
- College of Physics and State Key Laboratory of Superhard Materials, Jilin University, Changchun, Jilin 130021, China
| | - Paul C. Whitford
- Center for Theoretical Biological Physics, Rice University, 6100 Main, Houston, TX 77005-1827
| | - Jorge Chahine
- Departamento de Física—Instituto de Biociências, Letras e Ciências Exatas, Universidade Estadual Paulista, 15054-000 São José do Rio Preto, Brazil
| | - Wei Han
- College of Physics and State Key Laboratory of Superhard Materials, Jilin University, Changchun, Jilin 130021, China
| | - Erkang Wang
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences Changchun, Jilin 130012 China
| | - José N. Onuchic
- Center for Theoretical Biological Physics, Rice University, 6100 Main, Houston, TX 77005-1827
| | - Vitor B.P. Leite
- Departamento de Física—Instituto de Biociências, Letras e Ciências Exatas, Universidade Estadual Paulista, 15054-000 São José do Rio Preto, Brazil
| |
Collapse
|
40
|
Galzitskaya OV, Glyakina AV. Nucleation-based prediction of the protein folding rate and its correlation with the folding nucleus size. Proteins 2012; 80:2711-27. [DOI: 10.1002/prot.24156] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2012] [Revised: 07/19/2012] [Accepted: 07/21/2012] [Indexed: 11/08/2022]
|
41
|
Huang JT, Xing DJ, Huang W. Choice of synonymous codons associated with protein folding. Proteins 2012; 80:2056-62. [DOI: 10.1002/prot.24096] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2012] [Revised: 03/29/2012] [Accepted: 04/05/2012] [Indexed: 11/11/2022]
|
42
|
Braselmann E, Clark PL. Autotransporters: The Cellular Environment Reshapes a Folding Mechanism to Promote Protein Transport. J Phys Chem Lett 2012; 3:1063-1071. [PMID: 23687560 PMCID: PMC3654826 DOI: 10.1021/jz201654k] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
We know very little about how the cellular environment affects protein folding mechanisms. Here, we focus on one unique aspect of that environment that is difficult to recapitulate in the test tube: the effect of a folding vector. When protein folding is initiated at one end of the polypeptide chain, folding starts from a much smaller ensemble of conformations than during refolding of a full-length polypeptide chain. But to what extent can vectorial folding affect protein folding kinetics and the conformations of folding intermediates? We focus on recent studies of autotransporter proteins, the largest class of virulence proteins from pathogenic Gram-negative bacteria. Autotransporter proteins are secreted across the bacterial inner membrane from N→C-terminus, which, like refolding in vitro, retards folding. But in contrast, upon C→N-terminal secretion across the outer membrane autotransporter folding proceeds orders of magnitude faster. The potential impact of vectorial folding on the folding mechanisms of other proteins is also discussed.
Collapse
Affiliation(s)
| | - Patricia L. Clark
- To whom correspondence should be addressed: , (574)631-8353 [phone], (574)631-6652 [fax]
| |
Collapse
|
43
|
Buck PM, Kumar S, Wang X, Agrawal NJ, Trout BL, Singh SK. Computational methods to predict therapeutic protein aggregation. Methods Mol Biol 2012; 899:425-451. [PMID: 22735968 DOI: 10.1007/978-1-61779-921-1_26] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Protein based biotherapeutics have emerged as a successful class of pharmaceuticals. However, these macromolecules endure a variety of physicochemical degradations during manufacturing, shipping, and storage, which may adversely impact the drug product quality. Of these degradations, the irreversible self-association of therapeutic proteins to form aggregates is a major challenge in the formulation of these molecules. Tools to predict and mitigate protein aggregation are, therefore, of great interest to biopharmaceutical research and development. In this chapter, a number of such computational tools developed to understand and predict the various steps involved in protein aggregation are described. These tools can be grouped into three general classes: unfolding kinetics and native state thermal stability, colloidal stability, and sequence/structure based aggregation liabilities. Chapter sections introduce each class by discussing how these predictive tools provide insight into the molecular events leading to protein aggregation. The computational methods are then explained in detail along with their advantages and limitations.
Collapse
Affiliation(s)
- Patrick M Buck
- Biotherapeutics Pharmaceutical Research and Development, Pfizer, Inc, St. Louis, MO, USA
| | | | | | | | | | | |
Collapse
|
44
|
Galzitskaya OV, Bogatyreva NS, Glyakina AV. Bacterial proteins fold faster than eukaryotic proteins with simple folding kinetics. BIOCHEMISTRY (MOSCOW) 2011; 76:225-35. [PMID: 21568856 DOI: 10.1134/s000629791102009x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Protein domain frequency and distribution among kingdoms was statistically analyzed using the SCOP structural database. It appeared that among chosen protein domains with the best resolution, eukaryotic proteins more often belong to α-helical and β-structural proteins, while proteins of bacterial origin belong to α/β structural class. Statistical analysis of folding rates of 73 proteins with known experimental data revealed that bacterial proteins with simple kinetics (23 proteins) exhibit a higher folding rate compared to eukaryotic proteins with simple folding kinetics (27 proteins). Analysis of protein domain amino acid composition showed that the frequency of amino acid residues in proteins of eukaryotic and bacterial origin is different for proteins with simple and complex folding kinetics.
Collapse
Affiliation(s)
- O V Galzitskaya
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Moscow Region, Russia.
| | | | | |
Collapse
|
45
|
GALZITSKAYA OXANAV, BOGATYREVA NATALYAS, IVANKOV DMITRYN. COMPACTNESS DETERMINES PROTEIN FOLDING TYPE. J Bioinform Comput Biol 2011; 6:667-80. [DOI: 10.1142/s0219720008003618] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2007] [Revised: 01/02/2008] [Accepted: 01/04/2008] [Indexed: 11/18/2022]
Abstract
We have demonstrated here that protein compactness, which we define as the ratio of the accessible surface area of a protein to that of the ideal sphere of the same volume, is one of the factors determining the mechanism of protein folding. Proteins with multi-state kinetics, on average, are more compact (compactness is 1.49 ± 0.02 for proteins within the size range of 101–151 amino acid residues) than proteins with two-state kinetics (compactness is 1.59 ± 0.03 for proteins within the same size range of 101–151 amino acid residues). We have shown that compactness for homologous proteins can explain both the difference in folding rates and the difference in folding mechanisms.
Collapse
Affiliation(s)
- OXANA V. GALZITSKAYA
- Institute of Protein Research, Russian Academy of Sciences, Institutskaya Str. 4, Pushchino, Moscow Region 142290, Russia
| | - NATALYA S. BOGATYREVA
- Institute of Protein Research, Russian Academy of Sciences, Institutskaya Str. 4, Pushchino, Moscow Region 142290, Russia
| | - DMITRY N. IVANKOV
- Institute of Protein Research, Russian Academy of Sciences, Institutskaya Str. 4, Pushchino, Moscow Region 142290, Russia
| |
Collapse
|
46
|
Abstract
What are the physical limits to cell behavior? Often, the physical limitations can be dominated by the proteome, the cell's complement of proteins. We combine known protein sizes, stabilities, and rates of folding and diffusion, with the known protein-length distributions P(N) of proteomes (Escherichia coli, yeast, and worm), to formulate distributions and scaling relationships in order to address questions of cell physics. Why do mesophilic cells die around 50 °C? How can the maximal growth-rate temperature (around 37 °C) occur so close to the cell-death temperature? The model shows that the cell's death temperature coincides with a denaturation catastrophe of its proteome. The reason cells can function so well just a few degrees below their death temperature is because proteome denaturation is so cooperative. Why are cells so dense-packed with protein molecules (about 20% by volume)? Cells are packed at a density that maximizes biochemical reaction rates. At lower densities, proteins collide too rarely. At higher densities, proteins diffuse too slowly through the crowded cell. What limits cell sizes and growth rates? Cell growth is limited by rates of protein synthesis, by the folding rates of its slowest proteins, and--for large cells--by the rates of its protein diffusion. Useful insights into cell physics may be obtainable from scaling laws that encapsulate information from protein knowledge bases.
Collapse
|
47
|
Puorger C, Vetsch M, Wider G, Glockshuber R. Structure, Folding and Stability of FimA, the Main Structural Subunit of Type 1 Pili from Uropathogenic Escherichia coli Strains. J Mol Biol 2011; 412:520-35. [DOI: 10.1016/j.jmb.2011.07.044] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2011] [Revised: 07/16/2011] [Accepted: 07/20/2011] [Indexed: 11/26/2022]
|
48
|
Guo J, Rao N. Predicting protein folding rate from amino acid sequence. J Bioinform Comput Biol 2011; 9:1-13. [PMID: 21328704 DOI: 10.1142/s0219720011005306] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2010] [Revised: 10/19/2010] [Accepted: 10/19/2010] [Indexed: 11/18/2022]
Abstract
Predicting protein folding rate from amino acid sequence is an important challenge in computational and molecular biology. Over the past few years, many methods have been developed to reflect the correlation between the folding rates and protein structures and sequences. In this paper, we present an effective method, a combined neural network--genetic algorithm approach, to predict protein folding rates only from amino acid sequences, without any explicit structural information. The originality of this paper is that, for the first time, it tackles the effect of sequence order. The proposed method provides a good correlation between the predicted and experimental folding rates. The correlation coefficient is 0.80 and the standard error is 2.65 for 93 proteins, the largest such databases of proteins yet studied, when evaluated with leave-one-out jackknife test. The comparative results demonstrate that this correlation is better than most of other methods, and suggest the important contribution of sequence order information to the determination of protein folding rates.
Collapse
Affiliation(s)
- Jianxiu Guo
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, P. R. China.
| | | |
Collapse
|
49
|
Zhang Y, Luo L. The dynamical contact order: protein folding rate parameters based on quantum conformational transitions. SCIENCE CHINA-LIFE SCIENCES 2011; 54:386-92. [PMID: 21509661 DOI: 10.1007/s11427-011-4158-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/17/2010] [Accepted: 08/09/2010] [Indexed: 11/25/2022]
Abstract
Protein folding is regarded as a quantum transition between the torsion states of a polypeptide chain. According to the quantum theory of conformational dynamics, we propose the dynamical contact order (DCO) defined as a characteristic of the contact described by the moment of inertia and the torsion potential energy of the polypeptide chain between contact residues. Consequently, the protein folding rate can be quantitatively studied from the point of view of dynamics. By comparing theoretical calculations and experimental data on the folding rate of 80 proteins, we successfully validate the view that protein folding is a quantum conformational transition. We conclude that (i) a correlation between the protein folding rate and the contact inertial moment exists; (ii) multi-state protein folding can be regarded as a quantum conformational transition similar to that of two-state proteins but with an intermediate delay. We have estimated the order of magnitude of the time delay; (iii) folding can be classified into two types, exergonic and endergonic. Most of the two-state proteins with higher folding rate are exergonic and most of the multi-state proteins with low folding rate are endergonic. The folding speed limit is determined by exergonic folding.
Collapse
Affiliation(s)
- Ying Zhang
- Laboratory of Theoretical Biophysics, Faculty of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | | |
Collapse
|
50
|
Guo J, Rao N, Liu G, Yang Y, Wang G. Predicting protein folding rates using the concept of Chou's pseudo amino acid composition. J Comput Chem 2011; 32:1612-7. [PMID: 21328402 DOI: 10.1002/jcc.21740] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2010] [Revised: 11/04/2010] [Accepted: 12/02/2010] [Indexed: 12/12/2022]
Abstract
One of the most important challenges in computational and molecular biology is to understand the relationship between amino acid sequences and the folding rates of proteins. Recent works suggest that topological parameters, amino acid properties, chain length and the composition index relate well with protein folding rates, however, sequence order information has seldom been considered as a property for predicting protein folding rates. In this study, amino acid sequence order was used to derive an effective method, based on an extended version of the pseudo-amino acid composition, for predicting protein folding rates without any explicit structural information. Using the jackknife cross validation test, the method was demonstrated on the largest dataset (99 proteins) reported. The method was found to provide a good correlation between the predicted and experimental folding rates. The correlation coefficient is 0.81 (with a highly significant level) and the standard error is 2.46. The reported algorithm was found to perform better than several representative sequence-based approaches using the same dataset. The results indicate that sequence order information is an important determinant of protein folding rates.
Collapse
Affiliation(s)
- Jianxiu Guo
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, People's Republic of China
| | | | | | | | | |
Collapse
|