1
|
Banerjee A, Saha S, Tvedt NC, Yang LW, Bahar I. Mutually beneficial confluence of structure-based modeling of protein dynamics and machine learning methods. Curr Opin Struct Biol 2023; 78:102517. [PMID: 36587424 PMCID: PMC10038760 DOI: 10.1016/j.sbi.2022.102517] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Revised: 11/19/2022] [Accepted: 11/22/2022] [Indexed: 12/31/2022]
Abstract
Proteins sample an ensemble of conformers under physiological conditions, having access to a spectrum of modes of motions, also called intrinsic dynamics. These motions ensure the adaptation to various interactions in the cell, and largely assist in, if not determine, viable mechanisms of biological function. In recent years, machine learning frameworks have proven uniquely useful in structural biology, and recent studies further provide evidence to the utility and/or necessity of considering intrinsic dynamics for increasing their predictive ability. Efficient quantification of dynamics-based attributes by recently developed physics-based theories and models such as elastic network models provides a unique opportunity to generate data on dynamics for training ML models towards inferring mechanisms of protein function, assessing pathogenicity, or estimating binding affinities.
Collapse
Affiliation(s)
- Anupam Banerjee
- Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh PA 15261, USA
| | - Satyaki Saha
- Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh PA 15261, USA
| | - Nathan C Tvedt
- Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh PA 15261, USA; Computational and Applied Mathematics and Statistics, The College of William and Mary, Williamsburg, VA 23185, USA
| | - Lee-Wei Yang
- Institute of Bioinformatics and Structural Biology, and PhD Program in Biomedical Artificial Intelligence, National Tsing Hua University, Hsinchu 300044, Taiwan; Physics Division, National Center for Theoretical Sciences, Taipei 106319, Taiwan
| | - Ivet Bahar
- Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh PA 15261, USA.
| |
Collapse
|
2
|
Stachiv I, Kuo CY, Li W. Protein adsorption by nanomechanical mass spectrometry: Beyond the real-time molecular weighting. Front Mol Biosci 2023; 9:1058441. [PMID: 36685281 PMCID: PMC9849248 DOI: 10.3389/fmolb.2022.1058441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 12/14/2022] [Indexed: 01/06/2023] Open
Abstract
During past decades, enormous progress in understanding the mechanisms of the intermolecular interactions between the protein and surface at the single-molecule level has been achieved. These advances could only be possible by the ongoing development of highly sophisticated experimental methods such as atomic force microscopy, optical microscopy, surface plasmon resonance, ellipsometry, quartz crystal microbalance, conventional mass spectrometry, and, more recently, the nanomechanical systems. Here, we highlight the main findings of recent studies on the label-free single-molecule (protein) detection by nanomechanical systems including those focusing on the protein adsorption on various substrate surfaces. Since the nanomechanical techniques are capable of detecting and manipulating proteins even at the single-molecule level, therefore, they are expected to open a new way of studying the dynamics of protein functions. It is noteworthy that, in contrast to other experimental methods, where only given protein properties like molecular weight or protein stiffness can be determined, the nanomechanical systems enable a real-time measurement of the multiple protein properties (e.g., mass, stiffness, and/or generated surface stress), making them suitable for the study of protein adsorption mechanisms. Moreover, we also discuss the possible future trends in label-free detection and analysis of dynamics of protein complexes with these nanomechanical systems.
Collapse
Affiliation(s)
- Ivo Stachiv
- Department of Functional Materials, Institute of Physics, Czech Academy of Sciences, Prague, Czechia,*Correspondence: Ivo Stachiv,
| | - Chih-Yun Kuo
- Department of Neurology and Centre of Clinical Neuroscience, First Faculty of Medicine and General University Hospital in Prague, Charles University, Prague, Czechia
| | - Wei Li
- Department of Functional Materials, Institute of Physics, Czech Academy of Sciences, Prague, Czechia
| |
Collapse
|
3
|
Pacini L, Lesieur C. GCAT: A network model of mutational influences between amino acid positions in PSD95pdz3. Front Mol Biosci 2022; 9:1035248. [PMID: 36387271 PMCID: PMC9659846 DOI: 10.3389/fmolb.2022.1035248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Accepted: 10/13/2022] [Indexed: 12/05/2022] Open
Abstract
Proteins exist for more than 3 billion years: proof of a sustainable design. They have mechanisms coping with internal perturbations (e.g., amino acid mutations), which tie genetic backgrounds to diseases or drug therapy failure. One difficulty to grasp these mechanisms is the asymmetry of amino acid mutational impact: a mutation at position i in the sequence, which impact a position j does not imply that the mutation at position j impacts the position i. Thus, to distinguish the influence of the mutation of i on j from the influence of the mutation of j on i, position mutational influences must be represented with directions. Using the X ray structure of the third PDZ domain of PDS-95 (Protein Data Bank 1BE9) and in silico mutations, we build a directed network called GCAT that models position mutational influences. In the GCAT, a position is a node with edges that leave the node (out-edges) for the influences of the mutation of the position on other positions and edges that enter the position (in-edges) for the influences of the mutation of other positions on the position. 1BE9 positions split into four influence categories called G, C, A and T going from positions influencing on average less other positions and influenced on average by less other positions (category C) to positions influencing on average more others positions and influenced on average by more other positions (category T). The four categories depict position neighborhoods in the protein structure with different tolerance to mutations.
Collapse
Affiliation(s)
- Lorenza Pacini
- University Lyon, CNRS, INSA Lyon, Ecole Centrale de Lyon, UMR5005, Université Claude Bernard Lyon 1, Villeurbanne, France
- Institut Rhônalpin des Systèmes Complexes, IXXI-ENS-Lyon, Lyon, France
| | - Claire Lesieur
- University Lyon, CNRS, INSA Lyon, Ecole Centrale de Lyon, UMR5005, Université Claude Bernard Lyon 1, Villeurbanne, France
- Institut Rhônalpin des Systèmes Complexes, IXXI-ENS-Lyon, Lyon, France
- *Correspondence: Claire Lesieur,
| |
Collapse
|
4
|
Wingert B, Doruker P, Bahar I. Activation and Speciation Mechanisms in Class A GPCRs. J Mol Biol 2022; 434:167690. [PMID: 35728652 PMCID: PMC10129049 DOI: 10.1016/j.jmb.2022.167690] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 06/10/2022] [Accepted: 06/13/2022] [Indexed: 01/03/2023]
Abstract
Accurate development of allosteric modulators of GPCRs require a thorough assessment of their sequence, structure, and dynamics, toward gaining insights into their mechanisms of actions shared by family members, as well as dynamic features that distinguish subfamilies. Building on recent progress in the characterization of the signature dynamics of proteins, we analyzed here a dataset of 160 Class A GPCRs to determine their sequence similarities, structural landscape, and dynamic features across different species (human, bovine, mouse, squid, and rat), different activation states (active/inactive), and different subfamilies. The two dominant directions of variability across experimentally resolved structures, identified by principal component analysis of the dataset, shed light to cooperative mechanisms of activation, subfamily differentiation, and speciation of Class A GPCRs. The analysis reveals the functional significance of the conformational flexibilities of specific structural elements, including: the dominant role of the intracellular loop 3 (ICL3) together with the cytoplasmic ends of the adjoining helices TM5 and TM6 in enabling allosteric activation; the role of particular structural motifs at the extracellular loop 2 (ECL2) connecting TM4 and TM5 in binding ligands specific to different subfamilies; or even the differentiation of the N-terminal conformation across different species. Detailed analyses of the modes of motions accessible to the members of the dataset and their variations across members demonstrate how the active and inactive states of GPCRs obey distinct conformational dynamics. The collective fluctuations of the GPCRs are robustly defined in the active state, while the inactive conformers exhibit broad variance among members.
Collapse
Affiliation(s)
- Bentley Wingert
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Pemra Doruker
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Ivet Bahar
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA.
| |
Collapse
|
5
|
Kaynak BT, Krieger JM, Dudas B, Dahmani ZL, Costa MGS, Balog E, Scott AL, Doruker P, Perahia D, Bahar I. Sampling of Protein Conformational Space Using Hybrid Simulations: A Critical Assessment of Recent Methods. Front Mol Biosci 2022; 9:832847. [PMID: 35187088 PMCID: PMC8855042 DOI: 10.3389/fmolb.2022.832847] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Accepted: 01/12/2022] [Indexed: 12/17/2022] Open
Abstract
Recent years have seen several hybrid simulation methods for exploring the conformational space of proteins and their complexes or assemblies. These methods often combine fast analytical approaches with computationally expensive full atomic molecular dynamics (MD) simulations with the goal of rapidly sampling large and cooperative conformational changes at full atomic resolution. We present here a systematic comparison of the utility and limits of four such hybrid methods that have been introduced in recent years: MD with excited normal modes (MDeNM), collective modes-driven MD (CoMD), and elastic network model (ENM)-based generation, clustering, and relaxation of conformations (ClustENM) as well as its updated version integrated with MD simulations (ClustENMD). We analyzed the predicted conformational spaces using each of these four hybrid methods, applied to four well-studied proteins, triosephosphate isomerase (TIM), 3-phosphoglycerate kinase (PGK), HIV-1 protease (PR) and HIV-1 reverse transcriptase (RT), which provide extensive ensembles of experimental structures for benchmarking and comparing the methods. We show that a rigorous multi-faceted comparison and multiple metrics are necessary to properly assess the differences between conformational ensembles and provide an optimal protocol for achieving good agreement with experimental data. While all four hybrid methods perform well in general, being especially useful as computationally efficient methods that retain atomic resolution, the systematic analysis of the same systems by these four hybrid methods highlights the strengths and limitations of the methods and provides guidance for parameters and protocols to be adopted in future studies.
Collapse
Affiliation(s)
- Burak T. Kaynak
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States
| | - James M. Krieger
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States
| | - Balint Dudas
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Laboratoire de Biologie et Pharmacologie Appliquée, Ecole Normale Supérieure Paris-Saclay, Gif-sur-Yvette, France
- Department of Biophysics and Radiation Biology, Semmelweis University, Budapest, Hungary
| | - Zakaria L. Dahmani
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States
| | - Mauricio G. S. Costa
- Programa de Computação Científica, Vice-Presiden̂cia de Educação, Informação e Comunicação, Fundação Oswaldo Cruz, Rio de Janeiro, Brazil
| | - Erika Balog
- Department of Biophysics and Radiation Biology, Semmelweis University, Budapest, Hungary
| | - Ana Ligia Scott
- Laboratory of Bioinformatics and Computational Biology, Center of Mathematics, Computation and Cognition, Federal University of ABC-UFABC, Santo André, Brazil
| | - Pemra Doruker
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- *Correspondence: Ivet Bahar, ; David Perahia, ; Pemra Doruker,
| | - David Perahia
- Laboratoire de Biologie et Pharmacologie Appliquée, Ecole Normale Supérieure Paris-Saclay, Gif-sur-Yvette, France
- *Correspondence: Ivet Bahar, ; David Perahia, ; Pemra Doruker,
| | - Ivet Bahar
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- *Correspondence: Ivet Bahar, ; David Perahia, ; Pemra Doruker,
| |
Collapse
|
6
|
Pacini L, Dorantes-Gilardi R, Vuillon L, Lesieur C. Mapping Function from Dynamics: Future Challenges for Network-Based Models of Protein Structures. Front Mol Biosci 2021; 8:744646. [PMID: 34708077 PMCID: PMC8543124 DOI: 10.3389/fmolb.2021.744646] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Accepted: 08/19/2021] [Indexed: 11/25/2022] Open
Abstract
Proteins fulfill complex and diverse biological functions through the controlled atomic motions of their structures (functional dynamics). The protein composition is given by its amino-acid sequence, which was assumed to encode the function. However, the discovery of functional sequence variants proved that the functional encoding does not come down to the sequence, otherwise a change in the sequence would mean a change of function. Likewise, the discovery that function is fulfilled by a set of structures and not by a unique structure showed that the functional encoding does not come down to the structure either. That leaves us with the possibility that a set of atomic motions, achievable by different sequences and different structures, encodes a specific function. Thanks to the exponential growth in annual depositions in the Protein Data Bank of protein tridimensional structures at atomic resolutions, network models using the Cartesian coordinates of atoms of a protein structure as input have been used over 20 years to investigate protein features. Combining networks with experimental measures or with Molecular Dynamics (MD) simulations and using typical or ad-hoc network measures is well suited to decipher the link between protein dynamics and function. One perspective is to consider static structures alone as alternatives to address the question and find network measures relevant to dynamics that can be subsequently used for mining and classification of dynamic sequence changes functionally robust, adaptable or faulty. This way the set of dynamics that fulfill a function over a diversity of sequences and structures will be determined.
Collapse
Affiliation(s)
- Lorenza Pacini
- Ecole Centrale de Lyon, Ampère, UMR5005, Univ. Lyon, CNRS, INSA Lyon, Université Claude Bernard Lyon 1, Villeurbanne, France
- Institut Rhônalpin des Systèmes Complexes, IXXI-ENS-Lyon, Lyon, France
| | - Rodrigo Dorantes-Gilardi
- Institut Rhônalpin des Systèmes Complexes, IXXI-ENS-Lyon, Lyon, France
- USMB, CNRS, LAMA UMR5127, Le Bourget du Lac, France
| | | | - Claire Lesieur
- Ecole Centrale de Lyon, Ampère, UMR5005, Univ. Lyon, CNRS, INSA Lyon, Université Claude Bernard Lyon 1, Villeurbanne, France
- Institut Rhônalpin des Systèmes Complexes, IXXI-ENS-Lyon, Lyon, France
| |
Collapse
|
7
|
Echave J. Fast computational mutation-response scanning of proteins. PeerJ 2021; 9:e11330. [PMID: 33976988 PMCID: PMC8067912 DOI: 10.7717/peerj.11330] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Accepted: 03/31/2021] [Indexed: 12/21/2022] Open
Abstract
Studying the effect of perturbations on protein structure is a basic approach in protein research. Important problems, such as predicting pathological mutations and understanding patterns of structural evolution, have been addressed by computational simulations that model mutations using forces and predict the resulting deformations. In single mutation-response scanning simulations, a sensitivity matrix is obtained by averaging deformations over point mutations. In double mutation-response scanning simulations, a compensation matrix is obtained by minimizing deformations over pairs of mutations. These very useful simulation-based methods may be too slow to deal with large proteins, protein complexes, or large protein databases. To address this issue, I derived analytical closed formulas to calculate the sensitivity and compensation matrices directly, without simulations. Here, I present these derivations and show that the resulting analytical methods are much faster than their simulation counterparts.
Collapse
Affiliation(s)
- Julian Echave
- Instituto de Ciencias Físicas, Escuela de Ciencia y Tecnología, Universidad Nacional de San Martín, San Martín, Buenos Aires, Argentina
| |
Collapse
|
8
|
Zhang S, Gong W, Han Z, Liu Y, Li C. Insight into Shared Properties and Differential Dynamics and Specificity of Secretory Phospholipase A 2 Family Members. J Phys Chem B 2021; 125:3353-3363. [PMID: 33780247 DOI: 10.1021/acs.jpcb.1c01315] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Understanding generic mechanisms of functions shared by the secretory phospholipase A2 (sPLA2) family involved in the lipid metabolism and cell signaling and the molecular basis of function specificity for family members is an intriguing but challenging problem for biologists. Here, we explore the issue through extensive analyses using a combination of structure-based methods and bioinformatics tools on130 sPLA2 family members. The principal component analysis of the structure ensemble reveals that the enzyme has an open-close motion which helps widen the substrate binding channel, facilitating its binding to phospholipid. Performing elastic network model and sequence analyses found that the residues critical for family functions, such as cysteine and catalytic residues, are highly conserved and undergo minimal movements, which is evolutionarily essential as their perturbation would impact the function, while the four residue regions involved in the association with the calcium ion/membrane are lowly conserved and of high mobility and large variations in low-to-intermediate frequency modes, which reflects the specificity of members. The analyses from perturbation response scanning also reveal that the above four regions with high sensitivity to an external perturbation are member-specific, suggesting their different roles in allosteric modulation, while the minimal sensitive residues are the shared characteristics across family members, which play an important role in maintaining structural stability as the folding core. This study is helpful for understanding how sequences, structures, and dynamics of sPLA2 family members evolve to ensure their common and specific functions and can provide a guide for accurate design of proteins with finely tuned activities.
Collapse
Affiliation(s)
- Shan Zhang
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Weikang Gong
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Zhongjie Han
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Yang Liu
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Chunhua Li
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| |
Collapse
|