1
|
Jilani M, Turcan A, Haspel N, Jagodzinski F. Elucidating the Structural Impacts of Protein InDels. Biomolecules 2022; 12:1435. [PMID: 36291643 PMCID: PMC9599607 DOI: 10.3390/biom12101435] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 09/23/2022] [Accepted: 09/27/2022] [Indexed: 09/17/2023] Open
Abstract
The effects of amino acid insertions and deletions (InDels) remain a rather under-explored area of structural biology. These variations oftentimes are the cause of numerous disease phenotypes. In spite of this, research to study InDels and their structural significance remains limited, primarily due to a lack of experimental information and computational methods. In this work, we fill this gap by modeling InDels computationally; we investigate the rigidity differences between the wildtype and a mutant variant with one or more InDels. Further, we compare how structural effects due to InDels differ from the effects of amino acid substitutions, which are another type of amino acid mutation. We finish by performing a correlation analysis between our rigidity-based metrics and wet lab data for their ability to infer the effects of InDels on protein fitness.
Collapse
Affiliation(s)
- Muneeba Jilani
- Department of Computer Science, University of Massachusetts Boston, Boston, MA 02125, USA
| | - Alistair Turcan
- Department of Computer Science, Western Washington University, Bellingham, WA 98225, USA
| | - Nurit Haspel
- Department of Computer Science, University of Massachusetts Boston, Boston, MA 02125, USA
| | - Filip Jagodzinski
- Department of Computer Science, Western Washington University, Bellingham, WA 98225, USA
| |
Collapse
|
2
|
Valgardson J, Cosbey R, Houser P, Rupp M, Van Bronkhorst R, Lee M, Jagodzinski F, Amacher JF. MotifAnalyzer-PDZ: A computational program to investigate the evolution of PDZ-binding target specificity. Protein Sci 2019; 28:2127-2143. [PMID: 31599029 DOI: 10.1002/pro.3741] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2019] [Revised: 09/27/2019] [Accepted: 09/30/2019] [Indexed: 12/15/2022]
Abstract
Recognition of short linear motifs (SLiMs) or peptides by proteins is an important component of many cellular processes. However, due to limited and degenerate binding motifs, prediction of cellular targets is challenging. In addition, many of these interactions are transient and of relatively low affinity. Here, we focus on one of the largest families of SLiM-binding domains in the human proteome, the PDZ domain. These domains bind the extreme C-terminus of target proteins, and are involved in many signaling and trafficking pathways. To predict endogenous targets of PDZ domains, we developed MotifAnalyzer-PDZ, a program that filters and compares all motif-satisfying sequences in any publicly available proteome. This approach enables us to determine possible PDZ binding targets in humans and other organisms. Using this program, we predicted and biochemically tested novel human PDZ targets by looking for strong sequence conservation in evolution. We also identified three C-terminal sequences in choanoflagellates that bind a choanoflagellate PDZ domain, the Monsiga brevicollis SHANK1 PDZ domain (mbSHANK1), with endogenously-relevant affinities, despite a lack of conservation with the targets of a homologous human PDZ domain, SHANK1. All three are predicted to be signaling proteins, with strong sequence homology to cytosolic and receptor tyrosine kinases. Finally, we analyzed and compared the positional amino acid enrichments in PDZ motif-satisfying sequences from over a dozen organisms. Overall, MotifAnalyzer-PDZ is a versatile program to investigate potential PDZ interactions. This proof-of-concept work is poised to enable similar types of analyses for other SLiM-binding domains (e.g., MotifAnalyzer-Kinase). MotifAnalyzer-PDZ is available at http://motifAnalyzerPDZ.cs.wwu.edu.
Collapse
Affiliation(s)
- Jordan Valgardson
- Department of Computer Science, Western Washington University, Bellingham, Washington.,Department of Chemistry, Western Washington University, Bellingham, Washington
| | - Robin Cosbey
- Department of Computer Science, Western Washington University, Bellingham, Washington
| | - Paul Houser
- Department of Computer Science, Western Washington University, Bellingham, Washington
| | - Milo Rupp
- Department of Computer Science, Western Washington University, Bellingham, Washington
| | - Raiden Van Bronkhorst
- Department of Computer Science, Western Washington University, Bellingham, Washington
| | - Michael Lee
- Department of Computer Science, Western Washington University, Bellingham, Washington
| | - Filip Jagodzinski
- Department of Computer Science, Western Washington University, Bellingham, Washington
| | - Jeanine F Amacher
- Department of Chemistry, Western Washington University, Bellingham, Washington
| |
Collapse
|
3
|
Olney R, Tuor A, Jagodzinski F, Hutchinson B. A systematic exploration of ΔΔG cutoff ranges in machine learning models for protein mutation stability prediction. J Bioinform Comput Biol 2018; 16:1840022. [DOI: 10.1142/s021972001840022x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Discerning how a mutation affects the stability of a protein is central to the study of a wide range of diseases. Mutagenesis experiments on physical proteins provide precise insights about the effects of amino acid substitutions, but such studies are time and cost prohibitive. Computational approaches for informing experimentalists where to allocate wet-lab resources are available, including a variety of machine learning models. Assessing the accuracy of machine learning models for predicting the effects of mutations is dependent on experiments for amino acid substitutions performed in vitro. When similar experiments on physical proteins have been performed by multiple laboratories, the use of the data near the juncture of stabilizing and destabilizing mutations is questionable. In this work, we explore a systematic and principled alternative to discarding experimental data close to the juncture of stabilizing and destabilizing mutations. We model the inconclusive range of experimental [Formula: see text] values via 3- and 5-way classifiers, and systematically explore potential boundaries for the range of inconclusive experimental values. We demonstrate the effectiveness of potential boundaries through confusion matrices and heat map visualizations. We explore two novel metrics for assessing viable cutoff ranges, and find that under these metrics, a lower cutoff near [Formula: see text] and an upper cutoff near [Formula: see text] are optimal across multiple machine learning models.
Collapse
Affiliation(s)
| | - Aaron Tuor
- Pacific Northwest National Laboratory, Seattle, WA, USA
| | | | - Brian Hutchinson
- Western Washington University, Bellingham, WA, USA
- Pacific Northwest National Laboratory, Seattle, WA, USA
| |
Collapse
|
4
|
Abstract
The geometry of cavities in the surfaces of proteins facilitates a variety of biochemical functions. To better understand the biochemical nature of protein cavities, the shape, size, chemical properties, and evolutionary nature of functional and nonfunctional surface cavities have been exhaustively surveyed in protein structures. The rigidity of surface cavities, however, is not immediately available as a characteristic of structure data, and is thus more difficult to examine. Using rigidity analysis for assessing and analyzing molecular rigidity, this paper performs the first survey of the relationships between cavity properties, such as size and residue content, and how they correspond to cavity rigidity. Our survey measured a variety of rigidity metrics on 120,323 cavities from 12,785 sequentially non-redundant protein chains. We used VASP-E, a volume-based algorithm for analyzing cavity geometry. Our results suggest that rigidity properties of protein cavities are dependent on cavity surface area.
Collapse
Affiliation(s)
- Stephanie Mason
- Department of Computer Science, Western Washington University, 516 High Street, Bellingham, WA 98225, USA.
| | - Brian Y Chen
- Department of Computer Science and Engineering, Lehigh University, 19 Memorial Drive West, Bethlehem, PA 18015, USA.
| | - Filip Jagodzinski
- Department of Computer Science, Western Washington University, 516 High Street, Bellingham, WA 98225, USA.
| |
Collapse
|
5
|
Dehghanpoor R, Ricks E, Hursh K, Gunderson S, Farhoodi R, Haspel N, Hutchinson B, Jagodzinski F. Predicting the Effect of Single and Multiple Mutations on Protein Structural Stability. Molecules 2018; 23:molecules23020251. [PMID: 29382060 PMCID: PMC6017198 DOI: 10.3390/molecules23020251] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2017] [Revised: 01/15/2018] [Accepted: 01/19/2018] [Indexed: 01/06/2023] Open
Abstract
Predicting how a point mutation alters a protein’s stability can guide pharmaceutical drug design initiatives which aim to counter the effects of serious diseases. Conducting mutagenesis studies in physical proteins can give insights about the effects of amino acid substitutions, but such wet-lab work is prohibitive due to the time as well as financial resources needed to assess the effect of even a single amino acid substitution. Computational methods for predicting the effects of a mutation on a protein structure can complement wet-lab work, and varying approaches are available with promising accuracy rates. In this work we compare and assess the utility of several machine learning methods and their ability to predict the effects of single and double mutations. We in silico generate mutant protein structures, and compute several rigidity metrics for each of them. We use these as features for our Support Vector Regression (SVR), Random Forest (RF), and Deep Neural Network (DNN) methods. We validate the predictions of our in silico mutations against experimental ΔΔG stability data, and attain Pearson Correlation values upwards of 0.71 for single mutations, and 0.81 for double mutations. We perform ablation studies to assess which features contribute most to a model’s success, and also introduce a voting scheme to synthesize a single prediction from the individual predictions of the three models.
Collapse
Affiliation(s)
- Ramin Dehghanpoor
- Department of Computer Science, University of Massachusetts Boston, Boston, MA 02125, USA.
| | - Evan Ricks
- Department of Computer Science, Western Washington University, Bellingham, WA 98225, USA.
| | - Katie Hursh
- Department of Computer Science, Western Washington University, Bellingham, WA 98225, USA.
| | - Sarah Gunderson
- Department of Computer Science, Western Washington University, Bellingham, WA 98225, USA.
| | - Roshanak Farhoodi
- Department of Computer Science, University of Massachusetts Boston, Boston, MA 02125, USA.
| | - Nurit Haspel
- Department of Computer Science, University of Massachusetts Boston, Boston, MA 02125, USA.
| | - Brian Hutchinson
- Department of Computer Science, Western Washington University, Bellingham, WA 98225, USA.
- Computing and Analytics Division, Pacific Northwest National Laboratory; Richland, WA 99354, USA.
| | - Filip Jagodzinski
- Department of Computer Science, Western Washington University, Bellingham, WA 98225, USA.
| |
Collapse
|
6
|
Siderius M, Jagodzinski F. Mutation Sensitivity Maps: Identifying Residue Substitutions That Impact Protein Structure Via a Rigidity Analysis In Silico Mutation Approach. J Comput Biol 2017; 25:89-102. [PMID: 29035580 DOI: 10.1089/cmb.2017.0165] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Understanding how an amino acid substitution affects a protein's structure can aid in the design of pharmaceutical drugs that aim at countering diseases caused by protein mutants. Unfortunately, performing even a few amino acid substitutions in vitro is both time and cost prohibitive, whereas an exhaustive analysis that involves systematically mutating all amino acids in the physical protein is infeasible. Computational methods have been developed to predict the effects of mutations, but even many of them are computationally intensive or are else dependent on homology or experimental data that may not be available for the protein being studied. In this work, we motivate and present a computation pipeline whose only input is a Protein Data Bank file containing the 3D coordinates of the atoms of a biomolecule. Our high-throughput approach uses our ProMuteHT algorithm to exhaustively generate in silico amino acid substitutions at each residue, and it also includes an energy minimization option. This is in contrast to our previous work, where we analyzed the effects of in silico mutations to Alanine, Serine, and Glycine only. We exploit the speed of a fast rigidity analysis approach to analyze our protein variants, and develop a Mutation Sensitivity (MuSe) Map, to permit identifying residues that are most sensitive to mutations. We present a case study to show the degree to which a MuSe Map and whisker plots are able to locate amino acids whose mutations most affect a protein's structure as inferred from a rigidity analysis approach.
Collapse
Affiliation(s)
- Michael Siderius
- Department of Computer Science, Western Washington University , Bellingham, Washington
| | - Filip Jagodzinski
- Department of Computer Science, Western Washington University , Bellingham, Washington
| |
Collapse
|
7
|
Akbal-Delibas B, Jagodzinski F, Haspel N. A conservation and rigidity based method for detecting critical protein residues. BMC Struct Biol 2013; 13 Suppl 1:S6. [PMID: 24565061 PMCID: PMC3952096 DOI: 10.1186/1472-6807-13-s1-s6] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Background Certain amino acids in proteins play a critical role in determining their structural stability and function. Examples include flexible regions such as hinges which allow domain motion, and highly conserved residues on functional interfaces which allow interactions with other proteins. Detecting these regions can aid in the analysis and simulation of protein rigidity and conformational changes, and helps characterizing protein binding and docking. We present an analysis of critical residues in proteins using a combination of two complementary techniques. One method performs in-silico mutations and analyzes the protein's rigidity to infer the role of a point substitution to Glycine or Alanine. The other method uses evolutionary conservation to find functional interfaces in proteins. Results We applied the two methods to a dataset of proteins, including biomolecules with experimentally known critical residues as determined by the free energy of unfolding. Our results show that the combination of the two methods can detect the vast majority of critical residues in tested proteins. Conclusions Our results show that the combination of the two methods has the potential to detect more information than each method separately. Future work will provide a confidence level for the criticalness of a residue to improve the accuracy of our method and eliminate false positives. Once the combined methods are integrated into one scoring function, it can be applied to other domains such as estimating functional interfaces.
Collapse
|
8
|
Jagodzinski F, Clark P, Grant J, Liu T, Monastra S, Streinu I. Rigidity analysis of protein biological assemblies and periodic crystal structures. BMC Bioinformatics 2013; 14 Suppl 18:S2. [PMID: 24564201 PMCID: PMC3817814 DOI: 10.1186/1471-2105-14-s18-s2] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background We initiate in silico rigidity-theoretical studies of biological assemblies and small crystals for protein structures. The goal is to determine if, and how, the interactions among neighboring cells and subchains affect the flexibility of a molecule in its crystallized state. We use experimental X-ray crystallography data from the Protein Data Bank (PDB). The analysis relies on an effcient graph-based algorithm. Computational experiments were performed using new protein rigidity analysis tools available in the new release of our KINARI-Web server http://kinari.cs.umass.edu. Results We provide two types of results: on biological assemblies and on crystals. We found that when only isolated subchains are considered, structural and functional information may be missed. Indeed, the rigidity of biological assemblies is sometimes dependent on the count and placement of hydrogen bonds and other interactions among the individual subchains of the biological unit. Similarly, the rigidity of small crystals may be affected by the interactions between atoms belonging to different unit cells. We have analyzed a dataset of approximately 300 proteins, from which we generated 982 crystals (some of which are biological assemblies). We identified two types of behaviors. (a) Some crystals and/or biological assemblies will aggregate into rigid bodies that span multiple unit cells/asymmetric units. Some of them create substantially larger rigid cluster in the crystal/biological assembly form, while in other cases, the aggregation has a smaller effect just at the interface between the units. (b) In other cases, the rigidity properties of the asymmetric units are retained, because the rigid bodies did not combine. We also identified two interesting cases where rigidity analysis may be correlated with the functional behavior of the protein. This type of information, identified here for the first time, depends critically on the ability to create crystals and biological assemblies, and would not have been observed only from the asymmetric unit. For the Ribonuclease A protein (PDB file 5RSA), which is functionally active in the crystallized form, we found that the individual protein and its crystal form retain the flexibility parameters between the two states. In contrast, a derivative of Ribonuclease A (PDB file 9RSA), has no functional activity, and the protein in both the asymmetric and crystalline forms, is very rigid. For the vaccinia virus D13 scaffolding protein (PDB file 3SAQ), which has two biological assemblies, we observed a striking asymmetry in the rigidity cluster decomposition of one of them, which seems implausible, given its symmetry. Upon careful investigation, we tracked the cause to a placement decision by the Reduce software concerning the hydrogen atoms, thus affecting the distribution of certain hydrogen bonds. The surprising result is that the presence or lack of a very few, but critical, hydrogen bonds, can drastically affect the rigid cluster decomposition of the biological assembly. Conclusion The rigidity analysis of a single asymmetric unit may not accurately reflect the protein's behavior in the tightly packed crystal environment. Using our KINARI software, we demonstrated that additional functional and rigidity information can be gained by analyzing a protein's biological assembly and/or crystal structure. However, performing a larger scale study would be computationally expensive (due to the size of the molecules involved). Overcoming this limitation will require novel mathematical and computational extensions to our software.
Collapse
|
9
|
Abstract
Predicting the effect of a single amino acid substitution on the stability of a protein structure is a fundamental task in macromolecular modeling. It has relevance to drug design and understanding of disease-causing protein variants. We present KINARI-Mutagen, a web server for performing in silico mutation experiments on protein structures from the Protein Data Bank. Our rigidity-theoretical approach permits fast evaluation of the effects of mutations that may not be easy to perform in vitro, because it is not always possible to express a protein with a specific amino acid substitution. We use KINARI-Mutagen to identify critical residues, and we show that our predictions correlate with destabilizing mutations to glycine. In two in-depth case studies we show that the mutated residues identified by KINARI-Mutagen as critical correlate with experimental data, and would not have been identified by other methods such as Solvent Accessible Surface Area measurements or residue ranking by contributions to stabilizing interactions. We also generate 48 mutants for 14 proteins, and compare our rigidity-based results against experimental mutation stability data. KINARI-Mutagen is available at http://kinari.cs.umass.edu.
Collapse
Affiliation(s)
- Filip Jagodzinski
- Department of Computer Science, 140 Governors Drive, University of Massachusetts Amherst, Amherst, MA 01002, USA
| | | | | |
Collapse
|
10
|
Abstract
KINARI-Web is an interactive web server for performing rigidity analysis and visually exploring rigidity properties of proteins. It also provides tools for pre-processing the input data, such as selecting relevant chains from PDB files, adding hydrogen atoms and identifying stabilizing interactions. KINARI-Web offers a quick-start option for beginners, and highly customizable features for the experienced user. Chains, residues or atoms, as well as stabilizing constraints can be selected, removed or added, and the user can designate how different chemical interactions should be modeled during rigidity analysis. The enhanced Jmol-based visualizer allows for zooming in, highlighting or investigating different calculated rigidity properties of a molecular structure. KINARI-Web is freely available at http://kinari.cs.umass.edu.
Collapse
Affiliation(s)
- Naomi Fox
- Department of Computer Science, 140 Governors Drive, University of Massachusetts, Amherst, MA 01003, USA
| | | | | | | |
Collapse
|