1
|
Nikam R, Jemimah S, Gromiha MM. DeepPPAPredMut: deep ensemble method for predicting the binding affinity change in protein-protein complexes upon mutation. Bioinformatics 2024; 40:btae309. [PMID: 38718170 PMCID: PMC11112046 DOI: 10.1093/bioinformatics/btae309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 04/08/2024] [Accepted: 05/08/2024] [Indexed: 05/24/2024] Open
Abstract
MOTIVATION Protein-protein interactions underpin many cellular processes and their disruption due to mutations can lead to diseases. With the evolution of protein structure prediction methods like AlphaFold2 and the availability of extensive experimental affinity data, there is a pressing need for updated computational tools that can efficiently predict changes in binding affinity caused by mutations in protein-protein complexes. RESULTS We developed a deep ensemble model that leverages protein sequences, predicted structure-based features, and protein functional classes to accurately predict the change in binding affinity due to mutations. The model achieved a correlation of 0.97 and a mean absolute error (MAE) of 0.35 kcal/mol on the training dataset, and maintained robust performance on the test set with a correlation of 0.72 and a MAE of 0.83 kcal/mol. Further validation using Leave-One-Out Complex (LOOC) cross-validation exhibited a correlation of 0.83 and a MAE of 0.51 kcal/mol, indicating consistent performance. AVAILABILITY AND IMPLEMENTATION https://web.iitm.ac.in/bioinfo2/DeepPPAPredMut/index.html.
Collapse
Affiliation(s)
- Rahul Nikam
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| | - Sherlyn Jemimah
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
- Department of Biomedical Engineering, Khalifa University, P.O. Box: 127788 , Abu Dhabi, United Arab Emirates
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
- Department of Computer Science, Tokyo Tech World Research Hub Initiative (WRHI), Institute of Innovative Research, Tokyo Institute of Technology, 4259 Nagatsutacho, Midori-ku, Yokohama, Kanagawa 226-8501, Japan
| |
Collapse
|
2
|
Turina P, Fariselli P, Capriotti E. K-Pro: Kinetics Data on Proteins and Mutants. J Mol Biol 2023; 435:168245. [PMID: 37625584 DOI: 10.1016/j.jmb.2023.168245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 08/16/2023] [Accepted: 08/17/2023] [Indexed: 08/27/2023]
Abstract
The study of protein folding plays a crucial role in improving our understanding of protein function and of the relationship between genetics and phenotypes. In particular, understanding the thermodynamics and kinetics of the folding process is important for uncovering the mechanisms behind human disorders caused by protein misfolding. To address this issue, it is essential to collect and curate experimental kinetic and thermodynamic data on protein folding. K-Pro is a new database designed for collecting and storing experimental kinetic data on monomeric proteins, with a two-state folding mechanism. With 1,529 records from 62 proteins corresponding to 65 structures, K-Pro contains various kinetic parameters such as the logarithm of the folding and unfolding rates, Tanford's β and the ϕ values. When available, the database also includes thermodynamic parameters associated with the kinetic data. K-Pro features a user-friendly interface that allows browsing and downloading kinetic data of interest. The graphical interface provides a visual representation of the protein and mutants, and it is cross-linked to key databases such as PDB, UniProt, and PubMed. K-Pro is open and freely accessible through https://folding.biofold.org/k-pro and supports the latest versions of popular browsers.
Collapse
Affiliation(s)
- Paola Turina
- Department of Pharmacy and Biotechnology (FaBiT), University of Bologna, Via F. Selmi 3, 40126 Bologna, Italy
| | - Piero Fariselli
- Department of Medical Sciences, University of Torino, Via Santena 19, 10126 Torino, Italy
| | - Emidio Capriotti
- Department of Pharmacy and Biotechnology (FaBiT), University of Bologna, Via F. Selmi 3, 40126 Bologna, Italy.
| |
Collapse
|
3
|
Shirvanizadeh N, Vihinen M. VariBench, new variation benchmark categories and data sets. FRONTIERS IN BIOINFORMATICS 2023; 3:1248732. [PMID: 37795169 PMCID: PMC10546188 DOI: 10.3389/fbinf.2023.1248732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 09/08/2023] [Indexed: 10/06/2023] Open
Affiliation(s)
| | - Mauno Vihinen
- Department of Experimental Medical Science, Lund University, Lund, Sweden
| |
Collapse
|
4
|
Yang Y, Chong Z, Vihinen M. PON-Fold: Prediction of Substitutions Affecting Protein Folding Rate. Int J Mol Sci 2023; 24:13023. [PMID: 37629203 PMCID: PMC10455311 DOI: 10.3390/ijms241613023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 08/08/2023] [Accepted: 08/09/2023] [Indexed: 08/27/2023] Open
Abstract
Most proteins fold into characteristic three-dimensional structures. The rate of folding and unfolding varies widely and can be affected by variations in proteins. We developed a novel machine-learning-based method for the prediction of the folding rate effects of amino acid substitutions in two-state folding proteins. We collected a data set of experimentally defined folding rates for variants and used them to train a gradient boosting algorithm starting with 1161 features. Two predictors were designed. The three-class classifier had, in blind tests, specificity and sensitivity ranging from 0.324 to 0.419 and from 0.256 to 0.451, respectively. The other tool was a regression predictor that showed a Pearson correlation coefficient of 0.525. The error measures, mean absolute error and mean squared error, were 0.581 and 0.603, respectively. One of the previously presented tools could be used for comparison with the blind test data set, our method called PON-Fold showed superior performance on all used measures. The applicability of the tool was tested by predicting all possible substitutions in a protein domain. Predictions for different conformations of proteins, open and closed forms of a protein kinase, and apo and holo forms of an enzyme indicated that the choice of the structure had a large impact on the outcome. PON-Fold is freely available.
Collapse
Affiliation(s)
- Yang Yang
- School of Computer Science and Technology, Soochow University, Suzhou 215006, China; (Y.Y.); (Z.C.)
- Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing 210000, China
| | - Zhang Chong
- School of Computer Science and Technology, Soochow University, Suzhou 215006, China; (Y.Y.); (Z.C.)
| | - Mauno Vihinen
- Department of Experimental Medical Science, Lund University, BMC B13, SE-221 84 Lund, Sweden
| |
Collapse
|
5
|
Vila JA. Protein folding rate evolution upon mutations. Biophys Rev 2023; 15:661-669. [PMID: 37681091 PMCID: PMC10480377 DOI: 10.1007/s12551-023-01088-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Accepted: 06/24/2023] [Indexed: 09/09/2023] Open
Abstract
Despite the spectacular success of cutting-edge protein fold prediction methods, many critical questions remain unanswered, including why proteins can reach their native state in a biologically reasonable time. A satisfactory answer to this simple question could shed light on the slowest folding rate of proteins as well as how mutations-amino-acid substitutions and/or post-translational modifications-might affect it. Preliminary results indicate that (i) Anfinsen's dogma validity ensures that proteins reach their native state on a reasonable timescale regardless of their sequence or length, and (ii) it is feasible to determine the evolution of protein folding rates without accounting for epistasis effects or the mutational trajectories between the starting and target sequences. These results have direct implications for evolutionary biology because they lay the groundwork for a better understanding of why, and to what extent, mutations-a crucial element of evolution and a factor influencing it-affect protein evolvability. Furthermore, they may spur significant progress in our efforts to solve crucial structural biology problems, such as how a sequence encodes its folding.
Collapse
Affiliation(s)
- Jorge A. Vila
- IMASL-CONICET, Universidad Nacional de San Luis, Ejército de Los Andes 950, 5700 San Luis, Argentina
| |
Collapse
|
6
|
PCA-MutPred: Prediction of binding free energy change upon missense mutation in protein-carbohydrate complexes. J Mol Biol 2022; 434:167526. [DOI: 10.1016/j.jmb.2022.167526] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Revised: 02/26/2022] [Accepted: 03/01/2022] [Indexed: 11/22/2022]
|
7
|
Rajabi F, Bereshneh AH, Ramezanzadeh M, Garshasbi M. Novel compound heterozygous variants in XYLT1 gene caused Desbuquois dysplasia type 2 in an aborted fetus: a case report. BMC Pediatr 2022; 22:63. [PMID: 35081921 PMCID: PMC8790879 DOI: 10.1186/s12887-022-03132-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/15/2021] [Accepted: 01/15/2022] [Indexed: 11/16/2022] Open
Abstract
Background Desbuquois dysplasia type 2 (DBQD2) is an infrequent dysplasia with a wide range of symptoms, including facial deformities, growth retardation and short long bones. It is an autosomal recessive disorder caused by mutations in the XYLT1 gene that encodes xylosyltransferase-1. Case presentation We studied an aborted fetus from Iranian non-consanguineous parents who was therapeutically aborted at 19 weeks of gestation. Ultrasound examinations at 18 weeks of gestation revealed growth retardation in her long bones and some facial problems. Whole-exome sequencing was performed on the aborted fetus which revealed compound heterozygous XYLT1 mutations: c.742G>A; p.(Glu248Lys) and c.1537 C>A; p.(Leu513Met). Sanger sequencing and segregation analysis confirmed the compound heterozygosity of these variants in XYLT1. Conclusion The c.1537 C>A; p.(Leu513Met) variant has not been reported in any databases so far and therefore is novel. This is the third compound heterozygote report in XYLT1 and further supports the high heterogeneity of this disease. Supplementary Information The online version contains supplementary material available at 10.1186/s12887-022-03132-5.
Collapse
Affiliation(s)
- Fatemeh Rajabi
- Department of Medical Genetics, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, Iran
| | - Ali Hosseini Bereshneh
- Prenatal Diagnosis and Genetic Research Center, Dastgheib Hospital, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Mahboubeh Ramezanzadeh
- Department of Genetics and Molecular Medicine, Faculty of Medicine, Bushehr University of Medical Sciences, Bushehr, Iran
| | - Masoud Garshasbi
- Department of Medical Genetics, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, Iran.
| |
Collapse
|
8
|
Zimmermann RC, Sardiu ME, Manton CA, Miah MS, Banks CAS, Adams MK, Koestler DC, Hurst DR, Edmonds MD, Washburn MP, Welch DR. Perturbation of BRMS1 interactome reveals pathways that impact metastasis. PLoS One 2021; 16:e0259128. [PMID: 34788285 PMCID: PMC8598058 DOI: 10.1371/journal.pone.0259128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Accepted: 10/12/2021] [Indexed: 11/25/2022] Open
Abstract
Breast Cancer Metastasis Suppressor 1 (BRMS1) expression is associated with longer patient survival in multiple cancer types. Understanding BRMS1 functionality will provide insights into both mechanism of action and will enhance potential therapeutic development. In this study, we confirmed that the C-terminus of BRMS1 is critical for metastasis suppression and hypothesized that critical protein interactions in this region would explain its function. Phosphorylation status at S237 regulates BRMS1 protein interactions related to a variety of biological processes, phenotypes [cell cycle (e.g., CDKN2A), DNA repair (e.g., BRCA1)], and metastasis [(e.g., TCF2 and POLE2)]. Presence of S237 also directly decreased MDA-MB-231 breast carcinoma migration in vitro and metastases in vivo. The results add significantly to our understanding of how BRMS1 interactions with Sin3/HDAC complexes regulate metastasis and expand insights into BRMS1's molecular role, as they demonstrate BRMS1 C-terminus involvement in distinct protein-protein interactions.
Collapse
Affiliation(s)
- Rosalyn C. Zimmermann
- Department of Cancer Biology, The Kansas University Medical Center, Kansas City, KS, United States of America
| | - Mihaela E. Sardiu
- Stowers Institute for Medical Research, Kansas City, Missouri, United States of America
- Department of Biostatistics and Data Science, The Kansas University Medical Center, Kansas City, KS, United States of America
- The University of Kansas Cancer Center, Kansas City, KS, United States of America
| | - Christa A. Manton
- Department of Cancer Biology, The Kansas University Medical Center, Kansas City, KS, United States of America
- Pathology Department, University of Alabama at Birmingham, Birmingham, Alabama, United States of America
- Department of Biology, Baker University, Baldwin City, KS, United States of America
| | - Md. Sayem Miah
- Stowers Institute for Medical Research, Kansas City, Missouri, United States of America
- Department of Biochemistry and Molecular Biology, University of Arkansas for Health Sciences, Little Rock, AR, United States of America
| | - Charles A. S. Banks
- Stowers Institute for Medical Research, Kansas City, Missouri, United States of America
| | - Mark K. Adams
- Stowers Institute for Medical Research, Kansas City, Missouri, United States of America
| | - Devin C. Koestler
- Department of Biostatistics and Data Science, The Kansas University Medical Center, Kansas City, KS, United States of America
- The University of Kansas Cancer Center, Kansas City, KS, United States of America
| | - Douglas R. Hurst
- Pathology Department, University of Alabama at Birmingham, Birmingham, Alabama, United States of America
| | - Mick D. Edmonds
- Department of Genetics, University of Alabama at Birmingham, Birmingham, Alabama, United States of America
| | - Michael P. Washburn
- Department of Cancer Biology, The Kansas University Medical Center, Kansas City, KS, United States of America
- Stowers Institute for Medical Research, Kansas City, Missouri, United States of America
- The University of Kansas Cancer Center, Kansas City, KS, United States of America
| | - Danny R. Welch
- Department of Cancer Biology, The Kansas University Medical Center, Kansas City, KS, United States of America
- The University of Kansas Cancer Center, Kansas City, KS, United States of America
| |
Collapse
|
9
|
Rawat P, Prabakaran R, Kumar S, Gromiha MM. AggreRATE-Pred: a mathematical model for the prediction of change in aggregation rate upon point mutation. Bioinformatics 2020; 36:1439-1444. [PMID: 31599925 DOI: 10.1093/bioinformatics/btz764] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2018] [Revised: 09/30/2019] [Accepted: 10/05/2019] [Indexed: 01/09/2023] Open
Abstract
MOTIVATION Protein aggregation is a major unsolved problem in biochemistry with implications for several human diseases, biotechnology and biomaterial sciences. A majority of sequence-structural properties known for their mechanistic roles in protein aggregation do not correlate well with the aggregation kinetics. This limits the practical utility of predictive algorithms. RESULTS We analyzed experimental data on 183 unique single point mutations that lead to change in aggregation rates for 23 polypeptides and proteins. Our initial mathematical model obtained a correlation coefficient of 0.43 between predicted and experimental change in aggregation rate upon mutation (P-value <0.0001). However, when the dataset was classified based on protein length and conformation at the mutation sites, the average correlation coefficient almost doubled to 0.82 (range: 0.74-0.87; P-value <0.0001). We observed that distinct sequence and structure-based properties determine protein aggregation kinetics in each class. In conclusion, the protein aggregation kinetics are impacted by local factors and not by global ones, such as overall three-dimensional protein fold, or mechanistic factors such as the presence of aggregation-prone regions. AVAILABILITY AND IMPLEMENTATION The web server is available at http://www.iitm.ac.in/bioinfo/aggrerate-pred/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Puneet Rawat
- Protein Bioinformatics Lab, Department of Biotechnology, Indian Institute of Technology Madras, Chennai, Tamil Nadu 600036, India
| | - R Prabakaran
- Protein Bioinformatics Lab, Department of Biotechnology, Indian Institute of Technology Madras, Chennai, Tamil Nadu 600036, India
| | - Sandeep Kumar
- Biotherapeutics Discovery, Boehringer-Ingelheim Pharmaceutical Inc. Ridgefield, CT, USA
| | - M Michael Gromiha
- Protein Bioinformatics Lab, Department of Biotechnology, Indian Institute of Technology Madras, Chennai, Tamil Nadu 600036, India.,Advanced Computational Drug Discovery Unit (ACDD), Tokyo Tech World Research Hub Initiative (WRHI), Institute of Innovative Research, Tokyo Institute of Technology, 4259 Nagatsuta-cho, Midori-ku, Yokohama, Japan
| |
Collapse
|
10
|
Jemimah S, Sekijima M, Gromiha MM. ProAffiMuSeq: sequence-based method to predict the binding free energy change of protein-protein complexes upon mutation using functional classification. Bioinformatics 2020; 36:1725-1730. [PMID: 31713585 DOI: 10.1093/bioinformatics/btz829] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Revised: 10/23/2019] [Accepted: 11/11/2019] [Indexed: 12/20/2022] Open
Abstract
MOTIVATION Protein-protein interactions are essential for the cell and mediate various functions. However, mutations can disrupt these interactions and may cause diseases. Currently available computational methods require a complex structure as input for predicting the change in binding affinity. Further, they have not included the functional class information for the protein-protein complex. To address this, we have developed a method, ProAffiMuSeq, which predicts the change in binding free energy using sequence-based features and functional class. RESULTS Our method shows an average correlation between predicted and experimentally determined ΔΔG of 0.73 and mean absolute error (MAE) of 0.86 kcal/mol in 10-fold cross-validation and correlation of 0.75 with MAE of 0.94 kcal/mol in the test dataset. ProAffiMuSeq was also tested on an external validation set and showed results comparable to structure-based methods. Our method can be used for large-scale analysis of disease-causing mutations in protein-protein complexes without structural information. AVAILABILITY AND IMPLEMENTATION Users can access the method at https://web.iitm.ac.in/bioinfo2/proaffimuseq/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sherlyn Jemimah
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| | - Masakazu Sekijima
- Advanced Computational Drug Discovery Unit, Tokyo Institute of Technology, Midori-ku, Kanagawa 226-8503, Yokohama, Japan
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India.,Advanced Computational Drug Discovery Unit, Tokyo Tech World Research Hub Initiative (WRHI), Institute of Innovative Research, Tokyo Institute of Technology, Midori-ku, Kanagawa 226-8503, Yokohama, Japan
| |
Collapse
|
11
|
Zhang H, Liao L, Saravanan KM, Yin P, Wei Y. DeepBindRG: a deep learning based method for estimating effective protein-ligand affinity. PeerJ 2019; 7:e7362. [PMID: 31380152 PMCID: PMC6661145 DOI: 10.7717/peerj.7362] [Citation(s) in RCA: 52] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Accepted: 06/27/2019] [Indexed: 12/24/2022] Open
Abstract
Proteins interact with small molecules to modulate several important cellular functions. Many acute diseases were cured by small molecule binding in the active site of protein either by inhibition or activation. Currently, there are several docking programs to estimate the binding position and the binding orientation of protein–ligand complex. Many scoring functions were developed to estimate the binding strength and predict the effective protein–ligand binding. While the accuracy of current scoring function is limited by several aspects, the solvent effect, entropy effect, and multibody effect are largely ignored in traditional machine learning methods. In this paper, we proposed a new deep neural network-based model named DeepBindRG to predict the binding affinity of protein–ligand complex, which learns all the effects, binding mode, and specificity implicitly by learning protein–ligand interface contact information from a large protein–ligand dataset. During the initial data processing step, the critical interface information was preserved to make sure the input is suitable for the proposed deep learning model. While validating our model on three independent datasets, DeepBindRG achieves root mean squared error (RMSE) value of pKa (−logKd or −logKi) about 1.6–1.8 and R value around 0.5–0.6, which is better than the autodock vina whose RMSE value is about 2.2–2.4 and R value is 0.42–0.57. We also explored the detailed reasons for the performance of DeepBindRG, especially for several failed cases by vina. Furthermore, DeepBindRG performed better for four challenging datasets from DUD.E database with no experimental protein–ligand complexes. The better performance of DeepBindRG than autodock vina in predicting protein–ligand binding affinity indicates that deep learning approach can greatly help with the drug discovery process. We also compare the performance of DeepBindRG with a 4D based deep learning method “pafnucy”, the advantage and limitation of both methods have provided clues for improving the deep learning based protein–ligand prediction model in the future.
Collapse
Affiliation(s)
- Haiping Zhang
- Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China
| | - Linbu Liao
- Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China
| | - Konda Mani Saravanan
- Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China
| | - Peng Yin
- Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China
| | - Yanjie Wei
- Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China
| |
Collapse
|
12
|
An in-silico method for identifying aggregation rate enhancer and mitigator mutations in proteins. Int J Biol Macromol 2018; 118:1157-1167. [DOI: 10.1016/j.ijbiomac.2018.06.102] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2018] [Revised: 06/19/2018] [Accepted: 06/20/2018] [Indexed: 12/27/2022]
|
13
|
PhosphoPredict: A bioinformatics tool for prediction of human kinase-specific phosphorylation substrates and sites by integrating heterogeneous feature selection. Sci Rep 2017; 7:6862. [PMID: 28761071 PMCID: PMC5537252 DOI: 10.1038/s41598-017-07199-4] [Citation(s) in RCA: 57] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2016] [Accepted: 06/27/2017] [Indexed: 12/31/2022] Open
Abstract
Protein phosphorylation is a major form of post-translational modification (PTM) that regulates diverse cellular processes. In silico methods for phosphorylation site prediction can provide a useful and complementary strategy for complete phosphoproteome annotation. Here, we present a novel bioinformatics tool, PhosphoPredict, that combines protein sequence and functional features to predict kinase-specific substrates and their associated phosphorylation sites for 12 human kinases and kinase families, including ATM, CDKs, GSK-3, MAPKs, PKA, PKB, PKC, and SRC. To elucidate critical determinants, we identified feature subsets that were most informative and relevant for predicting substrate specificity for each individual kinase family. Extensive benchmarking experiments based on both five-fold cross-validation and independent tests indicated that the performance of PhosphoPredict is competitive with that of several other popular prediction tools, including KinasePhos, PPSP, GPS, and Musite. We found that combining protein functional and sequence features significantly improves phosphorylation site prediction performance across all kinases. Application of PhosphoPredict to the entire human proteome identified 150 to 800 potential phosphorylation substrates for each of the 12 kinases or kinase families. PhosphoPredict significantly extends the bioinformatics portfolio for kinase function analysis and will facilitate high-throughput identification of kinase-specific phosphorylation sites, thereby contributing to both basic and translational research programs.
Collapse
|
14
|
Structure of the homodimeric androgen receptor ligand-binding domain. Nat Commun 2017; 8:14388. [PMID: 28165461 PMCID: PMC5303882 DOI: 10.1038/ncomms14388] [Citation(s) in RCA: 138] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2016] [Accepted: 12/22/2016] [Indexed: 01/20/2023] Open
Abstract
The androgen receptor (AR) plays a crucial role in normal physiology, development and metabolism as well as in the aetiology and treatment of diverse pathologies such as androgen insensitivity syndromes (AIS), male infertility and prostate cancer (PCa). Here we show that dimerization of AR ligand-binding domain (LBD) is induced by receptor agonists but not by antagonists. The 2.15-Å crystal structure of homodimeric, agonist- and coactivator peptide-bound AR-LBD unveils a 1,000-Å2 large dimerization surface, which harbours over 40 previously unexplained AIS- and PCa-associated point mutations. An AIS mutation in the self-association interface (P767A) disrupts dimer formation in vivo, and has a detrimental effect on the transactivating properties of full-length AR, despite retained hormone-binding capacity. The conservation of essential residues suggests that the unveiled dimerization mechanism might be shared by other nuclear receptors. Our work defines AR-LBD homodimerization as an essential step in the proper functioning of this important transcription factor. The androgen receptor is crucial for the development and physiology of reproductive organs. Here the authors present the structure of the androgen receptor ligand-binding domain bound to dihydrotestosterone, identifying a homodimerization interface that is crucial for receptor activity in vivo.
Collapse
|
15
|
Gromiha MM, Yugandhar K, Jemimah S. Protein-protein interactions: scoring schemes and binding affinity. Curr Opin Struct Biol 2016; 44:31-38. [PMID: 27866112 DOI: 10.1016/j.sbi.2016.10.016] [Citation(s) in RCA: 87] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2016] [Revised: 09/30/2016] [Accepted: 10/25/2016] [Indexed: 01/16/2023]
Abstract
Protein-protein interactions mediate several cellular functions, which can be understood from the information obtained using the three-dimensional structures of protein-protein complexes and binding affinity data. This review focuses on computational aspects of predicting the best native-like complex structure and binding affinities. The first part covers the prediction of protein-protein complex structures and the advantages of conformational searching and scoring functions in protein-protein docking. The second part is devoted to various aspects of protein-protein interaction thermodynamics, such as databases for binding affinities and other thermodynamic parameters, computational methods to predict the binding affinity using either the three-dimensional structures of complexes or amino acid sequences, and change in binding affinities of the complexes upon mutations. We provide the latest developments on protein-protein docking and binding affinity studies along with a list of available computational resources for understanding protein-protein interactions.
Collapse
Affiliation(s)
- M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, Tamil Nadu, India.
| | - K Yugandhar
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, Tamil Nadu, India
| | - Sherlyn Jemimah
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, Tamil Nadu, India
| |
Collapse
|
16
|
Prediction of change in protein unfolding rates upon point mutations in two state proteins. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2016; 1864:1104-1109. [DOI: 10.1016/j.bbapap.2016.06.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2016] [Revised: 05/05/2016] [Accepted: 06/01/2016] [Indexed: 11/23/2022]
|
17
|
Chang CCH, Li C, Webb GI, Tey B, Song J, Ramanan RN. Periscope: quantitative prediction of soluble protein expression in the periplasm of Escherichia coli. Sci Rep 2016; 6:21844. [PMID: 26931649 PMCID: PMC4773868 DOI: 10.1038/srep21844] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2015] [Accepted: 01/28/2016] [Indexed: 12/20/2022] Open
Abstract
Periplasmic expression of soluble proteins in Escherichia coli not only offers a much-simplified downstream purification process, but also enhances the probability of obtaining correctly folded and biologically active proteins. Different combinations of signal peptides and target proteins lead to different soluble protein expression levels, ranging from negligible to several grams per litre. Accurate algorithms for rational selection of promising candidates can serve as a powerful tool to complement with current trial-and-error approaches. Accordingly, proteomics studies can be conducted with greater efficiency and cost-effectiveness. Here, we developed a predictor with a two-stage architecture, to predict the real-valued expression level of target protein in the periplasm. The output of the first-stage support vector machine (SVM) classifier determines which second-stage support vector regression (SVR) classifier to be used. When tested on an independent test dataset, the predictor achieved an overall prediction accuracy of 78% and a Pearson's correlation coefficient (PCC) of 0.77. We further illustrate the relative importance of various features with respect to different models. The results indicate that the occurrence of dipeptide glutamine and aspartic acid is the most important feature for the classification model. Finally, we provide access to the implemented predictor through the Periscope webserver, freely accessible at http://lightning.med.monash.edu/periscope/.
Collapse
Affiliation(s)
- Catherine Ching Han Chang
- Chemical Engineering Discipline, School of Engineering, Monash University, Jalan Lagoon Selatan 46150, Bandar Sunway, Selangor, Malaysia
- Department of Biochemistry and Molecular Biology, Monash University, Melbourne VIC 3800, Australia
| | - Chen Li
- Department of Biochemistry and Molecular Biology, Monash University, Melbourne VIC 3800, Australia
| | - Geoffrey I. Webb
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne VIC 3800, Australia
| | - BengTi Tey
- Chemical Engineering Discipline, School of Engineering, Monash University, Jalan Lagoon Selatan 46150, Bandar Sunway, Selangor, Malaysia
- Advanced Engineering Platform, School of Engineering, Monash University, Jalan Lagoon Selatan 46150, Bandar Sunway, Selangor, Malaysia
| | - Jiangning Song
- Department of Biochemistry and Molecular Biology, Monash University, Melbourne VIC 3800, Australia
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne VIC 3800, Australia
- National Engineering Laboratory for Industrial Enzymes, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China
| | - Ramakrishnan Nagasundara Ramanan
- Chemical Engineering Discipline, School of Engineering, Monash University, Jalan Lagoon Selatan 46150, Bandar Sunway, Selangor, Malaysia
- Advanced Engineering Platform, School of Engineering, Monash University, Jalan Lagoon Selatan 46150, Bandar Sunway, Selangor, Malaysia
- School of Chemistry, Monash University, Melbourne VIC 3800, Australia
| |
Collapse
|
18
|
Zhai J, Tang Y, Yuan H, Wang L, Shang H, Ma C. A Meta-Analysis Based Method for Prioritizing Candidate Genes Involved in a Pre-specific Function. FRONTIERS IN PLANT SCIENCE 2016; 7:1914. [PMID: 28018423 PMCID: PMC5156684 DOI: 10.3389/fpls.2016.01914] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2016] [Accepted: 12/02/2016] [Indexed: 05/10/2023]
Abstract
The identification of genes associated with a given biological function in plants remains a challenge, although network-based gene prioritization algorithms have been developed for Arabidopsis thaliana and many non-model plant species. Nevertheless, these network-based gene prioritization algorithms have encountered several problems; one in particular is that of unsatisfactory prediction accuracy due to limited network coverage, varying link quality, and/or uncertain network connectivity. Thus, a model that integrates complementary biological data may be expected to increase the prediction accuracy of gene prioritization. Toward this goal, we developed a novel gene prioritization method named RafSee, to rank candidate genes using a random forest algorithm that integrates sequence, evolutionary, and epigenetic features of plants. Subsequently, we proposed an integrative approach named RAP (Rank Aggregation-based data fusion for gene Prioritization), in which an order statistics-based meta-analysis was used to aggregate the rank of the network-based gene prioritization method and RafSee, for accurately prioritizing candidate genes involved in a pre-specific biological function. Finally, we showcased the utility of RAP by prioritizing 380 flowering-time genes in Arabidopsis. The "leave-one-out" cross-validation experiment showed that RafSee could work as a complement to a current state-of-art network-based gene prioritization system (AraNet v2). Moreover, RAP ranked 53.68% (204/380) flowering-time genes higher than AraNet v2, resulting in an 39.46% improvement in term of the first quartile rank. Further evaluations also showed that RAP was effective in prioritizing genes-related to different abiotic stresses. To enhance the usability of RAP for Arabidopsis and non-model plant species, an R package implementing the method is freely available at http://bioinfo.nwafu.edu.cn/software.
Collapse
|
19
|
Mallik S, Das S, Kundu S. Predicting protein folding rate change upon point mutation using residue-level coevolutionary information. Proteins 2015; 84:3-8. [DOI: 10.1002/prot.24960] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2015] [Revised: 11/11/2015] [Accepted: 11/11/2015] [Indexed: 11/10/2022]
Affiliation(s)
- Saurav Mallik
- Department of Biophysics; Molecular Biology and Bioinformatics, University of Calcutta; Kolkata 700009 India
- Center of Excellence in Systems Biology and Biomedical Engineering (TEQIP Phase-II), University of Calcutta; Kolkata 700009 India
| | - Smita Das
- Department of Biophysics; Molecular Biology and Bioinformatics, University of Calcutta; Kolkata 700009 India
| | - Sudip Kundu
- Department of Biophysics; Molecular Biology and Bioinformatics, University of Calcutta; Kolkata 700009 India
- Center of Excellence in Systems Biology and Biomedical Engineering (TEQIP Phase-II), University of Calcutta; Kolkata 700009 India
| |
Collapse
|
20
|
Huang X, Hernick M. Molecular Determinants of N-Acetylglucosamine Recognition and Turnover by N-Acetyl-1-d-myo-inosityl-2-amino-2-deoxy-α-d-glucopyranoside Deacetylase (MshB). Biochemistry 2015; 54:3784-90. [DOI: 10.1021/acs.biochem.5b00068] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Xinyi Huang
- Department
of Biochemistry, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Marcy Hernick
- Department
of Biochemistry, Virginia Tech, Blacksburg, Virginia 24061, United States
- Department
of Pharmaceutical Sciences, Appalachian College of Pharmacy, Oakwood, Virginia 24631, United States
| |
Collapse
|
21
|
A novel mutation in PNLIP causes pancreatic triglyceride lipase deficiency through protein misfolding. Biochim Biophys Acta Mol Basis Dis 2015; 1852:1372-9. [PMID: 25862608 DOI: 10.1016/j.bbadis.2015.04.002] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2015] [Revised: 03/30/2015] [Accepted: 04/01/2015] [Indexed: 01/28/2023]
Abstract
Congenital pancreatic triglyceride lipase (PNLIP) deficiency is a rare disorder with uncertain genetic background as most cases were described before gene sequencing was readily available. Recently, two brothers with PNLIP deficiency were found to carry a homozygous missense mutation, c.662C>T (p.T221M) in the PNLIP gene (J. Lipid Res. 2014. 55:307-312). Molecular modeling suggested the substitution would change the orientation of residues in the catalytic site and disrupt the function of p.T221M PNLIP. To test the effect of the p.T221M mutation on PNLIP function, we expressed wild-type and p.T221M PNLIP in human embryonic kidney (HEK) 293A cells and dexamethasone-differentiated AR42J rat acinar cells. In both cellular models, wild-type PNLIP was secreted into the conditioned medium where it was readily detectable by protein staining, immunoblot or lipase activity assays. In contrast, mutant p.T221M was not secreted into the medium, but it was present in cell lysates where it accumulated in the insoluble fraction. Intracellular retention of mutant p.T221M resulted in endoplasmic reticulum (ER) stress as measured by elevated XBP1 splicing and increased levels of ER chaperones. Our results demonstrate that the presence of methionine at position 221 in the PNLIP protein sequence causes misfolding and aggregation of the p.T221M mutant inside the cell. The consequent loss of enzyme secretion adequately explains the clinical phenotype of PNLIP deficiency reported for homozygous carriers of p.T221M. Furthermore, the ability of mutant p.T221M to induce ER stress suggests that this form of PNLIP deficiency might cause acinar cell damage as well.
Collapse
|