1
|
Ntallis C, Tzoupis H, Tselios T, Chasapis CT, Vlamis-Gardikas A. Distinct or Overlapping Areas of Mitochondrial Thioredoxin 2 May Be Used for Its Covalent and Strong Non-Covalent Interactions with Protein Ligands. Antioxidants (Basel) 2023; 13:15. [PMID: 38275635 PMCID: PMC10812433 DOI: 10.3390/antiox13010015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 12/09/2023] [Accepted: 12/16/2023] [Indexed: 01/27/2024] Open
Abstract
In silico approaches were employed to examine the characteristics of interactions between human mitochondrial thioredoxin 2 (HsTrx2) and its 38 previously identified mitochondrial protein ligands. All interactions appeared driven mainly by electrostatic forces. The statistically significant residues of HsTrx2 for interactions were characterized as "contact hot spots". Since these were identical/adjacent to putative thermodynamic hot spots, an energy network approach identified their neighbors to highlight possible contact interfaces. Three distinct areas for binding emerged: (i) one around the active site for covalent interactions, (ii) another antipodal to the active site for strong non-covalent interactions, and (iii) a third area involved in both kinds of interactions. The contact interfaces of HsTrx2 were projected as respective interfaces for Escherichia coli Trx1 (EcoTrx1), 2, and HsTrx1. Comparison of the interfaces and contact hot spots of HsTrx2 to the contact residues of EcoTx1 and HsTrx1 from existing crystal complexes with protein ligands supported the hypothesis, except for a part of the cleft/groove adjacent to Trp30 preceding the active site. The outcomes of this study raise the possibility for the rational design of selective inhibitors for the interactions of HsTrx2 with specific protein ligands without affecting the entirety of the functions of the Trx system.
Collapse
Affiliation(s)
- Charalampos Ntallis
- Department of Chemistry, University of Patras, 26504 Rion, Greece; (C.N.); (H.T.); (T.T.)
| | - Haralampos Tzoupis
- Department of Chemistry, University of Patras, 26504 Rion, Greece; (C.N.); (H.T.); (T.T.)
| | - Theodore Tselios
- Department of Chemistry, University of Patras, 26504 Rion, Greece; (C.N.); (H.T.); (T.T.)
| | - Christos T. Chasapis
- Institute of Chemical Biology, National Hellenic Research Foundation, Vas. Constantinou 48, 11635 Athens, Greece;
| | | |
Collapse
|
2
|
Kim Y, Yoon T, Park WB, Na S. Predicting mechanical properties of silk from its amino acid sequences via machine learning. J Mech Behav Biomed Mater 2023; 140:105739. [PMID: 36871478 DOI: 10.1016/j.jmbbm.2023.105739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2022] [Revised: 02/12/2023] [Accepted: 02/21/2023] [Indexed: 02/25/2023]
Abstract
The silk fiber is increasingly being sought for its superior mechanical properties, biocompatibility, and eco-friendliness, making it promising as a base material for various applications. One of the characteristics of protein fibers, such as silk, is that their mechanical properties are significantly dependent on the amino acid sequence. Numerous studies have been conducted to determine the specific relationship between the amino acid sequence of silk and its mechanical properties. Still, the relationship between the amino acid sequence of silk and its mechanical properties is yet to be clarified. Other fields have adopted machine learning (ML) to establish a relationship between the inputs, such as the ratio of different input material compositions and the resulting mechanical properties. We have proposed a method to convert the amino acid sequence into numerical values for input and succeeded in predicting the mechanical properties of silk from its amino acid sequences. Our study sheds light on predicting mechanical properties of silk fiber from respective amino acid sequences.
Collapse
|
3
|
Zhou H, Wang W, Jin J, Zheng Z, Zhou B. Graph Neural Network for Protein-Protein Interaction Prediction: A Comparative Study. Molecules 2022; 27:6135. [PMID: 36144868 DOI: 10.3390/molecules27186135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/27/2022] [Revised: 09/14/2022] [Accepted: 09/16/2022] [Indexed: 11/17/2022]
Abstract
Proteins are the fundamental biological macromolecules which underline practically all biological activities. Protein-protein interactions (PPIs), as they are known, are how proteins interact with other proteins in their environment to perform biological functions. Understanding PPIs reveals how cells behave and operate, such as the antigen recognition and signal transduction in the immune system. In the past decades, many computational methods have been developed to predict PPIs automatically, requiring less time and resources than experimental techniques. In this paper, we present a comparative study of various graph neural networks for protein-protein interaction prediction. Five network models are analyzed and compared, including neural networks (NN), graph convolutional neural networks (GCN), graph attention networks (GAT), hyperbolic neural networks (HNN), and hyperbolic graph convolutions (HGCN). By utilizing the protein sequence information, all of these models can predict the interaction between proteins. Fourteen PPI datasets are extracted and utilized to compare the prediction performance of all these methods. The experimental results show that hyperbolic graph neural networks tend to have a better performance than the other methods on the protein-related datasets.
Collapse
|
4
|
Abstract
Proteins are the essential biological macromolecules required to perform nearly all biological processes, and cellular functions. Proteins rarely carry out their tasks in isolation but interact with other proteins (known as protein–protein interaction) present in their surroundings to complete biological activities. The knowledge of protein–protein interactions (PPIs) unravels the cellular behavior and its functionality. The computational methods automate the prediction of PPI and are less expensive than experimental methods in terms of resources and time. So far, most of the works on PPI have mainly focused on sequence information. Here, we use graph convolutional network (GCN) and graph attention network (GAT) to predict the interaction between proteins by utilizing protein’s structural information and sequence features. We build the graphs of proteins from their PDB files, which contain 3D coordinates of atoms. The protein graph represents the amino acid network, also known as residue contact network, where each node is a residue. Two nodes are connected if they have a pair of atoms (one from each node) within the threshold distance. To extract the node/residue features, we use the protein language model. The input to the language model is the protein sequence, and the output is the feature vector for each amino acid of the underlying sequence. We validate the predictive capability of the proposed graph-based approach on two PPI datasets: Human and S. cerevisiae. Obtained results demonstrate the effectiveness of the proposed approach as it outperforms the previous leading methods. The source code for training and data to train the model are available at https://github.com/JhaKanchan15/PPI_GNN.git.
Collapse
|
5
|
Santa-Coloma TA. Overlapping synthetic peptides as a tool to map protein-protein interactions ̶ FSH as a model system of nonadditive interactions. Biochim Biophys Acta Gen Subj 2022; 1866:130153. [DOI: 10.1016/j.bbagen.2022.130153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2022] [Revised: 04/06/2022] [Accepted: 04/12/2022] [Indexed: 10/18/2022]
|
6
|
Li S, Wu S, Wang L, Li F, Jiang H, Bai F. Recent advances in predicting protein-protein interactions with the aid of artificial intelligence algorithms. Curr Opin Struct Biol 2022; 73:102344. [PMID: 35219216 DOI: 10.1016/j.sbi.2022.102344] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Revised: 01/02/2022] [Accepted: 01/17/2022] [Indexed: 12/15/2022]
Abstract
Protein-protein interactions (PPIs) are essential in the regulation of biological functions and cell events, therefore understanding PPIs have become a key issue to understanding the molecular mechanism and investigating the design of drugs. Here we highlight the major developments in computational methods developed for predicting PPIs by using types of artificial intelligence algorithms. The first part introduces the source of experimental PPI data. The second part is devoted to the PPI prediction methods based on sequential information. The third part covers representative methods using structural information as the input feature. The last part is methods designed by combining different types of features. For each part, the state-of-the-art computational PPI prediction methods are reviewed in an inclusive view. Finally, we discuss the flaws existing in this area and future directions of next-generation algorithms.
Collapse
Affiliation(s)
- Shiwei Li
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, Shanghai, China
| | - Sanan Wu
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, Shanghai, China
| | - Lin Wang
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, Shanghai, China
| | - Fenglei Li
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, Shanghai, China; School of Information Science and Technology, ShanghaiTech University, Shanghai, China
| | - Hualiang Jiang
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, Shanghai, China; Drug Discovery and Design Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Pudong, Shanghai, 201203, China
| | - Fang Bai
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, Shanghai, China; School of Information Science and Technology, ShanghaiTech University, Shanghai, China.
| |
Collapse
|
7
|
Abstract
Single-point mutations of certain residues (so-called hot spots) impair/disrupt protein-protein interactions (PPIs), leading to pathogenesis and drug resistance. Conventionally, a PPI-hot spot is identified when its replacement decreased the binding free energy significantly, generally by ≥2 kcal/mol. The relatively few mutations with such a significant binding free energy drop limited the number of distinct PPI-hot spots. By defining PPI-hot spots based on mutations that have been manually curated in UniProtKB to significantly impair/disrupt PPIs in addition to binding free energy changes, we have greatly expanded the number of distinct PPI-hot spots by an order of magnitude. These experimentally determined PPI-hot spots along with available structures have been collected in a database called PPI-HotspotDB. We have applied the PPI-HotspotDB to create a nonredundant benchmark, PPI-Hotspot+PDBBM, for assessing methods to predict PPI-hot spots using the free structure as input. PPI-HotspotDB will benefit the design of mutagenesis experiments and development of PPI-hot spot prediction methods. The database and benchmark are freely available at https://ppihotspot.limlab.dnsalias.org.
Collapse
Affiliation(s)
- Yao Chi Chen
- Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan
| | - Yu-Hsien Chen
- Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan
| | - Jon D Wright
- Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan
| | - Carmay Lim
- Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan.,Department of Chemistry, National Tsing Hua University, Hsinchu 300, Taiwan
| |
Collapse
|
8
|
Khetan R, Curtis R, Deane CM, Hadsund JT, Kar U, Krawczyk K, Kuroda D, Robinson SA, Sormanni P, Tsumoto K, Warwicker J, Martin ACR. Current advances in biopharmaceutical informatics: guidelines, impact and challenges in the computational developability assessment of antibody therapeutics. MAbs 2022; 14:2020082. [PMID: 35104168 PMCID: PMC8812776 DOI: 10.1080/19420862.2021.2020082] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Therapeutic monoclonal antibodies and their derivatives are key components of clinical pipelines in the global biopharmaceutical industry. The availability of large datasets of antibody sequences, structures, and biophysical properties is increasingly enabling the development of predictive models and computational tools for the "developability assessment" of antibody drug candidates. Here, we provide an overview of the antibody informatics tools applicable to the prediction of developability issues such as stability, aggregation, immunogenicity, and chemical degradation. We further evaluate the opportunities and challenges of using biopharmaceutical informatics for drug discovery and optimization. Finally, we discuss the potential of developability guidelines based on in silico metrics that can be used for the assessment of antibody stability and manufacturability.
Collapse
Affiliation(s)
- Rahul Khetan
- Manchester Institute of Biotechnology, University of Manchester, Manchester, UK
| | - Robin Curtis
- Manchester Institute of Biotechnology, University of Manchester, Manchester, UK
| | | | | | - Uddipan Kar
- Department of Biological Engineering, Massachusetts Institute of Technology (MIT), Cambridge, MA, USA
| | | | - Daisuke Kuroda
- Department of Bioengineering, School of Engineering, The University of Tokyo, Tokyo, Japan.,Medical Device Development and Regulation Research Center, School of Engineering, The University of Tokyo, Tokyo, Japan.,Department of Chemistry and Biotechnology, School of Engineering, The University of Tokyo, Tokyo, Japan
| | | | - Pietro Sormanni
- Chemistry of Health, Yusuf Hamied Department of Chemistry, University of Cambridge
| | - Kouhei Tsumoto
- Department of Bioengineering, School of Engineering, The University of Tokyo, Tokyo, Japan.,Medical Device Development and Regulation Research Center, School of Engineering, The University of Tokyo, Tokyo, Japan.,Department of Chemistry and Biotechnology, School of Engineering, The University of Tokyo, Tokyo, Japan.,The Institute of Medical Science, The University of Tokyo, Tokyo, Japan
| | - Jim Warwicker
- Manchester Institute of Biotechnology, University of Manchester, Manchester, UK
| | - Andrew C R Martin
- Institute of Structural and Molecular Biology, Division of Biosciences, University College London, London, UK
| |
Collapse
|
9
|
Yao S, Zheng C, Wang B, Chen P. A two-step ensemble learning for predicting protein hot spot residues from whole protein sequence. Amino Acids. [DOI: 10.1007/s00726-022-03129-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 01/17/2022] [Indexed: 11/26/2022]
|
10
|
Abstract
Molecular docking excels at creating a plethora of potential models of protein-protein complexes. To correctly distinguish the favorable, native-like models from the remaining ones remains, however, a challenge. We assessed here if a protocol based on molecular dynamics (MD) simulations would allow distinguishing native from non-native models to complement scoring functions used in docking. To this end, the first models for 25 protein-protein complexes were generated using HADDOCK. Next, MD simulations complemented with machine learning were used to discriminate between native and non-native complexes based on a combination of metrics reporting on the stability of the initial models. Native models showed higher stability in almost all measured properties, including the key ones used for scoring in the Critical Assessment of PRedicted Interaction (CAPRI) competition, namely the positional root mean square deviations and fraction of native contacts from the initial docked model. A random forest classifier was trained, reaching a 0.85 accuracy in correctly distinguishing native from non-native complexes. Reasonably modest simulation lengths of the order of 50-100 ns are sufficient to reach this accuracy, which makes this approach applicable in practice.
Collapse
Affiliation(s)
- Zuzana Jandova
- Computational
Structural Biology Group, Bijvoet Centre for Biomolecular Research,
Faculty of Science—Chemistry, Utrecht
University, Padualaan 8, 3584 CH Utrecht, the Netherlands
| | - Attilio Vittorio Vargiu
- Physics
Department, University of Cagliari, Cittadella
Universitaria, S.P. 8 km 0.700, 09042 Monserrato, Italy
| | - Alexandre M. J. J. Bonvin
- Computational
Structural Biology Group, Bijvoet Centre for Biomolecular Research,
Faculty of Science—Chemistry, Utrecht
University, Padualaan 8, 3584 CH Utrecht, the Netherlands
| |
Collapse
|
11
|
Yang W, Wang K, Wu H, Shao H, Chen H, Zhu J. Peptide scaffold‐derived peptidomimetic farnesyltransferase inhibitors. J CHIN CHEM SOC-TAIP 2021. [DOI: 10.1002/jccs.202100037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Wei Yang
- Department of Infectious Diseases, Taizhou Hospital Zhejiang University Taizhou China
| | - Kuifeng Wang
- Department of Infectious Diseases, Taizhou Hospital Zhejiang University Taizhou China
| | - Hongwei Wu
- Department of Infectious Diseases Affiliated Taizhou Hospital of Wenzhou Medical University Taizhou China
| | - Hui Shao
- Department of Infectious Diseases, Taizhou Hospital Zhejiang University Taizhou China
| | - Huazhong Chen
- Department of Infectious Diseases, Taizhou Hospital Zhejiang University Taizhou China
| | - Jiansheng Zhu
- Department of Infectious Diseases, Taizhou Hospital Zhejiang University Taizhou China
| |
Collapse
|
12
|
Schoeps B, Eckfeld C, Flüter L, Keppler S, Mishra R, Knolle P, Bayerl F, Böttcher J, Hermann CD, Häußler D, Krüger A. Identification of invariant chain CD74 as a functional receptor of tissue inhibitor of metalloproteinases-1 (TIMP-1). J Biol Chem 2021; 297:101072. [PMID: 34391782 PMCID: PMC8429975 DOI: 10.1016/j.jbc.2021.101072] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Revised: 08/04/2021] [Accepted: 08/10/2021] [Indexed: 11/29/2022] Open
Abstract
Multifunctionality of tissue inhibitor of metalloproteinases-1 (TIMP-1) comprising antiproteolytic as well as cytokinic activity has been attributed to its N-terminal and C-terminal domains, respectively. The molecular basis of the emerging proinflammatory cytokinic activity of TIMP-1 is still not completely understood. The cytokine receptor invariant chain (CD74) is involved in many inflammation-associated diseases and is highly expressed by immune cells. CD74 triggers zeta chain–associated protein kinase-70 (ZAP-70) signaling–associated activation upon interaction with its only known ligand, the macrophage migration inhibitory factor. Here, we demonstrate TIMP-1–CD74 interaction by coimmunoprecipitation and confocal microscopy in cells engineered to overexpress CD74. In silico docking in HADDOCK predicted regions of the N-terminal domain of TIMP-1 (N-TIMP-1) to interact with CD74. This was experimentally confirmed by confocal microscopy demonstrating that recombinant N-TIMP-1 lacking the entire C-terminal domain was sufficient to bind CD74. Interaction of TIMP-1 with endogenously expressed CD74 was demonstrated in the Namalwa B lymphoma cell line by dot blot binding assays as well as confocal microscopy. Functionally, we demonstrated that TIMP-1–CD74 interaction triggered intracellular ZAP-70 activation. N-TIMP-1 was sufficient to induce ZAP-70 activation and interference with the cytokine-binding site of CD74 using a synthetic peptide–abrogated TIMP-1-mediated ZAP-70 activation. Altogether, we here identified CD74 as a receptor and mediator of cytokinic TIMP-1 activity and revealed TIMP-1 as moonlighting protein harboring both cytokinic and antiproteolytic activity within its N-terminal domain. Recognition of this functional TIMP-1–CD74 interaction may shed new light on clinical attempts to therapeutically target ligand-induced CD74 activity in cancer and other inflammatory diseases.
Collapse
Affiliation(s)
- Benjamin Schoeps
- School of Medicine, Institutes of Molecular Immunology and Experimental Oncology, Technical University of Munich, Munich, Germany
| | - Celina Eckfeld
- School of Medicine, Institutes of Molecular Immunology and Experimental Oncology, Technical University of Munich, Munich, Germany
| | - Laura Flüter
- School of Medicine, Institutes of Molecular Immunology and Experimental Oncology, Technical University of Munich, Munich, Germany
| | - Selina Keppler
- School of Medicine, Institute of Clinical Chemistry and Pathobiochemistry, Technical University of Munich, Munich, Germany; TranslaTUM, Center for Translational Cancer Research, Technical University Munich, Munich, Germany
| | - Ritu Mishra
- School of Medicine, Institute of Clinical Chemistry and Pathobiochemistry, Technical University of Munich, Munich, Germany; TranslaTUM, Center for Translational Cancer Research, Technical University Munich, Munich, Germany
| | - Percy Knolle
- School of Medicine, Institutes of Molecular Immunology and Experimental Oncology, Technical University of Munich, Munich, Germany
| | - Felix Bayerl
- School of Medicine, Institutes of Molecular Immunology and Experimental Oncology, Technical University of Munich, Munich, Germany
| | - Jan Böttcher
- School of Medicine, Institutes of Molecular Immunology and Experimental Oncology, Technical University of Munich, Munich, Germany
| | - Chris D Hermann
- School of Medicine, Institutes of Molecular Immunology and Experimental Oncology, Technical University of Munich, Munich, Germany
| | - Daniel Häußler
- School of Medicine, Institutes of Molecular Immunology and Experimental Oncology, Technical University of Munich, Munich, Germany
| | - Achim Krüger
- School of Medicine, Institutes of Molecular Immunology and Experimental Oncology, Technical University of Munich, Munich, Germany.
| |
Collapse
|
13
|
Honorato RV, Koukos PI, Jiménez-García B, Tsaregorodtsev A, Verlato M, Giachetti A, Rosato A, Bonvin AMJJ. Structural Biology in the Clouds: The WeNMR-EOSC Ecosystem. Front Mol Biosci 2021; 8:729513. [PMID: 34395534 PMCID: PMC8356364 DOI: 10.3389/fmolb.2021.729513] [Citation(s) in RCA: 252] [Impact Index Per Article: 84.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Accepted: 07/13/2021] [Indexed: 12/05/2022] Open
Abstract
Structural biology aims at characterizing the structural and dynamic properties of biological macromolecules at atomic details. Gaining insight into three dimensional structures of biomolecules and their interactions is critical for understanding the vast majority of cellular processes, with direct applications in health and food sciences. Since 2010, the WeNMR project (www.wenmr.eu) has implemented numerous web-based services to facilitate the use of advanced computational tools by researchers in the field, using the high throughput computing infrastructure provided by EGI. These services have been further developed in subsequent initiatives under H2020 projects and are now operating as Thematic Services in the European Open Science Cloud portal (www.eosc-portal.eu), sending >12 millions of jobs and using around 4,000 CPU-years per year. Here we review 10 years of successful e-infrastructure solutions serving a large worldwide community of over 23,000 users to date, providing them with user-friendly, web-based solutions that run complex workflows in structural biology. The current set of active WeNMR portals are described, together with the complex backend machinery that allows distributed computing resources to be harvested efficiently.
Collapse
Affiliation(s)
- Rodrigo V Honorato
- Bijvoet Centre for Biomolecular Research, Faculty of Science, Department of Chemistry, Utrecht University, Utrecht, Netherlands
| | - Panagiotis I Koukos
- Bijvoet Centre for Biomolecular Research, Faculty of Science, Department of Chemistry, Utrecht University, Utrecht, Netherlands
| | - Brian Jiménez-García
- Bijvoet Centre for Biomolecular Research, Faculty of Science, Department of Chemistry, Utrecht University, Utrecht, Netherlands
| | | | | | - Andrea Giachetti
- Department of Chemistry and Magnetic Resonance Center, University of Florence, and C.I.R.M.M.P, Fiorentino, Italy
| | - Antonio Rosato
- Department of Chemistry and Magnetic Resonance Center, University of Florence, and C.I.R.M.M.P, Fiorentino, Italy
| | - Alexandre M J J Bonvin
- Bijvoet Centre for Biomolecular Research, Faculty of Science, Department of Chemistry, Utrecht University, Utrecht, Netherlands
| |
Collapse
|
14
|
Liu Y, Gong W, Zhao Y, Deng X, Zhang S, Li C. aPRBind: protein-RNA interface prediction by combining sequence and I-TASSER model-based structural features learned with convolutional neural networks. Bioinformatics 2021; 37:937-942. [PMID: 32821925 DOI: 10.1093/bioinformatics/btaa747] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2020] [Revised: 07/26/2020] [Accepted: 08/17/2020] [Indexed: 12/13/2022] Open
Abstract
MOTIVATION Protein-RNA interactions play a critical role in various biological processes. The accurate prediction of RNA-binding residues in proteins has been one of the most challenging and intriguing problems in the field of computational biology. The existing methods still have a relatively low accuracy especially for the sequence-based ab-initio methods. RESULTS In this work, we propose an approach aPRBind, a convolutional neural network-based ab-initio method for RNA-binding residue prediction. aPRBind is trained with sequence features and structural ones (particularly including residue dynamics information and residue-nucleotide propensity developed by us) that are extracted from the predicted structures by I-TASSER. The analysis of feature contributions indicates the sequence features are most important, followed by dynamics information, and the sequence and structural features are complementary in binding site prediction. The performance comparison of our method with other peer ones on benchmark dataset shows that aPRBind outperforms some state-of-the-art ab-initio methods. Additionally, aPRBind can give a better prediction for the modeled structures with TM-score≥0.5, and meanwhile since the structural features are not very sensitive to the refined 3D structures, aPRBind has only a marginal dependence on the accuracy of the structure model, which allows aPRBind to be applied to the RNA-binding site prediction for the modeled or unbound structures. AVAILABILITY AND IMPLEMENTATION The source code is available at https://github.com/ChunhuaLiLab/aPRbind. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yang Liu
- Department of Biomedical Engineering, Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Weikang Gong
- Department of Biomedical Engineering, Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Yanpeng Zhao
- Department of Biomedical Engineering, Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Xueqing Deng
- Department of Biomedical Engineering, Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Shan Zhang
- Department of Biomedical Engineering, Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Chunhua Li
- Department of Biomedical Engineering, Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| |
Collapse
|
15
|
Matos-Filipe P, Preto AJ, Koukos PI, Mourão J, Bonvin AMJJ, Moreira IS. MENSAdb: a thorough structural analysis of membrane protein dimers. Database (Oxford) 2021; 2021:baab013. [PMID: 33822911 PMCID: PMC8023553 DOI: 10.1093/database/baab013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2020] [Revised: 01/19/2021] [Accepted: 03/01/2021] [Indexed: 11/14/2022]
Abstract
Membrane proteins (MPs) are key players in a variety of different cellular processes and constitute the target of around 60% of all Food and Drug Administration-approved drugs. Despite their importance, there is still a massive lack of relevant structural, biochemical and mechanistic information mainly due to their localization within the lipid bilayer. To help fulfil this gap, we developed the MEmbrane protein dimer Novel Structure Analyser database (MENSAdb). This interactive web application summarizes the evolutionary and physicochemical properties of dimeric MPs to expand the available knowledge on the fundamental principles underlying their formation. Currently, MENSAdb contains features of 167 unique MPs (63% homo- and 37% heterodimers) and brings insights into the conservation of residues, accessible solvent area descriptors, average B-factors, intermolecular contacts at 2.5 Å and 4.0 Å distance cut-offs, hydrophobic contacts, hydrogen bonds, salt bridges, π-π stacking, T-stacking and cation-π interactions. The regular update and organization of all these data into a unique platform will allow a broad community of researchers to collect and analyse a large number of features efficiently, thus facilitating their use in the development of prediction models associated with MPs. Database URL: http://www.moreiralab.com/resources/mensadb.
Collapse
Affiliation(s)
- Pedro Matos-Filipe
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra 3005-504, Portugal
| | - António J Preto
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra 3005-504, Portugal
- PhD Programme in Experimental Biology and Biomedicine, Institute for Interdisciplinary Research, University of Coimbra, Coimbra, 3030-789, Portugal
| | - Panagiotis I Koukos
- Bijvoet Centre for Biomolecular Research, Faculty of Science—Chemistry, Utrecht University, Utrecht, 3584, CH, Netherlands
| | - Joana Mourão
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra 3005-504, Portugal
| | - Alexandre M J J Bonvin
- Bijvoet Centre for Biomolecular Research, Faculty of Science—Chemistry, Utrecht University, Utrecht, 3584, CH, Netherlands
| | - Irina S Moreira
- Department of Life Sciences, University of Coimbra, Coimbra, 3000-456, Portugal
- Center for Neuroscience and Cell Biology, Center for Innovative Biomedicine and Biotechnology, University of Coimbra, Coimbra, Portugal
| |
Collapse
|
16
|
Pluska L, Jarosch E, Zauber H, Kniss A, Waltho A, Bagola K, von Delbrück M, Löhr F, Schulman BA, Selbach M, Dötsch V, Sommer T. The UBA domain of conjugating enzyme Ubc1/Ube2K facilitates assembly of K48/K63-branched ubiquitin chains. EMBO J 2021; 40:e106094. [PMID: 33576509 PMCID: PMC7957398 DOI: 10.15252/embj.2020106094] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 12/22/2020] [Accepted: 01/05/2021] [Indexed: 12/23/2022] Open
Abstract
The assembly of a specific polymeric ubiquitin chain on a target protein is a key event in the regulation of numerous cellular processes. Yet, the mechanisms that govern the selective synthesis of particular polyubiquitin signals remain enigmatic. The homologous ubiquitin-conjugating (E2) enzymes Ubc1 (budding yeast) and Ube2K (mammals) exclusively generate polyubiquitin linked through lysine 48 (K48). Uniquely among E2 enzymes, Ubc1 and Ube2K harbor a ubiquitin-binding UBA domain with unknown function. We found that this UBA domain preferentially interacts with ubiquitin chains linked through lysine 63 (K63). Based on structural modeling, in vitro ubiquitination experiments, and NMR studies, we propose that the UBA domain aligns Ubc1 with K63-linked polyubiquitin and facilitates the selective assembly of K48/K63-branched ubiquitin conjugates. Genetic and proteomics experiments link the activity of the UBA domain, and hence the formation of this unusual ubiquitin chain topology, to the maintenance of cellular proteostasis.
Collapse
Affiliation(s)
- Lukas Pluska
- Max‐Delbrück‐Center for Molecular Medicine in the Helmholtz AssociationBerlin‐BuchGermany
| | - Ernst Jarosch
- Max‐Delbrück‐Center for Molecular Medicine in the Helmholtz AssociationBerlin‐BuchGermany
| | - Henrik Zauber
- Max‐Delbrück‐Center for Molecular Medicine in the Helmholtz AssociationBerlin‐BuchGermany
| | - Andreas Kniss
- Institute of Biophysical Chemistry and Center for Biomolecular Magnetic ResonanceGoethe UniversityFrankfurt am MainGermany
| | - Anita Waltho
- Max‐Delbrück‐Center for Molecular Medicine in the Helmholtz AssociationBerlin‐BuchGermany
| | - Katrin Bagola
- Max‐Delbrück‐Center for Molecular Medicine in the Helmholtz AssociationBerlin‐BuchGermany
| | | | - Frank Löhr
- Institute of Biophysical Chemistry and Center for Biomolecular Magnetic ResonanceGoethe UniversityFrankfurt am MainGermany
| | - Brenda A Schulman
- Department of Molecular Machines and SignalingMax Planck Institute of BiochemistryMartinsriedGermany
| | - Matthias Selbach
- Max‐Delbrück‐Center for Molecular Medicine in the Helmholtz AssociationBerlin‐BuchGermany
- Charité – Universitätsmedizin BerlinBerlinGermany
| | - Volker Dötsch
- Institute of Biophysical Chemistry and Center for Biomolecular Magnetic ResonanceGoethe UniversityFrankfurt am MainGermany
| | - Thomas Sommer
- Max‐Delbrück‐Center for Molecular Medicine in the Helmholtz AssociationBerlin‐BuchGermany
- Institute for BiologyHumboldt‐Universität zu BerlinBerlinGermany
| |
Collapse
|
17
|
Sitani D, Giorgetti A, Alfonso-Prieto M, Carloni P. Robust principal component analysis-based prediction of protein-protein interaction hot spots. Proteins 2021; 89:639-647. [PMID: 33458895 DOI: 10.1002/prot.26047] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Revised: 12/28/2020] [Accepted: 12/31/2020] [Indexed: 12/21/2022]
Abstract
Proteins often exert their function by binding to other cellular partners. The hot spots are key residues for protein-protein binding. Their identification may shed light on the impact of disease associated mutations on protein complexes and help design protein-protein interaction inhibitors for therapy. Unfortunately, current machine learning methods to predict hot spots, suffer from limitations caused by gross errors in the data matrices. Here, we present a novel data pre-processing pipeline that overcomes this problem by recovering a low rank matrix with reduced noise using Robust Principal Component Analysis. Application to existing databases shows the predictive power of the method.
Collapse
Affiliation(s)
- Divya Sitani
- JARA-Institute: Molecular Neuroscience and Neuroimaging, Institute for Neuroscience and Medicine INM-11/JARA-BRAIN Institute JBI-2, Forschungszentrum Jülich GmbH, Jülich, Germany.,Department of Biology, RWTH Aachen University, Aachen, Germany
| | - Alejandro Giorgetti
- Institute for Advanced Simulations IAS-5 / Institute for Neuroscience and Medicine INM-9, Computational Biomedicine, Forschungszentrum Jülich GmbH, Jülich, Germany.,Department of Biotechnology, University of Verona, Verona, Italy
| | - Mercedes Alfonso-Prieto
- Institute for Advanced Simulations IAS-5 / Institute for Neuroscience and Medicine INM-9, Computational Biomedicine, Forschungszentrum Jülich GmbH, Jülich, Germany.,Cécile and Oskar Vogt Institute for Brain Research, University Hospital Düsseldorf, Medical Faculty, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Paolo Carloni
- JARA-Institute: Molecular Neuroscience and Neuroimaging, Institute for Neuroscience and Medicine INM-11/JARA-BRAIN Institute JBI-2, Forschungszentrum Jülich GmbH, Jülich, Germany.,Institute for Advanced Simulations IAS-5 / Institute for Neuroscience and Medicine INM-9, Computational Biomedicine, Forschungszentrum Jülich GmbH, Jülich, Germany.,Department of Physics, RWTH Aachen University, Aachen, Germany.,JARA-HPC, IAS-5/INM-9 Computational Biomedicine, Forschungszentrum Jülich GmbH, Jülich, Germany
| |
Collapse
|
18
|
Abstract
Targeting protein-protein interactions is a challenge and crucial task of the drug discovery process. A good starting point for rational drug design is the identification of hot spots (HS) at protein-protein interfaces, typically conserved residues that contribute most significantly to the binding. In this chapter, we depict point-by-point an in-house pipeline used for HS prediction using only sequence-based features from the well-known SpotOn dataset of soluble proteins (Moreira et al., Sci Rep 7:8007, 2017), through the implementation of a deep neural network. The presented pipeline is divided into three steps: (1) feature extraction, (2) deep learning classification, and (3) model evaluation. We present all the available resources, including code snippets, the main dataset, and the free and open-source modules/packages necessary for full replication of the protocol. The users should be able to develop an HS prediction model with accuracy, precision, recall, and AUROC of 0.96, 0.93, 0.91, and 0.86, respectively.
Collapse
Affiliation(s)
- António J Preto
- Center for Innovative Biomedicine and Biotechnology, University of Coimbra, Coimbra, Portugal
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal
- Institute for Interdisciplinary Research, University of Coimbra, Coimbra, Portugal
| | - Pedro Matos-Filipe
- Center for Innovative Biomedicine and Biotechnology, University of Coimbra, Coimbra, Portugal
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal
| | - José G de Almeida
- Center for Innovative Biomedicine and Biotechnology, University of Coimbra, Coimbra, Portugal
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal
| | - Joana Mourão
- Center for Innovative Biomedicine and Biotechnology, University of Coimbra, Coimbra, Portugal
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal
- Institute for Interdisciplinary Research, University of Coimbra, Coimbra, Portugal
| | - Irina S Moreira
- Center for Innovative Biomedicine and Biotechnology, University of Coimbra, Coimbra, Portugal.
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal.
- University of Coimbra, Department of Life Sciences, University of Coimbra, Coimbra, Portugal.
| |
Collapse
|
19
|
Abstract
Immunogenicity against biotherapeutic proteins (BPs) and the potential outcome for the patient are difficult to predict. In vitro assays that can help to assess the immunogenic potential of BPs are not yet used routinely during drug development. MAPPs (MHC-associated peptide proteomics) is one of the assays best characterized regarding its value for immunogenicity potential assessment. This review is focusing on recent studies that have employed human HLA class II-MAPPs assays to rank biotherapeutic candidates, investigate clinical immunogenicity, and understand mechanistic root causes of immunogenicity. Advantages and challenges of the technology are discussed as well as the different areas of application.
Collapse
Affiliation(s)
- Anette C Karle
- Novartis Institute for Biomedical Research, Novartis Pharma AG, Basel, Switzerland
| |
Collapse
|
20
|
Yang W, Sun X, Zhang C, Lai L. Discovery of novel helix binding sites at protein-protein interfaces. Comput Struct Biotechnol J 2019; 17:1396-1403. [PMID: 31768230 PMCID: PMC6872852 DOI: 10.1016/j.csbj.2019.11.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2019] [Revised: 10/29/2019] [Accepted: 11/01/2019] [Indexed: 01/09/2023] Open
Abstract
Protein-protein interactions (PPIs) play a key role in numerous biological processes. Many efforts have been undertaken to develop PPI modulators for therapeutic applications; however, to date, most of the peptide binders designed to target PPIs are derived from native binding helices or using the native helix binding site, which has limited the applications of protein-protein interface binding peptide design. Here, we developed a general computational algorithm, HPer (Helix Positioner), that locates single-helix binding sites at protein-protein interfaces based on the structure of protein targets. HPer performed well on known single-helix-mediated PPIs and recaptured the key interactions and hot-spot residues of native helical binders. We also screened non-helical-mediated PPIs in the PDBbind database and identified 17 PPIs that were suitable for helical peptide binding, and the helical binding sites in these PPIs were also predicted for designing novel peptide ligands. The L2 domain of EGFR, which was the top ranked, was selected as an example to show the protocol and results of designing novel helical peptide ligands on the searched binding site. The binding stability of the designed sequences were further investigated using molecular dynamics simulations.
Collapse
Affiliation(s)
- Wei Yang
- BNLMS, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
- School of Life Sciences, Tsinghua University, Beijing 100084, China
- Center for Quantitative Biology, AAIS, Peking University, Beijing 100871, China
| | - Xiangyu Sun
- BNLMS, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Changsheng Zhang
- BNLMS, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Luhua Lai
- BNLMS, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
- Peking-Tsinghua Center for Life Sciences, AAIS, Peking University, Beijing 100084, China
- Center for Quantitative Biology, AAIS, Peking University, Beijing 100871, China
| |
Collapse
|
21
|
Jankauskaite J, Jiménez-García B, Dapkunas J, Fernández-Recio J, Moal IH. SKEMPI 2.0: an updated benchmark of changes in protein-protein binding energy, kinetics and thermodynamics upon mutation. Bioinformatics 2019; 35:462-469. [PMID: 30020414 PMCID: PMC6361233 DOI: 10.1093/bioinformatics/bty635] [Citation(s) in RCA: 141] [Impact Index Per Article: 28.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2018] [Accepted: 07/17/2018] [Indexed: 11/18/2022] Open
Abstract
Motivation Understanding the relationship between the sequence, structure, binding energy, binding kinetics and binding thermodynamics of protein–protein interactions is crucial to understanding cellular signaling, the assembly and regulation of molecular complexes, the mechanisms through which mutations lead to disease, and protein engineering. Results We present SKEMPI 2.0, a major update to our database of binding free energy changes upon mutation for structurally resolved protein–protein interactions. This version now contains manually curated binding data for 7085 mutations, an increase of 133%, including changes in kinetics for 1844 mutations, enthalpy and entropy changes for 443 mutations, and 440 mutations, which abolish detectable binding. Availability and implementation The database is available as supplementary data and at https://life.bsc.es/pid/skempi2/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Justina Jankauskaite
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Brian Jiménez-García
- Barcelona Supercomputing Center (BSC), Barcelona, Spain.,Bijvoet Center for Biomolecular Research, Faculty of Science, Utrecht University, Utrecht, the Netherlands
| | - Justas Dapkunas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Juan Fernández-Recio
- Barcelona Supercomputing Center (BSC), Barcelona, Spain.,Institut de Biologia Molecular de Barcelona (IBMB), CSIC, Barcelona, Spain
| | - Iain H Moal
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| |
Collapse
|
22
|
Schlee S, Straub K, Schwab T, Kinateder T, Merkl R, Sterner R. Prediction of quaternary structure by analysis of hot spot residues in protein-protein interfaces: the case of anthranilate phosphoribosyltransferases. Proteins 2019; 87:815-825. [PMID: 31134642 DOI: 10.1002/prot.25744] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2019] [Revised: 05/06/2019] [Accepted: 05/22/2019] [Indexed: 12/13/2022]
Abstract
It is an important goal of computational biology to correctly predict the association state of a protein based on its amino acid sequence and the structures of known homologues. We have pursued this goal on the example of anthranilate phosphoribosyltransferase (AnPRT), an enzyme that is involved in the biosynthesis of the amino acid tryptophan. Firstly, known crystal structures of naturally occurring homodimeric AnPRTs were analyzed using the Protein Interfaces, Surfaces, and Assemblies (PISA) service of the European Bioinformatics Institute (EBI). This led to the identification of two hydrophobic "hot spot" amino acids in the protein-protein interface that were predicted to be essential for self-association. Next, in a comprehensive multiple sequence alignment (MSA), naturally occurring AnPRT variants with hydrophilic or charged amino acids in place of hydrophobic residues in the two hot spot positions were identified. Representative variants were characterized in terms of thermal stability, enzymatic activity, and quaternary structure. We found that AnPRT variants with charged residues in both hot spot positions exist exclusively as monomers in solution. Variants with hydrophilic amino acids in one hot spot position occur in both forms, monomer and dimer. The results of the present study provide a detailed characterization of the determinants of the AnPRT monomer-dimer equilibrium and show that analysis of hot spots in combination with MSAs can be a valuable tool in prediction of protein quaternary structures.
Collapse
Affiliation(s)
- Sandra Schlee
- Institute of Biophysics and Physical Biochemistry, University of Regensburg, Regensburg, Germany
| | - Kristina Straub
- Institute of Biophysics and Physical Biochemistry, University of Regensburg, Regensburg, Germany
| | - Thomas Schwab
- Institute of Biophysics and Physical Biochemistry, University of Regensburg, Regensburg, Germany
| | - Thomas Kinateder
- Institute of Biophysics and Physical Biochemistry, University of Regensburg, Regensburg, Germany
| | - Rainer Merkl
- Institute of Biophysics and Physical Biochemistry, University of Regensburg, Regensburg, Germany
| | - Reinhard Sterner
- Institute of Biophysics and Physical Biochemistry, University of Regensburg, Regensburg, Germany
| |
Collapse
|
23
|
Korani W, Clevenger JP, Chu Y, Ozias-Akins P. Machine Learning as an Effective Method for Identifying True Single Nucleotide Polymorphisms in Polyploid Plants. Plant Genome 2019; 12:180023. [PMID: 30951095 DOI: 10.3835/plantgenome2018.05.0023] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Single nucleotide polymorphisms (SNPs) have many advantages as molecular markers since they are ubiquitous and codominant. However, the discovery of true SNPs in polyploid species is difficult. Peanut ( L.) is an allopolyploid, which has a very low rate of true SNP calling. A large set of true and false SNPs identified from the Axiom_ 58k array was leveraged to train machine-learning models to enable identification of true SNPs directly from sequence data to reduce ascertainment bias. These models achieved accuracy rates above 80% using real peanut RNA sequencing (RNA-seq) and whole-genome shotgun (WGS) resequencing data, which is higher than previously reported for polyploids and at least a twofold improvement for peanut. A 48K SNP array, Axiom_2, was designed using this approach resulting in 75% accuracy of calling SNPs from different tetraploid peanut genotypes. Using the method to simulate SNP variation in several polyploids, models achieved >98% accuracy in selecting true SNPs. Additionally, models built with simulated genotypes were able to select true SNPs at >80% accuracy using real peanut data. This work accomplished the objective to create an effective approach for calling highly reliable SNPs from polyploids using machine learning. A novel tool was developed for predicting true SNPs from sequence data, designated as SNP machine learning (SNP-ML), using the described models. The SNP-ML additionally provides functionality to train new models not included in this study for customized use, designated SNP machine learner (SNP-MLer). The SNP-ML is publicly available.
Collapse
|
24
|
Liu Q, Chen P, Wang B, Zhang J, Li J. dbMPIKT: a database of kinetic and thermodynamic mutant protein interactions. BMC Bioinformatics 2018; 19:455. [PMID: 30482172 PMCID: PMC6260753 DOI: 10.1186/s12859-018-2493-7] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2018] [Accepted: 11/13/2018] [Indexed: 02/06/2023] Open
Abstract
Background Protein-protein interactions (PPIs) play important roles in biological functions. Studies of the effects of mutants on protein interactions can provide further understanding of PPIs. Currently, many databases collect experimental mutants to assess protein interactions, but most of these databases are old and have not been updated for several years. Results To address this issue, we manually curated a kinetic and thermodynamic database of mutant protein interactions (dbMPIKT) that is freely accessible at our website. This database contains 5291 mutants in protein interactions collected from previous databases and the literature published within the last three years. Furthermore, some data analysis, such as mutation number, mutation type, protein pair source and network map construction, can be performed online. Conclusion Our work can promote the study on PPIs, and novel information can be mined from the new database. Our database is available in http://DeepLearner.ahu.edu.cn/web/dbMPIKT/ for use by all, including both academics and non-academics. Electronic supplementary material The online version of this article (10.1186/s12859-018-2493-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Quanya Liu
- Institute of Physical Science and Information Technology, Anhui University, Hefei, 230601, Anhui, China
| | - Peng Chen
- Institute of Physical Science and Information Technology, Anhui University, Hefei, 230601, Anhui, China.
| | - Bing Wang
- School of Electrical and Information Engineering, Anhui University of Technology, Ma'anshan, 243032, Anhui, China
| | - Jun Zhang
- School of Electronic Engineering & Automation, Anhui University, Hefei, 230601, Anhui, China
| | - Jinyan Li
- Advanced Analytics Institute and Centre for Health Technologies, University of Technology, Broadway, Sydney, NSW, 2007, Australia
| |
Collapse
|
25
|
Liu S, Liu C, Deng L. Machine Learning Approaches for Protein⁻Protein Interaction Hot Spot Prediction: Progress and Comparative Assessment. Molecules 2018; 23:E2535. [PMID: 30287797 DOI: 10.3390/molecules23102535] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2018] [Revised: 09/27/2018] [Accepted: 10/02/2018] [Indexed: 12/27/2022] Open
Abstract
Hot spots are the subset of interface residues that account for most of the binding free energy, and they play essential roles in the stability of protein binding. Effectively identifying which specific interface residues of protein–protein complexes form the hot spots is critical for understanding the principles of protein interactions, and it has broad application prospects in protein design and drug development. Experimental methods like alanine scanning mutagenesis are labor-intensive and time-consuming. At present, the experimentally measured hot spots are very limited. Hence, the use of computational approaches to predicting hot spots is becoming increasingly important. Here, we describe the basic concepts and recent advances of machine learning applications in inferring the protein–protein interaction hot spots, and assess the performance of widely used features, machine learning algorithms, and existing state-of-the-art approaches. We also discuss the challenges and future directions in the prediction of hot spots.
Collapse
|
26
|
Daberdaku S, Ferrari C. Exploring the potential of 3D Zernike descriptors and SVM for protein-protein interface prediction. BMC Bioinformatics 2018; 19:35. [PMID: 29409446 PMCID: PMC5802066 DOI: 10.1186/s12859-018-2043-3] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2017] [Accepted: 01/24/2018] [Indexed: 12/22/2022] Open
Abstract
Background The correct determination of protein–protein interaction interfaces is important for understanding disease mechanisms and for rational drug design. To date, several computational methods for the prediction of protein interfaces have been developed, but the interface prediction problem is still not fully understood. Experimental evidence suggests that the location of binding sites is imprinted in the protein structure, but there are major differences among the interfaces of the various protein types: the characterising properties can vary a lot depending on the interaction type and function. The selection of an optimal set of features characterising the protein interface and the development of an effective method to represent and capture the complex protein recognition patterns are of paramount importance for this task. Results In this work we investigate the potential of a novel local surface descriptor based on 3D Zernike moments for the interface prediction task. Descriptors invariant to roto-translations are extracted from circular patches of the protein surface enriched with physico-chemical properties from the HQI8 amino acid index set, and are used as samples for a binary classification problem. Support Vector Machines are used as a classifier to distinguish interface local surface patches from non-interface ones. The proposed method was validated on 16 classes of proteins extracted from the Protein–Protein Docking Benchmark 5.0 and compared to other state-of-the-art protein interface predictors (SPPIDER, PrISE and NPS-HomPPI). Conclusions The 3D Zernike descriptors are able to capture the similarity among patterns of physico-chemical and biochemical properties mapped on the protein surface arising from the various spatial arrangements of the underlying residues, and their usage can be easily extended to other sets of amino acid properties. The results suggest that the choice of a proper set of features characterising the protein interface is crucial for the interface prediction task, and that optimality strongly depends on the class of proteins whose interface we want to characterise. We postulate that different protein classes should be treated separately and that it is necessary to identify an optimal set of features for each protein class. Electronic supplementary material The online version of this article (10.1186/s12859-018-2043-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sebastian Daberdaku
- Department of Information Engineering, University of Padova, via Gradenigo 6/A, Padova, 35131, Italy.
| | - Carlo Ferrari
- Department of Information Engineering, University of Padova, via Gradenigo 6/A, Padova, 35131, Italy
| |
Collapse
|
27
|
Simões ICM, Coimbra JTS, Neves RPP, Costa IPD, Ramos MJ, Fernandes PA. Properties that rank protein:protein docking poses with high accuracy. Phys Chem Chem Phys 2018; 20:20927-20942. [DOI: 10.1039/c8cp03888k] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
The development of docking algorithms to predict near-native structures of protein:protein complexes from the structure of the isolated monomers is of paramount importance for molecular biology and drug discovery.
Collapse
Affiliation(s)
- Inês C. M. Simões
- UCIBIO
- REQUIMTE
- Departamento de Química e Bioquímica
- Faculdade de Ciências
- Universidade do Porto
| | - João T. S. Coimbra
- UCIBIO
- REQUIMTE
- Departamento de Química e Bioquímica
- Faculdade de Ciências
- Universidade do Porto
| | - Rui P. P. Neves
- UCIBIO
- REQUIMTE
- Departamento de Química e Bioquímica
- Faculdade de Ciências
- Universidade do Porto
| | - Inês P. D. Costa
- UCIBIO
- REQUIMTE
- Departamento de Química e Bioquímica
- Faculdade de Ciências
- Universidade do Porto
| | - Maria J. Ramos
- UCIBIO
- REQUIMTE
- Departamento de Química e Bioquímica
- Faculdade de Ciências
- Universidade do Porto
| | - Pedro A. Fernandes
- UCIBIO
- REQUIMTE
- Departamento de Química e Bioquímica
- Faculdade de Ciências
- Universidade do Porto
| |
Collapse
|
28
|
Sanchez-Garcia R, Sorzano COS, Carazo JM, Segura J. 3DCONS-DB: A Database of Position-Specific Scoring Matrices in Protein Structures. Molecules 2017; 22:E2230. [PMID: 29244774 DOI: 10.3390/molecules22122230] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Revised: 12/11/2017] [Accepted: 12/13/2017] [Indexed: 11/16/2022] Open
Abstract
Many studies have used position-specific scoring matrices (PSSM) profiles to characterize residues in protein structures and to predict a broad range of protein features. Moreover, PSSM profiles of Protein Data Bank (PDB) entries have been recalculated in many works for different purposes. Although the computational cost of calculating a single PSSM profile is affordable, many statistical studies or machine learning-based methods used thousands of profiles to achieve their goals, thereby leading to a substantial increase of the computational cost. In this work we present a new database compiling PSSM profiles for the proteins of the PDB. Currently, the database contains 333,532 protein chain profiles involving 123,135 different PDB entries.
Collapse
|
29
|
Almeida JG, Preto AJ, Koukos PI, Bonvin AM, Moreira IS. Membrane proteins structures: A review on computational modeling tools. Biochimica et Biophysica Acta (BBA) - Biomembranes 2017; 1859:2021-39. [DOI: 10.1016/j.bbamem.2017.07.008] [Citation(s) in RCA: 62] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2017] [Revised: 07/04/2017] [Accepted: 07/13/2017] [Indexed: 01/02/2023]
|
30
|
Moreira IS, Koukos PI, Melo R, Almeida JG, Preto AJ, Schaarschmidt J, Trellet M, Gümüş ZH, Costa J, Bonvin AMJJ. SpotOn: High Accuracy Identification of Protein-Protein Interface Hot-Spots. Sci Rep 2017; 7:8007. [PMID: 28808256 PMCID: PMC5556074 DOI: 10.1038/s41598-017-08321-2] [Citation(s) in RCA: 55] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2017] [Accepted: 07/07/2017] [Indexed: 12/21/2022] Open
Abstract
We present SpotOn, a web server to identify and classify interfacial residues as Hot-Spots (HS) and Null-Spots (NS). SpotON implements a robust algorithm with a demonstrated accuracy of 0.95 and sensitivity of 0.98 on an independent test set. The predictor was developed using an ensemble machine learning approach with up-sampling of the minor class. It was trained on 53 complexes using various features, based on both protein 3D structure and sequence. The SpotOn web interface is freely available at: http://milou.science.uu.nl/services/SPOTON/.
Collapse
Affiliation(s)
- Irina S Moreira
- CNC - Center for Neuroscience and Cell Biology; Rua Larga, FMUC, Polo I, 1°andar, Universidade de Coimbra, 3004-517, Coimbra, Portugal. .,Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Utrecht, 3584CH, The Netherlands.
| | - Panagiotis I Koukos
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Utrecht, 3584CH, The Netherlands
| | - Rita Melo
- CNC - Center for Neuroscience and Cell Biology; Rua Larga, FMUC, Polo I, 1°andar, Universidade de Coimbra, 3004-517, Coimbra, Portugal.,Centro de Ciências e Tecnologias Nucleares, Instituto Superior Técnico, Universidade de Lisboa, Estrada Nacional 10 (ao km 139,7), 2695-066, Bobadela LRS, Portugal
| | - Jose G Almeida
- CNC - Center for Neuroscience and Cell Biology; Rua Larga, FMUC, Polo I, 1°andar, Universidade de Coimbra, 3004-517, Coimbra, Portugal
| | - Antonio J Preto
- CNC - Center for Neuroscience and Cell Biology; Rua Larga, FMUC, Polo I, 1°andar, Universidade de Coimbra, 3004-517, Coimbra, Portugal
| | - Joerg Schaarschmidt
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Utrecht, 3584CH, The Netherlands
| | - Mikael Trellet
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Utrecht, 3584CH, The Netherlands
| | - Zeynep H Gümüş
- Department of Genetics and Genomics and Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Joaquim Costa
- CMUP/FCUP, Centro de Matemática da Universidade do Porto, Faculdade de Ciências, Rua do Campo Alegre, 4169-007, Porto, Portugal
| | - Alexandre M J J Bonvin
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Utrecht, 3584CH, The Netherlands.
| |
Collapse
|
31
|
Todeschini R, Pazos A, Arrasate S, González-Díaz H. Data Analysis in Chemistry and Bio-Medical Sciences. Int J Mol Sci 2016; 17:ijms17122105. [PMID: 27983646 PMCID: PMC5187905 DOI: 10.3390/ijms17122105] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2016] [Revised: 12/05/2016] [Accepted: 12/07/2016] [Indexed: 01/04/2023] Open
Affiliation(s)
- Roberto Todeschini
- Milano Chemometrics and QSAR Research Group, Department of Earth and Environmental Sciences, University of Milano-Bicocca, 20126 Milano, Italy.
| | - Alejandro Pazos
- Research Center on Information and Communication Technologies (CITIC), Institute of Biomedical Research (INIBIC), University of Coruña (UDC), Campus de Elviña s/n, 15071 A Coruña, Spain.
| | - Sonia Arrasate
- Department of Organic Chemistry II, University of the Basque Country UPV/EHU, Sarriena w/n, 48940 Leioa, Bizkaia, Spain.
| | - Humberto González-Díaz
- Department of Organic Chemistry II, University of the Basque Country UPV/EHU, Sarriena w/n, 48940 Leioa, Bizkaia, Spain.
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Biscay, Spain.
| |
Collapse
|
32
|
Li ZW, You ZH, Chen X, Gui J, Nie R. Highly Accurate Prediction of Protein-Protein Interactions via Incorporating Evolutionary Information and Physicochemical Characteristics. Int J Mol Sci 2016; 17:ijms17091396. [PMID: 27571061 PMCID: PMC5037676 DOI: 10.3390/ijms17091396] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2016] [Revised: 08/13/2016] [Accepted: 08/16/2016] [Indexed: 12/19/2022] Open
Abstract
Protein-protein interactions (PPIs) occur at almost all levels of cell functions and play crucial roles in various cellular processes. Thus, identification of PPIs is critical for deciphering the molecular mechanisms and further providing insight into biological processes. Although a variety of high-throughput experimental techniques have been developed to identify PPIs, existing PPI pairs by experimental approaches only cover a small fraction of the whole PPI networks, and further, those approaches hold inherent disadvantages, such as being time-consuming, expensive, and having high false positive rate. Therefore, it is urgent and imperative to develop automatic in silico approaches to predict PPIs efficiently and accurately. In this article, we propose a novel mixture of physicochemical and evolutionary-based feature extraction method for predicting PPIs using our newly developed discriminative vector machine (DVM) classifier. The improvements of the proposed method mainly consist in introducing an effective feature extraction method that can capture discriminative features from the evolutionary-based information and physicochemical characteristics, and then a powerful and robust DVM classifier is employed. To the best of our knowledge, it is the first time that DVM model is applied to the field of bioinformatics. When applying the proposed method to the Yeast and Helicobacter pylori (H. pylori) datasets, we obtain excellent prediction accuracies of 94.35% and 90.61%, respectively. The computational results indicate that our method is effective and robust for predicting PPIs, and can be taken as a useful supplementary tool to the traditional experimental methods for future proteomics research.
Collapse
Affiliation(s)
- Zheng-Wei Li
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 21116, China.
| | - Zhu-Hong You
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Science, Urumqi 830011, China.
| | - Xing Chen
- School of Information and Electrical Engineering, China University of Mining and Technology, Xuzhou 21116, China.
| | - Jie Gui
- Institute of Intelligent Machines, Chinese Academy of Sciences, Hefei 230031, China.
| | - Ru Nie
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 21116, China.
| |
Collapse
|