1
|
Li K, Quan L, Jiang Y, Li Y, Zhou Y, Wu T, Lyu Q. ctP 2ISP: Protein-Protein Interaction Sites Prediction Using Convolution and Transformer With Data Augmentation. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:297-306. [PMID: 35213314 DOI: 10.1109/tcbb.2022.3154413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Protein-protein interactions are the basis of many cellular biological processes, such as cellular organization, signal transduction, and immune response. Identifying protein-protein interaction sites is essential for understanding the mechanisms of various biological processes, disease development, and drug design. However, it remains a challenging task to make accurate predictions, as the small amount of training data and severe imbalanced classification reduce the performance of computational methods. We design a deep learning method named ctP2ISP to improve the prediction of protein-protein interaction sites. ctP2ISP employs Convolution and Transformer to extract information and enhance information perception so that semantic features can be mined to identify protein-protein interaction sites. A weighting loss function with different sample weights is designed to suppress the preference of the model toward multi-category prediction. To efficiently reuse the information in the training set, a preprocessing of data augmentation with an improved sample-oriented sampling strategy is applied. The trained ctP2ISP was evaluated against current state-of-the-art methods on six public datasets. The results show that ctP2ISP outperforms all other competing methods on the balance metrics: F1, MCC, and AUPRC. In particular, our prediction on open tests related to viruses may also be consistent with biological insights. The source code and data can be obtained from https://github.com/lennylv/ctP2ISP.
Collapse
|
2
|
Biswas P, Das M, Pal S, Ghosh R, Dam S. EhSir2c, a Sir2 homolog from the human pathogen Entamoeba histolytica interacts with a DNA repair protein, EhRAD23: Protein-protein interaction, docking and functional study. J Biomol Struct Dyn 2023; 41:263-279. [PMID: 34809531 DOI: 10.1080/07391102.2021.2004925] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Chromosome segregation is a crucial phenomenon in the cell cycle and defects in genome segregation result in an abnormality in various cellular events. Unlike higher eukaryotes, chromosome segregation and a number of cell cycle events are unusual in the protozoan parasite Entamoeba histolytica (E. histolytica). Characterization of Sir2 proteins from E. histolytica may reveal its unique cellular events as they play role in diverse cellular processes including chromosome segregation. E. histolytica has four homologs of Sir2 proteins. EhSir2a and EhSir2b show sequence similarity towards eukaryotic Sir2 homologs, whereas EhSir2c and EhSir2d are more like prokaryotic sirtuins. Using both computational and experimental methods, EhSir2c has been characterized in this study. The three-dimensional structure of EhSir2c is predicted by homology modelling. The protein interactors of EhSir2c have been identified by yeast-two-hybrid screening against the cDNA library of E. histolytica. We have identified a novel interactor, EhRAD23 which is a homolog of UV excision repair protein RAD23. The interaction of EhSir2c and EhRAD23 was validated by pull-down assay. UV-C irradiation up-regulates the relative expression of EhSir2c, suggesting the necessity of EhSir2c in UV-induced stress in this parasite.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Pinaki Biswas
- Department of Microbiology, The University of Burdwan, Burdwan, India
| | - Moubonny Das
- Department of Microbiology, The University of Burdwan, Burdwan, India
| | - Suchetana Pal
- Department of Microbiology, The University of Burdwan, Burdwan, India
| | - Raktim Ghosh
- Department of Microbiology, The University of Burdwan, Burdwan, India
| | - Somasri Dam
- Department of Microbiology, The University of Burdwan, Burdwan, India
| |
Collapse
|
3
|
ISPRED-SEQ: Deep neural networks and embeddings for predicting interaction sites in protein sequences. J Mol Biol 2023. [DOI: 10.1016/j.jmb.2023.167963] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
|
4
|
de Oliveira Matos A, dos Santos Dantas PH, Colmenares MTC, Sartori GR, Silva-Sales M, Da Silva JHM, Neves BJ, Andrade CH, Sales-Campos H. The CDR3 region as the major driver of TREM-1 interaction with its ligands, an in silico characterization. Comput Struct Biotechnol J 2023; 21:2579-2590. [PMID: 37122631 PMCID: PMC10130352 DOI: 10.1016/j.csbj.2023.04.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Revised: 04/07/2023] [Accepted: 04/12/2023] [Indexed: 05/02/2023] Open
Abstract
The triggering receptor expressed on myeloid cells-1 (TREM-1) is a pattern recognition receptor heavily investigated in infectious and non-infectious diseases. Because of its role in amplifying inflammation, TREM-1 has been explored as a diagnostic/prognostic biomarker. Further, as the receptor has been implicated in the pathophysiology of several diseases, therapies aiming at modulating its activity represent a promising strategy to constrain uncontrolled inflammatory or infectious diseases. Despite this, several aspects concerning its interaction with ligands and activation process, remain unclear. Although many molecules have been suggested as TREM-1 ligands, only five have been confirmed to interact with the receptor: actin, eCIRP, HMGB1, Hsp70 and PGLYRP1. However, the domains involved in the interaction between the receptor and these proteins are not clarified yet. Therefore, here we used in silico approaches to investigate the putative binding domains in the receptor, using hot spots analysis, molecular docking and molecular dynamics simulations between TREM-1 and its five known ligands. Our results indicated the complementarity-determining regions (CDRs) of the receptor as the main mediators of antigen recognition, especially the CDR3 loop. We believe that our study could be used as structural basis for the elucidation of TREM-1's recognition process, and may be useful for prospective in silico and biological investigations exploring the receptor in different contexts.
Collapse
Affiliation(s)
| | | | | | | | - Marcelle Silva-Sales
- Instituto de Patologia Tropical e Saúde Pública, Universidade Federal de Goiás, Goiânia, Brazil
| | | | - Bruno Junior Neves
- LabMol – Laboratory for Molecular Modeling and Drug Design, Faculty of Pharmacy, Universidade Federal de Goiás, Goiânia, Brazil
| | - Carolina Horta Andrade
- LabMol – Laboratory for Molecular Modeling and Drug Design, Faculty of Pharmacy, Universidade Federal de Goiás, Goiânia, Brazil
| | - Helioswilton Sales-Campos
- Instituto de Patologia Tropical e Saúde Pública, Universidade Federal de Goiás, Goiânia, Brazil
- Correspondence to: Universidade Federal de Goiás – UFG, Instituto de Patologia Tropical e Saúde Pública – IPTSP, Rua 235, S/N, sala 332, Setor Leste Universitário, Goiânia, Goiás 746050-05, Brazil.
| |
Collapse
|
5
|
Chen R, Li X, Yang Y, Song X, Wang C, Qiao D. Prediction of protein-protein interaction sites in intrinsically disordered proteins. Front Mol Biosci 2022; 9:985022. [PMID: 36250006 PMCID: PMC9567019 DOI: 10.3389/fmolb.2022.985022] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Accepted: 07/27/2022] [Indexed: 11/25/2022] Open
Abstract
Intrinsically disordered proteins (IDPs) participate in many biological processes by interacting with other proteins, including the regulation of transcription, translation, and the cell cycle. With the increasing amount of disorder sequence data available, it is thus crucial to identify the IDP binding sites for functional annotation of these proteins. Over the decades, many computational approaches have been developed to predict protein-protein binding sites of IDP (IDP-PPIS) based on protein sequence information. Moreover, there are new IDP-PPIS predictors developed every year with the rapid development of artificial intelligence. It is thus necessary to provide an up-to-date overview of these methods in this field. In this paper, we collected 30 representative predictors published recently and summarized the databases, features and algorithms. We described the procedure how the features were generated based on public data and used for the prediction of IDP-PPIS, along with the methods to generate the feature representations. All the predictors were divided into three categories: scoring functions, machine learning-based prediction, and consensus approaches. For each category, we described the details of algorithms and their performances. Hopefully, our manuscript will not only provide a full picture of the status quo of IDP binding prediction, but also a guide for selecting different methods. More importantly, it will shed light on the inspirations for future development trends and principles.
Collapse
Affiliation(s)
- Ranran Chen
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Xinlu Li
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Yaqing Yang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Xixi Song
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Cheng Wang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
- *Correspondence: Cheng Wang, ; Dongdong Qiao,
| | - Dongdong Qiao
- Shandong Mental Health Center, Shandong University, Jinan, China
- *Correspondence: Cheng Wang, ; Dongdong Qiao,
| |
Collapse
|
6
|
Capel H, Weiler R, Dijkstra M, Vleugels R, Bloem P, Feenstra KA. ProteinGLUE multi-task benchmark suite for self-supervised protein modeling. Sci Rep 2022; 12:16047. [PMID: 36163232 PMCID: PMC9512797 DOI: 10.1038/s41598-022-19608-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Accepted: 08/31/2022] [Indexed: 11/09/2022] Open
Abstract
Self-supervised language modeling is a rapidly developing approach for the analysis of protein sequence data. However, work in this area is heterogeneous and diverse, making comparison of models and methods difficult. Moreover, models are often evaluated only on one or two downstream tasks, making it unclear whether the models capture generally useful properties. We introduce the ProteinGLUE benchmark for the evaluation of protein representations: a set of seven per-amino-acid tasks for evaluating learned protein representations. We also offer reference code, and we provide two baseline models with hyperparameters specifically trained for these benchmarks. Pre-training was done on two tasks, masked symbol prediction and next sentence prediction. We show that pre-training yields higher performance on a variety of downstream tasks such as secondary structure and protein interaction interface prediction, compared to no pre-training. However, the larger base model does not outperform the smaller medium model. We expect the ProteinGLUE benchmark dataset introduced here, together with the two baseline pre-trained models and their performance evaluations, to be of great value to the field of protein sequence-based property prediction. Availability: code and datasets from https://github.com/ibivu/protein-glue .
Collapse
Affiliation(s)
- Henriette Capel
- Informatics Institute, Vrije Universiteit, 1081 HV, Amsterdam, The Netherlands
| | - Robin Weiler
- Informatics Institute, Vrije Universiteit, 1081 HV, Amsterdam, The Netherlands
| | - Maurits Dijkstra
- Informatics Institute, Vrije Universiteit, 1081 HV, Amsterdam, The Netherlands
| | - Reinier Vleugels
- Informatics Institute, Vrije Universiteit, 1081 HV, Amsterdam, The Netherlands
| | - Peter Bloem
- Informatics Institute, Vrije Universiteit, 1081 HV, Amsterdam, The Netherlands
| | - K Anton Feenstra
- Informatics Institute, Vrije Universiteit, 1081 HV, Amsterdam, The Netherlands.
| |
Collapse
|
7
|
Multi-task learning to leverage partially annotated data for PPI interface prediction. Sci Rep 2022; 12:10487. [PMID: 35729253 PMCID: PMC9213449 DOI: 10.1038/s41598-022-13951-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Accepted: 05/31/2022] [Indexed: 11/29/2022] Open
Abstract
Protein protein interactions (PPI) are crucial for protein functioning, nevertheless predicting residues in PPI interfaces from the protein sequence remains a challenging problem. In addition, structure-based functional annotations, such as the PPI interface annotations, are scarce: only for about one-third of all protein structures residue-based PPI interface annotations are available. If we want to use a deep learning strategy, we have to overcome the problem of limited data availability. Here we use a multi-task learning strategy that can handle missing data. We start with the multi-task model architecture, and adapted it to carefully handle missing data in the cost function. As related learning tasks we include prediction of secondary structure, solvent accessibility, and buried residue. Our results show that the multi-task learning strategy significantly outperforms single task approaches. Moreover, only the multi-task strategy is able to effectively learn over a dataset extended with structural feature data, without additional PPI annotations. The multi-task setup becomes even more important, if the fraction of PPI annotations becomes very small: the multi-task learner trained on only one-eighth of the PPI annotations—with data extension—reaches the same performances as the single-task learner on all PPI annotations. Thus, we show that the multi-task learning strategy can be beneficial for a small training dataset where the protein’s functional properties of interest are only partially annotated.
Collapse
|
8
|
Casadio R, Martelli PL, Savojardo C. Machine learning solutions for predicting protein–protein interactions. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Rita Casadio
- Biocomputing Group University of Bologna Bologna Italy
| | | | | |
Collapse
|
9
|
Delaunay M, Ha-Duong T. Computational Tools and Strategies to Develop Peptide-Based Inhibitors of Protein-Protein Interactions. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2405:205-230. [PMID: 35298816 DOI: 10.1007/978-1-0716-1855-4_11] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Protein-protein interactions play crucial and subtle roles in many biological processes and modifications of their fine mechanisms generally result in severe diseases. Peptide derivatives are very promising therapeutic agents for modulating protein-protein associations with sizes and specificities between those of small compounds and antibodies. For the same reasons, rational design of peptide-based inhibitors naturally borrows and combines computational methods from both protein-ligand and protein-protein research fields. In this chapter, we aim to provide an overview of computational tools and approaches used for identifying and optimizing peptides that target protein-protein interfaces with high affinity and specificity. We hope that this review will help to implement appropriate in silico strategies for peptide-based drug design that builds on available information for the systems of interest.
Collapse
Affiliation(s)
| | - Tâp Ha-Duong
- Université Paris-Saclay, CNRS, BioCIS, Châtenay-Malabry, France.
| |
Collapse
|
10
|
Stringer B, de Ferrante H, Abeln S, Heringa J, Feenstra KA, Haydarlou R. PIPENN: protein interface prediction from sequence with an ensemble of neural nets. Bioinformatics 2022; 38:2111-2118. [PMID: 35150231 PMCID: PMC9004643 DOI: 10.1093/bioinformatics/btac071] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Revised: 01/16/2022] [Accepted: 02/04/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION The interactions between proteins and other molecules are essential to many biological and cellular processes. Experimental identification of interface residues is a time-consuming, costly and challenging task, while protein sequence data are ubiquitous. Consequently, many computational and machine learning approaches have been developed over the years to predict such interface residues from sequence. However, the effectiveness of different Deep Learning (DL) architectures and learning strategies for protein-protein, protein-nucleotide and protein-small molecule interface prediction has not yet been investigated in great detail. Therefore, we here explore the prediction of protein interface residues using six DL architectures and various learning strategies with sequence-derived input features. RESULTS We constructed a large dataset dubbed BioDL, comprising protein-protein interactions from the PDB, and DNA/RNA and small molecule interactions from the BioLip database. We also constructed six DL architectures, and evaluated them on the BioDL benchmarks. This shows that no single architecture performs best on all instances. An ensemble architecture, which combines all six architectures, does consistently achieve peak prediction accuracy. We confirmed these results on the published benchmark set by Zhang and Kurgan (ZK448), and on our own existing curated homo- and heteromeric protein interaction dataset. Our PIPENN sequence-based ensemble predictor outperforms current state-of-the-art sequence-based protein interface predictors on ZK448 on all interaction types, achieving an AUC-ROC of 0.718 for protein-protein, 0.823 for protein-nucleotide and 0.842 for protein-small molecule. AVAILABILITY AND IMPLEMENTATION Source code and datasets are available at https://github.com/ibivu/pipenn/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Hans de Ferrante
- Department of Computer Science, IBIVU—Center for Integrative Bioinformatics, Vrije Universiteit, 1081HV Amsterdam, The Netherlands
| | - Sanne Abeln
- Department of Computer Science, IBIVU—Center for Integrative Bioinformatics, Vrije Universiteit, 1081HV Amsterdam, The Netherlands
| | - Jaap Heringa
- Department of Computer Science, IBIVU—Center for Integrative Bioinformatics, Vrije Universiteit, 1081HV Amsterdam, The Netherlands
| | - K Anton Feenstra
- Department of Computer Science, IBIVU—Center for Integrative Bioinformatics, Vrije Universiteit, 1081HV Amsterdam, The Netherlands
| | | |
Collapse
|
11
|
Hou Q, Stringer B, Waury K, Capel H, Haydarlou R, Xue F, Abeln S, Heringa J, Feenstra KA. SeRenDIP-CE: Sequence-based Interface Prediction for Conformational Epitopes. Bioinformatics 2021; 37:3421-3427. [PMID: 33974039 PMCID: PMC8136078 DOI: 10.1093/bioinformatics/btab321] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Revised: 03/26/2021] [Accepted: 04/26/2021] [Indexed: 11/21/2022] Open
Abstract
Motivation Antibodies play an important role in clinical research and biotechnology, with their specificity determined by the interaction with the antigen’s epitope region, as a special type of protein–protein interaction (PPI) interface. The ubiquitous availability of sequence data, allows us to predict epitopes from sequence in order to focus time-consuming wet-lab experiments toward the most promising epitope regions. Here, we extend our previously developed sequence-based predictors for homodimer and heterodimer PPI interfaces to predict epitope residues that have the potential to bind an antibody. Results We collected and curated a high quality epitope dataset from the SAbDab database. Our generic PPI heterodimer predictor obtained an AUC-ROC of 0.666 when evaluated on the epitope test set. We then trained a random forest model specifically on the epitope dataset, reaching AUC 0.694. Further training on the combined heterodimer and epitope datasets, improves our final predictor to AUC 0.703 on the epitope test set. This is better than the best state-of-the-art sequence-based epitope predictor BepiPred-2.0. On one solved antibody–antigen structure of the COVID19 virus spike receptor binding domain, our predictor reaches AUC 0.778. We added the SeRenDIP-CE Conformational Epitope predictors to our webserver, which is simple to use and only requires a single antigen sequence as input, which will help make the method immediately applicable in a wide range of biomedical and biomolecular research. Availability and implementation Webserver, source code and datasets at www.ibi.vu.nl/programs/serendipwww/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Qingzhen Hou
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Shandong 250002, P. R. China.,National institute of health data science of China, Shandong University, Shandong 250002, P. R. China
| | - Bas Stringer
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands
| | - Katharina Waury
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands
| | - Henriette Capel
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands
| | - Reza Haydarlou
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands
| | - Fuzhong Xue
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Shandong 250002, P. R. China.,National institute of health data science of China, Shandong University, Shandong 250002, P. R. China
| | - Sanne Abeln
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands
| | - Jaap Heringa
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands.,AIMMS - Amsterdam Institute for Molecules Medicines and Systems, Vrije Universiteit Amsterdam
| | - K Anton Feenstra
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands.,AIMMS - Amsterdam Institute for Molecules Medicines and Systems, Vrije Universiteit Amsterdam
| |
Collapse
|
12
|
Kagami L, Roca-Martínez J, Gavaldá-García J, Ramasamy P, Feenstra KA, Vranken WF. Online biophysical predictions for SARS-CoV-2 proteins. BMC Mol Cell Biol 2021; 22:23. [PMID: 33892639 PMCID: PMC8062939 DOI: 10.1186/s12860-021-00362-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2021] [Accepted: 04/01/2021] [Indexed: 12/26/2022] Open
Abstract
BACKGROUND The SARS-CoV-2 virus, the causative agent of COVID-19, consists of an assembly of proteins that determine its infectious and immunological behavior, as well as its response to therapeutics. Major structural biology efforts on these proteins have already provided essential insights into the mode of action of the virus, as well as avenues for structure-based drug design. However, not all of the SARS-CoV-2 proteins, or regions thereof, have a well-defined three-dimensional structure, and as such might exhibit ambiguous, dynamic behaviour that is not evident from static structure representations, nor from molecular dynamics simulations using these structures. MAIN: We present a website ( https://bio2byte.be/sars2/ ) that provides protein sequence-based predictions of the backbone and side-chain dynamics and conformational propensities of these proteins, as well as derived early folding, disorder, β-sheet aggregation, protein-protein interaction and epitope propensities. These predictions attempt to capture the inherent biophysical propensities encoded in the sequence, rather than context-dependent behaviour such as the final folded state. In addition, we provide the biophysical variation that is observed in homologous proteins, which gives an indication of the limits of their functionally relevant biophysical behaviour. CONCLUSION The https://bio2byte.be/sars2/ website provides a range of protein sequence-based predictions for 27 SARS-CoV-2 proteins, enabling researchers to form hypotheses about their possible functional modes of action.
Collapse
Affiliation(s)
- Luciano Kagami
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Triomflaan, 1050, Brussels, Belgium
| | - Joel Roca-Martínez
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Triomflaan, 1050, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050, Brussels, Belgium
- VIB Structural Biology Research Centre, Pleinlaan 2, 1050, Brussels, Belgium
| | - Jose Gavaldá-García
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Triomflaan, 1050, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050, Brussels, Belgium
- VIB Structural Biology Research Centre, Pleinlaan 2, 1050, Brussels, Belgium
| | - Pathmanaban Ramasamy
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Triomflaan, 1050, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050, Brussels, Belgium
- VIB Structural Biology Research Centre, Pleinlaan 2, 1050, Brussels, Belgium
- VIB-UGent Center for Medical Biotechnology, VIB, 9000, Ghent, Belgium
- Department of Biomolecular Medicine, Faculty of Health Sciences and Medicine, Ghent University, 9000, Ghent, Belgium
| | - K Anton Feenstra
- IBIVU - Center for Integrative Bioinformatics, Department of Computer Science, Vrije Universiteit Amsterdam, Amsterdam, 1081HV, The Netherlands
- AIMMS - Amsterdam Institute for Molecules Medicines and Systems, Vrije Universiteit Amsterdam, Amsterdam, 1081HV, The Netherlands
| | - Wim F Vranken
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Triomflaan, 1050, Brussels, Belgium.
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050, Brussels, Belgium.
- VIB Structural Biology Research Centre, Pleinlaan 2, 1050, Brussels, Belgium.
| |
Collapse
|
13
|
Waman VP, Sen N, Varadi M, Daina A, Wodak SJ, Zoete V, Velankar S, Orengo C. The impact of structural bioinformatics tools and resources on SARS-CoV-2 research and therapeutic strategies. Brief Bioinform 2021; 22:742-768. [PMID: 33348379 PMCID: PMC7799268 DOI: 10.1093/bib/bbaa362] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Revised: 11/06/2020] [Accepted: 11/09/2020] [Indexed: 01/18/2023] Open
Abstract
SARS-CoV-2 is the causative agent of COVID-19, the ongoing global pandemic. It has posed a worldwide challenge to human health as no effective treatment is currently available to combat the disease. Its severity has led to unprecedented collaborative initiatives for therapeutic solutions against COVID-19. Studies resorting to structure-based drug design for COVID-19 are plethoric and show good promise. Structural biology provides key insights into 3D structures, critical residues/mutations in SARS-CoV-2 proteins, implicated in infectivity, molecular recognition and susceptibility to a broad range of host species. The detailed understanding of viral proteins and their complexes with host receptors and candidate epitope/lead compounds is the key to developing a structure-guided therapeutic design. Since the discovery of SARS-CoV-2, several structures of its proteins have been determined experimentally at an unprecedented speed and deposited in the Protein Data Bank. Further, specialized structural bioinformatics tools and resources have been developed for theoretical models, data on protein dynamics from computer simulations, impact of variants/mutations and molecular therapeutics. Here, we provide an overview of ongoing efforts on developing structural bioinformatics tools and resources for COVID-19 research. We also discuss the impact of these resources and structure-based studies, to understand various aspects of SARS-CoV-2 infection and therapeutic development. These include (i) understanding differences between SARS-CoV-2 and SARS-CoV, leading to increased infectivity of SARS-CoV-2, (ii) deciphering key residues in the SARS-CoV-2 involved in receptor-antibody recognition, (iii) analysis of variants in host proteins that affect host susceptibility to infection and (iv) analyses facilitating structure-based drug and vaccine design against SARS-CoV-2.
Collapse
Affiliation(s)
| | | | | | - Antoine Daina
- Molecular Modeling Group at SIB, Swiss Institute of Bioinformatics
| | | | - Vincent Zoete
- Department of Fundamental Oncology at the University of Lausanne and Group leader at SIB
| | | | | |
Collapse
|
14
|
Savojardo C, Martelli PL, Casadio R. Protein–Protein Interaction Methods and Protein Phase Separation. Annu Rev Biomed Data Sci 2020. [DOI: 10.1146/annurev-biodatasci-011720-104428] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In the last decade, newly developed experimental methods have made it possible to highlight that macromolecules in the cell milieu physically interact to support physiology. This has shifted the problem of protein–protein interaction from a microscopic, electron-density scale to a mesoscopic one. Further, nowadays there is increasing evidence that proteins in the nucleus and in the cytoplasm can aggregate in membraneless organelles for different physiological reasons. In this scenario, it is urgent to face the problem of biomolecule functional annotation with efficient computational methods, suited to extract knowledge from reliable data and transfer information across different domains of investigation. Here, we revise the present state of the art of our knowledge of protein–protein interaction and the computational methods that differently implement it. Furthermore, we explore experimental and computational features of a set of proteins involved in phase separation.
Collapse
Affiliation(s)
- Castrense Savojardo
- Biocomputing Group, Department of Pharmacy and Biotechnology and Interdepartmental Center “Luigi Galvani” for Integrated Studies of Bioinformatics, Biophysics, and Biocomplexity, University of Bologna, 40126 Bologna, Italy
| | - Pier Luigi Martelli
- Biocomputing Group, Department of Pharmacy and Biotechnology and Interdepartmental Center “Luigi Galvani” for Integrated Studies of Bioinformatics, Biophysics, and Biocomplexity, University of Bologna, 40126 Bologna, Italy
| | - Rita Casadio
- Biocomputing Group, Department of Pharmacy and Biotechnology and Interdepartmental Center “Luigi Galvani” for Integrated Studies of Bioinformatics, Biophysics, and Biocomplexity, University of Bologna, 40126 Bologna, Italy
- Institute of Biomembranes, Bioenergetics, and Molecular Biotechnologies (IBIOM), Italian National Research Council (CNR), 70126 Bari, Italy
| |
Collapse
|
15
|
Barreto CAV, Baptista SJ, Preto AJ, Matos-Filipe P, Mourão J, Melo R, Moreira I. Prediction and targeting of GPCR oligomer interfaces. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2020; 169:105-149. [PMID: 31952684 DOI: 10.1016/bs.pmbts.2019.11.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
GPCR oligomerization has emerged as a hot topic in the GPCR field in the last years. Receptors that are part of these oligomers can influence each other's function, although it is not yet entirely understood how these interactions work. The existence of such a highly complex network of interactions between GPCRs generates the possibility of alternative targets for new therapeutic approaches. However, challenges still exist in the characterization of these complexes, especially at the interface level. Different experimental approaches, such as FRET or BRET, are usually combined to study GPCR oligomer interactions. Computational methods have been applied as a useful tool for retrieving information from GPCR sequences and the few X-ray-resolved oligomeric structures that are accessible, as well as for predicting new and trustworthy GPCR oligomeric interfaces. Machine-learning (ML) approaches have recently helped with some hindrances of other methods. By joining and evaluating multiple structure-, sequence- and co-evolution-based features on the same algorithm, it is possible to dilute the issues of particular structures and residues that arise from the experimental methodology into all-encompassing algorithms capable of accurately predict GPCR-GPCR interfaces. All these methods used as a single or a combined approach provide useful information about GPCR oligomerization and its role in GPCR function and dynamics. Altogether, we present experimental, computational and machine-learning methods used to study oligomers interfaces, as well as strategies that have been used to target these dynamic complexes.
Collapse
Affiliation(s)
- Carlos A V Barreto
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal
| | - Salete J Baptista
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal; Centro de Ciências e Tecnologias Nucleares, Instituto Superior Técnico, Universidade de Lisboa, CTN, LRS, Portugal
| | - António José Preto
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal
| | - Pedro Matos-Filipe
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal
| | - Joana Mourão
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal; Institute for Interdisciplinary Research, University of Coimbra, Coimbra, Portugal
| | - Rita Melo
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal; Centro de Ciências e Tecnologias Nucleares, Instituto Superior Técnico, Universidade de Lisboa, CTN, LRS, Portugal
| | - Irina Moreira
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal; Science and Technology Faculty, University of Coimbra, Coimbra, Portugal.
| |
Collapse
|