1
|
Pir MS, Timucin E. AFFIPred: AlphaFold2 structure-based Functional Impact Prediction of missense variations. Protein Sci 2025; 34:e70030. [PMID: 39840793 PMCID: PMC11751861 DOI: 10.1002/pro.70030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2024] [Revised: 12/23/2024] [Accepted: 12/24/2024] [Indexed: 01/23/2025]
Abstract
Protein structure holds immense potential for pathogenicity prediction, albeit structure-based predictors are limited compared to the sequence-based counterparts due to the "structure knowledge gap" between large number of available protein sequences and relatively limited number of structures. Leveraging the highly accurate protein structures predicted by AlphaFold2 (AF2), we introduce AFFIPred, an ensemble machine learning classifier that combines sequence and AF2-based structural characteristics to predict missense variant pathogenicity. Based on the assessments on unseen datasets, AFFIPred reached a comparable level of performance with the state-of-the-art predictors such as AlphaMissense. We also showed that the recruitment of AF2 structures that are full-length and represent the unbound states ensures more precise SASA calculations compared to the recruitment of experimental structures. In line with the completeness of the AF2 structures, their use provide a more comprehensive view of the structural characteristics of the missense variation datasets by capturing all variants. AFFIPred maintains high-level accuracy without the limitations of PDB-based classifiers. AFFIPred has predicted over 210 million variations of the human proteome, which are accessible at https://affipred.timucinlab.com/.
Collapse
Affiliation(s)
- Mustafa S Pir
- Department of Biostatistics and Bioinformatics, Institute of Health SciencesAcibadem UniversityAtasehirIstanbulTurkey
| | - Emel Timucin
- Department of Biostatistics and Bioinformatics, Institute of Health SciencesAcibadem UniversityAtasehirIstanbulTurkey
- Department of Biostatistics and Medical Informatics, School of MedicineAcibadem UniversityAtasehirIstanbulTurkey
| |
Collapse
|
2
|
Harris CT, Cohen S. Reducing Immunogenicity by Design: Approaches to Minimize Immunogenicity of Monoclonal Antibodies. BioDrugs 2024; 38:205-226. [PMID: 38261155 PMCID: PMC10912315 DOI: 10.1007/s40259-023-00641-2] [Citation(s) in RCA: 21] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/13/2023] [Indexed: 01/24/2024]
Abstract
Monoclonal antibodies (mAbs) have transformed therapeutic strategies for various diseases. Their high specificity to target antigens makes them ideal therapeutic agents for certain diseases. However, a challenge to their application in clinical practice is their potential risk to induce unwanted immune response, termed immunogenicity. This challenge drives the continued efforts to deimmunize these protein therapeutics while maintaining their pharmacokinetic properties and therapeutic efficacy. Because mAbs hold a central position in therapeutic strategies against an array of diseases, the importance of conducting comprehensive immunogenicity risk assessment during the drug development process cannot be overstated. Such assessment necessitates the employment of in silico, in vitro, and in vivo strategies to evaluate the immunogenicity risk of mAbs. Understanding the intricacies of the mechanisms that drive mAb immunogenicity is crucial to improving their therapeutic efficacy and safety and developing the most effective strategies to determine and mitigate their immunogenic risk. This review highlights recent advances in immunogenicity prediction strategies, with a focus on protein engineering strategies used throughout development to reduce immunogenicity.
Collapse
Affiliation(s)
- Chantal T Harris
- Department of BioAnalytical Sciences, Genentech Inc., South San Francisco, CA, 94080-4990, USA
| | - Sivan Cohen
- Department of BioAnalytical Sciences, Genentech Inc., South San Francisco, CA, 94080-4990, USA.
| |
Collapse
|
3
|
Ruiz-De-La-Cruz G, Sifuentes-Rincón AM, Paredes-Sánchez FA, Parra-Bracamonte GM, Casas E, Riley DG, Perry GA, Welsh TH, Randel RD. Analysis of nonsynonymous SNPs in candidate genes that influence bovine temperament and evaluation of their effect in Brahman cattle. Mol Biol Rep 2024; 51:285. [PMID: 38324050 PMCID: PMC10850011 DOI: 10.1007/s11033-024-09264-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 01/17/2024] [Indexed: 02/08/2024]
Abstract
BACKGROUND Temperament is an important production trait in cattle and multiple strategies had been developed to generate molecular markers to assist animal selection. As nonsynonymous single nucleotide polymorphisms are markers with the potential to affect gene functions, they could be useful to predict phenotypic effects. Genetic selection of less stress-responsive, temperamental animals is desirable from an economic and welfare point of view. METHODS AND RESULTS Two nonsynonymous single nucleotide polymorphisms identified in HTR1B and SLC18A2 candidate genes for temperament were analyzed in silico to determine their effects on protein structure. Those nsSNPs allowing changes in proteins were selected for a temperament association analysis in a Brahman population. Transversion effects on protein structure were evaluated in silico for each amino acid change model, revealing structural changes in the proteins of the HTR1B and SLC18A2 genes. The selected nsSNPs were genotyped in a Brahman population (n = 138), and their genotypic effects on three temperament traits were analyzed: exit velocity, pen score, and temperament score. Only the SNP rs209984404-HTR1B (C/A) showed a significant association (P = 0.0144) with pen score. The heterozygous genotype showed a pen score value 1.17 points lower than that of the homozygous CC genotype. CONCLUSION The results showed that in silico analysis could direct the selection of nsSNPs with the potential to change the protein. Non-synonymous single nucleotide polymorphisms causing structural changes and reduced protein stability were identified. Only rs209984404-HTR1B shows that the allele affecting protein stability was associated with the genotype linked to docility in cattle.
Collapse
Affiliation(s)
- Gilberto Ruiz-De-La-Cruz
- Laboratorio de Biotecnología Animal, Centro de Biotecnología Genómica, Instituto Politécnico Nacional, Reynosa, Tamaulipas, 88710, México
| | - Ana María Sifuentes-Rincón
- Laboratorio de Biotecnología Animal, Centro de Biotecnología Genómica, Instituto Politécnico Nacional, Reynosa, Tamaulipas, 88710, México.
| | | | - Gaspar Manuel Parra-Bracamonte
- Laboratorio de Biotecnología Animal, Centro de Biotecnología Genómica, Instituto Politécnico Nacional, Reynosa, Tamaulipas, 88710, México
| | - Eduardo Casas
- National Animal Disease Center, Agricultural Research Service, United States Department of Agriculture, Ames, IA, 50010, USA
| | - David G Riley
- Department of Animal Science, Texas A&M University, College Station, TX, 77843, USA
| | | | - Thomas H Welsh
- Department of Animal Science, Texas A&M University, College Station, TX, 77843, USA
| | | |
Collapse
|
4
|
Sellés Vidal L, Isalan M, Heap JT, Ledesma-Amaro R. A primer to directed evolution: current methodologies and future directions. RSC Chem Biol 2023; 4:271-291. [PMID: 37034405 PMCID: PMC10074555 DOI: 10.1039/d2cb00231k] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 01/18/2023] [Indexed: 01/30/2023] Open
Abstract
Directed evolution is one of the most powerful tools for protein engineering and functions by harnessing natural evolution, but on a shorter timescale. It enables the rapid selection of variants of biomolecules with properties that make them more suitable for specific applications. Since the first in vitro evolution experiments performed by Sol Spiegelman in 1967, a wide range of techniques have been developed to tackle the main two steps of directed evolution: genetic diversification (library generation), and isolation of the variants of interest. This review covers the main modern methodologies, discussing the advantages and drawbacks of each, and hence the considerations for designing directed evolution experiments. Furthermore, the most recent developments are discussed, showing how advances in the handling of ever larger library sizes are enabling new research questions to be tackled.
Collapse
Affiliation(s)
- Lara Sellés Vidal
- Imperial College Centre for Synthetic Biology, Imperial College London London SW7 2AZ UK
- Department of Bioengineering, Imperial College London London SW7 2AZ UK
| | - Mark Isalan
- Imperial College Centre for Synthetic Biology, Imperial College London London SW7 2AZ UK
- Department of Life Sciences, Imperial College London London SW7 2AZ UK
| | - John T Heap
- Imperial College Centre for Synthetic Biology, Imperial College London London SW7 2AZ UK
- Department of Life Sciences, Imperial College London London SW7 2AZ UK
- School of Life Sciences, The University of Nottingham, University Park Nottingham NG7 2RD UK
| | - Rodrigo Ledesma-Amaro
- Imperial College Centre for Synthetic Biology, Imperial College London London SW7 2AZ UK
- Department of Bioengineering, Imperial College London London SW7 2AZ UK
| |
Collapse
|
5
|
Rajapaksa S, Konagurthu AS, Lesk AM. Sequence and structure alignments in post-AlphaFold era. Curr Opin Struct Biol 2023; 79:102539. [PMID: 36753924 DOI: 10.1016/j.sbi.2023.102539] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2022] [Accepted: 01/02/2023] [Indexed: 02/09/2023]
Abstract
Sequence alignment is fundamental for analyzing protein structure and function. For all but closely-related proteins, alignments based on structures are more accurate than alignments based purely on amino-acid sequences. However, the disparity between the large amount of sequence data and the relative paucity of experimentally-determined structures has precluded the general applicability of structure alignment. Based on the success of AlphaFold (and its likes) in producing high-quality structure predictions, we suggest that when aligning homologous proteins, lacking experimental structures, better results can be obtained by a structural alignment of predicted structures than by an alignment based only on amino-acid sequences. We present a quantitative evaluation, based on pairwise alignments of sequences and structures (both predicted and experimental) to support this hypothesis.
Collapse
Affiliation(s)
- Sandun Rajapaksa
- Department of Data Science and Artificial Intelligence, Faculty of Information Technology, Monash University, Clayton, 3800, Victoria, Australia
| | - Arun S Konagurthu
- Department of Data Science and Artificial Intelligence, Faculty of Information Technology, Monash University, Clayton, 3800, Victoria, Australia
| | - Arthur M Lesk
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, 16802, Pennsylvania, USA.
| |
Collapse
|
6
|
Sarkar M, Saha S. Modeling of SARS-CoV-2 Virus Proteins: Implications on Its Proteome. Methods Mol Biol 2023; 2627:265-299. [PMID: 36959453 DOI: 10.1007/978-1-0716-2974-1_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/25/2023]
Abstract
COronaVIrus Disease 19 (COVID-19) is a severe acute respiratory syndrome (SARS) caused by a group of beta coronaviruses, SARS-CoV-2. The SARS-CoV-2 virus is similar to previous SARS- and MERS-causing strains and has infected nearly six hundred and fifty million people all over the globe, while the death toll has crossed the six million mark (as of December, 2022). In this chapter, we look at how computational modeling approaches of the viral proteins could help us understand the various processes in the viral life cycle inside the host, an understanding of which might provide key insights in mitigating this and future threats. This understanding helps us identify key targets for the purpose of drug discovery and vaccine development.
Collapse
Affiliation(s)
- Manish Sarkar
- Hochschule für Technik und Wirtschaft (HTW) Berlin, Berlin, Germany
- MedInsights SAS, Paris, France
| | - Soham Saha
- MedInsights, Veuilly la Poterie, France.
- MedInsights SAS, Paris, France.
| |
Collapse
|
7
|
Durairaj J, de Ridder D, van Dijk AD. Beyond sequence: Structure-based machine learning. Comput Struct Biotechnol J 2022; 21:630-643. [PMID: 36659927 PMCID: PMC9826903 DOI: 10.1016/j.csbj.2022.12.039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 12/21/2022] [Accepted: 12/21/2022] [Indexed: 12/31/2022] Open
Abstract
Recent breakthroughs in protein structure prediction demarcate the start of a new era in structural bioinformatics. Combined with various advances in experimental structure determination and the uninterrupted pace at which new structures are published, this promises an age in which protein structure information is as prevalent and ubiquitous as sequence. Machine learning in protein bioinformatics has been dominated by sequence-based methods, but this is now changing to make use of the deluge of rich structural information as input. Machine learning methods making use of structures are scattered across literature and cover a number of different applications and scopes; while some try to address questions and tasks within a single protein family, others aim to capture characteristics across all available proteins. In this review, we look at the variety of structure-based machine learning approaches, how structures can be used as input, and typical applications of these approaches in protein biology. We also discuss current challenges and opportunities in this all-important and increasingly popular field.
Collapse
Affiliation(s)
- Janani Durairaj
- Biozentrum, University of Basel, Basel, Switzerland
- Bioinformatics Group, Department of Plant Sciences, Wageningen University and Research, Wageningen, the Netherlands
| | - Dick de Ridder
- Bioinformatics Group, Department of Plant Sciences, Wageningen University and Research, Wageningen, the Netherlands
| | - Aalt D.J. van Dijk
- Bioinformatics Group, Department of Plant Sciences, Wageningen University and Research, Wageningen, the Netherlands
| |
Collapse
|
8
|
Varadi M, Nair S, Sillitoe I, Tauriello G, Anyango S, Bienert S, Borges C, Deshpande M, Green T, Hassabis D, Hatos A, Hegedus T, Hekkelman ML, Joosten R, Jumper J, Laydon A, Molodenskiy D, Piovesan D, Salladini E, Salzberg SL, Sommer MJ, Steinegger M, Suhajda E, Svergun D, Tenorio-Ku L, Tosatto S, Tunyasuvunakool K, Waterhouse AM, Žídek A, Schwede T, Orengo C, Velankar S. 3D-Beacons: decreasing the gap between protein sequences and structures through a federated network of protein structure data resources. Gigascience 2022; 11:giac118. [PMID: 36448847 PMCID: PMC9709962 DOI: 10.1093/gigascience/giac118] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 09/20/2022] [Accepted: 11/11/2022] [Indexed: 12/02/2022] Open
Abstract
While scientists can often infer the biological function of proteins from their 3-dimensional quaternary structures, the gap between the number of known protein sequences and their experimentally determined structures keeps increasing. A potential solution to this problem is presented by ever more sophisticated computational protein modeling approaches. While often powerful on their own, most methods have strengths and weaknesses. Therefore, it benefits researchers to examine models from various model providers and perform comparative analysis to identify what models can best address their specific use cases. To make data from a large array of model providers more easily accessible to the broader scientific community, we established 3D-Beacons, a collaborative initiative to create a federated network with unified data access mechanisms. The 3D-Beacons Network allows researchers to collate coordinate files and metadata for experimentally determined and theoretical protein models from state-of-the-art and specialist model providers and also from the Protein Data Bank.
Collapse
Affiliation(s)
- Mihaly Varadi
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SA, UK
| | - Sreenath Nair
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SA, UK
| | - Ian Sillitoe
- Department of Structural and Molecular Biology, UCL, London WC1E 6BT, UK
| | - Gerardo Tauriello
- Biozentrum, University of Basel, Basel 4056, Switzerland
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Stephen Anyango
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SA, UK
| | - Stefan Bienert
- Biozentrum, University of Basel, Basel 4056, Switzerland
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Clemente Borges
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
- European Molecular Biology Laboratory, EMBL Hamburg, Hamburg 69117, Germany
| | - Mandar Deshpande
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SA, UK
| | | | | | - Andras Hatos
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
- Department of Oncology, Lausanne University Hospital, Lausanne 1015, Switzerland
- Department of Computational Biology, University of Lausanne, Lausanne 1015, Switzerland
- Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
- Swiss Cancer Center Leman, Lausanne 1005, Switzerland
| | - Tamas Hegedus
- Department of Biophysics and Radiation Biology, Semmelweis University, Budapest 1094, Hungary
| | | | - Robbie Joosten
- Netherlands Cancer Institute, Amsterdam 1066 CX, The Netherlands
| | | | | | - Dmitry Molodenskiy
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
- European Molecular Biology Laboratory, EMBL Hamburg, Hamburg 69117, Germany
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
| | - Edoardo Salladini
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
| | - Steven L Salzberg
- Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21205, USA
| | - Markus J Sommer
- Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21205, USA
| | - Martin Steinegger
- School of Biology, Seoul National University, Seoul 82-2-880-6971, 6977, South Korea
| | - Erzsebet Suhajda
- Department of Biophysics and Radiation Biology, Semmelweis University, Budapest 1094, Hungary
| | - Dmitri Svergun
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
- European Molecular Biology Laboratory, EMBL Hamburg, Hamburg 69117, Germany
| | - Luiggi Tenorio-Ku
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
| | - Silvio Tosatto
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
| | | | - Andrew Mark Waterhouse
- Biozentrum, University of Basel, Basel 4056, Switzerland
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | | | - Torsten Schwede
- Biozentrum, University of Basel, Basel 4056, Switzerland
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Christine Orengo
- Department of Structural and Molecular Biology, UCL, London WC1E 6BT, UK
| | - Sameer Velankar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SA, UK
| |
Collapse
|
9
|
Pak MA, Ivankov DN. Best templates outperform homology models in predicting the impact of mutations on protein stability. Bioinformatics 2022; 38:4312-4320. [PMID: 35894930 DOI: 10.1093/bioinformatics/btac515] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 05/31/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Prediction of protein stability change upon mutation (ΔΔG) is crucial for facilitating protein engineering and understanding of protein folding principles. Robust prediction of protein folding free energy change requires the knowledge of protein three-dimensional (3D) structure. In case, protein 3D structure is not available, one can predict the structure from protein sequence; however, the perspectives of ΔΔG predictions for predicted protein structures are unknown. The accuracy of using 3D structures of the best templates for the ΔΔG prediction is also unclear. RESULTS To investigate these questions, we used a representative set of seven diverse and accurate publicly available tools (FoldX, Eris, Rosetta, DDGun, ACDC-NN, ThermoNet and DynaMut) for stability change prediction combined with AlphaFold or I-Tasser for protein 3D structure prediction. We found that best templates perform consistently better than (or similar to) homology models for all ΔΔG predictors. Our findings imply using the best template structure for the prediction of protein stability change upon mutation if the protein 3D structure is not available. AVAILABILITY AND IMPLEMENTATION The data are available at https://github.com/ivankovlab/template-vs-model. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Marina A Pak
- Center of Life Sciences, Skolkovo Institute of Science and Technology, Moscow 121205, Russia
| | - Dmitry N Ivankov
- Center of Life Sciences, Skolkovo Institute of Science and Technology, Moscow 121205, Russia
| |
Collapse
|
10
|
Uttarotai T, Mukjang N, Chaisoung N, Pathom-Aree W, Pekkoh J, Pumas C, Sattayawat P. Putative Protein Discovery from Microalgal Genomes as a Synthetic Biology Protein Library for Heavy Metal Bio-Removal. BIOLOGY 2022; 11:biology11081226. [PMID: 36009852 PMCID: PMC9405338 DOI: 10.3390/biology11081226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Revised: 08/06/2022] [Accepted: 08/12/2022] [Indexed: 11/22/2022]
Abstract
Simple Summary Nowadays, heavy metal polluted wastewater is one of the global challenges that leads to an insufficient supply of clean water. Taking advantage of what nature has to offer, several organisms, including microalgae, can natively bioremediate these heavy metals. However, the effectiveness of such processes does not meet expectations, especially with the increasing amount of pollution in today’s world. Therefore, with the goal of creating effective strains, synthetic biology via bioengineering is widely used as a strategy to enhance the heavy metal bio-removing capability, either by directly engineering the native ability of organisms or by transferring the ability to a more suitable host. In order to do so, a list of genes or proteins involved in the processes is crucial for stepwise engineering. Yet, a large amount of information remains to be discovered. In this work, a comprehensive library of putative proteins that are involved in heavy metal bio-removal from microalgae was constructed. Moreover, with the development of machine learning, the 3D structures of these proteins are also predicted, using machine learning-based methods, to aid the use of synthetic biology further. Abstract Synthetic biology is a principle that aims to create new biological systems with particular functions or to redesign the existing ones through bioengineering. Therefore, this principle is often utilized as a tool to put the knowledge learned to practical use in actual fields. However, there is still a great deal of information remaining to be found, and this limits the possible utilization of synthetic biology, particularly on the topic that is the focus of the present work—heavy metal bio-removal. In this work, we aim to construct a comprehensive library of putative proteins that might support heavy metal bio-removal. Hypothetical proteins were discovered from Chlorella and Scenedesmus genomes and extensively annotated. The protein structures of these putative proteins were also modeled through Alphafold2. Although a portion of this workflow has previously been demonstrated to annotate hypothetical proteins from whole genome sequences, the adaptation of such steps is yet to be done for library construction purposes. We also demonstrated further downstream steps that allow a more accurate function prediction of the hypothetical proteins by subjecting the models generated to structure-based annotation. In conclusion, a total of 72 newly discovered putative proteins were annotated with ready-to-use predicted structures available for further investigation.
Collapse
Affiliation(s)
- Toungporn Uttarotai
- Department of Biology, Faculty of Science, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Nilita Mukjang
- Department of Entomology and Plant Pathology, Faculty of Agriculture, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Natcha Chaisoung
- Department of Biology, Faculty of Science, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Wasu Pathom-Aree
- Department of Biology, Faculty of Science, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Jeeraporn Pekkoh
- Department of Biology, Faculty of Science, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Chayakorn Pumas
- Department of Biology, Faculty of Science, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Pachara Sattayawat
- Department of Biology, Faculty of Science, Chiang Mai University, Chiang Mai 50200, Thailand
- Research Center in Bioresources for Agriculture, Industry and Medicine, Chiang Mai University, Chiang Mai 50200, Thailand
- Research Center of Microbial Diversity and Sustainable Utilization, Faculty of Science, Chiang Mai University, Chiang Mai 50200, Thailand
- Correspondence:
| |
Collapse
|
11
|
FRTpred: A novel approach for accurate prediction of protein folding rate and type. Comput Biol Med 2022; 149:105911. [DOI: 10.1016/j.compbiomed.2022.105911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 07/08/2022] [Accepted: 07/23/2022] [Indexed: 11/20/2022]
|
12
|
Mahtarin R, Islam S, Islam MJ, Ullah MO, Ali MA, Halim MA. Structure and dynamics of membrane protein in SARS-CoV-2. J Biomol Struct Dyn 2022; 40:4725-4738. [PMID: 33353499 PMCID: PMC7784837 DOI: 10.1080/07391102.2020.1861983] [Citation(s) in RCA: 51] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Accepted: 12/05/2020] [Indexed: 12/15/2022]
Abstract
SARS-CoV-2 membrane (M) protein performs a variety of critical functions in virus infection cycle. However, the expression and purification of membrane protein structure is difficult despite tremendous progress. In this study, the 3 D structure is modeled followed by intensive validation and molecular dynamics simulation. The lack of suitable homologous templates (>30% sequence identities) leads us to construct the membrane protein models using template-free modeling (de novo or ab initio) approach with Robetta and trRosetta servers. Comparing with other model structures, it is evident that trRosetta (TM-score: 0.64; TM region RMSD: 2 Å) can provide the best model than Robetta (TM-score: 0.61; TM region RMSD: 3.3 Å) and I-TASSER (TM-score: 0.45; TM region RMSD: 6.5 Å). 100 ns molecular dynamics simulations are performed on the model structures by incorporating membrane environment. Moreover, secondary structure elements and principal component analysis (PCA) have also been performed on MD simulation data. Finally, trRosetta model is utilized for interpretation and visualization of interacting residues during protein-protein interactions. The common interacting residues including Phe103, Arg107, Met109, Trp110, Arg131, and Glu135 in the C-terminal domain of M protein are identified in membrane-spike and membrane-nucleocapsid protein complexes. The active site residues are also predicted for potential drug and peptide binding. Overall, this study might be helpful to design drugs and peptides against the modeled membrane protein of SARS-CoV-2 to accelerate further investigation. Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Rumana Mahtarin
- Division of Infectious Diseases and Division of Computer Aided Drug Design, The Red-Green Research Centre, BICCB, Tejgaon, Dhaka, Bangladesh
| | - Shafiqul Islam
- Division of Infectious Diseases and Division of Computer Aided Drug Design, The Red-Green Research Centre, BICCB, Tejgaon, Dhaka, Bangladesh
| | - Md. Jahirul Islam
- Division of Infectious Diseases and Division of Computer Aided Drug Design, The Red-Green Research Centre, BICCB, Tejgaon, Dhaka, Bangladesh
| | - M Obayed Ullah
- Division of Infectious Diseases and Division of Computer Aided Drug Design, The Red-Green Research Centre, BICCB, Tejgaon, Dhaka, Bangladesh
| | - Md Ackas Ali
- Division of Infectious Diseases and Division of Computer Aided Drug Design, The Red-Green Research Centre, BICCB, Tejgaon, Dhaka, Bangladesh
| | - Mohammad A. Halim
- Division of Infectious Diseases and Division of Computer Aided Drug Design, The Red-Green Research Centre, BICCB, Tejgaon, Dhaka, Bangladesh
- Department of Physical Sciences, University of Arkansas - Fort Smith, Fort Smith, AR, USA
| |
Collapse
|
13
|
Guo HB, Perminov A, Bekele S, Kedziora G, Farajollahi S, Varaljay V, Hinkle K, Molinero V, Meister K, Hung C, Dennis P, Kelley-Loughnane N, Berry R. AlphaFold2 models indicate that protein sequence determines both structure and dynamics. Sci Rep 2022; 12:10696. [PMID: 35739160 PMCID: PMC9226352 DOI: 10.1038/s41598-022-14382-9] [Citation(s) in RCA: 80] [Impact Index Per Article: 26.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Accepted: 06/06/2022] [Indexed: 12/29/2022] Open
Abstract
AlphaFold 2 (AF2) has placed Molecular Biology in a new era where we can visualize, analyze and interpret the structures and functions of all proteins solely from their primary sequences. We performed AF2 structure predictions for various protein systems, including globular proteins, a multi-domain protein, an intrinsically disordered protein (IDP), a randomized protein, two larger proteins (> 1000 AA), a heterodimer and a homodimer protein complex. Our results show that along with the three dimensional (3D) structures, AF2 also decodes protein sequences into residue flexibilities via both the predicted local distance difference test (pLDDT) scores of the models, and the predicted aligned error (PAE) maps. We show that PAE maps from AF2 are correlated with the distance variation (DV) matrices from molecular dynamics (MD) simulations, which reveals that the PAE maps can predict the dynamical nature of protein residues. Here, we introduce the AF2-scores, which are simply derived from pLDDT scores and are in the range of [0, 1]. We found that for most protein models, including large proteins and protein complexes, the AF2-scores are highly correlated with the root mean square fluctuations (RMSF) calculated from MD simulations. However, for an IDP and a randomized protein, the AF2-scores do not correlate with the RMSF from MD, especially for the IDP. Our results indicate that the protein structures predicted by AF2 also convey information of the residue flexibility, i.e., protein dynamics.
Collapse
Affiliation(s)
- Hao-Bo Guo
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA
- UES Inc., Dayton, OH, USA
| | - Alexander Perminov
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA
- Computer Science Department, Miami University, Oxford, OH, USA
| | - Selemon Bekele
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA
- UES Inc., Dayton, OH, USA
| | - Gary Kedziora
- General Dynamics Information Technology, Inc., Wright-Patterson Air Force Base, 45433, OH, USA
| | - Sanaz Farajollahi
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA
- UES Inc., Dayton, OH, USA
| | - Vanessa Varaljay
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA
| | - Kevin Hinkle
- Department of Chemical and Materials Engineering, Dayton University, Dayton, OH, USA
| | - Valeria Molinero
- Department of Chemistry, The University of Utah, Salt Lake City, UT, USA
| | - Konrad Meister
- Department of Natural Sciences, University of Alaska Southeast, Juneau, AK, USA
- Max Planck Institute for Polymer Research, Mainz, Germany
| | - Chia Hung
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA
| | - Patrick Dennis
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA
| | - Nancy Kelley-Loughnane
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA.
| | - Rajiv Berry
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA.
| |
Collapse
|
14
|
Araujo-Arcos LE, Montaño S, Bello-Rios C, Garibay-Cerdenares OL, Leyva-Vázquez MA, Illades-Aguiar B. Molecular insights into the interaction of HPV-16 E6 variants against MAGI-1 PDZ1 domain. Sci Rep 2022; 12:1898. [PMID: 35115618 PMCID: PMC8814009 DOI: 10.1038/s41598-022-05995-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2021] [Accepted: 01/19/2022] [Indexed: 11/21/2022] Open
Abstract
Oncogenic protein E6 from Human Papilloma Virus 16 (HPV-16) mediates the degradation of Membrane-associated guanylate kinase with inverted domain structure-1 (MAGI-1), throughout the interaction of its protein binding motif (PBM) with the Discs-large homologous regions 1 (PDZ1) domain of MAG1-1. Generic variation in the E6 gene that translates to changes in the protein’s amino acidic sequence modifies the interaction of E6 with the cellular protein MAGI-1. MAGI-1 is a scaffolding protein found at tight junctions of epithelial cells, where it interacts with a variety of proteins regulating signaling pathways. MAGI-1 is a multidomain protein containing two WW (rsp-domain-9), one guanylate kinase-like, and six PDZ domains. PDZ domains played an important role in the function of MAGI-1 and served as targets for several viral proteins including the HPV-16 E6. The aim of this work was to evaluate, with an in silico approach, employing molecular dynamics simulation and protein–protein docking, the interaction of the intragenic variants E-G350 (L83V), E-C188/G350 (E29Q/L83V), E-A176/G350 (D25N/L83V), E6-AAa (Q14H/H78Y/83V) y E6-AAc (Q14H/I27RH78Y/L83V) and E6-reference of HPV-16 with MAGI-1. We found that variants E-G350, E-C188/G350, E-A176/G350, AAa and AAc increase their affinity to our two models of MAGI-1 compared to E6-reference.
Collapse
Affiliation(s)
- Lilian Esmeralda Araujo-Arcos
- Laboratorio de Biomedicina Molecular, Facultad de Ciencias Químico-Biológicas, Universidad Autonóma de Guerrero, 39090, Chilpancingo, CP, México
| | - Sarita Montaño
- Laboratorio de Bioinformática y Simulación Molecular, Facultad de Ciencias Químico Biológicas, Universidad Autónoma de Sinaloa, 80030, Culiacán Sinaloa, CP, México.
| | - Ciresthel Bello-Rios
- Laboratorio de Biomedicina Molecular, Facultad de Ciencias Químico-Biológicas, Universidad Autonóma de Guerrero, 39090, Chilpancingo, CP, México
| | - Olga Lilia Garibay-Cerdenares
- Laboratorio de Biomedicina Molecular, Facultad de Ciencias Químico-Biológicas, Universidad Autonóma de Guerrero, 39090, Chilpancingo, CP, México.,CONACyT-Universidad Autónoma de Guerrero, 39087, Chilpancingo, CP, México
| | - Marco Antonio Leyva-Vázquez
- Laboratorio de Biomedicina Molecular, Facultad de Ciencias Químico-Biológicas, Universidad Autonóma de Guerrero, 39090, Chilpancingo, CP, México
| | - Berenice Illades-Aguiar
- Laboratorio de Biomedicina Molecular, Facultad de Ciencias Químico-Biológicas, Universidad Autonóma de Guerrero, 39090, Chilpancingo, CP, México.
| |
Collapse
|
15
|
Kaushik R, Zhang KYJ. ProFitFun: a protein tertiary structure fitness function for quantifying the accuracies of model structures. Bioinformatics 2022; 38:369-376. [PMID: 34542606 DOI: 10.1093/bioinformatics/btab666] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Revised: 09/06/2021] [Accepted: 09/16/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION An accurate estimation of the quality of protein model structures typifies as a cornerstone in protein structure prediction regimes. Despite the recent groundbreaking success in the field of protein structure prediction, there are certain prospects for the improvement in model quality estimation at multiple stages of protein structure prediction and thus, to further push the prediction accuracy. Here, a novel approach, named ProFitFun, for assessing the quality of protein models is proposed by harnessing the sequence and structural features of experimental protein structures in terms of the preferences of backbone dihedral angles and relative surface accessibility of their amino acid residues at the tripeptide level. The proposed approach leverages upon the backbone dihedral angle and surface accessibility preferences of the residues by accounting for its N-terminal and C-terminal neighbors in the protein structure. These preferences are used to evaluate protein structures through a machine learning approach and tested on an extensive dataset of diverse proteins. RESULTS The approach was extensively validated on a large test dataset (n = 25 005) of protein structures, comprising 23 661 models of 82 non-homologous proteins and 1344 non-homologous experimental structures. In addition, an external dataset of 40 000 models of 200 non-homologous proteins was also used for the validation of the proposed method. Both datasets were further used for benchmarking the proposed method with four different state-of-the-art methods for protein structure quality assessment. In the benchmarking, the proposed method outperformed some state-of-the-art methods in terms of Spearman's and Pearson's correlation coefficients, average GDT-TS loss, sum of z-scores and average absolute difference of predictions over corresponding observed values. The high accuracy of the proposed approach promises a potential use of the sequence and structural features in computational protein design. AVAILABILITY AND IMPLEMENTATION http://github.com/KYZ-LSB/ProTerS-FitFun. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Rahul Kaushik
- Laboratory for Structural Bioinformatics, Center for Biosystems Dynamics Research, RIKEN, Yokohama, Kanagawa 230-0045, Japan
| | - Kam Y J Zhang
- Laboratory for Structural Bioinformatics, Center for Biosystems Dynamics Research, RIKEN, Yokohama, Kanagawa 230-0045, Japan
| |
Collapse
|
16
|
Mabonga L, Masamba P, Kappo AP. Inhibitory potential of a benzoxazole derivative, 4FI against SNRPG∼RING finger domain protein complex as a lead compound in the discovery of anti-cancer drugs: A molecular dynamics simulation approach. INFORMATICS IN MEDICINE UNLOCKED 2022. [DOI: 10.1016/j.imu.2022.100993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022] Open
|
17
|
Localization of Energetic Frustration in Proteins. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2021; 2376:387-398. [PMID: 34845622 DOI: 10.1007/978-1-0716-1716-8_22] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
We present a detailed heuristic method to quantify the degree of local energetic frustration manifested by protein molecules. Current applications are realized in computational experiments where a protein structure is visualized highlighting the energetic conflicts or the concordance of the local interactions in that structure. Minimally frustrated linkages highlight the stable folding core of the molecule. Sites of high local frustration, in contrast, often indicate functionally relevant regions such as binding, active, or allosteric sites.
Collapse
|
18
|
Sarma H, Upadhyaya M, Gogoi B, Phukan M, Kashyap P, Das B, Devi R, Sharma HK. Cardiovascular Drugs: an Insight of In Silico Drug Design Tools. J Pharm Innov 2021. [DOI: 10.1007/s12247-021-09587-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
19
|
Petrosyan R, Narayan A, Woodside MT. Single-Molecule Force Spectroscopy of Protein Folding. J Mol Biol 2021; 433:167207. [PMID: 34418422 DOI: 10.1016/j.jmb.2021.167207] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 08/11/2021] [Accepted: 08/11/2021] [Indexed: 10/20/2022]
Abstract
The use of force probes to induce unfolding and refolding of single molecules through the application of mechanical tension, known as single-molecule force spectroscopy (SMFS), has proven to be a powerful tool for studying the dynamics of protein folding. Here we provide an overview of what has been learned about protein folding using SMFS, from small, single-domain proteins to large, multi-domain proteins. We highlight the ability of SMFS to measure the energy landscapes underlying folding, to map complex pathways for native and non-native folding, to probe the mechanisms of chaperones that assist with native folding, to elucidate the effects of the ribosome on co-translational folding, and to monitor the folding of membrane proteins.
Collapse
Affiliation(s)
- Rafayel Petrosyan
- Department of Physics, University of Alberta, Edmonton, AB T6G 2E1, Canada
| | - Abhishek Narayan
- Department of Physics, University of Alberta, Edmonton, AB T6G 2E1, Canada
| | - Michael T Woodside
- Department of Physics, University of Alberta, Edmonton, AB T6G 2E1, Canada
| |
Collapse
|
20
|
Hou Q, Stringer B, Waury K, Capel H, Haydarlou R, Xue F, Abeln S, Heringa J, Feenstra KA. SeRenDIP-CE: Sequence-based Interface Prediction for Conformational Epitopes. Bioinformatics 2021; 37:3421-3427. [PMID: 33974039 PMCID: PMC8136078 DOI: 10.1093/bioinformatics/btab321] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Revised: 03/26/2021] [Accepted: 04/26/2021] [Indexed: 11/21/2022] Open
Abstract
Motivation Antibodies play an important role in clinical research and biotechnology, with their specificity determined by the interaction with the antigen’s epitope region, as a special type of protein–protein interaction (PPI) interface. The ubiquitous availability of sequence data, allows us to predict epitopes from sequence in order to focus time-consuming wet-lab experiments toward the most promising epitope regions. Here, we extend our previously developed sequence-based predictors for homodimer and heterodimer PPI interfaces to predict epitope residues that have the potential to bind an antibody. Results We collected and curated a high quality epitope dataset from the SAbDab database. Our generic PPI heterodimer predictor obtained an AUC-ROC of 0.666 when evaluated on the epitope test set. We then trained a random forest model specifically on the epitope dataset, reaching AUC 0.694. Further training on the combined heterodimer and epitope datasets, improves our final predictor to AUC 0.703 on the epitope test set. This is better than the best state-of-the-art sequence-based epitope predictor BepiPred-2.0. On one solved antibody–antigen structure of the COVID19 virus spike receptor binding domain, our predictor reaches AUC 0.778. We added the SeRenDIP-CE Conformational Epitope predictors to our webserver, which is simple to use and only requires a single antigen sequence as input, which will help make the method immediately applicable in a wide range of biomedical and biomolecular research. Availability and implementation Webserver, source code and datasets at www.ibi.vu.nl/programs/serendipwww/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Qingzhen Hou
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Shandong 250002, P. R. China.,National institute of health data science of China, Shandong University, Shandong 250002, P. R. China
| | - Bas Stringer
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands
| | - Katharina Waury
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands
| | - Henriette Capel
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands
| | - Reza Haydarlou
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands
| | - Fuzhong Xue
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Shandong 250002, P. R. China.,National institute of health data science of China, Shandong University, Shandong 250002, P. R. China
| | - Sanne Abeln
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands
| | - Jaap Heringa
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands.,AIMMS - Amsterdam Institute for Molecules Medicines and Systems, Vrije Universiteit Amsterdam
| | - K Anton Feenstra
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands.,AIMMS - Amsterdam Institute for Molecules Medicines and Systems, Vrije Universiteit Amsterdam
| |
Collapse
|
21
|
Nanni L, Brahnam S. Robust ensemble of handcrafted and learned approaches for DNA-binding proteins. APPLIED COMPUTING AND INFORMATICS 2021. [DOI: 10.1108/aci-03-2021-0051] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Purpose
Automatic DNA-binding protein (DNA-BP) classification is now an essential proteomic technology. Unfortunately, many systems reported in the literature are tested on only one or two datasets/tasks. The purpose of this study is to create the most optimal and universal system for DNA-BP classification, one that performs competitively across several DNA-BP classification tasks.
Design/methodology/approach
Efficient DNA-BP classifier systems require the discovery of powerful protein representations and feature extraction methods. Experiments were performed that combined and compared descriptors extracted from state-of-the-art matrix/image protein representations. These descriptors were trained on separate support vector machines (SVMs) and evaluated. Convolutional neural networks with different parameter settings were fine-tuned on two matrix representations of proteins. Decisions were fused with the SVMs using the weighted sum rule and evaluated to experimentally derive the most powerful general-purpose DNA-BP classifier system.
Findings
The best ensemble proposed here produced comparable, if not superior, classification results on a broad and fair comparison with the literature across four different datasets representing a variety of DNA-BP classification tasks, thereby demonstrating both the power and generalizability of the proposed system.
Originality/value
Most DNA-BP methods proposed in the literature are only validated on one (rarely two) datasets/tasks. In this work, the authors report the performance of our general-purpose DNA-BP system on four datasets representing different DNA-BP classification tasks. The excellent results of the proposed best classifier system demonstrate the power of the proposed approach. These results can now be used for baseline comparisons by other researchers in the field.
Collapse
|
22
|
Chung SS, Ng JCF, Laddach A, Thomas NSB, Fraternali F. Short loop functional commonality identified in leukaemia proteome highlights crucial protein sub-networks. NAR Genom Bioinform 2021; 3:lqab010. [PMID: 33709075 PMCID: PMC7936661 DOI: 10.1093/nargab/lqab010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Revised: 12/19/2020] [Accepted: 01/26/2021] [Indexed: 11/13/2022] Open
Abstract
Direct drug targeting of mutated proteins in cancer is not always possible and efficacy can be nullified by compensating protein-protein interactions (PPIs). Here, we establish an in silico pipeline to identify specific PPI sub-networks containing mutated proteins as potential targets, which we apply to mutation data of four different leukaemias. Our method is based on extracting cyclic interactions of a small number of proteins topologically and functionally linked in the Protein-Protein Interaction Network (PPIN), which we call short loop network motifs (SLM). We uncover a new property of PPINs named 'short loop commonality' to measure indirect PPIs occurring via common SLM interactions. This detects 'modules' of PPI networks enriched with annotated biological functions of proteins containing mutation hotspots, exemplified by FLT3 and other receptor tyrosine kinase proteins. We further identify functional dependency or mutual exclusivity of short loop commonality pairs in large-scale cellular CRISPR-Cas9 knockout screening data. Our pipeline provides a new strategy for identifying new therapeutic targets for drug discovery.
Collapse
Affiliation(s)
- Sun Sook Chung
- Department of Haematological Medicine, King's College London, London, SE5 9NU, UK
| | - Joseph C F Ng
- Randall Centre for Cell and Molecular Biophysics, King's College London, London, SE1 1UL, UK
| | - Anna Laddach
- Randall Centre for Cell and Molecular Biophysics, King's College London, London, SE1 1UL, UK
| | - N Shaun B Thomas
- Department of Haematological Medicine, King's College London, London, SE5 9NU, UK
| | - Franca Fraternali
- Randall Centre for Cell and Molecular Biophysics, King's College London, London, SE1 1UL, UK
| |
Collapse
|
23
|
Huang TC, Fischer WB. Sequence–function correlation of the transmembrane domains in NS4B of HCV using a computational approach. AIMS BIOPHYSICS 2021. [DOI: 10.3934/biophy.2021013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|
24
|
Santana CA, Silveira SDA, Moraes JPA, Izidoro SC, de Melo-Minardi RC, Ribeiro AJM, Tyzack JD, Borkakoti N, Thornton JM. GRaSP: a graph-based residue neighborhood strategy to predict binding sites. Bioinformatics 2020; 36:i726-i734. [DOI: 10.1093/bioinformatics/btaa805] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/08/2020] [Indexed: 01/22/2023] Open
Abstract
Abstract
Motivation
The discovery of protein–ligand-binding sites is a major step for elucidating protein function and for investigating new functional roles. Detecting protein–ligand-binding sites experimentally is time-consuming and expensive. Thus, a variety of in silico methods to detect and predict binding sites was proposed as they can be scalable, fast and present low cost.
Results
We proposed Graph-based Residue neighborhood Strategy to Predict binding sites (GRaSP), a novel residue centric and scalable method to predict ligand-binding site residues. It is based on a supervised learning strategy that models the residue environment as a graph at the atomic level. Results show that GRaSP made compatible or superior predictions when compared with methods described in the literature. GRaSP outperformed six other residue-centric methods, including the one considered as state-of-the-art. Also, our method achieved better results than the method from CAMEO independent assessment. GRaSP ranked second when compared with five state-of-the-art pocket-centric methods, which we consider a significant result, as it was not devised to predict pockets. Finally, our method proved scalable as it took 10–20 s on average to predict the binding site for a protein complex whereas the state-of-the-art residue-centric method takes 2–5 h on average.
Availability and implementation
The source code and datasets are available at https://github.com/charles-abreu/GRaSP.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Charles A Santana
- Department of Biochemistry and Immunology
- Department of Computer Science, Universidade Federal de Minas Gerais, Belo Horizonte 31270-901, Brazil
| | - Sabrina de A Silveira
- Department of Computer Science, Universidade Federal de Viçosa, Viçosa 36570-900, Brazil
- Institute of Technological Sciences (ICT), Advanced Campus at Itabira, Universidade Federal de Itajubá, Itabira 35903-087, Brazil
| | - João P A Moraes
- Institute of Technological Sciences (ICT), Advanced Campus at Itabira, Universidade Federal de Itajubá, Itabira 35903-087, Brazil
| | - Sandro C Izidoro
- Institute of Technological Sciences (ICT), Advanced Campus at Itabira, Universidade Federal de Itajubá, Itabira 35903-087, Brazil
| | - Raquel C de Melo-Minardi
- Department of Biochemistry and Immunology
- Department of Computer Science, Universidade Federal de Minas Gerais, Belo Horizonte 31270-901, Brazil
| | - António J M Ribeiro
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jonathan D Tyzack
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Neera Borkakoti
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Janet M Thornton
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
25
|
Hameduh T, Haddad Y, Adam V, Heger Z. Homology modeling in the time of collective and artificial intelligence. Comput Struct Biotechnol J 2020; 18:3494-3506. [PMID: 33304450 PMCID: PMC7695898 DOI: 10.1016/j.csbj.2020.11.007] [Citation(s) in RCA: 58] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 11/04/2020] [Accepted: 11/04/2020] [Indexed: 12/12/2022] Open
Abstract
Homology modeling is a method for building protein 3D structures using protein primary sequence and utilizing prior knowledge gained from structural similarities with other proteins. The homology modeling process is done in sequential steps where sequence/structure alignment is optimized, then a backbone is built and later, side-chains are added. Once the low-homology loops are modeled, the whole 3D structure is optimized and validated. In the past three decades, a few collective and collaborative initiatives allowed for continuous progress in both homology and ab initio modeling. Critical Assessment of protein Structure Prediction (CASP) is a worldwide community experiment that has historically recorded the progress in this field. Folding@Home and Rosetta@Home are examples of crowd-sourcing initiatives where the community is sharing computational resources, whereas RosettaCommons is an example of an initiative where a community is sharing a codebase for the development of computational algorithms. Foldit is another initiative where participants compete with each other in a protein folding video game to predict 3D structure. In the past few years, contact maps deep machine learning was introduced to the 3D structure prediction process, adding more information and increasing the accuracy of models significantly. In this review, we will take the reader in a journey of exploration from the beginnings to the most recent turnabouts, which have revolutionized the field of homology modeling. Moreover, we discuss the new trends emerging in this rapidly growing field.
Collapse
Affiliation(s)
- Tareq Hameduh
- Department of Chemistry and Biochemistry, Mendel University in Brno, Zemedelska 1, CZ-613 00 Brno, Czech Republic
| | - Yazan Haddad
- Department of Chemistry and Biochemistry, Mendel University in Brno, Zemedelska 1, CZ-613 00 Brno, Czech Republic
- Central European Institute of Technology, Brno University of Technology, Purkynova 656/123, 612 00 Brno, Czech Republic
| | - Vojtech Adam
- Department of Chemistry and Biochemistry, Mendel University in Brno, Zemedelska 1, CZ-613 00 Brno, Czech Republic
- Central European Institute of Technology, Brno University of Technology, Purkynova 656/123, 612 00 Brno, Czech Republic
| | - Zbynek Heger
- Department of Chemistry and Biochemistry, Mendel University in Brno, Zemedelska 1, CZ-613 00 Brno, Czech Republic
- Central European Institute of Technology, Brno University of Technology, Purkynova 656/123, 612 00 Brno, Czech Republic
| |
Collapse
|
26
|
Studer G, Rempfer C, Waterhouse AM, Gumienny R, Haas J, Schwede T. QMEANDisCo-distance constraints applied on model quality estimation. Bioinformatics 2020; 36:1765-1771. [PMID: 31697312 PMCID: PMC7075525 DOI: 10.1093/bioinformatics/btz828] [Citation(s) in RCA: 526] [Impact Index Per Article: 105.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2019] [Revised: 10/24/2019] [Accepted: 11/06/2019] [Indexed: 01/13/2023] Open
Abstract
Motivation Methods that estimate the quality of a 3D protein structure model in absence of an experimental reference structure are crucial to determine a model’s utility and potential applications. Single model methods assess individual models whereas consensus methods require an ensemble of models as input. In this work, we extend the single model composite score QMEAN that employs statistical potentials of mean force and agreement terms by introducing a consensus-based distance constraint (DisCo) score. Results DisCo exploits distance distributions from experimentally determined protein structures that are homologous to the model being assessed. Feed-forward neural networks are trained to adaptively weigh contributions by the multi-template DisCo score and classical single model QMEAN parameters. The result is the composite score QMEANDisCo, which combines the accuracy of consensus methods with the broad applicability of single model approaches. We also demonstrate that, despite being the de-facto standard for structure prediction benchmarking, CASP models are not the ideal data source to train predictive methods for model quality estimation. For performance assessment, QMEANDisCo is continuously benchmarked within the CAMEO project and participated in CASP13. For both, it ranks among the top performers and excels with low response times. Availability and implementation QMEANDisCo is available as web-server at https://swissmodel.expasy.org/qmean. The source code can be downloaded from https://git.scicore.unibas.ch/schwede/QMEAN. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Gabriel Studer
- Biozentrum, University of Basel, Basel 4056, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Christine Rempfer
- Biozentrum, University of Basel, Basel 4056, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Andrew M Waterhouse
- Biozentrum, University of Basel, Basel 4056, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Rafal Gumienny
- Biozentrum, University of Basel, Basel 4056, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Juergen Haas
- Biozentrum, University of Basel, Basel 4056, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Torsten Schwede
- Biozentrum, University of Basel, Basel 4056, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| |
Collapse
|
27
|
Hou Q, De Geest PFG, Griffioen CJ, Abeln S, Heringa J, Feenstra KA. SeRenDIP: SEquential REmasteriNg to DerIve profiles for fast and accurate predictions of PPI interface positions. Bioinformatics 2020; 35:4794-4796. [PMID: 31116381 DOI: 10.1093/bioinformatics/btz428] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2019] [Revised: 05/12/2019] [Accepted: 05/17/2019] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Interpretation of ubiquitous protein sequence data has become a bottleneck in biomolecular research, due to a lack of structural and other experimental annotation data for these proteins. Prediction of protein interaction sites from sequence may be a viable substitute. We therefore recently developed a sequence-based random forest method for protein-protein interface prediction, which yielded a significantly increased performance than other methods on both homomeric and heteromeric protein-protein interactions. Here, we present a webserver that implements this method efficiently. RESULTS With the aim of accelerating our previous approach, we obtained sequence conservation profiles by re-mastering the alignment of homologous sequences found by PSI-BLAST. This yielded a more than 10-fold speedup and at least the same accuracy, as reported previously for our method; these results allowed us to offer the method as a webserver. The web-server interface is targeted to the non-expert user. The input is simply a sequence of the protein of interest, and the output a table with scores indicating the likelihood of having an interaction interface at a certain position. As the method is sequence-based and not sensitive to the type of protein interaction, we expect this webserver to be of interest to many biological researchers in academia and in industry. AVAILABILITY AND IMPLEMENTATION Webserver, source code and datasets are available at www.ibi.vu.nl/programs/serendipwww/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Qingzhen Hou
- Department of BioModeling, BioInformatics & BioProcesses, Université Libre de Bruxelles, Brussels 1050, Belgium
| | - Paul F G De Geest
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands
| | - Christian J Griffioen
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands
| | - Sanne Abeln
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands
| | - Jaap Heringa
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands.,AIMMS - Amsterdam Institute for Molecules Medicines and Systems, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands
| | - K Anton Feenstra
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands.,AIMMS - Amsterdam Institute for Molecules Medicines and Systems, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands
| |
Collapse
|
28
|
High-yield production of L-serine through a novel identified exporter combined with synthetic pathway in Corynebacterium glutamicum. Microb Cell Fact 2020; 19:115. [PMID: 32471433 PMCID: PMC7260847 DOI: 10.1186/s12934-020-01374-5] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Accepted: 05/25/2020] [Indexed: 01/08/2023] Open
Abstract
Background l-Serine has wide and increasing applications in industries with fast-growing market demand. Although strategies for achieving and improving l-serine production in Corynebacterium glutamicum (C. glutamicum) have focused on inhibiting its degradation and enhancing its biosynthetic pathway, l-serine yield has remained relatively low. Exporters play an essential role in the fermentative production of amino acids. To achieve higher l-serine yield, l-serine export from the cell should be improved. In C. glutamicum, ThrE, which can export l-threonine and l-serine, is the only identified l-serine exporter so far. Results In this study, a novel l-serine exporter NCgl0580 was identified and characterized in C. glutamicum ΔSSAAI (SSAAI), and named as SerE (encoded by serE). Deletion of serE in SSAAI led to a 56.5% decrease in l-serine titer, whereas overexpression of serE compensated for the lack of serE with respect to l-serine titer. A fusion protein with SerE and enhanced green fluorescent protein (EGFP) was constructed to confirm that SerE localized at the plasma membrane. The function of SerE was studied by peptide feeding approaches, and the results showed that SerE is a novel exporter for l-serine and l-threonine in C. glutamicum. Subsequently, the interaction of a known l-serine exporter ThrE and SerE was studied, and the results suggested that SerE is more important than ThrE in l-serine export in SSAAI. In addition, probe plasmid and electrophoretic mobility shift assays (EMSA) revealed NCgl0581 as the transcriptional regulator of SerE. Comparative transcriptomics between SSAAI and the NCgl0581 deletion strain showed that NCgl0581 is a positive regulator of NCgl0580. Finally, by overexpressing the novel exporter SerE, combined with l-serine synthetic pathway key enzyme serAΔ197, serC, and serB, the resulting strain presented an l-serine titer of 43.9 g/L with a yield of 0.44 g/g sucrose, which is the highest l-serine titer and yield reported so far in C. glutamicum. Conclusions This study provides a novel target for l-serine and l-threonine export engineering as well as a new global transcriptional regulator NCgl0581 in C. glutamicum.
Collapse
|
29
|
McGuffin LJ, Adiyaman R, Maghrabi AHA, Shuid AN, Brackenridge DA, Nealon JO, Philomina LS. IntFOLD: an integrated web resource for high performance protein structure and function prediction. Nucleic Acids Res 2020; 47:W408-W413. [PMID: 31045208 PMCID: PMC6602432 DOI: 10.1093/nar/gkz322] [Citation(s) in RCA: 81] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Revised: 04/05/2019] [Accepted: 04/23/2019] [Indexed: 12/14/2022] Open
Abstract
The IntFOLD server provides a unified resource for the automated prediction of: protein tertiary structures with built-in estimates of model accuracy (EMA), protein structural domain boundaries, natively unstructured or disordered regions in proteins, and protein–ligand interactions. The component methods have been independently evaluated via the successive blind CASP experiments and the continual CAMEO benchmarking project. The IntFOLD server has established its ranking as one of the best performing publicly available servers, based on independent official evaluation metrics. Here, we describe significant updates to the server back end, where we have focused on performance improvements in tertiary structure predictions, in terms of global 3D model quality and accuracy self-estimates (ASE), which we achieve using our newly improved ModFOLD7_rank algorithm. We also report on various upgrades to the front end including: a streamlined submission process, enhanced visualization of models, new confidence scores for ranking, and links for accessing all annotated model data. Furthermore, we now include an option for users to submit selected models for further refinement via convenient push buttons. The IntFOLD server is freely available at: http://www.reading.ac.uk/bioinf/IntFOLD/.
Collapse
Affiliation(s)
- Liam J McGuffin
- School of Biological Sciences, University of Reading, Whiteknights, Reading RG6 6AS, UK
| | - Recep Adiyaman
- School of Biological Sciences, University of Reading, Whiteknights, Reading RG6 6AS, UK
| | - Ali H A Maghrabi
- School of Biological Sciences, University of Reading, Whiteknights, Reading RG6 6AS, UK
| | - Ahmad N Shuid
- School of Biological Sciences, University of Reading, Whiteknights, Reading RG6 6AS, UK.,Infectomics cluster, Advanced Medical and Dental Institute, University of Science, Malaysia, Bertam, 13200, Kepala Batas, Pulau Pinang, Malaysia
| | | | - John O Nealon
- School of Biological Sciences, University of Reading, Whiteknights, Reading RG6 6AS, UK
| | - Limcy S Philomina
- School of Biological Sciences, University of Reading, Whiteknights, Reading RG6 6AS, UK
| |
Collapse
|
30
|
Li Z, Jiang Y, Guengerich FP, Ma L, Li S, Zhang W. Engineering cytochrome P450 enzyme systems for biomedical and biotechnological applications. J Biol Chem 2020; 295:833-849. [PMID: 31811088 PMCID: PMC6970918 DOI: 10.1074/jbc.rev119.008758] [Citation(s) in RCA: 69] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Cytochrome P450 enzymes (P450s) are broadly distributed among living organisms and play crucial roles in natural product biosynthesis, degradation of xenobiotics, steroid biosynthesis, and drug metabolism. P450s are considered as the most versatile biocatalysts in nature because of the vast variety of substrate structures and the types of reactions they catalyze. In particular, P450s can catalyze regio- and stereoselective oxidations of nonactivated C-H bonds in complex organic molecules under mild conditions, making P450s useful biocatalysts in the production of commodity pharmaceuticals, fine or bulk chemicals, bioremediation agents, flavors, and fragrances. Major efforts have been made in engineering improved P450 systems that overcome the inherent limitations of the native enzymes. In this review, we focus on recent progress of different strategies, including protein engineering, redox-partner engineering, substrate engineering, electron source engineering, and P450-mediated metabolic engineering, in efforts to more efficiently produce pharmaceuticals and other chemicals. We also discuss future opportunities for engineering and applications of the P450 systems.
Collapse
Affiliation(s)
- Zhong Li
- Shandong Provincial Key Laboratory of Synthetic Biology and CAS Key Laboratory of Biofuels at Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, Shandong 266101, China
- State Key Laboratory of Microbial Technology, Shandong University, Qingdao, Shandong 266237, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yuanyuan Jiang
- Shandong Provincial Key Laboratory of Synthetic Biology and CAS Key Laboratory of Biofuels at Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, Shandong 266101, China
- State Key Laboratory of Microbial Technology, Shandong University, Qingdao, Shandong 266237, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - F Peter Guengerich
- Department of Biochemistry, Vanderbilt University School of Medicine, Nashville, Tennessee 37232-0146
| | - Li Ma
- State Key Laboratory of Microbial Technology, Shandong University, Qingdao, Shandong 266237, China
| | - Shengying Li
- State Key Laboratory of Microbial Technology, Shandong University, Qingdao, Shandong 266237, China
- Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao, 266237 Shandong, China
| | - Wei Zhang
- State Key Laboratory of Microbial Technology, Shandong University, Qingdao, Shandong 266237, China
- Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao, 266237 Shandong, China
| |
Collapse
|
31
|
Abstract
There is a large gap between the numbers of known protein-protein interactions and the corresponding experimentally solved structures of protein complexes. Fortunately, this gap can be in part bridged by computational structure modeling methods. Currently, template-based modeling is the most accurate means to predict both individual protein structures and protein complexes. One of the major issues in template-based modeling is to identify homologous structures that could be utilized as templates. To simplify this task, we have developed the PPI3D web server. The server is not only able to search for homologous protein complexes, but also provides means to analyze identified interactions and to model protein complexes. In recent CASP and CAPRI experiments, PPI3D proved to be a useful tool for homology modeling of multimeric proteins. In this chapter, we provide a brief description of the PPI3D web server capabilities and how to use the server for modeling of protein complexes.
Collapse
Affiliation(s)
- Justas Dapkūnas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Česlovas Venclovas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania.
| |
Collapse
|
32
|
|
33
|
Revisiting the "satisfaction of spatial restraints" approach of MODELLER for protein homology modeling. PLoS Comput Biol 2019; 15:e1007219. [PMID: 31846452 PMCID: PMC6938380 DOI: 10.1371/journal.pcbi.1007219] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Revised: 12/31/2019] [Accepted: 11/13/2019] [Indexed: 01/02/2023] Open
Abstract
The most frequently used approach for protein structure prediction is currently homology modeling. The 3D model building phase of this methodology is critical for obtaining an accurate and biologically useful prediction. The most widely employed tool to perform this task is MODELLER. This program implements the “modeling by satisfaction of spatial restraints” strategy and its core algorithm has not been altered significantly since the early 1990s. In this work, we have explored the idea of modifying MODELLER with two effective, yet computationally light strategies to improve its 3D modeling performance. Firstly, we have investigated how the level of accuracy in the estimation of structural variability between a target protein and its templates in the form of σ values profoundly influences 3D modeling. We show that the σ values produced by MODELLER are on average weakly correlated to the true level of structural divergence between target-template pairs and that increasing this correlation greatly improves the program’s predictions, especially in multiple-template modeling. Secondly, we have inquired into how the incorporation of statistical potential terms (such as the DOPE potential) in the MODELLER’s objective function impacts positively 3D modeling quality by providing a small but consistent improvement in metrics such as GDT-HA and lDDT and a large increase in stereochemical quality. Python modules to harness this second strategy are freely available at https://github.com/pymodproject/altmod. In summary, we show that there is a large room for improving MODELLER in terms of 3D modeling quality and we propose strategies that could be pursued in order to further increase its performance. Proteins are fundamental biological molecules that carry out countless activities in living beings. Since the function of proteins is dictated by their three-dimensional atomic structures, acquiring structural details of proteins provides deep insights into their function. Currently, the most frequently used computational approach for protein structure prediction is template-based modeling. In this approach, a target protein is modeled using the experimentally-derived structural information of a template protein assumed to have a similar structure to the target. MODELLER is the most frequently used program for template-based 3D model building. Despite its success, its predictions are not always accurate enough to be useful in Biomedical Research. Here, we show that it is possible to greatly increase the performance of MODELLER by modifying two aspects of its algorithm. First, we demonstrate that providing the program with accurate estimations of local target-template structural divergence greatly increases the quality of its predictions. Additionally, we show that modifying MODELLER’s scoring function with statistical potential energetic terms also helps to improve modeling quality. This work will be useful in future research, since it reports practical strategies to improve the performance of this core tool in Structural Bioinformatics.
Collapse
|
34
|
Theoretical and Experimental Approaches Aimed at Drug Design Targeting Neurodegenerative Diseases. Processes (Basel) 2019. [DOI: 10.3390/pr7120940] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
In recent years, green chemistry has been strengthening, showing how basic and applied sciences advance globally, protecting the environment and human health. A clear example of this evolution is the synergy that now exists between theoretical and computational methods to design new drugs in the most efficient possible way, using the minimum of reagents and obtaining the maximum yield. The development of compounds with potential therapeutic activity against multiple targets associated with neurodegenerative diseases/disorders (NDD) such as Alzheimer’s disease is a hot topic in medical chemistry, where different scientists from various disciplines collaborate to find safe, active, and effective drugs. NDD are a public health problem, affecting mainly the population over 60 years old. To generate significant progress in the pharmacological treatment of NDD, it is necessary to employ different experimental strategies of green chemistry, medical chemistry, and molecular biology, coupled with computational and theoretical approaches such as molecular simulations and chemoinformatics, all framed in the rational drug design targeting NDD. Here, we review how green chemistry and computational approaches have been used to develop new compounds with the potential application against NDD, as well as the challenges and new directions of the drug development multidisciplinary process.
Collapse
|
35
|
Mutational Analysis of a Highly Conserved PLSSMXP Sequence in the Small Subunit of Bacillus licheniformis γ-Glutamyltranspeptidase. Biomolecules 2019; 9:biom9090508. [PMID: 31546955 PMCID: PMC6769717 DOI: 10.3390/biom9090508] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Revised: 09/18/2019] [Accepted: 09/19/2019] [Indexed: 01/13/2023] Open
Abstract
A highly conserved 458PLSSMXP464 sequence in the small subunit (S-subunit) of an industrially important Bacillus licheniformis γ-glutamyltranspeptidase (BlGGT) was identified by sequence alignment. Molecular structures of the precursor mimic and the mature form of BlGGT clearly reveal that this peptide sequence is in close spatial proximity to the self-processing and catalytic sites of the enzyme. To probe the role of this conserved sequence, ten mutant enzymes of BlGGT were created through a series of deletion and alanine-scanning mutagenesis. SDS-PAGE and densitometric analyses showed that the intrinsic ability of BlGGT to undergo autocatalytic processing was detrimentally affected by the deletion-associated mutations. However, loss of self-activating capacity was not obviously observed in most of the Ala-replacement mutants. The Ala-replacement mutants had a specific activity comparable to or greater than that of the wild-type enzyme; conversely, all deletion mutants completely lost their enzymatic activity. As compared with BlGGT, S460A and S461S showed greatly enhanced kcat/Km values by 2.73- and 2.67-fold, respectively. The intrinsic tryptophan fluorescence and circular dichroism spectral profiles of Ala-replacement and deletion mutants were typically similar to those of BlGGT. However, heat and guanidine hydrochloride-induced unfolding transitions of the deletion-associated mutant proteins were severely reduced as compared with the wild-type enzyme. The predictive mutant models suggest that the microenvironments required for both self-activation and catalytic reaction of BlGGT can be altered upon mutations.
Collapse
|
36
|
Sumbalova L, Stourac J, Martinek T, Bednar D, Damborsky J. HotSpot Wizard 3.0: web server for automated design of mutations and smart libraries based on sequence input information. Nucleic Acids Res 2019; 46:W356-W362. [PMID: 29796670 PMCID: PMC6030891 DOI: 10.1093/nar/gky417] [Citation(s) in RCA: 191] [Impact Index Per Article: 31.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2018] [Accepted: 05/07/2018] [Indexed: 11/30/2022] Open
Abstract
HotSpot Wizard is a web server used for the automated identification of hotspots in semi-rational protein design to give improved protein stability, catalytic activity, substrate specificity and enantioselectivity. Since there are three orders of magnitude fewer protein structures than sequences in bioinformatic databases, the major limitation to the usability of previous versions was the requirement for the protein structure to be a compulsory input for the calculation. HotSpot Wizard 3.0 now accepts the protein sequence as input data. The protein structure for the query sequence is obtained either from eight repositories of homology models or is modeled using Modeller and I-Tasser. The quality of the models is then evaluated using three quality assessment tools—WHAT_CHECK, PROCHECK and MolProbity. During follow-up analyses, the system automatically warns the users whenever they attempt to redesign poorly predicted parts of their homology models. The second main limitation of HotSpot Wizard’s predictions is that it identifies suitable positions for mutagenesis, but does not provide any reliable advice on particular substitutions. A new module for the estimation of thermodynamic stabilities using the Rosetta and FoldX suites has been introduced which prevents destabilizing mutations among pre-selected variants entering experimental testing. HotSpot Wizard is freely available at http://loschmidt.chemi.muni.cz/hotspotwizard.
Collapse
Affiliation(s)
- Lenka Sumbalova
- Loschmidt Laboratories, Department of Experimental Biology, Masaryk University, 62500 Brno, Czech Republic.,IT4Innovations Centre of Excellence, Faculty of Information Technology, Brno University of Technology, Bozetechova 2, 61266 Brno, Czech Republic
| | - Jan Stourac
- Loschmidt Laboratories, Department of Experimental Biology, Masaryk University, 62500 Brno, Czech Republic.,International Centre for Clinical Research, St. Anne's University Hospital Brno, 65691 Brno, Czech Republic
| | - Tomas Martinek
- IT4Innovations Centre of Excellence, Faculty of Information Technology, Brno University of Technology, Bozetechova 2, 61266 Brno, Czech Republic
| | - David Bednar
- Loschmidt Laboratories, Department of Experimental Biology, Masaryk University, 62500 Brno, Czech Republic.,International Centre for Clinical Research, St. Anne's University Hospital Brno, 65691 Brno, Czech Republic
| | - Jiri Damborsky
- Loschmidt Laboratories, Department of Experimental Biology, Masaryk University, 62500 Brno, Czech Republic.,International Centre for Clinical Research, St. Anne's University Hospital Brno, 65691 Brno, Czech Republic
| |
Collapse
|
37
|
Uddin R, Siddiqui QN, Sufian M, Azam SS, Wadood A. Proteome-wide subtractive approach to prioritize a hypothetical protein of XDR-Mycobacterium tuberculosis as potential drug target. Genes Genomics 2019; 41:1281-1292. [PMID: 31388979 DOI: 10.1007/s13258-019-00857-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2019] [Accepted: 07/25/2019] [Indexed: 12/12/2022]
Abstract
BACKGROUND Among the resistant isolates of MTB, multidrug resistant tuberculosis (MDR-TB) and extensively drug resistant tuberculosis (XDR-TB) have been the areas of growing concern. The genomic analysis showed that the respective genomic pool of the XDR-MTB proteome contains more than 30% of the hypothetical proteins for which no functions have been annotated yet. This class of proteins presumably have their own importance to complete genome and proteome information. The bioinformatics advancements have helped to annotate those hypothetical proteins by using various computational tools and have potential to classify them functionally. OBJECTIVE The objective of this study was to propose a new and unique drug target against the deadly Mycobacterium tuberculosis using Bioinformatics approaches to characterize the hypothetical proteins. RESULTS We stepwise reduced the hypothetical proteins (total number: 1256) out of the complete proteome to only 26 essential hypothetical proteins. Out of those 26 proteins, the protein WP_003401246.1 was computationally characterized as the druggable target. CONCLUSION The study proposed a hypothetical protein from complete proteome of the XDR-MTB as a new drug target against which new drug candidates can be proposed. Hence, the study opens up the new avenues in the areas of drug discovery against deadly M. tuberculosis.
Collapse
Affiliation(s)
- Reaz Uddin
- Lab 103 PCMD ext. Dr. Panjwani Center for Molecular Medicine and Drug Research, International Center for Chemical and Biological Sciences, University of Karachi, Karachi, 75270, Pakistan.
| | - Quratulain Nehal Siddiqui
- Lab 103 PCMD ext. Dr. Panjwani Center for Molecular Medicine and Drug Research, International Center for Chemical and Biological Sciences, University of Karachi, Karachi, 75270, Pakistan
| | - Muhammad Sufian
- Lab 103 PCMD ext. Dr. Panjwani Center for Molecular Medicine and Drug Research, International Center for Chemical and Biological Sciences, University of Karachi, Karachi, 75270, Pakistan
| | - Syed Sikander Azam
- National Centre for Bioinformatics, Quaid-i-Azam University, Islamabad, Pakistan
| | - Abdul Wadood
- Department of Biochemistry, Abdul Wali Khan University, Mardan, Pakistan
| |
Collapse
|
38
|
Mead DJT, Lunagomez S, Gatherer D. Visualization of protein sequence space with force-directed graphs, and their application to the choice of target-template pairs for homology modelling. J Mol Graph Model 2019; 92:180-191. [PMID: 31377535 PMCID: PMC7110651 DOI: 10.1016/j.jmgm.2019.07.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2019] [Revised: 07/23/2019] [Accepted: 07/25/2019] [Indexed: 11/15/2022]
Abstract
The protein sequence-structure gap results from the contrast between rapid, low-cost deep sequencing, and slow, expensive experimental structure determination techniques. Comparative homology modelling may have the potential to close this gap by predicting protein structure in target sequences using existing experimentally solved structures as templates. This paper presents the first use of force-directed graphs for the visualization of sequence space in two dimensions, and applies them to the choice of suitable RNA-dependent RNA polymerase (RdRP) target-template pairs within human-infective RNA virus genera. Measures of centrality in protein sequence space for each genus were also derived and used to identify centroid nearest-neighbour sequences (CNNs) potentially useful for production of homology models most representative of their genera. Homology modelling was then carried out for target-template pairs in different species, different genera and different families, and model quality assessed using several metrics. Reconstructed ancestral RdRP sequences for individual genera were also used as templates for the production of ancestral RdRP homology models. High quality ancestral RdRP models were consistently produced, as were good quality models for target-template pairs in the same genus. Homology modelling between genera in the same family produced mixed results and inter-family modelling was unreliable. We present a protocol for the production of optimal RdRP homology models for use in further experiments, e.g. docking to discover novel anti-viral compounds. (219 words) The first use of force-directed graphs for the visualization of multidimensional protein sequence space in two dimensions. Measures of centrality in protein sequence space to identify sequences for production of homology models. Homology modelling for RNA-dependent RNA polymerase (RdRP) target-template pairs in different species, genera and families. A protocol for the production of optimal RdRP homology models for use in further experiments.
Collapse
Affiliation(s)
- Dylan J T Mead
- Division of Biomedical & Life Sciences, Faculty of Health & Medicine, Lancaster University, Lancaster, LA1 4YT, UK.
| | - Simón Lunagomez
- Department of Mathematics & Statistics, Lancaster University, Lancaster, LA1 4YF, UK.
| | - Derek Gatherer
- Division of Biomedical & Life Sciences, Faculty of Health & Medicine, Lancaster University, Lancaster, LA1 4YT, UK.
| |
Collapse
|
39
|
Gonzalez TL, Rae JM, Colacino JA, Richardson RJ. Homology models of mouse and rat estrogen receptor- α ligand-binding domain created by in silico mutagenesis of a human template: molecular docking with 17ß-estradiol, diethylstilbestrol, and paraben analogs. COMPUTATIONAL TOXICOLOGY (AMSTERDAM, NETHERLANDS) 2019; 10:1-16. [PMID: 30740556 PMCID: PMC6363358 DOI: 10.1016/j.comtox.2018.11.003] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Crystal structures exist for human, but not rodent, estrogen receptor-α ligand-binding domain (ERα-LBD). Consequently, rodent studies involving binding of compounds to ERα-LBD are limited in their molecular-level interpretation and extrapolation to humans. Because the sequences of rodent and human ERα-LBDs are > 95% identical, we expected their 3D structures and ligand binding to be highly similar. To test this hypothesis, we used the human ERα-LBD structure (PDB 3UUD) as a template to produce rat and mouse homology models. Employing the rodent models and human structure, we generated docking poses of 23 Group A ligands (17ß-estradiol, diethylstilbestrol, and 21 paraben analogs) in AutoDock Vina for interspecies comparisons. Ligand RMSDs (Å) (median, 95% CI) were 0.49 (0.21-1.82) (human-mouse) and 1.19 (0.22-1.82) (human-rat), well below the 2.0-2.5 Å range for equivalent docking poses. Numbers of interspecies ligand-receptor residue contacts were highly similar, with Sorensen Sc (%) = 96.8 (90.0-100) (human-mouse) and 97.7 (89.5-100) (human-rat). Likewise, numbers of interspecies ligand-receptor residue contacts were highly correlated: Pearson r = 0.913 (human-mouse) and 0.925 (human-rat). Numbers of interspecies ligand-receptor atom contacts were even more tightly correlated: r = 0.979 (human-mouse) and 0.986 (human-rat). Pyramid plots of numbers of ligand-receptor atom contacts by residue exhibited high interspecies symmetry and had Spearman r s = 0.977 (human-mouse) and 0.966 (human-rat). Group B ligands included 15 ring-substituted parabens recently shown experimentally to exhibit decreased binding to human ERα and to exert increased antimicrobial activity. Ligand efficiencies calculated from docking ligands into human ERα-LBD were well correlated with those derived from published experimental data (Pearson partial r p = 0.894 and 0.918; Groups A and B, respectively). Overall, the results indicate that our constructed rodent ERα-LBDs interact with ligands in like manner to the human receptor, thus providing a high level of confidence in extrapolations of rodent to human ligand-receptor interactions.
Collapse
Affiliation(s)
- Thomas L. Gonzalez
- Department of Environmental Health Sciences, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
| | - James M. Rae
- Division of Hematology and Oncology, Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
| | - Justin A. Colacino
- Department of Environmental Health Sciences, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
- Department of Nutritional Sciences, University of Michigan School of Public Health, Ann Arbor, MI 48109 USA
- Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109 USA
| | - Rudy J. Richardson
- Department of Environmental Health Sciences, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
- Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109 USA
- Department of Neurology, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| |
Collapse
|
40
|
Downard KM, Maleknia SD. Mass spectrometry in structural proteomics: The case for radical probe protein footprinting. Trends Analyt Chem 2019. [DOI: 10.1016/j.trac.2018.11.016] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
|
41
|
Baig MH, Ahmad K, Rabbani G, Danishuddin M, Choi I. Computer Aided Drug Design and its Application to the Development of Potential Drugs for Neurodegenerative Disorders. Curr Neuropharmacol 2018; 16:740-748. [PMID: 29046156 PMCID: PMC6080097 DOI: 10.2174/1570159x15666171016163510] [Citation(s) in RCA: 82] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2017] [Revised: 09/24/2017] [Accepted: 10/10/2017] [Indexed: 12/11/2022] Open
Abstract
Background Neurodegenerative disorders (NDs) are diverse group of disorders characterized by escalating loss of neurons (structural and functional). The development of potential therapeutics for NDs presents an important challenge, as traditional treatments are inefficient and usually are unable to stop or retard the process of neurodegeneration. Computer-Aided Drug Design (CADD) has emerged as an efficient means of developing candidate drugs for the treatment of many disease types. Applications of CADD approach to drug discovery are progressing day by day. The recent tendency in drug design is to rationally design potent therapeutics with multi-targeting effects, higher efficacies, and fewer side effects, especially in terms of toxicity. Methods A wide literature search was performed for writing this review. An updated view on different types of NDs, their effect on human population and a brief introduction to CADD, various approaches involved in this technique, ranging from structural-based to ligand-based drug design has been discussed. The successful application of CADD approaches for the treatment of neurodegenerative disorders is also included in this review. Results In this review, we have briefly described about CADD and its use in the development of the therapeutic drug candidates against NDs. The successful applications, limitations and future prospects of this approach have also been discussed. Conclusion CADD can assist researchers studying interactions between drugs and receptors. We believe this review will be helpful for better understanding of CADD and its applications towards the discovery of new drug candidates against various fatal NDs.
Collapse
Affiliation(s)
| | - Khurshid Ahmad
- Department of Medical Biotechnology, Yeungnam University, Gyeongsan, Korea
| | - Gulam Rabbani
- Department of Medical Biotechnology, Yeungnam University, Gyeongsan, Korea
| | - Mohd Danishuddin
- School of computation and Integrative Sciences, Jawaharlal Nehru University, New Delhi-110067, India
| | - Inho Choi
- Department of Medical Biotechnology, Yeungnam University, Gyeongsan, Korea
| |
Collapse
|
42
|
Pereira GRC, Da Silva ANR, Do Nascimento SS, De Mesquita JF. In silico analysis and molecular dynamics simulation of human superoxide dismutase 3 (SOD3) genetic variants. J Cell Biochem 2018; 120:3583-3598. [DOI: 10.1002/jcb.27636] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2018] [Accepted: 08/16/2018] [Indexed: 01/05/2023]
Affiliation(s)
- G. R. C. Pereira
- Department of Genetics and Molecular Biology Federal University of the State of Rio de Janeiro (UNIRIO) Rio de Janeiro Brazil
| | - A. N. R. Da Silva
- Department of Genetics and Molecular Biology Federal University of the State of Rio de Janeiro (UNIRIO) Rio de Janeiro Brazil
| | - S. S. Do Nascimento
- Department of Genetics and Molecular Biology Federal University of the State of Rio de Janeiro (UNIRIO) Rio de Janeiro Brazil
| | - J. F. De Mesquita
- Department of Genetics and Molecular Biology Federal University of the State of Rio de Janeiro (UNIRIO) Rio de Janeiro Brazil
| |
Collapse
|
43
|
Glusman G, Rose PW, Prlić A, Dougherty J, Duarte JM, Hoffman AS, Barton GJ, Bendixen E, Bergquist T, Bock C, Brunk E, Buljan M, Burley SK, Cai B, Carter H, Gao J, Godzik A, Heuer M, Hicks M, Hrabe T, Karchin R, Leman JK, Lane L, Masica DL, Mooney SD, Moult J, Omenn GS, Pearl F, Pejaver V, Reynolds SM, Rokem A, Schwede T, Song S, Tilgner H, Valasatava Y, Zhang Y, Deutsch EW. Mapping genetic variations to three-dimensional protein structures to enhance variant interpretation: a proposed framework. Genome Med 2017; 9:113. [PMID: 29254494 PMCID: PMC5735928 DOI: 10.1186/s13073-017-0509-y] [Citation(s) in RCA: 44] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
The translation of personal genomics to precision medicine depends on the accurate interpretation of the multitude of genetic variants observed for each individual. However, even when genetic variants are predicted to modify a protein, their functional implications may be unclear. Many diseases are caused by genetic variants affecting important protein features, such as enzyme active sites or interaction interfaces. The scientific community has catalogued millions of genetic variants in genomic databases and thousands of protein structures in the Protein Data Bank. Mapping mutations onto three-dimensional (3D) structures enables atomic-level analyses of protein positions that may be important for the stability or formation of interactions; these may explain the effect of mutations and in some cases even open a path for targeted drug development. To accelerate progress in the integration of these data types, we held a two-day Gene Variation to 3D (GVto3D) workshop to report on the latest advances and to discuss unmet needs. The overarching goal of the workshop was to address the question: what can be done together as a community to advance the integration of genetic variants and 3D protein structures that could not be done by a single investigator or laboratory? Here we describe the workshop outcomes, review the state of the field, and propose the development of a framework with which to promote progress in this arena. The framework will include a set of standard formats, common ontologies, a common application programming interface to enable interoperation of the resources, and a Tool Registry to make it easy to find and apply the tools to specific analysis problems. Interoperability will enable integration of diverse data sources and tools and collaborative development of variant effect prediction methods.
Collapse
Affiliation(s)
| | - Peter W Rose
- San Diego Supercomputer Center, University of California San Diego, La Jolla, CA, 98093, USA
| | - Andreas Prlić
- San Diego Supercomputer Center, University of California San Diego, La Jolla, CA, 98093, USA.,RCSB Protein Data Bank, University of California San Diego, La Jolla, CA, 98093, USA
| | | | - José M Duarte
- RCSB Protein Data Bank, University of California San Diego, La Jolla, CA, 98093, USA
| | - Andrew S Hoffman
- Human Centered Design & Engineering, University of Washington, Seattle, WA, 98195, USA
| | - Geoffrey J Barton
- Division of Computational Biology, School of Life Sciences, University of Dundee, Dundee, DD1 5EH, UK
| | - Emøke Bendixen
- Department of Molecular Biology and Genetics, Aarhus University, 8000, Aarhus, Denmark
| | - Timothy Bergquist
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, 98109, USA
| | - Christian Bock
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, 98109, USA
| | - Elizabeth Brunk
- University of California San Diego, La Jolla, CA, 92093, USA
| | - Marija Buljan
- Institute of Molecular Systems Biology, ETH Zurich, CH-8093, Zurich, Switzerland
| | - Stephen K Burley
- San Diego Supercomputer Center, University of California San Diego, La Jolla, CA, 98093, USA.,RCSB Protein Data Bank, University of California San Diego, La Jolla, CA, 98093, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA
| | - Binghuang Cai
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, 98109, USA
| | - Hannah Carter
- University of California San Diego, La Jolla, CA, 92093, USA
| | - JianJiong Gao
- Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Adam Godzik
- SBP Medical Discovery Institute, La Jolla, CA, 92037, USA
| | - Michael Heuer
- AMPLab, University of California, Berkeley, CA, 94720, USA
| | | | - Thomas Hrabe
- SBP Medical Discovery Institute, La Jolla, CA, 92037, USA
| | - Rachel Karchin
- Department of Biomedical Engineering, Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD, 21218, USA.,Department of Oncology, Johns Hopkins Medicine, Baltimore, MD, 21287, USA
| | - Julia Koehler Leman
- Flatiron Institute, Center for Computational Biology, Simons Foundation, New York, NY, 10010, USA.,Department of Biology and Center for Genomics and Systems Biology, New York University, New York, NY, 10003, USA
| | - Lydie Lane
- SIB Swiss Institute of Bioinformatics and University of Geneva, CH-1211, Geneva, Switzerland
| | - David L Masica
- Department of Biomedical Engineering, Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD, 21218, USA
| | - Sean D Mooney
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, 98109, USA
| | - John Moult
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD, 20850, USA.,Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD, 20742, USA
| | - Gilbert S Omenn
- Institute for Systems Biology, Seattle, WA, 98109, USA.,Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI, 48109-2218, USA
| | - Frances Pearl
- School of Life Sciences, University of Sussex, Brighton, BN1 9QG, UK
| | - Vikas Pejaver
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, 98109, USA.,The University of Washington eScience Institute, Seattle, WA, 98195, USA
| | | | - Ariel Rokem
- The University of Washington eScience Institute, Seattle, WA, 98195, USA
| | - Torsten Schwede
- SIB Swiss Institute of Bioinformatics and Biozentrum University of Basel, CH-4056, Basel, Switzerland
| | - Sicheng Song
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, 98109, USA
| | - Hagen Tilgner
- Brain and Mind Research Institute, Weill Cornell Medicine, New York City, NY, 10021, USA
| | - Yana Valasatava
- RCSB Protein Data Bank, University of California San Diego, La Jolla, CA, 98093, USA
| | - Yang Zhang
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI, 48109-2218, USA
| | | |
Collapse
|
44
|
Shanmugam G, Jeon J. Computer-Aided Drug Discovery in Plant Pathology. THE PLANT PATHOLOGY JOURNAL 2017; 33:529-542. [PMID: 29238276 PMCID: PMC5720600 DOI: 10.5423/ppj.rw.04.2017.0084] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2017] [Revised: 09/06/2017] [Accepted: 09/06/2017] [Indexed: 05/31/2023]
Abstract
Control of plant diseases is largely dependent on use of agrochemicals. However, there are widening gaps between our knowledge on plant diseases gained from genetic/mechanistic studies and rapid translation of the knowledge into target-oriented development of effective agrochemicals. Here we propose that the time is ripe for computer-aided drug discovery/design (CADD) in molecular plant pathology. CADD has played a pivotal role in development of medically important molecules over the last three decades. Now, explosive increase in information on genome sequences and three dimensional structures of biological molecules, in combination with advances in computational and informational technologies, opens up exciting possibilities for application of CADD in discovery and development of agrochemicals. In this review, we outline two categories of the drug discovery strategies: structure- and ligand-based CADD, and relevant computational approaches that are being employed in modern drug discovery. In order to help readers to dive into CADD, we explain concepts of homology modelling, molecular docking, virtual screening, and de novo ligand design in structure-based CADD, and pharmacophore modelling, ligand-based virtual screening, quantitative structure activity relationship modelling and de novo ligand design for ligand-based CADD. We also provide the important resources available to carry out CADD. Finally, we present a case study showing how CADD approach can be implemented in reality for identification of potent chemical compounds against the important plant pathogens, Pseudomonas syringae and Colletotrichum gloeosporioides.
Collapse
Affiliation(s)
| | - Junhyun Jeon
- Corresponding author. Phone) +82-53-810-3030, FAX) +82-53-810-4769, E-mail)
| |
Collapse
|
45
|
Somody JC, MacKinnon SS, Windemuth A. Structural coverage of the proteome for pharmaceutical applications. Drug Discov Today 2017; 22:1792-1799. [DOI: 10.1016/j.drudis.2017.08.004] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2017] [Revised: 08/16/2017] [Accepted: 08/17/2017] [Indexed: 01/09/2023]
|
46
|
Sahlgren C, Meinander A, Zhang H, Cheng F, Preis M, Xu C, Salminen TA, Toivola D, Abankwa D, Rosling A, Karaman DŞ, Salo-Ahen OMH, Österbacka R, Eriksson JE, Willför S, Petre I, Peltonen J, Leino R, Johnson M, Rosenholm J, Sandler N. Tailored Approaches in Drug Development and Diagnostics: From Molecular Design to Biological Model Systems. Adv Healthc Mater 2017; 6. [PMID: 28892296 DOI: 10.1002/adhm.201700258] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2017] [Revised: 05/04/2017] [Indexed: 12/13/2022]
Abstract
Approaches to increase the efficiency in developing drugs and diagnostics tools, including new drug delivery and diagnostic technologies, are needed for improved diagnosis and treatment of major diseases and health problems such as cancer, inflammatory diseases, chronic wounds, and antibiotic resistance. Development within several areas of research ranging from computational sciences, material sciences, bioengineering to biomedical sciences and bioimaging is needed to realize innovative drug development and diagnostic (DDD) approaches. Here, an overview of recent progresses within key areas that can provide customizable solutions to improve processes and the approaches taken within DDD is provided. Due to the broadness of the area, unfortunately all relevant aspects such as pharmacokinetics of bioactive molecules and delivery systems cannot be covered. Tailored approaches within (i) bioinformatics and computer-aided drug design, (ii) nanotechnology, (iii) novel materials and technologies for drug delivery and diagnostic systems, and (iv) disease models to predict safety and efficacy of medicines under development are focused on. Current developments and challenges ahead are discussed. The broad scope reflects the multidisciplinary nature of the field of DDD and aims to highlight the convergence of biological, pharmaceutical, and medical disciplines needed to meet the societal challenges of the 21st century.
Collapse
Affiliation(s)
- Cecilia Sahlgren
- Faculty of Science and Engineering; Cell Biology; Åbo Akademi University; FI-20520 Turku Finland
- Turku Centre for Biotechnology; Åbo Akademi University and University of Turku; FI-20520 Turku Finland
- Department of Biomedical Engineering; Technical University of Eindhoven; 5613 DR Eindhoven Netherlands
| | - Annika Meinander
- Faculty of Science and Engineering; Cell Biology; Åbo Akademi University; FI-20520 Turku Finland
| | - Hongbo Zhang
- Faculty of Science and Engineering; Pharmaceutical Sciences Laboratory; Åbo Akademi University; FI-20520 Turku Finland
| | - Fang Cheng
- Faculty of Science and Engineering; Cell Biology; Åbo Akademi University; FI-20520 Turku Finland
| | - Maren Preis
- Faculty of Science and Engineering; Pharmaceutical Sciences Laboratory; Åbo Akademi University; FI-20520 Turku Finland
| | - Chunlin Xu
- Faculty of Science and Engineering; Natural Materials Technology; Åbo Akademi University; FI-20500 Turku Finland
| | - Tiina A. Salminen
- Faculty of Science and Engineering; Structural Bioinformatics Laboratory; Åbo Akademi University; FI-20520 Turku Finland
| | - Diana Toivola
- Faculty of Science and Engineering; Cell Biology; Åbo Akademi University; FI-20520 Turku Finland
- Turku Center for Disease Modeling; University of Turku; FI-20520 Turku Finland
| | - Daniel Abankwa
- Department of Biomedical Engineering; Technical University of Eindhoven; 5613 DR Eindhoven Netherlands
| | - Ari Rosling
- Faculty of Science and Engineering; Polymer Technologies; Åbo Akademi University; FI-20500 Turku Finland
| | - Didem Şen Karaman
- Faculty of Science and Engineering; Pharmaceutical Sciences Laboratory; Åbo Akademi University; FI-20520 Turku Finland
| | - Outi M. H. Salo-Ahen
- Faculty of Science and Engineering; Pharmaceutical Sciences Laboratory; Åbo Akademi University; FI-20520 Turku Finland
- Faculty of Science and Engineering; Structural Bioinformatics Laboratory; Åbo Akademi University; FI-20520 Turku Finland
| | - Ronald Österbacka
- Faculty of Science and Engineering; Physics; Åbo Akademi University; FI-20500 Turku Finland
| | - John E. Eriksson
- Faculty of Science and Engineering; Cell Biology; Åbo Akademi University; FI-20520 Turku Finland
- Turku Centre for Biotechnology; Åbo Akademi University and University of Turku; FI-20520 Turku Finland
| | - Stefan Willför
- Faculty of Science and Engineering; Natural Materials Technology; Åbo Akademi University; FI-20500 Turku Finland
| | - Ion Petre
- Faculty of Science and Engineering; Computer Science; Åbo Akademi University; FI-20500 Turku Finland
| | - Jouko Peltonen
- Faculty of Science and Engineering; Physical Chemistry; Åbo Akademi University; FI-20500 Turku Finland
| | - Reko Leino
- Faculty of Science and Engineering; Organic Chemistry; Johan Gadolin Process Chemistry Centre; Åbo Akademi University; FI-20500 Turku Finland
| | - Mark Johnson
- Faculty of Science and Engineering; Structural Bioinformatics Laboratory; Åbo Akademi University; FI-20520 Turku Finland
| | - Jessica Rosenholm
- Faculty of Science and Engineering; Pharmaceutical Sciences Laboratory; Åbo Akademi University; FI-20520 Turku Finland
| | - Niklas Sandler
- Faculty of Science and Engineering; Pharmaceutical Sciences Laboratory; Åbo Akademi University; FI-20520 Turku Finland
| |
Collapse
|
47
|
Sahlgren C, Meinander A, Zhang H, Cheng F, Preis M, Xu C, Salminen TA, Toivola D, Abankwa D, Rosling A, Karaman DŞ, Salo-Ahen OMH, Österbacka R, Eriksson JE, Willför S, Petre I, Peltonen J, Leino R, Johnson M, Rosenholm J, Sandler N. Tailored Approaches in Drug Development and Diagnostics: From Molecular Design to Biological Model Systems. Adv Healthc Mater 2017. [DOI: 10.1002/adhm.201700258 10.1002/adhm.201700258] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/26/2023]
Affiliation(s)
- Cecilia Sahlgren
- Faculty of Science and Engineering; Cell Biology; Åbo Akademi University; FI-20520 Turku Finland
- Turku Centre for Biotechnology; Åbo Akademi University and University of Turku; FI-20520 Turku Finland
- Department of Biomedical Engineering; Technical University of Eindhoven; 5613 DR Eindhoven Netherlands
| | - Annika Meinander
- Faculty of Science and Engineering; Cell Biology; Åbo Akademi University; FI-20520 Turku Finland
| | - Hongbo Zhang
- Faculty of Science and Engineering; Pharmaceutical Sciences Laboratory; Åbo Akademi University; FI-20520 Turku Finland
| | - Fang Cheng
- Faculty of Science and Engineering; Cell Biology; Åbo Akademi University; FI-20520 Turku Finland
| | - Maren Preis
- Faculty of Science and Engineering; Pharmaceutical Sciences Laboratory; Åbo Akademi University; FI-20520 Turku Finland
| | - Chunlin Xu
- Faculty of Science and Engineering; Natural Materials Technology; Åbo Akademi University; FI-20500 Turku Finland
| | - Tiina A. Salminen
- Faculty of Science and Engineering; Structural Bioinformatics Laboratory; Åbo Akademi University; FI-20520 Turku Finland
| | - Diana Toivola
- Faculty of Science and Engineering; Cell Biology; Åbo Akademi University; FI-20520 Turku Finland
- Turku Center for Disease Modeling; University of Turku; FI-20520 Turku Finland
| | - Daniel Abankwa
- Department of Biomedical Engineering; Technical University of Eindhoven; 5613 DR Eindhoven Netherlands
| | - Ari Rosling
- Faculty of Science and Engineering; Polymer Technologies; Åbo Akademi University; FI-20500 Turku Finland
| | - Didem Şen Karaman
- Faculty of Science and Engineering; Pharmaceutical Sciences Laboratory; Åbo Akademi University; FI-20520 Turku Finland
| | - Outi M. H. Salo-Ahen
- Faculty of Science and Engineering; Pharmaceutical Sciences Laboratory; Åbo Akademi University; FI-20520 Turku Finland
- Faculty of Science and Engineering; Structural Bioinformatics Laboratory; Åbo Akademi University; FI-20520 Turku Finland
| | - Ronald Österbacka
- Faculty of Science and Engineering; Physics; Åbo Akademi University; FI-20500 Turku Finland
| | - John E. Eriksson
- Faculty of Science and Engineering; Cell Biology; Åbo Akademi University; FI-20520 Turku Finland
- Turku Centre for Biotechnology; Åbo Akademi University and University of Turku; FI-20520 Turku Finland
| | - Stefan Willför
- Faculty of Science and Engineering; Natural Materials Technology; Åbo Akademi University; FI-20500 Turku Finland
| | - Ion Petre
- Faculty of Science and Engineering; Computer Science; Åbo Akademi University; FI-20500 Turku Finland
| | - Jouko Peltonen
- Faculty of Science and Engineering; Physical Chemistry; Åbo Akademi University; FI-20500 Turku Finland
| | - Reko Leino
- Faculty of Science and Engineering; Organic Chemistry; Johan Gadolin Process Chemistry Centre; Åbo Akademi University; FI-20500 Turku Finland
| | - Mark Johnson
- Faculty of Science and Engineering; Structural Bioinformatics Laboratory; Åbo Akademi University; FI-20520 Turku Finland
| | - Jessica Rosenholm
- Faculty of Science and Engineering; Pharmaceutical Sciences Laboratory; Åbo Akademi University; FI-20520 Turku Finland
| | - Niklas Sandler
- Faculty of Science and Engineering; Pharmaceutical Sciences Laboratory; Åbo Akademi University; FI-20520 Turku Finland
| |
Collapse
|
48
|
Monzon AM, Zea DJ, Marino-Buslje C, Parisi G. Homology modeling in a dynamical world. Protein Sci 2017; 26:2195-2206. [PMID: 28815769 DOI: 10.1002/pro.3274] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2017] [Revised: 08/09/2017] [Accepted: 08/09/2017] [Indexed: 12/31/2022]
Abstract
A key concept in template-based modeling (TBM) is the high correlation between sequence and structural divergence, with the practical consequence that homologous proteins that are similar at the sequence level will also be similar at the structural level. However, conformational diversity of the native state will reduce the correlation between structural and sequence divergence, because structural variation can appear without sequence diversity. In this work, we explore the impact that conformational diversity has on the relationship between structural and sequence divergence. We find that the extent of conformational diversity can be as high as the maximum structural divergence among families. Also, as expected, conformational diversity impairs the well-established correlation between sequence and structural divergence, which is nosier than previously suggested. However, we found that this noise can be resolved using a priori information coming from the structure-function relationship. We show that protein families with low conformational diversity show a well-correlated relationship between sequence and structural divergence, which is severely reduced in proteins with larger conformational diversity. This lack of correlation could impair TBM results in highly dynamical proteins. Finally, we also find that the presence of order/disorder can provide useful beforehand information for better TBM performance.
Collapse
Affiliation(s)
- Alexander Miguel Monzon
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, CONICET, B1876BXD, Bernal, Argentina
| | - Diego Javier Zea
- Structural Bioinformatics Unit, Fundación Instituto Leloir, CONICET, C1405BWE Ciudad Autónoma de Buenos Aires, Argentina
| | - Cristina Marino-Buslje
- Structural Bioinformatics Unit, Fundación Instituto Leloir, CONICET, C1405BWE Ciudad Autónoma de Buenos Aires, Argentina
| | - Gustavo Parisi
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, CONICET, B1876BXD, Bernal, Argentina
| |
Collapse
|
49
|
Dapkūnas J, Olechnovič K, Venclovas Č. Modeling of protein complexes in CAPRI Round 37 using template-based approach combined with model selection. Proteins 2017; 86 Suppl 1:292-301. [DOI: 10.1002/prot.25378] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2017] [Revised: 08/25/2017] [Accepted: 09/10/2017] [Indexed: 01/14/2023]
Affiliation(s)
- Justas Dapkūnas
- Institute of Biotechnology, Vilnius University, Saulėtekio 7; Vilnius LT-10257 Lithuania
| | - Kliment Olechnovič
- Institute of Biotechnology, Vilnius University, Saulėtekio 7; Vilnius LT-10257 Lithuania
| | - Česlovas Venclovas
- Institute of Biotechnology, Vilnius University, Saulėtekio 7; Vilnius LT-10257 Lithuania
| |
Collapse
|
50
|
Franzoi M, Sturlese M, Bellanda M, Mammi S. A molecular dynamics strategy for CSαβ peptides disulfide-assisted model refinement. J Biomol Struct Dyn 2017; 35:2736-2744. [PMID: 27581488 DOI: 10.1080/07391102.2016.1231081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
Many cysteine-stabilized antimicrobial peptides from a variety of living organisms could be good candidates for the development of anti-infective agents. In the absence of experimentally obtained structural data, peptide modeling is an essential tool for understanding structure-activity relationships and for optimizing the bioactive moieties. Focusing on cysteine-rich peptide structures, we reproduced the case of structure predictions in the so-called midnight zone. We developed our protocol on a training set derived by clustering the available cysteine-stabilized αβ (CSαβ) structures in nine different representative families and tested it on peptides randomly selected from each family. Starting from draft models, we tested a structure-based disulfide predictor and we used cysteine distances as constraints during molecular dynamics. Finally, we proposed an analysis for final structure selection. Accordingly, we obtained a mean root mean square deviation improvement of 21% for the test set. Our findings demonstrate that it is possible to predict the network of disulfide bridges in cysteine-stabilized peptides and to use this result to improve the accuracy of structural predictions. Finally, we applied the methods to predict the structure of royalisin, a cysteine-rich peptide with unknown structure.
Collapse
Affiliation(s)
- Marco Franzoi
- a Department of Biology , University of Padova , Via Ugo Bassi 58/B, Padova 35131 , Italy
| | - Mattia Sturlese
- b Molecular Modeling Section (MMS), Department of Pharmaceutical and Pharmacological Sciences , University of Padova , Via Marzolo 5, Padova 35131 , Italy
| | - Massimo Bellanda
- c Department of Chemical Sciences , University of Padova , Via Marzolo 1, Padova 35131 , Italy
| | - Stefano Mammi
- c Department of Chemical Sciences , University of Padova , Via Marzolo 1, Padova 35131 , Italy
| |
Collapse
|