1
|
Cretin G, Périn C, Zimmermann N, Galochkina T, Gelly JC. ICARUS: flexible protein structural alignment based on Protein Units. Bioinformatics 2023; 39:btad459. [PMID: 37498544 PMCID: PMC10400377 DOI: 10.1093/bioinformatics/btad459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 07/04/2023] [Accepted: 07/26/2023] [Indexed: 07/28/2023] Open
Abstract
MOTIVATION Alignment of protein structures is a major problem in structural biology. The first approach commonly used is to consider proteins as rigid bodies. However, alignment of protein structures can be very complex due to conformational variability, or complex evolutionary relationships between proteins such as insertions, circular permutations or repetitions. In such cases, introducing flexibility becomes useful for two reasons: (i) it can help compare two protein chains which adopted two different conformational states, such as due to proteins/ligands interaction or post-translational modifications, and (ii) it aids in the identification of conserved regions in proteins that may have distant evolutionary relationships. RESULTS We propose ICARUS, a new approach for flexible structural alignment based on identification of Protein Units, evolutionarily preserved structural descriptors of intermediate size, between secondary structures and domains. ICARUS significantly outperforms reference methods on a dataset of very difficult structural alignments. AVAILABILITY AND IMPLEMENTATION Code is freely available online at https://github.com/DSIMB/ICARUS.
Collapse
Affiliation(s)
- Gabriel Cretin
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, F-75015 Paris, France
- Laboratoire d’Excellence GR-Ex, 75015 Paris, France
| | - Charlotte Périn
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, F-75015 Paris, France
- Laboratoire d’Excellence GR-Ex, 75015 Paris, France
- TBI, Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France
| | - Nicolas Zimmermann
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, F-75015 Paris, France
- Laboratoire d’Excellence GR-Ex, 75015 Paris, France
| | - Tatiana Galochkina
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, F-75015 Paris, France
- Laboratoire d’Excellence GR-Ex, 75015 Paris, France
| | - Jean-Christophe Gelly
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, F-75015 Paris, France
- Laboratoire d’Excellence GR-Ex, 75015 Paris, France
| |
Collapse
|
2
|
Jessen-Howard D, Pan Q, Ascher DB. Identifying the Molecular Drivers of Pathogenic Aldehyde Dehydrogenase Missense Mutations in Cancer and Non-Cancer Diseases. Int J Mol Sci 2023; 24:10157. [PMID: 37373306 DOI: 10.3390/ijms241210157] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Revised: 06/07/2023] [Accepted: 06/08/2023] [Indexed: 06/29/2023] Open
Abstract
Human aldehyde dehydrogenases (ALDHs) comprising 19 isoenzymes play a vital role on both endogenous and exogenous aldehyde metabolism. This NAD(P)-dependent catalytic process relies on the intact structural and functional activity of the cofactor binding, substrate interaction, and the oligomerization of ALDHs. Disruptions on the activity of ALDHs, however, could result in the accumulation of cytotoxic aldehydes, which have been linked with a wide range of diseases, including both cancers as well as neurological and developmental disorders. In our previous works, we have successfully characterised the structure-function relationships of the missense variants of other proteins. We, therefore, applied a similar analysis pipeline to identify potential molecular drivers of pathogenic ALDH missense mutations. Variants data were first carefully curated and labelled as cancer-risk, non-cancer diseases, and benign. We then leveraged various computational biophysical methods to describe the changes caused by missense mutations, informing a bias of detrimental mutations with destabilising effects. Cooperating with these insights, several machine learning approaches were further utilised to investigate the combination of features, revealing the necessity of the conservation of ALDHs. Our work aims to provide important biological perspectives on pathogenic consequences of missense mutations of ALDHs, which could be invaluable resources in the development of cancer treatment.
Collapse
Affiliation(s)
- Dana Jessen-Howard
- School of Chemistry and Molecular Bioscience, University of Queensland, Brisbane, QLD 4072, Australia
| | - Qisheng Pan
- School of Chemistry and Molecular Bioscience, University of Queensland, Brisbane, QLD 4072, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC 3004, Australia
| | - David B Ascher
- School of Chemistry and Molecular Bioscience, University of Queensland, Brisbane, QLD 4072, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC 3004, Australia
| |
Collapse
|
3
|
Satvati S, Ghasemi Y, Najafipour S, Eskandari S, Mahmoodi S, Nezafat N, Hashemzaei M. Finding and engineering the newly found bacterial superoxide dismutase enzyme to increase its thermostability and decrease the immunogenicity: a computational and experimental research. Arch Microbiol 2023; 205:260. [PMID: 37291420 DOI: 10.1007/s00203-023-03601-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 05/23/2023] [Accepted: 05/29/2023] [Indexed: 06/10/2023]
Abstract
Superoxide dismutase (SOD) is one of the most important antioxidant enzymes that can reduce oxidative stress in the cell environment. Nowadays, bacterial sources of enzyme are commercially applicable in the cosmetics and pharmaceutical industries, but the allergenic effect of proteins from non-human sources has been mentioned as disadvantage of these kinds of enzymes. In this study, to find the suitable bacterial SOD candidate for decreasing immunogenicity, the sequences of five thermophilic bacteria were selected as reference species. Then, linear and conformational B-cell epitopes of the SOD were analyzed by different servers. The stability and immunogenicity of mutant positions were also evaluated. The mutant gene was inserted into the pET-23a expression vector and transformed into E. Coli BL21 (DE3) for expression of the recombinant enzyme. Afterward, the expression of the mutant enzyme was evaluated by SDS-PAGE analysis and the recombinant enzyme activity was assessed. Anoxybacillus gonensis was selected as a reasonable SOD source according to BLAST search, physicochemical properties analysis, and prediction of allergenic features. Regarding our results, five residues including E84, E142, K144, G147, and M148 were predicted as candidates for mutagenesis. Finally, the K144A was chosen as the final modification due to the increase in the stability of the enzyme and decreased immunogenicity of the enzyme as well. The enzyme activity was 240 U/ml at room temperature. Alternation in K144 to alanine caused increased stability of the enzyme. In silico studies confirmed non-antigenic protein after mutation.
Collapse
Affiliation(s)
- Saha Satvati
- Department of Medical Biotechnology, School of Advanced Technologies in Medicine, Fasa University of Medical Sciences, Fasa, Iran
| | - Younes Ghasemi
- Department of Pharmaceutical Biotechnology, School of Pharmacy, Shiraz University of Medical Sciences, Shiraz, Iran
- Computational vaccine and Drug Design Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Sohrab Najafipour
- Department of Medical Biotechnology, School of Advanced Technologies in Medicine, Fasa University of Medical Sciences, Fasa, Iran
- Department of Tissue Engineering, School of Advanced Technologies in Medicine, Fasa University of Medical Sciences, Fasa, Iran
| | - Sedigheh Eskandari
- Computational vaccine and Drug Design Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Shirin Mahmoodi
- Department of Medical Biotechnology, School of Advanced Technologies in Medicine, Fasa University of Medical Sciences, Fasa, Iran.
| | - Navid Nezafat
- Department of Pharmaceutical Biotechnology, School of Pharmacy, Shiraz University of Medical Sciences, Shiraz, Iran.
- Computational vaccine and Drug Design Research Center, Shiraz University of Medical Sciences, Shiraz, Iran.
- Pharmaceutical Science Research Center, Shiraz University of Medical Sciences, Shiraz, Iran.
| | - Masoud Hashemzaei
- Department of Pharmaceutical Biotechnology, School of Pharmacy, Shiraz University of Medical Sciences, Shiraz, Iran
- Computational vaccine and Drug Design Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| |
Collapse
|
4
|
Estrada A, Suárez-Díaz E, Becerra A. Reconstructing the Last Common Ancestor: Epistemological and Empirical Challenges. Acta Biotheor 2022; 70:15. [PMID: 35575816 DOI: 10.1007/s10441-022-09439-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Accepted: 04/25/2022] [Indexed: 11/24/2022]
Abstract
Reconstructing the genetic traits of the Last Common Ancestor (LCA) and the Tree of Life (TOL) are two examples of the reaches of contemporary molecular phylogenetics. Nevertheless, the whole enterprise has led to paradoxical results. The presence of Lateral Gene Transfer poses epistemic and empirical challenges to meet these goals; the discussion around this subject has been enriched by arguments from philosophers and historians of science. At the same time, a few but influential research groups have aimed to reconstruct the LCA with rich-in-detail hypotheses and high-resolution gene catalogs and metabolic traits. We argue that LGT poses insurmountable challenges for detailed and rich in details reconstructions and propose, instead, a middle-ground position with the reconstruction of a slim LCA based on traits under strong pressures of Negative Natural Selection, and for the need of consilience with evidence from organismal biology and geochemistry. We defend a cautionary perspective that goes beyond the statistical analysis of gene similarities and assumes the broader consequences of evolving empirical data and epistemic pluralism in the reconstruction of early life.
Collapse
Affiliation(s)
- Amadeo Estrada
- Posgrado en Ciencias Biológicas, Universidad Nacional Autónoma de México, Coyoacán, Mexico
| | - Edna Suárez-Díaz
- Facultad de Ciencias, Universidad Nacional Autónoma de México, Circuito Exterior Ciudad Universitaria, 04510, Coyoacán, DF, Mexico
| | - Arturo Becerra
- Facultad de Ciencias, Universidad Nacional Autónoma de México, Circuito Exterior Ciudad Universitaria, 04510, Coyoacán, DF, Mexico.
| |
Collapse
|
5
|
Sanchez-Pulido L, Ponting CP. Extending the Horizon of Homology Detection with Coevolution-based Structure Prediction. J Mol Biol 2021; 433:167106. [PMID: 34139218 PMCID: PMC8527833 DOI: 10.1016/j.jmb.2021.167106] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Revised: 06/09/2021] [Accepted: 06/09/2021] [Indexed: 12/12/2022]
Abstract
Traditional sequence analysis algorithms fail to identify distant homologies when they lie beyond a detection horizon. In this review, we discuss how co-evolution-based contact and distance prediction methods are pushing back this homology detection horizon, thereby yielding new functional insights and experimentally testable hypotheses. Based on correlated substitutions, these methods divine three-dimensional constraints among amino acids in protein sequences that were previously devoid of all annotated domains and repeats. The new algorithms discern hidden structure in an otherwise featureless sequence landscape. Their revelatory impact promises to be as profound as the use, by archaeologists, of ground-penetrating radar to discern long-hidden, subterranean structures. As examples of this, we describe how triplicated structures reflecting longin domains in MON1A-like proteins, or UVR-like repeats in DISC1, emerge from their predicted contact and distance maps. These methods also help to resolve structures that do not conform to a "beads-on-a-string" model of protein domains. In one such example, we describe CFAP298 whose ubiquitin-like domain was previously challenging to perceive owing to a large sequence insertion within it. More generally, the new algorithms permit an easier appreciation of domain families and folds whose evolution involved structural insertion or rearrangement. As we exemplify with α1-antitrypsin, coevolution-based predicted contacts may also yield insights into protein dynamics and conformational change. This new combination of structure prediction (using innovative co-evolution based methods) and homology inference (using more traditional sequence analysis approaches) shows great promise for bringing into view a sea of evolutionary relationships that had hitherto lain far beyond the horizon of homology detection.
Collapse
Affiliation(s)
- Luis Sanchez-Pulido
- Medical Research Council Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh EH4 2XU, UK.
| | - Chris P Ponting
- Medical Research Council Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh EH4 2XU, UK.
| |
Collapse
|
6
|
Waqas M, Haider A, Rehman A, Qasim M, Umar A, Sufyan M, Akram HN, Mir A, Razzaq R, Rasool D, Tahir RA, Sehgal SA. Immunoinformatics and Molecular Docking Studies Predicted Potential Multiepitope-Based Peptide Vaccine and Novel Compounds against Novel SARS-CoV-2 through Virtual Screening. BIOMED RESEARCH INTERNATIONAL 2021; 2021:1596834. [PMID: 33728324 PMCID: PMC7910514 DOI: 10.1155/2021/1596834] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Revised: 08/13/2020] [Accepted: 02/08/2021] [Indexed: 02/06/2023]
Abstract
BACKGROUND Coronaviruses (CoVs) are enveloped positive-strand RNA viruses which have club-like spikes at the surface with a unique replication process. Coronaviruses are categorized as major pathogenic viruses causing a variety of diseases in birds and mammals including humans (lethal respiratory dysfunctions). Nowadays, a new strain of coronaviruses is identified and named as SARS-CoV-2. Multiple cases of SARS-CoV-2 attacks are being reported all over the world. SARS-CoV-2 showed high death rate; however, no specific treatment is available against SARS-CoV-2. METHODS In the current study, immunoinformatics approaches were employed to predict the antigenic epitopes against SARS-CoV-2 for the development of the coronavirus vaccine. Cytotoxic T-lymphocyte and B-cell epitopes were predicted for SARS-CoV-2 coronavirus protein. Multiple sequence alignment of three genomes (SARS-CoV, MERS-CoV, and SARS-CoV-2) was used to conserved binding domain analysis. RESULTS The docking complexes of 4 CTL epitopes with antigenic sites were analyzed followed by binding affinity and binding interaction analyses of top-ranked predicted peptides with MHC-I HLA molecule. The molecular docking (Food and Drug Regulatory Authority library) was performed, and four compounds exhibiting least binding energy were identified. The designed epitopes lead to the molecular docking against MHC-I, and interactional analyses of the selected docked complexes were investigated. In conclusion, four CTL epitopes (GTDLEGNFY, TVNVLAWLY, GSVGFNIDY, and QTFSVLACY) and four FDA-scrutinized compounds exhibited potential targets as peptide vaccines and potential biomolecules against deadly SARS-CoV-2, respectively. A multiepitope vaccine was also designed from different epitopes of coronavirus proteins joined by linkers and led by an adjuvant. CONCLUSION Our investigations predicted epitopes and the reported molecules that may have the potential to inhibit the SARS-CoV-2 virus. These findings can be a step towards the development of a peptide-based vaccine or natural compound drug target against SARS-CoV-2.
Collapse
Affiliation(s)
- Muhammad Waqas
- Department of Bioinformatics and Biotechnology, Government College University, Faisalabad, Pakistan
| | - Ali Haider
- Department of Bioinformatics and Biotechnology, Government College University, Faisalabad, Pakistan
| | - Abdur Rehman
- Department of Bioinformatics and Biotechnology, Government College University, Faisalabad, Pakistan
| | - Muhammad Qasim
- Department of Bioinformatics and Biotechnology, Government College University, Faisalabad, Pakistan
| | - Ahitsham Umar
- Department of Bioinformatics and Biotechnology, Government College University, Faisalabad, Pakistan
| | - Muhammad Sufyan
- Department of Bioinformatics and Biotechnology, Government College University, Faisalabad, Pakistan
| | - Hafiza Nisha Akram
- Department of Environmental Sciences, Quaid-e-Azam University, Islamabad, Pakistan
| | - Asif Mir
- Department of Biological Sciences, International Islamic University, Islamabad, Pakistan
| | - Roha Razzaq
- Department of Bioinformatics and Biotechnology, Government College University, Faisalabad, Pakistan
| | - Danish Rasool
- Department of Bioinformatics and Biotechnology, Government College University, Faisalabad, Pakistan
| | - Rana Adnan Tahir
- Department of Biosciences, COMSATS University, Sahiwal Campus, Islamabad, Pakistan
| | - Sheikh Arslan Sehgal
- Department of Bioinformatics and Biotechnology, Government College University, Faisalabad, Pakistan
- Department of Bioinformatics, University of Okara, Okara, Pakistan
| |
Collapse
|
7
|
Bugge K, Staby L, Salladini E, Falbe-Hansen RG, Kragelund BB, Skriver K. αα-Hub domains and intrinsically disordered proteins: A decisive combo. J Biol Chem 2021; 296:100226. [PMID: 33361159 PMCID: PMC7948954 DOI: 10.1074/jbc.rev120.012928] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Revised: 12/22/2020] [Accepted: 12/22/2020] [Indexed: 01/02/2023] Open
Abstract
Hub proteins are central nodes in protein-protein interaction networks with critical importance to all living organisms. Recently, a new group of folded hub domains, the αα-hubs, was defined based on a shared αα-hairpin supersecondary structural foundation. The members PAH, RST, TAFH, NCBD, and HHD are found in large proteins such as Sin3, RCD1, TAF4, CBP, and harmonin, which organize disordered transcriptional regulators and membrane scaffolds in interactomes of importance to human diseases and plant quality. In this review, studies of structures, functions, and complexes across the αα-hubs are described and compared to provide a unified description of the group. This analysis expands the associated molecular concepts of "one domain-one binding site", motif-based ligand binding, and coupled folding and binding of intrinsically disordered ligands to additional concepts of importance to signal fidelity. These include context, motif reversibility, multivalency, complex heterogeneity, synergistic αα-hub:ligand folding, accessory binding sites, and supramodules. We propose that these multifaceted protein-protein interaction properties are made possible by the characteristics of the αα-hub fold, including supersite properties, dynamics, variable topologies, accessory helices, and malleability and abetted by adaptability of the disordered ligands. Critically, these features provide additional filters for specificity. With the presentations of new concepts, this review opens for new research questions addressing properties across the group, which are driven from concepts discovered in studies of the individual members. Combined, the members of the αα-hubs are ideal models for deconvoluting signal fidelity maintained by folded hubs and their interactions with intrinsically disordered ligands.
Collapse
Affiliation(s)
- Katrine Bugge
- REPIN and The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark; Structural Biology and NMR Laboratory, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Lasse Staby
- REPIN and The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark; Structural Biology and NMR Laboratory, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Edoardo Salladini
- REPIN and The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Rasmus G Falbe-Hansen
- REPIN and The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Birthe B Kragelund
- REPIN and The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark; Structural Biology and NMR Laboratory, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| | - Karen Skriver
- REPIN and The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
8
|
Trivedi R, Nagarajaram HA. Substitution scoring matrices for proteins - An overview. Protein Sci 2020; 29:2150-2163. [PMID: 32954566 DOI: 10.1002/pro.3954] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Revised: 09/17/2020] [Accepted: 09/18/2020] [Indexed: 01/17/2023]
Abstract
Sequence analysis is the primary and simplest approach to discover structural, functional and evolutionary details of related proteins. All the alignment based approaches of sequence analysis make use of amino acid substitution matrices, and the accuracy of the results largely depends on the type of scoring matrices used to perform alignment tasks. An amino acid substitution matrix is a 20 × 20 matrix in which the individual elements encapsulate the rates at which each of the 20 amino acid residues in proteins are substituted by other amino acid residues over time. In contrast to most globular/ordered proteins whose amino acids composition is considered as standard, there are several classes of proteins (e.g., transmembrane proteins) in which certain types of amino acid (e.g., hydrophobic residues) are enriched. These compositional differences among various classes of proteins are manifested in their underlying residue substitution frequencies. Therefore, each of the compositionally distinct class of proteins or protein segments should be studied using specific scoring matrices that reflect their distinct residue substitution pattern. In this review, we describe the development and application of various substitution scoring matrices peculiar to proteins with standard and biased compositions. Along with most commonly used standard matrices (PAM, BLOSUM, MD and VTML) that act as default parameters in various homologs search and alignment tools, different substitution scoring matrices specific to compositionally distinct class of proteins are discussed in detail.
Collapse
Affiliation(s)
- Rakesh Trivedi
- Laboratory of Computational Biology, Centre for DNA Fingerprinting and Diagnostics, Uppal, Hyderabad, Telangana, India.,Graduate School, Manipal Academy of Higher Education, Manipal, Karnataka, India
| | - Hampapathalu Adimurthy Nagarajaram
- Laboratory of Computational Biology, Department of Systems and Computational Biology, School of Life Sciences, University of Hyderabad, Hyderabad, Telangana, India.,Centre for Modelling, Simulation and Design, University of Hyderabad, Hyderabad, Telangana, India
| |
Collapse
|
9
|
Alvarez-Carreño C, Coello G, Arciniega M. FiRES: A computational method for the de novo identification of internal structure similarity in proteins. Proteins 2020; 88:1169-1179. [PMID: 32112578 DOI: 10.1002/prot.25886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2019] [Revised: 11/12/2019] [Accepted: 02/24/2020] [Indexed: 11/08/2022]
Abstract
Internal structure similarity in proteins can be observed at the domain and subdomain levels. From an evolutionary perspective, structurally similar elements may arise divergently by gene duplication and fusion events but may also be the product of convergent evolution under physicochemical constraints. The characterization of proteins that contain repeated structural elements has implications for many fields of protein science including protein domain evolution, structure classification, structure prediction, and protein engineering. FiRES (Find Repeated Elements in Structure) is an algorithm that relies on a topology-independent structure alignment method to identify repeating elements in protein structure. FiRES was tested against two hand curated databases of protein repeats: MALIDUP, for very divergent duplicated domains; and RepeatsDB for short tandem repeats. The performance of FiRES was compared to that of lalign, RADAR, HHrepID, CE-symm, ReUPred, and Swelfe. FiRES was the method that most accurately detected proteins either with duplicated domains (accuracy = 0.86) or with multiple repeated units (accuracy = 0.92). FiRES is a new methodology for the discovery of proteins containing structurally similar elements. The FiRES web server is publicly available at http://fires.ifc.unam.mx. The scripts, results, and benchmarks from this study can be downloaded from https://github.com/Claualvarez/fires.
Collapse
Affiliation(s)
- Claudia Alvarez-Carreño
- Department of Bioquímica y Biología Estructural, Instituto de Fisiología Celular, Universidad Nacional Autónoma de México, Mexico City, Mexico.,School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia, USA
| | - Gerardo Coello
- Unidad de Cómputo, Instituto de Fisiología Celular, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Marcelino Arciniega
- Department of Bioquímica y Biología Estructural, Instituto de Fisiología Celular, Universidad Nacional Autónoma de México, Mexico City, Mexico
| |
Collapse
|
10
|
Li SH, Guan ZX, Zhang D, Zhang ZM, Huang J, Yang W, Lin H. Recent Advancement in Predicting Subcellular Localization of Mycobacterial Protein with Machine Learning Methods. Med Chem 2019; 16:605-619. [PMID: 31584379 DOI: 10.2174/1573406415666191004101913] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2019] [Revised: 06/25/2019] [Accepted: 08/23/2019] [Indexed: 01/28/2023]
Abstract
Mycobacterium tuberculosis (MTB) can cause the terrible tuberculosis (TB), which is reported as one of the most dreadful epidemics. Although many biochemical molecular drugs have been developed to cope with this disease, the drug resistance-especially the multidrug-resistant (MDR) and extensively drug-resistance (XDR)-poses a huge threat to the treatment. However, traditional biochemical experimental method to tackle TB is time-consuming and costly. Benefited by the appearance of the enormous genomic and proteomic sequence data, TB can be treated via sequence-based biological computational approach-bioinformatics. Studies on predicting subcellular localization of mycobacterial protein (MBP) with high precision and efficiency may help figure out the biological function of these proteins and then provide useful insights for protein function annotation as well as drug design. In this review, we reported the progress that has been made in computational prediction of subcellular localization of MBP including the following aspects: 1) Construction of benchmark datasets. 2) Methods of feature extraction. 3) Techniques of feature selection. 4) Application of several published prediction algorithms. 5) The published results. 6) The further study on prediction of subcellular localization of MBP.
Collapse
Affiliation(s)
- Shi-Hao Li
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Zheng-Xing Guan
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Dan Zhang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Zi-Mei Zhang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Jian Huang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Wuritu Yang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.,Development and Planning Department, Inner Mongolia University, Hohhot, P.R. China
| | - Hao Lin
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
11
|
Uribe Acosta M, Villa Restrepo AF. In silico analysis of phag-like protein in Ralstonia Euthropa H16, potentially involved in polyhydroxyalkanoates synthesis. REVISTA POLITÉCNICA 2019. [DOI: 10.33571/rpolitec.v15n29a5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Polyhydroxyalkanoates (PHA) are synthesised by bacteria as carbon storage material. The protein PhaG directs carbon from non-related carbon sources such as glycerol, metabolised through fatty acid de novo synthesis (FAS) pathway, with PHA synthesis. The gene that codifies for this protein has not yet been found in the genome of Ralstonia eutropha H16, a model organism. By bioinformatic comparison to already known PhaG proteins, a PhaG-like protein was found codified by gene H16_A0147 and presence of the gene was preliminary confirmed by PCR. This is the first study that shows the presence and characteristics of a PhaG-like protein in R. eutropha H16 and represents the first step for the identification of a connection between FAS and PHA pathways in this model bacterium. Further gene deletion and enzymatic activity studies are necessary to confirm this potential relationship, which could improve industrial PHA production and utilisation of agro-industrial residues such as glycerol.
Collapse
|
12
|
Molecular cloning, expression, and functional characterization of 70-kDa heat shock protein, DnaK, from Bacillus halodurans. Int J Biol Macromol 2019; 137:151-159. [PMID: 31260773 DOI: 10.1016/j.ijbiomac.2019.06.217] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2019] [Revised: 06/21/2019] [Accepted: 06/27/2019] [Indexed: 11/20/2022]
Abstract
In the present study, we report cloning, sequencing, and functional characterization dnaK gene of B. halodurans that is the central component in cellular network of molecular chaperones. The 3D structures of DnaK obtained by I-TASSER server showed that the overall structures of DnaK from B. halodurans and human HSP70 chaperone BiP are very similar with a homology of 88.8%. The purified recombinant DnaK consists of a His-tag at C-terminus and show a band on approximately 70-kDa region in SDS-PAGE. The resultant refolding assay revealed that the refolding rate was considerably improved by the addition of the novel DnaK chaperone for the refolding of heat-denatured carbonic anhydrase. Also, salt resistance experiments indicated that E. coli + DnaK survival had enhanced by 4.4-fold as compared with control cells in 0.4 M NaCl. The number of E. coli + DnaK colonies was 2.5-fold higher than control colonies in pH 9.5. We showed that DnaK refolding functions were decreased by increasing Cd2+ in nanomolar concentrations. Hg2+ had a biphasic effect on recombinant DnaK refolding function: inhibition at low and stimulation at high concentrations. It was concluded that the DnaK from B. halodurans can potentially be employed for improving functional properties of proteins in various applications.
Collapse
|
13
|
Abstract
The pentameric γ-aminobutyric acid type A receptors are ion channels activated by ligands, which intervene in the rapid inhibitory transmission in the mammalian CNS. Due to their rich pharmacology and therapeutic potential, it is essential to understand their structure and function thoroughly. This deep characterization was hampered by the lack of experimental structural information for many years. Thus, computational techniques have been extensively combined with experimental data, in order to undertake the study of γ-aminobutyric acid type A receptors and their interaction with drugs. Here, we review the exciting journey made to assess the structures of these receptors and outline major outcomes. Finally, we discuss the brand new structure of the α1β2γ2 subtype and the amazing advances it brings to the field.
Collapse
|
14
|
Zhang Z, Wang J, Gong Y, Li Y. Contributions of substitutions and indels to the structural variations in ancient protein superfamilies. BMC Genomics 2018; 19:771. [PMID: 30355304 PMCID: PMC6201574 DOI: 10.1186/s12864-018-5178-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2018] [Accepted: 10/16/2018] [Indexed: 11/10/2022] Open
Abstract
Background Quantitative evaluation of protein structural evolution is important for our understanding of protein biological functions and their evolutionary adaptation, and is useful in guiding protein engineering. However, compared to the models for sequence evolution, the quantitative models for protein structural evolution received less attention. Ancient protein superfamilies are often considered versatile, allowing genetic and functional diversifications during long-term evolution. In this study, we investigated the quantitative impacts of sequence variations on the structural evolution of homologues in 68 ancient protein superfamilies that exist widely in sequenced eukaryotic, bacterial and archaeal genomes. Results We found that the accumulated structural variations within ancient superfamilies could be explained largely by a bilinear model that simultaneously considers amino acid substitution and insertion/deletion (indel). Both substitutions and indels are essential for explaining the structural variations within ancient superfamilies. For those ancient superfamilies with high bilinear multiple correlation coefficients, the influence of each unit of substitution or indel on structural variations is almost constant within each superfamily, but varies greatly among different superfamilies. The influence of each unit indel on structural variations is always larger than that of each unit substitution within each superfamily, but the accumulated contributions of indels to structural variations are lower than those of substitutions in most superfamilies. The total contributions of sequence indels and substitutions (46% and 54%, respectively) to the structural variations that result from sequence variations are slightly different in ancient superfamilies. Conclusions Structural variations within ancient protein superfamilies accumulated under the significantly bilinear influence of amino acid substitutions and indels in sequences. Both substitutions and indels are essential for explaining the structural variations within ancient superfamilies. For those structural variations resulting from sequence variations, the total contribution of indels is slightly lower than that of amino acid substitutions. The regular clock exists not only in protein sequences, but also probably in protein structures. Electronic supplementary material The online version of this article (10.1186/s12864-018-5178-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Zheng Zhang
- State Key Laboratory of Microbial Technology, Institute of Microbial Technology, Shandong University, Qingdao, 266237, China
| | - Jinlan Wang
- Physical Examination Office of Shandong Province, Health and Family Planning Commission of Shandong Province, Jinan, 250014, China
| | - Ya Gong
- State Key Laboratory of Microbial Technology, Institute of Microbial Technology, Shandong University, Qingdao, 266237, China
| | - Yuezhong Li
- State Key Laboratory of Microbial Technology, Institute of Microbial Technology, Shandong University, Qingdao, 266237, China.
| |
Collapse
|
15
|
Shukla S, Bafna K, Gullett C, Myles DAA, Agarwal PK, Cuneo MJ. Differential Substrate Recognition by Maltose Binding Proteins Influenced by Structure and Dynamics. Biochemistry 2018; 57:5864-5876. [PMID: 30204415 PMCID: PMC6189639 DOI: 10.1021/acs.biochem.8b00783] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
The genome of the hyperthermophile Thermotoga maritima contains three isoforms of maltose binding protein (MBP) that are high-affinity receptors for di-, tri-, and tetrasaccharides. Two of these proteins (tmMBP1 and tmMBP2) share significant sequence identity, approximately 90%, while the third (tmMBP3) shares less than 40% identity. MBP from Escherichia coli (ecMBP) shares 35% sequence identity with the tmMBPs. This subset of MBP isoforms offers an interesting opportunity to investigate the mechanisms underlying the evolution of substrate specificity and affinity profiles in a genome where redundant MBP genes are present. In this study, the X-ray crystal structures of tmMBP1, tmMBP2, and tmMBP3 are reported in the absence and presence of oligosaccharides. tmMBP1 and tmMBP2 have binding pockets that are larger than that of tmMBP3, enabling them to bind to larger substrates, while tmMBP1 and tmMBP2 also undergo substrate-induced hinge bending motions (∼52°) that are larger than that of tmMBP3 (∼35°). Small-angle X-ray scattering was used to compare protein behavior in solution, and computer simulations provided insights into dynamics of these proteins. Comparing quantitative protein-substrate interactions and dynamical properties of tmMBPs with those of the promiscuous ecMBP and disaccharide selective Thermococcus litoralis MBP provides insights into the features that enable selective binding. Collectively, the results provide insights into how the structure and dynamics of tmMBP homologues enable them to differentiate between a myriad of chemical entities while maintaining their common fold.
Collapse
Affiliation(s)
- Shantanu Shukla
- Graduate School of Genome Science and Technology, The University of Tennessee, Knoxville, Tennessee
- Neutron Sciences Directorate, Oak Ridge National Laboratory, Oak Ridge, Tennessee
| | - Khushboo Bafna
- Graduate School of Genome Science and Technology, The University of Tennessee, Knoxville, Tennessee
| | - Caeley Gullett
- Neutron Sciences Directorate, Oak Ridge National Laboratory, Oak Ridge, Tennessee
| | - Dean A. A. Myles
- Graduate School of Genome Science and Technology, The University of Tennessee, Knoxville, Tennessee
- Neutron Sciences Directorate, Oak Ridge National Laboratory, Oak Ridge, Tennessee
| | - Pratul K. Agarwal
- Department of Biochemistry & Cellular and Molecular Biology, The University of Tennessee, Knoxville, Tennessee
| | - Matthew J. Cuneo
- Neutron Sciences Directorate, Oak Ridge National Laboratory, Oak Ridge, Tennessee
- Deparment of Structural Biology, St. Jude Children’s Research Hospital, Memphis, Tennessee
| |
Collapse
|
16
|
How is structural divergence related to evolutionary information? Mol Phylogenet Evol 2018; 127:859-866. [DOI: 10.1016/j.ympev.2018.06.033] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2017] [Revised: 06/01/2018] [Accepted: 06/19/2018] [Indexed: 12/15/2022]
|
17
|
Monzon AM, Zea DJ, Marino-Buslje C, Parisi G. Homology modeling in a dynamical world. Protein Sci 2017; 26:2195-2206. [PMID: 28815769 DOI: 10.1002/pro.3274] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2017] [Revised: 08/09/2017] [Accepted: 08/09/2017] [Indexed: 12/31/2022]
Abstract
A key concept in template-based modeling (TBM) is the high correlation between sequence and structural divergence, with the practical consequence that homologous proteins that are similar at the sequence level will also be similar at the structural level. However, conformational diversity of the native state will reduce the correlation between structural and sequence divergence, because structural variation can appear without sequence diversity. In this work, we explore the impact that conformational diversity has on the relationship between structural and sequence divergence. We find that the extent of conformational diversity can be as high as the maximum structural divergence among families. Also, as expected, conformational diversity impairs the well-established correlation between sequence and structural divergence, which is nosier than previously suggested. However, we found that this noise can be resolved using a priori information coming from the structure-function relationship. We show that protein families with low conformational diversity show a well-correlated relationship between sequence and structural divergence, which is severely reduced in proteins with larger conformational diversity. This lack of correlation could impair TBM results in highly dynamical proteins. Finally, we also find that the presence of order/disorder can provide useful beforehand information for better TBM performance.
Collapse
Affiliation(s)
- Alexander Miguel Monzon
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, CONICET, B1876BXD, Bernal, Argentina
| | - Diego Javier Zea
- Structural Bioinformatics Unit, Fundación Instituto Leloir, CONICET, C1405BWE Ciudad Autónoma de Buenos Aires, Argentina
| | - Cristina Marino-Buslje
- Structural Bioinformatics Unit, Fundación Instituto Leloir, CONICET, C1405BWE Ciudad Autónoma de Buenos Aires, Argentina
| | - Gustavo Parisi
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, CONICET, B1876BXD, Bernal, Argentina
| |
Collapse
|
18
|
Gupta SK, Chaudhary KK, Mishra N. Bioinformatics and Its Therapeutic Applications. PHARMACEUTICAL SCIENCES 2017. [DOI: 10.4018/978-1-5225-1762-7.ch016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
Bioinformatics has emerged as a major element in contemporary biomedical and pharmaceutical region. Bioinformatics deals with growth in biological data and has led to development of many databases. Bioinformatics deals with collection of data that is relevant clinically and these days separate term clinical information has come up. Data mimics are another field which is gaining importance. This chapter shall deal with introduction of bioinformatics and its applications in medicine and health care.
Collapse
|
19
|
Leelananda SP, Kloczkowski A, Jernigan RL. Fold-specific sequence scoring improves protein sequence matching. BMC Bioinformatics 2016; 17:328. [PMID: 27578239 PMCID: PMC5006591 DOI: 10.1186/s12859-016-1198-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2016] [Accepted: 08/24/2016] [Indexed: 11/10/2022] Open
Abstract
Background Sequence matching is extremely important for applications throughout biology, particularly for discovering information such as functional and evolutionary relationships, and also for discriminating between unimportant and disease mutants. At present the functions of a large fraction of genes are unknown; improvements in sequence matching will improve gene annotations. Universal amino acid substitution matrices such as Blosum62 are used to measure sequence similarities and to identify distant homologues, regardless of the structure class. However, such single matrices do not take into account important structural information evident within the different topologies of proteins and treats substitutions within all protein folds identically. Others have suggested that the use of structural information can lead to significant improvements in sequence matching but this has not yet been very effective. Here we develop novel substitution matrices that include not only general sequence information but also have a topology specific component that is unique for each CATH topology. This novel feature of using a combination of sequence and structure information for each protein topology significantly improves the sequence matching scores for the sequence pairs tested. We have used a novel multi-structure alignment method for each homology level of CATH in order to extract topological information. Results We obtain statistically significant improved sequence matching scores for 73 % of the alpha helical test cases. On average, 61 % of the test cases showed improvements in homology detection when structure information was incorporated into the substitution matrices. On average z-scores for homology detection are improved by more than 54 % for all cases, and some individual cases have z-scores more than twice those obtained using generic matrices. Our topology specific similarity matrices also outperform other traditional similarity matrices and single matrix based structure methods. When default amino acid substitution matrix in the Psi-blast algorithm is replaced by our structure-based matrices, the structure matching is significantly improved over conventional Psi-blast. It also outperforms results obtained for the corresponding HMM profiles generated for each topology. Conclusions We show that by incorporating topology-specific structure information in addition to sequence information into specific amino acid substitution matrices, the sequence matching scores and homology detection are significantly improved. Our topology specific similarity matrices outperform other traditional similarity matrices, single matrix based structure methods, also show improvement over conventional Psi-blast and HMM profile based methods in sequence matching. The results support the discriminatory ability of the new amino acid similarity matrices to distinguish between distant homologs and structurally dissimilar pairs. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1198-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sumudu P Leelananda
- Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, 112 Office and Lab Building, Ames, IA, 50011-3020, USA.,Laurence H. Baker Center for Bioinformatics and Biological Statistics, Iowa State University, 112 Office and Lab Building, Ames, IA, 50011-3020, USA.,Present Address: 2120 Newman and Wolfrom Laboratory, The Ohio State University, 100 W 18th Ave, Columbus, OH, 43210, USA.,Present Address: Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital, Columbus, OH, 43205, USA
| | - Andrzej Kloczkowski
- Present Address: Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital, Columbus, OH, 43205, USA.,Present Address: Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, 43205, USA
| | - Robert L Jernigan
- Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, 112 Office and Lab Building, Ames, IA, 50011-3020, USA. .,Laurence H. Baker Center for Bioinformatics and Biological Statistics, Iowa State University, 112 Office and Lab Building, Ames, IA, 50011-3020, USA.
| |
Collapse
|
20
|
Bidaux G, Sgobba M, Lemonnier L, Borowiec AS, Noyer L, Jovanovic S, Zholos AV, Haider S. Functional and Modeling Studies of the Transmembrane Region of the TRPM8 Channel. Biophys J 2016; 109:1840-51. [PMID: 26536261 DOI: 10.1016/j.bpj.2015.09.027] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2015] [Revised: 09/18/2015] [Accepted: 09/28/2015] [Indexed: 12/15/2022] Open
Abstract
Members of the transient receptor potential (TRP) ion channel family act as polymodal cellular sensors, which aid in regulating Ca(2+) homeostasis. Within the TRP family, TRPM8 is the cold receptor that forms a nonselective homotetrameric cation channel. In the absence of TRPM8 crystal structure, little is known about the relationship between structure and function. Inferences of TRPM8 structure have come from mutagenesis experiments coupled to electrophysiology, mainly regarding the fourth transmembrane helix (S4), which constitutes a moderate voltage-sensing domain, and about cold sensor and phosphatidylinositol 4,5-bisphosphate binding sites, which are both located in the C-terminus of TRPM8. In this study, we use a combination of molecular modeling and experimental techniques to examine the structure of the TRPM8 transmembrane and pore helix region including the conducting conformation of the selectivity filter. The model is consistent with a large amount of functional data and was further tested by mutagenesis. We present structural insight into the role of residues involved in intra- and intersubunit interactions and their link with the channel activity, sensitivity to icilin, menthol and cold, and impact on channel oligomerization.
Collapse
Affiliation(s)
- Gabriel Bidaux
- Inserm, U1003, Laboratoire de Physiologie Cellulaire, Equipe labellisée par la Ligue contre le Cancer, Villeneuve d'Ascq, France; Laboratory of Excellence, Ion Channels Science and Therapeutics, Université de Lille 1, Villeneuve d'Ascq, France; Laboratoire Biophotonique Cellulaire Fonctionnelle. Institut de Recherche Interdisciplinaire, Villeneuve d'Ascq, France
| | - Miriam Sgobba
- Centre for Cancer Research and Cell Biology, Queen's University of Belfast, Belfast, United Kingdom
| | - Loic Lemonnier
- Inserm, U1003, Laboratoire de Physiologie Cellulaire, Equipe labellisée par la Ligue contre le Cancer, Villeneuve d'Ascq, France; Laboratory of Excellence, Ion Channels Science and Therapeutics, Université de Lille 1, Villeneuve d'Ascq, France
| | - Anne-Sophie Borowiec
- Inserm, U1003, Laboratoire de Physiologie Cellulaire, Equipe labellisée par la Ligue contre le Cancer, Villeneuve d'Ascq, France; Laboratory of Excellence, Ion Channels Science and Therapeutics, Université de Lille 1, Villeneuve d'Ascq, France
| | - Lucile Noyer
- Inserm, U1003, Laboratoire de Physiologie Cellulaire, Equipe labellisée par la Ligue contre le Cancer, Villeneuve d'Ascq, France; Laboratory of Excellence, Ion Channels Science and Therapeutics, Université de Lille 1, Villeneuve d'Ascq, France
| | | | - Alexander V Zholos
- Department of Biophysics, Educational and Scientific Centre, "Institute of Biology" Taras Shevchenko, Kiev National University, Kiev, Ukraine.
| | | |
Collapse
|
21
|
Iacoangeli A, Marcatili P, Tramontano A. Exploiting Homology Information in Nontemplate Based Prediction of Protein Structures. J Chem Theory Comput 2015; 11:5045-51. [DOI: 10.1021/acs.jctc.5b00371] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Alfredo Iacoangeli
- Department
of Physics, Sapienza University of Rome, P.le A. Moro 4, 00185 Rome, Italy
| | - Paolo Marcatili
- Department
of Physics, Sapienza University of Rome, P.le A. Moro 4, 00185 Rome, Italy
| | - Anna Tramontano
- Department
of Physics, Sapienza University of Rome, P.le A. Moro 4, 00185 Rome, Italy
- Istituto
Pasteur Fondazione Cenci Bolognetti, Sapienza University of Rome, P.le
A. Moro 4, 00185 Rome, Italy
| |
Collapse
|
22
|
Gorai B, Prabhavadhni A, Sivaraman T. Unfolding stabilities of two structurally similar proteins as probed by temperature-induced and force-induced molecular dynamics simulations. J Biomol Struct Dyn 2014; 33:2037-47. [DOI: 10.1080/07391102.2014.986668] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
23
|
Zhu S, Gao B. Nematode-derived drosomycin-type antifungal peptides provide evidence for plant-to-ecdysozoan horizontal transfer of a disease resistance gene. Nat Commun 2014; 5:3154. [DOI: 10.1038/ncomms4154] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2013] [Accepted: 12/19/2013] [Indexed: 11/09/2022] Open
|
24
|
Kopec KO, Lupas AN. β-Propeller blades as ancestral peptides in protein evolution. PLoS One 2013; 8:e77074. [PMID: 24143202 PMCID: PMC3797127 DOI: 10.1371/journal.pone.0077074] [Citation(s) in RCA: 67] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2013] [Accepted: 09/05/2013] [Indexed: 12/04/2022] Open
Abstract
Proteins of the β-propeller fold are ubiquitous in nature and widely used as structural scaffolds for ligand binding and enzymatic activity. This fold comprises between four and twelve four-stranded β-meanders, the so called blades that are arranged circularly around a central funnel-shaped pore. Despite the large size range of β-propellers, their blades frequently show sequence similarity indicative of a common ancestry and it has been proposed that the majority of β-propellers arose divergently by amplification and diversification of an ancestral blade. Given the structural versatility of β-propellers and the hypothesis that the first folded proteins evolved from a simpler set of peptides, we investigated whether this blade may have given rise to other folds as well. Using sequence comparisons, we identified proteins of four other folds as potential homologs of β-propellers: the luminal domain of inositol-requiring enzyme 1 (IRE1-LD), type II β-prisms, β-pinwheels, and WW domains. Because, with increasing evolutionary distance and decreasing sequence length, the statistical significance of sequence comparisons becomes progressively harder to distinguish from the background of convergent similarities, we complemented our analyses with a new method that evaluates possible homology based on the correlation between sequence and structure similarity. Our results indicate a homologous relationship of IRE1-LD and type II β-prisms with β-propellers, and an analogous one for β-pinwheels and WW domains. Whereas IRE1-LD most likely originated by fold-changing mutations from a fully formed PQQ motif β-propeller, type II β-prisms originated by amplification and differentiation of a single blade, possibly also of the PQQ type. We conclude that both β-propellers and type II β-prisms arose by independent amplification of a blade-sized fragment, which represents a remnant of an ancient peptide world.
Collapse
Affiliation(s)
- Klaus O. Kopec
- Department of Protein Evolution, Max-Planck-Institute for Developmental Biology, Tübingen, Baden-Württemberg, Germany
| | - Andrei N. Lupas
- Department of Protein Evolution, Max-Planck-Institute for Developmental Biology, Tübingen, Baden-Württemberg, Germany
- * E-mail:
| |
Collapse
|
25
|
Kalinina OV, Oberwinkler H, Glass B, Kräusslich HG, Russell RB, Briggs JAG. Computational identification of novel amino-acid interactions in HIV Gag via correlated evolution. PLoS One 2012; 7:e42468. [PMID: 22879995 PMCID: PMC3411748 DOI: 10.1371/journal.pone.0042468] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2012] [Accepted: 07/09/2012] [Indexed: 12/31/2022] Open
Abstract
Pairs of amino acid positions that evolve in a correlated manner are proposed to play important roles in protein structure or function. Methods to detect them might fare better with families for which sequences of thousands of closely related homologs are available than families with only a few distant relatives. We applied co-evolution analysis to thousands of sequences of HIV Gag, finding that the most significantly co-evolving positions are proximal in the quaternary structures of the viral capsid. A reduction in infectivity caused by mutating one member of a significant pair could be rescued by a compensatory mutation of the other.
Collapse
Affiliation(s)
- Olga V. Kalinina
- CellNetworks, Bioquant, University of Heidelberg, Heidelberg, Germany
| | - Heike Oberwinkler
- Department of Infectious Diseases, Virology, Universitätsklinikum Heidelberg, Heidelberg, Germany
| | - Bärbel Glass
- Department of Infectious Diseases, Virology, Universitätsklinikum Heidelberg, Heidelberg, Germany
| | - Hans-Georg Kräusslich
- CellNetworks, Bioquant, University of Heidelberg, Heidelberg, Germany
- Department of Infectious Diseases, Virology, Universitätsklinikum Heidelberg, Heidelberg, Germany
| | - Robert B. Russell
- CellNetworks, Bioquant, University of Heidelberg, Heidelberg, Germany
| | - John A. G. Briggs
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| |
Collapse
|
26
|
Hassan S, Debnath A, Mahalingam V, Hanna LE. Computational structural analysis of proteins of Mycobacterium tuberculosis and a resource for identifying off-targets. J Mol Model 2012; 18:3993-4004. [DOI: 10.1007/s00894-012-1412-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2011] [Accepted: 03/20/2012] [Indexed: 10/28/2022]
|
27
|
Abstract
Accurate all-atom energy functions are crucial for successful high-resolution protein structure prediction. In this chapter, we review both physics-based force fields and knowledge-based potentials used in protein modeling. Because it is important to calculate the energy as accurately as possible given the limitations imposed by sampling convergence, different components of the energy, and force fields representing them to varying degrees of detail and complexity are discussed. Force fields using Cartesian as well as torsion angle representations of protein geometry are covered. Since solvent is important for protein energetics, different aqueous and membrane solvation models for protein simulations are also described. Finally, we summarize recent progress in protein structure refinement using new force fields.
Collapse
|
28
|
Szilágyi A, Zhang Y, Závodszky P. Intra-chain 3D segment swapping spawns the evolution of new multidomain protein architectures. J Mol Biol 2011; 415:221-35. [PMID: 22079367 DOI: 10.1016/j.jmb.2011.10.045] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2011] [Revised: 10/07/2011] [Accepted: 10/27/2011] [Indexed: 10/15/2022]
Abstract
Multidomain proteins form in evolution through the concatenation of domains, but structural domains may comprise multiple segments of the chain. In this work, we demonstrate that new multidomain architectures can evolve by an apparent three-dimensional swap of segments between structurally similar domains within a single-chain monomer. By a comprehensive structural search of the current Protein Data Bank (PDB), we identified 32 well-defined segment-swapped proteins (SSPs) belonging to 18 structural families. Nearly 13% of all multidomain proteins in the PDB may have a segment-swapped evolutionary precursor as estimated by more permissive searching criteria. The formation of SSPs can be explained by two principal evolutionary mechanisms: (i) domain swapping and fusion (DSF) and (ii) circular permutation (CP). By large-scale comparative analyses using structural alignment and hidden Markov model methods, it was found that the majority of SSPs have evolved via the DSF mechanism, and a much smaller fraction, via CP. Functional analyses further revealed that segment swapping, which results in two linkers connecting the domains, may impart directed flexibility to multidomain proteins and contributes to the development of new functions. Thus, inter-domain segment swapping represents a novel general mechanism by which new protein folds and multidomain architectures arise in evolution, and SSPs have structural and functional properties that make them worth defining as a separate group.
Collapse
Affiliation(s)
- András Szilágyi
- Institute of Enzymology, Hungarian Academy of Sciences, Karolina út 29, H-1113 Budapest, Hungary
| | | | | |
Collapse
|
29
|
Chakraborty J, Dutta TK. From lipid transport to oxygenation of aromatic compounds: evolution within the Bet v1-like superfamily. J Biomol Struct Dyn 2011; 29:67-78. [PMID: 21696226 DOI: 10.1080/07391102.2011.10507375] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
In absence of significant sequence similarity, remote homology between proteins can be confused with analogy and in such a case, shared ancestry can be inferred in light of certain unique and common features. In the present study, to understand the evolutionary origin of catalytic domain of large subunit of ring-hydroxylating oxygenases (RHOs), belonging to the Bet v1-like superfamily, structure-based phylogenies have been derived from structural alignment of representative proteins of the superfamily. A careful inspection of the structural relatedness of RHOs with the rest of the families showed closest similarity between RHO catalytic domain and PA1206-like protein. In addition, phylogenetic relationship of the Rieske domain of the large subunit of RHOs with functionally and structurally similar proteins has also been elucidated so as to postulate the most possible events leading to the genesis of the large subunit of RHOs.
Collapse
Affiliation(s)
- Joydeep Chakraborty
- Department of Microbiology, Bose Institute, P-1/12 C.I.T. Scheme VII M, Kolkata 700054, India
| | | |
Collapse
|
30
|
Using increment of diversity to predict mitochondrial proteins of malaria parasite: integrating pseudo-amino acid composition and structural alphabet. Amino Acids 2010; 42:1309-16. [DOI: 10.1007/s00726-010-0825-7] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2010] [Accepted: 12/17/2010] [Indexed: 11/29/2022]
|
31
|
Structural bioinformatics: deriving biological insights from protein structures. Interdiscip Sci 2010; 2:347-66. [PMID: 21153779 DOI: 10.1007/s12539-010-0045-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2010] [Revised: 06/18/2010] [Accepted: 06/21/2010] [Indexed: 12/27/2022]
Abstract
Structural bioinformatics can be described as an approach that will help decipher biological insights from protein structures. As an important component of structural biology, this area promises to provide a high resolution understanding of biology by assisting comprehension and interpretation of a large amount of structural data. Biological function of protein molecules can be inferred from their three-dimensional structures by comparing structures, classifying them and transferring function from a related protein or family. It is well known now that the structure space of protein molecules is more conserved than the sequence space, making it important to seek functional associations at the structural level. An added advantage of structural bioinformatics over simpler sequence-based methods is that the former also provides ultimate insights into the mechanisms by which various biological events take place. A bird's eye-view of the different aspects of structural bioinformatics is given here along with various recent advances in the area including how knowledge obtained from structural bioinformatics can be applied in drug discovery.
Collapse
|
32
|
Zhang Z, Wang Y, Wang L, Gao P. The combined effects of amino acid substitutions and indels on the evolution of structure within protein families. PLoS One 2010; 5:e14316. [PMID: 21179197 PMCID: PMC3001449 DOI: 10.1371/journal.pone.0014316] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2010] [Accepted: 11/16/2010] [Indexed: 01/02/2023] Open
Abstract
BACKGROUND In the process of protein evolution, sequence variations within protein families can cause changes in protein structures and functions. However, structures tend to be more conserved than sequences and functions. This leads to an intriguing question: what is the evolutionary mechanism by which sequence variations produce structural changes? To investigate this question, we focused on the most common types of sequence variations: amino acid substitutions and insertions/deletions (indels). Here their combined effects on protein structure evolution within protein families are studied. RESULTS Sequence-structure correlation analysis on 75 homologous structure families (from SCOP) that contain 20 or more non-redundant structures shows that in most of these families there is, statistically, a bilinear correlation between the amount of substitutions and indels versus the degree of structure variations. Bilinear regression of percent sequence non-identity (PNI) and standardized number of gaps (SNG) versus RMSD was performed. The coefficients from the regression analysis could be used to estimate the structure changes caused by each unit of substitution (structural substitution sensitivity, SSS) and by each unit of indel (structural indel sensitivity, SIDS). An analysis on 52 families with high bilinear fitting multiple correlation coefficients and statistically significant regression coefficients showed that SSS is mainly constrained by disulfide bonds, which almost have no effects on SIDS. CONCLUSIONS Structural changes in homologous protein families could be rationally explained by a bilinear model combining amino acid substitutions and indels. These results may further improve our understanding of the evolutionary mechanisms of protein structures.
Collapse
Affiliation(s)
- Zheng Zhang
- State Key Laboratory of Microbial Technology, Shandong University, Jinan, Shandong, China
| | - Yuxiao Wang
- State Key Laboratory of Microbial Technology, Shandong University, Jinan, Shandong, China
- Division of Basic Science, UT Southwestern, Dallas, Texas, United States of America
| | - Lushan Wang
- State Key Laboratory of Microbial Technology, Shandong University, Jinan, Shandong, China
- * E-mail: (LW); (PG)
| | - Peiji Gao
- State Key Laboratory of Microbial Technology, Shandong University, Jinan, Shandong, China
- * E-mail: (LW); (PG)
| |
Collapse
|
33
|
Bryant DH, Moll M, Chen BY, Fofanov VY, Kavraki LE. Analysis of substructural variation in families of enzymatic proteins with applications to protein function prediction. BMC Bioinformatics 2010; 11:242. [PMID: 20459833 PMCID: PMC2885373 DOI: 10.1186/1471-2105-11-242] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2009] [Accepted: 05/11/2010] [Indexed: 12/02/2022] Open
Abstract
Background Structural variations caused by a wide range of physico-chemical and biological sources directly influence the function of a protein. For enzymatic proteins, the structure and chemistry of the catalytic binding site residues can be loosely defined as a substructure of the protein. Comparative analysis of drug-receptor substructures across and within species has been used for lead evaluation. Substructure-level similarity between the binding sites of functionally similar proteins has also been used to identify instances of convergent evolution among proteins. In functionally homologous protein families, shared chemistry and geometry at catalytic sites provide a common, local point of comparison among proteins that may differ significantly at the sequence, fold, or domain topology levels. Results This paper describes two key results that can be used separately or in combination for protein function analysis. The Family-wise Analysis of SubStructural Templates (FASST) method uses all-against-all substructure comparison to determine Substructural Clusters (SCs). SCs characterize the binding site substructural variation within a protein family. In this paper we focus on examples of automatically determined SCs that can be linked to phylogenetic distance between family members, segregation by conformation, and organization by homology among convergent protein lineages. The Motif Ensemble Statistical Hypothesis (MESH) framework constructs a representative motif for each protein cluster among the SCs determined by FASST to build motif ensembles that are shown through a series of function prediction experiments to improve the function prediction power of existing motifs. Conclusions FASST contributes a critical feedback and assessment step to existing binding site substructure identification methods and can be used for the thorough investigation of structure-function relationships. The application of MESH allows for an automated, statistically rigorous procedure for incorporating structural variation data into protein function prediction pipelines. Our work provides an unbiased, automated assessment of the structural variability of identified binding site substructures among protein structure families and a technique for exploring the relation of substructural variation to protein function. As available proteomic data continues to expand, the techniques proposed will be indispensable for the large-scale analysis and interpretation of structural data.
Collapse
Affiliation(s)
- Drew H Bryant
- Department of Computer Science, Rice University, Houston, TX, USA
| | | | | | | | | |
Collapse
|
34
|
Konc J, Janezic D. ProBiS algorithm for detection of structurally similar protein binding sites by local structural alignment. ACTA ACUST UNITED AC 2010; 26:1160-8. [PMID: 20305268 PMCID: PMC2859123 DOI: 10.1093/bioinformatics/btq100] [Citation(s) in RCA: 184] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Motivation: Exploitation of locally similar 3D patterns of physicochemical properties on the surface of a protein for detection of binding sites that may lack sequence and global structural conservation. Results: An algorithm, ProBiS is described that detects structurally similar sites on protein surfaces by local surface structure alignment. It compares the query protein to members of a database of protein 3D structures and detects with sub-residue precision, structurally similar sites as patterns of physicochemical properties on the protein surface. Using an efficient maximum clique algorithm, the program identifies proteins that share local structural similarities with the query protein and generates structure-based alignments of these proteins with the query. Structural similarity scores are calculated for the query protein's surface residues, and are expressed as different colors on the query protein surface. The algorithm has been used successfully for the detection of protein–protein, protein–small ligand and protein–DNA binding sites. Availability: The software is available, as a web tool, free of charge for academic users at http://probis.cmm.ki.si Contact:dusa@cmm.ki.si Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Janez Konc
- National Institute of Chemistry, Ljubljana, Slovenia
| | | |
Collapse
|
35
|
Kleinman CL, Rodrigue N, Lartillot N, Philippe H. Statistical potentials for improved structurally constrained evolutionary models. Mol Biol Evol 2010; 27:1546-60. [PMID: 20159780 DOI: 10.1093/molbev/msq047] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Assessing the influence of three-dimensional protein structure on sequence evolution is a difficult task, mainly because of the assumption of independence between sites required by probabilistic phylogenetic methods. Recently, models that include an explicit treatment of protein structure and site interdependencies have been developed: a statistical potential (an energy-like scoring system for sequence-structure compatibility) is used to evaluate the probability of fixation of a given mutation, assuming a coarse-grained protein structure that is constant through evolution. Yet, due to the novelty of these models and the small degree of overlap between the fields of structural and evolutionary biology, only simple representations of protein structure have been used so far. In this work, we present new forms of statistical potentials using a probabilistic framework recently developed for evolutionary studies. Terms related to pairwise distance interactions, torsion angles, solvent accessibility, and flexibility of the residues are included in the potentials, so as to study the effects of the main factors known to influence protein structure. The new potentials, with a more detailed representation of the protein structure, yield a better fit than the previously used scoring functions, with pairwise interactions contributing to more than half of this improvement. In a phylogenetic context, however, the structurally constrained models are still outperformed by some of the available site-independent models in terms of fit, possibly indicating that alternatives to coarse-grained statistical potentials should be explored in order to better model structural constraints.
Collapse
Affiliation(s)
- Claudia L Kleinman
- Département de Biochimie, Centre Robert Cedergren, Université de Montréal, Montreal, Quebec, Canada.
| | | | | | | |
Collapse
|
36
|
Remmert M, Biegert A, Linke D, Lupas AN, Söding J. Evolution of outer membrane beta-barrels from an ancestral beta beta hairpin. Mol Biol Evol 2010; 27:1348-58. [PMID: 20106904 DOI: 10.1093/molbev/msq017] [Citation(s) in RCA: 84] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Outer membrane beta-barrels (OMBBs) are the major class of outer membrane proteins from Gram-negative bacteria, mitochondria, and plastids. Their transmembrane domains consist of 8-24 beta-strands forming a closed, barrel-shaped beta-sheet around a central pore. Despite their obvious structural regularity, evidence for an origin by duplication or for a common ancestry has not been found. We use three complementary approaches to show that all OMBBs from Gram-negative bacteria evolved from a single, ancestral beta beta hairpin. First, we link almost all families of known single-chain bacterial OMBBs with each other through transitive profile searches. Second, we identify a clear repeat signature in the sequences of many OMBBs in which the repeating sequence unit coincides with the structural beta beta hairpin repeat. Third, we show that the observed sequence similarity between OMBB hairpins cannot be explained by structural or membrane constraints on their sequences. The third approach addresses a longstanding problem in protein evolution: how to distinguish between a very remotely homologous relationship and the opposing scenario of "sequence convergence." The origin of a diverse group of proteins from a single hairpin module supports the hypothesis that, around the time of transition from the RNA to the protein world, proteins arose by amplification and recombination of short peptide modules that had previously evolved as cofactors of RNAs.
Collapse
Affiliation(s)
- M Remmert
- Department of Biochemistry, Gene Center Munich and Center for Integrated Protein Science (CIPSM), Ludwig-Maximilians-Universtät München, Munich, Germany
| | | | | | | | | |
Collapse
|
37
|
Prediction of subcellular location of mycobacterial protein using feature selection techniques. Mol Divers 2009; 14:667-71. [DOI: 10.1007/s11030-009-9205-1] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2009] [Accepted: 10/20/2009] [Indexed: 11/26/2022]
|
38
|
Pokarowski P, Kloczkowski A, Nowakowski S, Pokarowska M, Jernigan RL, Kolinski A. Ideal amino acid exchange forms for approximating substitution matrices. Proteins 2009; 69:379-93. [PMID: 17623859 DOI: 10.1002/prot.21509] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
We have analyzed 29 published substitution matrices (SMs) and five statistical protein contact potentials (CPs) for comparison. We find that popular, 'classical' SMs obtained mainly from sequence alignments of globular proteins are mostly correlated by at least a value of 0.9. The BLOSUM62 is the central element of this group. A second group includes SMs derived from alignments of remote homologs or transmembrane proteins. These matrices correlate better with classical SMs (0.8) than among themselves (0.7). A third group consists of intermediate links between SMs and CPs - matrices and potentials that exhibit mutual correlations of at least 0.8. Next, we show that SMs can be approximated with a correlation of 0.9 by expressions c(0) + x(i)x(j) + y(i)y(j) + z(i)z(j), 1<or= i, j <or= 20, where c(0) is a constant and the vectors (x(i)), (y(i)), (z(i)) correlate highly with hydrophobicity, molecular volume and coil preferences of amino acids, respectively. The present paper is the continuation of our work (Pokarowski et al., Proteins 2005;59:49-57), where similar approximation were used to derive ideal amino acid interaction forms from CPs. Both approximations allow us to understand general trends in amino acid similarity and can help improve multiple sequence alignments using the fast Fourier transform (MAFFT), fast threading or another methods based on alignments of physicochemical profiles of protein sequences. The use of this approximation in sequence alignments instead of a classical SM yields results that differ by less than 5%. Intermediate links between SMs and CPs, new formulas for approximating these matrices, and the highly significant dependence of classical SMs on coil preferences are new findings.
Collapse
Affiliation(s)
- Piotr Pokarowski
- Institute of Informatics, Faculty of Mathematics, Informatics and Mechanics, Warsaw University, 02-097 Warsaw, Poland.
| | | | | | | | | | | |
Collapse
|
39
|
Peterson ME, Chen F, Saven JG, Roos DS, Babbitt PC, Sali A. Evolutionary constraints on structural similarity in orthologs and paralogs. Protein Sci 2009; 18:1306-15. [PMID: 19472362 PMCID: PMC2774440 DOI: 10.1002/pro.143] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2008] [Revised: 03/29/2009] [Accepted: 03/30/2009] [Indexed: 11/10/2022]
Abstract
Although a quantitative relationship between sequence similarity and structural similarity has long been established, little is known about the impact of orthology on the relationship between protein sequence and structure. Among homologs, orthologs (derived by speciation) more frequently have similar functions than paralogs (derived by duplication). Here, we hypothesize that an orthologous pair will tend to exhibit greater structural similarity than a paralogous pair at the same level of sequence similarity. To test this hypothesis, we used 284,459 pairwise structure-based alignments of 12,634 unique domains from SCOP as well as orthology and paralogy assignments from OrthoMCL DB. We divided the comparisons by sequence identity and determined whether the sequence-structure relationship differed between the orthologs and paralogs. We found that at levels of sequence identity between 30 and 70%, orthologous domain pairs indeed tend to be significantly more structurally similar than paralogous pairs at the same level of sequence identity. An even larger difference is found when comparing ligand binding residues instead of whole domains. These differences between orthologs and paralogs are expected to be useful for selecting template structures in comparative modeling and target proteins in structural genomics.
Collapse
Affiliation(s)
- Mark E Peterson
- Department of Bioengineering and Therapeutic Sciences, University of CaliforniaSan Francisco, San Francisco, California 94158
- Department of Pharmaceutical Chemistry, University of CaliforniaSan Francisco, San Francisco, California 94158
- California Institute for Quantitative Biosciences, University of CaliforniaSan Francisco, San Francisco, California 94158
| | - Feng Chen
- Department of Chemistry, University of PennsylvaniaPhiladelphia, PA 19104
- Department of Biology and Penn Genomics Institute, University of PennsylvaniaPhiladelphia, PA 19104
| | - Jeffery G Saven
- Department of Chemistry, University of PennsylvaniaPhiladelphia, PA 19104
| | - David S Roos
- Department of Biology and Penn Genomics Institute, University of PennsylvaniaPhiladelphia, PA 19104
| | - Patricia C Babbitt
- Department of Bioengineering and Therapeutic Sciences, University of CaliforniaSan Francisco, San Francisco, California 94158
- Department of Pharmaceutical Chemistry, University of CaliforniaSan Francisco, San Francisco, California 94158
- California Institute for Quantitative Biosciences, University of CaliforniaSan Francisco, San Francisco, California 94158
| | - Andrej Sali
- Department of Bioengineering and Therapeutic Sciences, University of CaliforniaSan Francisco, San Francisco, California 94158
- Department of Pharmaceutical Chemistry, University of CaliforniaSan Francisco, San Francisco, California 94158
- California Institute for Quantitative Biosciences, University of CaliforniaSan Francisco, San Francisco, California 94158
| |
Collapse
|
40
|
Williams SG, Lovell SC. The effect of sequence evolution on protein structural divergence. Mol Biol Evol 2009; 26:1055-65. [PMID: 19193735 DOI: 10.1093/molbev/msp020] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
The complex constraints imposed by protein structure and function result in varied rates of sequence and structural divergence in proteins. Analysis of sequence differences between homologous proteins can advance our understanding of structural divergence and some of the constraints that govern the evolution of these molecules. Here, we assess the relationship between amino acid sequence and structural divergence. Firstly, we demonstrate that the relationship between protein sequence and structural divergence is governed by a variety of evolutionary constraints, including solvent exposure and secondary structure. Secondly, although compensatory substitutions are widespread, we find many radical size-changing mutations that are not compensated by neighboring complementary changes. Instead, these noncompensated substitutions are mitigated by alteration of protein structure. These results suggest a combined mechanism of accommodating substitutions in proteins, involving both coevolution and structural accommodation. Such a mechanism can explain previously observed correlated substitutions of residues that are distant both in sequence and structure, allowing an integrated view of sequence and structural divergence of proteins.
Collapse
Affiliation(s)
- Simon G Williams
- Faculty of Life Sciences, University of Manchester, Manchester, UK
| | | |
Collapse
|
41
|
Nuutinen T, Tossavainen H, Fredriksson K, Pirilä P, Permi P, Pospiech H, Syvaoja JE. The solution structure of the amino-terminal domain of human DNA polymerase epsilon subunit B is homologous to C-domains of AAA+ proteins. Nucleic Acids Res 2008; 36:5102-10. [PMID: 18676977 PMCID: PMC2528186 DOI: 10.1093/nar/gkn497] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
DNA polymerases α, δ and ε are large multisubunit complexes that replicate the bulk of the DNA in the eukaryotic cell. In addition to the homologous catalytic subunits, these enzymes possess structurally related B subunits, characterized by a carboxyterminal calcineurin-like and an aminoproximal oligonucleotide/oligosaccharide binding-fold domain. The B subunits also share homology with the exonuclease subunit of archaeal DNA polymerases D. Here, we describe a novel domain specific to the N-terminus of the B subunit of eukaryotic DNA polymerases ε. The N-terminal domain of human DNA polymerases ε (Dpoe2NT) expressed in Escherichia coli was characterized. Circular dichroism studies demonstrated that Dpoe2NT forms a stable, predominantly α-helical structure. The solution structure of Dpoe2NT revealed a domain that consists of a left-handed superhelical bundle. Four helices are arranged in two hairpins and the connecting loops contain short β-strand segments that form a short parallel sheet. DALI searches demonstrated a striking structural similarity of the Dpoe2NT with the α-helical subdomains of ATPase associated with various cellular activity (AAA+) proteins (the C-domain). Like C-domains, Dpoe2NT is rich in charged amino acids. The biased distribution of the charged residues is reflected by a polarization and a considerable dipole moment across the Dpoe2NT. Dpoe2NT represents the first C-domain fold not associated with an AAA+ protein.
Collapse
|
42
|
Gao B, Zhu SY. Differential potency of drosomycin to Neurospora crassa and its mutant: implications for evolutionary relationship between defensins from insects and plants. INSECT MOLECULAR BIOLOGY 2008; 17:405-411. [PMID: 18651922 DOI: 10.1111/j.1365-2583.2008.00810.x] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
Drosomycin, the first inducible antifungal peptide isolated from Drosophila, belongs to the superfamily of CSalphabeta-type defensins. In the present study we report a modified approach for high-level expression of drosomycin, which allows us to evaluate its differential potency on the filamentous fungus Neurospora crassa WT (wild type) and N. crassa MUT16, a specific resistance mutant strain to plant defensins, by using different approaches. The results presented here show for the first time that N. crassa MUT16 is resistant to our recombinant drosomycin. Differential survival rates of Drosophila larvae infected by N. crassa WT and MUT16 further confirm the key antifungal role of drosomycin in vivo. The absence of activity against MUT16 suggests a mechanical commonality between drosomycin and plant defensins, which provides additional evidence in favor of their homologous relationship. Furthermore, the existence of drosomycin-like molecules in fungi suggests that all these peptides could originate from a common ancestry rather than horizontal gene transfer between plants and insects, which is further strengthened by the monophyletic origin of these peptides from plants, fungi and insects.
Collapse
Affiliation(s)
- B Gao
- Group of Animal Innate Immunity, State Key Laboratory of Integrated Management of Pest Insects & Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | | |
Collapse
|
43
|
Choi SC, Stone EA, Kishino H, Thorne JL. Estimates of natural selection due to protein tertiary structure inform the ancestry of biallelic loci. Gene 2008; 441:45-52. [PMID: 18725272 DOI: 10.1016/j.gene.2008.07.020] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2008] [Accepted: 07/10/2008] [Indexed: 10/21/2022]
Abstract
We consider the inference of which of two alleles is ancestral when the alleles have a single nonsynonymous difference and when natural selection acts via protein tertiary structure. Whereas the probability that an allele is ancestral under neutrality is equal to its frequency, under selection this probability depends on allele frequency and on the magnitude and direction of selection pressure. Although allele frequencies can be well estimated from intraspecific data, small fitness differences have a large evolutionary impact but can be difficult to estimate with only intraspecific data. Methods for predicting aspects of phenotype from genotype can supplement intraspecific sequence data. Recently developed statistical techniques can assess effects of phenotypes, such as protein tertiary structure on molecular evolution. While these techniques were initially designed for comparing protein-coding genes from different species, the resulting interspecific inferences can be assigned population genetic interpretations to assess the effect of selection pressure, and we use them here along with intraspecific allele frequency data to estimate the probability that an allele is ancestral. We focus on 140 nonsynonymous single nucleotide polymorphisms of humans that are in proteins with known tertiary structures. We find that our technique for employing protein tertiary structure information yields some biologically plausible results but that it does not substantially improve the inference of ancestral human allele types.
Collapse
Affiliation(s)
- Sang Chul Choi
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC 27695-7566, USA
| | | | | | | |
Collapse
|
44
|
Ahmed S, Kapoor D, Singh B, Guptasarma P. Conformational behavior of polypeptides derived through simultaneous global conservative site-directed mutagenesis of chymotrypsin inhibitor 2. BIOCHIMICA ET BIOPHYSICA ACTA 2008; 1784:796-805. [PMID: 18359306 DOI: 10.1016/j.bbapap.2008.01.023] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/15/2007] [Revised: 01/30/2008] [Accepted: 01/31/2008] [Indexed: 05/26/2023]
Abstract
The natural occurrence of conservative residue substitutions in proteins suggests that side-chain packing schemes in protein interiors can accommodate mutational replacements of residues by others of similar nature. To explore the extent to which such substitutions are tolerated, especially when introduced simultaneously and globally over the entire length of a polypeptide chain, we examined the conformational behavior of a model 65 residues-long protein, wild-type chymotrypsin inhibitor 2 (WTCI2), and two globally-mutated (GM) variants named GMCI2-1 and GMCI2-2, each incorporating 55 conservative residue substitutions. GMCI2-1, was soluble over a wide range of pH, and folded into a compact, spherical, monomer marked by (i) complete absence of surface hydrophobicity, (ii) a WTCI2-like betaII-type CD spectrum, (iii) high WTCI2-like thermal stability, and (d) 1D and 2D NMR spectra characteristic of folded protein structure. GMCI2-2 was insoluble over a wide range of pH, and could be solubilized only at pH 4.0, showing non-WTCI2-like far-UV CD spectra characterized by high helical content. These results tentatively indicate that polypeptides incorporating residues of identical nature at equivalent chain locations can show the potential to fold with similar characteristics. However, further detailed investigations would be required to determine whether indeed the structural fold of GMCI2-1 resembles that of WTCI2, and to evaluate the extent to which it does so.
Collapse
Affiliation(s)
- Shubbir Ahmed
- Division of Protein Science and Engineering, Institute of Microbial Technology, Sector 39-A, Chandigarh 160 036, India
| | | | | | | |
Collapse
|
45
|
Cheng H, Kim BH, Grishin NV. Discrimination between distant homologs and structural analogs: lessons from manually constructed, reliable data sets. J Mol Biol 2008; 377:1265-78. [PMID: 18313074 PMCID: PMC4494761 DOI: 10.1016/j.jmb.2007.12.076] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2007] [Accepted: 12/20/2007] [Indexed: 10/22/2022]
Abstract
A natural way to study protein sequence, structure, and function is to put them in the context of evolution. Homologs inherit similarities from their common ancestor, while analogs converge to similar structures due to a limited number of energetically favorable ways to pack secondary structural elements. Using novel strategies, we previously assembled two reliable databases of homologs and analogs. In this study, we compare these two data sets and develop a support vector machine (SVM)-based classifier to discriminate between homologs and analogs. The classifier uses a number of well-known similarity scores. We observe that although both structure scores and sequence scores contribute to SVM performance, profile sequence scores computed based on structural alignments are the best discriminators between remote homologs and structural analogs. We apply our classifier to a representative set from the expert-constructed database, Structural Classification of Proteins (SCOP). The SVM classifier recovers 76% of the remote homologs defined as domains in the same SCOP superfamily but from different families. More importantly, we also detect and discuss interesting homologous relationships between SCOP domains from different superfamilies, folds, and even classes.
Collapse
Affiliation(s)
- Hua Cheng
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, TX 75390-9050, USA.
| | | | | |
Collapse
|
46
|
Clapier CR, Chakravarthy S, Petosa C, Fernández-Tornero C, Luger K, Müller CW. Structure of the Drosophila nucleosome core particle highlights evolutionary constraints on the H2A-H2B histone dimer. Proteins 2008; 71:1-7. [PMID: 17957772 PMCID: PMC2443955 DOI: 10.1002/prot.21720] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
We determined the 2.45 Å crystal structure of the nucleosome core particle from Drosophila melanogaster and compared it to that of Xenopus laevis bound to the identical 147 base-pair DNA fragment derived from human α-satellite DNA. Differences between the two structures primarily reflect 16 amino acid substitutions between species, 15 of which are in histones H2A and H2B. Four of these involve histone tail residues, resulting in subtly altered protein–DNA interactions that exemplify the structural plasticity of these tails. Of the 12 substitutions occurring within the histone core regions, five involve small, solvent-exposed residues not involved in intraparticle interactions. The remaining seven involve buried hydrophobic residues, and appear to have coevolved so as to preserve the volume of side chains within the H2A hydrophobic core and H2A-H2B dimer interface. Thus, apart from variations in the histone tails, amino acid substitutions that differentiate Drosophila from Xenopus histones occur in mutually compensatory combinations. This highlights the tight evolutionary constraints exerted on histones since the vertebrate and invertebrate lineages diverged.
Collapse
Affiliation(s)
- Cedric R Clapier
- European Molecular Biology Laboratory, Grenoble Outstation, 38042 Grenoble Cedex 9, France
| | | | | | | | | | | |
Collapse
|
47
|
Abdel-Halim H, Hanrahan JR, Hibbs DE, Johnston GAR, Chebib M. A molecular basis for agonist and antagonist actions at GABA(C) receptors. Chem Biol Drug Des 2008; 71:306-27. [PMID: 18312293 DOI: 10.1111/j.1747-0285.2008.00642.x] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
We modelled the N-terminal ligand-binding domain of the rho1 GABA(C) receptor based on the Lymnaea stagnalis acetylcholine-binding protein (L-AChBP) crystal structure using comparative modelling and validated using flexible docking guided by known mutagenesis studies. A range of known rho1 GABA(C) receptor ligands comprising seven full agonists, 10 partial agonists, 43 antagonists and 12 inactive molecules were used to evaluate and validate the models. Of the 50 models identified, six models that allowed flexible ligand docking in accordance with the experimental data were selected and used to study detailed receptor-ligand interactions. The most refined model to accommodate all known active ligands featured a cavity comprising of a volume of 488 A(3). A detailed analysis of the interaction between the rho1 GABA(C) receptor model and the docked ligands revealed possible H-bonds and cation-pi interactions between the different ligands and binding site residues. Based on quantum mechanical/molecular mechanical (QM/MM) calculations, the model showed distinctive conformations of loop C that provided a molecular basis for agonist and antagonist actions. Agonists elicit loop C closure, while a more open loop C was observed upon antagonist binding. The model differentiates the role for key residues known to be involved in either binding and/or gating.
Collapse
Affiliation(s)
- Heba Abdel-Halim
- Faculty of Pharmacy, The University of Sydney, Sydney, NSW 2006, Australia
| | | | | | | | | |
Collapse
|
48
|
Russell RB. Classification of protein folds. Mol Biotechnol 2007; 36:238-47. [PMID: 17873410 DOI: 10.1007/s12033-007-0032-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/1999] [Revised: 11/30/1999] [Accepted: 11/30/1999] [Indexed: 11/26/2022]
Abstract
The diversity and complexity of bioinformatics tools currently available for protein sequence analysis can make it difficult to know where to begin when presented with a new sequence. In this article, we present a protocol outlining one approach to sequence analysis that should give as comprehensive a picture as possible as to the likely structure and function of a protein given the limits of available tools. We also provide worked examples showing how these tools can have an impact on the understanding of protein function prior to experimental studies.
Collapse
Affiliation(s)
- Robert B Russell
- Structural Bioinformatics, EMBL, Meyerhofstrasse 1, Heidelberg, Germany.
| |
Collapse
|
49
|
Bártová I, Koca J, Otyepka M. Functional flexibility of human cyclin-dependent kinase-2 and its evolutionary conservation. Protein Sci 2007; 17:22-33. [PMID: 18042686 DOI: 10.1110/ps.072951208] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Cyclin-dependent kinase 2 (CDK2) is the most thoroughly studied of the cyclin-dependent kinases that regulate essential cellular processes, including the cell cycle, and it has become a model for studies of regulatory mechanisms at the molecular level. This contribution identifies flexible and rigid regions of CDK2 based on temperature B-factors acquired from both X-ray data and molecular dynamics simulations. In addition, the biological relevance of the identified flexible regions and their motions is explored using information from the essential dynamics analysis related to conformational changes of CDK2 and knowledge of its biological function(s). The conserved regions of CMGC protein kinases' primary sequences are located in the most rigid regions identified in our analyses, with the sole exception of the absolutely conserved G13 in the tip of the glycine-rich loop. The conserved rigid regions are important for nucleotide binding, catalysis, and substrate recognition. In contrast, the most flexible regions correlate with those where large conformational changes occur during CDK2 regulation processes. The rigid regions flank and form a rigid skeleton for the flexible regions, which appear to provide the plasticity required for CDK2 regulation. Unlike the rigid regions (which as mentioned are highly conserved) no evidence of evolutionary conservation was found for the flexible regions.
Collapse
Affiliation(s)
- Iveta Bártová
- Department of Physical Chemistry, Palacky University, 771 46 Olomouc, Czech Republic
| | | | | |
Collapse
|
50
|
Improving pairwise sequence alignment between distantly related proteins. Methods Mol Biol 2007. [PMID: 17993679 DOI: 10.1007/978-1-59745-514-5_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
Sequence alignment between remotely related proteins has been one of the more difficult problems in structural biology. Improvements have been achieved by incorporating information that enhances the diversity of the substitution matrices. NdPASA is a web-based server that optimizes sequence alignments between proteins sharing low percentages of sequence identity. The program integrates structure information of the template sequence into a global alignment algorithm by employing amino acids' neighbor-dependent propensities for secondary structure as unique parameters for alignment. NdPASA optimizes alignment by evaluating the likelihood of a residue pair in the query sequence matching against a corresponding residue pair adopting a particular secondary structure in the template sequence. The server is designed to aid homologous protein structure modeling. It is most effective when the structure of the template sequence is known. NdPASA can be accessed online at www.fenglab.org/bioserver.html.
Collapse
|