1
|
Itoh T, Ogawa T, Hibi T, Kimoto H. Characterization of the extracellular domain of sensor histidine kinase NagS from Paenibacillus sp. str. FPU-7: nagS interacts with oligosaccharide binding protein NagB1 in complexes with N, N'-diacetylchitobiose. Biosci Biotechnol Biochem 2024; 88:294-304. [PMID: 38059852 DOI: 10.1093/bbb/zbad173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Accepted: 11/27/2023] [Indexed: 12/08/2023]
Abstract
We have previously isolated the Gram-positive chitin-degrading bacterium Paenibacillus sp. str. FPU-7. This bacterium traps chitin disaccharide (GlcNAc)2 on its cell surface using two homologous solute-binding proteins, NagB1 and NagB2. Bacteria use histidine kinase (HK) of the two-component regulatory system as an extracellular environment sensor. In this study, we found that nagS, which encodes a HK, is located next to the nagB1 gene. Biochemical experiments revealed that the NagS sensor domain (NagS30-294) interacts with the NagB1-(GlcNAc)2 complex. However, proof of NagS30-294 interacting with NagB1 without (GlcNAc)2 is currently unavailable. In contrast to NagB1, no complex formation was observed between NagS30-294 and NagB2, even in the presence of (GlcNAc)2. The NagS30-294 crystal structure at 1.8 Å resolution suggested that the canonical tandem-Per-Arnt-Sim fold recognizes the NagB1-(GlcNAc)2 complex. This study provides insight into the recognition of chitin oligosaccharides by bacteria.
Collapse
Affiliation(s)
- Takafumi Itoh
- Department of Bioscience and Biotechnology, Fukui Prefectural University, Fukui, Japan
| | - Tomoki Ogawa
- Department of Bioscience and Biotechnology, Fukui Prefectural University, Fukui, Japan
| | - Takao Hibi
- Department of Bioscience and Biotechnology, Fukui Prefectural University, Fukui, Japan
| | - Hisashi Kimoto
- Department of Bioscience and Biotechnology, Fukui Prefectural University, Fukui, Japan
| |
Collapse
|
2
|
Yariv B, Yariv E, Kessel A, Masrati G, Chorin AB, Martz E, Mayrose I, Pupko T, Ben‐Tal N. Using evolutionary data to make sense of macromolecules with a "face-lifted" ConSurf. Protein Sci 2023; 32:e4582. [PMID: 36718848 PMCID: PMC9942591 DOI: 10.1002/pro.4582] [Citation(s) in RCA: 175] [Impact Index Per Article: 87.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 01/21/2023] [Accepted: 01/27/2023] [Indexed: 02/01/2023]
Abstract
The ConSurf web-sever for the analysis of proteins, RNA, and DNA provides a quick and accurate estimate of the per-site evolutionary rate among homologues. The analysis reveals functionally important regions, such as catalytic and ligand-binding sites, which often evolve slowly. Since the last report in 2016, ConSurf has been improved in multiple ways. It now has a user-friendly interface that makes it easier to perform the analysis and to visualize the results. Evolutionary rates are calculated based on a set of homologous sequences, collected using hidden Markov model-based search tools, recently embedded in the pipeline. Using these, and following the removal of redundancy, ConSurf assembles a representative set of effective homologues for protein and nucleic acid queries to enable informative analysis of the evolutionary patterns. The analysis is particularly insightful when the evolutionary rates are mapped on the macromolecule structure. In this respect, the availability of AlphaFold model structures of essentially all UniProt proteins makes ConSurf particularly relevant to the research community. The UniProt ID of a query protein with an available AlphaFold model can now be used to start a calculation. Another important improvement is the Python re-implementation of the entire computational pipeline, making it easier to maintain. This Python pipeline is now available for download as a standalone version. We demonstrate some of ConSurf's key capabilities by the analysis of caveolin-1, the main protein of membrane invaginations called caveolae.
Collapse
Affiliation(s)
- Barak Yariv
- George S. Wise Faculty of Life Sciences, Department of Biochemistry and Molecular BiologyTel Aviv UniversityTel AvivIsrael
| | - Elon Yariv
- George S. Wise Faculty of Life Sciences, Department of Biochemistry and Molecular BiologyTel Aviv UniversityTel AvivIsrael
| | - Amit Kessel
- George S. Wise Faculty of Life Sciences, Department of Biochemistry and Molecular BiologyTel Aviv UniversityTel AvivIsrael
| | - Gal Masrati
- George S. Wise Faculty of Life Sciences, Department of Biochemistry and Molecular BiologyTel Aviv UniversityTel AvivIsrael
| | - Adi Ben Chorin
- George S. Wise Faculty of Life Sciences, Department of Biochemistry and Molecular BiologyTel Aviv UniversityTel AvivIsrael
| | - Eric Martz
- Department of MicrobiologyUniversity of MassachusettsAmherstMassachusettsUSA
| | - Itay Mayrose
- George S. Wise Faculty of Life Sciences, School of Plant Sciences and Food SecurityTel Aviv UniversityTel AvivIsrael
| | - Tal Pupko
- George S. Wise Faculty of Life Sciences, The Shmunis School of Biomedicine and Cancer ResearchTel Aviv UniversityTel AvivIsrael
| | - Nir Ben‐Tal
- George S. Wise Faculty of Life Sciences, Department of Biochemistry and Molecular BiologyTel Aviv UniversityTel AvivIsrael
| |
Collapse
|
3
|
Itoh T, Yaguchi M, Nakaichi A, Yoda M, Hibi T, Kimoto H. Structural characterization of two solute-binding proteins for N,N'-diacetylchitobiose/ N,N',N''-triacetylchitotoriose of the gram-positive bacterium, Paenibacillus sp. str. FPU-7. J Struct Biol X 2021; 5:100049. [PMID: 34195603 PMCID: PMC8233162 DOI: 10.1016/j.yjsbx.2021.100049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Accepted: 05/28/2021] [Indexed: 10/27/2022] Open
Abstract
The chitinolytic bacterium Paenibacillus sp. str. FPU-7 efficiently degrades chitin into oligosaccharides such as N-acetyl-D-glucosamine (GlcNAc) and disaccharides (GlcNAc)2 through multiple secretory chitinases. Transport of these oligosaccharides by P. str. FPU-7 has not yet been clarified. In this study, we identified nagB1, predicted to encode a sugar solute-binding protein (SBP), which is a component of the ABC transport system. However, the genes next to nagB1 were predicted to encode two-component regulatory system proteins rather than transmembrane domains (TMDs). We also identified nagB2, which is highly homologous to nagB1. Adjacent to nagB2, two genes were predicted to encode TMDs. Binding experiments of the recombinant NagB1 and NagB2 to several oligosaccharides using differential scanning fluorimetry and surface plasmon resonance confirmed that both proteins are SBPs of (GlcNAc)2 and (GlcNAc)3. We determined their crystal structures complexed with and without chitin oligosaccharides at a resolution of 1.2 to 2.0 Å. The structures shared typical SBP structural folds and were classified as subcluster D-I. Large domain motions were observed in the structures, suggesting that they were induced by ligand binding via the "Venus flytrap" mechanism. These structures also revealed chitin oligosaccharide recognition mechanisms. In conclusion, our study provides insight into the recognition and transport of chitin oligosaccharides in bacteria.
Collapse
Key Words
- ABC transporter
- ABC, ATP-binding cassette
- Chitin oligosaccharide
- DSF, differential scanning fluorimetry
- GH, glycoside hydrolase
- GlcN, D-glucosamine
- GlcNAc, N-acetyl-D-glucosamine
- OD600, optical density at 600 nm
- PDB, Protein Data Bank
- PTS, phosphoenolpyruvate phosphotransferase system
- Paenibacillus
- RU, response unit
- SBP, solute binding protein
- Se-Met, selenomethionine
- Solute binding protein
- TMD, transmembrane domain
- Two-component regulatory system
- a.a., amino acid
- r.m.s.d., root mean-square deviation
Collapse
Affiliation(s)
- Takafumi Itoh
- Department of Bioscience and Biotechnology, Fukui Prefectural University, 4-1-1 Matsuokakenjyoujima, Eiheiji-cho, Yoshida-gun, Fukui 910-1195, Japan
| | - Misaki Yaguchi
- Department of Bioscience and Biotechnology, Fukui Prefectural University, 4-1-1 Matsuokakenjyoujima, Eiheiji-cho, Yoshida-gun, Fukui 910-1195, Japan
| | - Akari Nakaichi
- Department of Bioscience and Biotechnology, Fukui Prefectural University, 4-1-1 Matsuokakenjyoujima, Eiheiji-cho, Yoshida-gun, Fukui 910-1195, Japan
| | - Moe Yoda
- Department of Bioscience and Biotechnology, Fukui Prefectural University, 4-1-1 Matsuokakenjyoujima, Eiheiji-cho, Yoshida-gun, Fukui 910-1195, Japan
| | - Takao Hibi
- Department of Bioscience and Biotechnology, Fukui Prefectural University, 4-1-1 Matsuokakenjyoujima, Eiheiji-cho, Yoshida-gun, Fukui 910-1195, Japan
| | - Hisashi Kimoto
- Department of Bioscience and Biotechnology, Fukui Prefectural University, 4-1-1 Matsuokakenjyoujima, Eiheiji-cho, Yoshida-gun, Fukui 910-1195, Japan
| |
Collapse
|
4
|
Structural effects driven by rare point mutations in amylin hormone, the type II diabetes-associated peptide. Biochim Biophys Acta Gen Subj 2021; 1865:129935. [PMID: 34044067 DOI: 10.1016/j.bbagen.2021.129935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Revised: 05/11/2021] [Accepted: 05/20/2021] [Indexed: 11/27/2022]
Abstract
BACKGROUND Amylin is a 37-amino-acid peptide hormone co-secreted with insulin, which participates in glucose homeostasis. This hormone is able to aggregate in a β-sheet conformation and deposit in islet amyloids, a hallmark in type II diabetes. Since amylin is a gene-encoded hormone, this peptide has variants caused by point mutations that can impact its functions. METHODS Here, we analyzed the structural effects caused by S20G and G33R point mutations which, according to the 1000 Genomes Project, have frequency in East Asian and European populations, respectively. The analyses were performed by means of aggrescan server, SNP functional effect predictors, and molecular dynamics. RESULTS We found that both mutations have aggregation potential and cause changes in the monomeric forms when compared with wild-type amylin. Furthermore, comparative analyses with pramlintide, an amylin drug analogue, allowed us to infer that second α-helix maintenance may be related to the aggregation potential. CONCLUSIONS The S20G mutation has been described as pathologically related, which is in agreement with our findings. In addition, our data suggest that the G33R mutation might have a deleterious effect. The data presented here also provide new therapy opportunities, whether for creating more effective drugs for diabetes or implementing specific treatment for patients with these mutations. GENERAL SIGNIFICANCE Our data could help to better understand the impact of mutations on the wild-type amylin sequence, as a starting point for the evaluation and characterization of other variations. Moreover, these findings could improve the health of patients with type II diabetes.
Collapse
|
5
|
Ferreira KCDV, Fialho LF, Franco OL, de Alencar SA, Porto WF. Benchmarking analysis of deleterious SNP prediction tools on CYP2D6 enzyme. Chem Biol Drug Des 2020; 96:984-994. [DOI: 10.1111/cbdd.13676] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2019] [Revised: 02/15/2020] [Accepted: 03/03/2020] [Indexed: 12/19/2022]
Affiliation(s)
- Karla Cristina do Vale Ferreira
- Programa de Pós‐Graduação em Ciências Genômicas e Biotecnologia Universidade Católica de Brasília Brasília Brazil
- Centro de Análises Proteômicas e Bioquímicas Pós‐Graduação em Ciências Genômicas e Biotecnologia Universidade Católica de Brasília Brasília Brazil
| | - Leonardo Ferreira Fialho
- Programa de Pós‐Graduação em Ciências Genômicas e Biotecnologia Universidade Católica de Brasília Brasília Brazil
| | - Octávio Luiz Franco
- Programa de Pós‐Graduação em Ciências Genômicas e Biotecnologia Universidade Católica de Brasília Brasília Brazil
- Centro de Análises Proteômicas e Bioquímicas Pós‐Graduação em Ciências Genômicas e Biotecnologia Universidade Católica de Brasília Brasília Brazil
- S‐Inova Biotech Pós Graduação em Biotecnologia Universidade Católica Dom Bosco Campo Grande Brazil
| | - Sérgio Amorim de Alencar
- Programa de Pós‐Graduação em Ciências Genômicas e Biotecnologia Universidade Católica de Brasília Brasília Brazil
| | - William Farias Porto
- S‐Inova Biotech Pós Graduação em Biotecnologia Universidade Católica Dom Bosco Campo Grande Brazil
- Porto Reports Brasília Brazil
| |
Collapse
|
6
|
Steinegger M, Meier M, Mirdita M, Vöhringer H, Haunsberger SJ, Söding J. HH-suite3 for fast remote homology detection and deep protein annotation. BMC Bioinformatics 2019; 20:473. [PMID: 31521110 PMCID: PMC6744700 DOI: 10.1186/s12859-019-3019-7] [Citation(s) in RCA: 634] [Impact Index Per Article: 105.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Accepted: 08/02/2019] [Indexed: 01/06/2023] Open
Abstract
Background HH-suite is a widely used open source software suite for sensitive sequence similarity searches and protein fold recognition. It is based on pairwise alignment of profile Hidden Markov models (HMMs), which represent multiple sequence alignments of homologous proteins. Results We developed a single-instruction multiple-data (SIMD) vectorized implementation of the Viterbi algorithm for profile HMM alignment and introduced various other speed-ups. These accelerated the search methods HHsearch by a factor 4 and HHblits by a factor 2 over the previous version 2.0.16. HHblits3 is ∼10× faster than PSI-BLAST and ∼20× faster than HMMER3. Jobs to perform HHsearch and HHblits searches with many query profile HMMs can be parallelized over cores and over cluster servers using OpenMP and message passing interface (MPI). The free, open-source, GPLv3-licensed software is available at https://github.com/soedinglab/hh-suite. Conclusion The added functionalities and increased speed of HHsearch and HHblits should facilitate their use in large-scale protein structure and function prediction, e.g. in metagenomics and genomics projects.
Collapse
Affiliation(s)
- Martin Steinegger
- Quantitative and Computational Biology Group, Max-Planck Institute for Biophysical Chemistry, Am Fassberg 11, Munich, 81379, Germany.,Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Markus Meier
- Quantitative and Computational Biology Group, Max-Planck Institute for Biophysical Chemistry, Am Fassberg 11, Munich, 81379, Germany
| | - Milot Mirdita
- Quantitative and Computational Biology Group, Max-Planck Institute for Biophysical Chemistry, Am Fassberg 11, Munich, 81379, Germany
| | - Harald Vöhringer
- Quantitative and Computational Biology Group, Max-Planck Institute for Biophysical Chemistry, Am Fassberg 11, Munich, 81379, Germany.,European Bioinformatics Institute, Cambridge, CB10 1SD, United Kingdom
| | | | - Johannes Söding
- Quantitative and Computational Biology Group, Max-Planck Institute for Biophysical Chemistry, Am Fassberg 11, Munich, 81379, Germany.
| |
Collapse
|
7
|
Monteiro LLS, Franco OL, Alencar SA, Porto WF. Deciphering the structural basis for glucocorticoid resistance caused by missense mutations in the ligand binding domain of glucocorticoid receptor. J Mol Graph Model 2019; 92:216-226. [PMID: 31401440 DOI: 10.1016/j.jmgm.2019.07.020] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2019] [Revised: 07/01/2019] [Accepted: 07/31/2019] [Indexed: 11/25/2022]
Abstract
The glucocorticoid resistance hereditary condition may emerge from the occurrence of point mutations in the glucocorticoid receptor (GR), which could impair its functionality. Because the main feature of such pathology is the resistance of the hypothalamic-pituitary-adrenal axis to the hormone cortisol, we used the GR ligand binding domain three-dimensional structure to perform computational analysis for eight variants known to cause this clinical condition (I559 N, V571A, D641V, G679S, F737L, I747 M, L753F and L773P), aiming to understand, on the atom scale, how they cause glucocorticoid resistance. We observed that the mutations generated a reduced affinity to cortisol and they alter some loop conformations, which could be a consequence from changes in protein motion, which in turn could result from the reduced stability of mutant GR structures. Therefore, the analyzed mutations compromise the GR ligand binding domain structure and cortisol binding, which could characterize the glucocorticoid resistance phenotype.
Collapse
Affiliation(s)
- L L S Monteiro
- Programa de Pós-Graduação Em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília, DF, Brazil
| | - O L Franco
- Programa de Pós-Graduação Em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília, DF, Brazil; Centro de Análises Proteômicas e Bioquímicas, Pós-Graduação Em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília, DF, Brazil; S-Inova Biotech, Pós-Graduação Em Biotecnologia, Universidade Católica Dom Bosco, Campo Grande, MS, Brazil
| | - S A Alencar
- Programa de Pós-Graduação Em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília, DF, Brazil
| | - W F Porto
- Porto Reports, Brasília, DF, Brazil; S-Inova Biotech, Pós-Graduação Em Biotecnologia, Universidade Católica Dom Bosco, Campo Grande, MS, Brazil.
| |
Collapse
|
8
|
Yamada KD, Kinoshita K. De novo profile generation based on sequence context specificity with the long short-term memory network. BMC Bioinformatics 2018; 19:272. [PMID: 30021530 PMCID: PMC6052547 DOI: 10.1186/s12859-018-2284-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2018] [Accepted: 07/11/2018] [Indexed: 11/24/2022] Open
Abstract
Background Long short-term memory (LSTM) is one of the most attractive deep learning methods to learn time series or contexts of input data. Increasing studies, including biological sequence analyses in bioinformatics, utilize this architecture. Amino acid sequence profiles are widely used for bioinformatics studies, such as sequence similarity searches, multiple alignments, and evolutionary analyses. Currently, many biological sequences are becoming available, and the rapidly increasing amount of sequence data emphasizes the importance of scalable generators of amino acid sequence profiles. Results We employed the LSTM network and developed a novel profile generator to construct profiles without any assumptions, except for input sequence context. Our method could generate better profiles than existing de novo profile generators, including CSBuild and RPS-BLAST, on the basis of profile-sequence similarity search performance with linear calculation costs against input sequence size. In addition, we analyzed the effects of the memory power of LSTM and found that LSTM had high potential power to detect long-range interactions between amino acids, as in the case of beta-strand formation, which has been a difficult problem in protein bioinformatics using sequence information. Conclusion We demonstrated the importance of sequence context and the feasibility of LSTM on biological sequence analyses. Our results demonstrated the effectiveness of memories in LSTM and showed that our de novo profile generator, SPBuild, achieved higher performance than that of existing methods for profile prediction of beta-strands, where long-range interactions of amino acids are important and are known to be difficult for the existing window-based prediction methods. Our findings will be useful for the development of other prediction methods related to biological sequences by machine learning methods. Electronic supplementary material The online version of this article (10.1186/s12859-018-2284-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Kazunori D Yamada
- Graduate School of Information Sciences, Tohoku University, Sendai, Japan.,Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, Japan
| | - Kengo Kinoshita
- Graduate School of Information Sciences, Tohoku University, Sendai, Japan. .,Tohoku Medical Megabank Organization, Tohoku University, Sendai, Japan. .,Institute of Development, Aging, and Cancer, Tohoku University, Sendai, Japan.
| |
Collapse
|
9
|
Rigden DJ, Thomas JMH, Simkovic F, Simpkin A, Winn MD, Mayans O, Keegan RM. Ensembles generated from crystal structures of single distant homologues solve challenging molecular-replacement cases in AMPLE. Acta Crystallogr D Struct Biol 2018; 74:183-193. [PMID: 29533226 PMCID: PMC5947759 DOI: 10.1107/s2059798318002310] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2017] [Accepted: 02/07/2018] [Indexed: 01/17/2023] Open
Abstract
Molecular replacement (MR) is the predominant route to solution of the phase problem in macromolecular crystallography. Although routine in many cases, it becomes more effortful and often impossible when the available experimental structures typically used as search models are only distantly homologous to the target. Nevertheless, with current powerful MR software, relatively small core structures shared between the target and known structure, of 20-40% of the overall structure for example, can succeed as search models where they can be isolated. Manual sculpting of such small structural cores is rarely attempted and is dependent on the crystallographer's expertise and understanding of the protein family in question. Automated search-model editing has previously been performed on the basis of sequence alignment, in order to eliminate, for example, side chains or loops that are not present in the target, or on the basis of structural features (e.g. solvent accessibility) or crystallographic parameters (e.g. B factors). Here, based on recent work demonstrating a correlation between evolutionary conservation and protein rigidity/packing, novel automated ways to derive edited search models from a given distant homologue over a range of sizes are presented. A variety of structure-based metrics, many readily obtained from online webservers, can be fed to the MR pipeline AMPLE to produce search models that succeed with a set of test cases where expertly manually edited comparators, further processed in diverse ways with MrBUMP, fail. Further significant performance gains result when the structure-based distance geometry method CONCOORD is used to generate ensembles from the distant homologue. To our knowledge, this is the first such approach whereby a single structure is meaningfully transformed into an ensemble for the purposes of MR. Additional cases further demonstrate the advantages of the approach. CONCOORD is freely available and computationally inexpensive, so these novel methods offer readily available new routes to solve difficult MR cases.
Collapse
Affiliation(s)
- Daniel J. Rigden
- Institute of Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, England
| | - Jens M. H. Thomas
- Institute of Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, England
| | - Felix Simkovic
- Institute of Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, England
| | - Adam Simpkin
- Institute of Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, England
| | - Martyn D. Winn
- Science and Technology Facilities Council, Daresbury Laboratory, Warrington WA4 4AD, England
| | - Olga Mayans
- Fachbereich Biologie, Universität Konstanz, 78457 Konstanz, Germany
| | - Ronan M. Keegan
- Research Complex at Harwell, STFC Rutherford Appleton Laboratory, Didcot OX11 0FA, England
| |
Collapse
|
10
|
Yamada KD. Derivative-free neural network for optimizing the scoring functions associated with dynamic programming of pairwise-profile alignment. Algorithms Mol Biol 2018; 13:5. [PMID: 29467815 PMCID: PMC5815186 DOI: 10.1186/s13015-018-0123-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2017] [Accepted: 02/06/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND A profile-comparison method with position-specific scoring matrix (PSSM) is among the most accurate alignment methods. Currently, cosine similarity and correlation coefficients are used as scoring functions of dynamic programming to calculate similarity between PSSMs. However, it is unclear whether these functions are optimal for profile alignment methods. By definition, these functions cannot capture nonlinear relationships between profiles. Therefore, we attempted to discover a novel scoring function, which was more suitable for the profile-comparison method than existing functions, using neural networks. RESULTS Although neural networks required derivative-of-cost functions, the problem being addressed in this study lacked them. Therefore, we implemented a novel derivative-free neural network by combining a conventional neural network with an evolutionary strategy optimization method used as a solver. Using this novel neural network system, we optimized the scoring function to align remote sequence pairs. Our results showed that the pairwise-profile aligner using the novel scoring function significantly improved both alignment sensitivity and precision relative to aligners using existing functions. CONCLUSIONS We developed and implemented a novel derivative-free neural network and aligner (Nepal) for optimizing sequence alignments. Nepal improved alignment quality by adapting to remote sequence alignments and increasing the expressiveness of similarity scores. Additionally, this novel scoring function can be realized using a simple matrix operation and easily incorporated into other aligners. Moreover our scoring function could potentially improve the performance of homology detection and/or multiple-sequence alignment of remote homologous sequences. The goal of the study was to provide a novel scoring function for profile alignment method and develop a novel learning system capable of addressing derivative-free problems. Our system is capable of optimizing the performance of other sophisticated methods and solving problems without derivative-of-cost functions, which do not always exist in practical problems. Our results demonstrated the usefulness of this optimization method for derivative-free problems.
Collapse
|
11
|
Singh P, Ganjiwale A, Howlett AC, Cowsik SM. In silico interaction analysis of cannabinoid receptor interacting protein 1b (CRIP1b) - CB1 cannabinoid receptor. J Mol Graph Model 2017; 77:311-321. [PMID: 28918320 PMCID: PMC5816684 DOI: 10.1016/j.jmgm.2017.09.006] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2017] [Revised: 09/01/2017] [Accepted: 09/02/2017] [Indexed: 01/16/2023]
Abstract
Cannabinoid Receptor Interacting Protein isoform 1b (CRIP1b) is known to interact with the CB1 receptor. Alternative splicing of the CNRIP1 gene produces CRIP1a and CRIP1b with a difference in the third exon only. Exons 1 and 2 encode for a functional domain in both proteins. CRIP1a is involved in regulating CB1 receptor internalization, but the function of CRIP1b is not very well characterized. Since there are significant identities in functional domains of these proteins, CRIP1b is a potential target for drug discovery. We report here predicted structure of CRIP1b followed by its interaction analysis with CB1 receptor by in-silico methods A number of complementary computational techniques, including, homology modeling, ab-initio and protein threading, were applied to generate three-dimensional molecular models for CRIP1b. The computed model of CRIP1b was refined, followed by docking with C terminus of CB1 receptor to generate a model for the CRIP1b- CB1 receptor interaction. The structure of CRIP1b obtained by homology modelling using RHO_GDI-2 as template is a sandwich fold structure having beta sheets connected by loops, similar to predicted CRIP1a structure. The best scoring refined model of CRIP1b in complex with the CB1 receptor C terminus peptide showed favourable polar interactions. The overall binding pocket of CRIP1b was found to be overlapping to that of CRIP1a. The Arg82 and Cys126 of CRIP1b are involved in the majority of hydrogen bond interactions with the CB1 receptor and are possible key residues required for interactions between the CB1 receptor and CRIP1b.
Collapse
Affiliation(s)
- Pratishtha Singh
- School of Life Sciences, Jawaharlal Nehru University, New Delhi 110067, India
| | - Anjali Ganjiwale
- Department of Life Sciences, Bangalore University, Bangalore 560056, India
| | - Allyn C Howlett
- Department of Physiology and Pharmacology, Wake Forest School of Medicine, Winston-Salem, NC 27157, USA
| | - Sudha M Cowsik
- School of Life Sciences, Jawaharlal Nehru University, New Delhi 110067, India.
| |
Collapse
|
12
|
Porto WF, Marques FA, Pogue HB, de Oliveira Cardoso MT, do Vale MGR, da Silva Pires Á, Franco OL, de Alencar SA, Pogue R. Computational Investigation of Growth Hormone Receptor Trp169Arg Heterozygous Mutation in a Child With Short Stature. J Cell Biochem 2017; 118:4762-4771. [DOI: 10.1002/jcb.26144] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2017] [Accepted: 05/17/2017] [Indexed: 11/07/2022]
Affiliation(s)
- William Farias Porto
- Programa de Pós‐Graduação em Ciências Genômicas e BiotecnologiaUniversidade Católica de BrasíliaBrasília – DFBrazil
- Centro de Análises Proteômicas e Bioquímicas, Pós‐Graduação em Ciências Genômicas e BiotecnologiaUniversidade Católica de BrasíliaBrasília – DFBrazil
- Porto ReportsBrasília – DFBrazil
| | - Felipe Albuquerque Marques
- Programa de Pós‐Graduação em Ciências Genômicas e BiotecnologiaUniversidade Católica de BrasíliaBrasília – DFBrazil
- Departamento de FarmáciaUniversidade CEUMASão‐Luis – MABrazil
- Departamento de BiomedicinaUniversidade CEUMASão‐Luis – MABrazil
| | - Huri Brito Pogue
- Programa de Pós‐Graduação em Ciências Genômicas e BiotecnologiaUniversidade Católica de BrasíliaBrasília – DFBrazil
| | - Maria Teresinha de Oliveira Cardoso
- Curso de MedicinaUniversidade Católica de BrasíliaBrasília – DFBrazil
- Núcleo de Genética da Secretaria de Saúde do Distrito FederalBrasília – DFBrazil
| | | | - Állan da Silva Pires
- Programa de Pós‐Graduação em Ciências Genômicas e BiotecnologiaUniversidade Católica de BrasíliaBrasília – DFBrazil
- Centro de Análises Proteômicas e Bioquímicas, Pós‐Graduação em Ciências Genômicas e BiotecnologiaUniversidade Católica de BrasíliaBrasília – DFBrazil
| | - Octavio Luiz Franco
- Programa de Pós‐Graduação em Ciências Genômicas e BiotecnologiaUniversidade Católica de BrasíliaBrasília – DFBrazil
- Centro de Análises Proteômicas e Bioquímicas, Pós‐Graduação em Ciências Genômicas e BiotecnologiaUniversidade Católica de BrasíliaBrasília – DFBrazil
- S‐Inova Biotech, Pós‐graduação em BiotecnologiaUniversidade Católica Dom BoscoCampo GrandeMSBrazil
| | - Sérgio Amorim de Alencar
- Programa de Pós‐Graduação em Ciências Genômicas e BiotecnologiaUniversidade Católica de BrasíliaBrasília – DFBrazil
| | - Robert Pogue
- Programa de Pós‐Graduação em Ciências Genômicas e BiotecnologiaUniversidade Católica de BrasíliaBrasília – DFBrazil
| |
Collapse
|
13
|
Oda T, Lim K, Tomii K. Simple adjustment of the sequence weight algorithm remarkably enhances PSI-BLAST performance. BMC Bioinformatics 2017; 18:288. [PMID: 28578660 PMCID: PMC5455086 DOI: 10.1186/s12859-017-1686-9] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2017] [Accepted: 05/15/2017] [Indexed: 11/13/2022] Open
Abstract
Background PSI-BLAST, an extremely popular tool for sequence similarity search, features the utilization of Position-Specific Scoring Matrix (PSSM) constructed from a multiple sequence alignment (MSA). PSSM allows the detection of more distant homologs than a general amino acid substitution matrix does. An accurate estimation of the weights for sequences in an MSA is crucially important for PSSM construction. PSI-BLAST divides a given MSA into multiple blocks, for which sequence weights are calculated. When the block width becomes very narrow, the sequence weight calculation can be odd. Results We demonstrate that PSI-BLAST indeed generates a significant fraction of blocks having width less than 5, thereby degrading the PSI-BLAST performance. We revised the code of PSI-BLAST to prevent the blocks from being narrower than a given minimum block width (MBW). We designate the modified application of PSI-BLAST as PSI-BLASTexB. When MBW is 25, PSI-BLASTexB notably outperforms PSI-BLAST consistently for three independent benchmark sets. The performance boost is even more drastic when an MSA, instead of a sequence, is used as a query. Conclusions Our results demonstrate that the generation of narrow-width blocks during the sequence weight calculation is a critically important factor that restricts the PSI-BLAST search performance. By preventing narrow blocks, PSI-BLASTexB upgrades the PSI-BLAST performance remarkably. Binaries and source codes of PSI-BLASTexB (MBW = 25) are available at https://github.com/kyungtaekLIM/PSI-BLASTexB. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1686-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Toshiyuki Oda
- Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo, 135-0064, Japan.
| | - Kyungtaek Lim
- Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo, 135-0064, Japan
| | - Kentaro Tomii
- Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo, 135-0064, Japan. .,Biotechnology Research Institute for Drug Discovery, National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo, 135-0064, Japan.
| |
Collapse
|
14
|
Pires AS, Porto WF, Franco OL, Alencar SA. In silico analyses of deleterious missense SNPs of human apolipoprotein E3. Sci Rep 2017; 7:2509. [PMID: 28559539 PMCID: PMC5449402 DOI: 10.1038/s41598-017-01737-w] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2016] [Accepted: 03/27/2017] [Indexed: 12/23/2022] Open
Abstract
ApoE3 is the major chylomicron apolipoprotein, binding in a specific liver peripheral cell receptor, allowing transport and normal catabolism of triglyceride-rich lipoprotein constituents. Point mutations in ApoE3 have been associated with Alzheimer's disease, type III hyperlipoproteinemia, atherosclerosis, telomere shortening and impaired cognitive function. Here, we evaluate the impact of missense SNPs in APOE retrieved from dbSNP through 16 computational prediction tools, and further evaluate the structural impact of convergent deleterious changes using 100 ns molecular dynamics simulations. We have found structural changes in four analyzed variants (Pro102Arg, Arg132Ser, Arg176Cys and Trp294Cys), two of them (Pro102Arg and Arg176Cys) being previously associated with human diseases. In all cases, except for Trp294Cys, there was a loss in the number of hydrogen bonds between CT and NT domains that could result in their detachment. In conclusion, data presented here could increase the knowledge of ApoE3 activity and be a starting point for the study of the impact of variations on APOE gene.
Collapse
Affiliation(s)
- Allan S Pires
- Programa de Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília-DF, Brazil
- Centro de Análises Proteômicas e Bioquímicas, Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília-DF, Brazil
| | - William F Porto
- Programa de Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília-DF, Brazil
- Centro de Análises Proteômicas e Bioquímicas, Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília-DF, Brazil
- Porto Reports, Brasília-DF, Brazil
| | - Octavio L Franco
- Programa de Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília-DF, Brazil
- Centro de Análises Proteômicas e Bioquímicas, Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília-DF, Brazil
- S-Inova Biotech, Pós-graduação em Biotecnologia, Universidade Católica Dom Bosco, Campo Grande, MS, Brazil
| | - Sérgio A Alencar
- Programa de Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília-DF, Brazil.
| |
Collapse
|
15
|
Kim J, Ahuja LG, Chao FA, Xia Y, McClendon CL, Kornev AP, Taylor SS, Veglia G. A dynamic hydrophobic core orchestrates allostery in protein kinases. SCIENCE ADVANCES 2017; 3:e1600663. [PMID: 28435869 PMCID: PMC5384802 DOI: 10.1126/sciadv.1600663] [Citation(s) in RCA: 81] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2016] [Accepted: 03/15/2017] [Indexed: 05/05/2023]
Abstract
Eukaryotic protein kinases (EPKs) constitute a class of allosteric switches that mediate a myriad of signaling events. It has been postulated that EPKs' active and inactive states depend on the structural architecture of their hydrophobic cores, organized around two highly conserved spines: C-spine and R-spine. How the spines orchestrate the transition of the enzyme between catalytically uncommitted and committed states remains elusive. Using relaxation dispersion nuclear magnetic resonance spectroscopy, we found that the hydrophobic core of the catalytic subunit of protein kinase A, a prototypical and ubiquitous EPK, moves synchronously to poise the C subunit for catalysis in response to binding adenosine 5'-triphosphate. In addition to completing the C-spine, the adenine ring fuses the β structures of the N-lobe and the C-lobe. Additional residues that bridge the two spines (I150 and V104) are revealed as part of the correlated hydrophobic network; their importance was validated by mutagenesis, which led to inactivation. Because the hydrophobic architecture of the catalytic core is conserved throughout the EPK superfamily, the present study suggests a universal mechanism for dynamically driven allosteric activation of kinases mediated by coordinated signal transmission through ordered motifs in their hydrophobic cores.
Collapse
Affiliation(s)
- Jonggul Kim
- Department of Chemistry, University of Minnesota, Minneapolis, MN 55455, USA
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA
| | - Lalima G. Ahuja
- Department of Pharmacology, University of California, San Diego, La Jolla, CA 92093, USA
| | - Fa-An Chao
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA
| | - Youlin Xia
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA
| | - Christopher L. McClendon
- Department of Pharmacology, University of California, San Diego, La Jolla, CA 92093, USA
- Department of Chemistry and Biochemistry, University of California at San Diego, La Jolla, CA 92093, USA
| | - Alexandr P. Kornev
- Department of Pharmacology, University of California, San Diego, La Jolla, CA 92093, USA
| | - Susan S. Taylor
- Department of Pharmacology, University of California, San Diego, La Jolla, CA 92093, USA
- Department of Chemistry and Biochemistry, University of California at San Diego, La Jolla, CA 92093, USA
| | - Gianluigi Veglia
- Department of Chemistry, University of Minnesota, Minneapolis, MN 55455, USA
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA
- Corresponding author.
| |
Collapse
|
16
|
Pires ÁS, Porto WF, Castro PO, Franco OL, Alencar SA. Theoretical structural characterization of lymphoguanylin: A potential candidate for the development of drugs to treat gastrointestinal disorders. J Theor Biol 2017; 419:193-200. [PMID: 28214543 DOI: 10.1016/j.jtbi.2017.02.016] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2016] [Revised: 01/18/2017] [Accepted: 02/13/2017] [Indexed: 10/20/2022]
Abstract
Guanylin peptides (GPs) are small cysteine-rich peptide hormones involved in salt absorption, regulation of fluids and electrolyte homeostasis. This family presents four members: guanylin (GN), uroguanylin (UGN), lymphoguanylin (LGN) and renoguanylin (RGN). GPs have been used as templates for the development of drugs for the treatment of gastrointestinal disorders. Currently, LGN is the only GP with only one disulfide bridge, making it a remarkable member of this family and a potential drug template; however, there is no structural information about this peptide. In fact, LGN is predicted to be highly disordered and flexible, making it difficult to obtain structural information using in vitro methods. Therefore, this study applied a series of 1μs molecular dynamics simulations in order to understand the structural behavior of LGN, comparing it to the C115Y variant of GN, which shows the same Cys to Tyr modification. LGN showed to be more flexible than GN C115Y. While the negatively charged N-terminal, despite its repellent behavior, seems to be involved mainly in pH-dependent activity, the hydrophobic core showed to be the determinant factor in LGN's flexibility, which could be essential in its activity. These findings may be determinant in the development of new medicines to help in the treatment of gastrointestinal disorders. Moreover, our investigation of LGN structure clarified some issues in the structure-activity relationship of this peptide, providing new knowledge of guanylin peptides and clarifying the differences between GN C115Y and LGN.
Collapse
Affiliation(s)
- Állan S Pires
- Programa de Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília-DF, Brazil; Centro de Análises Proteômicas e Bioquímicas, Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília-DF, Brazil
| | - William F Porto
- Programa de Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília-DF, Brazil; Centro de Análises Proteômicas e Bioquímicas, Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília-DF, Brazil; Porto Reports, Brasília-DF, Brazil
| | - Pryscilla O Castro
- Programa de Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília-DF, Brazil
| | - Octavio L Franco
- Programa de Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília-DF, Brazil; Centro de Análises Proteômicas e Bioquímicas, Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília-DF, Brazil; S-Inova Biotech, Pós-graduação em Biotecnologia, Universidade Católica Dom Bosco,, Campo Grande, MS, Brazil
| | - Sérgio A Alencar
- Programa de Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília-DF, Brazil.
| |
Collapse
|
17
|
Ovchinnikov S, Park H, Varghese N, Huang PS, Pavlopoulos GA, Kim DE, Kamisetty H, Kyrpides NC, Baker D. Protein structure determination using metagenome sequence data. Science 2017; 355:294-298. [PMID: 28104891 PMCID: PMC5493203 DOI: 10.1126/science.aah4043] [Citation(s) in RCA: 351] [Impact Index Per Article: 43.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2016] [Accepted: 11/22/2016] [Indexed: 01/30/2023]
Abstract
Despite decades of work by structural biologists, there are still ~5200 protein families with unknown structure outside the range of comparative modeling. We show that Rosetta structure prediction guided by residue-residue contacts inferred from evolutionary information can accurately model proteins that belong to large families and that metagenome sequence data more than triple the number of protein families with sufficient sequences for accurate modeling. We then integrate metagenome data, contact-based structure matching, and Rosetta structure calculations to generate models for 614 protein families with currently unknown structures; 206 are membrane proteins and 137 have folds not represented in the Protein Data Bank. This approach provides the representative models for large protein families originally envisioned as the goal of the Protein Structure Initiative at a fraction of the cost.
Collapse
Affiliation(s)
- Sergey Ovchinnikov
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
- Molecular and Cellular Biology Program, University of Washington, Seattle, WA 98195, USA
| | - Hahnbeom Park
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | | | - Po-Ssu Huang
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | | | - David E Kim
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Howard Hughes Medical Institute, University of Washington, Box 357370, Seattle, WA 98105, USA
| | | | - Nikos C Kyrpides
- Joint Genome Institute, Walnut Creek, CA 94598, USA
- Department of Biological Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA.
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
- Howard Hughes Medical Institute, University of Washington, Box 357370, Seattle, WA 98105, USA
| |
Collapse
|
18
|
One-step design of a stable variant of the malaria invasion protein RH5 for use as a vaccine immunogen. Proc Natl Acad Sci U S A 2017; 114:998-1002. [PMID: 28096331 DOI: 10.1073/pnas.1616903114] [Citation(s) in RCA: 66] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Many promising vaccine candidates from pathogenic viruses, bacteria, and parasites are unstable and cannot be produced cheaply for clinical use. For instance, Plasmodium falciparum reticulocyte-binding protein homolog 5 (PfRH5) is essential for erythrocyte invasion, is highly conserved among field isolates, and elicits antibodies that neutralize in vitro and protect in an animal model, making it a leading malaria vaccine candidate. However, functional RH5 is only expressible in eukaryotic systems and exhibits moderate temperature tolerance, limiting its usefulness in hot and low-income countries where malaria prevails. Current approaches to immunogen stabilization involve iterative application of rational or semirational design, random mutagenesis, and biochemical characterization. Typically, each round of optimization yields minor improvement in stability, and multiple rounds are required. In contrast, we developed a one-step design strategy using phylogenetic analysis and Rosetta atomistic calculations to design PfRH5 variants with improved packing and surface polarity. To demonstrate the robustness of this approach, we tested three PfRH5 designs, all of which showed improved stability relative to wild type. The best, bearing 18 mutations relative to PfRH5, expressed in a folded form in bacteria at >1 mg of protein per L of culture, and had 10-15 °C higher thermal tolerance than wild type, while also retaining ligand binding and immunogenic properties indistinguishable from wild type, proving its value as an immunogen for a future generation of vaccines against the malaria blood stage. We envision that this efficient computational stability design methodology will also be used to enhance the biophysical properties of other recalcitrant vaccine candidates from emerging pathogens.
Collapse
|
19
|
Lim K, Yamada KD, Frith MC, Tomii K. Protein sequence-similarity search acceleration using a heuristic algorithm with a sensitive matrix. ACTA ACUST UNITED AC 2017; 17:147-154. [PMID: 28083762 PMCID: PMC5274646 DOI: 10.1007/s10969-016-9210-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2015] [Accepted: 12/05/2016] [Indexed: 12/28/2022]
Abstract
Protein database search for public databases is a fundamental step in the target selection of proteins in structural and functional genomics and also for inferring protein structure, function, and evolution. Most database search methods employ amino acid substitution matrices to score amino acid pairs. The choice of substitution matrix strongly affects homology detection performance. We earlier proposed a substitution matrix named MIQS that was optimized for distant protein homology search. Herein we further evaluate MIQS in combination with LAST, a heuristic and fast database search tool with a tunable sensitivity parameter m, where larger m denotes higher sensitivity. Results show that MIQS substantially improves the homology detection and alignment quality performance of LAST across diverse m parameters. Against a protein database consisting of approximately 15 million sequences, LAST with m = 105 achieves better homology detection performance than BLASTP, and completes the search 20 times faster. Compared to the most sensitive existing methods being used today, CS-BLAST and SSEARCH, LAST with MIQS and m = 106 shows comparable homology detection performance at 2.0 and 3.9 times greater speed, respectively. Results demonstrate that MIQS-powered LAST is a time-efficient method for sensitive and accurate homology search.
Collapse
Affiliation(s)
- Kyungtaek Lim
- Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo, 135-0064, Japan
| | - Kazunori D Yamada
- Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo, 135-0064, Japan
- Graduate School of Information Sciences, Tohoku University, 6-3-9 Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8579, Japan
| | - Martin C Frith
- Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo, 135-0064, Japan
- Department of Computational Biology and Medical Sciences, University of Tokyo, 5-1-5 Kashiwa-no-ha, Kashiwa, Chiba, 227-8561, Japan
| | - Kentaro Tomii
- Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo, 135-0064, Japan.
- Biotechnology Research Institute for Drug Discovery, National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo, 135-0064, Japan.
| |
Collapse
|
20
|
Untiveros M, Olspert A, Artola K, Firth AE, Kreuze JF, Valkonen JPT. A novel sweet potato potyvirus open reading frame (ORF) is expressed via polymerase slippage and suppresses RNA silencing. MOLECULAR PLANT PATHOLOGY 2016; 17:1111-23. [PMID: 26757490 PMCID: PMC4979677 DOI: 10.1111/mpp.12366] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/08/2015] [Revised: 12/10/2015] [Accepted: 12/17/2015] [Indexed: 05/20/2023]
Abstract
The single-stranded, positive-sense RNA genome of viruses in the genus Potyvirus encodes a large polyprotein that is cleaved to yield 10 mature proteins. The first three cleavage products are P1, HCpro and P3. An additional short open reading frame (ORF), called pipo, overlaps the P3 region of the polyprotein ORF. Four related potyviruses infecting sweet potato (Ipomoea batatas) are predicted to contain a third ORF, called pispo, which overlaps the 3' third of the P1 region. Recently, pipo has been shown to be expressed via polymerase slippage at a conserved GA6 sequence. Here, we show that pispo is also expressed via polymerase slippage at a GA6 sequence, with higher slippage efficiency (∼5%) than at the pipo site (∼1%). Transient expression of recombinant P1 or the 'transframe' product, P1N-PISPO, in Nicotiana benthamiana suppressed local RNA silencing (RNAi), but only P1N-PISPO inhibited short-distance movement of the silencing signal. These results reveal that polymerase slippage in potyviruses is not limited to pipo expression, but can be co-opted for the evolution and expression of further novel gene products.
Collapse
Affiliation(s)
- Milton Untiveros
- Department of Agricultural Sciences, University of Helsinki, FI-00014, Helsinki, Finland
| | - Allan Olspert
- Department of Pathology, Division of Virology, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QP, UK
- Department of Plant Sciences, University of Cambridge, Downing Street, Cambridge, CB2 3EA, UK
| | - Katrin Artola
- Department of Agricultural Sciences, University of Helsinki, FI-00014, Helsinki, Finland
| | - Andrew E Firth
- Department of Pathology, Division of Virology, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QP, UK
| | | | - Jari P T Valkonen
- Department of Agricultural Sciences, University of Helsinki, FI-00014, Helsinki, Finland
| |
Collapse
|
21
|
Burmann BM, Holdbrook DA, Callon M, Bond PJ, Hiller S. Revisiting the interaction between the chaperone Skp and lipopolysaccharide. Biophys J 2016; 108:1516-1526. [PMID: 25809264 DOI: 10.1016/j.bpj.2015.01.029] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2014] [Revised: 12/22/2014] [Accepted: 01/28/2015] [Indexed: 10/23/2022] Open
Abstract
The bacterial outer membrane comprises two main classes of components, lipids and membrane proteins. These nonsoluble compounds are conveyed across the aqueous periplasm along specific molecular transport routes: the lipid lipopolysaccharide (LPS) is shuttled by the Lpt system, whereas outer membrane proteins (Omps) are transported by chaperones, including the periplasmic Skp. In this study, we revisit the specificity of the chaperone-lipid interaction of Skp and LPS. High-resolution NMR spectroscopy measurements indicate that LPS interacts with Skp nonspecifically, accompanied by destabilization of the Skp trimer and similar to denaturation by the nonnatural detergent lauryldimethylamine-N-oxide (LDAO). Bioinformatic analysis of amino acid conservation, structural analysis of LPS-binding proteins, and MD simulations further confirm the absence of a specific LPS binding site on Skp, making a biological relevance of the interaction unlikely. Instead, our analysis reveals a highly conserved salt-bridge network, which likely has a role for Skp function.
Collapse
Affiliation(s)
| | | | | | - Peter J Bond
- Bioinformatics Institute (A(∗)STAR), Singapore; Department of Biological Sciences, National University of Singapore, Singapore
| | | |
Collapse
|
22
|
Systematic Exploration of an Efficient Amino Acid Substitution Matrix: MIQS. Methods Mol Biol 2016. [PMID: 27115635 DOI: 10.1007/978-1-4939-3572-7_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
Amino acid sequence comparisons to find similarities between proteins are fundamental sequence information analyses for inferring protein structure and function. In this study, we improve amino acid substitution matrices to identify distantly related proteins. We systematically sampled and benchmarked substitution matrices generated from the principal component analysis (PCA) subspace based on a set of typical existing matrices. Based on the benchmark results, we identified a region of highly sensitive matrices in the PCA subspace using kernel density estimation (KDE). Using the PCA subspace, we were able to deduce a novel sensitive matrix, called MIQS, which shows better detection performance for detecting distantly related proteins than those of existing matrices. This approach to derive an efficient amino acid substitution matrix might influence many fields of protein sequence analysis. MIQS is available at http://csas.cbrc.jp/Ssearch/ .
Collapse
|
23
|
Hess M, Keul F, Goesele M, Hamacher K. Addressing inaccuracies in BLOSUM computation improves homology search performance. BMC Bioinformatics 2016; 17:189. [PMID: 27122148 PMCID: PMC4849092 DOI: 10.1186/s12859-016-1060-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2015] [Accepted: 04/21/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND BLOSUM matrices belong to the most commonly used substitution matrix series for protein homology search and sequence alignments since their publication in 1992. In 2008, Styczynski et al. discovered miscalculations in the clustering step of the matrix computation. Still, the RBLOSUM64 matrix based on the corrected BLOSUM code was reported to perform worse at a statistically significant level than the BLOSUM62. Here, we present a further correction of the (R)BLOSUM code and provide a thorough performance analysis of BLOSUM-, RBLOSUM- and the newly derived CorBLOSUM-type matrices. Thereby, we assess homology search performance of these matrix-types derived from three different BLOCKS databases on all versions of the ASTRAL20, ASTRAL40 and ASTRAL70 subsets resulting in 51 different benchmarks in total. Our analysis is focused on two of the most popular BLOSUM matrices - BLOSUM50 and BLOSUM62. RESULTS Our study shows that fixing small errors in the BLOSUM code results in substantially different substitution matrices with a beneficial influence on homology search performance when compared to the original matrices. The CorBLOSUM matrices introduced here performed at least as good as their BLOSUM counterparts in ∼75 % of all test cases. On up-to-date ASTRAL databases BLOSUM matrices were even outperformed by CorBLOSUM matrices in more than 86 % of the times. In contrast to the study by Styczynski et al., the tested RBLOSUM matrices also outperformed the corresponding BLOSUM matrices in most of the cases. Comparing the CorBLOSUM with the RBLOSUM matrices revealed no general performance advantages for either on older ASTRAL releases. On up-to-date ASTRAL databases however CorBLOSUM matrices performed better than their RBLOSUM counterparts in ∼74 % of the test cases. CONCLUSIONS Our results imply that CorBLOSUM type matrices outperform the BLOSUM matrices on a statistically significant level in most of the cases, especially on up-to-date databases such as ASTRAL ≥2.01. Additionally, CorBLOSUM matrices are closer to those originally intended by Henikoff and Henikoff on a conceptual level. Hence, we encourage the usage of CorBLOSUM over (R)BLOSUM matrices for the task of homology search.
Collapse
Affiliation(s)
- Martin Hess
- Graphics, Capture and Massively Parallel Computing, Department of Computer Science, Technische Universität Darmstadt, Rundeturmstraße 12, Darmstadt, 64283, Germany.,Computational Biology and Simulation, Department of Biology, Technische Universität Darmstadt, Schnittspahnstraße 2, Darmstadt, 64287, Germany
| | - Frank Keul
- Computational Biology and Simulation, Department of Biology, Technische Universität Darmstadt, Schnittspahnstraße 2, Darmstadt, 64287, Germany.
| | - Michael Goesele
- Graphics, Capture and Massively Parallel Computing, Department of Computer Science, Technische Universität Darmstadt, Rundeturmstraße 12, Darmstadt, 64283, Germany
| | - Kay Hamacher
- Computational Biology and Simulation, Department of Biology, Technische Universität Darmstadt, Schnittspahnstraße 2, Darmstadt, 64287, Germany
| |
Collapse
|
24
|
Porto WF, Nolasco DO, Pires ÁS, Fernandes GR, Franco OL, Alencar SA. HD5 and HBD1 variants’ solvation potential energy correlates with their antibacterial activity against Escherichia coli. Biopolymers 2016; 106:43-50. [DOI: 10.1002/bip.22763] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2015] [Revised: 10/22/2015] [Accepted: 11/02/2015] [Indexed: 11/06/2022]
Affiliation(s)
- William F. Porto
- Programa De Pós-Graduação Em Ciências Genômicas E Biotecnologia; Universidade Católica De Brasília; Brasília- DF Brazil
- Centro De Análises Proteômicas E Bioquímicas, Pós-Graduação Em Ciências Genômicas E Biotecnologia; Universidade Católica De Brasília; Brasília- DF Brazil
| | - Diego O. Nolasco
- Programa De Pós-Graduação Em Ciências Genômicas E Biotecnologia; Universidade Católica De Brasília; Brasília- DF Brazil
- Curso De Física; Universidade Católica De Brasília; Brasília DF Brazil
| | - Állan S. Pires
- Programa De Pós-Graduação Em Ciências Genômicas E Biotecnologia; Universidade Católica De Brasília; Brasília- DF Brazil
- Centro De Análises Proteômicas E Bioquímicas, Pós-Graduação Em Ciências Genômicas E Biotecnologia; Universidade Católica De Brasília; Brasília- DF Brazil
| | - Gabriel R. Fernandes
- Programa De Pós-Graduação Em Ciências Genômicas E Biotecnologia; Universidade Católica De Brasília; Brasília- DF Brazil
| | - Octávio L. Franco
- Programa De Pós-Graduação Em Ciências Genômicas E Biotecnologia; Universidade Católica De Brasília; Brasília- DF Brazil
- Centro De Análises Proteômicas E Bioquímicas, Pós-Graduação Em Ciências Genômicas E Biotecnologia; Universidade Católica De Brasília; Brasília- DF Brazil
- S-Inova Biotech; Pos Graduação em Biotecnologia; Universidade Catolica Dom Bosco; Campo Grande Campo Grande Brazil
| | - Sérgio A. Alencar
- Programa De Pós-Graduação Em Ciências Genômicas E Biotecnologia; Universidade Católica De Brasília; Brasília- DF Brazil
| |
Collapse
|
25
|
Heavner ME, Qiu WG, Cheng HP. Phylogenetic Co-Occurrence of ExoR, ExoS, and ChvI, Components of the RSI Bacterial Invasion Switch, Suggests a Key Adaptive Mechanism Regulating the Transition between Free-Living and Host-Invading Phases in Rhizobiales. PLoS One 2015; 10:e0135655. [PMID: 26309130 PMCID: PMC4550343 DOI: 10.1371/journal.pone.0135655] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2015] [Accepted: 07/23/2015] [Indexed: 11/18/2022] Open
Abstract
Both bacterial symbionts and pathogens rely on their host-sensing mechanisms to activate the biosynthetic pathways necessary for their invasion into host cells. The Gram-negative bacterium Sinorhizobium meliloti relies on its RSI (ExoR-ExoS-ChvI) Invasion Switch to turn on the production of succinoglycan, an exopolysaccharide required for its host invasion. Recent whole-genome sequencing efforts have uncovered putative components of RSI-like invasion switches in many other symbiotic and pathogenic bacteria. To explore the possibility of the existence of a common invasion switch, we have conducted a phylogenomic survey of orthologous ExoR, ExoS, and ChvI tripartite sets in more than ninety proteobacterial genomes. Our analyses suggest that functional orthologs of the RSI invasion switch co-exist in Rhizobiales, an order characterized by numerous invasive species, but not in the order’s close relatives. Phylogenomic analyses and reconstruction of orthologous sets of the three proteins in Alphaproteobacteria confirm Rhizobiales-specific gene synteny and congruent RSI evolutionary histories. Evolutionary analyses further revealed site-specific substitutions correlated specifically to either animal-bacteria or plant-bacteria associations. Lineage restricted conservation of any one specialized gene is in itself an indication of species adaptation. However, the orthologous phylogenetic co-occurrence of all interacting partners within this single signaling pathway strongly suggests that the development of the RSI switch was a key adaptive mechanism. The RSI invasion switch, originally found in S. meliloti, is a characteristic of the Rhizobiales, and potentially a conserved crucial activation step that may be targeted to control host invasion by pathogenic bacterial species.
Collapse
Affiliation(s)
- Mary Ellen Heavner
- Biochemistry Program, The Graduate Center, City University of New York, New York, New York, United States of America
| | - Wei-Gang Qiu
- Biological Sciences Department, Hunter College, City University of New York, New York, New York, United States of America
| | - Hai-Ping Cheng
- Biochemistry Program, The Graduate Center, City University of New York, New York, New York, United States of America
- Biological Sciences Department, Lehman College, City University of New York, Bronx, New York, United States of America
- * E-mail:
| |
Collapse
|
26
|
Ndhlovu A, Hazelhurst S, Durand PM. Robust sequence alignment using evolutionary rates coupled with an amino acid substitution matrix. BMC Bioinformatics 2015; 16:255. [PMID: 26269100 PMCID: PMC4535666 DOI: 10.1186/s12859-015-0688-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2015] [Accepted: 07/29/2015] [Indexed: 11/27/2022] Open
Abstract
Background Selective pressures at the DNA level shape genes into profiles consisting of patterns of rapidly evolving sites and sites withstanding change. These profiles remain detectable even when protein sequences become extensively diverged. A common task in molecular biology is to infer functional, structural or evolutionary relationships by querying a database using an algorithm. However, problems arise when sequence similarity is low. This study presents an algorithm that uses the evolutionary rate at codon sites, the dN/dS (ω) parameter, coupled to a substitution matrix as an alignment metric for detecting distantly related proteins. The algorithm, called BLOSUM-FIRE couples a newer and improved version of the original FIRE (Functional Inference using Rates of Evolution) algorithm with an amino acid substitution matrix in a dynamic scoring function. The enigmatic hepatitis B virus X protein was used as a test case for BLOSUM-FIRE and its associated database EvoDB. Results The evolutionary rate based approach was coupled with a conventional BLOSUM substitution matrix. The two approaches are combined in a dynamic scoring function, which uses the selective pressure to score aligned residues. The dynamic scoring function is based on a coupled additive approach that scores aligned sites based on the level of conservation inferred from the ω values. Evaluation of the accuracy of this new implementation, BLOSUM-FIRE, using MAFFT alignment as reference alignments has shown that it is more accurate than its predecessor FIRE. Comparison of the alignment quality with widely used algorithms (MUSCLE, T-COFFEE, and CLUSTAL Omega) revealed that the BLOSUM-FIRE algorithm performs as well as conventional algorithms. Its main strength lies in that it provides greater potential for aligning divergent sequences and addresses the problem of low specificity inherent in the original FIRE algorithm. The utility of this algorithm is demonstrated using the Hepatitis B virus X (HBx) protein, a protein of unknown function, as a test case. Conclusion This study describes the utility of an evolutionary rate based approach coupled to the BLOSUM62 amino acid substitution matrix in inferring protein domain function. We demonstrate that such an approach is robust and performs as well as an array of conventional algorithms.
Collapse
Affiliation(s)
- Andrew Ndhlovu
- Evolutionary Medicine Laboratory, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa. .,Sydney Brenner Institute of Molecular Bioscience, University of the Witwatersrand, Johannesburg, South Africa.
| | - Scott Hazelhurst
- School of Electrical and Information Engineering, University of the Witwatersrand, Johannesburg, South Africa. .,Sydney Brenner Institute of Molecular Bioscience, University of the Witwatersrand, Johannesburg, South Africa.
| | - Pierre M Durand
- Evolutionary Medicine Laboratory, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa. .,Sydney Brenner Institute of Molecular Bioscience, University of the Witwatersrand, Johannesburg, South Africa. .,Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, 85721, USA. .,Department of Biodiversity and Conservation Biology, Faculty of Natural Sciences, University of the Western Cape, Private Bag X17, Belville, Cape Town, 7530, South Africa.
| |
Collapse
|
27
|
Porto WF, Franco OL, Alencar SA. Computational analyses and prediction of guanylin deleterious SNPs. Peptides 2015; 69:92-102. [PMID: 25899674 DOI: 10.1016/j.peptides.2015.04.013] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/03/2015] [Revised: 04/10/2015] [Accepted: 04/12/2015] [Indexed: 01/01/2023]
Abstract
Human guanylin, coded by the GUCA2A gene, is a member of a peptide family that activates intestinal membrane guanylate cyclase, regulating electrolyte and water transport in intestinal and renal epithelia. Deregulation of guanylin peptide activity has been associated with colon adenocarcinoma, adenoma and intestinal polyps. Besides, it is known that mutations on guanylin receptors could be involved in meconium ileus. However, there are no previous works regarding the alterations driven by single nucleotide polymorphisms in guanylin peptides. A comprehensive in silico analysis of missense SNPs present in the GUCA2A gene was performed taking into account 16 prediction tools in order to select the deleterious variations for further evaluation by molecular dynamics simulations (50 ns). Molecular dynamics data suggest that the three out of five variants (Cys104Arg, Cys112Ser and Cys115Tyr) have undergone structural modifications in terms of flexibility, volume and/or solvation. In addition, two nonsense SNPs were identified, both preventing the formation of disulfide bonds and resulting in the synthesis of truncated proteins. In summary the structural analysis of missense SNPs is important to decrease the number of potential mutations to be in vitro evaluated for associating them with some genetic diseases. In addition, data reported here could lead to a better understanding of structural and functional aspects of guanylin peptides.
Collapse
Affiliation(s)
- William F Porto
- Programa de Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília-DF, Brazil; Centro de Análises Proteômicas e Bioquímicas, Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília-DF, Brazil
| | - Octávio L Franco
- Programa de Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília-DF, Brazil; Centro de Análises Proteômicas e Bioquímicas, Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília-DF, Brazil; C S-Inova, Pos-Graduação em Biotecnologia, Universidade Católica Dom Bosco, Campo Grande, MS, Brazil.
| | - Sérgio A Alencar
- Programa de Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília-DF, Brazil.
| |
Collapse
|
28
|
Ahola T, Karlin DG. Sequence analysis reveals a conserved extension in the capping enzyme of the alphavirus supergroup, and a homologous domain in nodaviruses. Biol Direct 2015; 10:16. [PMID: 25886938 PMCID: PMC4392871 DOI: 10.1186/s13062-015-0050-0] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2014] [Accepted: 03/24/2015] [Indexed: 12/16/2022] Open
Abstract
Background Members of the alphavirus supergroup include human pathogens such as chikungunya virus, hepatitis E virus and rubella virus. They encode a capping enzyme with methyltransferase-guanylyltransferase (MTase-GTase) activity, which is an attractive drug target owing to its unique mechanism. However, its experimental study has proven very difficult. Results We examined over 50 genera of viruses by sequence analyses. Earlier studies showed that the MTase-GTase contains a “Core” region conserved in sequence. We show that it is followed by a long extension, which we termed “Iceberg” region, whose secondary structure, but not sequence, is strikingly conserved throughout the alphavirus supergroup. Sequence analyses strongly suggest that the minimal capping domain corresponds to the Core and Iceberg regions combined, which is supported by earlier experimental data. The Iceberg region contains all known membrane association sites that contribute to the assembly of viral replication factories. We predict that it may also contain an overlooked, widely conserved membrane-binding amphipathic helix. Unexpectedly, we detected a sequence homolog of the alphavirus MTase-GTase in taxa related to nodaviruses and to chronic bee paralysis virus. The presence of a capping enzyme in nodaviruses is biologically consistent, since they have capped genomes but replicate in the cytoplasm, where no cellular capping enzyme is present. The putative MTase-GTase domain of nodaviruses also contains membrane-binding sites that may drive the assembly of viral replication factories, revealing an unsuspected parallel with the alphavirus supergroup. Conclusions Our work will guide the functional analysis of the alphaviral MTase-GTase and the production of domains for structure determination. The identification of a homologous domain in a simple model system, nodaviruses, which replicate in numerous eukaryotic cell systems (yeast, flies, worms, mammals, and plants), can further help crack the function and structure of the enzyme. Reviewers This article was reviewed by Valerian Dolja, Eugene Koonin and Sebastian Maurer-Stroh. Electronic supplementary material The online version of this article (doi:10.1186/s13062-015-0050-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Tero Ahola
- Department of Food and Environmental Sciences, University of Helsinki, 00014, Helsinki, Finland.
| | - David G Karlin
- Department of Zoology, University of Oxford, Oxford, OX1 3PS, UK. .,The Division of Structural Biology, Henry Wellcome Building, Roosevelt Drive, Oxford, OX3 7BN, UK.
| |
Collapse
|
29
|
Meier A, Söding J. Context similarity scoring improves protein sequence alignments in the midnight zone. Bioinformatics 2014; 31:674-81. [PMID: 25338715 DOI: 10.1093/bioinformatics/btu697] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION High-quality protein sequence alignments are essential for a number of downstream applications such as template-based protein structure prediction. In addition to the similarity score between sequence profile columns, many current profile-profile alignment tools use extra terms that compare 1D-structural properties such as secondary structure and solvent accessibility, which are predicted from short profile windows around each sequence position. Such scores add non-redundant information by evaluating the conservation of local patterns of hydrophobicity and other amino acid properties and thus exploiting correlations between profile columns. RESULTS Here, instead of predicting and comparing known 1D properties, we follow an agnostic approach. We learn in an unsupervised fashion a set of maximally conserved patterns represented by 13-residue sequence profiles, without the need to know the cause of the conservation of these patterns. We use a maximum likelihood approach to train a set of 32 such profiles that can best represent patterns conserved within pairs of remotely homologs, structurally aligned training profiles. We include the new context score into our Hmm-Hmm alignment tool hhsearch and improve especially the quality of difficult alignments significantly. CONCLUSION The context similarity score improves the quality of homology models and other methods that depend on accurate pairwise alignments.
Collapse
Affiliation(s)
- Armin Meier
- Gene Center, LMU Munich, 81377 Munich and Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany
| | - Johannes Söding
- Gene Center, LMU Munich, 81377 Munich and Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany Gene Center, LMU Munich, 81377 Munich and Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany
| |
Collapse
|
30
|
Ma J, Wang S, Wang Z, Xu J. MRFalign: protein homology detection through alignment of Markov random fields. PLoS Comput Biol 2014; 10:e1003500. [PMID: 24675572 PMCID: PMC3967925 DOI: 10.1371/journal.pcbi.1003500] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2013] [Accepted: 01/08/2014] [Indexed: 11/24/2022] Open
Abstract
Sequence-based protein homology detection has been extensively studied and so far the most sensitive method is based upon comparison of protein sequence profiles, which are derived from multiple sequence alignment (MSA) of sequence homologs in a protein family. A sequence profile is usually represented as a position-specific scoring matrix (PSSM) or an HMM (Hidden Markov Model) and accordingly PSSM-PSSM or HMM-HMM comparison is used for homolog detection. This paper presents a new homology detection method MRFalign, consisting of three key components: 1) a Markov Random Fields (MRF) representation of a protein family; 2) a scoring function measuring similarity of two MRFs; and 3) an efficient ADMM (Alternating Direction Method of Multipliers) algorithm aligning two MRFs. Compared to HMM that can only model very short-range residue correlation, MRFs can model long-range residue interaction pattern and thus, encode information for the global 3D structure of a protein family. Consequently, MRF-MRF comparison for remote homology detection shall be much more sensitive than HMM-HMM or PSSM-PSSM comparison. Experiments confirm that MRFalign outperforms several popular HMM or PSSM-based methods in terms of both alignment accuracy and remote homology detection and that MRFalign works particularly well for mainly beta proteins. For example, tested on the benchmark SCOP40 (8353 proteins) for homology detection, PSSM-PSSM and HMM-HMM succeed on 48% and 52% of proteins, respectively, at superfamily level, and on 15% and 27% of proteins, respectively, at fold level. In contrast, MRFalign succeeds on 57.3% and 42.5% of proteins at superfamily and fold level, respectively. This study implies that long-range residue interaction patterns are very helpful for sequence-based homology detection. The software is available for download at http://raptorx.uchicago.edu/download/. A summary of this paper appears in the proceedings of the RECOMB 2014 conference, April 2–5. Sequence-based protein homology detection has been extensively studied, but it remains very challenging for remote homologs with divergent sequences. So far the most sensitive methods employ HMM-HMM comparison, which models a protein family using HMM (Hidden Markov Model) and then detects homologs using HMM-HMM alignment. HMM cannot model long-range residue interaction patterns and thus, carries very little information regarding the global 3D structure of a protein family. As such, HMM comparison is not sensitive enough for distantly-related homologs. In this paper, we present an MRF-MRF comparison method for homology detection. In particular, we model a protein family using Markov Random Fields (MRF) and then detect homologs by MRF-MRF alignment. Compared to HMM, MRFs are able to model long-range residue interaction pattern and thus, contains information for the overall 3D structure of a protein family. Consequently, MRF-MRF comparison is much more sensitive than HMM-HMM comparison. To implement MRF-MRF comparison, we have developed a new scoring function to measure the similarity of two MRFs and also an efficient ADMM algorithm to optimize the scoring function. Experiments confirm that MRF-MRF comparison indeed outperforms HMM-HMM comparison in terms of both alignment accuracy and remote homology detection, especially for mainly beta proteins.
Collapse
Affiliation(s)
- Jianzhu Ma
- Toyota Technological Institute at Chicago, Chicago, Illinois, United States of America
| | - Sheng Wang
- Toyota Technological Institute at Chicago, Chicago, Illinois, United States of America
| | - Zhiyong Wang
- Toyota Technological Institute at Chicago, Chicago, Illinois, United States of America
| | - Jinbo Xu
- Toyota Technological Institute at Chicago, Chicago, Illinois, United States of America
- * E-mail:
| |
Collapse
|
31
|
Friedman R. Drug resistance missense mutations in cancer are subject to evolutionary constraints. PLoS One 2013; 8:e82059. [PMID: 24376513 PMCID: PMC3869674 DOI: 10.1371/journal.pone.0082059] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2013] [Accepted: 10/29/2013] [Indexed: 12/20/2022] Open
Abstract
Several tumour types are sensitive to deactivation of just one or very few genes that are constantly active in the cancer cells, a phenomenon that is termed ‘oncogene addiction’. Drugs that target the products of those oncogenes can yield a temporary relief, and even complete remission. Unfortunately, many patients receiving oncogene-targeted therapies relapse on treatment. This often happens due to somatic mutations in the oncogene (‘resistance mutations’). ‘Compound mutations’, which in the context of cancer drug resistance are defined as two or more mutations of the drug target in the same clone may lead to enhanced resistance against the most selective inhibitors. Here, it is shown that the vast majority of the resistance mutations occurring in cancer patients treated with tyrosin kinase inhibitors aimed at three different proteins follow an evolutionary pathway. Using bioinformatic analysis tools, it is found that the drug-resistance mutations in the tyrosine kinase domains of Abl1, ALK and exons 20 and 21 of EGFR favour transformations to residues that can be identified in similar positions in evolutionary related proteins. The results demonstrate that evolutionary pressure shapes the mutational landscape in the case of drug-resistance somatic mutations. The constraints on the mutational landscape suggest that it may be possible to counter single drug-resistance point mutations. The observation of relatively many resistance mutations in Abl1, but not in the other genes, is explained by the fact that mutations in Abl1 tend to be biochemically conservative, whereas mutations in EGFR and ALK tend to be radical. Analysis of Abl1 compound mutations suggests that such mutations are more prevalent than hitherto reported and may be more difficult to counter. This supports the notion that such mutations may provide an escape route for targeted cancer drug resistance.
Collapse
Affiliation(s)
- Ran Friedman
- Department of Chemistry and Biomedical Sciences, Linnæus University, Kalmar, Sweden
- Linnæus University Centre for Biomaterials Chemistry, Linnæus University, Kalmar, Sweden
- * E-mail:
| |
Collapse
|
32
|
Yamada K, Tomii K. Revisiting amino acid substitution matrices for identifying distantly related proteins. ACTA ACUST UNITED AC 2013; 30:317-25. [PMID: 24281694 PMCID: PMC3904525 DOI: 10.1093/bioinformatics/btt694] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
Motivation: Although many amino acid substitution matrices have been developed, it has not been well understood which is the best for similarity searches, especially for remote homology detection. Therefore, we collected information related to existing matrices, condensed it and derived a novel matrix that can detect more remote homology than ever. Results: Using principal component analysis with existing matrices and benchmarks, we developed a novel matrix, which we designate as MIQS. The detection performance of MIQS is validated and compared with that of existing general purpose matrices using SSEARCH with optimized gap penalties for each matrix. Results show that MIQS is able to detect more remote homology than the existing matrices on an independent dataset. In addition, the performance of our developed matrix was superior to that of CS-BLAST, which was a novel similarity search method with no amino acid matrix. We also evaluated the alignment quality of matrices and methods, which revealed that MIQS shows higher alignment sensitivity than that with the existing matrix series and CS-BLAST. Fundamentally, these results are expected to constitute good proof of the availability and/or importance of amino acid matrices in sequence analysis. Moreover, with our developed matrix, sophisticated similarity search methods such as sequence–profile and profile–profile comparison methods can be improved further. Availability and implementation: Newly developed matrices and datasets used for this study are available at http://csas.cbrc.jp/Ssearch/. Contact:k-tomii@aist.go.jp Supplementary information:Supplementary data are available at Bioinformatics online
Collapse
Affiliation(s)
- Kazunori Yamada
- Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
| | | |
Collapse
|
33
|
Powerful sequence similarity search methods and in-depth manual analyses can identify remote homologs in many apparently "orphan" viral proteins. J Virol 2013; 88:10-20. [PMID: 24155369 PMCID: PMC3911697 DOI: 10.1128/jvi.02595-13] [Citation(s) in RCA: 71] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
The genome sequences of new viruses often contain many "orphan" or "taxon-specific" proteins apparently lacking homologs. However, because viral proteins evolve very fast, commonly used sequence similarity detection methods such as BLAST may overlook homologs. We analyzed a data set of proteins from RNA viruses characterized as "genus specific" by BLAST. More powerful methods developed recently, such as HHblits or HHpred (available through web-based, user-friendly interfaces), could detect distant homologs of a quarter of these proteins, suggesting that these methods should be used to annotate viral genomes. In-depth manual analyses of a subset of the remaining sequences, guided by contextual information such as taxonomy, gene order, or domain cooccurrence, identified distant homologs of another third. Thus, a combination of powerful automated methods and manual analyses can uncover distant homologs of many proteins thought to be orphans. We expect these methodological results to be also applicable to cellular organisms, since they generally evolve much more slowly than RNA viruses. As an application, we reanalyzed the genome of a bee pathogen, Chronic bee paralysis virus (CBPV). We could identify homologs of most of its proteins thought to be orphans; in each case, identifying homologs provided functional clues. We discovered that CBPV encodes a domain homologous to the Alphavirus methyltransferase-guanylyltransferase; a putative membrane protein, SP24, with homologs in unrelated insect viruses and insect-transmitted plant viruses having different morphologies (cileviruses, higreviruses, blunerviruses, negeviruses); and a putative virion glycoprotein, ORF2, also found in negeviruses. SP24 and ORF2 are probably major structural components of the virions.
Collapse
|
34
|
Celniker G, Nimrod G, Ashkenazy H, Glaser F, Martz E, Mayrose I, Pupko T, Ben-Tal N. ConSurf: Using Evolutionary Data to Raise Testable Hypotheses about Protein Function. Isr J Chem 2013. [DOI: 10.1002/ijch.201200096] [Citation(s) in RCA: 397] [Impact Index Per Article: 33.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
35
|
Pierce S, Gersak K, Michaelson-Cohen R, Walsh T, Lee M, Malach D, Klevit R, King MC, Levy-Lahad E. Mutations in LARS2, encoding mitochondrial leucyl-tRNA synthetase, lead to premature ovarian failure and hearing loss in Perrault syndrome. Am J Hum Genet 2013; 92:614-20. [PMID: 23541342 DOI: 10.1016/j.ajhg.2013.03.007] [Citation(s) in RCA: 163] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2012] [Revised: 02/25/2013] [Accepted: 03/11/2013] [Indexed: 11/25/2022] Open
Abstract
The genetic causes of premature ovarian failure (POF) are highly heterogeneous, and causative mutations have been identified in more than ten genes so far. In two families affected by POF accompanied by hearing loss (together, these symptoms compose Perrault syndrome), exome sequencing revealed mutations in LARS2, encoding mitochondrial leucyl-tRNA synthetase: homozygous c.1565C>A (p.Thr522Asn) in a consanguineous Palestinian family and compound heterozygous c.1077delT and c.1886C>T (p.Thr629Met) in a nonconsanguineous Slovenian family. LARS2 c.1077delT leads to a frameshift at codon 360 of the 901 residue protein. LARS2 p.Thr522Asn occurs in the LARS2 catalytic domain at a site conserved from bacteria through mammals. LARS2 p.Thr629Met occurs in the LARS2 leucine-specific domain, which is adjacent to a catalytic loop critical in all species but for which primary sequence is not well conserved. A recently developed method of detecting remote homologies revealed threonine at this site in consensus sequences derived from multiple-species alignments seeded by human and E. coli residues at this region. Yeast complementation indicated that LARS2 c.1077delT is nonfunctional and that LARS2 p.Thr522Asn is partially functional. LARS2 p.Thr629Met was functional in this assay but might be insufficient as a heterozygote with the fully nonfunctional LARS2 c.1077delT allele. A known C. elegans strain with the protein-truncating alteration LARS-2 p.Trp247Ter was confirmed to be sterile. After HARS2, LARS2 is the second gene encoding mitochondrial tRNA synthetase to be found to harbor mutations leading to Perrault syndrome, further supporting a critical role for mitochondria in the maintenance of ovarian function and hearing.
Collapse
|