1
|
Hernández-Zambrano LJ, Alfonso-González H, Buitrago SP, Castro-Cavadía CJ, Garzón-Ospina D. Exploring the genetic diversity pattern of PvEBP/DBP2: A promising candidate for an effective Plasmodium vivax vaccine. Acta Trop 2024; 255:107231. [PMID: 38685340 DOI: 10.1016/j.actatropica.2024.107231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Revised: 04/25/2024] [Accepted: 04/26/2024] [Indexed: 05/02/2024]
Abstract
Malaria remains a public health challenge. Since many control strategies have proven ineffective in eradicating this disease, new strategies are required, among which the design of a multivalent vaccine stands out. However, the effectiveness of this strategy has been hindered, among other reasons, by the genetic diversity observed in parasite antigens. In Plasmodium vivax, the Erythrocyte Binding Protein (PvEBP, also known as DBP2) is an alternate ligand to Duffy Binding Protein (DBP); given its structural resemblance to DBP, EBP/DBP2 is proposed as a promising antigen for inclusion in vaccine design. However, the extent of genetic diversity within the locus encoding this protein has not been comprehensively assessed. Thus, this study aimed to characterize the genetic diversity of the locus encoding the P. vivax EBP/DBP2 protein and to determine the evolutionary mechanisms modulating this diversity. Several intrapopulation genetic variation parameters were estimated using 36 gene sequences of PvEBP/DBP2 from Colombian P. vivax clinical isolates and 186 sequences available in databases. The study then evaluated the worldwide genetic structure and the evolutionary forces that may influence the observed patterns of genetic variation. It was found that the PvEBP/DBP2 gene exhibits one of the lowest levels of genetic diversity compared to other vaccine-candidate antigens. Four major haplotypes were shared worldwide. Analysis of the protein's 3D structure and epitope prediction identified five regions with potential antigenic properties. The results suggest that the PvEBP/DBP2 protein possesses ideal characteristics to be considered when designing a multivalent effective antimalarial vaccine against P. vivax.
Collapse
Affiliation(s)
- Laura J Hernández-Zambrano
- Grupo de Estudios en Genética y Biología Molecular (GEBIMOL), School of Biological Sciences, Universidad Pedagógica y Tecnológica de Colombia - UPTC, Tunja, Boyacá, Colombia; Population Genetics And Molecular Evolution (PGAME), Fundación Scient, Tunja, Boyacá, Colombia
| | - Heliairis Alfonso-González
- Grupo de Estudios en Genética y Biología Molecular (GEBIMOL), School of Biological Sciences, Universidad Pedagógica y Tecnológica de Colombia - UPTC, Tunja, Boyacá, Colombia; Population Genetics And Molecular Evolution (PGAME), Fundación Scient, Tunja, Boyacá, Colombia
| | - Sindy P Buitrago
- Grupo de Estudios en Genética y Biología Molecular (GEBIMOL), School of Biological Sciences, Universidad Pedagógica y Tecnológica de Colombia - UPTC, Tunja, Boyacá, Colombia; Population Genetics And Molecular Evolution (PGAME), Fundación Scient, Tunja, Boyacá, Colombia
| | - Carlos J Castro-Cavadía
- Grupo de Investigaciones Microbiológicas y Biomédicas de Córdoba (GIMBIC), School of Health Sciences, Universidad de Córdoba, Montería, Córdoba, Colombia
| | - Diego Garzón-Ospina
- Grupo de Estudios en Genética y Biología Molecular (GEBIMOL), School of Biological Sciences, Universidad Pedagógica y Tecnológica de Colombia - UPTC, Tunja, Boyacá, Colombia; Population Genetics And Molecular Evolution (PGAME), Fundación Scient, Tunja, Boyacá, Colombia.
| |
Collapse
|
2
|
Ji J, Carpentier B, Chakraborty A, Nangia S. An Affordable Topography-Based Protocol for Assigning a Residue's Character on a Hydropathy (PARCH) Scale. J Chem Theory Comput 2024; 20:1656-1672. [PMID: 37018141 PMCID: PMC10902853 DOI: 10.1021/acs.jctc.3c00106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Indexed: 04/06/2023]
Abstract
The hydropathy of proteins or quantitative assessment of protein-water interactions has been a topic of interest for decades. Most hydropathy scales use a residue-based or atom-based approach to assign fixed numerical values to the 20 amino acids and categorize them as hydrophilic, hydroneutral, or hydrophobic. These scales overlook the protein's nanoscale topography, such as bumps, crevices, cavities, clefts, pockets, and channels, in calculating the hydropathy of the residues. Some recent studies have included protein topography in determining hydrophobic patches on protein surfaces, but these methods do not provide a hydropathy scale. To overcome the limitations in the existing methods, we have developed a Protocol for Assigning a Residue's Character on the Hydropathy (PARCH) scale that adopts a holistic approach to assigning the hydropathy of a residue. The parch scale evaluates the collective response of the water molecules in the protein's first hydration shell to increasing temperatures. We performed the parch analysis of a set of well-studied proteins that include the following─enzymes, immune proteins, and integral membrane proteins, as well as fungal and virus capsid proteins. Since the parch scale evaluates every residue based on its location, a residue may have very different parch values inside a crevice versus a surface bump. Thus, a residue can have a range of parch values (or hydropathies) dictated by the local geometry. The parch scale calculations are computationally inexpensive and can compare hydropathies of different proteins. The parch analysis can affordably and reliably aid in designing nanostructured surfaces, identifying hydrophilic and hydrophobic patches, and drug discovery.
Collapse
Affiliation(s)
- Jingjing Ji
- Department
of Biomedical and Chemical Engineering, Syracuse University, Syracuse, New York 13244, United States
| | - Britnie Carpentier
- Department
of Biomedical and Chemical Engineering, Syracuse University, Syracuse, New York 13244, United States
| | - Arindam Chakraborty
- Department
of Chemistry, Syracuse University, Syracuse, New York 13244, United States
| | - Shikha Nangia
- Department
of Biomedical and Chemical Engineering, Syracuse University, Syracuse, New York 13244, United States
| |
Collapse
|
3
|
Kumar N, Bajiya N, Patiyal S, Raghava GPS. Multi-perspectives and challenges in identifying B-cell epitopes. Protein Sci 2023; 32:e4785. [PMID: 37733481 PMCID: PMC10578127 DOI: 10.1002/pro.4785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 09/11/2023] [Accepted: 09/16/2023] [Indexed: 09/23/2023]
Abstract
The identification of B-cell epitopes (BCEs) in antigens is a crucial step in developing recombinant vaccines or immunotherapies for various diseases. Over the past four decades, numerous in silico methods have been developed for predicting BCEs. However, existing reviews have only covered specific aspects, such as the progress in predicting conformational or linear BCEs. Therefore, in this paper, we have undertaken a systematic approach to provide a comprehensive review covering all aspects associated with the identification of BCEs. First, we have covered the experimental techniques developed over the years for identifying linear and conformational epitopes, including the limitations and challenges associated with these techniques. Second, we have briefly described the historical perspectives and resources that maintain experimentally validated information on BCEs. Third, we have extensively reviewed the computational methods developed for predicting conformational BCEs from the structure of the antigen, as well as the methods for predicting conformational epitopes from the sequence. Fourth, we have systematically reviewed the in silico methods developed in the last four decades for predicting linear or continuous BCEs. Finally, we have discussed the overall challenge of identifying continuous or conformational BCEs. In this review, we only listed major computational resources; a complete list with the URL is available from the BCinfo website (https://webs.iiitd.edu.in/raghava/bcinfo/).
Collapse
Affiliation(s)
- Nishant Kumar
- Department of Computational BiologyIndraprastha Institute of Information TechnologyNew DelhiIndia
| | - Nisha Bajiya
- Department of Computational BiologyIndraprastha Institute of Information TechnologyNew DelhiIndia
| | - Sumeet Patiyal
- Department of Computational BiologyIndraprastha Institute of Information TechnologyNew DelhiIndia
| | - Gajendra P. S. Raghava
- Department of Computational BiologyIndraprastha Institute of Information TechnologyNew DelhiIndia
| |
Collapse
|
4
|
Satvati S, Ghasemi Y, Najafipour S, Eskandari S, Mahmoodi S, Nezafat N, Hashemzaei M. Finding and engineering the newly found bacterial superoxide dismutase enzyme to increase its thermostability and decrease the immunogenicity: a computational and experimental research. Arch Microbiol 2023; 205:260. [PMID: 37291420 DOI: 10.1007/s00203-023-03601-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 05/23/2023] [Accepted: 05/29/2023] [Indexed: 06/10/2023]
Abstract
Superoxide dismutase (SOD) is one of the most important antioxidant enzymes that can reduce oxidative stress in the cell environment. Nowadays, bacterial sources of enzyme are commercially applicable in the cosmetics and pharmaceutical industries, but the allergenic effect of proteins from non-human sources has been mentioned as disadvantage of these kinds of enzymes. In this study, to find the suitable bacterial SOD candidate for decreasing immunogenicity, the sequences of five thermophilic bacteria were selected as reference species. Then, linear and conformational B-cell epitopes of the SOD were analyzed by different servers. The stability and immunogenicity of mutant positions were also evaluated. The mutant gene was inserted into the pET-23a expression vector and transformed into E. Coli BL21 (DE3) for expression of the recombinant enzyme. Afterward, the expression of the mutant enzyme was evaluated by SDS-PAGE analysis and the recombinant enzyme activity was assessed. Anoxybacillus gonensis was selected as a reasonable SOD source according to BLAST search, physicochemical properties analysis, and prediction of allergenic features. Regarding our results, five residues including E84, E142, K144, G147, and M148 were predicted as candidates for mutagenesis. Finally, the K144A was chosen as the final modification due to the increase in the stability of the enzyme and decreased immunogenicity of the enzyme as well. The enzyme activity was 240 U/ml at room temperature. Alternation in K144 to alanine caused increased stability of the enzyme. In silico studies confirmed non-antigenic protein after mutation.
Collapse
Affiliation(s)
- Saha Satvati
- Department of Medical Biotechnology, School of Advanced Technologies in Medicine, Fasa University of Medical Sciences, Fasa, Iran
| | - Younes Ghasemi
- Department of Pharmaceutical Biotechnology, School of Pharmacy, Shiraz University of Medical Sciences, Shiraz, Iran
- Computational vaccine and Drug Design Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Sohrab Najafipour
- Department of Medical Biotechnology, School of Advanced Technologies in Medicine, Fasa University of Medical Sciences, Fasa, Iran
- Department of Tissue Engineering, School of Advanced Technologies in Medicine, Fasa University of Medical Sciences, Fasa, Iran
| | - Sedigheh Eskandari
- Computational vaccine and Drug Design Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Shirin Mahmoodi
- Department of Medical Biotechnology, School of Advanced Technologies in Medicine, Fasa University of Medical Sciences, Fasa, Iran.
| | - Navid Nezafat
- Department of Pharmaceutical Biotechnology, School of Pharmacy, Shiraz University of Medical Sciences, Shiraz, Iran.
- Computational vaccine and Drug Design Research Center, Shiraz University of Medical Sciences, Shiraz, Iran.
- Pharmaceutical Science Research Center, Shiraz University of Medical Sciences, Shiraz, Iran.
| | - Masoud Hashemzaei
- Department of Pharmaceutical Biotechnology, School of Pharmacy, Shiraz University of Medical Sciences, Shiraz, Iran
- Computational vaccine and Drug Design Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| |
Collapse
|
5
|
Hollebrands B, Hageman JA, van de Sande JW, Albada B, Janssen HG. Improved LC-MS identification of short homologous peptides using sequence-specific retention time predictors. Anal Bioanal Chem 2023; 415:2715-2726. [PMID: 37000211 PMCID: PMC10185643 DOI: 10.1007/s00216-023-04670-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 03/17/2023] [Accepted: 03/21/2023] [Indexed: 04/01/2023]
Abstract
Peptides are an important group of compounds contributing to the desired, as well as the undesired taste of a food product. Their taste impressions can include aspects of sweetness, bitterness, savoury, umami and many other impressions depending on the amino acids present as well as their sequence. Identification of short peptides in foods is challenging. We developed a method to assign identities to short peptides including homologous structures, i.e. peptides containing the same amino acids with a different sequence order, by accurate prediction of the retention times during reversed phase separation. To train the method, a large set of well-defined short peptides with systematic variations in the amino acid sequence was prepared by a novel synthesis strategy called 'swapped-sequence synthesis'. Additionally, several proteins were enzymatically digested to yield short peptides. Experimental retention times were determined after reversed phase separation and peptide MS2 data was acquired using a high-resolution mass spectrometer operated in data-dependent acquisition mode (DDA). A support vector regression model was trained using a combination of existing sequence-independent peptide descriptors and a newly derived set of selected amino acid index derived sequence-specific peptide (ASP) descriptors. The model was trained and validated using the experimental retention times of the 713 small food-relevant peptides prepared. Whilst selecting the most useful ASP descriptors for our model, special attention was given to predict the retention time differences between homologous peptide structures. Inclusion of ASP descriptors greatly improved the ability to accurately predict retention times, including retention time differences between 157 homologous peptide pairs. The final prediction model had a goodness-of-fit (Q2) of 0.94; moreover for 93% of the short peptides, the elution order was correctly predicted.
Collapse
Affiliation(s)
- Boudewijn Hollebrands
- Unilever Foods Innovation Centre - Hive, Bronland 14, 6708 WH, Wageningen, the Netherlands.
- Laboratory of Organic Chemistry, Wageningen University & Research, Stippeneng 4, 6708 WE, Wageningen, the Netherlands.
| | - Jos A Hageman
- Wageningen University & Research, Biometris, P.O. Box 16, 6700 AA, Wageningen, the Netherlands
| | - Jasper W van de Sande
- Laboratory of Organic Chemistry, Wageningen University & Research, Stippeneng 4, 6708 WE, Wageningen, the Netherlands
| | - Bauke Albada
- Laboratory of Organic Chemistry, Wageningen University & Research, Stippeneng 4, 6708 WE, Wageningen, the Netherlands
| | - Hans-Gerd Janssen
- Unilever Foods Innovation Centre - Hive, Bronland 14, 6708 WH, Wageningen, the Netherlands
- Laboratory of Organic Chemistry, Wageningen University & Research, Stippeneng 4, 6708 WE, Wageningen, the Netherlands
| |
Collapse
|
6
|
Ataides LS, de Moraes Maia F, Conte FP, Isaac L, Barbosa AS, da Costa Lima-Junior J, Avelar KES, Rodrigues-da-Silva RN. Sph2 (176-191) and Sph2 (446-459): Identification of B-Cell Linear Epitopes in Sphingomyelinase 2 (Sph2), Naturally Recognized by Patients Infected by Pathogenic Leptospires. Vaccines (Basel) 2023; 11:vaccines11020359. [PMID: 36851237 PMCID: PMC9959207 DOI: 10.3390/vaccines11020359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 01/30/2023] [Accepted: 02/01/2023] [Indexed: 02/09/2023] Open
Abstract
Sphingomyelin is a major constituent of eukaryotic cell membranes, and if degraded by bacteria sphingomyelinases may contribute to the pathogenesis of infection. Among Leptospira spp., there are five sphingomyelinases exclusively expressed by pathogenic leptospires, in which Sph2 is expressed during natural infections, cytotoxic, and implicated in the leptospirosis hemorrhagic complications. Considering this and the lack of information about associations between Sph2 and leptospirosis severity, we use a combination of immunoinformatics approaches to identify its B-cell epitopes, evaluate their reactivity against samples from leptospirosis patients, and investigate the role of antibodies anti-Sph2 in protection against severe leptospirosis. Two B-cell epitopes, Sph2(176-191) and Sph2(446-459), were predicted in Sph2 from L. interrogans serovar Lai, presenting different levels of identity when compared with other pathogenic leptospires. These epitopes were recognized by about 40% of studied patients with a prevalence of IgG antibodies against both Sph2(176-191) and Sph2(446-459). Remarkably, just individuals with low reactivity to Sph2(176-191) presented clinical complications, while high responders had only mild symptoms. Therefore, we identified two B-cell linear epitopes, recognized by antibodies of patients with leptospirosis, that could be further explored in the development of multi-epitope vaccines against leptospirosis.
Collapse
Affiliation(s)
- Laura Sant’Anna Ataides
- Laboratório de Tecnologia Imunológica, Instituto de Tecnologia em Imunobiológicos, FIOCRUZ, Rio de Janeiro 21040-900, RJ, Brazil
| | - Fernanda de Moraes Maia
- Laboratório de Tecnologia Imunológica, Instituto de Tecnologia em Imunobiológicos, FIOCRUZ, Rio de Janeiro 21040-900, RJ, Brazil
| | - Fernando Paiva Conte
- Laboratório Piloto Eucariotos, Instituto de Tecnologia em Imunobiológicos, FIOCRUZ, Rio de Janeiro 21040-900, RJ, Brazil
| | - Lourdes Isaac
- Departamento de Imunologia, Instituto de Ciências Biomédicas, Universidade de São Paulo, São Paulo 05508-000, SP, Brazil
| | - Angela Silva Barbosa
- Laboratório de Bacteriologia, Instituto Butantan, São Paulo 05503-900, SP, Brazil
| | - Josué da Costa Lima-Junior
- Laboratório de Imunoparasitologia, Instituto Oswaldo Cruz, FIOCRUZ, Rio de Janeiro 21040-900, RJ, Brazil
| | - Kátia Eliane Santos Avelar
- Laboratório de Referência Nacional para Leptospirose, Instituto Oswaldo Cruz, FIOCRUZ, Rio de Janeiro 21040-900, RJ, Brazil
| | - Rodrigo Nunes Rodrigues-da-Silva
- Laboratório de Tecnologia Imunológica, Instituto de Tecnologia em Imunobiológicos, FIOCRUZ, Rio de Janeiro 21040-900, RJ, Brazil
- Correspondence: or ; Tel.: +55-21982054291
| |
Collapse
|
7
|
Carballo GM, Vázquez KG, García-González LA, Rio GD, Brizuela CA. Embedded-AMP: A Multi-Thread Computational Method for the Systematic Identification of Antimicrobial Peptides Embedded in Proteome Sequences. Antibiotics (Basel) 2023; 12:antibiotics12010139. [PMID: 36671338 PMCID: PMC9854971 DOI: 10.3390/antibiotics12010139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 01/03/2023] [Accepted: 01/05/2023] [Indexed: 01/12/2023] Open
Abstract
Antimicrobial peptides (AMPs) have gained the attention of the research community for being an alternative to conventional antimicrobials to fight antibiotic resistance and for displaying other pharmacologically relevant activities, such as cell penetration, autophagy induction, immunomodulation, among others. The identification of AMPs had been accomplished by combining computational and experimental approaches and have been mostly restricted to self-contained peptides despite accumulated evidence indicating AMPs may be found embedded within proteins, the functions of which are not necessarily associated with antimicrobials. To address this limitation, we propose a machine-learning (ML)-based pipeline to identify AMPs that are embedded in proteomes. Our method performs an in-silico digestion of every protein in the proteome to generate unique k-mers of different lengths, computes a set of molecular descriptors for each k-mer, and performs an antimicrobial activity prediction. To show the efficiency of the method we used the shrimp proteome, and the pipeline analyzed all k-mers between 10 and 60 amino acids in length to predict all AMPs in less than 20 min. As an application example we predicted AMPs in different rodents (common cuy, common rat, and naked mole rat) with different reported longevities and found a relation between species longevity and the number of predicted AMPs. The analysis shows as the longevity of the species is higher, the number of predicted AMPs is also higher. The pipeline is available as a web service.
Collapse
Affiliation(s)
| | - Karen Guerrero Vázquez
- Computer Science Department, CICESE Research Center, Ensenada 22860, Mexico
- School of Mathematical & Statistical Sciences, University of Galway, H91 TK33 Galway, Ireland
| | | | - Gabriel Del Rio
- Department of Biochemistry and Structural Biology, Instituto de Fisiologia Celular, UNAM, Mexico City 04510, Mexico
- Correspondence: (G.D.R.); (C.A.B.)
| | - Carlos A. Brizuela
- Computer Science Department, CICESE Research Center, Ensenada 22860, Mexico
- Correspondence: (G.D.R.); (C.A.B.)
| |
Collapse
|
8
|
Fantini J, Chahinian H, Yahi N. A Vaccine Strategy Based on the Identification of an Annular Ganglioside Binding Motif in Monkeypox Virus Protein E8L. Viruses 2022; 14:v14112531. [PMID: 36423140 PMCID: PMC9693861 DOI: 10.3390/v14112531] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2022] [Revised: 11/11/2022] [Accepted: 11/14/2022] [Indexed: 11/18/2022] Open
Abstract
The recent outbreak of Monkeypox virus requires the development of a vaccine specifically directed against this virus as quickly as possible. We propose here a new strategy based on a two-step analysis combining (i) the search for binding domains of viral proteins to gangliosides present in lipid rafts of host cells, and (ii) B epitope predictions. Based on previous studies of HIV and SARS-CoV-2 proteins, we show that the Monkeypox virus cell surface-binding protein E8L possesses a ganglioside-binding motif consisting of several subsites forming a ring structure. The binding of the E8L protein to a cluster of gangliosides GM1 mimicking a lipid raft domain is driven by both shape and electrostatic surface potential complementarities. An induced-fit mechanism unmasks selected amino acid side chains of the motif without significantly affecting the secondary structure of the protein. The ganglioside-binding motif overlaps three potential linear B epitopes that are well exposed on the unbound E8L surface that faces the host cell membrane. This situation is ideal for generating neutralizing antibodies. We thus suggest using these three sequences derived from the E8L protein as immunogens in a vaccine formulation (recombinant protein, synthetic peptides or genetically based) specific for Monkeypox virus. This lipid raft/ganglioside-based strategy could be used for developing therapeutic and vaccine responses to future virus outbreaks, in parallel to existing solutions.
Collapse
|
9
|
Mitogenome selection in the evolution of key ecological strategies in the ancient hexapod class Collembola. Sci Rep 2022; 12:14810. [PMID: 36045215 PMCID: PMC9433435 DOI: 10.1038/s41598-022-18407-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2021] [Accepted: 08/10/2022] [Indexed: 11/09/2022] Open
Abstract
A longstanding question in evolutionary biology is how natural selection and environmental pressures shape the mitochondrial genomic architectures of organisms. Mitochondria play a pivotal role in cellular respiration and aerobic metabolism, making their genomes functionally highly constrained. Evaluating selective pressures on mitochondrial genes can provide functional and ecological insights into the evolution of organisms. Collembola (springtails) are an ancient hexapod group that includes the oldest terrestrial arthropods in the fossil record, and that are closely associated with soil environments. Of interest is the diversity of habitat stratification preferences (life forms) exhibited by different species within the group. To understand whether signals of positive selection are linked to the evolution of life forms, we analysed 32 published Collembola mitogenomes in a phylomitogenomic framework. We found no evidence that signatures of selection are correlated with the evolution of novel life forms, but rather that mutations have accumulated as a function of time. Our results highlight the importance of nuclear-mitochondrial interactions in the evolution of collembolan life forms and that mitochondrial genomic data should be interpreted with caution, as complex selection signals may complicate evolutionary inferences.
Collapse
|
10
|
Abstract
Antibodies and T cell receptors (TCRs) are the fundamental building blocks of adaptive immunity. Repertoire-scale functionality derives from their epitope-binding properties, just as macroscopic properties like temperature derive from microscopic molecular properties. However, most approaches to repertoire-scale measurement, including sequence diversity and entropy, are not based on antibody or TCR function in this way. Thus, they potentially overlook key features of immunological function. Here we present a framework that describes repertoires in terms of the epitope-binding properties of their constituent antibodies and TCRs, based on analysis of thousands of antibody-antigen and TCR-peptide-major-histocompatibility-complex binding interactions and over 400 high-throughput repertoires. We show that repertoires consist of loose overlapping classes of antibodies and TCRs with similar binding properties. We demonstrate the potential of this framework to distinguish specific responses vs. bystander activation in influenza vaccinees, stratify cytomegalovirus (CMV)-infected cohorts, and identify potential immunological "super-agers." Classes add a valuable dimension to the assessment of immune function.
Collapse
|
11
|
Caldararo F, Di Giulio M. The genetic code is very close to a global optimum in a model of its origin taking into account both the partition energy of amino acids and their biosynthetic relationships. Biosystems 2022; 214:104613. [DOI: 10.1016/j.biosystems.2022.104613] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2021] [Revised: 01/16/2022] [Accepted: 01/17/2022] [Indexed: 01/23/2023]
|
12
|
Yang Y, Zeng L, Vihinen M. PON-Sol2: Prediction of Effects of Variants on Protein Solubility. Int J Mol Sci 2021; 22:8027. [PMID: 34360790 PMCID: PMC8348231 DOI: 10.3390/ijms22158027] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 07/19/2021] [Accepted: 07/22/2021] [Indexed: 01/13/2023] Open
Abstract
Genetic variations have a multitude of effects on proteins. A substantial number of variations affect protein-solvent interactions, either aggregation or solubility. Aggregation is often related to structural alterations, whereas solubilizable proteins in the solid phase can be made again soluble by dilution. Solubility is a central protein property and when reduced can lead to diseases. We developed a prediction method, PON-Sol2, to identify amino acid substitutions that increase, decrease, or have no effect on the protein solubility. The method is a machine learning tool utilizing gradient boosting algorithm and was trained on a large dataset of variants with different outcomes after the selection of features among a large number of tested properties. The method is fast and has high performance. The normalized correct prediction rate for three states is 0.656, and the normalized GC2 score is 0.312 in 10-fold cross-validation. The corresponding numbers in the blind test were 0.545 and 0.157. The performance was superior in comparison to previous methods. The PON-Sol2 predictor is freely available. It can be used to predict the solubility effects of variants for any organism, even in large-scale projects.
Collapse
Affiliation(s)
- Yang Yang
- School of Computer Science and Technology, Soochow University, Suzhou 215006, China; (Y.Y.); (L.Z.)
| | - Lianjie Zeng
- School of Computer Science and Technology, Soochow University, Suzhou 215006, China; (Y.Y.); (L.Z.)
- Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing 210000, China
| | - Mauno Vihinen
- Department of Experimental Medical Science, Lund University, BMC B13, SE-221 84 Lund, Sweden
| |
Collapse
|
13
|
Chen KH, Hu YJ. Residue-Residue Interaction Prediction via Stacked Meta-Learning. Int J Mol Sci 2021; 22:ijms22126393. [PMID: 34203772 PMCID: PMC8232778 DOI: 10.3390/ijms22126393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 06/06/2021] [Accepted: 06/13/2021] [Indexed: 11/16/2022] Open
Abstract
Protein-protein interactions (PPIs) are the basis of most biological functions determined by residue-residue interactions (RRIs). Predicting residue pairs responsible for the interaction is crucial for understanding the cause of a disease and drug design. Computational approaches that considered inexpensive and faster solutions for RRI prediction have been widely used to predict protein interfaces for further analysis. This study presents RRI-Meta, an ensemble meta-learning-based method for RRI prediction. Its hierarchical learning structure comprises four base classifiers and one meta-classifier to integrate predictive strengths from different classifiers. It considers multiple feature types, including sequence-, structure-, and neighbor-based features, for characterizing other properties of a residue interaction environment to better distinguish between noninteracting and interacting residues. We conducted the same experiments using the same data as previously reported in the literature to demonstrate RRI-Meta's performance. Experimental results show that RRI-Meta is superior to several current prediction tools. Additionally, to analyze the factors that affect the performance of RRI-Meta, we conducted a comparative case study using different protein complexes.
Collapse
Affiliation(s)
- Kuan-Hsi Chen
- College of Computer Science, National Yang Ming Chiao Tung University, Hsinchu 300093, Taiwan;
| | - Yuh-Jyh Hu
- Institute of Biomedical Engineering, National Yang Ming Chiao Tung University, Hsinchu 300093, Taiwan
- Correspondence: ; Tel.: +886-3-571-2121
| |
Collapse
|
14
|
Wan X, Tan X. A protein structural study based on the centrality analysis of protein sequence feature networks. PLoS One 2021; 16:e0248861. [PMID: 33780482 PMCID: PMC8006989 DOI: 10.1371/journal.pone.0248861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2020] [Accepted: 03/05/2021] [Indexed: 11/19/2022] Open
Abstract
In this paper, we use network approaches to analyze the relations between protein sequence features for the top hierarchical classes of CATH and SCOP. We use fundamental connectivity measures such as correlation (CR), normalized mutual information rate (nMIR), and transfer entropy (TE) to analyze the pairwise-relationships between the protein sequence features, and use centrality measures to analyze weighted networks constructed from the relationship matrices. In the centrality analysis, we find both commonalities and differences between the different protein 3D structural classes. Results show that all top hierarchical classes of CATH and SCOP present strong non-deterministic interactions for the composition and arrangement features of Cystine (C), Methionine (M), Tryptophan (W), and also for the arrangement features of Histidine (H). The different protein 3D structural classes present different preferences in terms of their centrality distributions and significant features.
Collapse
Affiliation(s)
- Xiaogeng Wan
- College of Mathematics and Physics, Beijing University of Chemical Technology, Beijing, China
- * E-mail:
| | - Xinying Tan
- The Fourth Center of PLA General Hospital, Beijing, China
| |
Collapse
|
15
|
Hooshmand N, Fayazi J, Tabatabaei S, Ghaleh Golab Behbahan N. Prediction of B cell and T-helper cell epitopes candidates of bovine leukaemia virus (BLV) by in silico approach. Vet Med Sci 2020; 6:730-739. [PMID: 32592322 PMCID: PMC7738742 DOI: 10.1002/vms3.307] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Revised: 05/04/2020] [Accepted: 05/22/2020] [Indexed: 01/22/2023] Open
Abstract
The bovine leukaemia virus (BLV) is a retrovirus responsible for enzootic bovine leukaemia (EBL) disease, the most common cattle disease leading to high annual economic losses to the cattle breeding industry. Virus monitoring among the sheep and cattle herds is usually done by vaccination. Inactivated virus vaccines can partially protect the livestock from viral challenge. However, vaccinated animals are likely to be infected. So, there is an essential need for producing vaccine by other methods. Gp60 SU, encoded by Env gene, is the surface glycoprotein of BLV detected to be the major target for the host immunity against the virus. Different stages were performed to predict the potential B and T-helper cell epitopes. The general framework of the method includes retrieving the amino acid sequence of gp60 SU, conducting the sequence alignment, getting the entropy plot, retrieving the previously found epitopes, predicting the hydropathy parameters, modelling the tertiary structure of the glycoprotein, minimizing the structure energy, validating the model by Ramachandran plot, predicting the linear and discontinuous epitopes by various servers and eventually choosing the consensus immunogenic regions. Ramachandran plot scrutiny has demonstrated that the modelled prediction is accurate and suitable. By surveying overlaps of various results, 4 and 2 immunogenic regions were selected as linear and conformational epitopes respectively. Amino acids 35-53, 67-97, 288-302 and 410-421 and those of numbers 37-58 and 72-100 were the regions selected as linear and conformational epitopes respectively. The tertiary structure of the final epitope was modelled as well. A comparison of the predicted epitopes structure with that of gp60 SU envelope, illustrated that the tertiary structure of these epitopes does not change after being separated from the primary complete one. The present achievements will lead to a better interpretation of the antigen-antibody interactions against gp60 in the designing process of safe and efficient vaccines.
Collapse
Affiliation(s)
- Negar Hooshmand
- Animal Science DepartmentAgricultural Sciences and Natural Resources University of KhuzestanMollasaniIran
| | - Jamal Fayazi
- Animal Science DepartmentAgricultural Sciences and Natural Resources University of KhuzestanMollasaniIran
| | - Saleh Tabatabaei
- Animal Science DepartmentAgricultural Sciences and Natural Resources University of KhuzestanMollasaniIran
| | - Nader Ghaleh Golab Behbahan
- Razi Vaccine and Serum Research InstituteAgricultural Research Education and Extention Organization (AREEO)TehranIran
| |
Collapse
|
16
|
Timmons PB, Hewage CM. HAPPENN is a novel tool for hemolytic activity prediction for therapeutic peptides which employs neural networks. Sci Rep 2020; 10:10869. [PMID: 32616760 PMCID: PMC7331684 DOI: 10.1038/s41598-020-67701-3] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Accepted: 06/09/2020] [Indexed: 12/11/2022] Open
Abstract
The growing prevalence of resistance to antibiotics motivates the search for new antibacterial agents. Antimicrobial peptides are a diverse class of well-studied membrane-active peptides which function as part of the innate host defence system, and form a promising avenue in antibiotic drug research. Some antimicrobial peptides exhibit toxicity against eukaryotic membranes, typically characterised by hemolytic activity assays, but currently, the understanding of what differentiates hemolytic and non-hemolytic peptides is limited. This study leverages advances in machine learning research to produce a novel artificial neural network classifier for the prediction of hemolytic activity from a peptide's primary sequence. The classifier achieves best-in-class performance, with cross-validated accuracy of [Formula: see text] and Matthews correlation coefficient of 0.71. This innovative classifier is available as a web server at https://research.timmons.eu/happenn , allowing the research community to utilise it for in silico screening of peptide drug candidates for high therapeutic efficacies.
Collapse
Affiliation(s)
- Patrick Brendan Timmons
- UCD School of Biomolecular and Biomedical Science, UCD Centre for Synthesis and Chemical Biology, UCD Conway Institute, University College Dublin, Dublin 4, Ireland
| | - Chandralal M Hewage
- UCD School of Biomolecular and Biomedical Science, UCD Centre for Synthesis and Chemical Biology, UCD Conway Institute, University College Dublin, Dublin 4, Ireland.
| |
Collapse
|
17
|
iBitter-SCM: Identification and characterization of bitter peptides using a scoring card method with propensity scores of dipeptides. Genomics 2020; 112:2813-2822. [DOI: 10.1016/j.ygeno.2020.03.019] [Citation(s) in RCA: 58] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Revised: 03/19/2020] [Accepted: 03/22/2020] [Indexed: 12/21/2022]
|
18
|
Application of Meta Learning to B-Cell Conformational Epitope Prediction. Methods Mol Biol 2020. [PMID: 32162268 DOI: 10.1007/978-1-0716-0389-5_22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
One of the major challenges in the field of vaccine design is identifying B-cell epitopes in continuously evolving viruses. Various tools have been developed to predict linear or conformational epitopes, each relying on different physicochemical properties and adopting distinct search strategies. In this chapter, we propose different ensemble meta-learning approaches for epitope prediction based on stacked, cascade generalizations, and meta decision trees. Through meta learning, we expect a meta learner to be able to integrate multiple prediction models and outperform the single best-performing model. The objective of this chapter is twofold: (1) to promote the complementary predictive strengths in different prediction tools and (2) to introduce computational models to exploit the synergy among various prediction tools. Our primary goal is not to develop any particular classifier for B-cell epitope prediction, but to advocate the feasibility of meta learning to epitope prediction. With the flexibility of meta learning, the researcher can construct various meta classification hierarchies that are applicable to epitope prediction in different protein domains.
Collapse
|
19
|
Chen KH, Wang TF, Hu YJ. Protein-protein interaction prediction using a hybrid feature representation and a stacked generalization scheme. BMC Bioinformatics 2019; 20:308. [PMID: 31182027 PMCID: PMC6558856 DOI: 10.1186/s12859-019-2907-1] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2019] [Accepted: 05/17/2019] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Although various machine learning-based predictors have been developed for estimating protein-protein interactions, their performances vary with dataset and species, and are affected by two primary aspects: choice of learning algorithm, and the representation of protein pairs. To improve the performance of predicting protein-protein interactions, we exploit the synergy of multiple learning algorithms, and utilize the expressiveness of different protein-pair features. RESULTS We developed a stacked generalization scheme that integrates five learning algorithms. We also designed three types of protein-pair features based on the physicochemical properties of amino acids, gene ontology annotations, and interaction network topologies. When tested on 19 published datasets collected from eight species, the proposed approach achieved a significantly higher or comparable overall performance, compared with seven competitive predictors. CONCLUSION We introduced an ensemble learning approach for PPI prediction that integrated multiple learning algorithms and different protein-pair representations. The extensive comparisons with other state-of-the-art prediction tools demonstrated the feasibility and superiority of the proposed method.
Collapse
Affiliation(s)
- Kuan-Hsi Chen
- College of Computer Science, National Chiao Tung University, Hsinchu, 300, Taiwan
| | - Tsai-Feng Wang
- Institute of Data Science and Engineering, National Chiao Tung University, Hsinchu, 300, Taiwan
| | - Yuh-Jyh Hu
- Institute of Biomedical Engineering, College of Computer Science, National Chiao Tung University, Hsinchu, 300, Taiwan.
| |
Collapse
|
20
|
Burdukiewicz M, Sobczyk P, Chilimoniuk J, Gagat P, Mackiewicz P. Prediction of Signal Peptides in Proteins from Malaria Parasites. Int J Mol Sci 2018; 19:E3709. [PMID: 30469512 PMCID: PMC6321056 DOI: 10.3390/ijms19123709] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2018] [Revised: 11/15/2018] [Accepted: 11/17/2018] [Indexed: 01/08/2023] Open
Abstract
Signal peptides are N-terminal presequences responsible for targeting proteins to the endomembrane system, and subsequent subcellular or extracellular compartments, and consequently condition their proper function. The significance of signal peptides stimulates development of new computational methods for their detection. These methods employ learning systems trained on datasets comprising signal peptides from different types of proteins and taxonomic groups. As a result, the accuracy of predictions are high in the case of signal peptides that are well-represented in databases, but might be low in other, atypical cases. Such atypical signal peptides are present in proteins found in apicomplexan parasites, causative agents of malaria and toxoplasmosis. Apicomplexan proteins have a unique amino acid composition due to their AT-biased genomes. Therefore, we designed a new, more flexible and universal probabilistic model for recognition of atypical eukaryotic signal peptides. Our approach called signalHsmm includes knowledge about the structure of signal peptides and physicochemical properties of amino acids. It is able to recognize signal peptides from the malaria parasites and related species more accurately than popular programs. Moreover, it is still universal enough to provide prediction of other signal peptides on par with the best preforming predictors.
Collapse
Affiliation(s)
- Michał Burdukiewicz
- Faculty of Mathematics and Information Science, Warsaw University of Technology, 00-661 Warszawa, Poland.
| | - Piotr Sobczyk
- Department of Mathematics, Wrocław University of Technology, 50-370 Wrocław, Poland.
| | | | - Przemysław Gagat
- Department of Genomics, University of Wrocław, 50-383 Wrocław, Poland.
| | - Paweł Mackiewicz
- Department of Genomics, University of Wrocław, 50-383 Wrocław, Poland.
| |
Collapse
|
21
|
Zimmer D, Schneider K, Sommer F, Schroda M, Mühlhaus T. Artificial Intelligence Understands Peptide Observability and Assists With Absolute Protein Quantification. FRONTIERS IN PLANT SCIENCE 2018; 9:1559. [PMID: 30483279 PMCID: PMC6242780 DOI: 10.3389/fpls.2018.01559] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2018] [Accepted: 10/04/2018] [Indexed: 05/20/2023]
Abstract
Targeted mass spectrometry has become the method of choice to gain absolute quantification information of high quality, which is essential for a quantitative understanding of biological systems. However, the design of absolute protein quantification assays remains challenging due to variations in peptide observability and incomplete knowledge about factors influencing peptide detectability. Here, we present a deep learning algorithm for peptide detectability prediction, d::pPop, which allows the informed selection of synthetic proteotypic peptides for the successful design of targeted proteomics quantification assays. The deep neural network is able to learn a regression model that relates the physicochemical properties of a peptide to its ion intensity detected by mass spectrometry. The approach makes use of experimentally detected deviations from the assumed equimolar abundance of all peptides derived from a given protein. Trained on extensive proteomics datasets, d::pPop's plant and non-plant specific models can predict the quality of proteotypic peptides for not yet experimentally identified proteins. Interrogating the deep neural network after learning from ~76,000 peptides per model organism allows to investigate the impact of different physicochemical properties on the observability of a peptide, thus providing insights into peptide observability as a multifaceted process. Empirical evaluation with rank accuracy metrics showed that our prediction approach outperforms existing algorithms. We circumvent the delicate step of selecting positive and negative training sets and at the same time also more closely reflect the need for selecting the top most promising peptides for targeting a protein of interest. Further, we used an artificial QconCAT protein to experimentally validate the observability prediction. Our proteotypic peptide prediction approach not only facilitates the design of absolute protein quantification assays via a user-friendly web interface but also enables the selection of proteotypic peptides for not yet observed proteins, hence rendering the tool especially useful for plant research.
Collapse
Affiliation(s)
- David Zimmer
- Computational Systems BiologyTU Kaiserslautern, Kaiserslautern, Germany
| | - Kevin Schneider
- Computational Systems BiologyTU Kaiserslautern, Kaiserslautern, Germany
| | - Frederik Sommer
- Molekulare Biotechnologie & SystembiologieTU Kaiserslautern, Kaiserslautern, Germany
| | - Michael Schroda
- Molekulare Biotechnologie & SystembiologieTU Kaiserslautern, Kaiserslautern, Germany
| | - Timo Mühlhaus
- Computational Systems BiologyTU Kaiserslautern, Kaiserslautern, Germany
| |
Collapse
|
22
|
Role of solvent accessibility for aggregation-prone patches in protein folding. Sci Rep 2018; 8:12896. [PMID: 30150761 PMCID: PMC6110721 DOI: 10.1038/s41598-018-31289-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2018] [Accepted: 08/15/2018] [Indexed: 11/21/2022] Open
Abstract
The arrangement of amino acids in a protein sequence encodes its native folding. However, the same arrangement in aggregation-prone regions may cause misfolding as a result of local environmental stress. Under normal physiological conditions, such regions congregate in the protein’s interior to avoid aggregation and attain the native fold. We have used solvent accessibility of aggregation patches (SAAPp) to determine the packing of aggregation-prone residues. Our results showed that SAAPp has low values for native crystal structures, consistent with protein folding as a mechanism to minimize the solvent accessibility of aggregation-prone residues. SAAPp also shows an average correlation of 0.76 with the global distance test (GDT) score on CASP12 template-based protein models. Using SAAPp scores and five structural features, a random forest machine learning quality assessment tool, SAAP-QA, showed 2.32 average GDT loss between best model predicted and actual best based on GDT score on independent CASP test data, with the ability to discriminate native-like folds having an AUC of 0.94. Overall, the Pearson correlation coefficient (PCC) between true and predicted GDT scores on independent CASP data was 0.86 while on the external CAMEO dataset, comprising high quality protein structures, PCC and average GDT loss were 0.71 and 4.46 respectively. SAAP-QA can be used to detect the quality of models and iteratively improve them to native or near-native structures.
Collapse
|
23
|
Nojoomi S, Koehl P. A weighted string kernel for protein fold recognition. BMC Bioinformatics 2017; 18:378. [PMID: 28841820 PMCID: PMC5574112 DOI: 10.1186/s12859-017-1795-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2017] [Accepted: 08/15/2017] [Indexed: 11/10/2022] Open
Abstract
Background Alignment-free methods for comparing protein sequences have proved to be viable alternatives to approaches that first rely on an alignment of the sequences to be compared. Much work however need to be done before those methods provide reliable fold recognition for proteins whose sequences share little similarity. We have recently proposed an alignment-free method based on the concept of string kernels, SeqKernel (Nojoomi and Koehl, BMC Bioinformatics, 2017, 18:137). In this previous study, we have shown that while Seqkernel performs better than standard alignment-based methods, its applications are potentially limited, because of biases due mostly to sequence length effects. Methods In this study, we propose improvements to SeqKernel that follows two directions. First, we developed a weighted version of the kernel, WSeqKernel. Second, we expand the concept of string kernels into a novel framework for deriving information on amino acids from protein sequences. Results Using a dataset that only contains remote homologs, we have shown that WSeqKernel performs remarkably well in fold recognition experiments. We have shown that with the appropriate weighting scheme, we can remove the length effects on the kernel values. WSeqKernel, just like any alignment-based sequence comparison method, depends on a substitution matrix. We have shown that this matrix can be optimized so that sequence similarity scores correlate well with structure similarity scores. Starting from no information on amino acid similarity, we have shown that we can derive a scoring matrix that echoes the physico-chemical properties of amino acids. Conclusion We have made progress in characterizing and parametrizing string kernels as alignment-based methods for comparing protein sequences, and we have shown that they provide a framework for extracting sequence information from structure. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1795-5) contains supplementary material, which is available to authorized users.
Collapse
|
24
|
Fassio AV, Martins PM, Guimarães SDS, Junior SSA, Ribeiro VS, de Melo-Minardi RC, Silveira SDA. Vermont: a multi-perspective visual interactive platform for mutational analysis. BMC Bioinformatics 2017; 18:403. [PMID: 28929973 PMCID: PMC5606220 DOI: 10.1186/s12859-017-1789-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND A huge amount of data about genomes and sequence variation is available and continues to grow on a large scale, which makes experimentally characterizing these mutations infeasible regarding disease association and effects on protein structure and function. Therefore, reliable computational approaches are needed to support the understanding of mutations and their impacts. Here, we present VERMONT 2.0, a visual interactive platform that combines sequence and structural parameters with interactive visualizations to make the impact of protein point mutations more understandable. RESULTS We aimed to contribute a novel visual analytics oriented method to analyze and gain insight on the impact of protein point mutations. To assess the ability of VERMONT to do this, we visually examined a set of mutations that were experimentally characterized to determine if VERMONT could identify damaging mutations and why they can be considered so. CONCLUSIONS VERMONT allowed us to understand mutations by interpreting position-specific structural and physicochemical properties. Additionally, we note some specific positions we believe have an impact on protein function/structure in the case of mutation.
Collapse
Affiliation(s)
- Alexandre V Fassio
- Department of Computer Science, Universidade Federal de Minas Gerais, 6627, Antônio Carlos avenue, Pampulha, Belo Horizonte, 31270-901, Brazil. .,Department of Biochemistry and Immunology, Universidade Federal de Minas Gerais, 6627, Antônio Carlos avenue, Pampulha, Belo Horizonte, 31270-901, Brazil.
| | - Pedro M Martins
- Department of Computer Science, Universidade Federal de Minas Gerais, 6627, Antônio Carlos avenue, Pampulha, Belo Horizonte, 31270-901, Brazil.,Department of Biochemistry and Immunology, Universidade Federal de Minas Gerais, 6627, Antônio Carlos avenue, Pampulha, Belo Horizonte, 31270-901, Brazil
| | - Samuel da S Guimarães
- Department of Computer Science, Universidade Federal de Viçosa, Peter Henry Rolfs avenue, Campus Universitário, Viçosa, 36570-900, Brazil
| | - Sócrates S A Junior
- Department of Computer Science, Universidade Federal de Viçosa, Peter Henry Rolfs avenue, Campus Universitário, Viçosa, 36570-900, Brazil
| | - Vagner S Ribeiro
- Department of Computer Science, Universidade Federal de Viçosa, Peter Henry Rolfs avenue, Campus Universitário, Viçosa, 36570-900, Brazil
| | - Raquel C de Melo-Minardi
- Department of Computer Science, Universidade Federal de Minas Gerais, 6627, Antônio Carlos avenue, Pampulha, Belo Horizonte, 31270-901, Brazil
| | - Sabrina de A Silveira
- Department of Computer Science, Universidade Federal de Viçosa, Peter Henry Rolfs avenue, Campus Universitário, Viçosa, 36570-900, Brazil
| |
Collapse
|
25
|
Abstract
Peptide antibodies, with their high specificities and affinities, are invaluable reagents for peptide and protein recognition in biological specimens. Depending on the application and the assay, in which the peptide antibody is to used, several factors influence successful antibody production, including peptide selection and antibody screening. Peptide antibodies have been used in clinical laboratory diagnostics with great success for decades, primarily because they can be produced to multiple targets, recognizing native wildtype proteins, denatured proteins, and newly generated epitopes. Especially mutation-specific peptide antibodies have become important as diagnostic tools in the detection of various cancers. In addition to their use as diagnostic tools in malignant and premalignant conditions, peptide antibodies are applied in all other areas of clinical laboratory diagnostics, including endocrinology, hematology, neurodegenerative diseases, cardiovascular diseases, infectious diseases, and amyloidoses.
Collapse
|
26
|
Abstract
AIM Toxicity arising from hemolytic activity of peptides hinders its further progress as drug candidates. MATERIALS & METHODS This study describes a sequence-based predictor based on a random forest classifier using amino acid composition, dipeptide composition and physicochemical descriptors (named HemoPred). RESULTS This approach could outperform previously reported method and typical classification methods (e.g., support vector machine and decision tree) verified by fivefold cross-validation and external validation with accuracy and Matthews correlation coefficient in excess of 95% and 0.91, respectively. Results revealed the importance of hydrophobic and Cys residues on α-helix and β-sheet, respectively, on the hemolytic activity. CONCLUSION A sequence-based predictor which is publicly available as the web service of HemoPred, is proposed to predict and analyze the hemolytic activity of peptides.
Collapse
|
27
|
Kavianpour H, Vasighi M. Structural classification of proteins using texture descriptors extracted from the cellular automata image. Amino Acids 2016; 49:261-271. [DOI: 10.1007/s00726-016-2354-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2016] [Accepted: 10/18/2016] [Indexed: 12/12/2022]
|
28
|
Synthetic Peptides are Better Than Native Antigens for Development of ELISA Assay for Diagnosis of Tuberculosis. Int J Pept Res Ther 2016. [DOI: 10.1007/s10989-016-9556-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
29
|
Simm S, Einloft J, Mirus O, Schleiff E. 50 years of amino acid hydrophobicity scales: revisiting the capacity for peptide classification. Biol Res 2016; 49:31. [PMID: 27378087 PMCID: PMC4932767 DOI: 10.1186/s40659-016-0092-5] [Citation(s) in RCA: 57] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2016] [Accepted: 06/17/2016] [Indexed: 11/28/2022] Open
Abstract
BACKGROUND Physicochemical properties are frequently analyzed to characterize protein-sequences of known and unknown function. Especially the hydrophobicity of amino acids is often used for structural prediction or for the detection of membrane associated or embedded β-sheets and α-helices. For this purpose many scales classifying amino acids according to their physicochemical properties have been defined over the past decades. In parallel, several hydrophobicity parameters have been defined for calculation of peptide properties. We analyzed the performance of separating sequence pools using 98 hydrophobicity scales and five different hydrophobicity parameters, namely the overall hydrophobicity, the hydrophobic moment for detection of the α-helical and β-sheet membrane segments, the alternating hydrophobicity and the exact ß-strand score. RESULTS Most of the scales are capable of discriminating between transmembrane α-helices and transmembrane β-sheets, but assignment of peptides to pools of soluble peptides of different secondary structures is not achieved at the same quality. The separation capacity as measure of the discrimination between different structural elements is best by using the five different hydrophobicity parameters, but addition of the alternating hydrophobicity does not provide a large benefit. An in silico evolutionary approach shows that scales have limitation in separation capacity with a maximal threshold of 0.6 in general. We observed that scales derived from the evolutionary approach performed best in separating the different peptide pools when values for arginine and tyrosine were largely distinct from the value of glutamate. Finally, the separation of secondary structure pools via hydrophobicity can be supported by specific detectable patterns of four amino acids. CONCLUSION It could be assumed that the quality of separation capacity of a certain scale depends on the spacing of the hydrophobicity value of certain amino acids. Irrespective of the wealth of hydrophobicity scales a scale separating all different kinds of secondary structures or between soluble and transmembrane peptides does not exist reflecting that properties other than hydrophobicity affect secondary structure formation as well. Nevertheless, application of hydrophobicity scales allows distinguishing between peptides with transmembrane α-helices and β-sheets. Furthermore, the overall separation capacity score of 0.6 using different hydrophobicity parameters could be assisted by pattern search on the protein sequence level for specific peptides with a length of four amino acids.
Collapse
Affiliation(s)
- Stefan Simm
- />Department of Biosciences, Molecular Cell Biology of Plants, Goethe University, Max von Laue Str. 9, 60438 Frankfurt/Main, Germany
| | - Jens Einloft
- />Molecular Bioinformatics, Cluster of Excellence Frankfurt “Macromolecular Complexes”, Institute of Computer Science, Faculty of Computer Science and Mathematics, Goethe-University Frankfurt, Robert-Mayer-Str. 11-15, 60325 Frankfurt/Main, Germany
| | - Oliver Mirus
- />Department of Biosciences, Molecular Cell Biology of Plants, Goethe University, Max von Laue Str. 9, 60438 Frankfurt/Main, Germany
| | - Enrico Schleiff
- />Department of Biosciences, Molecular Cell Biology of Plants, Cluster of Excellence Frankfurt (CEF) and Buchmann Institute of Molecular Life Sciences (BMLS), Goethe University, Max von Laue Str. 9, 60438 Frankfurt/Main, Germany
| |
Collapse
|
30
|
Nagarajan R, Archana A, Thangakani AM, Jemimah S, Velmurugan D, Gromiha MM. PDBparam: Online Resource for Computing Structural Parameters of Proteins. Bioinform Biol Insights 2016; 10:73-80. [PMID: 27330281 PMCID: PMC4909059 DOI: 10.4137/bbi.s38423] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2016] [Revised: 04/20/2016] [Accepted: 04/24/2016] [Indexed: 02/07/2023] Open
Abstract
Understanding the structure-function relationship in proteins is a longstanding goal in molecular and computational biology. The development of structure-based parameters has helped to relate the structure with the function of a protein. Although several structural features have been reported in the literature, no single server can calculate a wide-ranging set of structure-based features from protein three-dimensional structures. In this work, we have developed a web-based tool, PDBparam, for computing more than 50 structure-based features for any given protein structure. These features are classified into four major categories: (i) interresidue interactions, which include short-, medium-, and long-range interactions, contact order, long-range order, total contact distance, contact number, and multiple contact index, (ii) secondary structure propensities such as α-helical propensity, β-sheet propensity, and propensity of amino acids to exist at various positions of α-helix and amino acid compositions in high B-value regions, (iii) physicochemical properties containing ionic interactions, hydrogen bond interactions, hydrophobic interactions, disulfide interactions, aromatic interactions, surrounding hydrophobicity, and buriedness, and (iv) identification of binding site residues in protein-protein, protein-nucleic acid, and protein-ligand complexes. The server can be freely accessed at http://www.iitm.ac.in/bioinfo/pdbparam/. We suggest the use of PDBparam as an effective tool for analyzing protein structures.
Collapse
Affiliation(s)
- R. Nagarajan
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - A. Archana
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - A. Mary Thangakani
- CAS in Crystallography and Biophysics, University of Madras, Chennai, India
- Bioinformatics Infrastructure Facility, University of Madras, Chennai, India
| | - S. Jemimah
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - D. Velmurugan
- CAS in Crystallography and Biophysics, University of Madras, Chennai, India
- Bioinformatics Infrastructure Facility, University of Madras, Chennai, India
| | - M. Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| |
Collapse
|
31
|
Gunnell MK, Robison RA, Adams BJ. Natural Selection in Virulence Genes of Francisella tularensis. J Mol Evol 2016; 82:264-78. [PMID: 27177502 DOI: 10.1007/s00239-016-9743-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2014] [Accepted: 04/29/2016] [Indexed: 02/06/2023]
Abstract
A fundamental tenet of evolution is that alleles that are under negative selection are often deleterious and confer no evolutionary advantage. Negatively selected alleles are removed from the gene pool and are eventually extinguished from the population. Conversely, alleles under positive selection do confer an evolutionary advantage and lead to an increase in the overall fitness of the organism. These alleles increase in frequency until they eventually become fixed in the population. Francisella tularensis is a zoonotic pathogen and a potential biothreat agent. The most virulent type of F. tularensis, Type A, is distributed across North America with Type A.I occurring mainly in the east and Type A.II appearing mainly in the west. F. tularensis is thought to be a genome in decay (losing genes) because of the relatively large number of pseudogenes present in its genome. We hypothesized that the observed frequency of gene loss/pseudogenes may be an artifact of evolution in response to a changing environment, and that genes involved in virulence should be under strong positive selection. To test this hypothesis, we sequenced and compared whole genomes of Type A.I and A.II isolates. We analyzed a subset of virulence and housekeeping genes from several F. tularensis subspecies genomes to ascertain the presence and extent of positive selection. Eleven previously identified virulence genes were screened for positive selection along with 10 housekeeping genes. Analyses of selection yielded one housekeeping gene and 7 virulence genes which showed significant evidence of positive selection at loci implicated in cell surface structures and membrane proteins, metabolism and biosynthesis, transcription, translation and cell separation, and substrate binding and transport. Our results suggest that while the loss of functional genes through disuse could be accelerated by negative selection, the genome decay in Francisella could also be the byproduct of adaptive evolution driven by complex interactions between host, pathogen, and thier environment, as evidenced by several of its virulence genes which are undergoing strong, positive selection.
Collapse
Affiliation(s)
- Mark K Gunnell
- Department of Microbiology and Molecular Biology, Brigham Young University, Provo, UT, 84602, USA. .,Microbiology Branch, Life Sciences Division, Dugway Proving Ground, Dugway, UT, 84022, USA.
| | - Richard A Robison
- Department of Microbiology and Molecular Biology, Brigham Young University, Provo, UT, 84602, USA
| | - Byron J Adams
- Department of Biology, Brigham Young University, Provo, UT, 84602, USA
| |
Collapse
|
32
|
Dong P, Fan Y, Sun J, Lv M, Yi M, Tan X, Liu S. A dynamic interaction process between KaiA and KaiC is critical to the cyanobacterial circadian oscillator. Sci Rep 2016; 6:25129. [PMID: 27113386 PMCID: PMC4844972 DOI: 10.1038/srep25129] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2015] [Accepted: 04/12/2016] [Indexed: 11/09/2022] Open
Abstract
The core circadian oscillator of cyanobacteria consists of three proteins, KaiA, KaiB, and KaiC. This circadian oscillator could be functionally reconstituted in vitro with these three proteins, and therefore has been a very important model in circadian rhythm research. KaiA can bind to KaiC and then stimulate its phosphorylation, but their interaction mechanism remains elusive. In this study, we followed the "second-site suppressor" strategy to investigate the interaction mechanism of KaiA and KaiC. Using protein sequence analyses, we showed that there exist co-varying residues in the binding interface of KaiA and KaiC. The followed mutagenesis study verified that these residues are important to the functions of KaiA and KaiC, but their roles could not be fully explained by the reported complex structures of KaiA and KaiC derived peptides. Combining our data with previous reports, we suggested a dynamic interaction mechanism in KaiA-KaiC interaction, in which both KaiA and the intrinsically disordered tail of KaiC undergo significant structural changes through conformational selection and induced fit during the binding process. At last, we presented a mathematic model to support this hypothesis and explained the importance of this interaction mechanism for the KaiABC circadian oscillator.
Collapse
Affiliation(s)
- Pei Dong
- Hubei Key Laboratory of Tumor Microenvironment and Immunotherapy, China Three Gorges University, Yichang 443002, China.,College of Medical Science, China Three Gorges University, Yichang 443002, China
| | - Ying Fan
- College of Medical Science, China Three Gorges University, Yichang 443002, China
| | - Jianqiang Sun
- School of Statistics, Shandong Institute of Business and Technology, Yantai, 264005, China
| | - Mengting Lv
- College of Medical Science, China Three Gorges University, Yichang 443002, China
| | - Ming Yi
- Department of Physics, College of Sciences, Huazhong Agricultural University, Wuhan 430070, China
| | - Xiao Tan
- Hubei Key Laboratory of Tumor Microenvironment and Immunotherapy, China Three Gorges University, Yichang 443002, China.,College of Medical Science, China Three Gorges University, Yichang 443002, China
| | - Sen Liu
- Hubei Key Laboratory of Tumor Microenvironment and Immunotherapy, China Three Gorges University, Yichang 443002, China.,College of Medical Science, China Three Gorges University, Yichang 443002, China
| |
Collapse
|
33
|
Generation of monoclonal antibodies reactive against subtype specific conserved B-cell epitopes on haemagglutinin protein of influenza virus H5N1. Virus Res 2015; 199:46-55. [DOI: 10.1016/j.virusres.2015.01.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2014] [Revised: 12/12/2014] [Accepted: 01/10/2015] [Indexed: 11/19/2022]
|
34
|
Thompson JJ, Tabatabaei Ghomi H, Lill MA. Application of information theory to a three-body coarse-grained representation of proteins in the PDB: insights into the structural and evolutionary roles of residues in protein structure. Proteins 2014; 82:3450-65. [PMID: 25269778 DOI: 10.1002/prot.24698] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2014] [Revised: 09/09/2014] [Accepted: 09/19/2014] [Indexed: 01/03/2023]
Abstract
Knowledge-based methods for analyzing protein structures, such as statistical potentials, primarily consider the distances between pairs of bodies (atoms or groups of atoms). Considerations of several bodies simultaneously are generally used to characterize bonded structural elements or those in close contact with each other, but historically do not consider atoms that are not in direct contact with each other. In this report, we introduce an information-theoretic method for detecting and quantifying distance-dependent through-space multibody relationships between the sidechains of three residues. The technique introduced is capable of producing convergent and consistent results when applied to a sufficiently large database of randomly chosen, experimentally solved protein structures. The results of our study can be shown to reproduce established physico-chemical properties of residues as well as more recently discovered properties and interactions. These results offer insight into the numerous roles that residues play in protein structure, as well as relationships between residue function, protein structure, and evolution. The techniques and insights presented in this work should be useful in the future development of novel knowledge-based tools for the evaluation of protein structure.
Collapse
Affiliation(s)
- Jared J Thompson
- Department of Medicinal Chemistry and Molecular Pharmacology, College of Pharmacy, Purdue University, West Lafayette, Indiana
| | | | | |
Collapse
|
35
|
Zhang J, Zhao X, Sun P, Gao B, Ma Z. Conformational B-cell epitopes prediction from sequences using cost-sensitive ensemble classifiers and spatial clustering. BIOMED RESEARCH INTERNATIONAL 2014; 2014:689219. [PMID: 25045691 PMCID: PMC4083607 DOI: 10.1155/2014/689219] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/03/2014] [Revised: 05/02/2014] [Accepted: 05/10/2014] [Indexed: 12/20/2022]
Abstract
B-cell epitopes are regions of the antigen surface which can be recognized by certain antibodies and elicit the immune response. Identification of epitopes for a given antigen chain finds vital applications in vaccine and drug research. Experimental prediction of B-cell epitopes is time-consuming and resource intensive, which may benefit from the computational approaches to identify B-cell epitopes. In this paper, a novel cost-sensitive ensemble algorithm is proposed for predicting the antigenic determinant residues and then a spatial clustering algorithm is adopted to identify the potential epitopes. Firstly, we explore various discriminative features from primary sequences. Secondly, cost-sensitive ensemble scheme is introduced to deal with imbalanced learning problem. Thirdly, we adopt spatial algorithm to tell which residues may potentially form the epitopes. Based on the strategies mentioned above, a new predictor, called CBEP (conformational B-cell epitopes prediction), is proposed in this study. CBEP achieves good prediction performance with the mean AUC scores (AUCs) of 0.721 and 0.703 on two benchmark datasets (bound and unbound) using the leave-one-out cross-validation (LOOCV). When compared with previous prediction tools, CBEP produces higher sensitivity and comparable specificity values. A web server named CBEP which implements the proposed method is available for academic use.
Collapse
Affiliation(s)
- Jian Zhang
- School of Computer Science and Information Technology, Northeast Normal University, Changchun 1300117, China
| | - Xiaowei Zhao
- School of Computer Science and Information Technology, Northeast Normal University, Changchun 1300117, China
| | - Pingping Sun
- School of Computer Science and Information Technology, Northeast Normal University, Changchun 1300117, China
- The Engineering Laboratory for Drug-Gene and Protein Screening, Northeast Normal University, Changchun 1300117, China
| | - Bo Gao
- School of Computer Science and Information Technology, Northeast Normal University, Changchun 1300117, China
| | - Zhiqiang Ma
- School of Computer Science and Information Technology, Northeast Normal University, Changchun 1300117, China
| |
Collapse
|
36
|
Huang JT, Huang W, Huang SR, Li X. How the folding rates of two- and multistate proteins depend on the amino acid properties. Proteins 2014; 82:2375-82. [DOI: 10.1002/prot.24599] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2014] [Revised: 04/27/2014] [Accepted: 05/05/2014] [Indexed: 01/05/2023]
Affiliation(s)
- Jitao T. Huang
- Department of Chemistry and State Key Laboratory of EOC; College of Chemistry, Nankai University; Tianjin 300071 China
| | - Wei Huang
- Department of Chemistry and State Key Laboratory of EOC; College of Chemistry, Nankai University; Tianjin 300071 China
| | - Shanran R. Huang
- Department of Chemistry and State Key Laboratory of EOC; College of Chemistry, Nankai University; Tianjin 300071 China
| | - Xin Li
- Department of Chemistry and State Key Laboratory of EOC; College of Chemistry, Nankai University; Tianjin 300071 China
| |
Collapse
|
37
|
N-terminal in coat protein of Garlic virus X is indispensible for its serological detection. Virus Genes 2013; 48:128-32. [PMID: 24136255 DOI: 10.1007/s11262-013-0990-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2013] [Accepted: 09/25/2013] [Indexed: 10/26/2022]
Abstract
Conserved coat protein region of plant viruses is often used as source of antigen for production of polyclonal antibodies for broad-based detection of closely related viruses. Antigenic region in coat protein is located either on N-terminal, and/or C-terminal or in the middle of coat protein. A study was undertaken to determine if antigenic region resides in N-terminal in Garlic virus X (GarV-X) of Allexivirus. In allexiviruses, N-terminal of coat protein region (1-57 amino acids) was highly variable. A complete coat protein of 27 kDa and a truncated protein without N-terminal (20 kDa) of GarV-X were expressed in pET expression vector and confirmed in western blotting using anti-His antisera. These expressed proteins were purified and used for antisera production. Specific and strong reaction was obtained for antisera generated against GarV-X full CP and GarV-X was detected in field-grown allium crops viz., onion, garlic, leek, and bunching onion and chives in ELISA. Antisera against GarV-X CPΔ1-61 (truncated CP) did not show reaction for GarV-X detection in immunoassay. Epitope mapping also indicated N-terminal as major antigenic determinant region with highest antigenic signal score. Our studies confirm that antigenic signals or epitopes reside in the N-terminal region of GarV-X which can be synthesized and used for production of monoclonal antibodies for specific detection purposes.
Collapse
|
38
|
Abstract
Background Molecular evolution is a very active field of research, with several complementary approaches, including dN/dS, HON90, MM01, and others. Each has documented strengths and weaknesses, and no one approach provides a clear picture of how natural selection works at the molecular level. The purpose of this work is to present a simple new method that uses quantitative amino acid properties to identify and characterize directional selection in proteins. Methods Inferred amino acid replacements are viewed through the prism of a single physicochemical property to determine the amount and direction of change caused by each replacement. This allows the calculation of the probability that the mean change in the single property associated with the amino acid replacements is equal to zero (H0: μ = 0; i.e., no net change) using a simple two-tailed t-test. Results Example data from calanoid and cyclopoid copepod cytochrome oxidase subunit I sequence pairs are presented to demonstrate how directional selection may be linked to major shifts in adaptive zones, and that convergent evolution at the whole organism level may be the result of convergent protein adaptations. Conclusions Rather than replace previous methods, this new method further complements existing methods to provide a holistic glimpse of how natural selection shapes protein structure and function over evolutionary time.
Collapse
|
39
|
Lin SYH, Cheng CW, Su ECY. Prediction of B-cell epitopes using evolutionary information and propensity scales. BMC Bioinformatics 2013; 14 Suppl 2:S10. [PMID: 23484214 PMCID: PMC3549808 DOI: 10.1186/1471-2105-14-s2-s10] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
Background Development of computational tools that can accurately predict presence and location of B-cell epitopes on pathogenic proteins has a valuable application to the field of vaccinology. Because of the highly variable yet enigmatic nature of B-cell epitopes, their prediction presents a great challenge to computational immunologists. Methods We propose a method, BEEPro (B-cell epitope prediction by evolutionary information and propensity scales), which adapts a linear averaging scheme on 16 properties using a support vector machine model to predict both linear and conformational B-cell epitopes. These 16 properties include position specific scoring matrix (PSSM), an amino acid ratio scale, and a set of 14 physicochemical scales obtained via a feature selection process. Finally, a three-way data split procedure is used during the validation process to prevent over-estimation of prediction performance and avoid bias in our experiment results. Results In our experiment, first we use a non-redundant linear B-cell epitope dataset curated by Sollner et al. for feature selection and parameter optimization. Evaluated by a three-way data split procedure, BEEPro achieves significant improvement with the area under the receiver operating curve (AUC) = 0.9987, accuracy = 99.29%, mathew's correlation coefficient (MCC) = 0.9281, sensitivity = 0.9604, specificity = 0.9946, positive predictive value (PPV) = 0.9042 for the Sollner dataset. In addition, the same parameters are used to evaluate performance on other independent linear B-cell epitope test datasets, BEEPro attains an AUC which ranges from 0.9874 to 0.9950 and an accuracy which ranges from 93.73% to 97.31%. Moreover, five-fold cross-validation on one benchmark conformational B-cell epitope dataset yields an accuracy of 92.14% and AUC of 0.9066. Conclusions Compared with other current models, our method achieves a significant improvement with respect to AUC, accuracy, MCC, sensitivity, specificity, and PPV. Thus, we have shown that an appropriate combination of evolutionary information and propensity scales with a support vector machine model can significantly enhance the prediction performance of both linear and conformational B-cell epitopes.
Collapse
Affiliation(s)
- Scott Yi-Heng Lin
- School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan
| | | | | |
Collapse
|
40
|
A realistic model under which the genetic code is optimal. J Mol Evol 2013; 77:170-84. [PMID: 23877342 DOI: 10.1007/s00239-013-9571-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2012] [Accepted: 06/27/2013] [Indexed: 01/23/2023]
Abstract
The genetic code has a high level of error robustness. Using values of hydrophobicity scales as a proxy for amino acid character, and the mean square measure as a function quantifying error robustness, a value can be obtained for a genetic code which reflects the error robustness of that code. By comparing this value with a distribution of values belonging to codes generated by random permutations of amino acid assignments, the level of error robustness of a genetic code can be quantified. We present a calculation in which the standard genetic code is shown to be optimal. We obtain this result by (1) using recently updated values of polar requirement as input; (2) fixing seven assignments (Ile, Trp, His, Phe, Tyr, Arg, and Leu) based on aptamer considerations; and (3) using known biosynthetic relations of the 20 amino acids. This last point is reflected in an approach of subdivision (restricting the random reallocation of assignments to amino acid subgroups, the set of 20 being divided in four such subgroups). The three approaches to explain robustness of the code (specific selection for robustness, amino acid-RNA interactions leading to assignments, or a slow growth process of assignment patterns) are reexamined in light of our findings. We offer a comprehensive hypothesis, stressing the importance of biosynthetic relations, with the code evolving from an early stage with just glycine and alanine, via intermediate stages, towards 64 codons carrying todays meaning.
Collapse
|
41
|
Zhang W, Niu Y, Xiong Y, Zhao M, Yu R, Liu J. Computational prediction of conformational B-cell epitopes from antigen primary structures by ensemble learning. PLoS One 2012; 7:e43575. [PMID: 22927994 PMCID: PMC3424238 DOI: 10.1371/journal.pone.0043575] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2011] [Accepted: 07/26/2012] [Indexed: 11/18/2022] Open
Abstract
MOTIVATION The conformational B-cell epitopes are the specific sites on the antigens that have immune functions. The identification of conformational B-cell epitopes is of great importance to immunologists for facilitating the design of peptide-based vaccines. As an attempt to narrow the search for experimental validation, various computational models have been developed for the epitope prediction by using antigen structures. However, the application of these models is undermined by the limited number of available antigen structures. In contrast to the most of available structure-based methods, we here attempt to accurately predict conformational B-cell epitopes from antigen sequences. METHODS In this paper, we explore various sequence-derived features, which have been observed to be associated with the location of epitopes or ever used in the similar tasks. These features are evaluated and ranked by their discriminative performance on the benchmark datasets. From the perspective of information science, the combination of various features can usually lead to better results than the individual features. In order to build the robust model, we adopt the ensemble learning approach to incorporate various features, and develop the ensemble model to predict conformational epitopes from antigen sequences. RESULTS Evaluated by the leave-one-out cross validation, the proposed method gives out the mean AUC scores of 0.687 and 0.651 on two datasets respectively compiled from the bound structures and unbound structures. When compared with publicly available servers by using the independent dataset, our method yields better or comparable performance. The results demonstrate the proposed method is useful for the sequence-based conformational epitope prediction. AVAILABILITY The web server and datasets are freely available at http://bcell.whu.edu.cn.
Collapse
Affiliation(s)
- Wen Zhang
- School of Computer, Wuhan University, Wuhan, China.
| | | | | | | | | | | |
Collapse
|
42
|
Anbouhi MH, Abolhassani M, Bouzari S, Khanahmad H, Aghasadeghi MR, Madadkar-Sobhani A, Amanzadeh A, Behdani M, Shokrgozar MA. Immunological evaluation of predicted linear B-cell epitopes of human CD20 antigen. Biotechnol Appl Biochem 2012; 59:186-92. [DOI: 10.1002/bab.1012] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2011] [Accepted: 02/07/2012] [Indexed: 11/09/2022]
|
43
|
Evaluation of hydropathy of amino acids from a comparison of their viscosities inside vesicles and on supported lipid bilayers. Colloids Surf B Biointerfaces 2012; 91:63-7. [PMID: 22118892 DOI: 10.1016/j.colsurfb.2011.10.038] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2011] [Revised: 10/20/2011] [Accepted: 10/20/2011] [Indexed: 11/21/2022]
Abstract
The viscosity of amino acids enclosed in giant lipid vesicles (η(out)) subjected to a shear flow near a solid surface has been studied using quartz crystal microbalance (QCM). This viscosity has been compared with shear viscosity for the different amino acids adsorbed on supported bilayers (SLBs) (η(in)) of the lipids on quartz. Using a first approximation of vesicles as model rigid spheres, the measured viscosities and the extent of deformation of vesicles observed using optical microscopy, two non-dimensional parameters: the reduced volume and the ratio of (η(in))/(η(out)) have been analyzed as a function of physical parameters: vesicle substrate distance (vesicle vs. supported lipid bilayers), vesicle size and their variation as a function of the viscosity. The kinematics of the vesicles with the amino acids compared with the shear at supported lipid bilayers seems to describe a reasonable hydropathy scale for the amino acids. The results show that there is a direct correlation between the above parameters and the polarity variations in amino acids suggesting that the viscous force may be an important parameter and should be taken into account in studies on membrane proteins interacting with cells and cell adhesion in flow chambers where cell membrane and the adhesive substrate are in relative motion.
Collapse
|
44
|
Su CH, Pal NR, Lin KL, Chung IF. Identification of amino acid propensities that are strong determinants of linear B-cell epitope using neural networks. PLoS One 2012; 7:e30617. [PMID: 22347389 PMCID: PMC3275595 DOI: 10.1371/journal.pone.0030617] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2011] [Accepted: 12/22/2011] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Identification of amino acid propensities that are strong determinants of linear B-cell epitope is very important to enrich our knowledge about epitopes. This can also help to obtain better epitope prediction. Typical linear B-cell epitope prediction methods combine various propensities in different ways to improve prediction accuracies. However, fewer but better features may yield better prediction. Moreover, for a propensity, when the sequence length is k, there will be k values, which should be treated as a single unit for feature selection and hence usual feature selection method will not work. Here we use a novel Group Feature Selecting Multilayered Perceptron, GFSMLP, which treats a group of related information as a single entity and selects useful propensities related to linear B-cell epitopes, and uses them to predict epitopes. METHODOLOGY/ PRINCIPAL FINDINGS We use eight widely known propensities and four data sets. We use GFSMLP to rank propensities by the frequency with which they are selected. We find that Chou's beta-turn and Ponnuswamy's polarity are better features for prediction of linear B-cell epitope. We examine the individual and combined discriminating power of the selected propensities and analyze the correlation between paired propensities. Our results show that the selected propensities are indeed good features, which also cooperate with other propensities to enhance the discriminating power for predicting epitopes. We find that individually polarity is not the best predictor, but it collaborates with others to yield good prediction. Usual feature selection methods cannot provide such information. CONCLUSIONS/ SIGNIFICANCE Our results confirm the effectiveness of active (group) feature selection by GFSMLP over the traditional passive approaches of evaluating various combinations of propensities. The GFSMLP-based feature selection can be extended to more than 500 remaining propensities to enhance our biological knowledge about epitopes and to obtain better prediction. A graphical-user-interface version of GFSMLP is available at: http://bio.classcloud.org/GFSMLP/.
Collapse
Affiliation(s)
- Chun-Hung Su
- Institute of Biomedical Informatics, National Yang-Ming University, Taipei, Taiwan, Republic of China
| | - Nikhil R. Pal
- Electronics and Communication Sciences Unit, Indian Statistical Institute, Calcutta, India
| | - Ken-Li Lin
- Computer Center, Chung Hua University, Hsinchu,Taiwan, Republic of China
| | - I-Fang Chung
- Institute of Biomedical Informatics, National Yang-Ming University, Taipei, Taiwan, Republic of China
- * E-mail:
| |
Collapse
|
45
|
Huang JT, Xing DJ, Huang W. Relationship between protein folding kinetics and amino acid properties. Amino Acids 2011; 43:567-72. [DOI: 10.1007/s00726-011-1189-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2011] [Accepted: 11/29/2011] [Indexed: 10/14/2022]
|
46
|
Wang Y, Wu W, Negre NN, White KP, Li C, Shah PK. Determinants of antigenicity and specificity in immune response for protein sequences. BMC Bioinformatics 2011; 12:251. [PMID: 21693021 PMCID: PMC3133554 DOI: 10.1186/1471-2105-12-251] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2010] [Accepted: 06/21/2011] [Indexed: 11/22/2022] Open
Abstract
Background Target specific antibodies are pivotal for the design of vaccines, immunodiagnostic tests, studies on proteomics for cancer biomarker discovery, identification of protein-DNA and other interactions, and small and large biochemical assays. Therefore, it is important to understand the properties of protein sequences that are important for antigenicity and to identify small peptide epitopes and large regions in the linear sequence of the proteins whose utilization result in specific antibodies. Results Our analysis using protein properties suggested that sequence composition combined with evolutionary information and predicted secondary structure, as well as solvent accessibility is sufficient to predict successful peptide epitopes. The antigenicity and the specificity in immune response were also found to depend on the epitope length. We trained the B-Cell Epitope Oracle (BEOracle), a support vector machine (SVM) classifier, for the identification of continuous B-Cell epitopes with these protein properties as learning features. The BEOracle achieved an F1-measure of 81.37% on a large validation set. The BEOracle classifier outperformed the classical methods based on propensity and sophisticated methods like BCPred and Bepipred for B-Cell epitope prediction. The BEOracle classifier also identified peptides for the ChIP-grade antibodies from the modENCODE/ENCODE projects with 96.88% accuracy. High BEOracle score for peptides showed some correlation with the antibody intensity on Immunofluorescence studies done on fly embryos. Finally, a second SVM classifier, the B-Cell Region Oracle (BROracle) was trained with the BEOracle scores as features to predict the performance of antibodies generated with large protein regions with high accuracy. The BROracle classifier achieved accuracies of 75.26-63.88% on a validation set with immunofluorescence, immunohistochemistry, protein arrays and western blot results from Protein Atlas database. Conclusions Together our results suggest that antigenicity is a local property of the protein sequences and that protein sequence properties of composition, secondary structure, solvent accessibility and evolutionary conservation are the determinants of antigenicity and specificity in immune response. Moreover, specificity in immune response could also be accurately predicted for large protein regions without the knowledge of the protein tertiary structure or the presence of discontinuous epitopes. The dataset prepared in this work and the classifier models are available for download at https://sites.google.com/site/oracleclassifiers/.
Collapse
Affiliation(s)
- Yulong Wang
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute & Harvard School of Public Health, Boston 02115 MA, USA.
| | | | | | | | | | | |
Collapse
|
47
|
Singh AK, Rath SK, Misra K. Identification of epitopes in Indian human papilloma virus 16 E6: a bioinformatics approach. J Virol Methods 2011; 177:26-30. [PMID: 21699918 DOI: 10.1016/j.jviromet.2011.06.006] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2011] [Revised: 05/31/2011] [Accepted: 06/07/2011] [Indexed: 11/16/2022]
Abstract
HPV-16 is reported as the cause of cervical and other related carcinomas. The early expressed protein E6 in cancer cells is found to be the target for immune therapeutic methods. The sequence of HPV-16 E6 (Accession No: ABK32509) from NCBI databank has been taken for this study. Hydrophilicity, flexibility, accessibility, turns, exposed surface, polarity and antigenic propensity scales were used for the B cell epitope prediction. MHC Class I and Class II alleles for the accession were predicted by the MHCPred 2.0 Program. The epitope sequences were also found out. Computer-based prediction program results show, A0203 and DRB0101 lower IC50 than other alleles. The best peptide binding affinity was 21HLCTELQTT30 of A0203 allele. In DRB0101 allele the peptide found was 39YCKQQLLRR48. Different structural features of the protein have also been predicted including glycosylation, kinase C phosphorylation, casein kinase II phosphorylation and N-myristylation sites. These computational prediction programs show four glycosylation, five kinase C phosphorylation, two casein kinase II phosphorylation, zero N-myristylation sites and seven disulphide sites. Development and approval of new vaccines are the keys for control of cancer. Epitopes and other structural features of protein prediction could be the best source of information and can help in molecular and medical studies of viral infection and development of HPV associated cancer drugs.
Collapse
Affiliation(s)
- Ajay Kumar Singh
- Centre for Biomedical Magnetic Research, SGPGI Campus, Lucknow, India.
| | | | | |
Collapse
|
48
|
Mittal A, Jayaram B. Backbones of Folded Proteins Reveal Novel Invariant Amino Acid Neighborhoods. J Biomol Struct Dyn 2011; 28:443-54. [DOI: 10.1080/073911011010524954] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
|
49
|
Moosavi F, Mohabatkar H, Mohsenzadeh S. Computer-aided analysis of structural properties and epitopes of Iranian HPV-16 E7 oncoprotein. Interdiscip Sci 2010; 2:367-72. [PMID: 21153780 DOI: 10.1007/s12539-010-0040-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2009] [Revised: 01/16/2010] [Accepted: 01/18/2010] [Indexed: 11/30/2022]
Abstract
Infection by human papillomavirus type 16 (HPV-16) is the cause of 50% or more of cervical cancers in women. The E7 oncoprotein of HPV-16 has long been known as a potent immortalizing and transforming agent. We used different servers like PseAAC, MHC_binding, MHC_II_binding and Expasy for the present computational prediction. The results for T cell epitopes showed that B1501, A0203, A0201, A0202, A6801 and DRB0405 alleles had lower IC50 than other alleles. We also predicted several peptides with the best binding affinities for alleles of the most frequent MHC class I and II alleles of the various ethnic groups living in the different region of Iran. Two peptides (26-35) and (44-52) were predicted as B-cell epitopes. According to this analysis 1 N-glycosylation site, 2 PKC sites, 4 CK2 sites and 3 disulfide sites were predicted. Our computational study predicted that B cell epitope 1 was Casein kinase II phosphorylated (site No. 31) and glycosylated (site No. 29). Putative MHC-I epitopes 3 and 5 and MHC-II epitopes 19, 21 and 26 were predicted to be casein kinase II phosphorylated. MHC-II epitopes 19 and 21 was predicted to be glycosylated. T cell epitopes 1, 13, 16 and 24 were demonstrated to be kinase C phosphorylated. The result of this analysis for Iranian HPV-16 E7 also indicated that 21.43%, 18.37% and 60.20% of the protein were in the α-helix, extended strand and random coil respectively.
Collapse
Affiliation(s)
- Fatemeh Moosavi
- Department of Biology, College of Sciences, Shiraz University, Iran
| | | | | |
Collapse
|
50
|
Ansari HR, Raghava GP. Identification of conformational B-cell Epitopes in an antigen from its primary sequence. Immunome Res 2010; 6:6. [PMID: 20961417 PMCID: PMC2974664 DOI: 10.1186/1745-7580-6-6] [Citation(s) in RCA: 201] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2010] [Accepted: 10/20/2010] [Indexed: 11/10/2022] Open
Abstract
Background One of the major challenges in the field of vaccine design is to predict conformational B-cell epitopes in an antigen. In the past, several methods have been developed for predicting conformational B-cell epitopes in an antigen from its tertiary structure. This is the first attempt in this area to predict conformational B-cell epitope in an antigen from its amino acid sequence. Results All Support vector machine (SVM) models were trained and tested on 187 non-redundant protein chains consisting of 2261 antibody interacting residues of B-cell epitopes. Models have been developed using binary profile of pattern (BPP) and physiochemical profile of patterns (PPP) and achieved a maximum MCC of 0.22 and 0.17 respectively. In this study, for the first time SVM model has been developed using composition profile of patterns (CPP) and achieved a maximum MCC of 0.73 with accuracy 86.59%. We compare our CPP based model with existing structure based methods and observed that our sequence based model is as good as structure based methods. Conclusion This study demonstrates that prediction of conformational B-cell epitope in an antigen is possible from is primary sequence. This study will be very useful in predicting conformational B-cell epitopes in antigens whose tertiary structures are not available. A web server CBTOPE has been developed for predicting B-cell epitope http://www.imtech.res.in/raghava/cbtope/.
Collapse
Affiliation(s)
- Hifzur Rahman Ansari
- Bioinformatics Center, Institute of Microbial Technology, Sector 39-A, Chandigarh, India.
| | | |
Collapse
|