1
|
Caoili SEC. B-Cell Epitope Prediction for Antipeptide Paratopes with the HAPTIC2/HEPTAD User Toolkit (HUT). Methods Mol Biol 2024; 2821:9-32. [PMID: 38997477 DOI: 10.1007/978-1-0716-3914-6_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/14/2024]
Abstract
B-cell epitope prediction is key to developing peptide-based vaccines and immunodiagnostics along with antibodies for prophylactic, therapeutic and/or diagnostic use. This entails estimating paratope binding affinity for variable-length peptidic sequences subject to constraints on both paratope accessibility and antigen conformational flexibility, as described herein for the HAPTIC2/HEPTAD User Toolkit (HUT). HUT comprises the Heuristic Affinity Prediction Tool for Immune Complexes 2 (HAPTIC2), the HAPTIC2-like Epitope Prediction Tool for Antigen with Disulfide (HEPTAD) and the HAPTIC2/HEPTAD Input Preprocessor (HIP). HIP enables tagging of residues (e.g., in hydrophobic blobs, ordered regions and glycosylation motifs) for exclusion from downstream analyses by HAPTIC2 and HEPTAD. HAPTIC2 estimates paratope binding affinity for disulfide-free disordered peptidic antigens (by analogy between flexible-ligand docking and protein folding), from terms attributed to compaction (in view of sequence length, charge and temperature-dependent polyproline-II helical propensity), collapse (disfavored by residue bulkiness) and contact (with glycine and proline regarded as polar residues that hydrogen bond with paratopes). HEPTAD analyzes antigen sequences that each contain two cysteine residues for which the impact of disulfide pairing is estimated as a correction to the free-energy penalty of compaction. All of HUT is freely accessible online ( https://freeshell.de/~badong/hut.htm ).
Collapse
Affiliation(s)
- Salvador Eugenio C Caoili
- Biomedical Innovations Research for Translational Health Science (BIRTHS) Laboratory, Department of Biochemistry and Molecular Biology, College of Medicine, University of the Philippines Manila, Ermita, Manila, Philippines.
| |
Collapse
|
2
|
Liu Y, Liu Y, Wang S, Zhu X. LBCE-XGB: A XGBoost Model for Predicting Linear B-Cell Epitopes Based on BERT Embeddings. Interdiscip Sci 2023; 15:293-305. [PMID: 36646842 DOI: 10.1007/s12539-023-00549-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 12/28/2022] [Accepted: 01/03/2023] [Indexed: 01/18/2023]
Abstract
Accurately detecting linear B-cell epitopes (BCEs) makes great sense in vaccine design, immunodiagnostic test, antibody production, disease prevention and treatment. Wet-lab experiments for determining linear BCEs are both expensive and laborious, which are not able to meet the recognition needs of modern massive protein sequence data. Instead, computational methods can efficiently identify linear BCEs with low cost. Although several computational methods are available, the performance is still not satisfactory. Thus, we propose a new method, LBCE-XGB, to forecast linear BCEs based on XGBoost algorithm. To represent the biological information concealed in peptide sequences, the embeddings of the residues were obtained from a pre-trained domain-specific BERT model. In addition, the other five types of attributes comprising amino acid composition, amino acid antigenicity scale were also extracted. The best feature combination was determined according to the cross-validation results. Against the models developed by other deep learning and machine learning algorithms, LBCE-XGB achieves the top performance with an AUROC of 0.845 for fivefold cross-validation. The results on the independent test set show that our model attains an AUROC of 0.838 which is substantially higher than other state-of-the-art methods. The outcomes indicate that the representations of BERT could be an effective feature in predicting linear BCEs and we believe that LBCE-XGB could be a useful medium for detecting linear B cell epitopes with high accuracy and low cost.
Collapse
Affiliation(s)
- Yufeng Liu
- School of Sciences, Anhui Agricultural University, Hefei, 230036, Anhui, China
| | - Yinbo Liu
- School of Sciences, Anhui Agricultural University, Hefei, 230036, Anhui, China
| | - Shuyu Wang
- School of Sciences, Anhui Agricultural University, Hefei, 230036, Anhui, China
| | - Xiaolei Zhu
- School of Sciences, Anhui Agricultural University, Hefei, 230036, Anhui, China.
| |
Collapse
|
3
|
Lu S, Li Y, Ma Q, Nan X, Zhang S. A Structure-Based B-cell Epitope Prediction Model Through Combing Local and Global Features. Front Immunol 2022; 13:890943. [PMID: 35844532 PMCID: PMC9283778 DOI: 10.3389/fimmu.2022.890943] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Accepted: 05/23/2022] [Indexed: 11/24/2022] Open
Abstract
B-cell epitopes (BCEs) are a set of specific sites on the surface of an antigen that binds to an antibody produced by B-cell. The recognition of BCEs is a major challenge for drug design and vaccines development. Compared with experimental methods, computational approaches have strong potential for BCEs prediction at much lower cost. Moreover, most of the currently methods focus on using local information around target residue without taking the global information of the whole antigen sequence into consideration. We propose a novel deep leaning method through combing local features and global features for BCEs prediction. In our model, two parallel modules are built to extract local and global features from the antigen separately. For local features, we use Graph Convolutional Networks (GCNs) to capture information of spatial neighbors of a target residue. For global features, Attention-Based Bidirectional Long Short-Term Memory (Att-BLSTM) networks are applied to extract information from the whole antigen sequence. Then the local and global features are combined to predict BCEs. The experiments show that the proposed method achieves superior performance over the state-of-the-art BCEs prediction methods on benchmark datasets. Also, we compare the performance differences between data with or without global features. The experimental results show that global features play an important role in BCEs prediction. Our detailed case study on the BCEs prediction for SARS-Cov-2 receptor binding domain confirms that our method is effective for predicting and clustering true BCEs.
Collapse
Affiliation(s)
- Shuai Lu
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, China
| | - Yuguang Li
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, China
| | - Qiang Ma
- School of Life Sciences, Zhengzhou University, Zhengzhou, China
| | - Xiaofei Nan
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, China
- *Correspondence: Xiaofei Nan, ; Shoutao Zhang,
| | - Shoutao Zhang
- School of Life Sciences, Zhengzhou University, Zhengzhou, China
- Longhu Laboratory of Advanced Immunology, Zhengzhou, China
- *Correspondence: Xiaofei Nan, ; Shoutao Zhang,
| |
Collapse
|
4
|
Caoili SEC. Prediction of Variable-Length B-Cell Epitopes for Antipeptide Paratopes Using the Program HAPTIC. Protein Pept Lett 2022; 29:328-339. [PMID: 35125075 DOI: 10.2174/0929866529666220203101808] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Revised: 12/02/2021] [Accepted: 12/13/2021] [Indexed: 01/05/2023]
Abstract
BACKGROUND B-cell epitope prediction for antipeptide antibody responses enables peptide-based vaccine design and related translational applications. This entails estimating epitopeparatope binding free-energy changes from antigen sequence; but attempts to do so assuming uniform epitope length (e.g., of hexapeptide sequences, each spanning a typical paratope diameter when fully extended) have neglected empirically established variation in epitope length. OBJECTIVE This work aimed to develop a sequence-based physicochemical approach to variablelength B-cell epitope prediction for antipeptide paratopes recognizing flexibly disordered targets. METHODS Said approach was developed by analogy between epitope-paratope binding and protein folding modeled as polymer collapse, treating paratope structure implicitly. Epitope-paratope binding was thus conceptually resolved into processes of epitope compaction, collapse and contact, with epitope collapse presenting the main entropic barrier limiting epitope length among nonpolyproline sequences. The resulting algorithm was implemented as a computer program, namely the Heuristic Affinity Prediction Tool for Immune Complexes (HAPTIC), which is freely accessible via an online interface (http://badong.freeshell.org/haptic.htm). This was used in conjunction with published data on representative known peptide immunogens. RESULTS HAPTIC predicted immunodominant epitope sequences with lengths limited by penalties for both compaction and collapse, consistent with known paratope-bound structures of flexibly disordered epitopes. In most cases, the predicted association constant was greater than its experimentally determined counterpart but below the predicted upper bound for affinity maturation in vivo. CONCLUSION HAPTIC provides a physicochemically plausible means for estimating the affinity of antipeptide paratopes for sterically accessible and flexibly disordered peptidic antigen sequences by explicitly considering candidate B-cell epitopes of variable length.
Collapse
Affiliation(s)
- Salvador E C Caoili
- Biomedical Innovations Research for Translational Health Science (BIRTHS) Laboratory, Department of Biochemistry and Molecular Biology, College of Medicine, University of the Philippines Manila, Manila, Philippines
| |
Collapse
|
5
|
Delineating Surface Epitopes of Lyme Disease Pathogen Targeted by Highly Protective Antibodies of New Zealand White Rabbits. Infect Immun 2019; 87:IAI.00246-19. [PMID: 31085705 DOI: 10.1128/iai.00246-19] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Accepted: 05/07/2019] [Indexed: 11/20/2022] Open
Abstract
Lyme disease (LD), the most prevalent vector-borne illness in the United States and Europe, is caused by Borreliella burgdorferi No vaccine is available for humans. Dogmatically, B. burgdorferi can establish a persistent infection in the mammalian host (e.g., mice) due to a surface antigen, VlsE. This antigenically variable protein allows the spirochete to continually evade borreliacidal antibodies. However, our recent study has shown that the B. burgdorferi spirochete is effectively cleared by anti-B. burgdorferi antibodies of New Zealand White rabbits, despite the surface expression of VlsE. Besides homologous protection, the rabbit antibodies also cross-protect against heterologous B. burgdorferi spirochetes and significantly reduce the pathology of LD arthritis in persistently infected mice. Thus, this finding that NZW rabbits develop a unique repertoire of very potent antibodies targeting the protective surface epitopes, despite abundant VlsE, prompted us to identify the specificities of the protective rabbit antibodies and their respective targets. By applying subtractive reverse vaccinology, which involved the use of random peptide phage display libraries coupled with next-generation sequencing and our computational algorithms, repertoires of nonprotective (early) and protective (late) rabbit antibodies were identified and directly compared. Consequently, putative surface epitopes that are unique to the protective rabbit sera were mapped. Importantly, the relevance of newly identified protection-associated epitopes for their surface exposure has been strongly supported by prior empirical studies. This study is significant because it now allows us to systematically test the putative epitopes for their protective efficacy with an ultimate goal of selecting the most efficacious targets for development of a long-awaited LD vaccine.
Collapse
|
6
|
Manavalan B, Govindaraj RG, Shin TH, Kim MO, Lee G. iBCE-EL: A New Ensemble Learning Framework for Improved Linear B-Cell Epitope Prediction. Front Immunol 2018; 9:1695. [PMID: 30100904 PMCID: PMC6072840 DOI: 10.3389/fimmu.2018.01695] [Citation(s) in RCA: 113] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2018] [Accepted: 07/10/2018] [Indexed: 11/13/2022] Open
Abstract
Identification of B-cell epitopes (BCEs) is a fundamental step for epitope-based vaccine development, antibody production, and disease prevention and diagnosis. Due to the avalanche of protein sequence data discovered in postgenomic age, it is essential to develop an automated computational method to enable fast and accurate identification of novel BCEs within vast number of candidate proteins and peptides. Although several computational methods have been developed, their accuracy is unreliable. Thus, developing a reliable model with significant prediction improvements is highly desirable. In this study, we first constructed a non-redundant data set of 5,550 experimentally validated BCEs and 6,893 non-BCEs from the Immune Epitope Database. We then developed a novel ensemble learning framework for improved linear BCE predictor called iBCE-EL, a fusion of two independent predictors, namely, extremely randomized tree (ERT) and gradient boosting (GB) classifiers, which, respectively, uses a combination of physicochemical properties (PCP) and amino acid composition and a combination of dipeptide and PCP as input features. Cross-validation analysis on a benchmarking data set showed that iBCE-EL performed better than individual classifiers (ERT and GB), with a Matthews correlation coefficient (MCC) of 0.454. Furthermore, we evaluated the performance of iBCE-EL on the independent data set. Results show that iBCE-EL significantly outperformed the state-of-the-art method with an MCC of 0.463. To the best of our knowledge, iBCE-EL is the first ensemble method for linear BCEs prediction. iBCE-EL was implemented in a web-based platform, which is available at http://thegleelab.org/iBCE-EL. iBCE-EL contains two prediction modes. The first one identifying peptide sequences as BCEs or non-BCEs, while later one is aimed at providing users with the option of mining potential BCEs from protein sequences.
Collapse
Affiliation(s)
| | - Rajiv Gandhi Govindaraj
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, United States
| | - Tae Hwan Shin
- Department of Physiology, Ajou University School of Medicine, Suwon, South Korea.,Institute of Molecular Science and Technology, Ajou University, Suwon, South Korea
| | - Myeong Ok Kim
- Division of Life Science and Applied Life Science (BK21 Plus), College of Natural Sciences, Gyeongsang National University, Jinju, South Korea
| | - Gwang Lee
- Department of Physiology, Ajou University School of Medicine, Suwon, South Korea.,Institute of Molecular Science and Technology, Ajou University, Suwon, South Korea
| |
Collapse
|
7
|
Expressing Redundancy among Linear-Epitope Sequence Data Based on Residue-Level Physicochemical Similarity in the Context of Antigenic Cross-Reaction. Adv Bioinformatics 2016; 2016:1276594. [PMID: 27274725 PMCID: PMC4870339 DOI: 10.1155/2016/1276594] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2015] [Revised: 03/29/2016] [Accepted: 04/10/2016] [Indexed: 01/15/2023] Open
Abstract
Epitope-based design of vaccines, immunotherapeutics, and immunodiagnostics is complicated by structural changes that radically alter immunological outcomes. This is obscured by expressing redundancy among linear-epitope data as fractional sequence-alignment identity, which fails to account for potentially drastic loss of binding affinity due to single-residue substitutions even where these might be considered conservative in the context of classical sequence analysis. From the perspective of immune function based on molecular recognition of epitopes, functional redundancy of epitope data (FRED) thus may be defined in a biologically more meaningful way based on residue-level physicochemical similarity in the context of antigenic cross-reaction, with functional similarity between epitopes expressed as the Shannon information entropy for differential epitope binding. Such similarity may be estimated in terms of structural differences between an immunogen epitope and an antigen epitope with reference to an idealized binding site of high complementarity to the immunogen epitope, by analogy between protein folding and ligand-receptor binding; but this underestimates potential for cross-reactivity, suggesting that epitope-binding site complementarity is typically suboptimal as regards immunologic specificity. The apparently suboptimal complementarity may reflect a tradeoff to attain optimal immune function that favors generation of immune-system components each having potential for cross-reactivity with a variety of epitopes.
Collapse
|
8
|
Caoili SEC. An integrative structure-based framework for predicting biological effects mediated by antipeptide antibodies. J Immunol Methods 2015; 427:19-29. [PMID: 26410103 DOI: 10.1016/j.jim.2015.09.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2015] [Revised: 08/30/2015] [Accepted: 09/20/2015] [Indexed: 01/18/2023]
Abstract
A general framework is presented for predicting quantitative biological effects mediated by antipeptide antibodies, primarily on the basis of antigen structure (possibly featuring intrinsic disorder) analyzed to estimate epitope-paratope binding affinities, which in turn is considered within the context of dose-response relationships as regards antibody concentration. This is illustrated mainly using an approach based on protein structural energetics, whereby expected amounts of solvent-accessible surface area buried upon epitope-paratope binding are related to the corresponding binding affinity, which is estimated from putative B-cell epitope structure with implicit treatment of paratope structure, for antipeptide antibodies either reacting with peptides or cross-reacting with cognate protein antigens. Key methods described are implemented in SAPPHIRE/SUITE (Structural-energetic Analysis Program for Predicting Humoral Immune Response Epitopes/SAPPHIRE User Interface Tool Ensemble; publicly accessible via http://freeshell.de/~badong/suite.htm). Representative results thus obtained are compared with published experimental data on binding affinities and quantitative biological effects, with special attention to loss of paratope sidechain conformational entropy (neglected in previous analyses) and in light of key in-vivo constraints on antigen-antibody binding affinity and antibody-mediated effects. Implications for further refinement of B-cell epitope prediction methods are discussed as regards envisioned biomedical applications including the development of prophylactic and therapeutic antibodies, peptide-based vaccines and immunodiagnostics.
Collapse
Affiliation(s)
- Salvador Eugenio C Caoili
- Department of Biochemistry and Molecular Biology, College of Medicine, University of the Philippines Manila, Manila, Philippines.
| |
Collapse
|
9
|
Lian Y, Ge M, Pan XM. EPMLR: sequence-based linear B-cell epitope prediction method using multiple linear regression. BMC Bioinformatics 2014; 15:414. [PMID: 25523327 PMCID: PMC4307399 DOI: 10.1186/s12859-014-0414-y] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2014] [Accepted: 12/09/2014] [Indexed: 11/10/2022] Open
Abstract
Background B-cell epitopes have been studied extensively due to their immunological applications, such as peptide-based vaccine development, antibody production, and disease diagnosis and therapy. Despite several decades of research, the accurate prediction of linear B-cell epitopes has remained a challenging task. Results In this work, based on the antigen’s primary sequence information, a novel linear B-cell epitope prediction model was developed using the multiple linear regression (MLR). A 10-fold cross-validation test on a large non-redundant dataset was performed to evaluate the performance of our model. To alleviate the problem caused by the noise of negative dataset, 300 experiments utilizing 300 sub-datasets were performed. We achieved overall sensitivity of 81.8%, precision of 64.1% and area under the receiver operating characteristic curve (AUC) of 0.728. Conclusions We have presented a reliable method for the identification of linear B cell epitope using antigen’s primary sequence information. Moreover, a web server EPMLR has been developed for linear B-cell epitope prediction: http://www.bioinfo.tsinghua.edu.cn/epitope/EPMLR/. Electronic supplementary material The online version of this article (doi:10.1186/s12859-014-0414-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yao Lian
- The Key Laboratory of Bioinformatics, Ministry of Education, School of Life Sciences, Tsinghua University, Beijing, 100084, China.
| | - Meng Ge
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China.
| | - Xian-Ming Pan
- The Key Laboratory of Bioinformatics, Ministry of Education, School of Life Sciences, Tsinghua University, Beijing, 100084, China.
| |
Collapse
|