1
|
Angaitkar P, Ram Janghel R, Prasad Sahu T. An MCDM approach for Reverse vaccinology model to predict bacterial protective antigens. Vaccine 2024:S0264-410X(24)00517-6. [PMID: 38704249 DOI: 10.1016/j.vaccine.2024.04.078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 01/26/2024] [Accepted: 04/20/2024] [Indexed: 05/06/2024]
Abstract
Reverse vaccinology (RV) is a significant step in sensible vaccine design. In recent years, many machine learning (ML) methods have been used to improve RV prediction accuracy. However, there are still issues with prediction accuracy and programme accessibility in ML-based RV. This paper presents a supervised ML-based method to classify bacterial protective antigens (BPAgs) and identify the model(s) that consistently perform well for the training dataset. Six ML classifiers are used for testing with physiochemical features extracted from a comprehensive training dataset. Selecting the best performing model from different performance metrics (accuracy, precision, recall, F1-score, and AUC-ROC) has not been easy, because all the metrics has the same importance to predict BPAgs. To fix this issue, we propose a soft and hard ranking model based on multi-criteria decision-making (MCDM) approach for selecting the best performing ML method that classifies BPAgs. First, our proposed model uses homologous proteins (positive and negative samples) from Protegen and Uniprot databases. Second, we applied four strategies of Synthetic Minority Oversampling Technique and Edited Nearest Neighbour (SMOTE-ENN) to handle the data imbalance problem and train the model using ML methods. Third, we consider MCDM-based technique for order preference by similarity to the ideal solution (TOPSIS) method integrated with soft and hard ranking model. The entropy is used to obtain weighted evaluation criteria for ranking the models. Our experimental evaluations show that the proposed method with best performing models (Random Forest and Extreme Gradient Boosting) outperforms compared to existing open-source RV methods using benchmark datasets.
Collapse
Affiliation(s)
- Pratik Angaitkar
- Department of Information Technology, National Institute of Technology, Raipur, G.E.Road Raipur, C.G. -492010, India.
| | - Rekh Ram Janghel
- Department of Information Technology, National Institute of Technology, Raipur, G.E.Road Raipur, C.G. -492010, India.
| | - Tirath Prasad Sahu
- Department of Information Technology, National Institute of Technology, Raipur, G.E.Road Raipur, C.G. -492010, India.
| |
Collapse
|
2
|
Hourigan D, Stefanovic E, Hill C, Ross RP. Promiscuous, persistent and problematic: insights into current enterococcal genomics to guide therapeutic strategy. BMC Microbiol 2024; 24:103. [PMID: 38539119 PMCID: PMC10976773 DOI: 10.1186/s12866-024-03243-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 02/28/2024] [Indexed: 04/19/2024] Open
Abstract
Vancomycin-resistant enterococci (VRE) are major opportunistic pathogens and the causative agents of serious diseases, such as urinary tract infections and endocarditis. VRE strains mainly include species of Enterococcus faecium and E. faecalis which can colonise the gastrointestinal tract (GIT) of patients and, following growth and persistence in the gut, can transfer to blood resulting in systemic dissemination in the body. Advancements in genomics have revealed that hospital-associated VRE strains are characterised by increased numbers of mobile genetic elements, higher numbers of antibiotic resistance genes and often lack active CRISPR-Cas systems. Additionally, comparative genomics have increased our understanding of dissemination routes among patients and healthcare workers. Since the efficiency of currently available antibiotics is rapidly declining, new measures to control infection and dissemination of these persistent pathogens are urgently needed. These approaches include combinatory administration of antibiotics, strengthening colonisation resistance of the gut microbiota to reduce VRE proliferation through commensals or probiotic bacteria, or switching to non-antibiotic bacterial killers, such as bacteriophages or bacteriocins. In this review, we discuss the current knowledge of the genomics of VRE isolates and state-of-the-art therapeutic advances against VRE infections.
Collapse
Affiliation(s)
- David Hourigan
- APC Microbiome Ireland, Biosciences Institute, Biosciences Research Institute, College Rd, University College, Cork, Ireland
- School of Microbiology, University College Cork, College Rd, University College, Cork, Ireland
| | - Ewelina Stefanovic
- APC Microbiome Ireland, Biosciences Institute, Biosciences Research Institute, College Rd, University College, Cork, Ireland
- Teagasc Food Research Centre, Moorepark, Moorepark West, Fermoy, Co. Cork, Ireland
| | - Colin Hill
- APC Microbiome Ireland, Biosciences Institute, Biosciences Research Institute, College Rd, University College, Cork, Ireland
- School of Microbiology, University College Cork, College Rd, University College, Cork, Ireland
| | - R Paul Ross
- APC Microbiome Ireland, Biosciences Institute, Biosciences Research Institute, College Rd, University College, Cork, Ireland.
- School of Microbiology, University College Cork, College Rd, University College, Cork, Ireland.
- Teagasc Food Research Centre, Moorepark, Moorepark West, Fermoy, Co. Cork, Ireland.
| |
Collapse
|
3
|
Bravi B. Development and use of machine learning algorithms in vaccine target selection. NPJ Vaccines 2024; 9:15. [PMID: 38242890 PMCID: PMC10798987 DOI: 10.1038/s41541-023-00795-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Accepted: 12/07/2023] [Indexed: 01/21/2024] Open
Abstract
Computer-aided discovery of vaccine targets has become a cornerstone of rational vaccine design. In this article, I discuss how Machine Learning (ML) can inform and guide key computational steps in rational vaccine design concerned with the identification of B and T cell epitopes and correlates of protection. I provide examples of ML models, as well as types of data and predictions for which they are built. I argue that interpretable ML has the potential to improve the identification of immunogens also as a tool for scientific discovery, by helping elucidate the molecular processes underlying vaccine-induced immune responses. I outline the limitations and challenges in terms of data availability and method development that need to be addressed to bridge the gap between advances in ML predictions and their translational application to vaccine design.
Collapse
Affiliation(s)
- Barbara Bravi
- Department of Mathematics, Imperial College London, London, SW7 2AZ, UK.
| |
Collapse
|
4
|
Guarra F, Colombo G. Computational Methods in Immunology and Vaccinology: Design and Development of Antibodies and Immunogens. J Chem Theory Comput 2023; 19:5315-5333. [PMID: 37527403 PMCID: PMC10448727 DOI: 10.1021/acs.jctc.3c00513] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Indexed: 08/03/2023]
Abstract
The design of new biomolecules able to harness immune mechanisms for the treatment of diseases is a prime challenge for computational and simulative approaches. For instance, in recent years, antibodies have emerged as an important class of therapeutics against a spectrum of pathologies. In cancer, immune-inspired approaches are witnessing a surge thanks to a better understanding of tumor-associated antigens and the mechanisms of their engagement or evasion from the human immune system. Here, we provide a summary of the main state-of-the-art computational approaches that are used to design antibodies and antigens, and in parallel, we review key methodologies for epitope identification for both B- and T-cell mediated responses. A special focus is devoted to the description of structure- and physics-based models, privileged over purely sequence-based approaches. We discuss the implications of novel methods in engineering biomolecules with tailored immunological properties for possible therapeutic uses. Finally, we highlight the extraordinary challenges and opportunities presented by the possible integration of structure- and physics-based methods with emerging Artificial Intelligence technologies for the prediction and design of novel antigens, epitopes, and antibodies.
Collapse
Affiliation(s)
- Federica Guarra
- Department of Chemistry, University
of Pavia, Via Taramelli 12, 27100 Pavia, Italy
| | - Giorgio Colombo
- Department of Chemistry, University
of Pavia, Via Taramelli 12, 27100 Pavia, Italy
| |
Collapse
|
5
|
Cocorullo M, Chiarelli LR, Stelitano G. Improving Protection to Prevent Bacterial Infections: Preliminary Applications of Reverse Vaccinology against the Main Cystic Fibrosis Pathogens. Vaccines (Basel) 2023; 11:1221. [PMID: 37515037 PMCID: PMC10384294 DOI: 10.3390/vaccines11071221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 07/04/2023] [Accepted: 07/07/2023] [Indexed: 07/30/2023] Open
Abstract
Reverse vaccinology is a powerful tool that was recently used to develop vaccines starting from a pathogen genome. Some bacterial infections have the necessity to be prevented then treated. For example, individuals with chronic pulmonary diseases, such as Cystic Fibrosis, are prone to develop infections and biofilms in the thick mucus that covers their lungs, mainly caused by Burkholderia cepacia complex, Haemophilus influenzae, Mycobacterium abscessus complex, Pseudomonas aeruginosa and Staphylococcus aureus. These infections are complicated to treat and prevention remains the best strategy. Despite the availability of vaccines against some strains of those pathogens, it is necessary to improve the immunization of people with Cystic Fibrosis against all of them. An effective approach is to develop a broad-spectrum vaccine to utilize proteins that are well conserved across different species. In this context, reverse vaccinology, a method based on computational analysis of the genome of various microorganisms, appears as one of the most promising tools for the identification of putative targets for broad-spectrum vaccine development. This review provides an overview of the vaccines that are under development by reverse vaccinology against the aforementioned pathogens, as well as the progress made so far.
Collapse
Affiliation(s)
- Mario Cocorullo
- Department of Biology and Biotechnology "Lazzaro Spallanzani", University of Pavia, Via A. Ferrata 9, 27100 Pavia, Italy
| | - Laurent R Chiarelli
- Department of Biology and Biotechnology "Lazzaro Spallanzani", University of Pavia, Via A. Ferrata 9, 27100 Pavia, Italy
| | - Giovanni Stelitano
- Department of Biology and Biotechnology "Lazzaro Spallanzani", University of Pavia, Via A. Ferrata 9, 27100 Pavia, Italy
| |
Collapse
|
6
|
Preeti P, Nath SK, Arambam N, Sharma T, Choudhury PR, Choudhury A, Khanna V, Strych U, Hotez PJ, Bottazzi ME, Rawal K. Vaxi-DL: An Artificial Intelligence-Enabled Platform for Vaccine Development. Methods Mol Biol 2023; 2673:305-316. [PMID: 37258923 DOI: 10.1007/978-1-0716-3239-0_21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Vaccine development is a complex and long process. It involves several steps, including computational studies, experimental analyses, animal model system studies, and clinical trials. This process can be accelerated by using in silico antigen screening to identify potential vaccine candidates. In this chapter, we describe a deep learning-based technique which utilizes 18 biological and 9154 physicochemical properties of proteins for finding potential vaccine candidates. Using this technique, a new web-based system, named Vaxi-DL, was developed which helped in finding new vaccine candidates from bacteria, protozoa, viruses, and fungi. Vaxi-DL is available at: https://vac.kamalrawal.in/vaxidl/ .
Collapse
Affiliation(s)
- P Preeti
- Centre for Computational Biology and Bioinformatics, AIB, Amity University, Noida, Uttar Pradesh, India
| | - Swarsat Kaushik Nath
- Centre for Computational Biology and Bioinformatics, AIB, Amity University, Noida, Uttar Pradesh, India
| | - Nevidita Arambam
- Centre for Computational Biology and Bioinformatics, AIB, Amity University, Noida, Uttar Pradesh, India
| | - Trapti Sharma
- Centre for Computational Biology and Bioinformatics, AIB, Amity University, Noida, Uttar Pradesh, India
| | - Priyanka Ray Choudhury
- Centre for Computational Biology and Bioinformatics, AIB, Amity University, Noida, Uttar Pradesh, India
| | - Alakto Choudhury
- Centre for Computational Biology and Bioinformatics, AIB, Amity University, Noida, Uttar Pradesh, India
| | - Vrinda Khanna
- Centre for Computational Biology and Bioinformatics, AIB, Amity University, Noida, Uttar Pradesh, India
| | - Ulrich Strych
- Department of Pediatrics, Division of Tropical Medicine, Baylor College of Medicine, Houston, TX, USA
- Texas Children's Hospital Center for Vaccine Development, Houston, TX, USA
| | - Peter J Hotez
- Department of Pediatrics, Division of Tropical Medicine, Baylor College of Medicine, Houston, TX, USA
- Texas Children's Hospital Center for Vaccine Development, Houston, TX, USA
- Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, TX, USA
- Department of Biology, Baylor University, Waco, TX, USA
| | - Maria Elena Bottazzi
- Department of Pediatrics, Division of Tropical Medicine, Baylor College of Medicine, Houston, TX, USA
- Texas Children's Hospital Center for Vaccine Development, Houston, TX, USA
- Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, TX, USA
- Department of Biology, Baylor University, Waco, TX, USA
| | - Kamal Rawal
- Centre for Computational Biology and Bioinformatics, AIB, Amity University, Noida, Uttar Pradesh, India.
| |
Collapse
|
7
|
Ishwarlall TZ, Adeleke VT, Maharaj L, Okpeku M, Adeniyi AA, Adeleke MA. Identification of potential candidate vaccines against Mycobacterium ulcerans based on the major facilitator superfamily transporter protein. Front Immunol 2022; 13:1023558. [PMID: 36426350 PMCID: PMC9679648 DOI: 10.3389/fimmu.2022.1023558] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Accepted: 10/19/2022] [Indexed: 11/09/2023] Open
Abstract
Buruli ulcer is a neglected tropical disease that is characterized by non-fatal lesion development. The causative agent is Mycobacterium ulcerans (M. ulcerans). There are no known vectors or transmission methods, preventing the development of control methods. There are effective diagnostic techniques and treatment routines; however, several socioeconomic factors may limit patients' abilities to receive these treatments. The Bacillus Calmette-Guérin vaccine developed against tuberculosis has shown limited efficacy, and no conventionally designed vaccines have passed clinical trials. This study aimed to generate a multi-epitope vaccine against M. ulcerans from the major facilitator superfamily transporter protein using an immunoinformatics approach. Twelve M. ulcerans genome assemblies were analyzed, resulting in the identification of 11 CD8+ and 7 CD4+ T-cell epitopes and 2 B-cell epitopes. These conserved epitopes were computationally predicted to be antigenic, immunogenic, non-allergenic, and non-toxic. The CD4+ T-cell epitopes were capable of inducing interferon-gamma and interleukin-4. They successfully bound to their respective human leukocyte antigens alleles in in silico docking studies. The expected global population coverage of the T-cell epitopes and their restricted human leukocyte antigens alleles was 99.90%. The population coverage of endemic regions ranged from 99.99% (Papua New Guinea) to 21.81% (Liberia). Two vaccine constructs were generated using the Toll-like receptors 2 and 4 agonists, LprG and RpfE, respectively. Both constructs were antigenic, non-allergenic, non-toxic, thermostable, basic, and hydrophilic. The DNA sequences of the vaccine constructs underwent optimization and were successfully in-silico cloned with the pET-28a(+) plasmid. The vaccine constructs were successfully docked to their respective toll-like receptors. Molecular dynamics simulations were carried out to analyze the binding interactions within the complex. The generated binding energies indicate the stability of both complexes. The constructs generated in this study display severable favorable properties, with construct one displaying a greater range of favorable properties. However, further analysis and laboratory validation are required.
Collapse
Affiliation(s)
- Tamara Z. Ishwarlall
- Discipline of Genetics, School of Life Sciences, University of KwaZulu-Natal, Durban, South Africa
| | - Victoria T. Adeleke
- Department of Chemical Engineering, Mangosuthu University of Technology, Durban, South Africa
| | - Leah Maharaj
- Discipline of Genetics, School of Life Sciences, University of KwaZulu-Natal, Durban, South Africa
| | - Moses Okpeku
- Discipline of Genetics, School of Life Sciences, University of KwaZulu-Natal, Durban, South Africa
| | - Adebayo A. Adeniyi
- Department of Chemistry, Faculty of Natural and Agricultural Sciences, University of the Free State, Bloemfontein, South Africa
- Department of Industrial Chemistry, Federal University Oye Ekiti, Oye-Ekiti, Ekiti State, Nigeria
| | - Matthew A. Adeleke
- Discipline of Genetics, School of Life Sciences, University of KwaZulu-Natal, Durban, South Africa
| |
Collapse
|
8
|
Huffman A, Ong E, Hur J, D’Mello A, Tettelin H, He Y. COVID-19 vaccine design using reverse and structural vaccinology, ontology-based literature mining and machine learning. Brief Bioinform 2022; 23:bbac190. [PMID: 35649389 PMCID: PMC9294427 DOI: 10.1093/bib/bbac190] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2022] [Revised: 04/13/2022] [Accepted: 04/26/2022] [Indexed: 12/11/2022] Open
Abstract
Rational vaccine design, especially vaccine antigen identification and optimization, is critical to successful and efficient vaccine development against various infectious diseases including coronavirus disease 2019 (COVID-19). In general, computational vaccine design includes three major stages: (i) identification and annotation of experimentally verified gold standard protective antigens through literature mining, (ii) rational vaccine design using reverse vaccinology (RV) and structural vaccinology (SV) and (iii) post-licensure vaccine success and adverse event surveillance and its usage for vaccine design. Protegen is a database of experimentally verified protective antigens, which can be used as gold standard data for rational vaccine design. RV predicts protective antigen targets primarily from genome sequence analysis. SV refines antigens through structural engineering. Recently, RV and SV approaches, with the support of various machine learning methods, have been applied to COVID-19 vaccine design. The analysis of post-licensure vaccine adverse event report data also provides valuable results in terms of vaccine safety and how vaccines should be used or paused. Ontology standardizes and incorporates heterogeneous data and knowledge in a human- and computer-interpretable manner, further supporting machine learning and vaccine design. Future directions on rational vaccine design are discussed.
Collapse
Affiliation(s)
- Anthony Huffman
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
| | - Edison Ong
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
| | - Junguk Hur
- Department of Biomedical Sciences, University of North Dakota School of Medicine and Health Sciences, Grand Forks, North Dakota 58202, USA
| | - Adonis D’Mello
- Department of Microbiology and Immunology, Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Hervé Tettelin
- Department of Microbiology and Immunology, Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Yongqun He
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, Center for Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
| |
Collapse
|
9
|
Goodswen SJ, Kennedy PJ, Ellis JT. Compilation of parasitic immunogenic proteins from 30 years of published research using machine learning and natural language processing. Sci Rep 2022; 12:10349. [PMID: 35725870 PMCID: PMC9208253 DOI: 10.1038/s41598-022-13790-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Accepted: 05/18/2022] [Indexed: 12/02/2022] Open
Abstract
The World Health Organisation reported in 2020 that six of the top 10 sources of death in low-income countries are parasites. Parasites are microorganisms in a relationship with a larger organism, the host. They acquire all benefits at the host’s expense. A disease develops if the parasitic infection disrupts normal functioning of the host. This disruption can range from mild to severe, including death. Humans and livestock continue to be challenged by established and emerging infectious disease threats. Vaccination is the most efficient tool for preventing current and future threats. Immunogenic proteins sourced from the disease-causing parasite are worthwhile vaccine components (subunits) due to reliable safety and manufacturing capacity. Publications with ‘subunit vaccine’ in their title have accumulated to thousands over the last three decades. However, there are possibly thousands more reporting immunogenicity results without mentioning ‘subunit’ and/or ‘vaccine’. The exact number is unclear given the non-standardised keywords in publications. The study aim is to identify parasite proteins that induce a protective response in an animal model as reported in the scientific literature within the last 30 years using machine learning and natural language processing. Source code to fulfil this aim and the vaccine candidate list obtained is made available.
Collapse
Affiliation(s)
- Stephen J Goodswen
- School of Life Sciences, University of Technology Sydney, 15 Broadway, Ultimo, NSW, 2007, Australia
| | - Paul J Kennedy
- School of Computer Science, Faculty of Engineering and Information Technology and the Australian Artificial Intelligence Institute, University of Technology Sydney, 15 Broadway, Ultimo, NSW, 2007, Australia
| | - John T Ellis
- School of Life Sciences, University of Technology Sydney, 15 Broadway, Ultimo, NSW, 2007, Australia.
| |
Collapse
|
10
|
Ishwarlall TZ, Okpeku M, Adeniyi AA, Adeleke MA. The search for a Buruli Ulcer vaccine and the effectiveness of the Bacillus Calmette-Guérin vaccine. Acta Trop 2022; 228:106323. [PMID: 35065013 DOI: 10.1016/j.actatropica.2022.106323] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Revised: 01/16/2022] [Accepted: 01/18/2022] [Indexed: 11/01/2022]
Abstract
Buruli Ulcer is a neglected tropical disease that is caused by Mycobacterium ulcerans. It is not fatal; however, it manifests a range of devastating symptoms on the hosts' bodies. Various drugs and treatments are available for the disease; however, they are often costly and have adverse effects. There is still much uncertainty regarding the mode of transmission, vectors, and reservoir. At present, there are no official vector control methods, prevention methods, or a vaccine licensed to prevent infection. The Bacillus Calmette-Guérin vaccine developed against tuberculosis has some effectiveness against M. ulcerans. However, it is unable to induce long-lasting protection. Various types of vaccines have been developed based specifically against M. ulcerans; however, to date, none has entered clinical trials or has been released for public use. Additional awareness and funding are needed for research in this field and the development of more treatments, diagnostic tools, and vaccines.
Collapse
|
11
|
Rawal K, Sinha R, Nath SK, Preeti P, Kumari P, Gupta S, Sharma T, Strych U, Hotez P, Bottazzi ME. Vaxi-DL: A web-based deep learning server to identify potential vaccine candidates. Comput Biol Med 2022; 145:105401. [PMID: 35381451 DOI: 10.1016/j.compbiomed.2022.105401] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 03/10/2022] [Accepted: 03/10/2022] [Indexed: 11/19/2022]
Abstract
The development of a new vaccine is a challenging exercise involving several steps including computational studies, experimental work, and animal studies followed by clinical studies. To accelerate the process, in silico screening is frequently used for antigen identification. Here, we present Vaxi-DL, web-based deep learning (DL) software that evaluates the potential of protein sequences to serve as vaccine target antigens. Four different DL pathogen models were trained to predict target antigens in bacteria, protozoa, fungi, and viruses that cause infectious diseases in humans. Datasets containing antigenic and non-antigenic sequences were derived from known vaccine candidates and the Protegen database. Biological and physicochemical properties were computed for the datasets using publicly available bioinformatics tools. For each of the four pathogen models, the datasets were divided into training, validation, and testing subsets and then scaled and normalised. The models were constructed using Fully Connected Layers (FCLs), hyper-tuned, and trained using the training subset. Accuracy, sensitivity, specificity, precision, recall, and AUC (Area under the Curve) were used as metrics to assess the performance of these models. The models were benchmarked using independent datasets of known target antigens against other prediction tools such as VaxiJen and Vaxign-ML. We also tested Vaxi-DL on 219 known potential vaccine candidates (PVC) from 37 different pathogens. Our tool predicted 175 PVCs correctly out of 219 sequences. We also tested Vaxi-DL on different datasets obtained from multiple resources. Our tool has demonstrated an average sensitivity of 93% and will thus be a useful tool for prioritising PVCs for preclinical studies.
Collapse
Affiliation(s)
- Kamal Rawal
- Amity Institute of Biotechnology, Amity University, Uttar Pradesh, India.
| | - Robin Sinha
- Amity Institute of Biotechnology, Amity University, Uttar Pradesh, India.
| | | | - P Preeti
- Amity Institute of Biotechnology, Amity University, Uttar Pradesh, India.
| | - Priya Kumari
- Amity Institute of Biotechnology, Amity University, Uttar Pradesh, India.
| | - Srijanee Gupta
- Amity Institute of Biotechnology, Amity University, Uttar Pradesh, India.
| | - Trapti Sharma
- Amity Institute of Biotechnology, Amity University, Uttar Pradesh, India.
| | - Ulrich Strych
- Texas Children's Center for Vaccine Development, Departments of Pediatrics and Molecular Virology and Microbiology, National School of Tropical Medicine, Baylor College of Medicine, Houston, TX, USA.
| | - Peter Hotez
- Texas Children's Center for Vaccine Development, Departments of Pediatrics and Molecular Virology and Microbiology, National School of Tropical Medicine, Baylor College of Medicine, Houston, TX, USA; Department of Biology, Baylor University, Waco, TX, USA.
| | - Maria Elena Bottazzi
- Texas Children's Center for Vaccine Development, Departments of Pediatrics and Molecular Virology and Microbiology, National School of Tropical Medicine, Baylor College of Medicine, Houston, TX, USA; Department of Biology, Baylor University, Waco, TX, USA.
| |
Collapse
|
12
|
Singh R, Capalash N, Sharma P. Vaccine development to control the rising scourge of antibiotic-resistant Acinetobacter baumannii: a systematic review. 3 Biotech 2022; 12:85. [PMID: 35261870 PMCID: PMC8890014 DOI: 10.1007/s13205-022-03148-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Accepted: 02/11/2022] [Indexed: 03/02/2023] Open
Abstract
Acinetobacter baumannii has emerged as one of major nosocomial pathogen and global emergence of multidrug-resistant strains has become a challenge for developing effective treatment options. A. baumannii has developed resistance to almost all the antibiotics viz. beta-lactams, carbapenems, tigecycline and now colistin, a last resort of antibiotics. The world is on the cusp of post antibiotic era and the evolution of multi-, extreme- and pan–drug-resistant A. baumannii strains is its obvious harbinger. Various combinations of antibiotics have been investigated but no successful treatment option is available. All these failed efforts have led researchers to develop and implement prophylactic vaccination for the prevention of infections caused by this pathogen. In this review, the advantages and disadvantages of active and passive immunization, the types of sub-unit and multi-component vaccine candidates investigated against A. baumannii viz. whole cell organism, outer membrane vesicles, outer membrane complexes, conjugate vaccines and sub-unit vaccines have been discussed. In addition, the benefits of Reverse vaccinology are emphasized here in which the potential vaccine candidates are predicted using bioinformatic online tools prior to in vivo validations.
Collapse
|
13
|
Ong E, He Y. Vaccine Design by Reverse Vaccinology and Machine Learning. Methods Mol Biol 2022; 2414:1-16. [PMID: 34784028 DOI: 10.1007/978-1-0716-1900-1_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Reverse vaccinology (RV) is the state-of-the-art vaccine development strategy that starts with predicting vaccine antigens by bioinformatics analysis of the whole genome of a pathogen of interest. Vaxign is the first web-based RV vaccine prediction method based on calculating and filtering different criteria of proteins. Vaxign-ML is a new Vaxign machine learning (ML) method that predicts vaccine antigens based on extreme gradient boosting with the advance of new technologies and cumulation of protective antigen data. Using a benchmark dataset, Vaxign-ML showed superior performance in comparison to existing open-source RV tools. Vaxign-ML is also implemented within the web-based Vaxign platform to support easy and intuitive access. Vaxign-ML is also available as a command-based software package for more advanced and customizable vaccine antigen prediction. Both Vaxign and Vaxign-ML have been applied to predict SARS-CoV-2 (cause of COVID-19) and Brucella vaccine antigens to demonstrate the integrative approach to analyze and select vaccine candidates using the Vaxign platform.
Collapse
Affiliation(s)
- Edison Ong
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, USA
- GlaxoSmithKline Vaccines, Rixensart, Belgium
| | - Yongqun He
- Center of Computational Medicine and Bioinformatics, Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, University of Michigan Medical School, Ann Arbor, MI, USA.
| |
Collapse
|
14
|
Wang AYL. Modified mRNA-Based Vaccines Against Coronavirus Disease 2019. Cell Transplant 2022; 31:9636897221090259. [PMID: 35438579 PMCID: PMC9021518 DOI: 10.1177/09636897221090259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
The pandemic of coronavirus disease 2019 (COVID-19) continuously causes deaths worldwide, representing a considerable challenge to health care and economic systems with a new precedent in human history. Many therapeutic medicines primarily focused on preventing severe organ damage and complications, which can be fatal in some confirmed cases. The synthesized modified mRNA (modRNA) represents a nonviral, integration-free, zero-footprint, efficient, and safe strategy for vaccine discovery. modRNA-based technology has facilitated the rapid development of the first COVID-19 vaccines due to its cost- and time-saving properties, thus initiating a new era of prophylactic vaccines against infectious diseases. Recently, COVID-19 modRNA vaccines were approved, and a large-scale vaccination campaign began worldwide. To date, results suggest that the modRNA vaccines are highly effective against virus infection, which causes COVID-19. Although short-term studies have reported that their safety is acceptable, long-term safety and protective immunity remain unclear. In this review, we describe two major approved modRNA vaccines and discuss their potential myocarditis complications.
Collapse
Affiliation(s)
- Aline Yen Ling Wang
- Center for Vascularized Composite Allotransplantation, Chang Gung Memorial Hospital, Taoyuan, Taiwan
| |
Collapse
|
15
|
Ong E, Cooke MF, Huffman A, Xiang Z, Wong MU, Wang H, Seetharaman M, Valdez N, He Y. Vaxign2: the second generation of the first Web-based vaccine design program using reverse vaccinology and machine learning. Nucleic Acids Res 2021; 49:W671-W678. [PMID: 34009334 PMCID: PMC8218197 DOI: 10.1093/nar/gkab279] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2021] [Revised: 03/29/2021] [Accepted: 04/15/2021] [Indexed: 01/12/2023] Open
Abstract
Vaccination is one of the most significant inventions in medicine. Reverse vaccinology (RV) is a state-of-the-art technique to predict vaccine candidates from pathogen's genome(s). To promote vaccine development, we updated Vaxign2, the first web-based vaccine design program using reverse vaccinology with machine learning. Vaxign2 is a comprehensive web server for rational vaccine design, consisting of predictive and computational workflow components. The predictive part includes the original Vaxign filtering-based method and a new machine learning-based method, Vaxign-ML. The benchmarking results using a validation dataset showed that Vaxign-ML had superior prediction performance compared to other RV tools. Besides the prediction component, Vaxign2 implemented various post-prediction analyses to significantly enhance users' capability to refine the prediction results based on different vaccine design rationales and considerably reduce user time to analyze the Vaxign/Vaxign-ML prediction results. Users provide proteome sequences as input data, select candidates based on Vaxign outputs and Vaxign-ML scores, and perform post-prediction analysis. Vaxign2 also includes precomputed results from approximately 1 million proteins in 398 proteomes of 36 pathogens. As a demonstration, Vaxign2 was used to effectively analyse SARS-CoV-2, the coronavirus causing COVID-19. The comprehensive framework of Vaxign2 can support better and more rational vaccine design. Vaxign2 is publicly accessible at http://www.violinet.org/vaxign2.
Collapse
Affiliation(s)
- Edison Ong
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | - Michael F Cooke
- School of Information, University of Michigan, Ann Arbor, MI 48109, USA
- Undergraduate Research Opportunity Program, College of Literature, Science, and the Arts, University of Michigan, Ann Arbor, MI 48109, USA
| | - Anthony Huffman
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | - Zuoshuang Xiang
- Unit for Laboratory Animal Medicine, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | - Mei U Wong
- Unit for Laboratory Animal Medicine, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | - Haihe Wang
- Department of Pathogenobiology, Daqing Branch of Harbin Medical University, Daqing, Helongjiang, China
| | - Meenakshi Seetharaman
- Undergraduate Research Opportunity Program, College of Literature, Science, and the Arts, University of Michigan, Ann Arbor, MI 48109, USA
| | - Ninotchka Valdez
- Undergraduate Research Opportunity Program, College of Literature, Science, and the Arts, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yongqun He
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
- Unit for Laboratory Animal Medicine, University of Michigan Medical School, Ann Arbor, MI 48109, USA
- Department of Microbiology and Immunology, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| |
Collapse
|
16
|
Bacterial Immunogenicity Prediction by Machine Learning Methods. Vaccines (Basel) 2020; 8:vaccines8040709. [PMID: 33265930 PMCID: PMC7711804 DOI: 10.3390/vaccines8040709] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Revised: 11/19/2020] [Accepted: 11/24/2020] [Indexed: 12/18/2022] Open
Abstract
The identification of protective immunogens is the most important and vigorous initial step in the long-lasting and expensive process of vaccine design and development. Machine learning (ML) methods are very effective in data mining and in the analysis of big data such as microbial proteomes. They are able to significantly reduce the experimental work for discovering novel vaccine candidates. Here, we applied six supervised ML methods (partial least squares-based discriminant analysis, k nearest neighbor (kNN), random forest (RF), support vector machine (SVM), random subspace method (RSM), and extreme gradient boosting) on a set of 317 known bacterial immunogens and 317 bacterial non-immunogens and derived models for immunogenicity prediction. The models were validated by internal cross-validation in 10 groups from the training set and by the external test set. All of them showed good predictive ability, but the xgboost model displays the most prominent ability to identify immunogens by recognizing 84% of the known immunogens in the test set. The combined RSM-kNN model was the best in the recognition of non-immunogens, identifying 92% of them in the test set. The three best performing ML models (xgboost, RSM-kNN, and RF) were implemented in the new version of the server VaxiJen, and the prediction of bacterial immunogens is now based on majority voting.
Collapse
|
17
|
Keshavarzi Arshadi A, Webb J, Salem M, Cruz E, Calad-Thomson S, Ghadirian N, Collins J, Diez-Cecilia E, Kelly B, Goodarzi H, Yuan JS. Artificial Intelligence for COVID-19 Drug Discovery and Vaccine Development. Front Artif Intell 2020; 3:65. [PMID: 33733182 PMCID: PMC7861281 DOI: 10.3389/frai.2020.00065] [Citation(s) in RCA: 97] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2020] [Accepted: 07/17/2020] [Indexed: 12/31/2022] Open
Abstract
SARS-COV-2 has roused the scientific community with a call to action to combat the growing pandemic. At the time of this writing, there are as yet no novel antiviral agents or approved vaccines available for deployment as a frontline defense. Understanding the pathobiology of COVID-19 could aid scientists in their discovery of potent antivirals by elucidating unexplored viral pathways. One method for accomplishing this is the leveraging of computational methods to discover new candidate drugs and vaccines in silico. In the last decade, machine learning-based models, trained on specific biomolecules, have offered inexpensive and rapid implementation methods for the discovery of effective viral therapies. Given a target biomolecule, these models are capable of predicting inhibitor candidates in a structural-based manner. If enough data are presented to a model, it can aid the search for a drug or vaccine candidate by identifying patterns within the data. In this review, we focus on the recent advances of COVID-19 drug and vaccine development using artificial intelligence and the potential of intelligent training for the discovery of COVID-19 therapeutics. To facilitate applications of deep learning for SARS-COV-2, we highlight multiple molecular targets of COVID-19, inhibition of which may increase patient survival. Moreover, we present CoronaDB-AI, a dataset of compounds, peptides, and epitopes discovered either in silico or in vitro that can be potentially used for training models in order to extract COVID-19 treatment. The information and datasets provided in this review can be used to train deep learning-based models and accelerate the discovery of effective viral therapies.
Collapse
Affiliation(s)
- Arash Keshavarzi Arshadi
- Burnett School of Biomedical Sciences, University of Central Florida, Orlando, FL, United States
| | - Julia Webb
- Burnett School of Biomedical Sciences, University of Central Florida, Orlando, FL, United States
| | - Milad Salem
- Department of Electrical and Computer Engineering, University of Central Florida, Orlando, FL, United States
| | | | | | - Niloofar Ghadirian
- Department of Chemistry and Biochemistry, University of Arizona, Tucson, AZ, United States
| | - Jennifer Collins
- Burnett School of Biomedical Sciences, University of Central Florida, Orlando, FL, United States
| | | | | | - Hani Goodarzi
- Department of Biochemistry and Biophysics, Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA, United States
| | - Jiann Shiun Yuan
- Department of Electrical and Computer Engineering, University of Central Florida, Orlando, FL, United States
| |
Collapse
|
18
|
Ong E, Wong MU, Huffman A, He Y. COVID-19 Coronavirus Vaccine Design Using Reverse Vaccinology and Machine Learning. Front Immunol 2020; 11:1581. [PMID: 32719684 PMCID: PMC7350702 DOI: 10.3389/fimmu.2020.01581] [Citation(s) in RCA: 199] [Impact Index Per Article: 49.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2020] [Accepted: 06/15/2020] [Indexed: 12/16/2022] Open
Abstract
To ultimately combat the emerging COVID-19 pandemic, it is desired to develop an effective and safe vaccine against this highly contagious disease caused by the SARS-CoV-2 coronavirus. Our literature and clinical trial survey showed that the whole virus, as well as the spike (S) protein, nucleocapsid (N) protein, and membrane (M) protein, have been tested for vaccine development against SARS and MERS. However, these vaccine candidates might lack the induction of complete protection and have safety concerns. We then applied the Vaxign and the newly developed machine learning-based Vaxign-ML reverse vaccinology tools to predict COVID-19 vaccine candidates. Our Vaxign analysis found that the SARS-CoV-2 N protein sequence is conserved with SARS-CoV and MERS-CoV but not from the other four human coronaviruses causing mild symptoms. By investigating the entire proteome of SARS-CoV-2, six proteins, including the S protein and five non-structural proteins (nsp3, 3CL-pro, and nsp8-10), were predicted to be adhesins, which are crucial to the viral adhering and host invasion. The S, nsp3, and nsp8 proteins were also predicted by Vaxign-ML to induce high protective antigenicity. Besides the commonly used S protein, the nsp3 protein has not been tested in any coronavirus vaccine studies and was selected for further investigation. The nsp3 was found to be more conserved among SARS-CoV-2, SARS-CoV, and MERS-CoV than among 15 coronaviruses infecting human and other animals. The protein was also predicted to contain promiscuous MHC-I and MHC-II T-cell epitopes, and the predicted linear B-cell epitopes were found to be localized on the surface of the protein. Our predicted vaccine targets have the potential for effective and safe COVID-19 vaccine development. We also propose that an "Sp/Nsp cocktail vaccine" containing a structural protein(s) (Sp) and a non-structural protein(s) (Nsp) would stimulate effective complementary immune responses.
Collapse
Affiliation(s)
- Edison Ong
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, United States
| | - Mei U Wong
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, University of Michigan, Ann Arbor, MI, United States
| | - Anthony Huffman
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, United States
| | - Yongqun He
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, United States
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, University of Michigan, Ann Arbor, MI, United States
| |
Collapse
|
19
|
Ong E, Wang H, Wong MU, Seetharaman M, Valdez N, He Y. Vaxign-ML: supervised machine learning reverse vaccinology model for improved prediction of bacterial protective antigens. Bioinformatics 2020; 36:3185-3191. [PMID: 32096826 PMCID: PMC7214037 DOI: 10.1093/bioinformatics/btaa119] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2019] [Revised: 02/10/2020] [Accepted: 02/18/2020] [Indexed: 01/19/2023] Open
Abstract
MOTIVATION Reverse vaccinology (RV) is a milestone in rational vaccine design, and machine learning (ML) has been applied to enhance the accuracy of RV prediction. However, ML-based RV still faces challenges in prediction accuracy and program accessibility. RESULTS This study presents Vaxign-ML, a supervised ML classification to predict bacterial protective antigens (BPAgs). To identify the best ML method with optimized conditions, five ML methods were tested with biological and physiochemical features extracted from well-defined training data. Nested 5-fold cross-validation and leave-one-pathogen-out validation were used to ensure unbiased performance assessment and the capability to predict vaccine candidates against a new emerging pathogen. The best performing model (eXtreme Gradient Boosting) was compared to three publicly available programs (Vaxign, VaxiJen, and Antigenic), one SVM-based method, and one epitope-based method using a high-quality benchmark dataset. Vaxign-ML showed superior performance in predicting BPAgs. Vaxign-ML is hosted in a publicly accessible web server and a standalone version is also available. AVAILABILITY AND IMPLEMENTATION Vaxign-ML website at http://www.violinet.org/vaxign/vaxign-ml, Docker standalone Vaxign-ML available at https://hub.docker.com/r/e4ong1031/vaxign-ml and source code is available at https://github.com/VIOLINet/Vaxign-ML-docker. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Edison Ong
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | - Haihe Wang
- Department of Pathogenobiology, Daqing Branch of Harbin Medical University, Daqing 163319, China
- Unit for Laboratory Animal Medicine
| | | | | | - Ninotchka Valdez
- College of Literature, Science, and the Arts, University of Michigan
| | - Yongqun He
- Unit for Laboratory Animal Medicine
- Department of Microbiology and Immunology
- Center of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| |
Collapse
|
20
|
Ong E, Wong MU, Huffman A, He Y. COVID-19 coronavirus vaccine design using reverse vaccinology and machine learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2020:2020.03.20.000141. [PMID: 32511333 PMCID: PMC7239068 DOI: 10.1101/2020.03.20.000141] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/12/2023]
Abstract
To ultimately combat the emerging COVID-19 pandemic, it is desired to develop an effective and safe vaccine against this highly contagious disease caused by the SARS-CoV-2 coronavirus. Our literature and clinical trial survey showed that the whole virus, as well as the spike (S) protein, nucleocapsid (N) protein, and membrane protein, have been tested for vaccine development against SARS and MERS. We further used the Vaxign reverse vaccinology tool and the newly developed Vaxign-ML machine learning tool to predict COVID-19 vaccine candidates. The N protein was found to be conserved in the more pathogenic strains (SARS/MERS/COVID-19), but not in the other human coronaviruses that mostly cause mild symptoms. By investigating the entire proteome of SARS-CoV-2, six proteins, including the S protein and five non-structural proteins (nsp3, 3CL-pro, and nsp8-10) were predicted to be adhesins, which are crucial to the viral adhering and host invasion. The S, nsp3, and nsp8 proteins were also predicted by Vaxign-ML to induce high protective antigenicity. Besides the commonly used S protein, the nsp3 protein has not been tested in any coronavirus vaccine studies and was selected for further investigation. The nsp3 was found to be more conserved among SARS-CoV-2, SARS-CoV, and MERS-CoV than among 15 coronaviruses infecting human and other animals. The protein was also predicted to contain promiscuous MHC-I and MHC-II T-cell epitopes, and linear B-cell epitopes localized in specific locations and functional domains of the protein. Our predicted vaccine targets provide new strategies for effective and safe COVID-19 vaccine development.
Collapse
Affiliation(s)
- Edison Ong
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Mei U Wong
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Anthony Huffman
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yongqun He
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
21
|
Ong E, Wong MU, Huffman A, He Y. COVID-19 Coronavirus Vaccine Design Using Reverse Vaccinology and Machine Learning. Front Immunol 2020. [PMID: 32719684 DOI: 10.3389/fimmu.2020.01581/full] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/13/2023] Open
Abstract
To ultimately combat the emerging COVID-19 pandemic, it is desired to develop an effective and safe vaccine against this highly contagious disease caused by the SARS-CoV-2 coronavirus. Our literature and clinical trial survey showed that the whole virus, as well as the spike (S) protein, nucleocapsid (N) protein, and membrane (M) protein, have been tested for vaccine development against SARS and MERS. However, these vaccine candidates might lack the induction of complete protection and have safety concerns. We then applied the Vaxign and the newly developed machine learning-based Vaxign-ML reverse vaccinology tools to predict COVID-19 vaccine candidates. Our Vaxign analysis found that the SARS-CoV-2 N protein sequence is conserved with SARS-CoV and MERS-CoV but not from the other four human coronaviruses causing mild symptoms. By investigating the entire proteome of SARS-CoV-2, six proteins, including the S protein and five non-structural proteins (nsp3, 3CL-pro, and nsp8-10), were predicted to be adhesins, which are crucial to the viral adhering and host invasion. The S, nsp3, and nsp8 proteins were also predicted by Vaxign-ML to induce high protective antigenicity. Besides the commonly used S protein, the nsp3 protein has not been tested in any coronavirus vaccine studies and was selected for further investigation. The nsp3 was found to be more conserved among SARS-CoV-2, SARS-CoV, and MERS-CoV than among 15 coronaviruses infecting human and other animals. The protein was also predicted to contain promiscuous MHC-I and MHC-II T-cell epitopes, and the predicted linear B-cell epitopes were found to be localized on the surface of the protein. Our predicted vaccine targets have the potential for effective and safe COVID-19 vaccine development. We also propose that an "Sp/Nsp cocktail vaccine" containing a structural protein(s) (Sp) and a non-structural protein(s) (Nsp) would stimulate effective complementary immune responses.
Collapse
Affiliation(s)
- Edison Ong
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, United States
| | - Mei U Wong
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, University of Michigan, Ann Arbor, MI, United States
| | - Anthony Huffman
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, United States
| | - Yongqun He
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, United States
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, University of Michigan, Ann Arbor, MI, United States
| |
Collapse
|
22
|
Heinson AI, Ewing RM, Holloway JW, Woelk CH, Niranjan M. An evaluation of different classification algorithms for protein sequence-based reverse vaccinology prediction. PLoS One 2019; 14:e0226256. [PMID: 31834914 PMCID: PMC6910663 DOI: 10.1371/journal.pone.0226256] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2019] [Accepted: 11/22/2019] [Indexed: 12/03/2022] Open
Abstract
Previous work has shown that proteins that have the potential to be vaccine candidates can be predicted from features derived from their amino acid sequences. In this work, we make an empirical comparison across various machine learning classifiers on this sequence-based inference problem. Using systematic cross validation on a dataset of 200 known vaccine candidates and 200 negative examples, with a set of 525 features derived from the AA sequences and feature selection applied through a greedy backward elimination approach, we show that simple classification algorithms often perform as well as more complex support vector kernel machines. The work also includes a novel cross validation applied across bacterial species, i.e. the validation proteins all come from a specific species of bacterium not represented in the training set. We termed this type of validation Leave One Bacteria Out Validation (LOBOV).
Collapse
Affiliation(s)
- Ashley I. Heinson
- Faculty of Medicine University of Southampton, Southampton, United Kingdom
| | - Rob M. Ewing
- Department of Biological Sciences University of Southampton, Southampton, United Kingdom
| | - John W. Holloway
- Faculty of Medicine, University of Southampton, Southampton, United Kingdom
| | | | - Mahesan Niranjan
- Department of Electronics and Computer Science, University of Southampton, Southampton, United Kingdom
| |
Collapse
|
23
|
Dalsass M, Brozzi A, Medini D, Rappuoli R. Comparison of Open-Source Reverse Vaccinology Programs for Bacterial Vaccine Antigen Discovery. Front Immunol 2019; 10:113. [PMID: 30837982 PMCID: PMC6382693 DOI: 10.3389/fimmu.2019.00113] [Citation(s) in RCA: 72] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2018] [Accepted: 01/15/2019] [Indexed: 12/14/2022] Open
Abstract
Reverse Vaccinology (RV) is a widely used approach to identify potential vaccine candidates (PVCs) by screening the proteome of a pathogen through computational analyses. Since its first application in Group B meningococcus (MenB) vaccine in early 1990's, several software programs have been developed implementing different flavors of the first RV protocol. However, there has been no comprehensive review to date on these different RV tools. We have compared six of these applications designed for bacterial vaccines (NERVE, Vaxign, VaxiJen, Jenner-predict, Bowman-Heinson, and VacSol) against a set of 11 pathogens for which a curated list of known bacterial protective antigens (BPAs) was available. We present results on: (1) the comparison of criteria and programs used for the selection of PVCs (2) computational runtime and (3) performances in terms of fraction of proteome identified as PVC, fraction and enrichment of BPA identified in the set of PVCs. This review demonstrates that none of the programs was able to recall 100% of the tested set of BPAs and that the output lists of proteins are in poor agreement suggesting in the process of prioritize vaccine candidates not to rely on a single RV tool response. Singularly the best balance in terms of fraction of a proteome predicted as good candidate and recall of BPAs has been observed by the machine-learning approach proposed by Bowman (1) and enhanced by Heinson (2). Even though more performing than the other approaches it shows the disadvantage of limited accessibility to non-experts users and strong dependence between results and a-priori training dataset composition. In conclusion we believe that to significantly enhance the performances of next RV methods further studies should focus on the enhancement of accuracy of the existing protein annotation tools and should leverage on the assets of machine-learning techniques applied to biological datasets expanded also through the incorporation and curation of bacterial proteins characterized by negative experimental results.
Collapse
Affiliation(s)
- Mattia Dalsass
- GlaxoSmithKline, Siena, Italy.,Dipartimento di Scienze Cliniche e Biologiche, Università degli Studi di Torino, Turin, Italy
| | | | | | | |
Collapse
|
24
|
Goodswen SJ, Kennedy PJ, Ellis JT. A Gene-Based Positive Selection Detection Approach to Identify Vaccine Candidates Using Toxoplasma gondii as a Test Case Protozoan Pathogen. Front Genet 2018; 9:332. [PMID: 30177953 PMCID: PMC6109633 DOI: 10.3389/fgene.2018.00332] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Accepted: 08/02/2018] [Indexed: 11/22/2022] Open
Abstract
Over the last two decades, various in silico approaches have been developed and refined that attempt to identify protein and/or peptide vaccines candidates from informative signals encoded in protein sequences of a target pathogen. As to date, no signal has been identified that clearly indicates a protein will effectively contribute to a protective immune response in a host. The premise for this study is that proteins under positive selection from the immune system are more likely suitable vaccine candidates than proteins exposed to other selection pressures. Furthermore, our expectation is that protein sequence regions encoding major histocompatibility complexes (MHC) binding peptides will contain consecutive positive selection sites. Using freely available data and bioinformatic tools, we present a high-throughput approach through a pipeline that predicts positive selection sites, protein subcellular locations, and sequence locations of medium to high T-Cell MHC class I binding peptides. Positive selection sites are estimated from a sequence alignment by comparing rates of synonymous (dS) and non-synonymous (dN) substitutions among protein coding sequences of orthologous genes in a phylogeny. The main pipeline output is a list of protein vaccine candidates predicted to be naturally exposed to the immune system and containing sites under positive selection. Candidates are ranked with respect to the number of consecutive sites located on protein sequence regions encoding MHCI-binding peptides. Results are constrained by the reliability of prediction programs and quality of input data. Protein sequences from Toxoplasma gondii ME49 strain (TGME49) were used as a case study. Surface antigen (SAG), dense granules (GRA), microneme (MIC), and rhoptry (ROP) proteins are considered worthy T. gondii candidates. Given 8263 TGME49 protein sequences processed anonymously, the top 10 predicted candidates were all worthy candidates. In particular, the top ten included ROP5 and ROP18, which are T. gondii virulence determinants. The chance of randomly selecting a ROP protein was 0.2% given 8263 sequences. We conclude that the approach described is a valuable addition to other in silico approaches to identify vaccines candidates worthy of laboratory validation and could be adapted for other apicomplexan parasite species (with appropriate data).
Collapse
Affiliation(s)
- Stephen J Goodswen
- School of Life Sciences, University of Technology Sydney, Ultimo, NSW, Australia
| | - Paul J Kennedy
- School of Software, Faculty of Engineering and Information Technology, Centre for Artificial Intelligence, University of Technology Sydney, Ultimo, NSW, Australia
| | - John T Ellis
- School of Life Sciences, University of Technology Sydney, Ultimo, NSW, Australia
| |
Collapse
|
25
|
Abstract
The present incidence of leptospirosis in China is significantly lower than past rates, although small localized outbreaks continue to occur in epidemic regions. Improvements in sanitation, as well as vaccination of high-risk populations, have played crucial roles in reducing the disease burden. Several types of human leptospirosis vaccines have been developed, including inactivated whole-cell, outer-envelope, and recombinant vaccines. Of these, only a multivalent inactivated leptospirosis vaccine is available in China, which was added to the Chinese Expanded Program on Immunization in 2007. However, this vaccine elicits serogroup-specific immunity, and serogroup epidemiology should continue to be monitored to enhance vaccine coverage and distribution. On the other hand, the efficiency of the inactivated vaccine should be further improved by optimizing the formulation, and by expanding the target population. More importantly, additional investments should be made to develop universal recombinant vaccines.
Collapse
Affiliation(s)
- Yinghua Xu
- a Key Laboratory of the Ministry of Health for Research on Quality and Standardization of Biotech Products, National Institutes for Food and Drug Control, Bio-pharmaceutical Industrial Base , Daxing District, Beijing , People's Republic of China
| | - Qiang Ye
- a Key Laboratory of the Ministry of Health for Research on Quality and Standardization of Biotech Products, National Institutes for Food and Drug Control, Bio-pharmaceutical Industrial Base , Daxing District, Beijing , People's Republic of China
| |
Collapse
|
26
|
Ong E, Wong MU, He Y. Identification of New Features from Known Bacterial Protective Vaccine Antigens Enhances Rational Vaccine Design. Front Immunol 2017; 8:1382. [PMID: 29123525 PMCID: PMC5662880 DOI: 10.3389/fimmu.2017.01382] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2017] [Accepted: 10/06/2017] [Indexed: 11/13/2022] Open
Abstract
With many protective vaccine antigens reported in the literature and verified experimentally, how to use the knowledge mined from these antigens to support rational vaccine design and study underlying design mechanism remains unclear. In order to address the problem, a systematic bioinformatics analysis was performed on 291 Gram-positive and Gram-negative bacterial protective antigens with experimental evidence manually curated in the Protegen database. The bioinformatics analyses evaluated included subcellular localization, adhesin probability, peptide signaling, transmembrane α-helix and β-barrel, conserved domain, Clusters of Orthologous Groups, and Gene Ontology functional annotations. Here we showed the critical role of adhesins, along with subcellular localization, peptide signaling, in predicting secreted extracellular or surface-exposed protective antigens, with mechanistic explanations supported by functional analysis. We also found a significant negative correlation of transmembrane α-helix to antigen protectiveness in Gram-positive and Gram-negative pathogens, while a positive correlation of transmembrane β-barrel was observed in Gram-negative pathogens. The commonly less-focused cytoplasmic and cytoplasmic membrane proteins could be potentially predicted with the help of other selection criteria such as adhesin probability and functional analysis. The significant findings in this study can support rational vaccine design and enhance our understanding of vaccine design mechanisms.
Collapse
Affiliation(s)
- Edison Ong
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, United States
| | - Mei U Wong
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, University of Michigan, Ann Arbor, MI, United States
| | - Yongqun He
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, University of Michigan, Ann Arbor, MI, United States.,Center of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, United States
| |
Collapse
|
27
|
Liu F, Tang X, Sheng X, Xing J, Zhan W. Comparative study of the vaccine potential of six outer membrane proteins of Edwardsiella tarda and the immune responses of flounder ( Paralichthys olivaceus ) after vaccination. Vet Immunol Immunopathol 2017; 185:38-47. [DOI: 10.1016/j.vetimm.2017.01.008] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2016] [Revised: 01/18/2017] [Accepted: 01/26/2017] [Indexed: 01/10/2023]
|
28
|
Heinson AI, Gunawardana Y, Moesker B, Hume CCD, Vataga E, Hall Y, Stylianou E, McShane H, Williams A, Niranjan M, Woelk CH. Enhancing the Biological Relevance of Machine Learning Classifiers for Reverse Vaccinology. Int J Mol Sci 2017; 18:ijms18020312. [PMID: 28157153 PMCID: PMC5343848 DOI: 10.3390/ijms18020312] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2016] [Accepted: 01/17/2017] [Indexed: 12/11/2022] Open
Abstract
Reverse vaccinology (RV) is a bioinformatics approach that can predict antigens with protective potential from the protein coding genomes of bacterial pathogens for subunit vaccine design. RV has become firmly established following the development of the BEXSERO® vaccine against Neisseria meningitidis serogroup B. RV studies have begun to incorporate machine learning (ML) techniques to distinguish bacterial protective antigens (BPAs) from non-BPAs. This research contributes significantly to the RV field by using permutation analysis to demonstrate that a signal for protective antigens can be curated from published data. Furthermore, the effects of the following on an ML approach to RV were also assessed: nested cross-validation, balancing selection of non-BPAs for subcellular localization, increasing the training data, and incorporating greater numbers of protein annotation tools for feature generation. These enhancements yielded a support vector machine (SVM) classifier that could discriminate BPAs (n = 200) from non-BPAs (n = 200) with an area under the curve (AUC) of 0.787. In addition, hierarchical clustering of BPAs revealed that intracellular BPAs clustered separately from extracellular BPAs. However, no immediate benefit was derived when training SVM classifiers on data sets exclusively containing intra- or extracellular BPAs. In conclusion, this work demonstrates that ML classifiers have great utility in RV approaches and will lead to new subunit vaccines in the future.
Collapse
Affiliation(s)
- Ashley I Heinson
- Faculty of Medicine, University of Southampton, Southampton SO17 1BJ, UK.
| | | | - Bastiaan Moesker
- Faculty of Medicine, University of Southampton, Southampton SO17 1BJ, UK.
| | - Carmen C Denman Hume
- London School of Hygiene and Tropical Medicine (LSHTM), Department of Pathogen Molecular BiologyLondon WC1E 7HT, UK.
| | - Elena Vataga
- Solutions, University of Southampton, Southampton SO17 1BJ, UK.
| | - Yper Hall
- Public Health England, National Infection Service, Porton Down Salisbury, SP4 0JG, UK.
| | - Elena Stylianou
- The Jenner Institute, University of Oxford, Oxford OX3 7DQ, UK.
| | - Helen McShane
- The Jenner Institute, University of Oxford, Oxford OX3 7DQ, UK.
| | - Ann Williams
- Public Health England, National Infection Service, Porton Down Salisbury, SP4 0JG, UK.
| | - Mahesan Niranjan
- Department of Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ, UK.
| | | |
Collapse
|
29
|
Dellagostin OA, Grassmann AA, Rizzi C, Schuch RA, Jorge S, Oliveira TL, McBride AJA, Hartwig DD. Reverse Vaccinology: An Approach for Identifying Leptospiral Vaccine Candidates. Int J Mol Sci 2017; 18:ijms18010158. [PMID: 28098813 PMCID: PMC5297791 DOI: 10.3390/ijms18010158] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2016] [Revised: 01/05/2017] [Accepted: 01/06/2017] [Indexed: 12/01/2022] Open
Abstract
Leptospirosis is a major public health problem with an incidence of over one million human cases each year. It is a globally distributed, zoonotic disease and is associated with significant economic losses in farm animals. Leptospirosis is caused by pathogenic Leptospira spp. that can infect a wide range of domestic and wild animals. Given the inability to control the cycle of transmission among animals and humans, there is an urgent demand for a new vaccine. Inactivated whole-cell vaccines (bacterins) are routinely used in livestock and domestic animals, however, protection is serovar-restricted and short-term only. To overcome these limitations, efforts have focused on the development of recombinant vaccines, with partial success. Reverse vaccinology (RV) has been successfully applied to many infectious diseases. A growing number of leptospiral genome sequences are now available in public databases, providing an opportunity to search for prospective vaccine antigens using RV. Several promising leptospiral antigens were identified using this approach, although only a few have been characterized and evaluated in animal models. In this review, we summarize the use of RV for leptospirosis and discuss the need for potential improvements for the successful development of a new vaccine towards reducing the burden of human and animal leptospirosis.
Collapse
Affiliation(s)
- Odir A Dellagostin
- Núcleo de Biotecnologia, Centro de Desenvolvimento Tecnológico, Universidade Federal de Pelotas, Pelotas RS 96100-000, Brazil.
| | - André A Grassmann
- Núcleo de Biotecnologia, Centro de Desenvolvimento Tecnológico, Universidade Federal de Pelotas, Pelotas RS 96100-000, Brazil.
| | - Caroline Rizzi
- Núcleo de Biotecnologia, Centro de Desenvolvimento Tecnológico, Universidade Federal de Pelotas, Pelotas RS 96100-000, Brazil.
| | - Rodrigo A Schuch
- Núcleo de Biotecnologia, Centro de Desenvolvimento Tecnológico, Universidade Federal de Pelotas, Pelotas RS 96100-000, Brazil.
| | - Sérgio Jorge
- Núcleo de Biotecnologia, Centro de Desenvolvimento Tecnológico, Universidade Federal de Pelotas, Pelotas RS 96100-000, Brazil.
| | - Thais L Oliveira
- Núcleo de Biotecnologia, Centro de Desenvolvimento Tecnológico, Universidade Federal de Pelotas, Pelotas RS 96100-000, Brazil.
| | - Alan J A McBride
- Núcleo de Biotecnologia, Centro de Desenvolvimento Tecnológico, Universidade Federal de Pelotas, Pelotas RS 96100-000, Brazil.
| | - Daiane D Hartwig
- Departamento de Microbiologia e Parasitologia, Instituto de Biologia, Universidade Federal de Pelotas, Pelotas RS 96100-000, Brazil.
| |
Collapse
|
30
|
A review of reverse vaccinology approaches for the development of vaccines against ticks and tick borne diseases. Ticks Tick Borne Dis 2015; 7:573-85. [PMID: 26723274 DOI: 10.1016/j.ttbdis.2015.12.012] [Citation(s) in RCA: 99] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2015] [Revised: 11/24/2015] [Accepted: 12/12/2015] [Indexed: 02/07/2023]
Abstract
The field of reverse vaccinology developed as an outcome of the genome sequence revolution. Following the introduction of live vaccinations in the western world by Edward Jenner in 1798 and the coining of the phrase 'vaccine', in 1881 Pasteur developed a rational design for vaccines. Pasteur proposed that in order to make a vaccine that one should 'isolate, inactivate and inject the microorganism' and these basic rules of vaccinology were largely followed for the next 100 years leading to the elimination of several highly infectious diseases. However, new technologies were needed to conquer many pathogens which could not be eliminated using these traditional technologies. Thus increasingly, computers were used to mine genome sequences to rationally design recombinant vaccines. Several vaccines for bacterial and viral diseases (i.e. meningococcus and HIV) have been developed, however the on-going challenge for parasite vaccines has been due to their comparatively larger genomes. Understanding the immune response is important in reverse vaccinology studies as this knowledge will influence how the genome mining is to be conducted. Vaccine candidates for anaplasmosis, cowdriosis, theileriosis, leishmaniasis, malaria, schistosomiasis, and the cattle tick have been identified using reverse vaccinology approaches. Some challenges for parasite vaccine development include the ability to address antigenic variability as well the understanding of the complex interplay between antibody, mucosal and/or T cell immune responses. To understand the complex parasite interactions with the livestock host, there is the limitation where algorithms for epitope mining using the human genome cannot directly be adapted for bovine, for example the prediction of peptide binding to major histocompatibility complex motifs. As the number of genomes for both hosts and parasites increase, the development of new algorithms for pan-genomic mining will continue to impact the future of parasite and ricketsial (and other tick borne pathogens) disease vaccine development.
Collapse
|
31
|
Abstract
Reverse vaccinology (RV) is a computational approach that aims to identify putative vaccine candidates in the protein coding genome (proteome) of pathogens. RV has primarily been applied to bacterial pathogens to identify proteins that can be formulated into subunit vaccines, which consist of one or more protein antigens. An RV approach based on a filtering method has already been used to construct a subunit vaccine against Neisseria meningitidis serogroup B that is now registered in several countries (Bexsero). Recently, machine learning methods have been used to improve the ability of RV approaches to identify vaccine candidates. Further improvements related to the incorporation of epitope-binding annotation and gene expression data are discussed. In the future, it is envisaged that RV approaches will facilitate rapid vaccine design with less reliance on conventional animal testing and clinical trials in order to curb the threat of antibiotic resistance or newly emerged outbreaks of bacterial origin.
Collapse
|
32
|
Goodswen SJ, Kennedy PJ, Ellis JT. A novel strategy for classifying the output from an in silico vaccine discovery pipeline for eukaryotic pathogens using machine learning algorithms. BMC Bioinformatics 2013; 14:315. [PMID: 24180526 PMCID: PMC3826511 DOI: 10.1186/1471-2105-14-315] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2013] [Accepted: 10/28/2013] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND An in silico vaccine discovery pipeline for eukaryotic pathogens typically consists of several computational tools to predict protein characteristics. The aim of the in silico approach to discovering subunit vaccines is to use predicted characteristics to identify proteins which are worthy of laboratory investigation. A major challenge is that these predictions are inherent with hidden inaccuracies and contradictions. This study focuses on how to reduce the number of false candidates using machine learning algorithms rather than relying on expensive laboratory validation. Proteins from Toxoplasma gondii, Plasmodium sp., and Caenorhabditis elegans were used as training and test datasets. RESULTS The results show that machine learning algorithms can effectively distinguish expected true from expected false vaccine candidates (with an average sensitivity and specificity of 0.97 and 0.98 respectively), for proteins observed to induce immune responses experimentally. CONCLUSIONS Vaccine candidates from an in silico approach can only be truly validated in a laboratory. Given any in silico output and appropriate training data, the number of false candidates allocated for validation can be dramatically reduced using a pool of machine learning algorithms. This will ultimately save time and money in the laboratory.
Collapse
Affiliation(s)
| | | | - John T Ellis
- School of Medical and Molecular Biosciences, ithree institute at the University of Technology Sydney (UTS), Sydney, Australia.
| |
Collapse
|