1
|
Aggarwal SK, Singh A, Choudhary M, Kumar A, Rakshit S, Kumar P, Bohra A, Varshney RK. Pangenomics in Microbial and Crop Research: Progress, Applications, and Perspectives. Genes (Basel) 2022; 13:598. [PMID: 35456404 PMCID: PMC9031676 DOI: 10.3390/genes13040598] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 03/16/2022] [Accepted: 03/25/2022] [Indexed: 01/25/2023] Open
Abstract
Advances in sequencing technologies and bioinformatics tools have fueled a renewed interest in whole genome sequencing efforts in many organisms. The growing availability of multiple genome sequences has advanced our understanding of the within-species diversity, in the form of a pangenome. Pangenomics has opened new avenues for future research such as allowing dissection of complex molecular mechanisms and increased confidence in genome mapping. To comprehensively capture the genetic diversity for improving plant performance, the pangenome concept is further extended from species to genus level by the inclusion of wild species, constituting a super-pangenome. Characterization of pangenome has implications for both basic and applied research. The concept of pangenome has transformed the way biological questions are addressed. From understanding evolution and adaptation to elucidating host–pathogen interactions, finding novel genes or breeding targets to aid crop improvement to design effective vaccines for human prophylaxis, the increasing availability of the pangenome has revolutionized several aspects of biological research. The future availability of high-resolution pangenomes based on reference-level near-complete genome assemblies would greatly improve our ability to address complex biological problems.
Collapse
|
2
|
Machado KCT, Fortuin S, Tomazella GG, Fonseca AF, Warren RM, Wiker HG, de Souza SJ, de Souza GA. On the Impact of the Pangenome and Annotation Discrepancies While Building Protein Sequence Databases for Bacteria Proteogenomics. Front Microbiol 2019; 10:1410. [PMID: 31281302 PMCID: PMC6596428 DOI: 10.3389/fmicb.2019.01410] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Accepted: 06/05/2019] [Indexed: 01/19/2023] Open
Abstract
In proteomics, peptide information within mass spectrometry (MS) data from a specific organism sample is routinely matched against a protein sequence database that best represent such organism. However, if the species/strain in the sample is unknown or genetically poorly characterized, it becomes challenging to determine a database which can represent such sample. Building customized protein sequence databases merging multiple strains for a given species has become a strategy to overcome such restrictions. However, as more genetic information is publicly available and interesting genetic features such as the existence of pan- and core genes within a species are revealed, we questioned how efficient such merging strategies are to report relevant information. To test this assumption, we constructed databases containing conserved and unique sequences for 10 different species. Features that are relevant for probabilistic-based protein identification by proteomics were then monitored. As expected, increase in database complexity correlates with pangenomic complexity. However, Mycobacterium tuberculosis and Bordetella pertussis generated very complex databases even having low pangenomic complexity. We further tested database performance by using MS data from eight clinical strains from M. tuberculosis, and from two published datasets from Staphylococcus aureus. We show that by using an approach where database size is controlled by removing repeated identical tryptic sequences across strains/species, computational time can be reduced drastically as database complexity increases.
Collapse
Affiliation(s)
- Karla C T Machado
- Bioinformatics Multidisciplinary Environment, Universidade Federal do Rio Grande do Norte, Natal, Brazil
| | - Suereta Fortuin
- DST/NRF Centre of Excellence for Biomedical Tuberculosis Research/SAMRC Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Stellenbosch, South Africa
| | - Gisele Guicardi Tomazella
- Bioinformatics Multidisciplinary Environment, Universidade Federal do Rio Grande do Norte, Natal, Brazil
- The Gade Research Group for Infection and Immunity, Department of Clinical Science, University of Bergen, Bergen, Norway
- The Institute of Bioinformatics and Biotechnology, Natal, Brazil
| | - Andre F Fonseca
- Bioinformatics Multidisciplinary Environment, Universidade Federal do Rio Grande do Norte, Natal, Brazil
| | - Robin Mark Warren
- DST/NRF Centre of Excellence for Biomedical Tuberculosis Research/SAMRC Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Stellenbosch, South Africa
| | - Harald G Wiker
- The Gade Research Group for Infection and Immunity, Department of Clinical Science, University of Bergen, Bergen, Norway
| | - Sandro Jose de Souza
- Bioinformatics Multidisciplinary Environment, Universidade Federal do Rio Grande do Norte, Natal, Brazil
- The Brain Institute, Universidade Federal do Rio Grande do Norte, Natal, Brazil
| | - Gustavo Antonio de Souza
- Bioinformatics Multidisciplinary Environment, Universidade Federal do Rio Grande do Norte, Natal, Brazil
- Department of Biochemistry, Federal University of Rio Grande do Norte (UFRN), Natal, Brazil
| |
Collapse
|
3
|
Omasits U, Varadarajan AR, Schmid M, Goetze S, Melidis D, Bourqui M, Nikolayeva O, Québatte M, Patrignani A, Dehio C, Frey JE, Robinson MD, Wollscheid B, Ahrens CH. An integrative strategy to identify the entire protein coding potential of prokaryotic genomes by proteogenomics. Genome Res 2017; 27:2083-2095. [PMID: 29141959 PMCID: PMC5741054 DOI: 10.1101/gr.218255.116] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2016] [Accepted: 10/25/2017] [Indexed: 12/18/2022]
Abstract
Accurate annotation of all protein-coding sequences (CDSs) is an essential prerequisite to fully exploit the rapidly growing repertoire of completely sequenced prokaryotic genomes. However, large discrepancies among the number of CDSs annotated by different resources, missed functional short open reading frames (sORFs), and overprediction of spurious ORFs represent serious limitations. Our strategy toward accurate and complete genome annotation consolidates CDSs from multiple reference annotation resources, ab initio gene prediction algorithms and in silico ORFs (a modified six-frame translation considering alternative start codons) in an integrated proteogenomics database (iPtgxDB) that covers the entire protein-coding potential of a prokaryotic genome. By extending the PeptideClassifier concept of unambiguous peptides for prokaryotes, close to 95% of the identifiable peptides imply one distinct protein, largely simplifying downstream analysis. Searching a comprehensive Bartonella henselae proteomics data set against such an iPtgxDB allowed us to unambiguously identify novel ORFs uniquely predicted by each resource, including lipoproteins, differentially expressed and membrane-localized proteins, novel start sites and wrongly annotated pseudogenes. Most novelties were confirmed by targeted, parallel reaction monitoring mass spectrometry, including unique ORFs and single amino acid variations (SAAVs) identified in a re-sequenced laboratory strain that are not present in its reference genome. We demonstrate the general applicability of our strategy for genomes with varying GC content and distinct taxonomic origin. We release iPtgxDBs for B. henselae, Bradyrhizobium diazoefficiens and Escherichia coli and the software to generate both proteogenomics search databases and integrated annotation files that can be viewed in a genome browser for any prokaryote.
Collapse
Affiliation(s)
- Ulrich Omasits
- Agroscope, Research Group Molecular Diagnostics, Genomics and Bioinformatics & SIB Swiss Institute of Bioinformatics, CH-8820 Wädenswil, Switzerland
| | - Adithi R Varadarajan
- Agroscope, Research Group Molecular Diagnostics, Genomics and Bioinformatics & SIB Swiss Institute of Bioinformatics, CH-8820 Wädenswil, Switzerland.,Department of Health Sciences and Technology, Institute of Molecular Systems Biology, Swiss Federal Institute of Technology Zurich, CH-8093 Zurich, Switzerland
| | - Michael Schmid
- Agroscope, Research Group Molecular Diagnostics, Genomics and Bioinformatics & SIB Swiss Institute of Bioinformatics, CH-8820 Wädenswil, Switzerland
| | - Sandra Goetze
- Department of Health Sciences and Technology, Institute of Molecular Systems Biology, Swiss Federal Institute of Technology Zurich, CH-8093 Zurich, Switzerland
| | - Damianos Melidis
- Agroscope, Research Group Molecular Diagnostics, Genomics and Bioinformatics & SIB Swiss Institute of Bioinformatics, CH-8820 Wädenswil, Switzerland
| | - Marc Bourqui
- Agroscope, Research Group Molecular Diagnostics, Genomics and Bioinformatics & SIB Swiss Institute of Bioinformatics, CH-8820 Wädenswil, Switzerland
| | - Olga Nikolayeva
- Institute for Molecular Life Sciences & SIB Swiss Institute of Bioinformatics, University of Zurich, CH-8057 Zurich, Switzerland
| | | | - Andrea Patrignani
- Functional Genomics Center Zurich, ETH & UZH Zurich, CH-8057 Zurich, Switzerland
| | | | - Juerg E Frey
- Agroscope, Research Group Molecular Diagnostics, Genomics and Bioinformatics & SIB Swiss Institute of Bioinformatics, CH-8820 Wädenswil, Switzerland
| | - Mark D Robinson
- Institute for Molecular Life Sciences & SIB Swiss Institute of Bioinformatics, University of Zurich, CH-8057 Zurich, Switzerland
| | - Bernd Wollscheid
- Department of Health Sciences and Technology, Institute of Molecular Systems Biology, Swiss Federal Institute of Technology Zurich, CH-8093 Zurich, Switzerland
| | - Christian H Ahrens
- Agroscope, Research Group Molecular Diagnostics, Genomics and Bioinformatics & SIB Swiss Institute of Bioinformatics, CH-8820 Wädenswil, Switzerland
| |
Collapse
|
4
|
Abstract
Proteogenomics is a research area that combines areas as proteomics and genomics in a multi-omics setup using both mass spectrometry and high-throughput sequencing technologies. Currently, the main goals of the field are to aid genome annotation or to unravel the proteome complexity. Mass spectrometry based identifications of matching or homologues peptides can further refine gene models. Also, the identification of novel proteoforms is also made possible based on detection of novel translation initiation sites (cognate or near-cognate), novel transcript isoforms, sequence variation or novel (small) open reading frames in intergenic or un-translated genic regions by analyzing high-throughput sequencing data from RNAseq or ribosome profiling experiments. Other proteogenomics studies using a combination of proteomics and genomics techniques focus on antibody sequencing, the identification of immunogenic peptides or venom peptides. Over the years, a growing amount of bioinformatics tools and databases became available to help streamlining these cross-omics studies. Some of these solutions only help in specific steps of the proteogenomics studies, e.g. building custom sequence databases (based on next generation sequencing output) for mass spectrometry fragmentation spectrum matching. Over the last few years a handful integrative tools also became available that can execute complete proteogenomics analyses. Some of these are presented as stand-alone solutions, whereas others are implemented in a web-based framework such as Galaxy. In this review we aimed at sketching a comprehensive overview of all the bioinformatics solutions that are available for this growing research area. © 2015 Wiley Periodicals, Inc. Mass Spec Rev 36:584-599, 2017.
Collapse
Affiliation(s)
- Gerben Menschaert
- Lab of Bioinformatics and Computational Genomics, Department of
Mathematical Modeling, Statistics and Bioinformatics, Faculty of Bioscience
Engineering, Ghent University, Ghent, Belgium
- To whom correspondence should be addressed. Tel:
+32 9 264 99 22; Fax: +32 9 264 6220;
| | - Fenyö David
- Center for Health Informatics and Bioinformatics and Department of
Biochemistry and Molecular Pharmacology, New York University School of Medicine, New
York, New York, USA
| |
Collapse
|
5
|
Schmidt F, Meyer T, Sundaramoorthy N, Michalik S, Surmann K, Depke M, Dhople V, Gesell Salazar M, Holtappels G, Zhang N, Bröker BM, Bachert C, Völker U. Characterization of human and Staphylococcus aureus proteins in respiratory mucosa by in vivo- and immunoproteomics. J Proteomics 2017; 155:31-39. [DOI: 10.1016/j.jprot.2017.01.008] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2016] [Revised: 11/28/2016] [Accepted: 01/13/2017] [Indexed: 10/20/2022]
|
6
|
Abstract
Database searching is the preferred method for protein identification from digital spectra of mass to charge ratios (m/z) detected for protein samples through mass spectrometers. The search database is one of the major influencing factors in discovering proteins present in the sample and thus in deriving biological conclusions. In most cases the choice of search database is arbitrary. Here we describe common search databases used in proteomic studies and their impact on final list of identified proteins. We also elaborate upon factors like composition and size of the search database that can influence the protein identification process. In conclusion, we suggest that choice of the database depends on the type of inferences to be derived from proteomics data. However, making additional efforts to build a compact and concise database for a targeted question should generally be rewarding in achieving confident protein identifications.
Collapse
Affiliation(s)
- Dhirendra Kumar
- G.N. Ramachandran Knowledge Centre for Genome Informatics, CSIR-Institute of Genomics and Integrative Biology, South Campus, Sukhdev Vihar, Mathura Road, Delhi, 110025, India
| | - Amit Kumar Yadav
- G.N. Ramachandran Knowledge Centre for Genome Informatics, CSIR-Institute of Genomics and Integrative Biology, South Campus, Sukhdev Vihar, Mathura Road, Delhi, 110025, India
| | - Debasis Dash
- G.N. Ramachandran Knowledge Centre for Genome Informatics, CSIR-Institute of Genomics and Integrative Biology, South Campus, Sukhdev Vihar, Mathura Road, Delhi, 110025, India.
| |
Collapse
|
7
|
Ischenko D, Alexeev D, Shitikov E, Kanygina A, Malakhova M, Kostryukova E, Larin A, Kovalchuk S, Pobeguts O, Butenko I, Anikanov N, Altukhov I, Ilina E, Govorun V. Large scale analysis of amino acid substitutions in bacterial proteomics. BMC Bioinformatics 2016; 17:450. [PMID: 27821049 PMCID: PMC5100282 DOI: 10.1186/s12859-016-1301-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2016] [Accepted: 10/21/2016] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND Proteomics of bacterial pathogens is a developing field exploring microbial physiology, gene expression and the complex interactions between bacteria and their hosts. One of the complications in proteomic approach is micro- and macro-heterogeneity of bacterial species, which makes it impossible to build a comprehensive database of bacterial genomes for identification, while most of the existing algorithms rely largely on genomic data. RESULTS Here we present a large scale study of identification of single amino acid polymorphisms between bacterial strains. An ad hoc method was developed based on MS/MS spectra comparison without the support of a genomic database. Whole-genome sequencing was used to validate the accuracy of polymorphism detection. Several approaches presented earlier to the proteomics community as useful for polymorphism detection were tested on isolates of Helicobacter pylori, Neisseria gonorrhoeae and Escherichia coli. CONCLUSION The developed method represents a perspective approach in the field of bacterial proteomics allowing to identify hundreds of peptides with novel SAPs from a single proteome.
Collapse
Affiliation(s)
- Dmitry Ischenko
- Research Institute of Physical Chemical Medicine, Malaya Pirogovskaya, 1a, Moscow, 119435, Russian Federation.
- Moscow Institute of Physics and Technology, Institutskiy pereulok, 9, Dolgoprudny, 141700, Russian Federation.
| | - Dmitry Alexeev
- Research Institute of Physical Chemical Medicine, Malaya Pirogovskaya, 1a, Moscow, 119435, Russian Federation
- Moscow Institute of Physics and Technology, Institutskiy pereulok, 9, Dolgoprudny, 141700, Russian Federation
| | - Egor Shitikov
- Research Institute of Physical Chemical Medicine, Malaya Pirogovskaya, 1a, Moscow, 119435, Russian Federation
| | - Alexandra Kanygina
- Research Institute of Physical Chemical Medicine, Malaya Pirogovskaya, 1a, Moscow, 119435, Russian Federation
- Moscow Institute of Physics and Technology, Institutskiy pereulok, 9, Dolgoprudny, 141700, Russian Federation
| | - Maja Malakhova
- Research Institute of Physical Chemical Medicine, Malaya Pirogovskaya, 1a, Moscow, 119435, Russian Federation
| | - Elena Kostryukova
- Research Institute of Physical Chemical Medicine, Malaya Pirogovskaya, 1a, Moscow, 119435, Russian Federation
| | - Andrey Larin
- Research Institute of Physical Chemical Medicine, Malaya Pirogovskaya, 1a, Moscow, 119435, Russian Federation
| | - Sergey Kovalchuk
- Research Institute of Physical Chemical Medicine, Malaya Pirogovskaya, 1a, Moscow, 119435, Russian Federation
| | - Olga Pobeguts
- Research Institute of Physical Chemical Medicine, Malaya Pirogovskaya, 1a, Moscow, 119435, Russian Federation
| | - Ivan Butenko
- Research Institute of Physical Chemical Medicine, Malaya Pirogovskaya, 1a, Moscow, 119435, Russian Federation
| | - Nikolay Anikanov
- Research Institute of Physical Chemical Medicine, Malaya Pirogovskaya, 1a, Moscow, 119435, Russian Federation
| | - Ilya Altukhov
- Research Institute of Physical Chemical Medicine, Malaya Pirogovskaya, 1a, Moscow, 119435, Russian Federation
- Moscow Institute of Physics and Technology, Institutskiy pereulok, 9, Dolgoprudny, 141700, Russian Federation
| | - Elena Ilina
- Research Institute of Physical Chemical Medicine, Malaya Pirogovskaya, 1a, Moscow, 119435, Russian Federation
| | - Vadim Govorun
- Research Institute of Physical Chemical Medicine, Malaya Pirogovskaya, 1a, Moscow, 119435, Russian Federation
| |
Collapse
|
8
|
Strobel M, Pförtner H, Tuchscherr L, Völker U, Schmidt F, Kramko N, Schnittler HJ, Fraunholz MJ, Löffler B, Peters G, Niemann S. Post-invasion events after infection with Staphylococcus aureus are strongly dependent on both the host cell type and the infecting S. aureus strain. Clin Microbiol Infect 2016; 22:799-809. [PMID: 27393124 DOI: 10.1016/j.cmi.2016.06.020] [Citation(s) in RCA: 105] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2016] [Revised: 06/28/2016] [Accepted: 06/29/2016] [Indexed: 10/21/2022]
Abstract
Host cell invasion is a major feature of Staphylococcus aureus and contributes to infection development. The intracellular metabolically active bacteria can induce host cell activation and death but they can also persist for long time periods. In this study a comparative analysis was performed of different well-characterized S. aureus strains in their interaction with a variety of host cell types. Staphylococcus aureus (strains 6850, USA300, LS1, SH1000, Cowan1) invasion was compared in different human cell types (epithelial and endothelial cells, keratinocytes, fibroblasts, osteoblasts). The number of intracellular bacteria was determined, cell inflammation was investigated, as well as cell death and phagosomal escape of bacteria. To explain strain-dependent differences in the secretome, a proteomic approach was used. Barrier cells took up high amounts of bacteria and were killed by aggressive strains. These strains expressed high levels of toxins, and possessed the ability to escape from phagolysosomes. Osteoblasts and keratinocytes ingested less bacteria, and were not killed, even though the primary osteoblasts were strongly activated by S. aureus. In all cell types S. aureus was able to persist. Strong differences in uptake, cytotoxicity, and inflammatory response were observed between primary cells and their corresponding cell lines, demonstrating that cell lines reflect only partially the functions and physiology of primary cells. This study provides a contribution for a better understanding of the pathomechanisms of S. aureus infections. The proteomic data provide important basic knowledge on strains commonly used in the analysis of S. aureus-host cell interaction.
Collapse
Affiliation(s)
- M Strobel
- University Hospital of Muenster, Institute of Medical Microbiology, Muenster, Germany
| | - H Pförtner
- Interfaculty Institute of Genetics and Functional Genomics, University Medicine Greifswald, Greifswald, Germany
| | - L Tuchscherr
- Institute of Medical Microbiology, Jena University Hospital, Germany
| | - U Völker
- Interfaculty Institute of Genetics and Functional Genomics, University Medicine Greifswald, Greifswald, Germany
| | - F Schmidt
- Interfaculty Institute of Genetics and Functional Genomics, University Medicine Greifswald, Greifswald, Germany
| | - N Kramko
- Westfaelische-Wilhelms University, Institute of Anatomy and Vascular Biology, Muenster, Germany
| | - H-J Schnittler
- Westfaelische-Wilhelms University, Institute of Anatomy and Vascular Biology, Muenster, Germany
| | - M J Fraunholz
- Department of Microbiology, Biocenter, University of Wuerzburg, Wuerzburg, Germany
| | - B Löffler
- Institute of Medical Microbiology, Jena University Hospital, Germany
| | - G Peters
- University Hospital of Muenster, Institute of Medical Microbiology, Muenster, Germany; Cluster of Excellence EXC 1003, Cells in Motion, Muenster, Germany
| | - S Niemann
- University Hospital of Muenster, Institute of Medical Microbiology, Muenster, Germany.
| |
Collapse
|
9
|
Stentzel S, Teufelberger A, Nordengrün M, Kolata J, Schmidt F, van Crombruggen K, Michalik S, Kumpfmüller J, Tischer S, Schweder T, Hecker M, Engelmann S, Völker U, Krysko O, Bachert C, Bröker BM. Staphylococcal serine protease-like proteins are pacemakers of allergic airway reactions to Staphylococcus aureus. J Allergy Clin Immunol 2016; 139:492-500.e8. [PMID: 27315768 DOI: 10.1016/j.jaci.2016.03.045] [Citation(s) in RCA: 86] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2015] [Revised: 02/15/2016] [Accepted: 03/22/2016] [Indexed: 12/20/2022]
Abstract
BACKGROUND A substantial subgroup of asthmatic patients have "nonallergic" or idiopathic asthma, which often takes a severe course and is difficult to treat. The cause might be allergic reactions to the gram-positive pathogen Staphylococcus aureus, a frequent colonizer of the upper airways. However, the driving allergens of S aureus have remained elusive. OBJECTIVE We sought to search for potentially allergenic S aureus proteins and characterize the immune response directed against them. METHODS S aureus extracellular proteins targeted by human serum IgG4 were identified by means of immunoblotting to screen for potential bacterial allergens. Candidate antigens were expressed as recombinant proteins and used to analyze the established cellular and humoral immune responses in healthy adults and asthmatic patients. The ability to induce a type 2 immune response in vivo was tested in a mouse asthma model. RESULTS We identified staphylococcal serine protease-like proteins (Spls) as dominant IgG4-binding S aureus proteins. SplA through SplF are extracellular proteases of unknown function expressed by S aureus in vivo. Spls elicited IgE antibody responses in most asthmatic patients. In healthy S aureus carriers and noncarriers, peripheral blood T cells elaborated TH2 cytokines after stimulation with Spls, as is typical for allergens. In contrast, TH1/TH17 cytokines, which dominated the response to S aureus α-hemolysin, were of low concentration or absent. In mice inhalation of SplD without adjuvant induced lung inflammation characterized by TH2 cytokines and eosinophil infiltration. CONCLUSION We identify Spls as triggering allergens released by S aureus, opening prospects for diagnosis and causal therapy of asthma.
Collapse
Affiliation(s)
- Sebastian Stentzel
- Department of Immunology, University Medicine Greifswald, Greifswald, Germany
| | | | - Maria Nordengrün
- Department of Immunology, University Medicine Greifswald, Greifswald, Germany
| | - Julia Kolata
- Department of Immunology, University Medicine Greifswald, Greifswald, Germany; Medical Microbiology, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Frank Schmidt
- Interfaculty Institute for Genetics and Functional Genomics, Department of Functional Genomics, University Medicine Greifswald, Greifswald, Germany; Junior Group Applied Proteomics, ZIK FunGene, University Medicine Greifswald, Greifswald, Germany
| | | | - Stephan Michalik
- Interfaculty Institute for Genetics and Functional Genomics, Department of Functional Genomics, University Medicine Greifswald, Greifswald, Germany; Junior Group Applied Proteomics, ZIK FunGene, University Medicine Greifswald, Greifswald, Germany
| | - Jana Kumpfmüller
- Department of Pharmaceutical Biotechnology, Institute of Pharmacy, Ernst-Moritz-Arndt-University Greifswald, Greifswald, Germany; Department of Biomolecular Chemistry, Leibniz Institute for Natural Product Research and Infection Biology, HKI, Jena, Germany
| | - Sebastian Tischer
- Department of Immunology, University Medicine Greifswald, Greifswald, Germany
| | - Thomas Schweder
- Department of Pharmaceutical Biotechnology, Institute of Pharmacy, Ernst-Moritz-Arndt-University Greifswald, Greifswald, Germany
| | - Michael Hecker
- Institute for Microbiology, Ernst-Moritz-Arndt-University Greifswald, Greifswald, Germany
| | - Susanne Engelmann
- Institute for Microbiology, Ernst-Moritz-Arndt-University Greifswald, Greifswald, Germany; Institute for Microbiology, University of Braunschweig, Braunschweig, Germany; Helmholtz Center for Infection Research, Microbial Proteomics, Braunschweig, Germany
| | - Uwe Völker
- Interfaculty Institute for Genetics and Functional Genomics, Department of Functional Genomics, University Medicine Greifswald, Greifswald, Germany
| | - Olga Krysko
- Upper Airways Research Laboratory, Ghent University, Ghent, Belgium
| | - Claus Bachert
- Upper Airways Research Laboratory, Ghent University, Ghent, Belgium; Division of Ear, Nose, and Throat Diseases, Clintec, Karolinska Institute, Stockholm, Sweden
| | - Barbara M Bröker
- Department of Immunology, University Medicine Greifswald, Greifswald, Germany.
| |
Collapse
|
10
|
Kumar D, Mondal AK, Kutum R, Dash D. Proteogenomics of rare taxonomic phyla: A prospective treasure trove of protein coding genes. Proteomics 2015; 16:226-40. [PMID: 26773550 DOI: 10.1002/pmic.201500263] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2015] [Revised: 09/18/2015] [Accepted: 09/28/2015] [Indexed: 01/04/2023]
Abstract
Sustainable innovations in sequencing technologies have resulted in a torrent of microbial genome sequencing projects. However, the prokaryotic genomes sequenced so far are unequally distributed along their phylogenetic tree; few phyla contain the majority, the rest only a few representatives. Accurate genome annotation lags far behind genome sequencing. While automated computational prediction, aided by comparative genomics, remains a popular choice for genome annotation, substantial fraction of these annotations are erroneous. Proteogenomics utilizes protein level experimental observations to annotate protein coding genes on a genome wide scale. Benefits of proteogenomics include discovery and correction of gene annotations regardless of their phylogenetic conservation. This not only allows detection of common, conserved proteins but also the discovery of protein products of rare genes that may be horizontally transferred or taxonomy specific. Chances of encountering such genes are more in rare phyla that comprise a small number of complete genome sequences. We collated all bacterial and archaeal proteogenomic studies carried out to date and reviewed them in the context of genome sequencing projects. Here, we present a comprehensive list of microbial proteogenomic studies, their taxonomic distribution, and also urge for targeted proteogenomics of underexplored taxa to build an extensive reference of protein coding genes.
Collapse
Affiliation(s)
- Dhirendra Kumar
- G. N. Ramachandran Knowledge Center of Genome Informatics, CSIR-Institute of Genomics and Integrative Biology, South Campus, Sukhdev Vihar, Delhi, India
| | - Anupam Kumar Mondal
- G. N. Ramachandran Knowledge Center of Genome Informatics, CSIR-Institute of Genomics and Integrative Biology, South Campus, Sukhdev Vihar, Delhi, India
| | - Rintu Kutum
- G. N. Ramachandran Knowledge Center of Genome Informatics, CSIR-Institute of Genomics and Integrative Biology, South Campus, Sukhdev Vihar, Delhi, India
| | - Debasis Dash
- G. N. Ramachandran Knowledge Center of Genome Informatics, CSIR-Institute of Genomics and Integrative Biology, South Campus, Sukhdev Vihar, Delhi, India
| |
Collapse
|
11
|
Kucharova V, Wiker HG. Proteogenomics in microbiology: taking the right turn at the junction of genomics and proteomics. Proteomics 2014; 14:2360-675. [PMID: 25263021 DOI: 10.1002/pmic.201400168] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2014] [Revised: 08/18/2014] [Accepted: 09/23/2014] [Indexed: 12/14/2022]
Abstract
High-accuracy and high-throughput proteomic methods have completely changed the way we can identify and characterize proteins. MS-based proteomics can now provide a unique supplement to genomic data and add a new level of information to the interpretation of genomic sequences. Proteomics-driven genome annotation has become especially relevant in microbiology where genomes are sequenced on a daily basis and limitations of an in silico driven annotation process are well recognized. In this review paper, we outline different strategies on how one can design a proteogenomic experiment, for example on genome-sequenced (synonymous proteogenomics) versus unsequenced organisms (ortho-proteogenomics) or with the aid of other "omic" data such as RNA-seq. We touch upon many challenges that are encountered during a typical proteogenomic study, mostly concerning bioinformatics methods and downstream data analysis, but also related to creation and use of sequence databases. A large list of proteogenomic case studies of different microorganisms is provided to illustrate the mapping of MS/MS-derived peptide spectra to genomic DNA sequences. These investigations have led to accurate determination of translational initiation sites, pointed out eventual read-throughs or programmed frameshifts, detected signal peptide processing or other protein maturation events, removed questionable annotation assignments, and provided evidence for predicted hypothetical proteins.
Collapse
Affiliation(s)
- Veronika Kucharova
- Department of Clinical Science, The Gade Research Group for Infection and Immunity, University of Bergen, Norway
| | | |
Collapse
|
12
|
Fleurbaaij F, Heemskerk AAM, Russcher A, Klychnikov OI, Deelder AM, Mayboroda OA, Kuijper EJ, van Leeuwen HC, Hensbergen PJ. Capillary-electrophoresis mass spectrometry for the detection of carbapenemases in (multi-)drug-resistant Gram-negative bacteria. Anal Chem 2014; 86:9154-61. [PMID: 25155175 DOI: 10.1021/ac502049p] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
In a time in which the spread of multidrug resistant microorganisms is ever increasing, there is a need for fast and unequivocal identification of suspect organisms to supplement existing techniques in the clinical laboratory, especially in single bacterial colonies. Mass-spectrometry coupled with efficient peptide separation techniques offer great potential for identification of resistant-related proteins in complex microbiological samples in an unbiased manner. Here, we developed a capillary electrophoresis-electrospray ionization-tandem mass spectrometry CE-ESI-MS/MS bottom-up proteomics workflow for sensitive and specific peptide analysis with the emphasis on the identification of β-lactamases (carbapenemases OXA-48 and KPC in particular) in bacterial species. For this purpose, tryptic peptides from whole cell lysates were analyzed by sheathless CE-ESI-MS/MS and proteins were identified after searching of the spectral data against bacterial protein databases. The CE-ESI-MS/MS workflow was first evaluated using a recombinant TEM-1 β-lactamase, resulting in 68% of the amino acid sequence being covered by 20 different unique peptides. Subsequently, a resistant and susceptible Escherichia coli lab strain were analyzed and based on the observed β-lactamase peptides, the two strains could easily be discriminated. Finally, the method was tested in an unbiased setup using a collection of in-house characterized OXA-48 (n = 17) and KPC (n = 10) clinical isolates. The developed CE-ESI-MS/MS method was able to identify the presence of OXA-48 and KPC in all of the carbapenemase positive samples, independent of species and degree of susceptibility. Four negative controls were tested and classified as negative by this method. Furthermore, a number of extended-spectrum beta-lactamases (ESBL) were identified in the same analyses, confirming the multiresistant character in 19 out of 27 clinical isolates. Importantly, the method performed equally well on protein lysates from single colonies. As such, it demonstrates CE-ESI-MS/MS as a potential next generation mass spectrometry platform within the clinical microbiology laboratory.
Collapse
Affiliation(s)
- Frank Fleurbaaij
- Department of Medical Microbiology, Section Experimental Microbiology, Leiden University Medical Center , 2333 ZA Leiden, The Netherlands
| | | | | | | | | | | | | | | | | |
Collapse
|
13
|
de Keijzer J, de Haas PE, de Ru AH, van Veelen PA, van Soolingen D. Disclosure of selective advantages in the "modern" sublineage of the Mycobacterium tuberculosis Beijing genotype family by quantitative proteomics. Mol Cell Proteomics 2014; 13:2632-45. [PMID: 25022876 DOI: 10.1074/mcp.m114.038380] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
The Mycobacterium tuberculosis Beijing genotype, consisting of the more ancient (atypical) and modern (typical) emerging sublineage, is one of the most prevalent and genetically conserved genotype families and has often been associated with multidrug resistance. In this study, we employed a 2D-LC-FTICR MS approach, combined with dimethylation of tryptic peptides, to systematically compare protein abundance levels of ancient and modern Beijing strains and identify differences that could be associated with successful spread of the modern sublineage. The data is available via ProteomeXchange using the identifier PXD000931. Despite the highly uniform protein abundance ratios in both sublineages, we identified four proteins as differentially regulated between both sublineages, which could explain the apparent increased adaptation of the modern Beijing strains. These proteins are; Rv0450c/MmpL4, Rv1269c, Rv3137, and Rv3283/sseA. Transcriptional and functional analysis of these proteins in a large cohort of 29 Beijing strains showed that the mRNA levels of Rv0450c/MmpL4 are significantly higher in modern Beijing strains, whereas we also provide evidence that Rv3283/sseA is less abundant in the modern Beijing sublineage. Our findings provide a possible explanation for the increased virulence and success of the modern Beijing sublineage. In addition, in the established dataset of 1817 proteins, we demonstrate the pre-existence of several, possibly unique, antibiotic efflux pumps in the proteome of the Beijing strains. This may reflect an increased ability of Beijing strains to escape exposure to antituberculosis drugs.
Collapse
Affiliation(s)
- Jeroen de Keijzer
- From the ‡Department of Immunohematology and Blood Transfusion, Leiden University Medical Centre (LUMC), Leiden, 2300 RC, The Netherlands;
| | - Petra E de Haas
- §Tuberculosis Reference Laboratory, National Institute for Public Health and the Environment (RIVM), Bilthoven, 3720 BA, The Netherlands
| | - Arnoud H de Ru
- From the ‡Department of Immunohematology and Blood Transfusion, Leiden University Medical Centre (LUMC), Leiden, 2300 RC, The Netherlands
| | - Peter A van Veelen
- From the ‡Department of Immunohematology and Blood Transfusion, Leiden University Medical Centre (LUMC), Leiden, 2300 RC, The Netherlands
| | - Dick van Soolingen
- §Tuberculosis Reference Laboratory, National Institute for Public Health and the Environment (RIVM), Bilthoven, 3720 BA, The Netherlands; ¶Departments of Pulmonary Diseases and Medical Microbiology, Radboud University Medical Centre, Nijmegen, 6500 HB, The Netherlands
| |
Collapse
|
14
|
Wu X, Xu L, Gu W, Xu Q, He QY, Sun X, Zhang G. Iterative Genome Correction Largely Improves Proteomic Analysis of Nonmodel Organisms. J Proteome Res 2014; 13:2724-34. [PMID: 24809469 DOI: 10.1021/pr500369b] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Affiliation(s)
- Xiaohui Wu
- Key Laboratory of Functional
Protein Research of Guangdong Higher Education Institutes, Institute
of Life and Health Engineering, College of Life Science and Technology, Jinan University, Huang-Pu Avenue West 601, Guangzhou 510632, China
| | - Lina Xu
- Key Laboratory of Functional
Protein Research of Guangdong Higher Education Institutes, Institute
of Life and Health Engineering, College of Life Science and Technology, Jinan University, Huang-Pu Avenue West 601, Guangzhou 510632, China
| | - Wei Gu
- Key Laboratory of Functional
Protein Research of Guangdong Higher Education Institutes, Institute
of Life and Health Engineering, College of Life Science and Technology, Jinan University, Huang-Pu Avenue West 601, Guangzhou 510632, China
| | - Qian Xu
- Key Laboratory of Functional
Protein Research of Guangdong Higher Education Institutes, Institute
of Life and Health Engineering, College of Life Science and Technology, Jinan University, Huang-Pu Avenue West 601, Guangzhou 510632, China
| | - Qing-Yu He
- Key Laboratory of Functional
Protein Research of Guangdong Higher Education Institutes, Institute
of Life and Health Engineering, College of Life Science and Technology, Jinan University, Huang-Pu Avenue West 601, Guangzhou 510632, China
| | - Xuesong Sun
- Key Laboratory of Functional
Protein Research of Guangdong Higher Education Institutes, Institute
of Life and Health Engineering, College of Life Science and Technology, Jinan University, Huang-Pu Avenue West 601, Guangzhou 510632, China
| | - Gong Zhang
- Key Laboratory of Functional
Protein Research of Guangdong Higher Education Institutes, Institute
of Life and Health Engineering, College of Life Science and Technology, Jinan University, Huang-Pu Avenue West 601, Guangzhou 510632, China
| |
Collapse
|
15
|
Abstract
Proteogenomics has the potential to advance genome annotation through high quality peptide identifications derived from mass spectrometry experiments, which demonstrate a given gene or isoform is expressed and translated at the protein level. This can advance our understanding of genome function, discovering novel genes and gene structure that have not yet been identified or validated. Because of the high-throughput shotgun nature of most proteomics experiments, it is essential to carefully control for false positives and prevent any potential misannotation. A number of statistical procedures to deal with this are in wide use in proteomics, calculating false discovery rate (FDR) and posterior error probability (PEP) values for groups and individual peptide spectrum matches (PSMs). These methods control for multiple testing and exploit decoy databases to estimate statistical significance. Here, we show that database choice has a major effect on these confidence estimates leading to significant differences in the number of PSMs reported. We note that standard target:decoy approaches using six-frame translations of nucleotide sequences, such as assembled transcriptome data, apparently underestimate the confidence assigned to the PSMs. The source of this error stems from the inflated and unusual nature of the six-frame database, where for every target sequence there exists five "incorrect" targets that are unlikely to code for protein. The attendant FDR and PEP estimates lead to fewer accepted PSMs at fixed thresholds, and we show that this effect is a product of the database and statistical modeling and not the search engine. A variety of approaches to limit database size and remove noncoding target sequences are examined and discussed in terms of the altered statistical estimates generated and PSMs reported. These results are of importance to groups carrying out proteogenomics, aiming to maximize the validation and discovery of gene structure in sequenced genomes, while still controlling for false positives.
Collapse
Affiliation(s)
- Paul Blakeley
- Faculty of Life Sciences, The University of Manchester, Manchester M13 9PT, UK
| | | | | |
Collapse
|
16
|
Pawar H, Sahasrabuddhe NA, Renuse S, Keerthikumar S, Sharma J, Kumar GSS, Venugopal A, Sekhar NR, Kelkar DS, Nemade H, Khobragade SN, Muthusamy B, Kandasamy K, Harsha HC, Chaerkady R, Patole MS, Pandey A. A proteogenomic approach to map the proteome of an unsequenced pathogen - Leishmania donovani. Proteomics 2012; 12:832-44. [DOI: 10.1002/pmic.201100505] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
- Harsh Pawar
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- Rajiv Gandhi University of Health Sciences; Bangalore Karnataka India
| | - Nandini A. Sahasrabuddhe
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- Manipal University; Madhav Nagar Manipal Karnataka India
- McKusick-Nathans Institute of Genetic Medicine; Johns Hopkins University School of Medicine; Baltimore MD USA
- Department of Biological Chemistry; Johns Hopkins University School of Medicine; Baltimore MD USA
| | - Santosh Renuse
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- McKusick-Nathans Institute of Genetic Medicine; Johns Hopkins University School of Medicine; Baltimore MD USA
- Department of Biological Chemistry; Johns Hopkins University School of Medicine; Baltimore MD USA
- Department of Biotechnology; Amrita Vishwa Vidyapeetham; Kollam Kerala India
| | | | - Jyoti Sharma
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- Manipal University; Madhav Nagar Manipal Karnataka India
| | - Ghantasala. S. Sameer Kumar
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- Department of Biotechnology; Kuvempu University; Shimoga Karnataka India
| | - Abhilash Venugopal
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- Department of Biotechnology; Kuvempu University; Shimoga Karnataka India
| | - Nirujogi Raja Sekhar
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- Bioinformatics Centre; School of Life Sciences; Pondicherry University; Puducherry India
| | - Dhanashree S. Kelkar
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- Department of Biotechnology; Amrita Vishwa Vidyapeetham; Kollam Kerala India
| | - Harshal Nemade
- National Centre for Cell Sciences; Pune Maharashtra India
| | | | - Babylakshmi Muthusamy
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- Bioinformatics Centre; School of Life Sciences; Pondicherry University; Puducherry India
| | - Kumaran Kandasamy
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
| | - H. C. Harsha
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
| | - Raghothama Chaerkady
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- McKusick-Nathans Institute of Genetic Medicine; Johns Hopkins University School of Medicine; Baltimore MD USA
- Department of Biological Chemistry; Johns Hopkins University School of Medicine; Baltimore MD USA
| | | | - Akhilesh Pandey
- McKusick-Nathans Institute of Genetic Medicine; Johns Hopkins University School of Medicine; Baltimore MD USA
- Department of Biological Chemistry; Johns Hopkins University School of Medicine; Baltimore MD USA
- Department of Oncology; Johns Hopkins University School of Medicine; Baltimore MD USA
- Department of Pathology; Johns Hopkins University School of Medicine; Baltimore MD USA
| |
Collapse
|
17
|
Tomazella GG, Risberg K, Mylvaganam H, Lindemann PC, Thiede B, de Souza GA, Wiker HG. Proteomic analysis of a multi-resistant clinical Escherichia coli isolate of unknown genomic background. J Proteomics 2012; 75:1830-7. [DOI: 10.1016/j.jprot.2011.12.024] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2011] [Revised: 12/15/2011] [Accepted: 12/16/2011] [Indexed: 11/18/2022]
|
18
|
Abstract
Tuberculosis, the disease caused by Mycobacterium tuberculosis, remains a relevant public health issue. This is due mostly to the coepidemiology with HIV/AIDS, the appearance of multidrug-resistant strains globally, and failure of BCG (bacillus Calmette-Guerin) vaccination to confer complete protection. This bacterium was one of the first to have its genome sequenced, yet over a decade after the release of the genomic information, the characterization of its phylogenetic tree and of different strain variants inside this species revealed that much is still needed to be done for a full understanding of the M. tuberculosis genome and proteome. Current methods using LC-MS/MS and hybrid high-resolution mass spectrometers can identify 2400-2800 proteins of the 4000 predicted genes in M. tuberculosis. In this article, we review relevant details of this bacterium's pathology and immunology, describing articles where proteomics helped the community to tackle some of the organism biology, from understanding strain diversity, cellular structure composition, immunogenicity, and host-pathogen interactions. Finally, we will discuss the challenges yet to be fulfilled in order to better characterize M. tuberculosis by proteomics.
Collapse
Affiliation(s)
- Gustavo A de Souza
- The Gade Institute, Section for Microbiology and Immunology, University of Bergen, Bergen, Norway
| | | |
Collapse
|
19
|
Wiker HG, Tomazella GG, de Souza GA. A quantitative view on Mycobacterium leprae antigens by proteomics. J Proteomics 2011; 74:1711-9. [PMID: 21278007 DOI: 10.1016/j.jprot.2011.01.004] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2010] [Revised: 12/09/2010] [Accepted: 01/10/2011] [Indexed: 11/29/2022]
Abstract
Leprosy is an ancient disease and the focus of the researchers' scrutiny for more than a century. However, many of the molecular aspects related to transmission, virulence, antigens and immune responses are far from known. Initially, the implementation of recombinant DNA library screens raised interesting antigen candidates. Finally, the availability of Mycobacterium leprae genomic information showed an intriguing genome reduction which is now largely used in comparative genomics. While predictive in silico tools are commonly used to identify possible antigens, proteomic approaches have not yet been explored fully to study M. leprae biology. Quantitative information obtained at the protein level, and its analysis as part of a complex system, would be a key feature to be used to help researchers to validate and understand many of such in silico predictions. Through a re-analysis of data from a previous publication of our group, we could easily tackle many questions regarding antigen prediction and pseudogene expression. Several well known antigens are among the quantitatively dominant proteins, while several major proteins have not been explored as antigens. We argue that combining proteomic approaches together with bioinformatic workflows is a required step in the characterization of important pathogens.
Collapse
Affiliation(s)
- Harald G Wiker
- The Gade Institute, Section for Microbiology and Immunology, University of Bergen, Norway.
| | | | | |
Collapse
|
20
|
Målen H, De Souza GA, Pathak S, Søfteland T, Wiker HG. Comparison of membrane proteins of Mycobacterium tuberculosis H37Rv and H37Ra strains. BMC Microbiol 2011; 11:18. [PMID: 21261938 PMCID: PMC3033788 DOI: 10.1186/1471-2180-11-18] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2010] [Accepted: 01/24/2011] [Indexed: 01/24/2023] Open
Abstract
Background The potential causes for variation in virulence between distinct M. tuberculosis strains are still not fully known. However, differences in protein expression are probably an important factor. In this study we used a label-free quantitative proteomic approach to estimate differences in protein abundance between two closely related M. tuberculosis strains; the virulent H37Rv strain and its attenuated counterpart H37Ra. Results We were able to identify more than 1700 proteins from both strains. As expected, the majority of the identified proteins had similar relative abundance in the two strains. However, 29 membrane-associated proteins were observed with a 5 or more fold difference in their relative abundance in one strain compared to the other. Of note, 19 membrane- and lipo-proteins had higher abundance in H37Rv, while another 10 proteins had a higher abundance in H37Ra. Interestingly, the possible protein-export membrane protein SecF (Rv2586c), and three ABC-transporter proteins (Rv0933, Rv1273c and Rv1819c) were among the more abundant proteins in M. tuberculosis H37Rv. Conclusion Our data suggests that the bacterial secretion system and the transmembrane transport system may be important determinants of the ability of distinct M. tuberculosis strains to cause disease.
Collapse
Affiliation(s)
- Hiwa Målen
- Section for Microbiology and Immunology, the Gade Institute, University of Bergen, Bergen, Norway
| | | | | | | | | |
Collapse
|
21
|
de Souza GA, Arntzen MØ, Fortuin S, Schürch AC, Målen H, McEvoy CRE, van Soolingen D, Thiede B, Warren RM, Wiker HG. Proteogenomic analysis of polymorphisms and gene annotation divergences in prokaryotes using a clustered mass spectrometry-friendly database. Mol Cell Proteomics 2010; 10:M110.002527. [PMID: 21030493 PMCID: PMC3013451 DOI: 10.1074/mcp.m110.002527] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Precise annotation of genes or open reading frames is still a difficult task that results in divergence even for data generated from the same genomic sequence. This has an impact in further proteomic studies, and also compromises the characterization of clinical isolates with many specific genetic variations that may not be represented in the selected database. We recently developed software called multistrain mass spectrometry prokaryotic database builder (MSMSpdbb) that can merge protein databases from several sources and be applied on any prokaryotic organism, in a proteomic-friendly approach. We generated a database for the Mycobacterium tuberculosis complex (using three strains of Mycobacterium bovis and five of M. tuberculosis), and analyzed data collected from two laboratory strains and two clinical isolates of M. tuberculosis. We identified 2561 proteins, of which 24 were present in M. tuberculosis H37Rv samples, but not annotated in the M. tuberculosis H37Rv genome. We were also able to identify 280 nonsynonymous single amino acid polymorphisms and confirm 367 translational start sites. As a proof of concept we applied the database to whole-genome DNA sequencing data of one of the clinical isolates, which allowed the validation of 116 predicted single amino acid polymorphisms and the annotation of 131 N-terminal start sites. Moreover we identified regions not present in the original M. tuberculosis H37Rv sequence, indicating strain divergence or errors in the reference sequence. In conclusion, we demonstrated the potential of using a merged database to better characterize laboratory or clinical bacterial strains.
Collapse
Affiliation(s)
- Gustavo A de Souza
- The Gade Institute, Section for Microbiology and Immunology, University of Bergen, N-5021 Bergen, Norway
| | | | | | | | | | | | | | | | | | | |
Collapse
|